Class MDSet

java.lang.Object
chemaxon.descriptors.MDSet

@PublicApi public class MDSet extends Object
MDset combines several MolecularDescriptors into one entity. The purpose of this class is to allow dissimilarity calculations being performed on various MolecularDescriptors simultaneously. This improves the predictive power of individual descriptors and is more efficient than doing it one-by-one.
MDSet objects can be compared against each other by dissimilarity metrics. The dissimilarity coefficient is obtained as the weighted sum of the dissimilarity coefficients of the pair-wise comparison of components. Weights are stored in the MDSetParameters class, aggregated by this class.
MDSet instances are associated with (and calculated from) molecular structures. This connection between the orginal Molecule and its MDSet objects is preserved by the unique identifier of the molecule which is stored in the MDSet object too.
Besides MolecularDescriptor components, and MDSet object can take an arbitrary number of external, user defined float values. Typically, these are calculated by third party software and stored in SDfile tags or database columns. These values are used in dissimilarity calculations but they are never modified.
Remark: the term Set is slightly misleading since components constituting the MDSet are ordered. Tuple or Record would be more appropriate though probably quite unusual in a cheminformatics context.
Since:
JChem 2.0
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    float
    dissimilarity measured against an other set
  • Constructor Summary

    Constructors
    Constructor
    Description
    Creates an empty MDSet object.
    MDSet(int nComponents)
    Creates an empty MDSet object capable of stroring a given number of MolecularDescriptor components.
    MDSet(int nComponents, int nUserData)
    Creates an empty MDSet object capable of stroring a given number of MolecularDescriptor components and the given number of user defined (external) data.
    Copy constructor.
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    Appends the next component to the MDSet object.
    Clones the object.
    void
    Generates the MDSet from the given molecular structure.
    getDescriptor(int index)
    Gets a specified component of the MDSet.
    float
    Calculates the dissimilarity between two MDSet objects.
    int
    Gets the identifier of the MDSet.
    float
    Gives a lower bound estimation for the value of getDissimilarity( final Object o ).
    Gets the natural identifier of the source Molecule of the MDSet.
    Gets the current parameter settings.
    float[]
    Deprecated, for removal: This API element is subject to removal in a future version.
    since 2.3
    float
    getUserData(int index)
    Deprecated, for removal: This API element is subject to removal in a future version.
    since 2.3
    static MDSet
    newInstance(String[] componentTypes)
    Gets a new MDSet instance constituted of the specified components.
    static MDSet
    newInstance(String[] componentTypes, File[] params)
    Gets a new MDSet instance constituted of the specified components.
    static MDSet
    newInstance(String[] componentTypes, String[] params)
    Gets a new MDSet instance constituted of the specified components.
    void
    setDescriptor(int componentIndex, MolecularDescriptor md)
    Sets a given component of the MDSet.
    void
    Sets all components of the MDSet.
    void
    setId(int id)
    Sets the unique internal idenifier of the MDSet object.
    void
    Sets the natural idenifier of the MDSet object.
    void
    Sets the parameters of the MDSet.
    void
    setSize(int nComponents)
    Sets the number of MolecularDescriptor components in the MDSet.
    void
    setSize(int nComponents, int nUserData)
    Sets the number of MolecularDescriptor components and the number of user defined (external) data in the MDSet.
    void
    setUserData(float[] userData)
    Deprecated, for removal: This API element is subject to removal in a future version.
    since 2.3
    void
    setUserData(int dataIndex, float userData)
    Deprecated, for removal: This API element is subject to removal in a future version.
    since 2.3
    int
    Gets the number of components constituting the MDSet.

    Methods inherited from class java.lang.Object

    equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • dissim

      public float dissim
      dissimilarity measured against an other set
  • Constructor Details

    • MDSet

      public MDSet()
      Creates an empty MDSet object. It can be initialized by calling setSize( int nComponents ) and setParameters( final MDSetParameters params ).
    • MDSet

      public MDSet(MDSet c)
      Copy constructor. Creates an identical object, in which components are cloned, but parameters are not cloned.
      Parameters:
      c - a MDSet object to be copied
    • MDSet

      public MDSet(int nComponents)
      Creates an empty MDSet object capable of stroring a given number of MolecularDescriptor components. Components should be added by setDescriptor( final MolecularDescriptor descriptor ) .
      Parameters:
      nComponents - number of components in the MDSet object
    • MDSet

      public MDSet(int nComponents, int nUserData)
      Creates an empty MDSet object capable of stroring a given number of MolecularDescriptor components and the given number of user defined (external) data. Components should be added by setDescriptor( final MolecularDescriptor descriptor ) .
      Parameters:
      nComponents - number of components in the MDSet object
      nUserData - number of further floating point values
  • Method Details

    • newInstance

      public static MDSet newInstance(String[] componentTypes)
      Gets a new MDSet instance constituted of the specified components. MDSetParameters are set to default.
      Parameters:
      componentTypes - type names of the components
      Returns:
      a new object
    • newInstance

      public static MDSet newInstance(String[] componentTypes, String[] params)
      Gets a new MDSet instance constituted of the specified components. Components are parametrized with the given parameter settings.
      Parameters:
      componentTypes - type names of the components
      params - parameter strings
      Returns:
      a new object; or null, if the required class could not be instanciated
    • newInstance

      public static MDSet newInstance(String[] componentTypes, File[] params)
      Gets a new MDSet instance constituted of the specified components. Components are parametrized from the given parameter files.
      Parameters:
      componentTypes - type names of the components
      params - parameter files
      Returns:
      a new object; or null, if the required class could not be instanciated
    • clone

      public Object clone()
      Clones the object.
      Overrides:
      clone in class Object
      Returns:
      a new, identical MDSet instance
    • setSize

      public void setSize(int nComponents, int nUserData)
      Sets the number of MolecularDescriptor components and the number of user defined (external) data in the MDSet.
      Parameters:
      nComponents - number of components in the MDSet object
      nUserData - number of further floating point values
    • setSize

      public void setSize(int nComponents)
      Sets the number of MolecularDescriptor components in the MDSet.
      Parameters:
      nComponents - number of components in the MDSet object
    • setId

      public void setId(int id)
      Sets the unique internal idenifier of the MDSet object.
      Parameters:
      id - unique identifier
    • getId

      public int getId()
      Gets the identifier of the MDSet.
      Returns:
      the identifier
    • setNaturalId

      public void setNaturalId(String id)
      Sets the natural idenifier of the MDSet object. This identifier is taken from a Molecule (from an SDfile tag).
      Parameters:
      id - unique identifier
    • getNaturalId

      public String getNaturalId()
      Gets the natural identifier of the source Molecule of the MDSet.
      Returns:
      the identifier
    • setParameters

      public void setParameters(MDSetParameters params)
      Sets the parameters of the MDSet. Note, that this has no effect on the parameters of individual MolecularDescriptor components in the MDSet.
      Parameters:
      params - new parameters for this MDSet.
    • getParameters

      public MDSetParameters getParameters()
      Gets the current parameter settings.
      Returns:
      the parameters of the MDSet
    • addDescriptor

      public void addDescriptor(MolecularDescriptor descriptor)
      Appends the next component to the MDSet object.
      Parameters:
      descriptor - the next component of the MDSet
    • setDescriptors

      public void setDescriptors(MolecularDescriptor[] descriptors)
      Sets all components of the MDSet.
      Parameters:
      descriptors - MDSet components, they are not cloned
    • setDescriptor

      public void setDescriptor(int componentIndex, MolecularDescriptor md)
      Sets a given component of the MDSet.
      Parameters:
      componentIndex - index of the component to be set
      md - the MolecularDescriptor type of the specified component
    • size

      public int size()
      Gets the number of components constituting the MDSet.
      Returns:
      number of component
    • getDescriptor

      public MolecularDescriptor getDescriptor(int index)
      Gets a specified component of the MDSet.
      Parameters:
      index - component index
      Returns:
      the selected component
    • generate

      public void generate(Molecule mol) throws MDGeneratorException
      Generates the MDSet from the given molecular structure.
      Parameters:
      mol - the molecule to generate from.
      Throws:
      MDGeneratorException - when failed to generate one of the components
    • getDissimilarity

      public float getDissimilarity(MDSet other)
      Calculates the dissimilarity between two MDSet objects. The dissimilarity value is the weighted sum of the component-wise dissimilarity values.
      Parameters:
      other - a MDSet object which this is compared to Its type is Object in order to implement the Clusterable interface.
      Returns:
      the dissimilarity coefficient calculated
    • getLowerBound

      public float getLowerBound(Object o)
      Gives a lower bound estimation for the value of getDissimilarity( final Object o ). This method is implemented due to the services requirements by the Clusterable interface.
      Parameters:
      o - MDSet object to which this is compated Its type is Object in order to implement the Clusterable interface.
      Returns:
      the lower bound estimation of the dissimilarity coefficient
    • setUserData

      @Deprecated(forRemoval=true) @SubjectToRemoval(date=JUL_01_2025) public void setUserData(float[] userData)
      Deprecated, for removal: This API element is subject to removal in a future version.
      since 2.3
      Sets all user defined float values in the MDSet.
      Parameters:
      userData - user defined floating point data values
    • setUserData

      @Deprecated(forRemoval=true) @SubjectToRemoval(date=JUL_01_2025) public void setUserData(int dataIndex, float userData)
      Deprecated, for removal: This API element is subject to removal in a future version.
      since 2.3
      Sets a given user defined float value in the MDSet.
      Parameters:
      dataIndex - index of the data value to be set
      userData - user defined floating point data value
    • getUserData

      @Deprecated(forRemoval=true) @SubjectToRemoval(date=JUL_01_2025) public float getUserData(int index)
      Deprecated, for removal: This API element is subject to removal in a future version.
      since 2.3
      Gets the value of a user defined data component.
      Parameters:
      index - data component index
      Returns:
      value of user defined data component
    • getUserData

      @Deprecated(forRemoval=true) @SubjectToRemoval(date=JUL_01_2025) public float[] getUserData()
      Deprecated, for removal: This API element is subject to removal in a future version.
      since 2.3
      Gets the value of all user defined data components.
      Returns:
      array of values of user defined data components