Class MolecularDescriptor

java.lang.Object
chemaxon.descriptors.MolecularDescriptor
All Implemented Interfaces:
Cloneable
Direct Known Subclasses:
BCUT, ChemicalFingerprint, CustomDescriptor, ECFP, PharmacophoreFingerprint, ReactionFingerprint, ScalarDescriptor, ShapeDescriptor

@PublicAPI public abstract class MolecularDescriptor extends Object implements Cloneable
Generic definition of molecular descriptors. The MolecularDescriptor class models all kinds of structural keys, fingerprints (hashed, pharmacophoric), MDL keys and many others which can be implemented in derived classes, some of these are implemented in JChem. For the sake of generality the MolecularDescriptor class does not introduce operations that manipulate descriptors on "atomic" level, that is, cells (or bins) of descriptors cannot be accessed either for reading or for writing. This is because cells in various chemical descriptors can have different type (for example bit, integer or floating point value).
Operations between different MolecularDescriptor derivatives are not supported, though for the sake of efficiency no extra type checking is introduced (other than provided by the language itself).
Derived classes should define their own dissimilarity metrics for the sake of efficiency. This is one reason why this class does not define any dissimilarity metric; the other is, that different MolecularDescriptor subclasses may have different representations (for instance integer array vs. float array).
Since:
JChem 2.0
  • Field Details

    • params

      protected MDParameters params
      Parameter settings related to the descriptor. Instances of the MolecularDescriptor class with the same parameter setting share a common MDParameter object, however MolecularDescriptor objects having different parameters are also allowed in the same application. MDParameters can be specialized with inheritance, thus classes derived from MolecularDescriptor are recommended to derive their own MDParameters sub-clusses.
  • Constructor Details

    • MolecularDescriptor

      protected MolecularDescriptor(MDParameters parameters)
      Creates a new MolecularDescriptor with the given parameters.
      Parameters:
      parameters - parameter settings of the descriptor to be created
      Since:
      JChem 2.2
    • MolecularDescriptor

      protected MolecularDescriptor()
      Default constructor, creates an empty object.
    • MolecularDescriptor

      protected MolecularDescriptor(MolecularDescriptor c)
      Copy constructor, creates am identical copy of the MolecularDescriptor passed as a parameter.
      Parameters:
      c - a MolecularDescriptor to be copied
  • Method Details

    • newInstance

      public static MolecularDescriptor newInstance(String descriptorTypeName, String parameters)
      Creates a MolecularDescriptor specified by its name and xml parameter.
      Parameters:
      descriptorTypeName - predefined type name or class name If the name doesn't match any of the predefined names, it's assumed that it's a class name parameters xml parameter configuration
      Returns:
      a new instance of the appropriate MolecularDescriptor
    • newInstanceSupplier

      @Deprecated(forRemoval=true) @SubjectToRemoval(date=JUL_01_2025) public static final com.google.common.base.Supplier<MolecularDescriptor> newInstanceSupplier(String descriptorTypeName)
      Deprecated, for removal: This API element is subject to removal in a future version.
      This method will be removed. Call newInstance(String) instead in a lambda.
      Creates a Supplier of the specified molecular descriptor. See newInstance(String).
    • newInstance

      public static MolecularDescriptor newInstance(String descriptorTypeName)
      Creates a MolecularDescriptor specified by its name. The descriptor is created with the default parameter settings.
      Parameters:
      descriptorTypeName - predefined type name or class name If the name does not match any of the predefined names, it is assumed to be a class name
      Returns:
      a new instance of the appropriate MolecularDescriptor
    • newInstanceFromXML

      public static MolecularDescriptor newInstanceFromXML(String parameters)
      Creates a new MolecularDescriptor instance according to the given parameter string. The parameter string is and XML configuration that specifies the type of the MOlecularDescriptor as well as its parmeter settings.
      Parameters:
      parameters - XML configuration string
      Returns:
      a new instance of the appropriate MolecularDescriptor
    • clone

      public abstract MolecularDescriptor clone()
      Creates a new instance with identical internal state.
      Overrides:
      clone in class Object
      Returns:
      the newly copied object
    • getName

      public String getName()
      Gets the name of the descriptor. The name is not the same as the class name, it is nicer, more readable and meaningful for end-users too.
      Returns:
      the descriptor's name depending on its actual type
    • getShortName

      public String getShortName()
      Gets the short name of the descriptor.
      Returns:
      the short name used in text outputs (tables etc.)
    • getParametersClassName

      public String getParametersClassName()
      Gets the name of the parameters class corresponding to the descriptor (prefixed with the package name as getClass().getName() would return).
      Returns:
      the name of the parameters class
    • setParameters

      public void setParameters(MDParameters parameters)
      Sets the parameters for an already created MolecularDescriptor.
      Parameters:
      parameters - parameter settings of the descriptor to be created
    • setParameters

      public abstract void setParameters(String parameters) throws MDParametersException
      Sets the parameters for an already created MolecularDescriptor.
      Parameters:
      parameters - parameter settings of the descriptor. If it is null or empty, then the default settings will be used.
      Throws:
      MDParametersException
    • getParameters

      public MDParameters getParameters()
      Gets the parameters associated with the object.
      Returns:
      parameters settings
    • needsConfig

      public boolean needsConfig()
      Indicates if class takes parameters from configuration file. Derived classes have to override this method appropriately.
      Returns:
      true, most descriptors classes must have a configuration (file)
      Since:
      JChem 2.2
    • setScreeningConfiguration

      public void setScreeningConfiguration(String config) throws MDParametersException
      Sets the screening configuration. Overwrites old parameters with the new ones, parameters not affected by the screening configuration remain unchanged.
      Parameters:
      config - screening configuration string
      Throws:
      MDParametersException
    • toData

      public abstract byte[] toData()
      Converts the internal (memory) representation of a MolecularDescriptor instance into an external format that can be stored in a database.
      Returns:
      binary representation of the descriptor
    • fromData

      public abstract void fromData(byte[] dbRepr)
      Builds a MolecularDescriptor object from its external (database) representation.
      Parameters:
      dbRepr - an array generated by toData()
    • toString

      public abstract String toString()
      Creates the string representation of a MolecularDescriptor object. This string value is stored in SDfiles, though the use of this string is not limited to this purpose. Typically, this string is compact, for instance zero values are not necessarily printed.
      Overrides:
      toString in class Object
      Returns:
      a formatted string of the descriptor
    • toDecimalString

      public abstract String toDecimalString()
      Creates the string representation of a MolecularDescriptor object. This string value contains all values of the descriptor (including all zeros), values are separated by tabs.
      Returns:
      a formatted string of the descriptor
    • toBinaryString

      public String toBinaryString()
      Creates the binary string representation of a MolecularDescriptor object.
      Returns:
      a 0,1 string of the descriptor
      Since:
      JChem 2.3
    • fromString

      public abstract void fromString(String descr) throws ParseException
      Builds a molecular descriptor from its string representation. Typically used when SDfile is read.
      Parameters:
      descr - descriptor string, previously generated by toString()
      Throws:
      ParseException
    • toFloatArray

      public abstract float[] toFloatArray()
      Creates the float array representation of a MolecularDescriptor object. This array contains all values of the descriptor (including all zeros) in the elements of the array.
      Returns:
      a formatted float array of the descriptor
      Since:
      JChem 2.0.1
    • fromFloatArray

      public abstract void fromFloatArray(float[] descr)
      Builds a molecular descriptor from its float array representation. Typically used when a hypothesis is created.
      Parameters:
      descr - descriptor represented in a float array (e.g. generated by toFloatArray())
      Since:
      JChem 2.0.1
    • generate

      public String[] generate(Molecule m) throws MDGeneratorException
      Creates the descriptor for the given Molecule.
      Returns:
      property names set in the molecule passed during generation
      Throws:
      MDGeneratorException - when failed to generate descriptor
    • generate

      public String[] generate(String molRepr) throws MDGeneratorException
      Creates the descriptor for the given Molecule.
      Parameters:
      molRepr - String representation of the molecule eg.: smiles
      Returns:
      property names set in the molecule passed during generation
      Throws:
      MDGeneratorException - when failed to generate descriptor
    • getAtomSetColors

      public Color[] getAtomSetColors()
      Determines the coloring of atoms. This coloring does not reflect element types, instead other categories related to the specific descriptor are considered. Therefore, whenever the parameters are changed, it is advisable to call getAtomSetColors() to obtain the current coloring scheme. Typically, the coloring scheme is defined in the MDParameters object associated with the MolecularDescriptor object.
      Returns:
      array of colors of different atom categories
    • getAtomSetNames

      public String[] getAtomSetNames()
    • getAtomSetIndexes

      public int[] getAtomSetIndexes(Molecule m)
      Gets the individual atom color indexes. This allows color mapping and visualization of various properties encoded into the MolecularDescriptor. Prior to this method, getAtomSetColors() has to be called to obtain a color array. This method returns a per atom color index array and the returned indexes refer to to the color array returned by getAtomSetColors.
      Parameters:
      m - a molecule to assign atom colors to
      Returns:
      array of color indexes (indexed by atom indexes)
    • getDissimilarityMetrics

      public abstract String[] getDissimilarityMetrics()
      Gets the dissimilarity metric names in an array.
      This method must be overloaded by derived classes in order to get the metrics array depending on the dynamic type. (This is needed because the metrics[] array is a class variable, but class variables are shared among all derived classes.)
      Returns:
      the metrics array
    • getDefaultDissimilarityMetricThresholds

      public abstract float[] getDefaultDissimilarityMetricThresholds()
      Gets the default dissimilarity threshold values for all dissimilarity metrics defined.
      Returns:
      array of dissimilarity threshold values
    • getDissimilarityMetricIndex

      public int getDissimilarityMetricIndex(String metricName) throws IllegalArgumentException
      Gets the internal index of the given metric.
      Parameters:
      metricName - name of a metric
      Returns:
      index of the specified metric
      Throws:
      IllegalArgumentException
    • getNumberOfWeights

      public int getNumberOfWeights(String dissimilarityMetricName) throws IllegalArgumentException
      Gets the number of weight factors used by the specified metric. This method can be applied to the dissimilarity metrics provided by the MolecularDescriptor class or its derived classes, but not to parameterized metric.
      Parameters:
      dissimilarityMetricName - name as returned by getDissimilarityMetrics()
      Returns:
      number of weights the metric uses
      Throws:
      IllegalArgumentException - if the given parameter is not a valid metric name
    • getNumberOfMetrics

      public int getNumberOfMetrics()
      Gets the number of parameterized metrics available for the particular descriptor.
      Returns:
      number of metrics implemented in this class
    • getMetricIndex

      public int getMetricIndex(String metricName) throws IllegalArgumentException
      Gets the index of the given parameterized metric.
      Parameters:
      metricName - name of a metric
      Returns:
      index of the specified metric
      Throws:
      IllegalArgumentException - when given metric name is not valid
    • getDefaultThreshold

      public float getDefaultThreshold(int metricIndex)
      Gets a metric dependent default threshold value. The actual value of this wired in parameter is not important, since it is only used in user interfaces to simplify the use of applications for beginners. Note: this method is for compatibility reasons.
      Parameters:
      metricIndex - index of a parameterized metric
    • getThreshold

      public float getThreshold(int metricIndex)
      Gets a metric dependent default threshold value. Ideally, this value should be based on statistics, though the actual value is not too critical, since these are only used in user interfaces to simplify the use of applications for beginners. Note: this method is for compatibility reasons.
      Parameters:
      metricIndex - index of a parameterized metric
    • getThreshold

      public float getThreshold()
      Gets threshold value of the current parameterized metric.
      Returns:
      threshold value
    • getMetricName

      public String getMetricName()
      Gets the name of the current parameterized metric.
      Returns:
      name of the current metric
    • getMetricName

      public String getMetricName(int metricIndex)
      Gets the name of a metric specified parameterized metric by its index. Note: this method is kept for backward compatibility.
      Parameters:
      metricIndex - metric index
      Returns:
      name of the given metric
    • getDefaultMetricIndex

      public int getDefaultMetricIndex()
      Gets the index of the default metric. The default metric is one of the available ones that is the most commonly used for the given MolecularDescriptor.
      Returns:
      metric index of the default metric
    • getDissimilarity

      public abstract float getDissimilarity(MolecularDescriptor other)
      Calculates the dissimilarity ratio between two MolecularDescriptor objects using the default metric. Default metric is set in the corresponding MDParameters object. In the case of asymmetric distances swapping the two descriptors can make a big difference.
      Parameters:
      other - a descriptor, to which the dissimilarity ratio is measured
      Returns:
      dissimilarity ratio
    • getDissimilarity

      public abstract float getDissimilarity(MolecularDescriptor other, int parametrizedMetricIndex)
      Calculates the dissimilarity between two MolecularDescriptor objects using the specified metric, apart from that it is the same as getDissimilarity( final MolecularDescriptor other ).
      Parameters:
      other - a descriptor, to which the dissimilarity ratio is measured
      parametrizedMetricIndex - the index of the parameterized metric to used
      Returns:
      dissimilarity ratio
      See Also:
    • getLowerBound

      public float getLowerBound(MolecularDescriptor other)
      Calculates an estimate for the minimum value of the distance using the default distance metric.
    • main

      @SubjectToRemoval(date=JAN_01_2025) @Deprecated(forRemoval=true) public static void main(String[] args)
      Deprecated, for removal: This API element is subject to removal in a future version.
      Will be removed, no replacement.