Package chemaxon.descriptors
Class MolecularDescriptor
- java.lang.Object
-
- chemaxon.descriptors.MolecularDescriptor
-
- All Implemented Interfaces:
Cloneable
- Direct Known Subclasses:
BCUT
,ChemicalFingerprint
,CustomDescriptor
,ECFP
,PharmacophoreFingerprint
,ReactionFingerprint
,ScalarDescriptor
,ShapeDescriptor
@PublicAPI public abstract class MolecularDescriptor extends Object implements Cloneable
Generic definition of molecular descriptors. TheMolecularDescriptor
class models all kinds of structural keys, fingerprints (hashed, pharmacophoric), MDL keys and many others which can be implemented in derived classes, some of these are implemented in JChem. For the sake of generality theMolecularDescriptor
class does not introduce operations that manipulate descriptors on "atomic" level, that is, cells (or bins) of descriptors cannot be accessed either for reading or for writing. This is because cells in various chemical descriptors can have different type (for example bit, integer or floating point value).
Operations between differentMolecularDescriptor
derivatives are not supported, though for the sake of efficiency no extra type checking is introduced (other than provided by the language itself).
Derived classes should define their own dissimilarity metrics for the sake of efficiency. This is one reason why this class does not define any dissimilarity metric; the other is, that differentMolecularDescriptor
subclasses may have different representations (for instance integer array vs. float array).- Since:
- JChem 2.0
-
-
Field Summary
Fields Modifier and Type Field Description protected MDParameters
params
Parameter settings related to the descriptor.
-
Constructor Summary
Constructors Constructor Description MolecularDescriptor()
Default constructor, creates an empty object.MolecularDescriptor(MDParameters parameters)
Creates a newMolecularDescriptor
with the given parameters.MolecularDescriptor(MolecularDescriptor c)
Copy constructor, creates am identical copy of theMolecularDescriptor
passed as a parameter.
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description abstract MolecularDescriptor
clone()
Creates a new instance with identical internal state.abstract void
fromData(byte[] dbRepr)
Builds aMolecularDescriptor
object from its external (database) representation.abstract void
fromFloatArray(float[] descr)
Builds a molecular descriptor from its float array representation.abstract void
fromString(String descr)
Builds a molecular descriptor from its string representation.String[]
generate(Molecule m)
Creates the descriptor for the given Molecule.String[]
generate(String molRepr)
Creates the descriptor for the given Molecule.Color[]
getAtomSetColors()
Determines the coloring of atoms.int[]
getAtomSetIndexes(Molecule m)
Gets the individual atom color indexes.String[]
getAtomSetNames()
abstract float[]
getDefaultDissimilarityMetricThresholds()
Gets the default dissimilarity threshold values for all dissimilarity metrics defined.int
getDefaultMetricIndex()
Gets the index of the default metric.float
getDefaultThreshold(int metricIndex)
Gets a metric dependent default threshold value.abstract float
getDissimilarity(MolecularDescriptor other)
Calculates the dissimilarity ratio between twoMolecularDescriptor
objects using the default metric.abstract float
getDissimilarity(MolecularDescriptor other, int parametrizedMetricIndex)
Calculates the dissimilarity between twoMolecularDescriptor
objects using the specified metric, apart from that it is the same asgetDissimilarity( final MolecularDescriptor other )
.int
getDissimilarityMetricIndex(String metricName)
Gets the internal index of the given metric.abstract String[]
getDissimilarityMetrics()
Gets the dissimilarity metric names in an array.float
getLowerBound(MolecularDescriptor other)
Calculates an estimate for the minimum value of the distance using the default distance metric.int
getMetricIndex(String metricName)
Gets the index of the given parameterized metric.String
getMetricName()
Gets the name of the current parameterized metric.String
getMetricName(int metricIndex)
Gets the name of a metric specified parameterized metric by its index.String
getName()
Gets the name of the descriptor.int
getNumberOfMetrics()
Gets the number of parameterized metrics available for the particular descriptor.int
getNumberOfWeights(String dissimilarityMetricName)
Gets the number of weight factors used by the specified metric.MDParameters
getParameters()
Gets the parameters associated with the object.String
getParametersClassName()
Gets the name of the parameters class corresponding to the descriptor (prefixed with the package name as getClass().getName() would return).String
getShortName()
Gets the short name of the descriptor.float
getThreshold()
Gets threshold value of the current parameterized metric.float
getThreshold(int metricIndex)
Gets a metric dependent default threshold value.static void
main(String[] args)
boolean
needsConfig()
Indicates if class takes parameters from configuration file.static MolecularDescriptor
newInstance(String descriptorTypeName)
Creates a MolecularDescriptor specified by its name.static MolecularDescriptor
newInstance(String descriptorTypeName, String parameters)
Creates a MolecularDescriptor specified by its name and xml parameter.static MolecularDescriptor
newInstanceFromXML(String parameters)
Creates a new MolecularDescriptor instance according to the given parameter string.static com.google.common.base.Supplier<MolecularDescriptor>
newInstanceSupplier(String descriptorTypeName)
Creates aSupplier
of the specified molecular descriptor.void
setParameters(MDParameters parameters)
Sets the parameters for an already createdMolecularDescriptor
.abstract void
setParameters(String parameters)
Sets the parameters for an already createdMolecularDescriptor
.void
setScreeningConfiguration(String config)
Sets the screening configuration.String
toBinaryString()
Creates the binary string representation of aMolecularDescriptor
object.abstract byte[]
toData()
Converts the internal (memory) representation of aMolecularDescriptor
instance into an external format that can be stored in a database.abstract String
toDecimalString()
Creates the string representation of aMolecularDescriptor
object.abstract float[]
toFloatArray()
Creates the float array representation of aMolecularDescriptor
object.abstract String
toString()
Creates the string representation of aMolecularDescriptor
object.
-
-
-
Field Detail
-
params
protected MDParameters params
Parameter settings related to the descriptor. Instances of theMolecularDescriptor
class with the same parameter setting share a commonMDParameter
object, howeverMolecularDescriptor
objects having different parameters are also allowed in the same application.MDParameters
can be specialized with inheritance, thus classes derived fromMolecularDescriptor
are recommended to derive their ownMDParameters
sub-clusses.
-
-
Constructor Detail
-
MolecularDescriptor
public MolecularDescriptor(MDParameters parameters)
Creates a newMolecularDescriptor
with the given parameters.- Parameters:
parameters
- parameter settings of the descriptor to be created- Since:
- JChem 2.2
-
MolecularDescriptor
public MolecularDescriptor()
Default constructor, creates an empty object.
-
MolecularDescriptor
public MolecularDescriptor(MolecularDescriptor c)
Copy constructor, creates am identical copy of theMolecularDescriptor
passed as a parameter.- Parameters:
c
- aMolecularDescriptor
to be copied
-
-
Method Detail
-
newInstance
public static final MolecularDescriptor newInstance(String descriptorTypeName, String parameters)
Creates a MolecularDescriptor specified by its name and xml parameter.- Parameters:
descriptorTypeName
- predefined type name or class name If the name doesn't match any of the predefined names, it's assumed that it's a class name parameters xml parameter configuration- Returns:
- a new instance of the appropriate MolecularDescriptor
-
newInstanceSupplier
public static final com.google.common.base.Supplier<MolecularDescriptor> newInstanceSupplier(String descriptorTypeName)
Creates aSupplier
of the specified molecular descriptor. SeenewInstance(String)
.
-
newInstance
public static final MolecularDescriptor newInstance(String descriptorTypeName)
Creates a MolecularDescriptor specified by its name. The descriptor is created with the default parameter settings.- Parameters:
descriptorTypeName
- predefined type name or class name If the name does not match any of the predefined names, it is assumed to be a class name- Returns:
- a new instance of the appropriate MolecularDescriptor
-
newInstanceFromXML
public static final MolecularDescriptor newInstanceFromXML(String parameters)
Creates a new MolecularDescriptor instance according to the given parameter string. The parameter string is and XML configuration that specifies the type of the MOlecularDescriptor as well as its parmeter settings.- Parameters:
parameters
- XML configuration string- Returns:
- a new instance of the appropriate MolecularDescriptor
-
clone
public abstract MolecularDescriptor clone()
Creates a new instance with identical internal state.
-
getName
public String getName()
Gets the name of the descriptor. The name is not the same as the class name, it is nicer, more readable and meaningful for end-users too.- Returns:
- the descriptor's name depending on its actual type
-
getShortName
public String getShortName()
Gets the short name of the descriptor.- Returns:
- the short name used in text outputs (tables etc.)
-
getParametersClassName
public String getParametersClassName()
Gets the name of the parameters class corresponding to the descriptor (prefixed with the package name as getClass().getName() would return).- Returns:
- the name of the parameters class
-
setParameters
public void setParameters(MDParameters parameters)
Sets the parameters for an already createdMolecularDescriptor
.- Parameters:
parameters
- parameter settings of the descriptor to be created
-
setParameters
public abstract void setParameters(String parameters) throws MDParametersException
Sets the parameters for an already createdMolecularDescriptor
.- Parameters:
parameters
- parameter settings of the descriptor. If it is null or empty, then the default settings will be used.- Throws:
MDParametersException
-
getParameters
public MDParameters getParameters()
Gets the parameters associated with the object.- Returns:
- parameters settings
-
needsConfig
public boolean needsConfig()
Indicates if class takes parameters from configuration file. Derived classes have to override this method appropriately.- Returns:
- true, most descriptors classes must have a configuration (file)
- Since:
- JChem 2.2
-
setScreeningConfiguration
public void setScreeningConfiguration(String config) throws MDParametersException
Sets the screening configuration. Overwrites old parameters with the new ones, parameters not affected by the screening configuration remain unchanged.- Parameters:
config
- screening configuration string- Throws:
MDParametersException
-
toData
public abstract byte[] toData()
Converts the internal (memory) representation of aMolecularDescriptor
instance into an external format that can be stored in a database.- Returns:
- binary representation of the descriptor
-
fromData
public abstract void fromData(byte[] dbRepr)
Builds aMolecularDescriptor
object from its external (database) representation.- Parameters:
dbRepr
- an array generated bytoData()
-
toString
public abstract String toString()
Creates the string representation of aMolecularDescriptor
object. This string value is stored in SDfiles, though the use of this string is not limited to this purpose. Typically, this string is compact, for instance zero values are not necessarily printed.
-
toDecimalString
public abstract String toDecimalString()
Creates the string representation of aMolecularDescriptor
object. This string value contains all values of the descriptor (including all zeros), values are separated by tabs.- Returns:
- a formatted string of the descriptor
-
toBinaryString
public String toBinaryString()
Creates the binary string representation of aMolecularDescriptor
object.- Returns:
- a 0,1 string of the descriptor
- Since:
- JChem 2.3
-
fromString
public abstract void fromString(String descr) throws ParseException
Builds a molecular descriptor from its string representation. Typically used when SDfile is read.- Parameters:
descr
- descriptor string, previously generated bytoString()
- Throws:
ParseException
-
toFloatArray
public abstract float[] toFloatArray()
Creates the float array representation of aMolecularDescriptor
object. This array contains all values of the descriptor (including all zeros) in the elements of the array.- Returns:
- a formatted float array of the descriptor
- Since:
- JChem 2.0.1
-
fromFloatArray
public abstract void fromFloatArray(float[] descr)
Builds a molecular descriptor from its float array representation. Typically used when a hypothesis is created.- Parameters:
descr
- descriptor represented in a float array (e.g. generated bytoFloatArray()
)- Since:
- JChem 2.0.1
-
generate
public String[] generate(Molecule m) throws MDGeneratorException
Creates the descriptor for the given Molecule.- Returns:
- property names set in the molecule passed during generation
- Throws:
MDGeneratorException
- when failed to generate descriptor
-
generate
public String[] generate(String molRepr) throws MDGeneratorException
Creates the descriptor for the given Molecule.- Parameters:
molRepr
- String representation of the molecule eg.: smiles- Returns:
- property names set in the molecule passed during generation
- Throws:
MDGeneratorException
- when failed to generate descriptor
-
getAtomSetColors
public Color[] getAtomSetColors()
Determines the coloring of atoms. This coloring does not reflect element types, instead other categories related to the specific descriptor are considered. Therefore, whenever the parameters are changed, it is advisable to callgetAtomSetColors()
to obtain the current coloring scheme. Typically, the coloring scheme is defined in theMDParameters
object associated with theMolecularDescriptor
object.- Returns:
- array of colors of different atom categories
-
getAtomSetNames
public String[] getAtomSetNames()
-
getAtomSetIndexes
public int[] getAtomSetIndexes(Molecule m)
Gets the individual atom color indexes. This allows color mapping and visualization of various properties encoded into theMolecularDescriptor
. Prior to this method,getAtomSetColors()
has to be called to obtain a color array. This method returns a per atom color index array and the returned indexes refer to to the color array returned bygetAtomSetColors
.- Parameters:
m
- a molecule to assign atom colors to- Returns:
- array of color indexes (indexed by atom indexes)
-
getDissimilarityMetrics
public abstract String[] getDissimilarityMetrics()
Gets the dissimilarity metric names in an array.
This method must be overloaded by derived classes in order to get the metrics array depending on the dynamic type. (This is needed because the metrics[] array is a class variable, but class variables are shared among all derived classes.)- Returns:
- the metrics array
-
getDefaultDissimilarityMetricThresholds
public abstract float[] getDefaultDissimilarityMetricThresholds()
Gets the default dissimilarity threshold values for all dissimilarity metrics defined.- Returns:
- array of dissimilarity threshold values
-
getDissimilarityMetricIndex
public int getDissimilarityMetricIndex(String metricName) throws IllegalArgumentException
Gets the internal index of the given metric.- Parameters:
metricName
- name of a metric- Returns:
- index of the specified metric
- Throws:
IllegalArgumentException
-
getNumberOfWeights
public int getNumberOfWeights(String dissimilarityMetricName) throws IllegalArgumentException
Gets the number of weight factors used by the specified metric. This method can be applied to the dissimilarity metrics provided by theMolecularDescriptor
class or its derived classes, but not to parameterized metric.- Parameters:
dissimilarityMetricName
- name as returned bygetDissimilarityMetrics()
- Returns:
- number of weights the metric uses
- Throws:
IllegalArgumentException
- if the given parameter is not a valid metric name
-
getNumberOfMetrics
public int getNumberOfMetrics()
Gets the number of parameterized metrics available for the particular descriptor.- Returns:
- number of metrics implemented in this class
-
getMetricIndex
public int getMetricIndex(String metricName) throws IllegalArgumentException
Gets the index of the given parameterized metric.- Parameters:
metricName
- name of a metric- Returns:
- index of the specified metric
- Throws:
IllegalArgumentException
- when given metric name is not valid
-
getDefaultThreshold
public float getDefaultThreshold(int metricIndex)
Gets a metric dependent default threshold value. The actual value of this wired in parameter is not important, since it is only used in user interfaces to simplify the use of applications for beginners. Note: this method is for compatibility reasons.- Parameters:
metricIndex
- index of a parameterized metric
-
getThreshold
public float getThreshold(int metricIndex)
Gets a metric dependent default threshold value. Ideally, this value should be based on statistics, though the actual value is not too critical, since these are only used in user interfaces to simplify the use of applications for beginners. Note: this method is for compatibility reasons.- Parameters:
metricIndex
- index of a parameterized metric
-
getThreshold
public float getThreshold()
Gets threshold value of the current parameterized metric.- Returns:
- threshold value
-
getMetricName
public String getMetricName()
Gets the name of the current parameterized metric.- Returns:
- name of the current metric
-
getMetricName
public String getMetricName(int metricIndex)
Gets the name of a metric specified parameterized metric by its index. Note: this method is kept for backward compatibility.- Parameters:
metricIndex
- metric index- Returns:
- name of the given metric
-
getDefaultMetricIndex
public int getDefaultMetricIndex()
Gets the index of the default metric. The default metric is one of the available ones that is the most commonly used for the givenMolecularDescriptor
.- Returns:
- metric index of the default metric
-
getDissimilarity
public abstract float getDissimilarity(MolecularDescriptor other)
Calculates the dissimilarity ratio between twoMolecularDescriptor
objects using the default metric. Default metric is set in the correspondingMDParameters
object. In the case of asymmetric distances swapping the two descriptors can make a big difference.- Parameters:
other
- a descriptor, to which the dissimilarity ratio is measured- Returns:
- dissimilarity ratio
-
getDissimilarity
public abstract float getDissimilarity(MolecularDescriptor other, int parametrizedMetricIndex)
Calculates the dissimilarity between twoMolecularDescriptor
objects using the specified metric, apart from that it is the same asgetDissimilarity( final MolecularDescriptor other )
.- Parameters:
other
- a descriptor, to which the dissimilarity ratio is measuredparametrizedMetricIndex
- the index of the parameterized metric to used- Returns:
- dissimilarity ratio
- See Also:
MDParameters
,PFParameters
-
getLowerBound
public float getLowerBound(MolecularDescriptor other)
Calculates an estimate for the minimum value of the distance using the default distance metric.
-
main
public static void main(String[] args)
-
-