Package chemaxon.descriptors
Class MolecularDescriptor
java.lang.Object
chemaxon.descriptors.MolecularDescriptor
- All Implemented Interfaces:
Cloneable
- Direct Known Subclasses:
BCUT
,ChemicalFingerprint
,CustomDescriptor
,ECFP
,PharmacophoreFingerprint
,ReactionFingerprint
,ScalarDescriptor
,ShapeDescriptor
Generic definition of molecular descriptors. The
Operations between different
Derived classes should define their own dissimilarity metrics for the sake of efficiency. This is one reason why this class does not define any dissimilarity metric; the other is, that different
MolecularDescriptor
class models all kinds of structural keys, fingerprints (hashed,
pharmacophoric), MDL keys and many others which can be implemented in derived
classes, some of these are implemented in JChem.
For the sake of generality the MolecularDescriptor
class does
not introduce operations that manipulate descriptors on "atomic"
level, that is, cells (or bins) of descriptors cannot be accessed either for
reading or for writing. This is because cells in various chemical descriptors
can have different type (for example bit, integer or floating point value).
Operations between different
MolecularDescriptor
derivatives
are not supported, though for the sake of efficiency no extra type checking
is introduced (other than provided by the language itself).
Derived classes should define their own dissimilarity metrics for the sake of efficiency. This is one reason why this class does not define any dissimilarity metric; the other is, that different
MolecularDescriptor
subclasses may have different representations (for instance integer
array vs. float array).- Since:
- JChem 2.0
-
Field Summary
Modifier and TypeFieldDescriptionprotected MDParameters
Parameter settings related to the descriptor. -
Constructor Summary
ModifierConstructorDescriptionprotected
Default constructor, creates an empty object.protected
MolecularDescriptor
(MDParameters parameters) Creates a newMolecularDescriptor
with the given parameters.protected
Copy constructor, creates am identical copy of theMolecularDescriptor
passed as a parameter. -
Method Summary
Modifier and TypeMethodDescriptionabstract MolecularDescriptor
clone()
Creates a new instance with identical internal state.abstract void
fromData
(byte[] dbRepr) Builds aMolecularDescriptor
object from its external (database) representation.abstract void
fromFloatArray
(float[] descr) Builds a molecular descriptor from its float array representation.abstract void
fromString
(String descr) Builds a molecular descriptor from its string representation.String[]
Creates the descriptor for the given Molecule.String[]
Creates the descriptor for the given Molecule.Color[]
Determines the coloring of atoms.int[]
Gets the individual atom color indexes.String[]
abstract float[]
Gets the default dissimilarity threshold values for all dissimilarity metrics defined.int
Gets the index of the default metric.float
getDefaultThreshold
(int metricIndex) Gets a metric dependent default threshold value.abstract float
Calculates the dissimilarity ratio between twoMolecularDescriptor
objects using the default metric.abstract float
getDissimilarity
(MolecularDescriptor other, int parametrizedMetricIndex) Calculates the dissimilarity between twoMolecularDescriptor
objects using the specified metric, apart from that it is the same asgetDissimilarity( final MolecularDescriptor other )
.int
getDissimilarityMetricIndex
(String metricName) Gets the internal index of the given metric.abstract String[]
Gets the dissimilarity metric names in an array.float
getLowerBound
(MolecularDescriptor other) Calculates an estimate for the minimum value of the distance using the default distance metric.int
getMetricIndex
(String metricName) Gets the index of the given parameterized metric.Gets the name of the current parameterized metric.getMetricName
(int metricIndex) Gets the name of a metric specified parameterized metric by its index.getName()
Gets the name of the descriptor.int
Gets the number of parameterized metrics available for the particular descriptor.int
getNumberOfWeights
(String dissimilarityMetricName) Gets the number of weight factors used by the specified metric.Gets the parameters associated with the object.Gets the name of the parameters class corresponding to the descriptor (prefixed with the package name as getClass().getName() would return).Gets the short name of the descriptor.float
Gets threshold value of the current parameterized metric.float
getThreshold
(int metricIndex) Gets a metric dependent default threshold value.static void
Deprecated, for removal: This API element is subject to removal in a future version.Will be removed, no replacement.boolean
Indicates if class takes parameters from configuration file.static MolecularDescriptor
newInstance
(String descriptorTypeName) Creates a MolecularDescriptor specified by its name.static MolecularDescriptor
newInstance
(String descriptorTypeName, String parameters) Creates a MolecularDescriptor specified by its name and xml parameter.static MolecularDescriptor
newInstanceFromXML
(String parameters) Creates a new MolecularDescriptor instance according to the given parameter string.void
setParameters
(MDParameters parameters) Sets the parameters for an already createdMolecularDescriptor
.abstract void
setParameters
(String parameters) Sets the parameters for an already createdMolecularDescriptor
.void
setScreeningConfiguration
(String config) Sets the screening configuration.Creates the binary string representation of aMolecularDescriptor
object.abstract byte[]
toData()
Converts the internal (memory) representation of aMolecularDescriptor
instance into an external format that can be stored in a database.abstract String
Creates the string representation of aMolecularDescriptor
object.abstract float[]
Creates the float array representation of aMolecularDescriptor
object.abstract String
toString()
Creates the string representation of aMolecularDescriptor
object.
-
Field Details
-
params
Parameter settings related to the descriptor. Instances of theMolecularDescriptor
class with the same parameter setting share a commonMDParameter
object, howeverMolecularDescriptor
objects having different parameters are also allowed in the same application.MDParameters
can be specialized with inheritance, thus classes derived fromMolecularDescriptor
are recommended to derive their ownMDParameters
sub-clusses.
-
-
Constructor Details
-
MolecularDescriptor
Creates a newMolecularDescriptor
with the given parameters.- Parameters:
parameters
- parameter settings of the descriptor to be created- Since:
- JChem 2.2
-
MolecularDescriptor
protected MolecularDescriptor()Default constructor, creates an empty object. -
MolecularDescriptor
Copy constructor, creates am identical copy of theMolecularDescriptor
passed as a parameter.- Parameters:
c
- aMolecularDescriptor
to be copied
-
-
Method Details
-
newInstance
Creates a MolecularDescriptor specified by its name and xml parameter.- Parameters:
descriptorTypeName
- predefined type name or class name If the name doesn't match any of the predefined names, it's assumed that it's a class name parameters xml parameter configuration- Returns:
- a new instance of the appropriate MolecularDescriptor
-
newInstance
Creates a MolecularDescriptor specified by its name. The descriptor is created with the default parameter settings.- Parameters:
descriptorTypeName
- predefined type name or class name If the name does not match any of the predefined names, it is assumed to be a class name- Returns:
- a new instance of the appropriate MolecularDescriptor
-
newInstanceFromXML
Creates a new MolecularDescriptor instance according to the given parameter string. The parameter string is and XML configuration that specifies the type of the MOlecularDescriptor as well as its parmeter settings.- Parameters:
parameters
- XML configuration string- Returns:
- a new instance of the appropriate MolecularDescriptor
-
clone
Creates a new instance with identical internal state. -
getName
Gets the name of the descriptor. The name is not the same as the class name, it is nicer, more readable and meaningful for end-users too.- Returns:
- the descriptor's name depending on its actual type
-
getShortName
Gets the short name of the descriptor.- Returns:
- the short name used in text outputs (tables etc.)
-
getParametersClassName
Gets the name of the parameters class corresponding to the descriptor (prefixed with the package name as getClass().getName() would return).- Returns:
- the name of the parameters class
-
setParameters
Sets the parameters for an already createdMolecularDescriptor
.- Parameters:
parameters
- parameter settings of the descriptor to be created
-
setParameters
Sets the parameters for an already createdMolecularDescriptor
.- Parameters:
parameters
- parameter settings of the descriptor. If it is null or empty, then the default settings will be used.- Throws:
MDParametersException
-
getParameters
Gets the parameters associated with the object.- Returns:
- parameters settings
-
needsConfig
public boolean needsConfig()Indicates if class takes parameters from configuration file. Derived classes have to override this method appropriately.- Returns:
- true, most descriptors classes must have a configuration (file)
- Since:
- JChem 2.2
-
setScreeningConfiguration
Sets the screening configuration. Overwrites old parameters with the new ones, parameters not affected by the screening configuration remain unchanged.- Parameters:
config
- screening configuration string- Throws:
MDParametersException
-
toData
public abstract byte[] toData()Converts the internal (memory) representation of aMolecularDescriptor
instance into an external format that can be stored in a database.- Returns:
- binary representation of the descriptor
-
fromData
public abstract void fromData(byte[] dbRepr) Builds aMolecularDescriptor
object from its external (database) representation.- Parameters:
dbRepr
- an array generated bytoData()
-
toString
Creates the string representation of aMolecularDescriptor
object. This string value is stored in SDfiles, though the use of this string is not limited to this purpose. Typically, this string is compact, for instance zero values are not necessarily printed. -
toDecimalString
Creates the string representation of aMolecularDescriptor
object. This string value contains all values of the descriptor (including all zeros), values are separated by tabs.- Returns:
- a formatted string of the descriptor
-
toBinaryString
Creates the binary string representation of aMolecularDescriptor
object.- Returns:
- a 0,1 string of the descriptor
- Since:
- JChem 2.3
-
fromString
Builds a molecular descriptor from its string representation. Typically used when SDfile is read.- Parameters:
descr
- descriptor string, previously generated bytoString()
- Throws:
ParseException
-
toFloatArray
public abstract float[] toFloatArray()Creates the float array representation of aMolecularDescriptor
object. This array contains all values of the descriptor (including all zeros) in the elements of the array.- Returns:
- a formatted float array of the descriptor
- Since:
- JChem 2.0.1
-
fromFloatArray
public abstract void fromFloatArray(float[] descr) Builds a molecular descriptor from its float array representation. Typically used when a hypothesis is created.- Parameters:
descr
- descriptor represented in a float array (e.g. generated bytoFloatArray()
)- Since:
- JChem 2.0.1
-
generate
Creates the descriptor for the given Molecule.- Returns:
- property names set in the molecule passed during generation
- Throws:
MDGeneratorException
- when failed to generate descriptor
-
generate
Creates the descriptor for the given Molecule.- Parameters:
molRepr
- String representation of the molecule eg.: smiles- Returns:
- property names set in the molecule passed during generation
- Throws:
MDGeneratorException
- when failed to generate descriptor
-
getAtomSetColors
Determines the coloring of atoms. This coloring does not reflect element types, instead other categories related to the specific descriptor are considered. Therefore, whenever the parameters are changed, it is advisable to callgetAtomSetColors()
to obtain the current coloring scheme. Typically, the coloring scheme is defined in theMDParameters
object associated with theMolecularDescriptor
object.- Returns:
- array of colors of different atom categories
-
getAtomSetNames
-
getAtomSetIndexes
Gets the individual atom color indexes. This allows color mapping and visualization of various properties encoded into theMolecularDescriptor
. Prior to this method,getAtomSetColors()
has to be called to obtain a color array. This method returns a per atom color index array and the returned indexes refer to to the color array returned bygetAtomSetColors
.- Parameters:
m
- a molecule to assign atom colors to- Returns:
- array of color indexes (indexed by atom indexes)
-
getDissimilarityMetrics
Gets the dissimilarity metric names in an array.
This method must be overloaded by derived classes in order to get the metrics array depending on the dynamic type. (This is needed because the metrics[] array is a class variable, but class variables are shared among all derived classes.)- Returns:
- the metrics array
-
getDefaultDissimilarityMetricThresholds
public abstract float[] getDefaultDissimilarityMetricThresholds()Gets the default dissimilarity threshold values for all dissimilarity metrics defined.- Returns:
- array of dissimilarity threshold values
-
getDissimilarityMetricIndex
Gets the internal index of the given metric.- Parameters:
metricName
- name of a metric- Returns:
- index of the specified metric
- Throws:
IllegalArgumentException
-
getNumberOfWeights
Gets the number of weight factors used by the specified metric. This method can be applied to the dissimilarity metrics provided by theMolecularDescriptor
class or its derived classes, but not to parameterized metric.- Parameters:
dissimilarityMetricName
- name as returned bygetDissimilarityMetrics()
- Returns:
- number of weights the metric uses
- Throws:
IllegalArgumentException
- if the given parameter is not a valid metric name
-
getNumberOfMetrics
public int getNumberOfMetrics()Gets the number of parameterized metrics available for the particular descriptor.- Returns:
- number of metrics implemented in this class
-
getMetricIndex
Gets the index of the given parameterized metric.- Parameters:
metricName
- name of a metric- Returns:
- index of the specified metric
- Throws:
IllegalArgumentException
- when given metric name is not valid
-
getDefaultThreshold
public float getDefaultThreshold(int metricIndex) Gets a metric dependent default threshold value. The actual value of this wired in parameter is not important, since it is only used in user interfaces to simplify the use of applications for beginners. Note: this method is for compatibility reasons.- Parameters:
metricIndex
- index of a parameterized metric
-
getThreshold
public float getThreshold(int metricIndex) Gets a metric dependent default threshold value. Ideally, this value should be based on statistics, though the actual value is not too critical, since these are only used in user interfaces to simplify the use of applications for beginners. Note: this method is for compatibility reasons.- Parameters:
metricIndex
- index of a parameterized metric
-
getThreshold
public float getThreshold()Gets threshold value of the current parameterized metric.- Returns:
- threshold value
-
getMetricName
Gets the name of the current parameterized metric.- Returns:
- name of the current metric
-
getMetricName
Gets the name of a metric specified parameterized metric by its index. Note: this method is kept for backward compatibility.- Parameters:
metricIndex
- metric index- Returns:
- name of the given metric
-
getDefaultMetricIndex
public int getDefaultMetricIndex()Gets the index of the default metric. The default metric is one of the available ones that is the most commonly used for the givenMolecularDescriptor
.- Returns:
- metric index of the default metric
-
getDissimilarity
Calculates the dissimilarity ratio between twoMolecularDescriptor
objects using the default metric. Default metric is set in the correspondingMDParameters
object. In the case of asymmetric distances swapping the two descriptors can make a big difference.- Parameters:
other
- a descriptor, to which the dissimilarity ratio is measured- Returns:
- dissimilarity ratio
-
getDissimilarity
Calculates the dissimilarity between twoMolecularDescriptor
objects using the specified metric, apart from that it is the same asgetDissimilarity( final MolecularDescriptor other )
.- Parameters:
other
- a descriptor, to which the dissimilarity ratio is measuredparametrizedMetricIndex
- the index of the parameterized metric to used- Returns:
- dissimilarity ratio
- See Also:
-
getLowerBound
Calculates an estimate for the minimum value of the distance using the default distance metric. -
main
@SubjectToRemoval(date=JAN_01_2025) @Deprecated(forRemoval=true) public static void main(String[] args) Deprecated, for removal: This API element is subject to removal in a future version.Will be removed, no replacement.
-