Package chemaxon.descriptors
Class MDSimilarity
java.lang.Object
chemaxon.descriptors.MDSimilarity
- All Implemented Interfaces:
chemaxon.license.Licensable
Performs similarity comparisons between MDSets
(see MDSet
(for example set of
chemical fingerprints and/or pharmacophore fingerprints). Comparisons may be
performed, when all the query descriptor sets to which molecular descriptor
sets will be compared are added, the metrics to be used are set and filtering
options are also set. If filtering thresholds are applied then they should be
also given.
After a comparison results may be retrieved by calling methods
getDissimilarityCoeff()
or getDissimilarityCoeffs()
.
Typical usage:
MDSimilarity similarity = MDSimilarity(); // Add queries from MDReader similarity.addQueries( queryReader ); // Setup metrics and thresholds for ( int d = 0; d < descriptorCount; d++ ) { for ( int m = 0; m < metricIndices[ d ].length; m++ ) { similarity.useMetric( d, metricIndices[ d ][ m ], thresholds[ d ][ m ]); } } // Setup filtering if ( andMetrics ) similarity.passWithAllMetrics(); else similarity.passWithOneMetric(); if ( andDescriptors ) similarity.passWithAllDescriptors(); else similarity.passWithOneDescriptor(); // Setup result writer (table writer in this case) MDSimilarityTableWriter twr = new MDSimilarityTableWriter( outputStream, precision ); if ( !verboseSet ) { twr.setVerbosity( verbose ); twr.setVerboseFrequency( verboseFreq ); verboseSet = true; } twr.setPrintId( generateId ); if ( idTagName != null ) { twr.setPrintNaturalId( true ); twr.setNaturalIdName( idTagName ); } twr.setPrecision( precision ); similarity.addResultWriter( twr ); // Perform comparisons, results are written into the specified result writer similarity.compare( targetReader );
- Since:
- JChem 2.0
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
addQueries
(MDReader queryReader) Adds new query molecules as their set of descriptors from a chemical descriptor reader.void
addQueries
(MDSet[] queries) Adds new query molecules as their set of descriptors from an array.void
Adds a new query molecule as its set of descriptors.void
Adds aMDSimilarityResultWriter
object.boolean
Compares a target descriptor against all queries added prior to the call of this method using the given metric of the given descriptor.int
Compares a list of target descriptor sets (read by a molecular descriptor reader) against all queries added prior to the call of this method the same way ascompareQueries( MolecularDescriptor target )
but for each target.boolean
Compares a target descriptor set (for instance from a database) against all queries added prior to the call of this method.float
getDissimilarityCoeff
(int queryIndex, int mdIndex, int metricIndex) Retrieves query dissimilarity coefficients (one at a time) of the lastcompareQueries()
orcompare()
method called.float[][]
getDissimilarityCoeffs
(int queryIndex) Retrieves query dissimilarity coefficients with all metrics and one query of the lastcompareQueries()
orcompare()
method called.float[]
getDissimilarityCoeffs
(int queryIndex, int mdIndex) Retrieves query dissimilarity coefficients with all metrics and one descriptor of the lastcompareQueries()
orcompare()
method called.int
Gets the number of queries that have already been added.int
getNrOfUsedMetrics
(int mdIndex) Return the number of metrics used with the given molecular descriptor in similarity calculations.getQuery
(int queryIndex) Gets a query.boolean
Checks the component-wise flag.boolean
boolean
Tells whether filtering of target descriptor sets is set to pass only if each descriptor in the set passes.boolean
Tells whether filtering of target descriptor sets is set to pass only if dissimilarity calculated with each metric used with the descriptor is under the required threshold.boolean
Tells whether filtering of target descriptor sets is set to pass if at least one descriptor in the set passes.boolean
Tells whether filtering of target descriptor sets is set to pass if dissimilarity calculated with at least one metric used with the descriptor is under the required threshold.boolean
isUsedMetric
(int mdIndex, int metricIndex) Return if the given metric is used with the given molecular descriptor in similarity calculations.void
In the following searches the descriptor set of a target molecule passes the comparison with a query descriptor set, if all descriptors of the set have passed the corresponding comparisons.void
In the following searches a target molecule's molecular descriptor passes the comparison with a corresponding query descriptor, if all dissimilarity coefficients (distances calculated with each metric) between these descriptors are under the previously given threshold.void
In the following searches the descriptor set of a target molecule passes the comparison with a query descriptor set, if at least one descriptor of the set have passed the corresponding comparisons.void
In the following searches a target molecule's molecular descriptor passes the comparison with a corresponding query descriptor, if at least one dissimilarity coefficient between these descriptors is under the previously given threshold.void
setComponentWise
(boolean componentWise) SetsMDSet
evaluation mode.void
void
setThreshold
(float threshold) Sets threshold for descriptor set mode.float
threshold
(int mdIndex, int metricIndex) Return the acceptance threshold of the given metric for the given molecular descriptor.void
useMetric
(int mdIndex, int metricIndex) Use the specified metric for the specified molecular descriptor with the dissimilarity threshold stored in the corresponding parameters settings.void
useMetric
(int mdIndex, int metricIndex, float threshold) Use the specified metric for the specified molecular descriptor along with the given dissimilarity threshold.
-
Constructor Details
-
MDSimilarity
public MDSimilarity()Creates a new instance. Allocates internal storage.
-
-
Method Details
-
setComponentWise
public void setComponentWise(boolean componentWise) SetsMDSet
evaluation mode. Default mode is composite (descriptor set) mode, when one dissimilarity value is calculated for each descriptor set (using selected/default metrics per components and calculating the weighted sum of these dissimilairty values). In component-wise mode each component of a descriptor set yield one dissimilarity value and these values are kept independent in screening (ie. they are not summed).- Parameters:
componentWise
- indicates component-wise evaluation model]- Since:
- JChem 2.2
-
addResultWriter
Adds aMDSimilarityResultWriter
object. AMDSimilarity
instance can have an arbitrary number and type of suchMDSimilarityResultWriter
s and all are envoked (in the same order as they were added) after each targetMDSet
has been processed.- Parameters:
rwr
- a result writer object- Since:
- JChem 2.2
-
addQuery
Adds a new query molecule as its set of descriptors. The number of queries is not limited, however their number is supposed to be significantly smaller than the number of targets. In typical usage the number of queries does not exceed 10.
Once a query is added, it cannot be withdrawn. Added queries must be the composition of the same kind of descriptors.- Parameters:
query
- Query descriptor set, it is not cloned.
-
addQueries
Adds new query molecules as their set of descriptors from an array.- Parameters:
queries
- Array of query descriptor sets, it is not cloned.
-
addQueries
Adds new query molecules as their set of descriptors from a chemical descriptor reader.- Parameters:
queryReader
- Molecular descriptor set reader of the queries.- Throws:
MDReaderException
- when failed reading the next descriptor set- Since:
- JChem 2.2
-
getQuery
Gets a query.- Parameters:
queryIndex
- The index of the query (in order of addition) from 0 togetNrOfQueries() - 1
(both inclusive).- Returns:
- The set of molecular descriptors of the query
-
getNrOfQueries
public int getNrOfQueries()Gets the number of queries that have already been added.- Returns:
- Number of query descriptors.
-
setThreshold
public void setThreshold(float threshold) Sets threshold for descriptor set mode. (Component-wise mode uses different threshold values for each descriptor component and metric.)- Parameters:
threshold
- similarity threshold- Since:
- JChem 2.2
-
useMetric
public void useMetric(int mdIndex, int metricIndex, float threshold) Use the specified metric for the specified molecular descriptor along with the given dissimilarity threshold.- Parameters:
mdIndex
- Index of the molecular descriptor in the set.metricIndex
- Index of the metric.threshold
- Maximum dissimilarity allowed.
-
useMetric
public void useMetric(int mdIndex, int metricIndex) Use the specified metric for the specified molecular descriptor with the dissimilarity threshold stored in the corresponding parameters settings.- Parameters:
mdIndex
- Index of the molecular descriptor in the set.metricIndex
- Index of the metric.
-
isUsedMetric
public boolean isUsedMetric(int mdIndex, int metricIndex) Return if the given metric is used with the given molecular descriptor in similarity calculations.- Parameters:
mdIndex
- Index of the molecular descriptor in the set.metricIndex
- Index of the metric.- Returns:
- Metric in use flag.
-
getNrOfUsedMetrics
public int getNrOfUsedMetrics(int mdIndex) Return the number of metrics used with the given molecular descriptor in similarity calculations.- Parameters:
mdIndex
- Index of the molecular descriptor in the set.- Returns:
- Metric in use flag.
-
threshold
public float threshold(int mdIndex, int metricIndex) Return the acceptance threshold of the given metric for the given molecular descriptor.- Parameters:
mdIndex
- Index of the molecular descriptor in the set.metricIndex
- Index of the metric.- Returns:
- Threshold value, -1.0F, if metric is not used.
-
isComponentWise
public boolean isComponentWise()Checks the component-wise flag.- Returns:
- true if screening work in component-wise mode
- Since:
- JChem 2.2
-
passWithAllMetrics
public void passWithAllMetrics()In the following searches a target molecule's molecular descriptor passes the comparison with a corresponding query descriptor, if all dissimilarity coefficients (distances calculated with each metric) between these descriptors are under the previously given threshold. If this flag is not set, then one coefficient under the threshold is enough for passing (default). -
isPassWithAllMetrics
public boolean isPassWithAllMetrics()Tells whether filtering of target descriptor sets is set to pass only if dissimilarity calculated with each metric used with the descriptor is under the required threshold.- Returns:
- true if the condition is met
-
passWithOneMetric
public void passWithOneMetric()In the following searches a target molecule's molecular descriptor passes the comparison with a corresponding query descriptor, if at least one dissimilarity coefficient between these descriptors is under the previously given threshold. This is the default setting. -
isPassWithOneMetric
public boolean isPassWithOneMetric()Tells whether filtering of target descriptor sets is set to pass if dissimilarity calculated with at least one metric used with the descriptor is under the required threshold.- Returns:
- true if the condition is met
-
passWithAllDescriptors
public void passWithAllDescriptors()In the following searches the descriptor set of a target molecule passes the comparison with a query descriptor set, if all descriptors of the set have passed the corresponding comparisons. If this flag is not set, then one passing descriptor from the set is enough for passing (default). -
isPassWithAllDescriptors
public boolean isPassWithAllDescriptors()Tells whether filtering of target descriptor sets is set to pass only if each descriptor in the set passes.- Returns:
- true if the condition is met
-
passWithOneDescriptor
public void passWithOneDescriptor()In the following searches the descriptor set of a target molecule passes the comparison with a query descriptor set, if at least one descriptor of the set have passed the corresponding comparisons. This is the default setting. -
isPassWithOneDescriptor
public boolean isPassWithOneDescriptor()Tells whether filtering of target descriptor sets is set to pass if at least one descriptor in the set passes.- Returns:
- true if the condition is met
-
compare
Compares a target descriptor against all queries added prior to the call of this method using the given metric of the given descriptor. The results of the comparison (the dissimilarity coefficients) are stored internally, but only the results of the last comparison are kept, former values are discarded. Thus it is the responsibility of the user of this class to obtain required values by callingqueryDissimilarityCoeffs()
aftercompareQueries()
is performed.
The method can be used for filtering purposes, in which case its return value indicates whether the current target descriptor set is filtered out or not. Threshold values are set separately withuseMetric()
.- Parameters:
mdIndex
- Index of the molecular descriptor.metricIndex
- Index of the metric.target
- Target descriptor set.- Returns:
- Target passed filtering or not.
- Throws:
RuntimeException
- in case of invalid configuration
-
compare
Compares a target descriptor set (for instance from a database) against all queries added prior to the call of this method. Results of the comparison (the dissimilarity coefficients) are stored internally, but only the results of the last comparison are kept, former values are discarded. Thus it is the responsibility of the user of this class to obtain required values by callingqueryDissimilarityCoeffs()
aftercompareQueries()
is performed.
The method can be used for filtering purposes, in which case its return value indicates whether the current target descriptor set is filtered out or not. Threshold values are set separately withuseMetric()
.- Parameters:
target
- Target descriptor set.- Returns:
- Target passed filtering or not.
- Throws:
RuntimeException
- in case of invalid configuration
-
compare
Compares a list of target descriptor sets (read by a molecular descriptor reader) against all queries added prior to the call of this method the same way ascompareQueries( MolecularDescriptor target )
but for each target.
Processing the results is the responsibility of the class implementing theMDSimilarityResultWriter
interface.
Before starting the processing of targets theopen()
procedure ofMDSimilarityResultWriter
is executed, then after processing each target the procedurewrite()
is invoked, after the processing has ended the procedureclose()
is invoked.- Parameters:
targetReader
- Reader of target descriptor sets.- Returns:
- Number of targets that passed filtering.
- Throws:
MDReaderException
- when failed reading the next descriptor setRuntimeException
- in case of invalid configuration- Since:
- JChem 2.2
-
getDissimilarityCoeff
public float getDissimilarityCoeff(int queryIndex, int mdIndex, int metricIndex) Retrieves query dissimilarity coefficients (one at a time) of the lastcompareQueries()
orcompare()
method called.- Parameters:
queryIndex
- Index of the query molecule. Query molecules are numbered from 0 to nQueries() - 1, in the same order as added with addQuery().mdIndex
- Index of molecular descriptor component in the set.metricIndex
- Index of the metric.- Returns:
- Value of the dissimilarity coefficient.
-
getDissimilarityCoeffs
public float[] getDissimilarityCoeffs(int queryIndex, int mdIndex) Retrieves query dissimilarity coefficients with all metrics and one descriptor of the lastcompareQueries()
orcompare()
method called.- Parameters:
queryIndex
- Index of the query molecule. Query molecules are numbered from 0 to nQueries() - 1, in the same order as added with addQuery().mdIndex
- Index of molecular descriptor component in the set.- Returns:
- Array of dissimilarity coefficients with each metrics.
-
getDissimilarityCoeffs
public float[][] getDissimilarityCoeffs(int queryIndex) Retrieves query dissimilarity coefficients with all metrics and one query of the lastcompareQueries()
orcompare()
method called.- Parameters:
queryIndex
- Index of the query molecule. Query molecules are numbered from 0 to nQueries() - 1, in the same order as added with addQuery().- Returns:
- Array of dissimilarity coefficients with each descriptor and metric.
-
isLicensed
public boolean isLicensed()- Specified by:
isLicensed
in interfacechemaxon.license.Licensable
-
setLicenseEnvironment
- Specified by:
setLicenseEnvironment
in interfacechemaxon.license.Licensable
-