@PublicAPI public class MDHitEvaluator extends java.lang.Object
Retrieves statistical information from a test screen on a set of molecules. Statistical information supplied:
Basic input:
There are two possible ways of usage. The first is intended to be applied to smaller amount of molecules, but with fast retrieval of statistical information in several ways. In this case all the dissimilarity values are calculated previously and are stored to enable fast queries.
If the 'memory-safe' methods are used, then dissimilarities are calculated on the go, each time when a query function is called, they are not stored in the memory.
Typical usage: Not memory-safe mode:
evaluator = new MDHitEvaluator( similarity ); evaluator.setSelectivityAsymmetryFactor( 0.3 ); int functionIndex = evaluator.getEvaluatorFunctionIndex( "SelectivityEffectiveness" ) evaluator.setCurrentEvaluatorFunction( functionIndex ); evaluator.calcDissimilarity( testReader, targetReader ); int nSimilars = evaluator.getNumberOfSimilars(); float E = evaluator.evaluateByMetric( descrIndex, metrIndex, (int) 0.3 * nSimilars, (int) 0.8 * nSimilars ); float E = evaluator.evaluateByMetric( descrIndex, metrIndex, (int) 0.5 * nSimilars, nSimilars );
Memory-safe mode, dissimilarities are always calculated!
evaluator = new MDHitEvaluator( similarity ); evaluator.setSelectivityAsymmetryFactor( 0.3 ); int functionIndex = evaluator.getEvaluatorFunctionIndex( "SelectivityEffectiveness" ) evaluator.setCurrentEvaluatorFunction( functionIndex ); float E = evaluator.evaluateByMetric( descrIndex, metrIndex, 50.0F, testReader, targetReader );
Modifier and Type | Field and Description |
---|---|
java.lang.String[] |
evaluatorFunctions |
Constructor and Description |
---|
MDHitEvaluator(MDSimilarity similarity)
Creates a new instance, allocates storage.
|
Modifier and Type | Method and Description |
---|---|
void |
calcDissimilarity(MDReader similarSetReader,
MDReader dissimilarSetReader)
Precalculates dissimilarity values.
|
int[] |
calcMetricDistribution(int descrIndex,
int metricIndex,
float lowerBound,
float upperBound,
int nHistograms,
float[] metricValues)
Retrieves the distribution of the given metric from the dissimilarity
values calculated by a previous call to
calcDissimilarity() . |
int[] |
calcMetricDistribution(int descrIndex,
int metricIndex,
float lowerBound,
float upperBound,
int nHistograms,
float[] metricValues,
MDReader similarSetReader,
MDReader dissimilarSetReader)
Retrieves the distribution of the given metric from the dissimilarity
values calculated by a screen using the given two molecular descriptor readers.
|
float |
evaluateByMetric(int descrIndex,
int metricIndex,
float minPercentageOfSimilarHits)
Return the value of the current evaluator function for a screen of
the similar set and the dissimilar set with the given descriptor and
metric, when the percentage of similar hits to the total number of
similars must be greater or equal, than the given percentage.
|
float |
evaluateByMetric(int descrIndex,
int metricIndex,
float minPercentageOfSimilarHits,
MDReader similarSetReader,
MDReader dissimilarSetReader)
Return the value of the current evaluator function for a screen of
the similar set and the dissimilar set with the given descriptor and
metric, when the percentage of similar hits to the total number of
similars must be greater or equal, than the given percentage.
|
float |
evaluateByMetric(int descrIndex,
int metricIndex,
int nSimilarHits)
Return the value of the current evaluator function for a screen of
the similar set and the dissimilar set with the given descriptor and
metric, when the given number of similars must be found.
|
float |
evaluateByMetric(int descrIndex,
int metricIndex,
int fromNSimilarHits,
int toNSimilarHits)
Return the value of the current evaluator function for a screen of
the similar set and the dissimilar set with the given descriptor and
metric, when the number of similar hits must be between the given numbers.
|
float |
evaluateByMetric(int descrIndex,
int metricIndex,
int fromNSimilarHits,
int toNSimilarHits,
MDReader similarSetReader,
MDReader dissimilarSetReader)
Return the value of the current evaluator function for a screen of
the similar set and the dissimilar set with the given descriptor and
metric, when the number of similar hits must be between the given numbers.
|
float |
evaluateByMetric(int descrIndex,
int metricIndex,
int nSimilarHits,
MDReader similarSetReader,
MDReader dissimilarSetReader)
Return the value of the current evaluator function for a screen of
the similar set and the dissimilar set with the given descriptor and
metric, when the given number of similars must be found.
|
int |
getCurrentEvaluatorFunction()
Gets the index of the current the evaluator function
|
int |
getEvaluatorFunctionIndex(java.lang.String name)
Gets the index of the evaluator function from its name
|
java.lang.String |
getEvaluatorFunctionName(int index)
Gets the name of the evaluator function from its index
|
java.util.ArrayList[] |
getInsertedDissimilars()
Returns lists of dissimilars which have dissimilarity values lower than the
similars.
|
int |
getNextDissimilarHit()
Retrieves ids of target hits found in a previous screen or evaluation
one by one.
|
int |
getNextSimilarHit()
Retrieves ids of known similar hits found in a previous screen or evaluation
one by one.
|
int |
getNumberOfDissimilarHits()
Returns the number of hits from the set of target molecules, found in a
previous evaluation or screen.
|
int |
getNumberOfDissimilars()
Returns the number of target molecules (read by dissimilarReader
previously).
|
int |
getNumberOfSimilarHits()
Returns the number of hits from the known similar molecules, found in a
previous evaluation or screen.
|
int |
getNumberOfSimilars()
Returns the number of known similar molecules (read by similarReader
previously).
|
float |
getSelectivityAsymmetryFactor()
Returns the value of the asymmetry factor (weight) of the evaluator
funcion selectivity effectiveness.
|
float |
getThreshold(int descrIndex,
int metricIndex)
Returns threshold set by last screen (given by user as a parameter) of
evaluation (set by evaluation).
|
void |
resetDissimilarHits()
Resets target hits found in a previous screen or evaluation for following
retrieval one by one.
|
void |
resetSimilarHits()
Resets known similar hits found in a previous screen or evaluation for
following retrieval one by one.
|
float[] |
screen(int descrIndex,
int metricIndex,
float threshold)
Screen the similar set and the dissimilar set with the given descriptor,
metric and threshold.
|
float[] |
screen(int descrIndex,
int metricIndex,
float threshold,
MDReader similarSetReader,
MDReader dissimilarSetReader)
Screen the similar set and the dissimilar set with the given descriptor,
metric and threshold.
|
void |
setCurrentEvaluatorFunction(int index)
Sets the evaluator function, the value of which is returned in each evaluate
call.
|
void |
setSelectivityAsymmetryFactor(float alpha)
Sets the asymmetry factor (weight) of the evaluator funcion selectivity
effectiveness.
|
public MDHitEvaluator(MDSimilarity similarity)
similarity
- A complete MDSimilarity object with added queriespublic void setCurrentEvaluatorFunction(int index)
index
- Index of evaluator funcionpublic int getCurrentEvaluatorFunction()
public int getEvaluatorFunctionIndex(java.lang.String name) throws java.lang.IllegalArgumentException
name
- Name of evaluator functionjava.lang.IllegalArgumentException
public java.lang.String getEvaluatorFunctionName(int index)
index
- Index of evaluator funcionpublic void setSelectivityAsymmetryFactor(float alpha) throws java.lang.IllegalArgumentException
alpha
- Value of he asymmetry factorjava.lang.IllegalArgumentException
public float getSelectivityAsymmetryFactor()
public void calcDissimilarity(MDReader similarSetReader, MDReader dissimilarSetReader) throws MDReaderException
similarSetReader
- Reader of the test set of known similarsdissimilarSetReader
- Reader of the set of target molecules (where
similars are thought)MDReaderException
- if the record couldn't be read.public float[] screen(int descrIndex, int metricIndex, float threshold)
calcDissimilarity()
has been called previously.descrIndex
- Index of molecular descriptormetricIndex
- Index of metric (of the given molecular descriptor)threshold
- Threshold value for selecting hitspublic float[] screen(int descrIndex, int metricIndex, float threshold, MDReader similarSetReader, MDReader dissimilarSetReader) throws MDReaderException
descrIndex
- Index of molecular descriptormetricIndex
- Index of metric (of the given molecular descriptor)threshold
- Threshold value for selecting hitssimilarSetReader
- Reader of the test set of known similarsdissimilarSetReader
- Reader of the set of target molecules (where
similars are thought)MDReaderException
- if the record couldn't be read.public float evaluateByMetric(int descrIndex, int metricIndex, int nSimilarHits)
getThreshold( descrIndex, metricIndex )
. To be
called only if calcDissimilarity()
has been called previously.descrIndex
- Index of molecular descriptormetricIndex
- Index of metric (of the given molecular descriptor)nSimilarHits
- Number of known similars required as hitspublic float evaluateByMetric(int descrIndex, int metricIndex, int nSimilarHits, MDReader similarSetReader, MDReader dissimilarSetReader) throws MDReaderException
getThreshold( descrIndex, metricIndex )
.descrIndex
- Index of molecular descriptormetricIndex
- Index of metric (of the given molecular descriptor)nSimilarHits
- Number of known similars required as hitssimilarSetReader
- Reader of the test set of known similarsdissimilarSetReader
- Reader of the set of target molecules (where
similars are thought)MDReaderException
- if the record couldn't be read.public float evaluateByMetric(int descrIndex, int metricIndex, int fromNSimilarHits, int toNSimilarHits)
getNumberOfSimilarHits( descrIndex, metricIndex )
and
getThreshold( descrIndex, metricIndex )
. To be
called only if calcDissimilarity()
has been called previously.descrIndex
- Index of molecular descriptormetricIndex
- Index of metric (of the given molecular descriptor)fromNSimilarHits
- Minimal number of known similars required as hitstoNSimilarHits
- Maximal number of known similars required as hitspublic float evaluateByMetric(int descrIndex, int metricIndex, float minPercentageOfSimilarHits)
getNumberOfSimilarHits( descrIndex, metricIndex )
and
getThreshold( descrIndex, metricIndex )
. To be
called only if calcDissimilarity()
has been called previously.descrIndex
- Index of molecular descriptormetricIndex
- Index of metric (of the given molecular descriptor)minPercentageOfSimilarHits
- Minimal percentage of known similars
required as hits compared to total number of similarspublic float evaluateByMetric(int descrIndex, int metricIndex, int fromNSimilarHits, int toNSimilarHits, MDReader similarSetReader, MDReader dissimilarSetReader) throws MDReaderException
getNumberOfSimilarHits( descrIndex, metricIndex )
and
getThreshold( descrIndex, metricIndex )
.descrIndex
- Index of molecular descriptormetricIndex
- Index of metric (of the given molecular descriptor)fromNSimilarHits
- Minimal number of known similars required as hitstoNSimilarHits
- Maximal number of known similars required as hitssimilarSetReader
- Reader of the test set of known similarsdissimilarSetReader
- Reader of the set of target molecules (where
similars are thought)MDReaderException
- if the record couldn't be read.public float evaluateByMetric(int descrIndex, int metricIndex, float minPercentageOfSimilarHits, MDReader similarSetReader, MDReader dissimilarSetReader) throws MDReaderException
getNumberOfSimilarHits( descrIndex, metricIndex )
and
getThreshold( descrIndex, metricIndex )
.descrIndex
- Index of molecular descriptormetricIndex
- Index of metric (of the given molecular descriptor)minPercentageOfSimilarHits
- Minimal percentage of known similars
required as hits compared to total number of similarssimilarSetReader
- Reader of the test set of known similarsdissimilarSetReader
- Reader of the set of target molecules (where
similars are thought)MDReaderException
- if the record couldn't be read.public int getNumberOfSimilars()
public int getNumberOfDissimilars()
public int getNumberOfSimilarHits()
public int getNumberOfDissimilarHits()
public void resetSimilarHits()
public void resetDissimilarHits()
public int getNextSimilarHit()
public int getNextDissimilarHit()
public float getThreshold(int descrIndex, int metricIndex)
descrIndex
- Index of molecular descriptormetricIndex
- Index of metric (of the given molecular descriptor)public java.util.ArrayList[] getInsertedDissimilars()
public int[] calcMetricDistribution(int descrIndex, int metricIndex, float lowerBound, float upperBound, int nHistograms, float[] metricValues)
calcDissimilarity()
.
Distribution is returned by giving the number of dissimilarities falling into
the (nHistograms - 2) equal size intervals beween lowerBound and upperBound,
and by adding two extra intervals: for each value lower than the given
lower bound and for each value greater than the given upper bound.
The i-th interval is defined as:
[ metricValues[ i ], metricValues[ i + 1 ] ]
.descrIndex
- Index of molecular descriptormetricIndex
- Index of metric (of the given molecular descriptor)lowerBound
- Lower bound for dissimilarity distributionupperBound
- Upper bound for dissimilarity distributionnHistograms
- Refinement of distribution: number of histograms
(including the two extra histograms)metricValues
- Outgoing parameter! Must be allocated previously with
length (nHistograms + 1), contains endpoints of the
dissimilarity value intervalspublic int[] calcMetricDistribution(int descrIndex, int metricIndex, float lowerBound, float upperBound, int nHistograms, float[] metricValues, MDReader similarSetReader, MDReader dissimilarSetReader) throws MDReaderException
[ metricValues[ i ], metricValues[ i + 1 ] ]
.descrIndex
- Index of molecular descriptormetricIndex
- Index of metric (of the given molecular descriptor)lowerBound
- Lower bound for dissimilarity distributionupperBound
- Upper bound for dissimilarity distributionnHistograms
- Refinement of distribution: number of histograms
(including the two extra histograms)metricValues
- Outgoing parameter! Must be allocated previously with
length (nHistograms + 1), contains endpoints of the
dissimilarity value intervalssimilarSetReader
- Reader of the test set of known similarsdissimilarSetReader
- Reader of the set of target molecules (where
similars are thought)MDReaderException
- if the record couldn't be read.