Package chemaxon.descriptors
Class MDGenerator
- java.lang.Object
-
- chemaxon.descriptors.MDGenerator
-
- Direct Known Subclasses:
BCUTGenerator
,CFGenerator
,ECFPGenerator
,PFGenerator
,RFGenerator
,ShapeGenerator
@PublicAPI public abstract class MDGenerator extends Object
Base class for all kinds ofMolecularDescriptor
generators. Its main purpose is two-fold: (1) defines an interface for all generator classes (that is, what methods should be implemented), (2) implements function for gather statistical data on descriptor generated and retrieval functions for these statistics.- Since:
- JChem 2.1
-
-
Field Summary
Fields Modifier and Type Field Description protected boolean
createStatistics
indicates if statistical data has to be gathered during generationprotected int[]
density
protected int[]
freqCount
protected int
maxNonEmptyId
protected float
maxNonEmptyPercent
protected int
minNonEmptyId
protected float
minNonEmptyPercent
protected int
molCount
variables to collect statistical data inprotected float
sumNonEmptyPercent
-
Constructor Summary
Constructors Constructor Description MDGenerator()
Created an object.
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected int
calcFreqCount(MolecularDescriptor d)
Calculate and store infreqCount[]
absolute frequency counts per cells.abstract String[]
generate(Molecule m, MolecularDescriptor d)
Generates the molecular descriptor for the given molecule.float
getAverageNonZeroRatio()
Gets the average percentage of cells that have non-zero value taken all descriptors generated since the initialization of the generator into account.int
getBrightestMolId()
Gets the id of that molecule which had the maximum number of non-zero cells among all descriptors generated since the initialization of the generator object.int
getDarkestMolId()
Gets the id of that molecule which had the minimum number of non-zero cells among all descriptors generated since the initialization of the generator object.int[]
getDensityCounts()
Gets the array of bit density.int[]
getFrequencyCounts()
Gets the absolute frequence count array for all descriptors generated.float
getMaximumBitRatio()
Gets the maximum percentage of non-zero cells in descriptors generated.float
getMinimumBitRatio()
Gets the minimum percentage of non-zero cells in descriptors generated.int
getMoleculeCount()
Gets the number of molecules processed (that is, the number of descriptors generated) since the initialization of the object.void
setCreateStatistics(boolean createStatistics)
Toggles the create statistics flag.protected void
updateStatistics(MolecularDescriptor d)
Updates statistics gathered on fingerprints generated.
-
-
-
Field Detail
-
createStatistics
protected boolean createStatistics
indicates if statistical data has to be gathered during generation
-
molCount
protected int molCount
variables to collect statistical data in
-
minNonEmptyPercent
protected float minNonEmptyPercent
-
minNonEmptyId
protected int minNonEmptyId
-
maxNonEmptyPercent
protected float maxNonEmptyPercent
-
maxNonEmptyId
protected int maxNonEmptyId
-
sumNonEmptyPercent
protected float sumNonEmptyPercent
-
freqCount
protected int[] freqCount
-
density
protected int[] density
-
-
Method Detail
-
generate
public abstract String[] generate(Molecule m, MolecularDescriptor d) throws MDGeneratorException
Generates the molecular descriptor for the given molecule. The MolecularDescriptor provided is updated (thus it has to be allocated and initialized by the client of this class).- Parameters:
m
- molecule for which the descriptor is createdd
- the generated descriptor- Returns:
- names of tags (properties) added
- Throws:
MDGeneratorException
- in the case of any failures to generate the descriptor
-
setCreateStatistics
public void setCreateStatistics(boolean createStatistics)
Toggles the create statistics flag.- Parameters:
createStatistics
- new value for the create statistics flag- Since:
- JChem 2.1
-
updateStatistics
protected void updateStatistics(MolecularDescriptor d)
Updates statistics gathered on fingerprints generated.- Parameters:
d
- newly generatedMolecularDescriptor
- Since:
- JChem 2.1
-
calcFreqCount
protected int calcFreqCount(MolecularDescriptor d)
Calculate and store infreqCount[]
absolute frequency counts per cells. Also gets number of non-zero cells in the descriptor.- Parameters:
d
- descriptor in which non-zero cells should be counted- Returns:
- number of non-zero cells
-
getMoleculeCount
public int getMoleculeCount()
Gets the number of molecules processed (that is, the number of descriptors generated) since the initialization of the object.- Returns:
- number of molecules processed
-
getAverageNonZeroRatio
public float getAverageNonZeroRatio()
Gets the average percentage of cells that have non-zero value taken all descriptors generated since the initialization of the generator into account.- Returns:
- relative number of bits set in descriptors
-
getMaximumBitRatio
public float getMaximumBitRatio()
Gets the maximum percentage of non-zero cells in descriptors generated.- Returns:
- maximum bits set, relative to descriptor length
-
getBrightestMolId
public int getBrightestMolId()
Gets the id of that molecule which had the maximum number of non-zero cells among all descriptors generated since the initialization of the generator object.- Returns:
- unique molecule identifier (a consequtive index from zero)
-
getMinimumBitRatio
public float getMinimumBitRatio()
Gets the minimum percentage of non-zero cells in descriptors generated.- Returns:
- minimum bits set, relative to descriptor length
-
getDarkestMolId
public int getDarkestMolId()
Gets the id of that molecule which had the minimum number of non-zero cells among all descriptors generated since the initialization of the generator object.- Returns:
- unique molecule identifier (a consequtive index from zero)
-
getDensityCounts
public int[] getDensityCounts()
Gets the array of bit density. The array can be indexed from 0 to 10. Indexi
returns the number of descriptors in which the ratio non-zero cells is between10 * i
and10 * i + 10
.- Returns:
- array of density counts
-
getFrequencyCounts
public int[] getFrequencyCounts()
Gets the absolute frequence count array for all descriptors generated. Each element of the array stores the number of descriptors in which the corresponding cell had non-zero value.- Returns:
- per-cell frequency count array
-
-