@PublicAPI public class CFGenerator extends MDGenerator
CFGenerator
class generates topological fingerprints
of molecular graphs.
Basic concepts
A binary string (series of 0 and 1) is constructed based on local
connectivity atoms and atom types. The length of the series is a predefined
constant parameter. Two further parameters influence the fingerprints
gerenerated: number of bonds, which determines the number size of local
neighborhood of atoms takes into account by providing an upper limit for
the length of path starting from each atom; and the number of bits to be set
in the fingerprint for each property identified.
Typical usage
For the sake of optimal memory usage one instance of this class can generate
fingerprints for a series of molecular graphs by the consecutive call to
the generate()
method.
In most cases the generator is not intended to be used directly. When molecules
are taken from files or databases the corresponding MolecularDescriptor
s
can be generated by the appropriate MDReader
object.
Alternatively, MolecularDescriptor.generate( final Molecule )
is the simplest way to obtain a descriptor corresponding to a molecular
structure.
CFGenerator gen = new CFGenerator(); ChemicalFingerprint fp = new ChemicalFingerprint( new CFParameters() ); Molecule mol = getFirstMoleculeFromSomewhere(); while ( mol != null ) { gen.generate( mol, fp ); doSomethingWith( fp ); mol = getNextMoleculeFromSomewhere(); }
createStatistics, density, freqCount, maxNonEmptyId, maxNonEmptyPercent, minNonEmptyId, minNonEmptyPercent, molCount, sumNonEmptyPercent
Constructor and Description |
---|
CFGenerator()
Creates a new instance of
CFGenerator which can be used to
generate chemical fingerprints for an arbitrary number of molecules. |
CFGenerator(int length)
Deprecated.
since 5.4
|
Modifier and Type | Method and Description |
---|---|
protected int |
calcFreqCount(MolecularDescriptor d)
Updates statistics gathered on fingerprints generated and get the
number of non-zero cells.
|
java.lang.String[] |
generate(Molecule m,
int[] aidxs,
MolecularDescriptor d)
Generates the partial chemical fingerprint for the given molecule.
|
java.lang.String[] |
generate(Molecule m,
MolecularDescriptor d)
Generates the chemical fingerprint for the given molecule.
|
getAverageNonZeroRatio, getBrightestMolId, getDarkestMolId, getDensityCounts, getFrequencyCounts, getMaximumBitRatio, getMinimumBitRatio, getMoleculeCount, setCreateStatistics, updateStatistics
public CFGenerator()
CFGenerator
which can be used to
generate chemical fingerprints for an arbitrary number of molecules.@Deprecated public CFGenerator(int length)
CFGenerator
which can be used to
generate chemical fingerprints for an arbitrary number of molecules.length
- length of the chemical fingerprint in bitspublic java.lang.String[] generate(Molecule m, MolecularDescriptor d) throws MDGeneratorException
ChemicalFingerprint
object is not allocated, the
MolecularDescriptor
provided as a method parameter is
updated (and it has to be allocated and initialized by the client of this
class).generate
in class MDGenerator
m
- molecule for which the fingerprint is createdd
- the chemical fingerprint generatedChemicalFingerprint
MDGeneratorException
- in the case of any failures to generate
the descriptorpublic java.lang.String[] generate(Molecule m, int[] aidxs, MolecularDescriptor d) throws MDGeneratorException
ChemicalFingerprint
object is not allocated, the
MolecularDescriptor
provided as a method parameter is
updated (and it has to be allocated and initialized by the client of this
class).
Partial fingerprint is fingerprint for paths containing the given atoms. The algorithm performs the full path enumeration over the molecule, but only sets bits in the resulting fingerprint for paths containing the given atoms.
m
- molecule for which the fingerprint is createdaidxs
- atom indexes that define the partial fingerprint generationd
- the chemical fingerprint generatedChemicalFingerprint
MDGeneratorException
protected int calcFreqCount(MolecularDescriptor d)
calcFreqCount
in class MDGenerator
d
- newly generated MolecularDescriptor