Class MDGenerator

java.lang.Object
chemaxon.descriptors.MDGenerator
Direct Known Subclasses:
BCUTGenerator, CFGenerator, ECFPGenerator, PFGenerator, RFGenerator, ShapeGenerator

@PublicApi public abstract class MDGenerator extends Object
Base class for all kinds of MolecularDescriptor generators. Its main purpose is two-fold: (1) defines an interface for all generator classes (that is, what methods should be implemented), (2) implements function for gather statistical data on descriptor generated and retrieval functions for these statistics.
Since:
JChem 2.1
  • Field Details

    • createStatistics

      protected boolean createStatistics
      indicates if statistical data has to be gathered during generation
    • molCount

      protected int molCount
      variables to collect statistical data in
    • minNonEmptyPercent

      protected float minNonEmptyPercent
    • minNonEmptyId

      protected int minNonEmptyId
    • maxNonEmptyPercent

      protected float maxNonEmptyPercent
    • maxNonEmptyId

      protected int maxNonEmptyId
    • sumNonEmptyPercent

      protected float sumNonEmptyPercent
    • freqCount

      protected int[] freqCount
    • density

      protected int[] density
  • Constructor Details

    • MDGenerator

      protected MDGenerator()
      Created an object.
  • Method Details

    • generate

      public abstract String[] generate(Molecule m, MolecularDescriptor d) throws MDGeneratorException
      Generates the molecular descriptor for the given molecule. The MolecularDescriptor provided is updated (thus it has to be allocated and initialized by the client of this class).
      Parameters:
      m - molecule for which the descriptor is created
      d - the generated descriptor
      Returns:
      names of tags (properties) added
      Throws:
      MDGeneratorException - in the case of any failures to generate the descriptor
    • setCreateStatistics

      public void setCreateStatistics(boolean createStatistics)
      Toggles the create statistics flag.
      Parameters:
      createStatistics - new value for the create statistics flag
      Since:
      JChem 2.1
    • updateStatistics

      protected void updateStatistics(MolecularDescriptor d)
      Updates statistics gathered on fingerprints generated.
      Parameters:
      d - newly generated MolecularDescriptor
      Since:
      JChem 2.1
    • calcFreqCount

      protected int calcFreqCount(MolecularDescriptor d)
      Calculate and store in freqCount[] absolute frequency counts per cells. Also gets number of non-zero cells in the descriptor.
      Parameters:
      d - descriptor in which non-zero cells should be counted
      Returns:
      number of non-zero cells
    • getMoleculeCount

      public int getMoleculeCount()
      Gets the number of molecules processed (that is, the number of descriptors generated) since the initialization of the object.
      Returns:
      number of molecules processed
    • getAverageNonZeroRatio

      public float getAverageNonZeroRatio()
      Gets the average percentage of cells that have non-zero value taken all descriptors generated since the initialization of the generator into account.
      Returns:
      relative number of bits set in descriptors
    • getMaximumBitRatio

      public float getMaximumBitRatio()
      Gets the maximum percentage of non-zero cells in descriptors generated.
      Returns:
      maximum bits set, relative to descriptor length
    • getBrightestMolId

      public int getBrightestMolId()
      Gets the id of that molecule which had the maximum number of non-zero cells among all descriptors generated since the initialization of the generator object.
      Returns:
      unique molecule identifier (a consequtive index from zero)
    • getMinimumBitRatio

      public float getMinimumBitRatio()
      Gets the minimum percentage of non-zero cells in descriptors generated.
      Returns:
      minimum bits set, relative to descriptor length
    • getDarkestMolId

      public int getDarkestMolId()
      Gets the id of that molecule which had the minimum number of non-zero cells among all descriptors generated since the initialization of the generator object.
      Returns:
      unique molecule identifier (a consequtive index from zero)
    • getDensityCounts

      public int[] getDensityCounts()
      Gets the array of bit density. The array can be indexed from 0 to 10. Index i returns the number of descriptors in which the ratio non-zero cells is between 10 * i and 10 * i + 10 .
      Returns:
      array of density counts
    • getFrequencyCounts

      public int[] getFrequencyCounts()
      Gets the absolute frequence count array for all descriptors generated. Each element of the array stores the number of descriptors in which the corresponding cell had non-zero value.
      Returns:
      per-cell frequency count array