Class DescriptorGenerator

java.lang.Object
chemaxon.descriptors.DescriptorGenerator

@PublicApi public class DescriptorGenerator extends Object
Simple class for generating molecular descriptors (fingerprints). The main purpose of this class is to provide a lightweight common interface for creating various molecular descriptors and obtaining them in different formats.

Typical usage

      DescriptorGenerator gen = new DescriptorGenerator("ECFP");
      gen.setParameter("Length", "512");
      Molecule mol = getFirstMoleculeFromSomewhere();
      while (mol != null) {
          gen.generate(mol);
          doSomethingWith(gen.getAsString());
          doSomethingWith(gen.getAsBitSet());
          mol = getNextMoleculeFromSomewhere();
      }
 
Since:
JChem 5.4
  • Constructor Details

    • DescriptorGenerator

      public DescriptorGenerator(String descrType)
      Creates a new instance using the given descriptor type with its default configuration parameters.
      Parameters:
      descrType - Predefined type name or class name of the desired molecular descriptor type. The list of available descriptor types can be obtained using getDescriptorTypes(). If the given string does not match any of the predefined names, it is assumed to be a class name.
      Throws:
      RuntimeException - if neither the given name matches any predefined descriptor type nor a derived class of MolecularDescriptor with that name can be initialized.
    • DescriptorGenerator

      public DescriptorGenerator(String descrType, String configString) throws MDParametersException
      Creates a new instance using the given descriptor type with the given XML configuration.
      Parameters:
      descrType - Predefined type name or class name of the desired molecular descriptor type. The list of available descriptor types can be obtained using getDescriptorTypes(). If the given string does not match any of the predefined names, it is assumed to be a class name.
      configString - XML configuration string for the selected descriptor type.
      Throws:
      RuntimeException - if neither the given name matches any predefined descriptor type nor a derived class of MolecularDescriptor with that name can be initialized.
      MDParametersException - if the XML configuration is invalid.
  • Method Details

    • getDescriptorTypes

      public static String[] getDescriptorTypes()
      Returns the list of the built-in molecular descripor types. The returned array contains the short names of the descriptors. The long names can be obtained using getDescriptorLongName(String).
    • getDescriptorLongName

      public static String getDescriptorLongName(String descrType)
      Returns the long name for the given molecular descriptor type.
      Parameters:
      descrType - Predefined short name of a descriptor type. The list of available short names can be obtained using getDescriptorTypes().
      Throws:
      IllegalArgumentException - if the given parameter is not an available descriptor type.
    • setParameter

      public void setParameter(String paramName, String paramValue)
      Sets a parameter of the current descriptor configuration. Only a few main parameters for each descriptor type can be set, which are stored as attributes of a designated element in the XML configuration. For specifying more parameters, you should pass a full XML configuration to the constructor of the class.
      Parameters:
      paramName - the name of the parameter, which must be the same as the attribute name in the XML configuration.
      paramValue - the new value of the parameter.
    • setStandardizer

      public void setStandardizer(Standardizer standardizer)
      Sets the standardizer object to be used during descriptor generation. This function replaces the standardizer that was defined before either by using this method or by the configuration parameters of the descriptor.
      Parameters:
      standardizer - the standardizer object
      Since:
      JChem 5.12
    • generate

      public void generate(Molecule mol) throws MDGeneratorException
      Generates descriptor for the given molecule.
      Parameters:
      mol - the molecule.
      Throws:
      MDGeneratorException - if failed to generate descriptor.
    • generate

      public void generate(Molecule mol, int[] atoms) throws MDGeneratorException
      Generates partial descriptor for the given molecule. The generated descriptor will contain only those features that are related to the given atoms of the input molecule.

      Currently, only ChemicalFingerprint supports this kind of partial descriptor generation. UnsupportedOperationException is thrown for all other descriptor types.

      Parameters:
      mol - the molecule.
      atoms - indexes of the selected atoms.
      Throws:
      MDGeneratorException - if failed to generate descriptor.
      UnsupportedOperationException - if the selected descriptor type does not support partial generation.
      Since:
      JChem 5.4.1
    • getAsString

      public String getAsString()
      Returns the generated descriptor in its native string representation. This function is applicable to all kinds of descriptors.
    • getAsFloatArray

      public float[] getAsFloatArray() throws UnsupportedOperationException
      Returns the generated descriptor in a float array representation if it is available.
      Throws:
      UnsupportedOperationException - if no appropriate conversion can be applied for the selected descriptor type.
    • getAsIntArray

      public int[] getAsIntArray() throws UnsupportedOperationException
      Returns the generated descriptor in an int array representation if it is available.
      Throws:
      UnsupportedOperationException - if this representation is not supported by the selected descriptor type.
    • getAsBitSet

      public BitSet getAsBitSet() throws UnsupportedOperationException
      Returns the generated descriptor in a BitSet representation if it is available.
      Throws:
      UnsupportedOperationException - if this representation is not supported by the selected descriptor type.