Class PharmacophoreFingerprint

  • All Implemented Interfaces:
    chemaxon.license.Licensable, Cloneable

    @PublicAPI
    public class PharmacophoreFingerprint
    extends MolecularDescriptor
    implements chemaxon.license.Licensable
    The PharmacophoreFingerprint class implements 2D pharmacophoric fingerprints. Such fingerprints (which are chemical descriptors) are constructed from sequences of histograms, each of these histograms have the same number of bars. (Each of these bars represent a descriptor cell.) The number of histograms is determined by the number of pharmacophore types (also often referred as features, properties). If the number of distinct pharmacophore features (for instance H-donor, H-acceptor, charge etc.) is n then the number of histograms is n*(n+1)/2.
    Pharmacophoric point types can be customized by the user of the software, and are specified in an external configuration file, see user documentation for details.
    The total number of bars (or bins) in one histogram (that is, the number of cells in the descriptor) is determined by two distance values: the minimal and maximal distances of pharmacophoric point pairs (atom pairs). Since fingerprints handled by this class are two-dimensional, distances are considered as topological distances (that is, the distance of two atoms in the same molecule is equal to the number of edges in the shortest path connecting the two nodes corresponding to the two atoms in the chemical graph of the molecule). (This implies that chemical graphs should be connected.) Atom pairs closer to each other than minimal distance are regarded as being minimal distance apart (and similarly for distance greater than the maximal distance).
    Thus the number of bars in one historgram is equal to: maximal distance - minimal distance + 1 .
    The above described three configuration parameters (minimal and maximal distance, and the number of pharmacophore types) have substantial influence on the size of the pharmacophoric fingerprints. When this class is instantiated these params have to be provided in a PFParameters object.
    Beside fingerprint size, two further circumstances determine the internal logical structure of fingerprints: the order of the histograms in the fingerprint, and the order of histogram bars in one histogram. Histograms are ordered by pharmacophore type symbols, that is, if H-bond acceptor is denoted by a, and H-donor property by d (and there are no more features specified), then the order of histograms is: a-a, a-d, d-d (and according to the above introduced formula, the number of histograms is 2*(2+1)/2 = 3. Histogram bars are ordered from left to right by distance valued (from minimal to maximal distance).
    This fingerprint structure results in a unique (well-defined, unambiguous) representation that enables the canonical numbering (indexing) of individual bins. This is vital in accessing cells efficiently. Otherwise, if only symbolic keys (in contrast to integer index numbers) could be used (for example ('a','d',3) ) a dramatic loss of efficiency in retrieving information from fingerprints would be experienced. Therefore it is crucial to introduce distinct symbols for different pharmacophore types in the XML configuration file and also to use the same symbols when fingerprints are generated and when they are used in dissimilarity calculations. Otherwise, the interpretation (meaning) of the fingerprints could be significantly different.

    Operations

    Three main groups of operations (methods) can be distinguished:

    • Direct bin manipulation: put value in a bin, increase the value stored in a bin, retrieve the value stored in a bin.
    • Conversion methods: string representations, extracting into database format and building up from string and database formats.
    • (Dis)similarity metrics: these compare two finegrprints and calculate a distance value (dissimilarity ratio or coefficient) between them.
    Since:
    JChem 2.0
    • Field Detail

      • fp

        protected float[] fp
        storage for the fingerprint
    • Constructor Detail

      • PharmacophoreFingerprint

        public PharmacophoreFingerprint()
        Creates a new, empty instance of PharmacophoreFingerprint without allocating internal storage.
      • PharmacophoreFingerprint

        public PharmacophoreFingerprint​(PFParameters params)
        Creates a new instance of PharmacophoreFingerprint according to the parameters given.
        Parameters:
        params - parameters used in fingerprint generation and handling
      • PharmacophoreFingerprint

        public PharmacophoreFingerprint​(String params)
        Creates a new instance of PharmacophoreFingerprint according to the parameters given.
        Parameters:
        params - parameter settings
      • PharmacophoreFingerprint

        public PharmacophoreFingerprint​(PharmacophoreFingerprint pfp)
        Copy constructor. An identical copy of the pharmacophore fingerprint passed is created, they share the same PFParameters object.
        Parameters:
        pfp - fingerprint to be copied
    • Method Detail

      • isLicensed

        public boolean isLicensed()
        Returns information about the licensing of the product.
        Specified by:
        isLicensed in interface chemaxon.license.Licensable
        Returns:
        true if the product is correctly licensed
      • setLicenseEnvironment

        public void setLicenseEnvironment​(String env)
        Sets the license environment.
        Specified by:
        setLicenseEnvironment in interface chemaxon.license.Licensable
      • getName

        public String getName()
        Gets the name of the PharmacophoreFingerprint object. The name is not the same as the class name, it is nicer, more readable and meaningful for end-users too.
        Overrides:
        getName in class MolecularDescriptor
        Returns:
        the nice, external name for PharmacophoreFingerprint class objects
      • getShortName

        public String getShortName()
        Gets the short name of the descriptor.
        Overrides:
        getShortName in class MolecularDescriptor
        Returns:
        the short name used in text outputs (tables etc.)
      • getParametersClassName

        public String getParametersClassName()
        Gets the name of the parameters class corresponding to the descriptor.
        Overrides:
        getParametersClassName in class MolecularDescriptor
        Returns:
        the name of the parameters class
      • setParameters

        public void setParameters​(MDParameters parameters)
        Sets parameters, allocates internal storage if needed and cleans the descriptor.
        Overrides:
        setParameters in class MolecularDescriptor
        Parameters:
        parameters - fingerprint parameters
        Since:
        JChem 2.2
      • toData

        public byte[] toData()
        Converts a PharmacophoreFingerprint object into a byte array. This format can be reffered to as an "external representation" since it servers as the data format for storing fingerprints in databases.
        Use the fromData() method to build the pharmacophore fingerprint from this "external" representation.
        Specified by:
        toData in class MolecularDescriptor
        Returns:
        byte array representation of the fingerprint object
      • fromData

        public void fromData​(byte[] dbRepr)
        Builds a PharmacophoreFingerprint from an external data format, created by a previous call to toData().
        Specified by:
        fromData in class MolecularDescriptor
        Parameters:
        dbRepr - "external" representation of PharmacophoreFingerprint
      • decompress

        protected byte[] decompress​(byte[] data)
        Uncompresses input byte array and stores the uncompressed array in params.data. This is the reverse of compress( final byte[] ). Checks header (first byte) and decompresses only if the value of the first byte is ZERO_SEQUENCE_COMPRESSION_CODE. Otherwise null is returned.
        Parameters:
        data - compressed data
      • generate

        public String[] generate​(Molecule m)
                          throws MDGeneratorException
        Creates the PharmacophoreFingerprint descriptor from the given Molecule. Calls the generator created by the corresponding MDParameters class.
        Overrides:
        generate in class MolecularDescriptor
        Returns:
        property names set in the molecule passed during generation
        Throws:
        MDGeneratorException - when failed to generate descriptor
      • inc

        public final void inc​(int fa,
                              int fb,
                              int dist)
        Increments the histogram corresponding to two features ('fa'-'fb') and a distance, 'dist'. Pharmacophore features (types, properties) are not used directly, but instead their indices (as introduced by PSymbols class) have to be provided for the sake of efficiency. Distance values are normalized in this method to fall within the minimum and maximum distance range, as specified by the previously given parameters.
        If the bin is already full its value is not changed.
        Parameters:
        fa - feature index of one of the features
        fb - feature index of the other paharmacophore feature
        dist - distance value of the two features
      • inc

        public final void inc​(int fa,
                              int fb,
                              int dist,
                              int nrRotBonds)
        The fuzzy version of inc( int fa, int fb, int dist ). The contents of all bins in the (fa,fb) histogram are incremented with the appropriate value depending on the distance and the number of rotatable bonds, and also the fuzzy smoothig factor.
        Parameters:
        fa - feature index of one of the features
        fb - feature index of the other paharmacophore feature
        dist - distance value of the two features
        nrRotBonds - number of rotatable bonds on the path connecting the two pharmacophoric points
      • inc

        public final void inc​(int fa,
                              int fb,
                              int dist,
                              float[] incr)
        The fuzzy version of inc( int fa, int fb, int dist ). The contents of all bins in the (fa,fb) histogram are incremented with the appropriate value depending on the user defined fuzzy smoothing vector.
        Parameters:
        fa - feature index of one of the features
        fb - feature index of the other paharmacophore feature
        dist - distance value of the two features
        incr - distant dependent fuzzy increments
      • inc

        public final void inc​(int bin)
        Increments the content of the specified hitogram bin by one. No overflow check is performed for the sake of efficiency (in normal use no overflow should occur, since 2^32-1 is large enough for molecules having about 90000 atoms). See the class description for the exact meaning of the bin index.
        Parameters:
        bin - index of the bin to be incremented by one
      • put

        public final void put​(int bin,
                              int newValue)
        Stores the given value in the specified hitogram bin. Previous value of the bin is thrown away.
        Parameters:
        bin - index of the bin to be incremented by one
        newValue - value to be stored in the given bin
      • put

        public final void put​(int bin,
                              float newValue)
        Stores the given value in the specified hitogram bin. Previous value of the bin is thrown away.
        Parameters:
        bin - index of the bin to be incremented by one
        newValue - value to be stored in the given bin
      • get

        public final float get​(int fa,
                               int fb,
                               int dist)
        Gets the histogram bar height of two features ('fa'-'fb') corresponding to the given ditance 'dist'. Distance values have to be normalized upfront to calling this method!
        Parameters:
        fa - feature index of one of the features
        fb - feature index of the other paharmacophore feature
        dist - distance value of the two features
        Returns:
        height (value) of the histogram bar (column) corresponding to the input arguments
      • get

        public final float get​(int bin)
        Gets the content of the specified hitogram bin. See the description of PharmacophoreFingerprint class for the meaning of the bin index.
        Parameters:
        bin - index of the bin qeuried
        Returns:
        the value sotred in the specified bin
      • clear

        public final void clear()
        Clears the fingerprint: sets all bins to store zero value.
      • toString

        public final String toString​(String sep,
                                     boolean nonZeroOnly)
        Creates the string representation of the pharmacophore fingerprint. The output format is different than in toString: <feature symbol> ' ' <feature symbol> @ <distance> '=' <value> <sep> ... . Note, that such text representation cannot be converted into pharmacophore fingerprint data.
        Parameters:
        sep - separator character printed between two bins
        nonZeroOnly - bins containing zero values are not printed
        Returns:
        the string representation of the fingerprint
      • toHistogramString

        public final String toHistogramString​(String sep,
                                              boolean nonZeroOnly)
        Creates the string representation of the fingerprint. All bins, or all all bins of those histograms in which at least one feature pair has at least one occurance (that is one non-zero valued bin) are printed depending on parameter settings.
        The format is: <feature symbol> ' ' <feature symbol> '=' '|' b1 b2 ... bn '|' <separator>, where bi denotes the value stored in bin i.
        Parameters:
        sep - separator string to be printed between histograms
        nonZeroOnly - all or non-zero value containing histogram are printed
        Returns:
        the string representation of the fingerprint
      • toDecimalString

        public final String toDecimalString()
        Converts the fingerprint into a string of decial numbers. All bins are printed in an unstructed way, values are simply separated by tabs.
        Specified by:
        toDecimalString in class MolecularDescriptor
        Returns:
        binary string representation of the fingerprint
      • toFloatArray

        public float[] toFloatArray()
        Creates the float array representation of a MolecularDescriptor object. This array contains all values of the descriptor (including all zeros) in the elements of the array.
        Specified by:
        toFloatArray in class MolecularDescriptor
        Returns:
        float array of the fingerprint cells
        Since:
        JChem 2.0.1
      • fromFloatArray

        public void fromFloatArray​(float[] descr)
        Builds a molecular descriptor from its float array representation. Typically used when a hypothesis is created.
        Specified by:
        fromFloatArray in class MolecularDescriptor
        Parameters:
        descr - descriptor represented in a float array (e.g. generated by toFloatArray())
        Since:
        JChem 2.0.1
      • getAtomSetColors

        public Color[] getAtomSetColors()
        Determines the coloring of atoms. This coloring does not reflect element types, instead pharmacophore point types. This method should be called after each call of setParameters() as that may change the coloring scheme to be applied.
        Overrides:
        getAtomSetColors in class MolecularDescriptor
        Returns:
        array of colors of different pharmacophore point types
      • getAtomSetIndexes

        public int[] getAtomSetIndexes​(Molecule m)
        Gets the individual atom colors by pharmcophore point type.
        Overrides:
        getAtomSetIndexes in class MolecularDescriptor
        Parameters:
        m - a molecule to assign pharmacophore point colors to
        Returns:
        array of color indexes indexed by atom indixes
      • getDefaultDissimilarityMetricThresholds

        public float[] getDefaultDissimilarityMetricThresholds()
        Gets the default dissimilarity threshold values for all dissimilarity metrics defined.
        Specified by:
        getDefaultDissimilarityMetricThresholds in class MolecularDescriptor
        Returns:
        array of dissimilarity threshold values
      • getEuclidean

        public final float getEuclidean​(PharmacophoreFingerprint f)
        Calculates the Euclidean distance. The dissimilarity coefficient returned ranges from 0 to MAX_FLOAT, this coefficient is not normalized.
        Parameters:
        f - another fingerprint from which the distance is measured
        Returns:
        dissimilarity coefficient
      • getWeightedEuclidean

        public final float getWeightedEuclidean​(PharmacophoreFingerprint f)
        Calculates the weighted Euclidean distance. Weights are taken from the associated PFParameters.
        Parameters:
        f - a fingerprint from which the distance is measured
        Returns:
        dissimilarity coefficient
      • getWeightedAsymmetricEuclidean

        public final float getWeightedAsymmetricEuclidean​(PharmacophoreFingerprint f)
        Calculates the weighted asymmetric Euclidean distance. Weights and asymmetry ratio are taken from the associated PFParameters.
        Parameters:
        f - a fingerprint from which the distance is measured
        Returns:
        dissimilarity coefficient
      • getSymmetricFBPA

        public final float getSymmetricFBPA​(PharmacophoreFingerprint f)
        Calculates the symmetric FBPA convolution product based distasnce of the fingerprint from an other (given as parameter).
        Parameters:
        f - distance of this is taken from f
        Returns:
        euclidean distance (dissimilarity measure)
      • getAsymmetricFBPA

        public final float getAsymmetricFBPA​(PharmacophoreFingerprint f)
        Calculates the asymmetric FBPA convolution product based distance of the fingerprint from an other (given as parameter).
        Parameters:
        f - the reference fingerprint (denoted by M))
        Returns:
        the euaclidean distance (dissimilarity measure)
      • getTanimoto

        public final float getTanimoto​(PharmacophoreFingerprint f)
        Calculates the Tanimoto metric (adapted to hystograms)
        Parameters:
        f - the distance from f is calculated
        Returns:
        the tanimoto distance (dissimilarity measure)
      • getTversky

        public float getTversky​(PharmacophoreFingerprint f)
        Calculates the Tversky !!DISSIMILARITY!! index
        Parameters:
        f - the distance from f is calculated
        Returns:
        the Tversky dissmilarity index as float
      • getScaledTanimoto

        public final float getScaledTanimoto​(PharmacophoreFingerprint f,
                                             PharmacophoreFingerprint hypothesis)
        Calculates the scaled Tanimoto metric (adapted to hystograms).
        Parameters:
        f - the distance is measured from f
        Returns:
        the tanimoto distance (dissimilarity measure)
      • index

        public int index​(int fa,
                         int fb,
                         int dist)
        Calculates the index of the bin specified by the arguments.
        Parameters:
        fa - index of the first pharmacophore point type
        fb - index of the second (other) pharmacophore point type
        dist - distance of the pharmacophore points
        Returns:
        index of the specified bit
      • getDissimilarity

        public float getDissimilarity​(MolecularDescriptor fp2)
        Calculates the dissimilarity between two pharmacophore fingerprints using the default distance measure.
        Specified by:
        getDissimilarity in class MolecularDescriptor
        Parameters:
        fp2 - the other pharmacophore fingerprint
        Returns:
        dissimilarity ratio
      • getDissimilarity

        public float getDissimilarity​(MolecularDescriptor fp2,
                                      int metricIndex)
        Calculates the dissimilarity between two pharmacophore fingerprints using the specified parametrized distance metric.
        Specified by:
        getDissimilarity in class MolecularDescriptor
        Parameters:
        fp2 - the pharmacohore fingerprint from which the distance is measured
        metricIndex - index of the parametrized metric to be used
        Returns:
        the dissimilarity ratio
        See Also:
        MDParameters, PFParameters
      • getLowerBound

        public float getLowerBound​(MolecularDescriptor fp2)
        Calculates the lower bound estimate of the dissimilarity from the given fingerprint. This method is required by Diffable see remarks at getDissimilarity( final Object fp2 ) for further explanation. In the case of PharmacophoreFingerprint a good estimate for the minimum distance cannot be obtained efficiently (that is, significantly faster than calculating the proper distance) therefore 0 is returned. This trivial distance bound estimation will lead to calling getDistance.
        Overrides:
        getLowerBound in class MolecularDescriptor
        Parameters:
        fp2 - pharmacophore fingerprint from which distance is measured
        Returns:
        estimate of the minimum distance
      • isSubsetOf

        public boolean isSubsetOf​(PharmacophoreFingerprint d)
        Checks if this fingerprint is a subset of another fingerprint that is passed as method parameter. A histogram (fingerprint) is considered to be a subset of another, if none of its bars is higher than that of the other's.
        Parameters:
        d - a descriptor which is supposed to be a superset
        Returns:
        true if this descriptor is a subset of the parameter
      • getMaxDist

        public float getMaxDist()
      • getMinDist

        public float getMinDist()
      • getResolution

        public float getResolution()
      • getNumberOfFeatures

        public int getNumberOfFeatures()
      • getSymbol

        public String getSymbol​(int feature)
      • get

        public float get​(int feature1,
                         int feature2,
                         float dist)
      • getAliasNames

        public List<String> getAliasNames()