Class Standardizer


  • @PublicAPI
    public class Standardizer
    extends Object
    Performs the standardization determined by the XML configuration file, or simple action string configuration.
    See the documentation for details:

    API usage examples:

    1. A simple example:
       Standardizer standardizer = new Standardizer(new File("standardize.xml"));
       standardizer.standardize(molecule);
       
    2. A more complex example with importing molecules, standardization with a final clean action to arrange changed atoms, storing standardization information in molecule properties and finally exporting the molecules in SDF format:
       // create Standardizer based on a XML configuration file
       Standardizer standardizer = new Standardizer(new File("config.xml"));
       try {
           Molecule molecule = null;
           MolExporter exporter = new MolExporter(System.out, "sdf");
           MolImporter importer = new MolImporter("mols.sdf");
           while ((molecule = importer.read()) != null) {
      
              // standardize molecule
              standardizer.standardize(molecule);
      
              // get applied task indexes
              int[] appliedTaskIndexes = standardizer.getAppliedTaskIndexes();
      
              // get applied task IDs
              String[] appliedTaskIdentifiers = standardizer.getAppliedTaskIDs();
      
              // store applied task indexes and IDs in molecule properties
              StringBuilder indexPropertyValue = new StringBuilder();
              for (int i = 0; i < appliedTaskIndexes.length; ++i) {
                  indexPropertyValue.append(appliedTaskIndexes[i]);
                  indexPropertyValue.append(" ");
          }
              StringBuilder identifierPropertyValue = new StringBuilder();
              for (int i = 0; i < appliedTaskIdentifiers.length; ++i) {
                  identifierPropertyValue.append(appliedTaskIdentifiers[i]);
                  identifierPropertyValue.append(" ");
          }
              molecule.setProperty("TASK_INDEXES", indexPropertyValue.toString());
              molecule.setProperty("TASK_IDS", identifierPropertyValue.toString());
      
              // write output
              exporter.write(molecule);
           }
           importer.close();
           exporter.close();
       } catch (LicenseException e) {
           e.printStackTrace();
       } catch (IOException e) {
           e.printStackTrace();
       }
       
    Configuration
    Defines a sequence of standardizer actions, that should be executed, see StandardizerConfiguration Configuration error
    If the provided configuration contains errors, the standardize(Molecule) method throws IllegalArgumentException by default. When the method setIgnoreConfigurationErrors(boolean) is called with true value beforehand, faulty actions are not executed, error message or exception should not occur!
    If errors of configuration needs to be checked, use the getConfiguration() method to get the configuration, and check the validity of the configuration by using the function StandardizerConfiguration.isValid().
    Standardizer actions
    All built-in standardizer actions can be used with the Standardizer. To use external standardizer actions, the XML factory-configuration file userdefinedstandardizers.xml should be placed at the $HOME/chemaxon folder containing the list of user-defined checkers. For more information see StandardizerActionFactory
    Built-in standardizer actions
    Built in standardizer actions are located in the packages chemaxon.standardizer.actions and chemaxon.standardizer.advancedactions
    Concurrent usage
    Standardizer works on a single thread. If there are more target molecules, we advise to create more Standardizer instances (by cloning an existing one), and processing the molecules concurrently. For more information see ConcurrentStandardizerProcessor
    Logging
    Logs are generated to a Logger, that can be collected by using StandardizerLogger.getLogger() method. As for now, only Level.WARNING and Level.SEVERE logs are generated.
    Since:
    5.11
    See Also:
    ConcurrentStandardizerProcessor
    • Constructor Detail

      • Standardizer

        public Standardizer​(String configuration)
                     throws IllegalArgumentException
        Initializes a new standardizer with auto-recognized XML or action string configuration

        Configurations containing references to molecule files does not work this way! If you have configurations containing references to external files, use the constructors that provide information on the path ( e.g. Standardizer(File) or Standardizer(String, String)

        Parameters:
        configuration - the configuration defined by action string or XML
        Throws:
        IllegalArgumentException - on invalid configuration
      • Standardizer

        public Standardizer​(String configuration,
                            String path)
                     throws IllegalArgumentException
        Initializes a new standardizer with auto-recognized XML or action string configuration
        Parameters:
        configuration - the configuration defined by action string or XML
        path - the root path of referenced contents in the configuration
        Throws:
        IllegalArgumentException - on invalid configuration
      • Standardizer

        public Standardizer​(Standardizer standardizer)
        Initializes a new standardizer based on an existing standardizer
        Parameters:
        standardizer - an initialized standardizer instance as a base
        Throws:
        IllegalArgumentException - on invalid configuration
    • Method Detail

      • isLicensed

        public boolean isLicensed()
        Gets whether the standardizer is licensed
        Returns:
        whether the standardizer is licensed
      • setLicenseEnvironment

        public void setLicenseEnvironment​(String env)
        Sets the license environment. For internal usage only.
        Parameters:
        env - the license environment
      • getOldToNew

        public int[] getOldToNew()
        Returns the old -> new atom index mapping.
        oldToNew[i]==j means that the i-th atom of the old molecule corresponds to the j-th atom of the new molecule.
        Returns:
        the old -> new atom index mapping
      • getNewToOld

        public int[] getNewToOld()
        Returns the new -> old atom index mapping.
        newToOld[i]==j means that the i-th atom of the new molecule corresponds to the j-th atom of the old molecule.
        Returns:
        the new -> old atom index mapping
      • setFinalClean

        @Deprecated
        public void setFinalClean​(int dim)
                           throws IllegalArgumentException
        Deprecated.
        use ConfigurationUtility.setFinalClean(StandardizerConfiguration, int) on the configuration returned by the method getConfiguration()
        Sets final clean task: partial, target-only.
        Parameters:
        dim - is the clean dimension, set -1 for the original molecule dimension
        Throws:
        IllegalArgumentException
      • setFinalClean

        @Deprecated
        public void setFinalClean​(int dim,
                                  boolean partial)
                           throws IllegalArgumentException
        Deprecated.
        use ConfigurationUtility.setFinalClean(StandardizerConfiguration, int, boolean) on the configuration returned by the method getConfiguration()
        Sets final clean task: target-only.
        Parameters:
        dim - is the clean dimension, currently partial clean only works in 2D - if set to 3 then full clean is performed
        partial - is true if only the changing atoms should be cleaned, false if full clean is needed - if the clean dimension is different from the molecule dimension then always full clean is performed
        Throws:
        IllegalArgumentException
      • standardize

        public List<Changes> standardize​(Molecule mol)
                                  throws chemaxon.license.LicenseException,
                                         IllegalArgumentException
        Standardization for one input molecule: performs the standardization actions according to the XML configuration. The input molecule is transformed and returned, no new molecule object is created. If the input molecule is a reaction then each reactant/product is standardized.
        Parameters:
        mol - is the input molecule to be standardized
        Returns:
        the list of changes applied by the standardizer (each entry of the list is a change set of a standardizer action execution)
        Throws:
        chemaxon.license.LicenseException - if there is no valid license
        IllegalArgumentException - if the parameter molecule, or the set configuration is not valid
      • getAppliedTaskIndexes

        public int[] getAppliedTaskIndexes()
        Returns the indexes of tasks applied to the last input molecule. Indexing is 0-based, with the largest index which equals to the number of tasks meaning the final clean action.
        Returns:
        the indexes of tasks applied to the last input molecule
      • getAppliedTaskIDs

        public String[] getAppliedTaskIDs()
        Returns the IDs of tasks applied to the last input molecule. The ID FINAL_CLEAN_ID corresponds to the final clean action.
        Returns:
        the IDs of tasks applied to the last input molecule
      • setFinalStereoFix

        public void setFinalStereoFix​(boolean clean)
        Sets to perform final stereo fixes.
        Parameters:
        clean - if true then final stereo fix is performed
      • setActiveGroups

        @Deprecated
        public void setActiveGroups​(String[] groups)
        Deprecated.
        use ConfigurationUtility.filterGroups(StandardizerConfiguration, String[]) on the configuration returned by the method getConfiguration()
        Sets the active groups of the runners
        Parameters:
        groups - the list of names of active groups
      • getActiveGroups

        @Deprecated
        public String[] getActiveGroups()
        Deprecated.
        use ConfigurationUtility.filterGroups(StandardizerConfiguration, String[]) on the configuration returned by the method getConfiguration()
        Gets the active groups of the runners
        Returns:
        the list of names of active groups
      • getConfiguration

        public StandardizerConfiguration getConfiguration()
        Gets the configuration of the standardizer. Note that modifications on the configuration is not applied to the standardizer.
        Returns:
        the configuration of the standardizer
      • getLicenseEnvironment

        public String getLicenseEnvironment()
        Gets the license environment of the standardizer
        Returns:
        the license environment of the standardizer
      • isStereoFix

        public boolean isStereoFix()
        Gets whether last stereo fix should be applied
        Returns:
        whether last stereo fix should be applied
      • setIgnoreConfigurationErrors

        public void setIgnoreConfigurationErrors​(boolean ignore)
        Sets whether the configuration errors should be ignored by the standardization process.
        When configuration errors are ignored, invalid standardizer actions in the configuration are not executed, however all valid actions are to be executed.
        When configuration errors are not ignored, IllegalArgumentException thrown when standardize(Molecule) method is called on an invalid configuration.
        Parameters:
        ignore - true if configuration errors should be ignored