Class RGroupDecomposition

  • All Implemented Interfaces:
    chemaxon.license.Licensable, SearchConstants, StereoConstants, SimpleSearcher

    @PublicAPI
    public class RGroupDecomposition
    extends StandardizedMolSearch
    implements chemaxon.license.Licensable

    R-group decomposition.

    Given a scaffold structure with attached R1, R2, ... nodes as query, determines the possible R-group decompositions of given target molecule. A decomposition consists of the matching scaffold and the R1, R2, ... ligands with attachment points. Each decomposition corresponds to a group hit, see Search.findFirstGroup() and Search.findNextGroup().

    After setting the query, the target, and possibly the search options, call findFirstDecomposition() and findNextDecomposition() to get the decompositions. Call Decomposition.equals(java.lang.Object) to filter equivalent decompositions that provide the same ligands (but correspond to different group hits). Color the decomposition by calling Decomposition.color() or Decomposition.color(java.lang.String). To create a result table with all decompositions right away, call findLigandTable(int, int, java.lang.String). Alternatively, you can get the table header with query, target and R-atoms, call getLigandTableHeader(int, int, java.lang.String) and then add only a single table row corresponding to the first decomposition by calling findLigandTableRow(int, java.lang.String). Different ligand attachment types can be set in setAttachmentType(int). As of JChem 5.3, extra ligands without matching R-atom are not allowed. As of JChem 5.3, a query without R-atoms will be automatically modified in setQuery(chemaxon.struc.Molecule): R-atoms will be added in place of implicit hydrogens by addRGroupsInPlaceOfImplHs(chemaxon.struc.Molecule). The R-grouped query can be retrieved by getRGroupedQuery(). If the original query contains R-atoms then the default R-atom matching behavior is SearchConstants.UNDEF_R_MATCHING_GROUP_H, otherwise the default matching behavior of the automatically added R-atoms is SearchConstants.UNDEF_R_MATCHING_GROUP_H_EMPTY (the empty set matching is allowed here because we expect that ligands are attached to some of the implicit hydrogens in the original query, but not necessarily all implicit hydrogens have corresponding ligands).

    Search options:

    The options below have different default values as in MolSearch:

    API usage examples:

    query: the query molecule
    target: the target molecule

    • Set parameters, get decomposition results, color target according to decomposition in SDF tag:
       RGroupDecomposition rgd = new RGroupDecomposition();
      
       // set search options
       rgd.getSearchOptions().setRLigandEqualityCheck(false);
       rgd.getSearchOptions().setBridgingRAllowed(false);
       rgd.getSearchOptions().setUndefinedRAtom(SearchOptions.UNDEF_R_MATCHING_GROUP);
      
       // set query and target
       rgd.setQuery(query);
       rgd.setTarget(target);
      
       // find decompositions, output colored and aligned target
       MolExporter exporter = new MolExporter(System.out, "sdf:-a");
       ArrayList list = new ArrayList ();
       Decomposition d = rgd.findFirstDecomposition();
       while (d != null) {
           boolean duplicate = false;
           for (int i=list.size()-1; i >= 0; --i) {
                 if (d.equals(list.get(i))) {
                   duplicate = true;
                   break;
               }
           }
           if (!duplicate) {
                 list.add(d);
               d.color("DMAP");
               exporter.write(d.getTarget());
           }
           d = rgd.findNextDecomposition();
       }
       exporter.close();
       
    • Get non-scaffold ligands, standardize both query and targets, output ligands in first decomposition:
       // standardize query and targets to find hits
       // in most cases aromatization only will be sufficient
       Standardizer st = new Standardizer("aromatize..N=[N:1]#[N:2]>>N=[N+:1]=[N-:2]");
      
       RGroupDecomposition rgd = new RGroupDecomposition();
       rgd.setAttachmentType(SearchConstants.ATTACHMENT_MAP);
      
       // standardize and set query
       st.standardize(query);
       rgd.setQuery(query);
      
       // standardize and set target
       st.standardize(target);
       rgd.setTarget(target);
      
       // process decomposition, output ligands
       Decomposition d = rgd.findDecomposition();
       if (d != null) {
           Molecule[] ligands = d.getLigands();
           for (Molecule ligand : ligands) {
               if (ligand != null) {
                   System.out.println(ligand.toFormat("smiles"));
               }
           }
       }
       
    • Color decomposition in atom sets, align target and ligands according to query, output in MRV:
       RGroupDecomposition rgd = new RGroupDecomposition();
       rgd.setAttachmentType(SearchConstants.ATTACHMENT_POINT);
       rgd.setAlign(true);
       rgd.setQuery(query);
       rgd.setTarget(target);
       MolExporter exporter = new MolExporter(System.out, "mrv:-a");
      
       Decomposition d = rgd.findDecomposition();
       if (d != null) {
           d.color();
           Molecule[] ligands = d.getLigands();
           Molecule scaffold = d.getScaffold();
           for (Molecule ligand : ligands) {
                 if (ligand != null) {
                     exporter.write(ligand);
                 } else {
                   exporter.write(scaffold);
               }
           }
       }
      
       exporter.close();
       
    • Image export with target coloring:
       // standardize query and targets to find hits
       // in most cases aromatization only will be sufficient
       Standardizer st = new Standardizer("aromatize..N=[N:1]#[N:2]>>N=[N+:1]=[N-:2]");
      
       // init RGroupDecomposition
       RGroupDecomposition rgd = new RGroupDecomposition();
      
       // set target and ligand alignment according to query
       rgd.setAlign(true);
      
       // set attachment data in atom labels as "R1", "R2", ...
       rgd.setAttachmentType(SearchConstants.ATTACHMENT_RLABEL);
      
       // standardize and set query
       st.standardize(query);
       rgd.setQuery(query);
      
       // standardize and set target
       st.standardize(target);
       rgd.setTarget(target);
      
       // process decompositions
       int index = 0;
       Decomposition d = null;
       if ((d = rgd.findDecomposition()) != null) {
           ++index;
      
           // color in atom sets
           d.color();
      
           // get the colored target
           Molecule mol = d.getTarget();
      
           // convert to PNG with default set colors
           // 'mono' option is needed to supress the default CPK atom coloring
           byte[] png = mol.toBinFormat("png:-a,mono,setcolors");
      
           // write to file
           BufferedOutputStream os = new BufferedOutputStream(new FileOutputStream("img-"+index+".png"));
           os.write(png, 0, png.length);
           os.flush();
           os.close();
      
           // get the query from the decomposition (possibly with automatically added R-atoms):
           Molecule rgquery = d.getQuery();
      
           // get the colored ligands
           Molecule[] ligands = d.getLigands();
           for (int i=0; i < ligands.length; ++i) {
                 if (ligands[i] != null) {
                     png = ligands[i].toBinFormat("png:-a,mono,setcolors");
                     os = new BufferedOutputStream(
                            new FileOutputStream("img-"+index+"-R"+rgquery.getAtom(i).getRgroup()+".png"));
                     os.write(png, 0, png.length);
                     os.flush();
                     os.close();
                 }
           }
       }
       
    • Get a colored ligand table with coloring data set in the "DMAP" SDF tag:
       RGroupDecomposition rgd = new RGroupDecomposition();
       rgd.setAttachmentType(SearchConstants.ATTACHMENT_ATOM); // attachment type: any-atoms
       rgd.setQuery(query);
       rgd.setTarget(target);
       Molecule[][] ligandTable = rgd.findLigandTable(RGroupDecomposition.HEADER_RGROUP,
           RGroupDecomposition.COL_MOLECULE | RGroupDecomposition.COL_SCAFFOLD, "DMAP");
       
    • Get a colored ligand table for different targets, one row for each target corresponding to the first decomposition (note, that this is null if there is no hit), align target and ligands according to query, coloring data set in atom sets (for MRV output):
       RGroupDecomposition rgd = new RGroupDecomposition();
       rgd.setAttachmentType(SearchConstants.ATTACHMENT_MAP);
       rgd.setAlign(true);
       rgd.setQuery(query);
       Molecule[][] ligandTable = new Molecule[targets.length+1][];
       ligandTable[0] = rgd.getLigandTableHeader(RGroupDecomposition.HEADER_MAP,
           RGroupDecomposition.COL_MOLECULE);
       for (int i=0; i < targets.length; ++i) {
           rgd.setTarget(targets[i]);
           ligandTable[i+1] = rgd.findLigandTableRow(RGroupDecomposition.COL_MOLECULE);
       }
       
    • Get ligands corresponding to R1 atoms:
       // standardize query and targets to find hits
       // in most cases this is not needed, since the
       // default standardization (aromatization only) will be sufficient
       Standardizer st = new Standardizer("aromatize..N=[N:1]#[N:2]>>N=[N+:1]=[N-:2]");
      
       // init RGroupDecomposition, set query
       RGroupDecomposition rgd = new RGroupDecomposition();
       rgd.setAttachmentType(SearchConstants.ATTACHMENT_MAP);
       rgd.getSearchOptions().setRLigandEqualityCheck(false);
      
       // standardize and set query
       st.standardize(query);
       rgd.setQuery(query);
      
       // standardize and set target
       st.standardize(target);
       rgd.setTarget(target);
      
       // process decomposition
       Decomposition d = rgd.findDecomposition();
       if (d != null) {
      
           // get query from decomposition, possibly R-atoms added automatically:
           Molecule rgquery = d.getQuery();
      
           // get the ligands:
           Molecule[] ligands = d.getLigands();
      
           // look for R1 ligands:
           for (int i=0; i < ligands.length; ++i) {
               if ((rgquery.getAtom(i).getAtno() == MolAtom.RGROUP) &&
                   (rgquery.getAtom(i).getRgroup() == 1)) {
                   if (ligands[i] != null) {
                       System.out.println(ligands[i].toFormat("smiles"));
                   }
               }
           }
       }
       
    Since:
    JChem 3.0
    • Field Detail

      • COL_MOLECULE

        public static final int COL_MOLECULE
        Constant for adding query-target column to ligand table.
        See Also:
        Constant Field Values
      • COL_SCAFFOLD

        public static final int COL_SCAFFOLD
        Constant for adding scaffold column to ligand table.
        See Also:
        Constant Field Values
      • HEADER_NONE

        public static final int HEADER_NONE
        Constant for header type: no header.
        See Also:
        Constant Field Values
      • HEADER_RGROUP

        public static final int HEADER_RGROUP
        Constant for header type: header with Rgroup atoms (cannot be exported to SMILES).
        See Also:
        Constant Field Values
      • HEADER_MAP

        public static final int HEADER_MAP
        Constant for header type: header with any atoms mapped by Rgroup indexes.
        See Also:
        Constant Field Values
      • SEPARATOR

        public static final String SEPARATOR
        Separator in atom color code property string representation.
        See Also:
        Constant Field Values
    • Constructor Detail

      • RGroupDecomposition

        public RGroupDecomposition()
        Constructor.
    • Method Detail

      • addRGroupsInPlaceOfImplHs

        public static Set<MolAtom> addRGroupsInPlaceOfImplHs​(Molecule query)
        Adds different rgroup atoms connected by any-bonds to query molecule in place of all implicit H-s. In this way ligands will be accepted and stored at all possible attachments. As of JChem 5.3, these R-atoms are added automatically if the query does not contain undefined R-atoms. In this case R-atoms are allowed to match the empty set. Call this before setQuery(chemaxon.struc.Molecule).
        Parameters:
        query - is the query molecule to be set
        Returns:
        those RAtom neighbours on which no s* property can be put because the RAtom can match on hydrogen.
      • addRGroupsInPlaceOfImplHs

        public static Set<MolAtom> addRGroupsInPlaceOfImplHs​(Molecule query,
                                                             int bondType)
        Adds different rgroup atoms connected by the specified bond type to query molecule in place of all implicit H-s. Bond types: see constants in MolBond. In this way ligands will be accepted and stored at all possible attachments. Call this before setQuery(chemaxon.struc.Molecule).
        Parameters:
        query - is the query molecule to be set
        bondType - is the attachment bond type
        Returns:
      • setAlign

        public void setAlign​(boolean align)
        Sets alignment. Default: false.
        Parameters:
        align - is true if ligands and target should be aligned (cleaned)
        Since:
        JChem 5.3
      • isMarkushQuery

        public boolean isMarkushQuery()
        Returns true if query contains Markush features which will be enumerated (all features except defined R-rgroups and homologies).
        Returns:
        true if query contains Markush features other than defined R-groups and homologies
      • setQuery

        public void setQuery​(Molecule mol)
        Specifies the query structure to search for. If the query does not contain undefined R-atoms then addRGroupsInPlaceOfImplHs(chemaxon.struc.Molecule) is called to put R-atoms in place of implicit hydrogens - in this case undefined R-atoms are allowed to match the empty set, apart from heavy atom groups or hydrogen atoms. If the query contains R-atoms, these are allowed to match heavy atom groups or hydrogen atoms, but not the empty set.
        Specified by:
        setQuery in interface SimpleSearcher
        Overrides:
        setQuery in class MolSearch
        Parameters:
        mol - the standardized query structure. See note on aromatic bonds.
      • getQuery

        public Molecule getQuery()
        Retrieves the original query structure.
        Overrides:
        getQuery in class MolSearch
        Returns:
        the original query structure
      • getRGroupedQuery

        public Molecule getRGroupedQuery()
        Returns the R-grouped query. This is the original query if it contains undefined R-atoms, otherwise it is its R-rgouped version. This is the query structure used in the search process.
        Returns:
        the R-grouped query
        Since:
        JChem 5.3
      • findLigandTable

        public Molecule[][] findLigandTable​(int headerType,
                                            int addCols)
                                     throws SearchException
        Returns the ligand table. Format:
         query    scaffold        R1         R2         R3    ...
         ---------------------------------------------------------
         target   scaffold1    ligand1.1  ligand1.2  ligand1.3
         target   scaffold2    ligand2.1  ligand2.2  ligand2.3
         
        in a Molecule[][] array (table):
        • table[0]: the table header with Rgroups as 1-atom molecules
        • table[i], i > 0: target molecule, scaffold, ligands
        If there are more Rgroup ligands with the same rgroup index in the query (e.g. multiple R1 groups) then corresponding ligands will be fused. If there is no Rgroup ligand with an rgroup index then the corresponding table entry is null. Color map is stored in atom sets which can be written in MRV format and is automatically handled by MView. See MolAtom.getSetSeq(). Rgroups in the table header are represented by any-atoms with rgroup index set in atom map to enable SMILES export.
        Parameters:
        headerType - is the header type:
        addCols - additional columns to include:
        Returns:
        the molecule table
        Throws:
        SearchException - on search error
        Since:
        JChem 5.3
      • findLigandTable

        public Molecule[][] findLigandTable​(int headerType,
                                            int addCols,
                                            String colorTag)
                                     throws SearchException
        Returns the ligand table. Format:
         query    scaffold        R1         R2         R3    ...
         ---------------------------------------------------------
         target   scaffold1    ligand1.1  ligand1.2  ligand1.3
         target   scaffold2    ligand2.1  ligand2.2  ligand2.3
         
        in a Molecule[][] array (table):
        • table[0]: the table header with Rgroups as 1-atom molecules
        • table[i], i > 0: target molecule, scaffold, ligands
        If there are more Rgroup ligands with the same rgroup index in the query (e.g. multiple R1 groups) then corresponding ligands will be fused. If there is no Rgroup ligand with an rgroup index then the corresponding table entry is null. If the colorTag is specified then a color map symbol is set in the molecule property for each atom, separated by SEPARATOR characters:
        • the least ligand attachment query atom Rgroup index for ligand atoms
        • 0 for scaffold atoms
        You can run MView with setting atom colors from the SDF tag DMAP, symbol - color pairs given in file Colors.ini:
         mview -t DMAP -p Colors.ini ligands.sdf
         
        By default, color map is stored in atom sets which can be written in MRV format and is automatically handled by MView. See MolAtom.getSetSeq(). Rgroups in the table header are represented by any-atoms with rgroup index set in atom map to enable SMILES export.
        Parameters:
        headerType - is the header type:
        addCols - additional columns to include:
        colorTag - is the molecule property name for setting the color map null if color should be stored in atomsets
        Returns:
        the molecule table
        Throws:
        SearchException - on search error
        Since:
        JChem 5.3
      • findLigandTableRow

        public Molecule[] findLigandTableRow​(int addCols)
                                      throws SearchException
        Returns a ligand table row with ligands corresponding to a the first search hit. Format:
         target    scaffold        ligand R1     ligand R2     ligand R3    ...
         
        The target and the scaffold columns are optional. The color map is stored in atom sets which can be written in MRV format and is automatically handled by MView. See MolAtom.getSetSeq(). Attachment points in ligands are represented according to attachment type set in setAttachmentType(int).
        Parameters:
        addCols - additional columns to include:
        Returns:
        the table row
        Throws:
        SearchException - on search error
        Since:
        JChem 5.3
      • findLigandTableRow

        public Molecule[] findLigandTableRow​(int addCols,
                                             String colorTag)
                                      throws SearchException
        Returns a ligand table row with ligands corresponding to a the first search hit. Format:
         target    scaffold        ligand R1     ligand R2     ligand R3    ...
         
        The target and the scaffold columns are optional. By default, color map is stored in atom sets which can be written in MRV format and is automatically handled by MView. See MolAtom.getSetSeq(). Attachment points in ligands are represented according to attachment type set in setAttachmentType(int).
        Parameters:
        addCols - additional columns to include:
        colorTag - is the molecule property name for setting the color map, null if color should be stored in atomsets
        Returns:
        the table row
        Throws:
        SearchException - on search error
        Since:
        JChem 5.3
      • getLigandTableHeader

        public Molecule[] getLigandTableHeader​(int headerType,
                                               int addCols)
        Returns ligand table header. Format:
         query    scaffold        R1         R2         R3    ...
         
        The query and the scaffold columns are optional. Color maps are stored in atom sets which can be written in MRV format and is automatically handled by MView. See MolAtom.getSetSeq(). Rgroups in the table header are represented by any-atoms with rgroup index set in atom map or by rgroup atoms.
        Parameters:
        headerType - is the header type:
        • HEADER_RGROUP for header with rgroup atoms (cannot be exported to SMILES)
        • HEADER_MAP for mapped any atoms (for SMILES export)
        addCols - additional columns to include:
        Since:
        JChem 5.1.1
      • getLigandTableHeader

        public Molecule[] getLigandTableHeader​(int headerType,
                                               int addCols,
                                               String colorTag)
        Returns ligand table header. Format:
         query    scaffold        R1         R2         R3    ...
         
        The query and the scaffold columns are optional. By default, color map is stored in atom sets which can be written in MRV format and is automatically handled by MView. See MolAtom.getSetSeq(). Rgroups in the table header are represented by any-atoms with rgroup index set in atom map or by rgroup atoms.
        Parameters:
        headerType - is the header type:
        • HEADER_RGROUP for header with rgroup atoms (cannot be exported to SMILES)
        • HEADER_MAP for mapped any atoms (for SMILES export)
        addCols - additional columns to include:
        colorTag - is the molecule property name for setting the color map, null if color should be stored in atomsets
      • isLicensed

        public boolean isLicensed()
        Specified by:
        isLicensed in interface chemaxon.license.Licensable
        Overrides:
        isLicensed in class MolSearch
      • setLicenseEnvironment

        public void setLicenseEnvironment​(String env)
        Specified by:
        setLicenseEnvironment in interface chemaxon.license.Licensable
        Overrides:
        setLicenseEnvironment in class MolSearch
      • setSearchOptions

        public void setSearchOptions​(MolSearchOptions options)
        Description copied from class: MolSearch
        Sets search options. This function makes a copy of the given search options object, thus modification of the original object does not affect future searches unless this method is called again.
        Overrides:
        setSearchOptions in class MolSearch
        Parameters:
        options - search options. A copy of this object will be stored.
        See Also:
        Search.getSearchOptions()