Class RGroupDecomposition

All Implemented Interfaces:
chemaxon.license.Licensable, SearchConstants, StereoConstants, SimpleSearcher

@PublicAPI public class RGroupDecomposition extends StandardizedMolSearch implements chemaxon.license.Licensable

R-group decomposition.

Given a scaffold structure with attached R1, R2, ... nodes as query, determines the possible R-group decompositions of given target molecule. A decomposition consists of the matching scaffold and the R1, R2, ... ligands with attachment points. Each decomposition corresponds to a group hit, see Search.findFirstGroup() and Search.findNextGroup().

After setting the query, the target, and possibly the search options, call findFirstDecomposition() and findNextDecomposition() to get the decompositions. Call Decomposition.equals(java.lang.Object) to filter equivalent decompositions that provide the same ligands (but correspond to different group hits). Color the decomposition by calling Decomposition.color() or Decomposition.color(java.lang.String). To create a result table with all decompositions right away, call findLigandTable(int, int, java.lang.String). Alternatively, you can get the table header with query, target and R-atoms, call getLigandTableHeader(int, int, java.lang.String) and then add only a single table row corresponding to the first decomposition by calling findLigandTableRow(int, java.lang.String). Different ligand attachment types can be set in setAttachmentType(int). As of JChem 5.3, extra ligands without matching R-atom are not allowed. As of JChem 5.3, a query without R-atoms will be automatically modified in setQuery(chemaxon.struc.Molecule): R-atoms will be added in place of implicit hydrogens by addRGroupsInPlaceOfImplHs(chemaxon.struc.Molecule). The R-grouped query can be retrieved by getRGroupedQuery(). If the original query contains R-atoms then the default R-atom matching behavior is SearchConstants.UNDEF_R_MATCHING_GROUP_H, otherwise the default matching behavior of the automatically added R-atoms is SearchConstants.UNDEF_R_MATCHING_GROUP_H_EMPTY (the empty set matching is allowed here because we expect that ligands are attached to some of the implicit hydrogens in the original query, but not necessarily all implicit hydrogens have corresponding ligands).

Search options:

The options below have different default values as in MolSearch:

API usage examples:

query: the query molecule
target: the target molecule

  • Set parameters, get decomposition results, color target according to decomposition in SDF tag:
     RGroupDecomposition rgd = new RGroupDecomposition();
    
     // set search options
     rgd.getSearchOptions().setRLigandEqualityCheck(false);
     rgd.getSearchOptions().setBridgingRAllowed(false);
     rgd.getSearchOptions().setUndefinedRAtom(SearchOptions.UNDEF_R_MATCHING_GROUP);
    
     // set query and target
     rgd.setQuery(query);
     rgd.setTarget(target);
    
     // find decompositions, output colored and aligned target
     MolExporter exporter = new MolExporter(System.out, "sdf:-a");
     ArrayList list = new ArrayList ();
     Decomposition d = rgd.findFirstDecomposition();
     while (d != null) {
         boolean duplicate = false;
         for (int i=list.size()-1; i >= 0; --i) {
            if (d.equals(list.get(i))) {
                 duplicate = true;
                 break;
             }
         }
         if (!duplicate) {
               list.add(d);
             d.color("DMAP");
             exporter.write(d.getTarget());
         }
         d = rgd.findNextDecomposition();
     }
     exporter.close();
     
  • Get non-scaffold ligands, standardize both query and targets, output ligands in first decomposition:
     // standardize query and targets to find hits
     // in most cases aromatization only will be sufficient
     Standardizer st = new Standardizer("aromatize..N=[N:1]#[N:2]>>N=[N+:1]=[N-:2]");
    
     RGroupDecomposition rgd = new RGroupDecomposition();
     rgd.setAttachmentType(SearchConstants.ATTACHMENT_MAP);
    
     // standardize and set query
     st.standardize(query);
     rgd.setQuery(query);
    
     // standardize and set target
     st.standardize(target);
     rgd.setTarget(target);
    
     // process decomposition, output ligands
     Decomposition d = rgd.findDecomposition();
     if (d != null) {
         Molecule[] ligands = d.getLigands();
         for (Molecule ligand : ligands) {
             if (ligand != null) {
                 System.out.println(ligand.toFormat("smiles"));
             }
         }
     }
     
  • Color decomposition in atom sets, align target and ligands according to query, output in MRV:
     RGroupDecomposition rgd = new RGroupDecomposition();
     rgd.setAttachmentType(SearchConstants.ATTACHMENT_POINT);
     rgd.setAlign(true);
     rgd.setQuery(query);
     rgd.setTarget(target);
     MolExporter exporter = new MolExporter(System.out, "mrv:-a");
    
     Decomposition d = rgd.findDecomposition();
     if (d != null) {
         d.color();
         Molecule[] ligands = d.getLigands();
         Molecule scaffold = d.getScaffold();
         for (Molecule ligand : ligands) {
           if (ligand != null) {
               exporter.write(ligand);
               } else {
                 exporter.write(scaffold);
             }
         }
     }
    
     exporter.close();
     
  • Image export with target coloring:
     // standardize query and targets to find hits
     // in most cases aromatization only will be sufficient
     Standardizer st = new Standardizer("aromatize..N=[N:1]#[N:2]>>N=[N+:1]=[N-:2]");
    
     // init RGroupDecomposition
     RGroupDecomposition rgd = new RGroupDecomposition();
    
     // set target and ligand alignment according to query
     rgd.setAlign(true);
    
     // set attachment data in atom labels as "R1", "R2", ...
     rgd.setAttachmentType(SearchConstants.ATTACHMENT_RLABEL);
    
     // standardize and set query
     st.standardize(query);
     rgd.setQuery(query);
    
     // standardize and set target
     st.standardize(target);
     rgd.setTarget(target);
    
     // process decompositions
     int index = 0;
     Decomposition d = null;
     if ((d = rgd.findDecomposition()) != null) {
         ++index;
    
         // color in atom sets
         d.color();
    
         // get the colored target
         Molecule mol = d.getTarget();
    
         // convert to PNG with default set colors
         // 'mono' option is needed to supress the default CPK atom coloring
         byte[] png = mol.toBinFormat("png:-a,mono,setcolors");
    
         // write to file
         BufferedOutputStream os = new BufferedOutputStream(new FileOutputStream("img-"+index+".png"));
         os.write(png, 0, png.length);
         os.flush();
         os.close();
    
         // get the query from the decomposition (possibly with automatically added R-atoms):
         Molecule rgquery = d.getQuery();
    
         // get the colored ligands
         Molecule[] ligands = d.getLigands();
         for (int i=0; i < ligands.length; ++i) {
           if (ligands[i] != null) {
               png = ligands[i].toBinFormat("png:-a,mono,setcolors");
               os = new BufferedOutputStream(
                          new FileOutputStream("img-"+index+"-R"+rgquery.getAtom(i).getRgroup()+".png"));
               os.write(png, 0, png.length);
               os.flush();
               os.close();
               }
         }
     }
     
  • Get a colored ligand table with coloring data set in the "DMAP" SDF tag:
     RGroupDecomposition rgd = new RGroupDecomposition();
     rgd.setAttachmentType(SearchConstants.ATTACHMENT_ATOM); // attachment type: any-atoms
     rgd.setQuery(query);
     rgd.setTarget(target);
     Molecule[][] ligandTable = rgd.findLigandTable(RGroupDecomposition.HEADER_RGROUP,
         RGroupDecomposition.COL_MOLECULE | RGroupDecomposition.COL_SCAFFOLD, "DMAP");
     
  • Get a colored ligand table for different targets, one row for each target corresponding to the first decomposition (note, that this is null if there is no hit), align target and ligands according to query, coloring data set in atom sets (for MRV output):
     RGroupDecomposition rgd = new RGroupDecomposition();
     rgd.setAttachmentType(SearchConstants.ATTACHMENT_MAP);
     rgd.setAlign(true);
     rgd.setQuery(query);
     Molecule[][] ligandTable = new Molecule[targets.length+1][];
     ligandTable[0] = rgd.getLigandTableHeader(RGroupDecomposition.HEADER_MAP,
         RGroupDecomposition.COL_MOLECULE);
     for (int i=0; i < targets.length; ++i) {
         rgd.setTarget(targets[i]);
         ligandTable[i+1] = rgd.findLigandTableRow(RGroupDecomposition.COL_MOLECULE);
     }
     
  • Get ligands corresponding to R1 atoms:
     // standardize query and targets to find hits
     // in most cases this is not needed, since the
     // default standardization (aromatization only) will be sufficient
     Standardizer st = new Standardizer("aromatize..N=[N:1]#[N:2]>>N=[N+:1]=[N-:2]");
    
     // init RGroupDecomposition, set query
     RGroupDecomposition rgd = new RGroupDecomposition();
     rgd.setAttachmentType(SearchConstants.ATTACHMENT_MAP);
     rgd.getSearchOptions().setRLigandEqualityCheck(false);
    
     // standardize and set query
     st.standardize(query);
     rgd.setQuery(query);
    
     // standardize and set target
     st.standardize(target);
     rgd.setTarget(target);
    
     // process decomposition
     Decomposition d = rgd.findDecomposition();
     if (d != null) {
    
         // get query from decomposition, possibly R-atoms added automatically:
         Molecule rgquery = d.getQuery();
    
         // get the ligands:
         Molecule[] ligands = d.getLigands();
    
         // look for R1 ligands:
         for (int i=0; i < ligands.length; ++i) {
             if ((rgquery.getAtom(i).getAtno() == MolAtom.RGROUP) &&
                 (rgquery.getAtom(i).getRgroup() == 1)) {
                 if (ligands[i] != null) {
                     System.out.println(ligands[i].toFormat("smiles"));
                 }
             }
         }
     }
     
Since:
JChem 3.0
  • Field Details

    • COL_MOLECULE

      public static final int COL_MOLECULE
      Constant for adding query-target column to ligand table.
      See Also:
    • COL_SCAFFOLD

      public static final int COL_SCAFFOLD
      Constant for adding scaffold column to ligand table.
      See Also:
    • HEADER_NONE

      public static final int HEADER_NONE
      Constant for header type: no header.
      See Also:
    • HEADER_RGROUP

      public static final int HEADER_RGROUP
      Constant for header type: header with Rgroup atoms (cannot be exported to SMILES).
      See Also:
    • HEADER_MAP

      public static final int HEADER_MAP
      Constant for header type: header with any atoms mapped by Rgroup indexes.
      See Also:
    • SEPARATOR

      public static final String SEPARATOR
      Separator in atom color code property string representation.
      See Also:
  • Constructor Details

    • RGroupDecomposition

      public RGroupDecomposition()
      Constructor.
  • Method Details

    • setAttachmentType

      public void setAttachmentType(int type)
      Parameters:
      type - is the attachment point representation type
    • addRGroupsInPlaceOfImplHs

      public static Set<MolAtom> addRGroupsInPlaceOfImplHs(Molecule query)
      Adds different rgroup atoms connected by any-bonds to query molecule in place of all implicit H-s. In this way ligands will be accepted and stored at all possible attachments. As of JChem 5.3, these R-atoms are added automatically if the query does not contain undefined R-atoms. In this case R-atoms are allowed to match the empty set. Call this before setQuery(chemaxon.struc.Molecule).
      Parameters:
      query - is the query molecule to be set
      Returns:
      those RAtom neighbours on which no s* property can be put because the RAtom can match on hydrogen.
    • addRGroupsInPlaceOfImplHs

      public static Set<MolAtom> addRGroupsInPlaceOfImplHs(Molecule query, int bondType)
      Adds different rgroup atoms connected by the specified bond type to query molecule in place of all implicit H-s. Bond types: see constants in MolBond. In this way ligands will be accepted and stored at all possible attachments. Call this before setQuery(chemaxon.struc.Molecule).
      Parameters:
      query - is the query molecule to be set
      bondType - is the attachment bond type
    • setAlign

      public void setAlign(boolean align)
      Sets alignment. Default: false.
      Parameters:
      align - is true if ligands and target should be aligned (cleaned)
      Since:
      JChem 5.3
    • isMarkushQuery

      public boolean isMarkushQuery()
      Returns true if query contains Markush features which will be enumerated (all features except defined R-rgroups and homologies).
      Returns:
      true if query contains Markush features other than defined R-groups and homologies
    • setQuery

      public void setQuery(Molecule mol)
      Specifies the query structure to search for. If the query does not contain undefined R-atoms then addRGroupsInPlaceOfImplHs(chemaxon.struc.Molecule) is called to put R-atoms in place of implicit hydrogens - in this case undefined R-atoms are allowed to match the empty set, apart from heavy atom groups or hydrogen atoms. If the query contains R-atoms, these are allowed to match heavy atom groups or hydrogen atoms, but not the empty set.
      Specified by:
      setQuery in interface SimpleSearcher
      Overrides:
      setQuery in class MolSearch
      Parameters:
      mol - the standardized query structure. See note on aromatic bonds.
    • setTarget

      public void setTarget(Molecule mol)
      Specifies the target molecule to search in.
      Specified by:
      setTarget in interface SimpleSearcher
      Overrides:
      setTarget in class MolSearch
      Parameters:
      mol - the possibly standardized target molecule. See note on aromatic bonds.
    • getQuery

      public Molecule getQuery()
      Retrieves the original query structure.
      Overrides:
      getQuery in class MolSearch
      Returns:
      the original query structure
    • getRGroupedQuery

      public Molecule getRGroupedQuery()
      Returns the R-grouped query. This is the original query if it contains undefined R-atoms, otherwise it is its R-rgouped version. This is the query structure used in the search process.
      Returns:
      the R-grouped query
      Since:
      JChem 5.3
    • getRLigandCount

      public int getRLigandCount() throws IllegalStateException
      Returns the number of R-atoms in the R-grouped query (see getRGroupedQuery()). Should be called after the query is set by setQuery(chemaxon.struc.Molecule). This is the number of R-ligands in the decompositions returned by findDecomposition().
      Returns:
      the number of R-atoms in the R-grouped query
      Throws:
      IllegalStateException - if the query is not set
      Since:
      JChem 5.12
    • findFirstDecomposition

      public Decomposition findFirstDecomposition() throws SearchException
      Finds the first decomposition result.
      Returns:
      the decomposition corresponding to the first group hit, or null if there is no hit
      Throws:
      SearchException - on search error
      Since:
      JChem 5.3
      See Also:
    • findNextDecomposition

      public Decomposition findNextDecomposition() throws SearchException
      Finds the next decomposition result.
      Returns:
      the decomposition corresponding to the next group hit, or null if there is no hit
      Throws:
      SearchException - on search error
      Since:
      JChem 5.3
      See Also:
    • findDecomposition

      public Decomposition findDecomposition() throws SearchException
      Finds the first decomposition result. Just a shorthand for findFirstDecomposition().
      Returns:
      the decomposition corresponding to the first group hit, or null if there is no hit
      Throws:
      SearchException - on search error
      Since:
      JChem 5.3
      See Also:
    • findDecomposition

      public Decomposition findDecomposition(boolean first) throws SearchException
      Finds a decomposition result.
      Parameters:
      first - true if first decomposition, false if next decomposition
      Returns:
      the decomposition corresponding to the first/next group hit, or null if there is no hit
      Throws:
      SearchException - on search error
      Since:
      JChem 5.3
      See Also:
    • findLigandTable

      public Molecule[][] findLigandTable(int headerType, int addCols) throws SearchException
      Returns the ligand table. Format:
       query    scaffold        R1         R2         R3    ...
       ---------------------------------------------------------
       target   scaffold1    ligand1.1  ligand1.2  ligand1.3
       target   scaffold2    ligand2.1  ligand2.2  ligand2.3
       
      in a Molecule[][] array (table):
      • table[0]: the table header with Rgroups as 1-atom molecules
      • table[i], i > 0: target molecule, scaffold, ligands
      If there are more Rgroup ligands with the same rgroup index in the query (e.g. multiple R1 groups) then corresponding ligands will be fused. If there is no Rgroup ligand with an rgroup index then the corresponding table entry is null. Color map is stored in atom sets which can be written in MRV format and is automatically handled by MView. See MolAtom.getSetSeq(). Rgroups in the table header are represented by any-atoms with rgroup index set in atom map to enable SMILES export.
      Parameters:
      headerType - is the header type:
      addCols - additional columns to include:
      Returns:
      the molecule table
      Throws:
      SearchException - on search error
      Since:
      JChem 5.3
    • findLigandTable

      public Molecule[][] findLigandTable(int headerType, int addCols, String colorTag) throws SearchException
      Returns the ligand table. Format:
       query    scaffold        R1         R2         R3    ...
       ---------------------------------------------------------
       target   scaffold1    ligand1.1  ligand1.2  ligand1.3
       target   scaffold2    ligand2.1  ligand2.2  ligand2.3
       
      in a Molecule[][] array (table):
      • table[0]: the table header with Rgroups as 1-atom molecules
      • table[i], i > 0: target molecule, scaffold, ligands
      If there are more Rgroup ligands with the same rgroup index in the query (e.g. multiple R1 groups) then corresponding ligands will be fused. If there is no Rgroup ligand with an rgroup index then the corresponding table entry is null. If the colorTag is specified then a color map symbol is set in the molecule property for each atom, separated by SEPARATOR characters:
      • the least ligand attachment query atom Rgroup index for ligand atoms
      • 0 for scaffold atoms
      You can run MView with setting atom colors from the SDF tag DMAP, symbol - color pairs given in file Colors.ini:
       mview -t DMAP -p Colors.ini ligands.sdf
       
      By default, color map is stored in atom sets which can be written in MRV format and is automatically handled by MView. See MolAtom.getSetSeq(). Rgroups in the table header are represented by any-atoms with rgroup index set in atom map to enable SMILES export.
      Parameters:
      headerType - is the header type:
      addCols - additional columns to include:
      colorTag - is the molecule property name for setting the color map null if color should be stored in atomsets
      Returns:
      the molecule table
      Throws:
      SearchException - on search error
      Since:
      JChem 5.3
    • findLigandTableRow

      public Molecule[] findLigandTableRow(int addCols) throws SearchException
      Returns a ligand table row with ligands corresponding to a the first search hit. Format:
       target    scaffold        ligand R1     ligand R2     ligand R3    ...
       
      The target and the scaffold columns are optional. The color map is stored in atom sets which can be written in MRV format and is automatically handled by MView. See MolAtom.getSetSeq(). Attachment points in ligands are represented according to attachment type set in setAttachmentType(int).
      Parameters:
      addCols - additional columns to include:
      Returns:
      the table row
      Throws:
      SearchException - on search error
      Since:
      JChem 5.3
    • findLigandTableRow

      public Molecule[] findLigandTableRow(int addCols, String colorTag) throws SearchException
      Returns a ligand table row with ligands corresponding to a the first search hit. Format:
       target    scaffold        ligand R1     ligand R2     ligand R3    ...
       
      The target and the scaffold columns are optional. By default, color map is stored in atom sets which can be written in MRV format and is automatically handled by MView. See MolAtom.getSetSeq(). Attachment points in ligands are represented according to attachment type set in setAttachmentType(int).
      Parameters:
      addCols - additional columns to include:
      colorTag - is the molecule property name for setting the color map, null if color should be stored in atomsets
      Returns:
      the table row
      Throws:
      SearchException - on search error
      Since:
      JChem 5.3
    • getLigandTableHeader

      public Molecule[] getLigandTableHeader(int headerType, int addCols)
      Returns ligand table header. Format:
       query    scaffold        R1         R2         R3    ...
       
      The query and the scaffold columns are optional. Color maps are stored in atom sets which can be written in MRV format and is automatically handled by MView. See MolAtom.getSetSeq(). Rgroups in the table header are represented by any-atoms with rgroup index set in atom map or by rgroup atoms.
      Parameters:
      headerType - is the header type:
      • HEADER_RGROUP for header with rgroup atoms (cannot be exported to SMILES)
      • HEADER_MAP for mapped any atoms (for SMILES export)
      addCols - additional columns to include:
      Since:
      JChem 5.1.1
    • getLigandTableHeader

      public Molecule[] getLigandTableHeader(int headerType, int addCols, String colorTag)
      Returns ligand table header. Format:
       query    scaffold        R1         R2         R3    ...
       
      The query and the scaffold columns are optional. By default, color map is stored in atom sets which can be written in MRV format and is automatically handled by MView. See MolAtom.getSetSeq(). Rgroups in the table header are represented by any-atoms with rgroup index set in atom map or by rgroup atoms.
      Parameters:
      headerType - is the header type:
      • HEADER_RGROUP for header with rgroup atoms (cannot be exported to SMILES)
      • HEADER_MAP for mapped any atoms (for SMILES export)
      addCols - additional columns to include:
      colorTag - is the molecule property name for setting the color map, null if color should be stored in atomsets
    • isLicensed

      public boolean isLicensed()
      Specified by:
      isLicensed in interface chemaxon.license.Licensable
      Overrides:
      isLicensed in class MolSearch
    • setLicenseEnvironment

      public void setLicenseEnvironment(String env)
      Specified by:
      setLicenseEnvironment in interface chemaxon.license.Licensable
      Overrides:
      setLicenseEnvironment in class MolSearch
    • setSearchOptions

      public void setSearchOptions(MolSearchOptions options)
      Description copied from class: MolSearch
      Sets search options. This function makes a copy of the given search options object, thus modification of the original object does not affect future searches unless this method is called again.
      Overrides:
      setSearchOptions in class MolSearch
      Parameters:
      options - search options. A copy of this object will be stored.
      See Also:
    • setParameters

      protected void setParameters() throws SearchException
      Overrides:
      setParameters in class MolSearch
      Throws:
      SearchException