@PublicAPI public abstract class MaxCommonSubstructure extends java.lang.Object implements chemaxon.license.Licensable
An instance of the default MCS algorithm implementation can be created using newInstance()
or
newInstance(McsSearchOptions)
. The provided MCS algorithms are powerful heuristic methods, which typically
find large common substructures in a short time. However, they do not always provide the exact optimal result due to
the complexity of the MCS problem (especially for large molecules). Furthermore, as the algorithms perform
randomized search, different results might be obtained for equivalent molecule representations.
Warning: MCS algorithms do not perform transformations on the input molecules, so you should be aware of
aromatization
(and other standardization actions) before using them.
Features: Query and target structures basically play the same role in MCS search except for query features: a query molecule may contain generic query atoms (A, Q, M, X, list atom, not list atom, etc.) and query bonds (any, single or double, etc.), but query properties (e.g., valence, hydrogen count) are ignored. If exact query atom/bond matching is set to true, then generic atoms and bonds are allowed in both molecules, but they are matched in exact manner. Reactions are also supported, but Markush structures are not.
Typical usage:
MaxCommonSubstructure mcs = MaxCommonSubstructure.newInstance(); mcs.setMolecules(queryMol, targetMol); McsSearchResult result = mcs.find(); System.out.println("Atoms in MCS: " + result.getAtomCount()); System.out.println("Bonds in MCS: " + result.getBondCount()); System.out.println("MCS molecule: " + MolExporter.exportToFormat(result.getAsMolecule(), "smiles"));
Modifier and Type | Field and Description |
---|---|
static long |
DEFAULT_RANDOM_SEED
Default random seed.
|
protected static java.util.logging.Logger |
LOG
Logger object.
|
protected Molecule |
queryMol
Query molecule (it is neither cloned nor modified).
|
protected long |
randomSeed
Random seed (0 means using the current time).
|
protected SearchMode |
searchMode
Search mode.
|
protected McsSearchOptions |
searchOpts
Search options.
|
protected Molecule |
targetMol
Target molecule (it is neither cloned nor modified).
|
protected long |
timeLimit
Time limit in milliseconds (-1 means disabled).
|
Modifier | Constructor and Description |
---|---|
protected |
MaxCommonSubstructure(McsSearchOptions searchOpts) |
Modifier and Type | Method and Description |
---|---|
float |
calculateSimilarityUpperBound()
Calculates an upper bound on the similarity of the query and target molecules with respect to the specified
search options.
|
int |
calculateUpperBound()
Calculates an upper bound on the number of bonds the maximum common substructure may contain with respect to the
specified search options.
|
McsSearchResult |
find()
Performs MCS search according to the specified options.
|
protected abstract McsSearchResult |
findMcs(com.chemaxon.search.sss.Matcher matcher)
Finds the MCS and all related data (fragments and mappings).
|
Molecule |
getQuery()
Gets the query structure.
|
long |
getRandomSeed()
Gets the random seed value.
|
SearchMode |
getSearchMode()
Gets the current search mode.
|
McsSearchOptions |
getSearchOptions()
Returns the search options used by this instance.
|
Molecule |
getTarget()
Gets the target structure.
|
long |
getTimeLimit()
Gets the maximum allowed MCS search time in milliseconds.
|
boolean |
hasNextResult()
Returns whether there are more results available.
|
boolean |
isLicensed()
Returns information about the licensing of the product.
|
static MaxCommonSubstructure |
newInstance()
Creates a new instance of MCS search algorithm using the default search options.
|
static MaxCommonSubstructure |
newInstance(McsSearchOptions searchOpts)
Creates a new instance of MCS search algorithm using the given search options.
|
McsSearchResult |
nextResult()
Finds the next MCS search result according to the specified options.
|
void |
setLicenseEnvironment(java.lang.String env)
Sets the license environment.
|
void |
setMolecules(Molecule query,
Molecule target)
Sets the two molecular structures to be matched.
|
void |
setQuery(Molecule query)
Sets the query structure.
|
void |
setRandomSeed(long seed)
Sets the random seed value.
|
void |
setSearchMode(SearchMode mode)
Sets the search mode that controls the running time and the accuracy of the algorithm.
|
void |
setTarget(Molecule target)
Sets the target structure.
|
void |
setTimeLimit(long maxMilliseconds)
Sets the maximum allowed time for MCS search.
|
public static final long DEFAULT_RANDOM_SEED
protected long randomSeed
protected SearchMode searchMode
protected long timeLimit
protected Molecule queryMol
protected Molecule targetMol
protected final McsSearchOptions searchOpts
protected static final java.util.logging.Logger LOG
protected MaxCommonSubstructure(McsSearchOptions searchOpts)
public static MaxCommonSubstructure newInstance()
MaxCliqueMcs
implementation is used, which turned out to be the most
efficient and robust according to our benchmark tests.
newInstance(McsSearchOptions)
public static MaxCommonSubstructure newInstance(McsSearchOptions searchOpts)
MaxCliqueMcs
implementation is used, which turned out to be the most
efficient and robust according to our benchmark tests.
searchOpts
- the search options (not null)public final boolean isLicensed()
isLicensed
in interface chemaxon.license.Licensable
public final void setLicenseEnvironment(java.lang.String env)
setLicenseEnvironment
in interface chemaxon.license.Licensable
public final void setMolecules(Molecule query, Molecule target)
query
- query molecule (not null)target
- target molecule (not null)public final void setQuery(Molecule query)
query
- query molecule (not null)public final void setTarget(Molecule target)
target
- target molecule (not null)public final Molecule getQuery()
public final Molecule getTarget()
public final McsSearchOptions getSearchOptions()
McsSearchOptions
.public final void setSearchMode(SearchMode mode)
SearchMode.NORMAL
. For more information, see SearchMode
.mode
- search mode (not null)public final SearchMode getSearchMode()
SearchMode
.public final void setTimeLimit(long maxMilliseconds)
nextResult()
(or hasNextResult()
).
This is an optional limit, which is set to 1 minute by default. You can use a negative parameter value to disable it.
If the search process seems to be too slow, consider using FAST
search mode instead of decreasing
the time limit (see setSearchMode(SearchMode)
).
maxMilliseconds
- maximum running time in milliseconds (negative value means disabled)public final long getTimeLimit()
setTimeLimit()
.public final void setRandomSeed(long seed)
seed
- random seedpublic final long getRandomSeed()
public final McsSearchResult find()
If multiple MCS results are desired, use hasNextResult()
and nextResult()
.
McsSearchResult
object containing the found common substructurejava.lang.IllegalStateException
- if the input molecules are not set before calling this method or
the method is called more than once for the same input moleculesjava.util.concurrent.CancellationException
- if the thread has been interruptedpublic final boolean hasNextResult()
nextResult()
.nextResult()
can be called to obtain a new McsSearchResult
objectjava.lang.IllegalStateException
- if the input molecules are not set before calling this methodjava.util.concurrent.CancellationException
- if the thread has been interruptedpublic final McsSearchResult nextResult()
If multiple search results are desired, this method can be called repeatedly. The results typically correspond to
equivalent common substructures but with different mappings. (See McsSearchResult
for more information.)
When this method is called multiple times, hasNextResult()
should be called first to avoid exception.
The search state is reset after a call to setQuery(Molecule)
, setTarget(Molecule)
or
setMolecules(Molecule, Molecule)
.
McsSearchResult
object containing the found common substructurejava.lang.IllegalStateException
- if the input molecules are not set before the first call to this method or
there are no more results to returnjava.util.concurrent.CancellationException
- if the thread has been interruptedpublic final int calculateUpperBound()
McsSearchResult.getBondCount()
.
As this method is much faster than executing the MCS algorithm (find()
), it can be used for
pre-filtering.
java.lang.IllegalStateException
- if the input molecules are not setjava.util.concurrent.CancellationException
- if the thread has been interruptedpublic final float calculateSimilarityUpperBound()
McsSearchResult.getSimilarity()
.
As this method is much faster than executing the MCS algorithm (find()
), it can be used for
pre-filtering.
java.lang.IllegalStateException
- if the input molecules are not setjava.util.concurrent.CancellationException
- if the thread has been interruptedprotected abstract McsSearchResult findMcs(com.chemaxon.search.sss.Matcher matcher)
This method is called from find()
, nextResult()
, and hasNextResult()
with a
properly initialized matcher object.
matcher
- the matcher object, which also contains the preprocessed query and target molecules