@PublicAPI public class MDParameters extends java.lang.Object
MolecularDescriptor
parameter settings. This class serves as
the base class for the parameter classes of specific
MolecularDescriptor
derivatives.
MolecularDescriptor
objects for
the sake of memory efficiency.
MolecularDescriptor
class - is as follows: derived class name begins
with the name of the corresponding MolecularDescriptor
class and
postfixed by Parameters. For instance that parameters class of the descriptor
class MDXyZ
is XyZParameters
.
MDParameters
provides an extensive functionality to process XML configuration files,
however, further parameter classes extending MDParameters
do
not necessarily have to use XML for storing parameters.
MDParameters
plays an important role in providing so called
Screening Configurations for the dissimilarity calculations.
Such configurations contain so called parameterized metrics that are
based on dissimilarity metrics implemented in classes that extend the
MolecularDescriptor
class. It is important to make clear
distinction between the above two cathegories: dissimilarity metrics are
the basis of the parametrized metrics. MDParameters
stores
metrics parameters and provides services for retrieving and storing
parametrized metrics, access them by either name or index.
Modifier and Type | Field and Description |
---|---|
protected java.util.ArrayList |
asymmetryFactors
asymmetry ratio of parametrized asymmetric metrics
|
protected int |
cellSize
size - number of bits - of one descriptor cell
|
protected java.util.ArrayList |
cellwiseWeights
is cell weights for parametrized metrics
|
protected java.lang.String |
configFilePath
location of the configuration file
|
protected int |
currentMetricIndex
index of the parametrized metric currently in use
|
protected byte[] |
data
buffer for external data format generation, used in
MolecularDescriptor.toData() |
protected java.text.NumberFormat |
decForm
to format floating point output
|
static float |
DEFAULT_ASYMMETRY_FACTOR |
static int |
DEFAULT_OUTPUT_PRECISION |
static float |
DEFAULT_SCALE_FACTOR
constants, default parameter values
|
static float |
DEFAULT_WEIGHT |
protected float |
defaultWeight
value for all missing weight parameters
|
protected org.dom4j.Document |
document
contains the XML document
|
protected MDGenerator |
generator
generates
MolecularDescriptors |
protected int |
internalSize
required memory size of one descriptor instance
|
protected int |
length
the length of the descriptor: the number of cells
|
protected MolecularDescriptor |
md
this object is needed to access default dissimilarity functions
|
protected java.util.ArrayList |
metricIndexes
convert parameterized indexes to MolecularDescriptor metric indexes
|
protected java.util.ArrayList |
normalized
flags indicating if the metric is normalized or not
|
protected int |
outputPrecision
number of fraction digits in floating point output format
|
protected java.util.List |
parametrizedMetricNodes |
protected java.util.ArrayList |
parametrizedMetrics
symbolic names (mnemonics) of parametrized metrics
|
protected org.dom4j.Element |
parametrizedMetricsNode |
protected java.util.ArrayList |
scaleFactors
scale factor of scalable parametrized metrics
|
protected org.dom4j.Node |
screeningConfigurationNode |
protected org.dom4j.Element |
similarityNode
node holding the similarity calculations related parameters
|
protected Standardizer |
standardizer
transform molecules into standard form before descriptor generation
|
protected org.dom4j.Node |
standardizerConfigurationNode
node defining the Standardizer configuration
|
protected java.util.ArrayList |
thresholds
dissimilarity thresholds values
|
protected java.util.ArrayList<java.lang.Float> |
tverskyA
alpha values for tversky dissimilarity
|
protected java.util.ArrayList<java.lang.Float> |
tverskyB
beta values for tversky dissimilarity
|
protected java.util.ArrayList |
weights
weights for parametrized metrics
|
Modifier | Constructor and Description |
---|---|
protected |
MDParameters()
Creates and initializes an empty object.
|
Modifier and Type | Method and Description |
---|---|
void |
addParameters(java.io.File parameterFile)
Sets parameters from an XML config file keeping all previous
settings.
|
void |
addParameters(java.lang.String parameterString)
Sets parameters from an XML string representation keeping all previous
settings.
|
int |
addParametrizedMetric(java.lang.String name,
java.lang.String metric,
java.lang.String activeFamily)
Expands the set of parametrized metrics with a new item.
|
protected org.dom4j.Element |
addParametrizedMetricNode(java.lang.String name,
java.lang.String activeFamily,
java.lang.String metric)
Adds a
ParametrizedMetric node to the DOM tree. |
protected void |
addParametrizedMetricsNode()
Adds the
ParametrizedMetrics node to the DOM tree. |
protected int |
appendParametrizedMetric(java.lang.String name,
java.lang.String metric)
Extends internal data with a new parametrized metric.
|
protected void |
checkDocumentVersion(java.lang.String docType,
java.lang.String version)
Checks if the document is the right version
|
void |
fromFile(java.io.File parameterFile)
Sets parameters from an XML file.
|
void |
fromString(java.lang.String parameterString)
Sets parameters from a string representation.
|
float |
getAsymmetryFactor()
Gets the asymmetry factor used in the current parametrized asymmetric metrics.
|
int |
getCellSize()
Gets the number of bits of an atomic cell in the descriptor.
|
int |
getCurrentMetricIndex() |
byte[] |
getData()
Gets the byte array which is used for conversions between internal and
external data formats.
|
java.text.DecimalFormat |
getDecForm()
Gets the formatter object that is capable of formatting fractions with
given precision.
|
java.lang.String |
getDefaultDocumentFrame()
Gets the default XML configuration string.
|
static java.lang.String |
getDefaultStandardizerConfiguration()
Gets the default configuration of the standardizer.
|
static java.lang.String |
getDescriptorTypeName(java.lang.String xmlConfig)
Takes the descriptor type name from the root element of the XML configuration.
|
int |
getInternalMetricIndex()
Gets the MolecularDescriptor specific metric index of the current
parametrized metric.
|
int |
getInternalSize()
Gets the required memory size to store the descriptor according to the
specified parameters.
|
int |
getLength()
Returns the number of cells forming the descriptor.
|
int |
getMetricIndex(java.lang.String name)
Gets the index of the given parametrized metric.
|
java.lang.String |
getMetricName()
Gets the user defined symbolic name of the current parametrized metric.
|
java.lang.String |
getMetricName(int metricIndex)
Gets the user defined symbolic name of the specified parametrized metric.
|
int |
getNumberOfMetrics()
Gets the total number of parametrized metrics available in the present
configuration.
|
int |
getNumberOfWeights()
Gets the number of weights the current parametrized metric takes.
|
protected int |
getNumberOfWeights(int parametrizedMetricIndex)
Gets the number of weight factors used by the specified metric.
|
float |
getScaleFactor()
Gets the scale factor used in the current parametrized scalable metrics.
|
MolecularDescriptor |
getScalingHypothesis()
Gets the scaling hypothesis used in scaled metrics.
|
java.lang.String |
getScreeningConfigurationString(java.lang.String nodeName,
java.lang.String attrib,
java.lang.String value)
Returns parts of the parameter values in string.
|
float |
getThreshold()
Gets the threshold value being set for the current parametrized version.
|
float |
getThreshold(int metricIndex)
Gets a metric dependent threshold value.
|
float |
getTverskyAlpha()
Gets Tversky alpha value for the given parametrized metric.
|
float |
getTverskyBeta()
Gets Tversky beta value for the given parametrized metric.
|
float[] |
getWeights()
Gets all weights for the given parametrized metric.
|
protected boolean |
importNodes(org.dom4j.Document doc,
boolean merge)
Imports nodes from the specified
Document into the current
(main) Document . |
protected void |
initParameters()
Initializes object after configuration parameters are loaded.
|
boolean |
isAsymmetric()
Returns whether current parametrized metric is asymmetric or not.
|
boolean |
isCellwiseWeights()
Gets boolean telling whether cell weights are to be generated for
current parametrized metric.
|
boolean |
isNormalized()
Returns whether current parametrized metric is normalized or not.
|
boolean |
isScaled()
Returns whether current parametrized metric is scaled or not.
|
boolean |
isStandardizationMandatory()
Checks is Standardization of molecules is mandatory for the corresponding
MolecularDescriptor before descriptor generation. |
boolean |
isWeighted()
Returns whether current parametrized metric is weighted or not.
|
protected void |
processDocument(boolean all)
Searches the DOM tree for relevant nodes and sets internal variables to
some these nodes for the sake of easier information processing.
|
protected void |
readFromXmlFile(java.io.File file,
boolean merge,
boolean all)
Reads configuration from XML file.
|
protected void |
readFromXmlString(java.lang.String xml,
boolean merge,
boolean all)
Reads configuration from XML string.
|
protected void |
readMetricParameters()
Processes all
ParametrizedMetric nodes in the DOM tree. |
protected void |
readMetricWeights(org.dom4j.Element parametrizedMetric,
int metricIndex) |
protected void |
readValues(boolean all)
Picks attribute values from the document tree that are relevant to the
actual
MDParameters sub-class. |
void |
setAsymmetryFactor(float af)
Sets the value of the asymmetry factor of the current parametrized metric.
|
void |
setCellSize(int cellSize)
Sets the size (number of bits) of the bins (cells).
|
void |
setCellwiseWeights(boolean c)
Sets boolean telling whether cell weights are to be generated for
current parametrized metric.
|
void |
setCreateStatistics(boolean createStatistics)
Toggles the create statistics flag of the
MDGenerator object. |
void |
setCurrentParametrizedMetric(int metricIndex)
Selects the specified parametrized metric to be the current.
|
void |
setLength(int length)
Sets the length (number of cells) of the descriptor.
|
void |
setNormalized(boolean yes)
Toggles the normalized flag of the current parametrized metric.
|
void |
setOutputPrecision(int precision)
Specifies the output precision for floating point values.
|
void |
setParameters(java.io.File parametersFile)
Sets parameters from an XML file representation overwriting all
previous settings with the new ones.
|
void |
setParameters(java.lang.String parametersString)
Sets parameters from an XML string representation overwriting all
previous parameters settings with the new ones.
|
void |
setScaleFactor(float scaleFactor)
Sets scaleFactor used with the current parametrized metrics.
|
void |
setScalingHypothesis(MolecularDescriptor scalingHypothesis)
Sets (stores) the specified scaling hypothesis.
|
void |
setThreshold(float th)
Sets the value of the threshold of the current parametrized metric.
|
void |
setWeights(float[] w)
Sets the cell-wise weight factors for the current parametrized metric.
|
Molecule |
standardize(Molecule m)
Standardizes the
Molecule and returns the standardized
form. |
java.lang.String |
toString()
Returns the parameter values in string.
|
protected java.lang.String |
toString(org.dom4j.Node node)
Returns parts of the parameter values in string.
|
protected void |
writeMetricParameter(java.util.ArrayList pl,
java.lang.String attr,
int mi,
boolean useDecForm)
Writes a given parameter of the specified metric into the corresponding
tree node.
|
public static final float DEFAULT_SCALE_FACTOR
public static final float DEFAULT_ASYMMETRY_FACTOR
public static final float DEFAULT_WEIGHT
public static final int DEFAULT_OUTPUT_PRECISION
protected int cellSize
protected int length
protected int internalSize
protected byte[] data
MolecularDescriptor.toData()
protected java.lang.String configFilePath
protected org.dom4j.Document document
protected org.dom4j.Node standardizerConfigurationNode
protected org.dom4j.Element similarityNode
protected org.dom4j.Node screeningConfigurationNode
protected org.dom4j.Element parametrizedMetricsNode
protected java.util.List parametrizedMetricNodes
protected java.util.ArrayList parametrizedMetrics
protected java.util.ArrayList metricIndexes
protected java.util.ArrayList scaleFactors
protected java.util.ArrayList<java.lang.Float> tverskyA
protected java.util.ArrayList<java.lang.Float> tverskyB
protected java.util.ArrayList asymmetryFactors
protected java.util.ArrayList thresholds
protected java.util.ArrayList normalized
protected float defaultWeight
protected java.util.ArrayList weights
protected java.util.ArrayList cellwiseWeights
protected int outputPrecision
protected int currentMetricIndex
protected MolecularDescriptor md
protected java.text.NumberFormat decForm
protected MDGenerator generator
MolecularDescriptors
protected Standardizer standardizer
protected MDParameters()
protected void initParameters()
public void fromString(java.lang.String parameterString) throws MDParametersException
parameterString
- configuration parameters in stringMDParametersException
- when the parameter string is not well-formedpublic void fromFile(java.io.File parameterFile) throws MDParametersException
configFilePath
.parameterFile
- initialized parameter fileMDParametersException
- failed to process parameter filepublic void addParameters(java.lang.String parameterString) throws MDParametersException
parameterString
- parameters in stringMDParametersException
- when the parameter string is not
well-formedpublic void addParameters(java.io.File parameterFile) throws MDParametersException
parameterFile
- parameter fileMDParametersException
- when the parameter string is not
well-formedpublic void setParameters(java.lang.String parametersString) throws MDParametersException
parametersString
- parameters in stringMDParametersException
- when the parameter string is not
well-formedpublic void setParameters(java.io.File parametersFile) throws MDParametersException
configFilePath
.parametersFile
- parameters FileMDParametersException
- when the parameter string is not
well-formedpublic java.lang.String toString()
toString
in class java.lang.Object
MDParametersException
- when creating the parameter string failspublic java.lang.String getScreeningConfigurationString(java.lang.String nodeName, java.lang.String attrib, java.lang.String value) throws MDParametersException
nodeName
- name of the node to be printedattrib
- attribute namevalue
- value of the attributeMDParametersException
- when creating the parameter string failsprotected java.lang.String toString(org.dom4j.Node node) throws MDParametersException
node
- rootnode of the subtree to be printedMDParametersException
- when creating the parameter string failspublic void setCellSize(int cellSize)
cellSize
- the width of one (and each) cell (bin) in bitspublic void setLength(int length) throws MDParametersException
length
- the required length (cell count)MDParametersException
- if argument is not positivepublic void setScalingHypothesis(MolecularDescriptor scalingHypothesis)
scalingHypothesis
- the consensus hypothesis used for scalingpublic void setScaleFactor(float scaleFactor)
scaleFactor
- the new value of the scaleFactorpublic void setAsymmetryFactor(float af)
af
- asymmetry factorpublic void setThreshold(float th)
th
- dissimilarity threshold valuepublic void setWeights(float[] w)
w
- weightspublic void setCellwiseWeights(boolean c)
c
- true if cell weightspublic void setNormalized(boolean yes)
yes
- true, if the metric is normalizedpublic void setOutputPrecision(int precision)
getDecForm()
.precision
- number of digits after the decimal pointpublic void setCurrentParametrizedMetric(int metricIndex)
metricIndex
- index of the selected parametrized metricpublic void setCreateStatistics(boolean createStatistics)
MDGenerator
object.createStatistics
- new value for the create statistics flagpublic int addParametrizedMetric(java.lang.String name, java.lang.String metric, java.lang.String activeFamily) throws MDParametersException
name
- symbolic name of the parametrized metricmetric
- name of the metric (like Tanimoto, Euclidean etc)activeFamily
- name of the active compounds familyMDParametersException
public int getCellSize()
public int getLength()
public int getInternalSize()
public byte[] getData()
public int getCurrentMetricIndex()
public int getNumberOfMetrics()
public int getNumberOfWeights()
protected int getNumberOfWeights(int parametrizedMetricIndex) throws java.lang.IllegalArgumentException
MolecularDescriptor
class or its derived classes, but not
to parametrized metric.parametrizedMetricIndex
- parametrized metric indexjava.lang.IllegalArgumentException
- if the given parameter is not a
valid metric indexpublic float getThreshold(int metricIndex)
getThreshold()
is kept
for compatibility reasons.metricIndex
- index of a parametrized metricpublic float getThreshold()
public MolecularDescriptor getScalingHypothesis()
public int getInternalMetricIndex()
public java.lang.String getMetricName()
public java.lang.String getMetricName(int metricIndex)
public int getMetricIndex(java.lang.String name)
name
- name of the parametrized metricpublic float getScaleFactor()
public float getAsymmetryFactor()
public float[] getWeights()
public float getTverskyAlpha()
public float getTverskyBeta()
public boolean isCellwiseWeights()
public java.text.DecimalFormat getDecForm()
setOutputPrecision( int precision )
.public boolean isScaled()
public boolean isAsymmetric()
public boolean isWeighted()
public boolean isNormalized()
public boolean isStandardizationMandatory()
MolecularDescriptor
before descriptor generation.
This method always returns true. Derived classes should override in case
when standardization is not obligatory.public static java.lang.String getDefaultStandardizerConfiguration()
MolecularDescriptor
. The default on this top level is aromatization
and dehydrogenization, but derived parameter classes may overload this
behaviour.public java.lang.String getDefaultDocumentFrame()
public Molecule standardize(Molecule m)
Molecule
and returns the standardized
form. The standardization is configured via XML. StandardizerConfiguration
is the corresponding XML tag. If no standardizar is set up, null is
returned.m
- molecular structure to be standardizedprotected void readFromXmlFile(java.io.File file, boolean merge, boolean all) throws MDParametersException
configFilePath
.file
- the XML file to read configuration data frommerge
- merge config from file into already existing parameters
or overwrite existing parameter valuesall
- process the complete document or only the
ScreeningConfiguration
tagMDParametersException
- in the case of any failureprotected void readFromXmlString(java.lang.String xml, boolean merge, boolean all) throws MDParametersException
xml
- the XML string to get the configuration data frommerge
- merge config from file into already existing parameters
or overwrite existing parameter valuesall
- process the complete document or only the
ScreeningConfiguration
tagMDParametersException
- in the case of any failureprotected void checkDocumentVersion(java.lang.String docType, java.lang.String version) throws MDParametersException
docType
- the required document typeversion
- the expected version numberMDParametersException
protected void processDocument(boolean all) throws MDParametersException
all
- process the complete document or only the
ScreeningConfiguration
tagMDParametersException
protected void readValues(boolean all) throws MDParametersException
MDParameters
sub-class.all
- process the complete document or only the
ScreeningConfiguration
tagMDParametersException
protected void readMetricParameters() throws MDParametersException
ParametrizedMetric
nodes in the DOM tree.
Reads parameterized metric names and associated parameter setting and
stores them in data member for faster and easier access in getter methods.MDParametersException
- if one of the nodes is not well-formedprotected void readMetricWeights(org.dom4j.Element parametrizedMetric, int metricIndex) throws MDParametersException
MDParametersException
protected void writeMetricParameter(java.util.ArrayList pl, java.lang.String attr, int mi, boolean useDecForm)
pl
- list of parameters (for all metric indexes)attr
- name of the attribute which the parameter corresponds tomi
- index of the metricuseDecForm
- use precision for writing floating point valuesprotected int appendParametrizedMetric(java.lang.String name, java.lang.String metric)
name
- name of the parametrized metricmetric
- dissimilarity metric name (as defined in its implementor
classprotected void addParametrizedMetricsNode()
ParametrizedMetrics
node to the DOM tree.protected org.dom4j.Element addParametrizedMetricNode(java.lang.String name, java.lang.String activeFamily, java.lang.String metric)
ParametrizedMetric
node to the DOM tree.name
- name of the parameterized metric, given by the useractiveFamily
- name of the active compound family (e.g. ACE)metric
- name of the dissimilarity metricprotected boolean importNodes(org.dom4j.Document doc, boolean merge)
Document
into the current
(main) Document
. New nodes can either merged into the
existing ones without removing them, or new nodes may overwrite
exisiting nodes.doc
- import nodes from this documentmerge
- merge (add new) or overwrite (replace with new) existing
nodespublic static java.lang.String getDescriptorTypeName(java.lang.String xmlConfig)
xmlConfig
- configuration string