Package chemaxon.formats
Class MFileFormatUtil
- java.lang.Object
-
- chemaxon.formats.MFileFormatUtil
-
@PublicAPI public class MFileFormatUtil extends Object
File format related utility functions.- Since:
- Marvin 4.1, 12/15/2005
-
-
Field Summary
Fields Modifier and Type Field Description static int
MOLMOVIE
Read multi-molecule files as movies.static int
MULTISET
The multi-molecule file really contains multiple atom sets of one molecule.static int
NOMOLMOVIE
Do not read multi-molecule XYZ files as movies.
-
Constructor Summary
Constructors Constructor Description MFileFormatUtil()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static String[]
convertToSmilingFormat(Molecule m)
Tries to convert a molecule to a SMILES related format.static String[]
convertToSmilingFormat(MProp p)
Try to convert a property to text with a SMILES related format argument.static MolExportModule
createExportModule(String fmt)
Creates an export module for the specified format.static MolExportModule
createExportModule(String fmt, String enc)
Creates an export module for the specified format with the specified encoding.static MRecordReader
createRecordReader(InputStream is, String opts)
Creates a record reader for an input stream.static MRecordReader
createRecordReader(InputStream is, String opts, String enc, String path)
Creates a record reader for an input stream.static MFileFormat[]
findFormats(String fmt, long flags, long mask)
Gets a list of formats.static String[]
getEncodingFromOptions(String fmtopts)
Gets the encoding that was explicitly given as an import option.static String
getFileExtensionLC(File f)
Gets the file extension in lower case.static String
getFileExtensionLC(String fname)
Gets the file extension in lower case.static MFileFormat
getFormat(String fmt)
Gets the file format descriptor for the specified codename.static List<String>
getFormatNamesWithExtension(String fileName)
static String
getKnownExtension(String fname)
Returns the file extension if it is a known extension.static String[]
getMolfileExtensions()
Gets the array of known molecule file extensions.static String[]
getMolfileFormats()
Gets the array of known molecule file formats.static String
getMostLikelyMolFormat(String fname)
Gets the most likey molecule file format from the file name extension.static String
getUnguessableFormat(String fname)
Gets the file format from the file name extension for formats that are not guessable from the file content.static boolean
isOutputCleanable(String fmt)
Tests whether the specified output format is cleanable.static boolean
isSubFormatOf(String f, String other)
Tests whether a format is a sub-format of another format.static boolean
isURLOrFileName(String s)
Tests whether the specified string is an URL (absolute or relative) or file name.static int
preprocessFormatAndOptions(String[] fmtopts)
Parses options like "MULTISET", "MOLMOVIE" or "NOMOLMOVIE".static String
recognizeOneLineFormat(String s)
Recognize a one-line string as CxSMILES, CxSMARTS, AbbrevGroup, Peptide or IUPAC name.static String
recognizeOneLineFormat(String s, MFileFormat... forbiddeneFormats)
Recognize a one-line string as CxSMILES, CxSMARTS, AbbrevGroup, Peptide or IUPAC name.static void
registerFormat(MFileFormat mff)
Registers a user defined file format.static String[]
splitFileAndOptions(String arg)
Parses "file{options}" strings used in molecule file import.static String[]
splitFormatAndOptions(String opts)
Parses "format:options" strings used in molecule file import and export.static void
testEncoding(String enc)
Tests whether the given charset name is supported by this JVM
-
-
-
Field Detail
-
MULTISET
public static final int MULTISET
The multi-molecule file really contains multiple atom sets of one molecule.- See Also:
- Constant Field Values
-
MOLMOVIE
public static final int MOLMOVIE
Read multi-molecule files as movies.- Since:
- Marvin 5.2, 02/12/2009
- See Also:
- Constant Field Values
-
NOMOLMOVIE
public static final int NOMOLMOVIE
Do not read multi-molecule XYZ files as movies.- Since:
- Marvin 5.2, 02/12/2009
- See Also:
- Constant Field Values
-
-
Method Detail
-
isSubFormatOf
public static boolean isSubFormatOf(String f, String other)
Tests whether a format is a sub-format of another format.- Parameters:
f
- the format codenameother
- the other format- Returns:
- true if it is a format variant of f
- Since:
- Marvin 4.1, 04/07/2006
-
splitFileAndOptions
public static String[] splitFileAndOptions(String arg)
Parses "file{options}" strings used in molecule file import.- Parameters:
arg
- string containing the filename and the options (if there are)- Returns:
- a two-element array containing the filename and the options.
-
splitFormatAndOptions
public static String[] splitFormatAndOptions(String opts)
Parses "format:options" strings used in molecule file import and export. Examples:splitFormatAndOptions("xyz:f1.4") returns {"xyz", "f1.4"} splitFormatAndOptions("f1.4") returns {null, "f1.4"} splitFormatAndOptions("xyz:") returns {"xyz", ""} splitFormatAndOptions("gzip:xyz:f1.4") returns {"gzip", "xyz:f1.4"}
The colon can be omitted in case if Marvin's built-in input formats. Example:splitFormatAndOptions("xyz") returns { "xyz", ""}
Colons after the first equality sign are ignored. This is to allow options which have a parameter that can contain a colon (e.g. URLs). Example:splitFormatAndOptions("param=https://chemaxon.com") returns {null, "param=https://chemaxon.com"}
- Parameters:
opts
- string containing the format and the options- Returns:
- an array containing the format(s) and the options.
-
preprocessFormatAndOptions
public static int preprocessFormatAndOptions(String[] fmtopts)
Parses options like "MULTISET", "MOLMOVIE" or "NOMOLMOVIE". Example:String[] fmtopts = splitFormatAndOptions("gzip:xyz:MULTISET,f1.4"); // fmtopts == {"gzip", "xyz:MULTISET,f.14"} int result = preprocessFormatAndOptions(fmtopts); // fmtopts == {"gzip", "xyz:f.14"}, results == MULTISET
- Parameters:
fmtopts
- two-element array containing the format and the options- Returns:
- flags corresponding to the options
- See Also:
splitFormatAndOptions(java.lang.String)
,MULTISET
,MOLMOVIE
,NOMOLMOVIE
-
getEncodingFromOptions
public static String[] getEncodingFromOptions(String fmtopts)
Gets the encoding that was explicitly given as an import option. The format is enc{name}, where name is a JAVA supported name of the charset.- Parameters:
fmtopts
- the input format and options- Returns:
- two element array, the first element is the encoding, the second contains the remaining import options.
- Throws:
IllegalCharsetNameException
- if the encoding is illegalUnsupportedCharsetException
- if the encoding is unsupported
-
testEncoding
public static void testEncoding(String enc) throws IllegalArgumentException
Tests whether the given charset name is supported by this JVM- Parameters:
enc
- the name of the charset- Throws:
IllegalArgumentException
-
getUnguessableFormat
public static String getUnguessableFormat(String fname)
Gets the file format from the file name extension for formats that are not guessable from the file content. Used to distinguish SMARTS and SMILES.- Parameters:
fname
- the filename- Returns:
- the file format or null if the file contents can be used to recognize the format
-
getFileExtensionLC
public static String getFileExtensionLC(File f)
Gets the file extension in lower case.- Parameters:
f
- the file- Returns:
- the extension in lower case
-
getFileExtensionLC
public static String getFileExtensionLC(String fname)
Gets the file extension in lower case.- Parameters:
fname
- the filename- Returns:
- the extension in lower case
-
getMostLikelyMolFormat
public static String getMostLikelyMolFormat(String fname)
Gets the most likey molecule file format from the file name extension.- Parameters:
fname
- the filename- Returns:
- the file format or null if the format cannot be determined from the file name
-
getKnownExtension
public static String getKnownExtension(String fname)
Returns the file extension if it is a known extension. Known extensions are the following: mrv t gz mol mol2 rgf rxn csmol csrgf csrxn sdf cssdf rdf smi smiles sma smarts cml xml xyz txt html htm cgi gif jpg jpeg msbmp png svg svgz- Parameters:
fname
- the filename- Returns:
- the extension
-
getMolfileExtensions
public static String[] getMolfileExtensions()
Gets the array of known molecule file extensions.- Returns:
- the array of known molecule file extensions
-
getMolfileFormats
public static String[] getMolfileFormats()
Gets the array of known molecule file formats.- Returns:
- the array of known molecule file formats
-
isOutputCleanable
public static boolean isOutputCleanable(String fmt) throws SecurityException
Tests whether the specified output format is cleanable. For a non-cleanable output format, cleaning is meaningless because coordinates are not stored.- Parameters:
fmt
- the format string- Returns:
- true if the specified output format is non-cleanable, false otherwise
- Throws:
SecurityException
- Since:
- Marvin 4.1, 02/13/2006
-
registerFormat
public static void registerFormat(MFileFormat mff)
Registers a user defined file format. TheMFileFormat.F_USER_DEFINED
flag is automatically set.- Parameters:
mff
- the file format- Since:
- Marvin 5.0, 05/23/2007
-
getFormat
public static MFileFormat getFormat(String fmt)
Gets the file format descriptor for the specified codename.- Parameters:
fmt
- the format codename- Returns:
- the descriptor or
null
if not found - Since:
- Marvin 5.0, 05/23/2007
-
findFormats
public static MFileFormat[] findFormats(String fmt, long flags, long mask)
Gets a list of formats.- Parameters:
fmt
- the format name ornull
if not importantflags
- select formats of which the specified flags are setmask
- only bits specified here are taken into account- Returns:
- the list
- Since:
- Marvin 5.0, 05/24/2007
-
createRecordReader
public static MRecordReader createRecordReader(InputStream is, String opts) throws MolFormatException, IOException
Creates a record reader for an input stream.- Parameters:
is
- the input streamopts
- input options ornull
- Returns:
- the record reader or
null
if the format was not recognized - Throws:
IllegalCharsetNameException
- if illegal encoding is usedUnsupportedCharsetException
- if unsupported encoding is usedSecurityException
- if the module cannot be loaded because of a firewall problemMolFormatException
IOException
- Since:
- Marvin 5.0, 06/03/2007
- See Also:
MFileFormat.createRecordReader(MolInputStream, String)
-
createRecordReader
public static MRecordReader createRecordReader(InputStream is, String opts, String enc, String path) throws MolFormatException, IOException
Creates a record reader for an input stream.- Parameters:
is
- the input streamopts
- input options ornull
enc
- the input encoding or nullpath
- the file path (it can also be an URL) ornull
- Returns:
- the record reader or
null
if the format was not recognized - Throws:
IllegalCharsetNameException
- if illegal encoding is usedUnsupportedCharsetException
- if unsupported encoding is usedSecurityException
- if the module cannot be loaded because of a firewall problemMolFormatException
IOException
- Since:
- Marvin 5.0, 06/03/2007, Marvin 5.3
- See Also:
MFileFormat.createRecordReader(MolInputStream, String)
-
createExportModule
public static MolExportModule createExportModule(String fmt) throws MolExportException
Creates an export module for the specified format.- Parameters:
fmt
- the format name- Throws:
SecurityException
- if the module cannot be loaded because of a firewall problemMolExportException
- See Also:
MFileFormat.createExportModule()
-
createExportModule
public static MolExportModule createExportModule(String fmt, String enc) throws MolExportException
Creates an export module for the specified format with the specified encoding.- Parameters:
fmt
- the format nameenc
- the encoding- Throws:
SecurityException
- if the module cannot be loaded because of a firewall problemMolExportException
- See Also:
MFileFormat.createExportModule()
-
convertToSmilingFormat
public static String[] convertToSmilingFormat(Molecule m) throws MolExportException
Tries to convert a molecule to a SMILES related format. SMILES, SMARTS, CxSMILES and CxSMARTS are tried in this order.- Returns:
- the result of the first successful conversion, the 0th array element is the converted text, the 1st element is the format
- Throws:
MolExportException
- if conversion was not successful- Since:
- Marvin 5.0, 11/11/2007
-
convertToSmilingFormat
public static String[] convertToSmilingFormat(MProp p) throws MolExportException
Try to convert a property to text with a SMILES related format argument. SMILES, SMARTS, CxSMILES and CxSMARTS are tried in this order.- Returns:
- the result of the first successful conversion, the 0th array element is the converted text, the 1st element is the format
- Throws:
MolExportException
- if conversion was not successful- Since:
- Marvin 5.0, 11/11/2007
-
recognizeOneLineFormat
public static String recognizeOneLineFormat(String s)
Recognize a one-line string as CxSMILES, CxSMARTS, AbbrevGroup, Peptide or IUPAC name.- Parameters:
s
- the input string- Returns:
- the most probable format or null
- Since:
- Marvin 4.1, 04/06/2006
-
recognizeOneLineFormat
public static String recognizeOneLineFormat(String s, MFileFormat... forbiddeneFormats)
Recognize a one-line string as CxSMILES, CxSMARTS, AbbrevGroup, Peptide or IUPAC name.- Parameters:
s
- the input stringforbiddeneFormats
- the list ofMFileFormat
that should be not recognised.- Returns:
- the most probable format or null
- Since:
- Marvin 4.1, 04/06/2006
-
isURLOrFileName
public static boolean isURLOrFileName(String s)
Tests whether the specified string is an URL (absolute or relative) or file name.- Parameters:
s
- the string- Returns:
- true if it is an URL or file name, false otherwise
-
-