Class MolImporter

    • Constructor Detail

      • MolImporter

        public MolImporter​(InputStream is,
                           String opts)
                    throws IOException,
                           MolFormatException
        Create a molecule importer for an input stream. Begins reading the input stream and determines the file format. If the option string starts with the substring "MULTISET", then all the molecules in the stream are merged into one molecule object containing multiple atom sets. If it starts with "MOLMOVIE", then molecules are read taken to be frames of a molecule movie (default in XYZ format). If it starts with "NOMOLMOVIE", then multimolecule XYZ files are not interpreted as molecule movies. Other parts of the option string are passed to the import module. The input character encoding can also be set in "enc{encoding}" form.
        Parameters:
        is - the input stream to read
        opts - the file format and/or options separated by a colon; use null for automatic format recognition and default options
        Throws:
        IOException - If I/O error occurred when determining the file format.
        MolFormatException - If the molecule file is in a format that cannot be read
        IllegalCharsetNameException - if illegal encoding is used
        UnsupportedCharsetException - if unsupported encoding is used
      • MolImporter

        public MolImporter​(InputStream is,
                           String opts,
                           String enc)
                    throws IOException,
                           MolFormatException
        Create a molecule importer for an input stream. Begins reading the input stream and determines the file format. If the option string starts with the substring "MULTISET", then all the molecules in the stream are merged into one molecule object containing multiple atom sets. If it starts with "MOLMOVIE", then molecules are read taken to be frames of a molecule movie (default in XYZ format). If it starts with "NOMOLMOVIE", then multimolecule XYZ files are not interpreted as molecule movies. Other parts of the option string are passed to the import module. The input character encoding can also be set in "enc{encoding}" form.
        Parameters:
        is - the input stream to read
        opts - the file format and/or options separated by a colon; use null for automatic format recognition and default options
        enc - charset name or null
        Throws:
        IOException - If I/O error occurred when determining the file format.
        MolFormatException - If the molecule file is in a format that cannot be read
        IllegalCharsetNameException - if illegal encoding is used
        UnsupportedCharsetException - if unsupported encoding is used
        Since:
        Marvin 3.5.5, 01/02/2006
      • MolImporter

        public MolImporter​(InputStream is,
                           String opts,
                           String enc,
                           String fileName)
                    throws IOException,
                           MolFormatException
        Create a molecule importer for an input stream. Begins reading the input stream and determines the file format. If the option string starts with the substring "MULTISET", then all the molecules in the stream are merged into one molecule object containing multiple atom sets. If it starts with "MOLMOVIE", then molecules are read taken to be frames of a molecule movie (default in XYZ format). If it starts with "NOMOLMOVIE", then multimolecule XYZ files are not interpreted as molecule movies. Other parts of the option string are passed to the import module. The input character encoding can also be set in "enc{encoding}" form.
        Parameters:
        is - the input stream to read
        opts - the file format and/or options separated by a colon; use null for automatic format recognition and default options
        enc - charset name or null
        fileName - the original filename the stream is reading from
        Throws:
        IOException - If I/O error occurred when determining the file format.
        MolFormatException - If the molecule file is in a format that cannot be read
        IllegalCharsetNameException - if illegal encoding is used
        UnsupportedCharsetException - if unsupported encoding is used
        Since:
        Marvin 5.8
      • MolImporter

        public MolImporter​(File f,
                           String opts)
                    throws IOException
        Create a molecule importer for a file. Begins reading the input stream and determines the file format. If the option string starts with the substring "MULTISET", then all the molecules in the stream are merged into one molecule object containing multiple atom sets. If it starts with "MOLMOVIE", then molecules are read taken to be frames of a molecule movie (default in XYZ format). If it starts with "NOMOLMOVIE", then multimolecule XYZ files are not interpreted as molecule movies. The input character encoding can also be set in "enc{encoding}" form. Other parts of the option string are passed to the import module.
        Parameters:
        f - the file to read
        opts - the file format and/or options separated by a colon; use null for automatic format recognition and default options
        Throws:
        IOException - If I/O error occurred when determining the file format.
        MolFormatException - If the molecule file is in a format that cannot be read
        IllegalCharsetNameException - if illegal encoding is used
        UnsupportedCharsetException - if unsupported encoding is used
        See Also:
        tell(), close()
      • MolImporter

        public MolImporter​(File f)
                    throws IOException
        Create a molecule importer for a file. Begins reading the input stream and determines the file format. If the option string starts with the substring "MULTISET", then all the molecules in the stream are merged into one molecule object containing multiple atom sets. If it starts with "MOLMOVIE", then molecules are read taken to be frames of a molecule movie (default in XYZ format). If it starts with "NOMOLMOVIE", then multimolecule XYZ files are not interpreted as molecule movies. The input character encoding can also be set in "enc{encoding}" form. Other parts of the option string are passed to the import module.
        Parameters:
        f - the file to read
        Throws:
        IOException - If I/O error occurred when determining the file format.
        MolFormatException - If the molecule file is in a format that cannot be read
        IllegalCharsetNameException - if illegal encoding is used
        UnsupportedCharsetException - if unsupported encoding is used
        Since:
        Marinv 6.3
        See Also:
        tell(), close()
      • MolImporter

        public MolImporter​(String fname)
                    throws IOException,
                           MolFormatException
        Create a molecule importer for a file. Begins reading the input stream and determines the file format. The filename string can contain options in the "file{options}" form. If the option string starts with the substring "MULTISET", then all the molecules in the stream are merged into one molecule object containing multiple atom sets. If it starts with "MOLMOVIE", then molecules are read taken to be frames of a molecule movie (default in XYZ format). If it starts with "NOMOLMOVIE", then multimolecule XYZ files are not interpreted as molecule movies. Other parts of the option string are passed to the import module. The input character encoding can also be set in "enc{encoding}" form.
        Parameters:
        fname - name of the file to read
        Throws:
        IOException - If I/O error occurred when determining the file format.
        MolFormatException - If the molecule file is in a format that cannot be read
        IllegalCharsetNameException - if illegal encoding is used
        UnsupportedCharsetException - if unsupported encoding is used
        See Also:
        tell(), close()
      • MolImporter

        public MolImporter​(String fname,
                           Object component,
                           String msg)
                    throws IOException,
                           MolFormatException
        Create a molecule importer with a progress monitor. Begins reading the input stream and determines the file format. The filename string can contain options in the "file{options}" form. If the option string starts with "MULTISET" or "MULTISET,", then all the molecules in the stream are merged into one molecule object containing multiple atom sets. The input character encoding can also be set in "enc{encoding}" form. Other parts of the option string are passed to the import module.
        Parameters:
        fname - name of the file to read
        component - the parent component
        msg - displayed message, where %p is replaced by the file path
        Throws:
        IOException - If I/O error occurred when determining the file format.
        MolFormatException - If the molecule file is in a format that cannot be read
        IllegalCharsetNameException - if illegal encoding is used
        UnsupportedCharsetException - if unsupported encoding is used
        See Also:
        tell(), close()
    • Method Detail

      • getFileName

        public String getFileName()
        Gets the name of the input file
        Returns:
        the name of the input file
      • getFile

        public File getFile()
        Gets the file object for the input.
        Returns:
        the File or null (if the input is not a File)
      • getOptions

        public String getOptions()
        Gets the import options.
        Returns:
        the options
      • isGrabbingEnabled

        @Deprecated
        public boolean isGrabbingEnabled()
        Deprecated.
        as of Marvin 6.2. It has no effect on the code.
        Tests whether molecule file content grabbing is enabled.
        Returns:
        true if enabled, false if disabled
        Since:
        4.0, 01/05/2005
      • setGrabbingEnabled

        @Deprecated
        public void setGrabbingEnabled​(boolean v)
        Deprecated.
        as of Marvin 6.2. It has no effect on the code.
        Enables or disables molecule file content grabbing.
        Parameters:
        v - true enables, false disables it
        Since:
        4.0, 01/05/2005
      • getGrabbedMoleculeString

        public String getGrabbedMoleculeString()
        Gets the last grabbed molecule string with LF style line endings by default. If the "noLF" import option was set for MolImporter, then original line endings are kept. E.g. new MolImporter(stream, "mrv:noLF");
        Returns:
        the molecule as a string
        Since:
        4.0, 01/05/2005
      • isMultiSet

        public boolean isMultiSet()
        Are the imported molecules merged into one multi-set molecule?
        Returns:
        true if the input is a multi-set molecule
      • isMolMovie

        public boolean isMolMovie()
        Are the imported molecules merged into one multi-set molecule?
        Returns:
        true if the input is a multi-set molecule
        Since:
        Marvin 5.2, 02/12/2009
      • getMolStream

        public Stream<Molecule> getMolStream()
        Creates a Molecule Stream with the iterator of the importer. Only one iterator can exist at a time, so only one stream can exist at a time.
        Specified by:
        getMolStream in interface chemaxon.marvin.io.formats.MoleculeImporterIface
        Overrides:
        getMolStream in class MDocSource
      • getMDocumentStream

        public Stream<MDocument> getMDocumentStream()
        Creates an MDocument Stream with the iterator of the importer. Only one iterator can exist at a time, so only one stream can exist at a time.
        Specified by:
        getMDocumentStream in interface chemaxon.marvin.io.formats.MoleculeImporterIface
        Overrides:
        getMDocumentStream in class MDocSource
      • setThreadCount

        public void setThreadCount​(int threadCount)
                            throws IllegalStateException
        Sets the number of threads for concurrent processing. Default: the number of CPUs, single-threaded processing if there is 1 CPU.
        Parameters:
        threadCount - the number of threads, set 0 for the number of CPUs, 1 for single-threaded mode
        Throws:
        IllegalStateException - if concurrent processing is already started or if object input stream is used instead of record importer
        Since:
        Marvin 5.3
      • getQueryMode

        public boolean getQueryMode()
        Gets query mode. SMILES strings are imported as SMARTS if query mode is set.
        Returns:
        query mode
        Since:
        Marvin 3.3, 11/14/2003
      • setQueryMode

        public void setQueryMode​(boolean q)
        Sets query mode. SMILES strings are imported as SMARTS if query mode is set.
        Parameters:
        q - query mode
        Since:
        Marvin 3.3, 11/14/2003
      • read

        public Molecule read()
                      throws IOException
        Read the next molecule.
        Specified by:
        read in interface chemaxon.marvin.io.formats.MoleculeImporterIface
        Returns:
        the next molecule, or null at end of file
        Throws:
        IOException - If I/O error occurred
      • createMol

        public Molecule createMol()
        Creates a target molecule object for import.
        Returns:
        new target molecule object
        Since:
        Marvin 3.4, 05/08/2004
      • nextDoc

        public MDocument nextDoc()
                          throws IOException
        Reads the next document.
        Specified by:
        nextDoc in class MDocSource
        Returns:
        the next document or null at end of file
        Throws:
        IOException - If I/O error occurred
        Since:
        Marvin 4.1, 04/14/2006
      • readMol

        @Deprecated
        public Molecule readMol​(Molecule mol)
                         throws MolFormatException,
                                IOException
        Deprecated.
        as of Marvin 14.7.7. use read() instead
        Read the next molecule. All the nodes, edges, and properties are removed from mol before reading. If the 'mol' parameter is not null then processing is single-threaded.
        Parameters:
        mol - target molecule object
        Returns:
        the molecule if success, null at end of file
        Throws:
        IOException - If I/O error occurred
        MolFormatException
      • read

        @Deprecated
        public boolean read​(Molecule mol)
                     throws IOException
        Deprecated.
        as of Marvin 14.7.7. use read() instead
        Read the next molecule. All the nodes, edges, and properties are removed from mol before reading. If the 'mol' parameter is not null then the processing is single-threaded. This method requires the parameter 'mol' molecule to be created with MolImporter.createMol() method.
        Parameters:
        mol - target molecule object
        Returns:
        true after success, false at end of file
        Throws:
        IOException - If I/O error occurred
      • seekRecord

        public void seekRecord​(int k,
                               MProgressMonitor pmon)
                        throws EOFException,
                               IOException
        Seek the specified record. This method should not be called before calling setThreadCount(int). Backward seeking (rewinding) in the stream is only possible if the underlying input stream is seekable. Note, that in concurrent mode this is not true, the import is not rewindable. Forward seeking is always possible. Seeking terminates before reaching the specified position if the user cancels the progress dialog.
        Specified by:
        seekRecord in class MDocSource
        Parameters:
        k - position
        pmon - progress monitor or null
        Throws:
        EOFException - if end of file reached while trying to seek
        IOException - if read error occurred
        Since:
        Marvin 4.1, 04/19/2006
        See Also:
        isRewindable(), setThreadCount(int)
      • isEndReached

        public boolean isEndReached()
        Tests whether the end of input is already reached.
        Specified by:
        isEndReached in class MDocSource
        Returns:
        true if the end was reached, false otherwise
        Since:
        Marvin 4.1, 06/18/2006
      • estimateNumRecords

        public int estimateNumRecords()
        Estimates the total number of records. If the end of file is already reached, then it returns the exact value. Otherwise, in case of a file with known length, it extrapolates from the last read record index and the value of the file pointer at the last read position. If the input is a stream with unknown total length, then it returns two times the current highest record number.
        Specified by:
        estimateNumRecords in class MDocSource
        Returns:
        estimated number of records or -1 at the beginning of file
        Since:
        Marvin 4.1, 04/18/2006
      • tell

        public long tell()
                  throws IOException
        Returns the current file offset.
        Returns:
        the file pointer
        Throws:
        IOException - if the position cannot be determined
      • getLineCount

        public int getLineCount()
        Gets the current line number. This method should not be called before calling setThreadCount(int).
        Returns:
        the line number
        See Also:
        setThreadCount(int)
      • getRecordCount

        public int getRecordCount()
        Gets the current record number.
        Specified by:
        getRecordCount in class MDocSource
        Returns:
        the record number
        Since:
        Marvin 4.1, 04/18/2006
      • getRecordCountMax

        public int getRecordCountMax()
        Gets the total number of records read.
        Specified by:
        getRecordCountMax in class MDocSource
        Returns:
        the number of records
        Since:
        Marvin 4.1, 04/18/2006
      • close

        public void close()
                   throws IOException
        Close the underlying input stream. IMPORTANT: call this after reading molecules to close concurrent processing properly.
        Specified by:
        close in interface AutoCloseable
        Specified by:
        close in interface Closeable
        Specified by:
        close in interface chemaxon.marvin.io.formats.MoleculeImporterIface
        Overrides:
        close in class MDocSource
        Throws:
        IOException - If an I/O error has occurred.
      • getFormat

        public String getFormat()
        Get the file format.
        Returns:
        "mrv", "mol", "csmol", "sdf", "cssdf", "rdf", "csrdf", "smiles", "sybyl", "mol2", "pdb", "xyz", "cube", "inchi", "gzip:{inner file format}" or "chemaxon.struc.Molecule" if imported from ObjectInputStream (serialized molecule)
      • importMol

        public static Molecule importMol​(byte[] b)
                                  throws MolFormatException
        Read a molecule from a byte array. If the array contains multiple molecules (it is an SDfile for example), read only the first one. Processing is single-threaded.
        Parameters:
        b - the molecule file contents
        Returns:
        the molecule
        Throws:
        MolFormatException - If the molecule file is in a format that cannot be read
      • importMol

        public static Molecule importMol​(byte[] b,
                                         String opts,
                                         String enc)
                                  throws MolFormatException
        Read a molecule from a byte array. If the array contains multiple molecules (it is an SDfile for example), read only the first one. Processing is single-threaded.
        Parameters:
        b - the molecule file contents
        opts - the file format and/or options separated by a colon; use null for automatic format recognition and default options
        enc - encoding or null
        Returns:
        the molecule
        Throws:
        MolFormatException - If the molecule file is in a format that cannot be read
        Since:
        Marvin 5.0, 12/27/2007
      • importMol

        public static boolean importMol​(byte[] b,
                                        Molecule mol)
                                 throws MolFormatException
        Read a molecule from a byte array. If the array contains multiple molecules (it is an SDfile for example), read only the first one. Processing is single-threaded.
        Parameters:
        b - the molecule file contents
        mol - target molecule object
        Returns:
        true in case of successful reading, false if no more molecules
        Throws:
        MolFormatException - If the molecule file is in a format that cannot be read
      • importMol

        public static boolean importMol​(byte[] b,
                                        String opts,
                                        String enc,
                                        Molecule mol)
                                 throws MolFormatException
        Read a molecule from a byte array. If the array contains multiple molecules (it is an SDfile for example), read only the first one. Processing is single-threaded.
        Parameters:
        b - the molecule file contents
        opts - the file format and/or options separated by a colon; use null for automatic format recognition and default options
        enc - encoding or null
        mol - target molecule object
        Returns:
        true in case of successful reading, false if no more molecules
        Throws:
        MolFormatException - If the molecule file is in a format that cannot be read
        Since:
        Marvin 5.0, 12/27/2007
      • importDoc

        public static MDocument importDoc​(byte[] b)
                                   throws MolFormatException
        Reads a document from a byte array. If the array contains multiple molecules (it is an SDfile for example), read only the first one. Processing is single-threaded.
        Parameters:
        b - the file contents
        Returns:
        the document or null if no document found in input
        Throws:
        MolFormatException - If the molecule file is in a format that cannot be read
        Since:
        Marvin 4.1.8, 04/20/2007
      • importDoc

        public static MDocument importDoc​(byte[] b,
                                          String opts,
                                          String enc)
                                   throws MolFormatException
        Reads a document from a byte array. If the array contains multiple molecules (it is an SDfile for example), read only the first one. Processing is single-threaded.
        Parameters:
        b - the file contents
        opts - the file format and/or options separated by a colon; use null for automatic format recognition and default options
        enc - encoding or null
        Returns:
        the document or null if no document found in input
        Throws:
        MolFormatException - If the molecule file is in a format that cannot be read
        Since:
        Marvin 5.0, 12/27/2007
      • importMol

        public static Molecule importMol​(String s)
                                  throws MolFormatException
        Read a molecule from a string. If the string contains multiple molecules (it is an SDfile for example), read only the first one. If the format is known, it is faster to use importMol(String, String) to avoid wasting time with format recognition. Processing is single-threaded.
        Parameters:
        s - the molecule file contents
        Returns:
        the molecule
        Throws:
        MolFormatException - If the molecule file is in a format that cannot be read
      • importMol

        public static Molecule importMol​(String s,
                                         String opts)
                                  throws MolFormatException
        Read a molecule from a string. If the string contains multiple molecules (it is an SDfile for example), read only the first one. Processing is single-threaded.
        Parameters:
        s - the molecule file contents
        opts - the file format and/or options separated by a colon; use null for automatic format recognition and default options
        Returns:
        the molecule
        Throws:
        MolFormatException - If the molecule file is in a format that cannot be read
      • importMol

        public static Molecule importMol​(String s,
                                         chemaxon.formats.ImportOptions options)
                                  throws MolFormatException
        Read a molecule from a string with the given options. If the string contains multiple molecules (it is an SDfile for example), read only the first one. Processing is single-threaded. If an encoding is specified in the option, its value will not be used since the input is a string which can not be encoded.
        Parameters:
        s - the molecule file contents
        options - options defined by an options object
        Returns:
        the molecule
        Throws:
        MolFormatException - If the molecule file is in a format that cannot be read
      • importMol

        @Deprecated
        public static Molecule importMol​(String s,
                                         String opts,
                                         String enc)
                                  throws MolFormatException
        Deprecated.
        (Since Marvin 5.5) There is no need to specify an encoding for a String input. Instead, if you have a String to import, call importMol(String, String); if you have a byte array, call importMol(byte[], String, String).
        Read a molecule from a string. If the string contains multiple molecules (it is an SDfile for example), read only the first one. Processing is single-threaded.
        Parameters:
        s - the molecule file contents
        opts - the file format and/or options separated by a colon; use null for automatic format recognition and default options
        enc - encoding or null
        Returns:
        the molecule
        Throws:
        MolFormatException - If the molecule file is in a format that cannot be read
        Since:
        Marvin 5.0, 12/27/2007
      • importMol

        public static boolean importMol​(String s,
                                        Molecule mol)
                                 throws MolFormatException
        Read a molecule from a string. If the string contains multiple molecules (it is an SDfile for example), read only the first one. Processing is single-threaded.
        Parameters:
        s - the file contents
        mol - target molecule object
        Returns:
        true in case of successful reading, false if no more molecules
        Throws:
        MolFormatException - If the molecule file is in a format that cannot be read
      • importMol

        @Deprecated
        public static boolean importMol​(String s,
                                        String opts,
                                        String enc,
                                        Molecule mol)
                                 throws MolFormatException
        Deprecated.
        (Since Marvin 5.5) There is no need to specify an encoding for a String input. Instead, if you have a String to import, call importMol(String, String); if you have a byte array, call importMol(byte[], String, String, Molecule).
        Read a molecule from a string. If the string contains multiple molecules (it is an SDfile for example), read only the first one. Processing is single-threaded.
        Parameters:
        s - the file contents
        opts - the file format and/or options separated by a colon; use null for automatic format recognition and default options
        enc - encoding or null
        mol - target molecule object
        Returns:
        true in case of successful reading, false if no more molecules
        Throws:
        MolFormatException - If the molecule file is in a format that cannot be read
      • getGlobalProperties

        public MPropertyContainer getGlobalProperties()
        Gets the global properties in a container that was retrieved from the input stream, earlier. Only MRV import supports global properties. Reads them by the initalization of the record importer.
        Returns:
        global properties in a container or null.
        Since:
        Marvin 5.0 06/05/2007
      • parseMRV

        public static MDocument parseMRV​(String sval)
                                  throws IOException
        Parses a document from a string in Marvin Document (MRV) format.
        Parameters:
        sval - the string
        Returns:
        the imported document
        Throws:
        IOException
        Since:
        Marvin 5.8, 08/25/2011