Class DocumentAnnotator

java.lang.Object
chemaxon.naming.document.annotate.DocumentAnnotator
All Implemented Interfaces:
chemaxon.marvin.io.formats.MoleculeImporterIface, AutoCloseable

@PublicAPI public class DocumentAnnotator extends Object implements chemaxon.marvin.io.formats.MoleculeImporterIface, AutoCloseable
Generate a chemically annotated HTML view of a document.
  • Field Details

  • Constructor Details

    • DocumentAnnotator

      public DocumentAnnotator(File sourceDocument) throws FileNotFoundException
      Constructs a DocumentAnnotator to annotate the given document file.

      The document format (PDF, HTML, XML) will be auto-detected.

      Parameters:
      sourceDocument - the document to annotate
      Throws:
      FileNotFoundException - if the document does not exist
    • DocumentAnnotator

      public DocumentAnnotator(InputStream sourceDocument) throws IOException
      Constructs a DocumentAnnotator to annotate the given document.

      The document format (PDF, HTML, XML) will be autodetected.

      If annotation of plain text is desired, please use: new DocumentAnnotator(sourceDocument, DocumentAnnotator.DocumentType.TXT)

      Parameters:
      sourceDocument - the document to annotate
      Throws:
      IOException
    • DocumentAnnotator

      public DocumentAnnotator(InputStream sourceDocument, DocumentAnnotator.DocumentType documentType)
      Constructs a DocumentAnnotator to annotate the given document.
      Parameters:
      sourceDocument - the document to annotate
      documentType - the type of the source document
  • Method Details

    • fromPlainText

      public static DocumentAnnotator fromPlainText(Reader source)
      Constructs a DocumentAnnotator to annotate the given text.
      Parameters:
      source - a Reader representing the source document
    • setOptions

      public void setOptions(DocumentAnnotatorOptions options)
      Sets the options used for document annotation.
    • isAnnotationSupported

      public boolean isAnnotationSupported()
      Checks whether annotation is supported for the current document type.
      Returns:
      true if the document type is supported, false otherwise
    • setAnnotatedOutput

      public boolean setAnnotatedOutput(File destination) throws IOException
      Set the destination file where an annotated version of the source document should be written.

      Since not all type of source documents are supported for annotation, this method can return false, in which case no annotated document will be generated.

      Returns:
      true if annotation is supported for the source document, false otherwise.
      Throws:
      IOException - if the file exists but is a directory rather than a regular file, does not exist but cannot be created, or cannot be opened for any other reason
      See Also:
    • setAnnotatedOutput

      public boolean setAnnotatedOutput(OutputStream destination)
      Set the destination output stream where an annotated version of the source document should be written.

      Since not all type of source documents are supported for annotation, this method can return false, in which case no annotated document will be generated.

      Returns:
      true if annotation is supported for the source document, false otherwise.
      See Also:
    • setAnnotatedOutputDirectory

      public File setAnnotatedOutputDirectory(File annotateDirectory) throws IOException
      Set the directory where the annotated document and associated resources will be placed.

      The name of the annotated document is based on the source document file name, if known. Otherwise, it will be an arbitrary file name, which is returned as the result of this method.

      Parameters:
      annotateDirectory - the destination directory
      Returns:
      the File inside the directory where the main document will be stored, or null if annotation is not supported for the source document.
      Throws:
      IOException - if the directory cannot be used to store files.
    • setAnnotatedOutputDirectory

      public File setAnnotatedOutputDirectory(File annotateDirectory, boolean keepOriginalExtension) throws IOException
      Set the directory where the annotated document and associated resources will be placed.

      The name of the annotated document is based on the source document file name, if known. Otherwise, it will be an arbitrary file name, which is returned as the result of this method.

      Parameters:
      annotateDirectory - the destination directory
      Returns:
      the File inside the directory where the main document will be stored, or null if annotation is not supported for the source document.
      Throws:
      IOException - if the directory cannot be used to store files.
    • setD2SOptions

      public void setD2SOptions(String options)
      Sets the options string to use for document annotation.
    • usePopups

      public void usePopups(boolean addPopups)
      Parameters:
      addPopups - whether a popup should be generated for each hit
    • setMolconvert

      public void setMolconvert(File molconvert)
    • setCustomHtmlToXmlConverter

      public void setCustomHtmlToXmlConverter(chemaxon.naming.document.annotate.XmlToHtmlConverter customConverter)
      Provide a custom converter from XML to HTML, to be used instead of the default one.
    • read

      public Molecule read() throws IOException
      Find the next structure in the source document and return it, or null when the end of the document has been reached.

      If an annotated document is being generated, calling read() might also lead to the corresponding portion of the annotated document to be written to the destination.

      Specified by:
      read in interface chemaxon.marvin.io.formats.MoleculeImporterIface
      Returns:
      the next structure found in the source document, or null if the document has been fully processed.
      Throws:
      IOException
    • close

      public void close() throws IOException
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface chemaxon.marvin.io.formats.MoleculeImporterIface
      Throws:
      IOException
    • process

      public void process() throws IOException
      Process the complete input at once, generating the annotated document.
      Throws:
      IOException
      See Also:
    • setResourceDirectory

      public void setResourceDirectory(File resourceDirectory) throws IOException
      Set the directory where resources will be stored.
      Throws:
      IOException - if the directory cannot be used to store files.