Class DocumentAnnotator
- All Implemented Interfaces:
- MoleculeImporterIface,- AutoCloseable
- 
Nested Class SummaryNested ClassesModifier and TypeClassDescriptionstatic enumDocument type enum forDocumentAnnotator.static interfaceThe interface for aDocumentAnnotatorprogress event listener.static final classRepresents the state of progress during a document annotation process.
- 
Field SummaryFields
- 
Constructor SummaryConstructorsConstructorDescriptionDocumentAnnotator(File sourceDocument) Constructs a DocumentAnnotator to annotate the given document file.DocumentAnnotator(InputStream sourceDocument) Constructs a DocumentAnnotator to annotate the given document.DocumentAnnotator(InputStream sourceDocument, DocumentAnnotator.DocumentType documentType) Constructs a DocumentAnnotator to annotate the given document.
- 
Method SummaryModifier and TypeMethodDescriptionvoidclose()Close the underlying input stream.static DocumentAnnotatorfromPlainText(Reader source) Constructs a DocumentAnnotator to annotate the given text.booleanChecks whether annotation is supported for the current document type.voidprocess()Process the complete input at once, generating the annotated document.read()Find the next structure in the source document and return it, or null when the end of the document has been reached.booleansetAnnotatedOutput(File destination) Set the destination file where an annotated version of the source document should be written.booleansetAnnotatedOutput(OutputStream destination) Set the destination output stream where an annotated version of the source document should be written.setAnnotatedOutputDirectory(File annotateDirectory) Set the directory where the annotated document and associated resources will be placed.setAnnotatedOutputDirectory(File annotateDirectory, boolean keepOriginalExtension) Set the directory where the annotated document and associated resources will be placed.voidsetCustomHtmlToXmlConverter(XmlToHtmlConverter customConverter) Provide a custom converter from XML to HTML, to be used instead of the default one.voidsetD2SOptions(String options) Sets the options string to use for document annotation.voidsetMolconvert(File molconvert) voidsetOptions(DocumentAnnotatorOptions options) Sets the options used for document annotation.voidsetResourceDirectory(File resourceDirectory) Set the directory where resources will be stored.voidusePopups(boolean addPopups) Methods inherited from class java.lang.Objectclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface chemaxon.marvin.io.formats.MoleculeImporterIfacegetMDocumentStream, getMolStream
- 
Field Details- 
MOL_UID_PROPERTY- See Also:
 
 
- 
- 
Constructor Details- 
DocumentAnnotatorConstructs a DocumentAnnotator to annotate the given document file.The document format (PDF, HTML, XML) will be auto-detected. - Parameters:
- sourceDocument- the document to annotate
- Throws:
- FileNotFoundException- if the document does not exist
 
- 
DocumentAnnotatorConstructs a DocumentAnnotator to annotate the given document.The document format (PDF, HTML, XML) will be autodetected. If annotation of plain text is desired, please use: new DocumentAnnotator(sourceDocument, DocumentAnnotator.DocumentType.TXT)- Parameters:
- sourceDocument- the document to annotate
- Throws:
- IOException
 
- 
DocumentAnnotatorConstructs a DocumentAnnotator to annotate the given document.- Parameters:
- sourceDocument- the document to annotate
- documentType- the type of the source document
 
 
- 
- 
Method Details- 
fromPlainTextConstructs a DocumentAnnotator to annotate the given text.- Parameters:
- source- a Reader representing the source document
 
- 
setOptionsSets the options used for document annotation.
- 
isAnnotationSupportedpublic boolean isAnnotationSupported()Checks whether annotation is supported for the current document type.- Returns:
- true if the document type is supported, false otherwise
 
- 
setAnnotatedOutputSet the destination file where an annotated version of the source document should be written.Since not all type of source documents are supported for annotation, this method can return false, in which case no annotated document will be generated. - Returns:
- true if annotation is supported for the source document, false otherwise.
- Throws:
- IOException- if the file exists but is a directory rather than a regular file, does not exist but cannot be created, or cannot be opened for any other reason
- See Also:
 
- 
setAnnotatedOutputSet the destination output stream where an annotated version of the source document should be written.Since not all type of source documents are supported for annotation, this method can return false, in which case no annotated document will be generated. - Returns:
- true if annotation is supported for the source document, false otherwise.
- See Also:
 
- 
setAnnotatedOutputDirectorySet the directory where the annotated document and associated resources will be placed.The name of the annotated document is based on the source document file name, if known. Otherwise, it will be an arbitrary file name, which is returned as the result of this method. - Parameters:
- annotateDirectory- the destination directory
- Returns:
- the File inside the directory where the main document will be stored, or null if annotation is not supported for the source document.
- Throws:
- IOException- if the directory cannot be used to store files.
 
- 
setAnnotatedOutputDirectorypublic File setAnnotatedOutputDirectory(File annotateDirectory, boolean keepOriginalExtension) throws IOException Set the directory where the annotated document and associated resources will be placed.The name of the annotated document is based on the source document file name, if known. Otherwise, it will be an arbitrary file name, which is returned as the result of this method. - Parameters:
- annotateDirectory- the destination directory
- Returns:
- the File inside the directory where the main document will be stored, or null if annotation is not supported for the source document.
- Throws:
- IOException- if the directory cannot be used to store files.
 
- 
setD2SOptionsSets the options string to use for document annotation.
- 
usePopupspublic void usePopups(boolean addPopups) - Parameters:
- addPopups- whether a popup should be generated for each hit
 
- 
setMolconvert
- 
setCustomHtmlToXmlConverterProvide a custom converter from XML to HTML, to be used instead of the default one.
- 
readFind the next structure in the source document and return it, or null when the end of the document has been reached.If an annotated document is being generated, calling read() might also lead to the corresponding portion of the annotated document to be written to the destination. - Specified by:
- readin interface- MoleculeImporterIface
- Returns:
- the next structure found in the source document, or null if the document has been fully processed.
- Throws:
- IOException- If an error occurred during reading.
 
- 
closeDescription copied from interface:MoleculeImporterIfaceClose the underlying input stream. IMPORTANT: call this after reading molecules to close concurrent processing properly.- Specified by:
- closein interface- AutoCloseable
- Specified by:
- closein interface- MoleculeImporterIface
- Throws:
- IOException- If an error has occurred.
 
- 
processProcess the complete input at once, generating the annotated document.- Throws:
- IOException
- See Also:
 
- 
setResourceDirectorySet the directory where resources will be stored.- Throws:
- IOException- if the directory cannot be used to store files.
 
 
-