Class DocumentAnnotator
- All Implemented Interfaces:
MoleculeImporterIface,AutoCloseable
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic enumDocument type enum forDocumentAnnotator.static interfaceThe interface for aDocumentAnnotatorprogress event listener.static final classRepresents the state of progress during a document annotation process. -
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionDocumentAnnotator(File sourceDocument) Constructs a DocumentAnnotator to annotate the given document file.DocumentAnnotator(InputStream sourceDocument) Constructs a DocumentAnnotator to annotate the given document.DocumentAnnotator(InputStream sourceDocument, DocumentAnnotator.DocumentType documentType) Constructs a DocumentAnnotator to annotate the given document. -
Method Summary
Modifier and TypeMethodDescriptionvoidclose()Close the underlying input stream.static DocumentAnnotatorfromPlainText(Reader source) Constructs a DocumentAnnotator to annotate the given text.booleanChecks whether annotation is supported for the current document type.voidprocess()Process the complete input at once, generating the annotated document.read()Find the next structure in the source document and return it, or null when the end of the document has been reached.booleansetAnnotatedOutput(File destination) Set the destination file where an annotated version of the source document should be written.booleansetAnnotatedOutput(OutputStream destination) Set the destination output stream where an annotated version of the source document should be written.setAnnotatedOutputDirectory(File annotateDirectory) Set the directory where the annotated document and associated resources will be placed.setAnnotatedOutputDirectory(File annotateDirectory, boolean keepOriginalExtension) Set the directory where the annotated document and associated resources will be placed.voidsetCustomHtmlToXmlConverter(XmlToHtmlConverter customConverter) Provide a custom converter from XML to HTML, to be used instead of the default one.voidsetD2SOptions(String options) Sets the options string to use for document annotation.voidsetMolconvert(File molconvert) voidsetOptions(DocumentAnnotatorOptions options) Sets the options used for document annotation.voidsetResourceDirectory(File resourceDirectory) Set the directory where resources will be stored.voidusePopups(boolean addPopups) Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface chemaxon.marvin.io.formats.MoleculeImporterIface
getMDocumentStream, getMolStream
-
Field Details
-
MOL_UID_PROPERTY
- See Also:
-
-
Constructor Details
-
DocumentAnnotator
Constructs a DocumentAnnotator to annotate the given document file.The document format (PDF, HTML, XML) will be auto-detected.
- Parameters:
sourceDocument- the document to annotate- Throws:
FileNotFoundException- if the document does not exist
-
DocumentAnnotator
Constructs a DocumentAnnotator to annotate the given document.The document format (PDF, HTML, XML) will be autodetected.
If annotation of plain text is desired, please use:
new DocumentAnnotator(sourceDocument, DocumentAnnotator.DocumentType.TXT)- Parameters:
sourceDocument- the document to annotate- Throws:
IOException
-
DocumentAnnotator
Constructs a DocumentAnnotator to annotate the given document.- Parameters:
sourceDocument- the document to annotatedocumentType- the type of the source document
-
-
Method Details
-
fromPlainText
Constructs a DocumentAnnotator to annotate the given text.- Parameters:
source- a Reader representing the source document
-
setOptions
Sets the options used for document annotation. -
isAnnotationSupported
public boolean isAnnotationSupported()Checks whether annotation is supported for the current document type.- Returns:
- true if the document type is supported, false otherwise
-
setAnnotatedOutput
Set the destination file where an annotated version of the source document should be written.Since not all type of source documents are supported for annotation, this method can return false, in which case no annotated document will be generated.
- Returns:
- true if annotation is supported for the source document, false otherwise.
- Throws:
IOException- if the file exists but is a directory rather than a regular file, does not exist but cannot be created, or cannot be opened for any other reason- See Also:
-
setAnnotatedOutput
Set the destination output stream where an annotated version of the source document should be written.Since not all type of source documents are supported for annotation, this method can return false, in which case no annotated document will be generated.
- Returns:
- true if annotation is supported for the source document, false otherwise.
- See Also:
-
setAnnotatedOutputDirectory
Set the directory where the annotated document and associated resources will be placed.The name of the annotated document is based on the source document file name, if known. Otherwise, it will be an arbitrary file name, which is returned as the result of this method.
- Parameters:
annotateDirectory- the destination directory- Returns:
- the File inside the directory where the main document will be stored, or null if annotation is not supported for the source document.
- Throws:
IOException- if the directory cannot be used to store files.
-
setAnnotatedOutputDirectory
public File setAnnotatedOutputDirectory(File annotateDirectory, boolean keepOriginalExtension) throws IOException Set the directory where the annotated document and associated resources will be placed.The name of the annotated document is based on the source document file name, if known. Otherwise, it will be an arbitrary file name, which is returned as the result of this method.
- Parameters:
annotateDirectory- the destination directory- Returns:
- the File inside the directory where the main document will be stored, or null if annotation is not supported for the source document.
- Throws:
IOException- if the directory cannot be used to store files.
-
setD2SOptions
Sets the options string to use for document annotation. -
usePopups
public void usePopups(boolean addPopups) - Parameters:
addPopups- whether a popup should be generated for each hit
-
setMolconvert
-
setCustomHtmlToXmlConverter
Provide a custom converter from XML to HTML, to be used instead of the default one. -
read
Find the next structure in the source document and return it, or null when the end of the document has been reached.If an annotated document is being generated, calling read() might also lead to the corresponding portion of the annotated document to be written to the destination.
- Specified by:
readin interfaceMoleculeImporterIface- Returns:
- the next structure found in the source document, or null if the document has been fully processed.
- Throws:
IOException- If an error occurred during reading.
-
close
Description copied from interface:MoleculeImporterIfaceClose the underlying input stream. IMPORTANT: call this after reading molecules to close concurrent processing properly.- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceMoleculeImporterIface- Throws:
IOException- If an error has occurred.
-
process
Process the complete input at once, generating the annotated document.- Throws:
IOException- See Also:
-
setResourceDirectory
Set the directory where resources will be stored.- Throws:
IOException- if the directory cannot be used to store files.
-