Class DocumentAnnotator
- All Implemented Interfaces:
MoleculeImporterIface
,AutoCloseable
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic enum
Document type enum forDocumentAnnotator
.static interface
The interface for aDocumentAnnotator
progress event listener.static final class
Represents the state of progress during a document annotation process. -
Field Summary
-
Constructor Summary
ConstructorDescriptionDocumentAnnotator
(File sourceDocument) Constructs a DocumentAnnotator to annotate the given document file.DocumentAnnotator
(InputStream sourceDocument) Constructs a DocumentAnnotator to annotate the given document.DocumentAnnotator
(InputStream sourceDocument, DocumentAnnotator.DocumentType documentType) Constructs a DocumentAnnotator to annotate the given document. -
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
Close the underlying input stream.static DocumentAnnotator
fromPlainText
(Reader source) Constructs a DocumentAnnotator to annotate the given text.boolean
Checks whether annotation is supported for the current document type.void
process()
Process the complete input at once, generating the annotated document.read()
Find the next structure in the source document and return it, or null when the end of the document has been reached.boolean
setAnnotatedOutput
(File destination) Set the destination file where an annotated version of the source document should be written.boolean
setAnnotatedOutput
(OutputStream destination) Set the destination output stream where an annotated version of the source document should be written.setAnnotatedOutputDirectory
(File annotateDirectory) Set the directory where the annotated document and associated resources will be placed.setAnnotatedOutputDirectory
(File annotateDirectory, boolean keepOriginalExtension) Set the directory where the annotated document and associated resources will be placed.void
setCustomHtmlToXmlConverter
(XmlToHtmlConverter customConverter) Provide a custom converter from XML to HTML, to be used instead of the default one.void
setD2SOptions
(String options) Sets the options string to use for document annotation.void
setMolconvert
(File molconvert) void
setOptions
(DocumentAnnotatorOptions options) Sets the options used for document annotation.void
setResourceDirectory
(File resourceDirectory) Set the directory where resources will be stored.void
usePopups
(boolean addPopups) Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface chemaxon.marvin.io.formats.MoleculeImporterIface
getMDocumentStream, getMolStream
-
Field Details
-
MOL_UID_PROPERTY
- See Also:
-
-
Constructor Details
-
DocumentAnnotator
Constructs a DocumentAnnotator to annotate the given document file.The document format (PDF, HTML, XML) will be auto-detected.
- Parameters:
sourceDocument
- the document to annotate- Throws:
FileNotFoundException
- if the document does not exist
-
DocumentAnnotator
Constructs a DocumentAnnotator to annotate the given document.The document format (PDF, HTML, XML) will be autodetected.
If annotation of plain text is desired, please use:
new DocumentAnnotator(sourceDocument, DocumentAnnotator.DocumentType.TXT)
- Parameters:
sourceDocument
- the document to annotate- Throws:
IOException
-
DocumentAnnotator
Constructs a DocumentAnnotator to annotate the given document.- Parameters:
sourceDocument
- the document to annotatedocumentType
- the type of the source document
-
-
Method Details
-
fromPlainText
Constructs a DocumentAnnotator to annotate the given text.- Parameters:
source
- a Reader representing the source document
-
setOptions
Sets the options used for document annotation. -
isAnnotationSupported
public boolean isAnnotationSupported()Checks whether annotation is supported for the current document type.- Returns:
- true if the document type is supported, false otherwise
-
setAnnotatedOutput
Set the destination file where an annotated version of the source document should be written.Since not all type of source documents are supported for annotation, this method can return false, in which case no annotated document will be generated.
- Returns:
- true if annotation is supported for the source document, false otherwise.
- Throws:
IOException
- if the file exists but is a directory rather than a regular file, does not exist but cannot be created, or cannot be opened for any other reason- See Also:
-
setAnnotatedOutput
Set the destination output stream where an annotated version of the source document should be written.Since not all type of source documents are supported for annotation, this method can return false, in which case no annotated document will be generated.
- Returns:
- true if annotation is supported for the source document, false otherwise.
- See Also:
-
setAnnotatedOutputDirectory
Set the directory where the annotated document and associated resources will be placed.The name of the annotated document is based on the source document file name, if known. Otherwise, it will be an arbitrary file name, which is returned as the result of this method.
- Parameters:
annotateDirectory
- the destination directory- Returns:
- the File inside the directory where the main document will be stored, or null if annotation is not supported for the source document.
- Throws:
IOException
- if the directory cannot be used to store files.
-
setAnnotatedOutputDirectory
public File setAnnotatedOutputDirectory(File annotateDirectory, boolean keepOriginalExtension) throws IOException Set the directory where the annotated document and associated resources will be placed.The name of the annotated document is based on the source document file name, if known. Otherwise, it will be an arbitrary file name, which is returned as the result of this method.
- Parameters:
annotateDirectory
- the destination directory- Returns:
- the File inside the directory where the main document will be stored, or null if annotation is not supported for the source document.
- Throws:
IOException
- if the directory cannot be used to store files.
-
setD2SOptions
Sets the options string to use for document annotation. -
usePopups
public void usePopups(boolean addPopups) - Parameters:
addPopups
- whether a popup should be generated for each hit
-
setMolconvert
-
setCustomHtmlToXmlConverter
Provide a custom converter from XML to HTML, to be used instead of the default one. -
read
Find the next structure in the source document and return it, or null when the end of the document has been reached.If an annotated document is being generated, calling read() might also lead to the corresponding portion of the annotated document to be written to the destination.
- Specified by:
read
in interfaceMoleculeImporterIface
- Returns:
- the next structure found in the source document, or null if the document has been fully processed.
- Throws:
IOException
- If an error occurred during reading.
-
close
Description copied from interface:MoleculeImporterIface
Close the underlying input stream. IMPORTANT: call this after reading molecules to close concurrent processing properly.- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceMoleculeImporterIface
- Throws:
IOException
- If an error has occurred.
-
process
Process the complete input at once, generating the annotated document. -
setResourceDirectory
Set the directory where resources will be stored.- Throws:
IOException
- if the directory cannot be used to store files.
-