public class PDFHighlighter extends PDFTextStripper
charactersByArticle, document, output, outputEncoding, systemLineSeparator| Constructor and Description |
|---|
PDFHighlighter()
Default constructor.
|
| Modifier and Type | Method and Description |
|---|---|
protected void |
endPage(PDPage pdPage)
End a page.
|
void |
generateXMLHighlight(PDDocument pdDocument,
String[] sWords,
Writer xmlOutput)
Generate an XML highlight string based on the PDF.
|
void |
generateXMLHighlight(PDDocument pdDocument,
String highlightWord,
Writer xmlOutput)
Generate an XML highlight string based on the PDF.
|
static void |
main(String[] args)
Command line application.
|
endArticle, endDocument, getAddMoreFormatting, getArticleEnd, getArticleStart, getAverageCharTolerance, getCharactersByArticle, getCurrentPageNo, getDropThreshold, getEndBookmark, getEndPage, getIndentThreshold, getLineSeparator, getListItemPatterns, getOutput, getPageEnd, getPageSeparator, getPageStart, getParagraphEnd, getParagraphStart, getSeparateByBeads, getSortByPosition, getSpacingTolerance, getStartBookmark, getStartPage, getSuppressDuplicateOverlappingText, getText, getText, getWordSeparator, handleLineSeparation, inspectFontEncoding, isParagraphSeparation, matchListItemPattern, matchPattern, processPage, processPages, processTextPosition, resetEngine, setAddMoreFormatting, setArticleEnd, setArticleStart, setAverageCharTolerance, setDropThreshold, setEndBookmark, setEndPage, setIndentThreshold, setLineSeparator, setListItemPatterns, setPageEnd, setPageSeparator, setPageStart, setParagraphEnd, setParagraphStart, setShouldSeparateByBeads, setSortByPosition, setSpacingTolerance, setStartBookmark, setStartPage, setSuppressDuplicateOverlappingText, setWordSeparator, startArticle, startArticle, startDocument, startPage, writeCharacters, writeLineSeparator, writePage, writePageEnd, writePageSeperator, writePageStart, writeParagraphEnd, writeParagraphSeparator, writeParagraphStart, writeString, writeString, writeText, writeText, writeWordSeparatorgetColorSpaces, getCurrentPage, getFonts, getGraphicsStack, getGraphicsState, getGraphicsStates, getResources, getTextLineMatrix, getTextMatrix, getTotalCharCnt, getValidCharCnt, getXObjects, isForceParsing, processEncodedText, processOperator, processOperator, processStream, processSubStream, registerOperatorProcessor, setColorSpaces, setFonts, setForceParsing, setGraphicsStack, setGraphicsState, setGraphicsStates, setTextLineMatrix, setTextMatrixpublic PDFHighlighter()
throws IOException
IOException - If there is an error constructing this class.public void generateXMLHighlight(PDDocument pdDocument, String highlightWord, Writer xmlOutput) throws IOException
pdDocument - The PDF to find words in.highlightWord - The word to search for.xmlOutput - The resulting output xml file.IOException - If there is an error reading from the PDF, or writing to the XML.public void generateXMLHighlight(PDDocument pdDocument, String[] sWords, Writer xmlOutput) throws IOException
pdDocument - The PDF to find words in.sWords - The words to search for.xmlOutput - The resulting output xml file.IOException - If there is an error reading from the PDF, or writing to the XML.protected void endPage(PDPage pdPage) throws IOException
endPage in class PDFTextStripperpdPage - The page we are about to process.IOException - If there is any error writing to the stream.public static void main(String[] args) throws IOException
args - The command line arguments to the application.IOException - If there is an error generating the highlight file.Copyright © 2002-2015 The Apache Software Foundation. All Rights Reserved.