public class PDFHighlighter extends PDFTextStripper
charactersByArticle, document, LINE_SEPARATOR, output| Constructor and Description | 
|---|
| PDFHighlighter()Default constructor. | 
| Modifier and Type | Method and Description | 
|---|---|
| protected void | endPage(PDPage pdPage)End a page. | 
| void | generateXMLHighlight(PDDocument pdDocument,
                    String[] sWords,
                    Writer xmlOutput)Generate an XML highlight string based on the PDF. | 
| void | generateXMLHighlight(PDDocument pdDocument,
                    String highlightWord,
                    Writer xmlOutput)Generate an XML highlight string based on the PDF. | 
| static void | main(String[] args)Command line application. | 
| protected void | showGlyph(Matrix textRenderingMatrix,
         PDFont font,
         int code,
         String unicode,
         Vector displacement)This method was originally written by Ben Litchfield for PDFStreamEngine. | 
endArticle, endDocument, getAddMoreFormatting, getArticleEnd, getArticleStart, getAverageCharTolerance, getCharactersByArticle, getCurrentPageNo, getDropThreshold, getEndBookmark, getEndPage, getIndentThreshold, getLineSeparator, getListItemPatterns, getOutput, getPageEnd, getPageStart, getParagraphEnd, getParagraphStart, getSeparateByBeads, getSortByPosition, getSpacingTolerance, getStartBookmark, getStartPage, getSuppressDuplicateOverlappingText, getText, getWordSeparator, matchPattern, processPage, processPages, processTextPosition, setAddMoreFormatting, setArticleEnd, setArticleStart, setAverageCharTolerance, setDropThreshold, setEndBookmark, setEndPage, setIndentThreshold, setLineSeparator, setListItemPatterns, setPageEnd, setPageStart, setParagraphEnd, setParagraphStart, setShouldSeparateByBeads, setSortByPosition, setSpacingTolerance, setStartBookmark, setStartPage, setSuppressDuplicateOverlappingText, setWordSeparator, startArticle, startArticle, startDocument, startPage, writeCharacters, writeLineSeparator, writePage, writePageEnd, writePageStart, writeParagraphEnd, writeParagraphSeparator, writeParagraphStart, writeString, writeString, writeText, writeWordSeparatoraddOperator, applyTextAdjustment, beginText, endText, getAppearance, getCurrentPage, getGraphicsStackSize, getGraphicsState, getInitialMatrix, getResources, getTextLineMatrix, getTextMatrix, operatorException, processAnnotation, processChildStream, processOperator, processOperator, processSoftMask, processTilingPattern, processTilingPattern, processTransparencyGroup, processType3Stream, registerOperatorProcessor, restoreGraphicsStack, restoreGraphicsState, saveGraphicsStack, saveGraphicsState, setLineDashPattern, setTextLineMatrix, setTextMatrix, showAnnotation, showFontGlyph, showForm, showText, showTextString, showTextStrings, showTransparencyGroup, showType3Glyph, transformedPoint, transformWidth, unsupportedOperatorpublic PDFHighlighter()
               throws IOException
IOException - If there is an error constructing this class.public void generateXMLHighlight(PDDocument pdDocument, String highlightWord, Writer xmlOutput) throws IOException
pdDocument - The PDF to find words in.highlightWord - The word to search for.xmlOutput - The resulting output xml file.IOException - If there is an error reading from the PDF, or writing to the XML.public void generateXMLHighlight(PDDocument pdDocument, String[] sWords, Writer xmlOutput) throws IOException
pdDocument - The PDF to find words in.sWords - The words to search for.xmlOutput - The resulting output xml file.IOException - If there is an error reading from the PDF, or writing to the XML.protected void endPage(PDPage pdPage) throws IOException
endPage in class PDFTextStripperpdPage - The page we are about to process.IOException - If there is any error writing to the stream.public static void main(String[] args) throws IOException
args - The command line arguments to the application.IOException - If there is an error generating the highlight file.protected void showGlyph(Matrix textRenderingMatrix, PDFont font, int code, String unicode, Vector displacement) throws IOException
showGlyph in class PDFStreamEnginetextRenderingMatrix - the current text rendering matrix, Trmfont - the current fontcode - internal PDF character code for the glyphunicode - the Unicode text for this glyph, or null if the PDF does provide itdisplacement - the displacement (i.e. advance) of the glyph in text spaceIOException - if the glyph cannot be processedCopyright © 2002–2016 The Apache Software Foundation. All rights reserved.