public class PDFMarkedContentExtractor extends PDFStreamEngine
| Constructor and Description | 
|---|
| PDFMarkedContentExtractor()Instantiate a new PDFTextStripper object. | 
| PDFMarkedContentExtractor(String encoding)Constructor. | 
| Modifier and Type | Method and Description | 
|---|---|
| void | beginMarkedContentSequence(COSName tag,
                          COSDictionary properties) | 
| void | endMarkedContentSequence() | 
| List<PDMarkedContent> | getMarkedContents() | 
| void | processPage(PDPage page)This will initialise and process the contents of the stream. | 
| protected void | processTextPosition(TextPosition text)This will process a TextPosition object and add the
 text to the list of characters on a page. | 
| protected void | showGlyph(Matrix textRenderingMatrix,
         PDFont font,
         int code,
         String unicode,
         Vector displacement)This method was originally written by Ben Litchfield for PDFStreamEngine. | 
| void | xobject(PDXObject xobject) | 
addOperator, applyTextAdjustment, beginText, endText, getAppearance, getCurrentPage, getGraphicsStackSize, getGraphicsState, getInitialMatrix, getResources, getTextLineMatrix, getTextMatrix, operatorException, processAnnotation, processChildStream, processOperator, processOperator, processSoftMask, processTilingPattern, processTilingPattern, processTransparencyGroup, processType3Stream, registerOperatorProcessor, restoreGraphicsStack, restoreGraphicsState, saveGraphicsStack, saveGraphicsState, setLineDashPattern, setTextLineMatrix, setTextMatrix, showAnnotation, showFontGlyph, showForm, showText, showTextString, showTextStrings, showTransparencyGroup, showType3Glyph, transformedPoint, transformWidth, unsupportedOperatorpublic PDFMarkedContentExtractor()
                          throws IOException
IOExceptionpublic PDFMarkedContentExtractor(String encoding) throws IOException
encoding - The encoding that the output will be written in.IOExceptionpublic void beginMarkedContentSequence(COSName tag, COSDictionary properties)
public void endMarkedContentSequence()
public void xobject(PDXObject xobject)
protected void processTextPosition(TextPosition text)
text - The text to process.public List<PDMarkedContent> getMarkedContents()
public void processPage(PDPage page) throws IOException
processPage in class PDFStreamEnginepage - the page to processIOException - if there is an error accessing the stream.protected void showGlyph(Matrix textRenderingMatrix, PDFont font, int code, String unicode, Vector displacement) throws IOException
showGlyph in class PDFStreamEnginetextRenderingMatrix - the current text rendering matrix, Trmfont - the current fontcode - internal PDF character code for the glyphunicode - the Unicode text for this glyph, or null if the PDF does provide itdisplacement - the displacement (i.e. advance) of the glyph in text spaceIOException - if the glyph cannot be processedCopyright © 2002–2017 The Apache Software Foundation. All rights reserved.