public class PDFParser extends BaseParser
| Modifier and Type | Field and Description | 
|---|---|
| protected boolean | isFDFDocment | 
| protected XrefTrailerResolver | xrefTrailerResolverCollects all Xref/trailer objects and resolves them into single
  object using startxref reference. | 
DEF, document, ENDOBJ, ENDSTREAM, forceParsing, pdfSource, PROP_PUSHBACK_SIZE| Constructor and Description | 
|---|
| PDFParser(InputStream input)Constructor. | 
| PDFParser(InputStream input,
         RandomAccess rafi)Constructor to allow control over RandomAccessFile. | 
| PDFParser(InputStream input,
         RandomAccess rafi,
         boolean force)Constructor to allow control over RandomAccessFile. | 
| Modifier and Type | Method and Description | 
|---|---|
| void | clearResources()Release all used resources. | 
| COSDocument | getDocument()This will get the document that was parsed. | 
| FDFDocument | getFDFDocument()This will get the FDF document that was parsed. | 
| PDDocument | getPDDocument()This will get the PD document that was parsed. | 
| protected boolean | isContinueOnError(Exception e)Returns true if parsing should be continued. | 
| void | parse()This will parse the stream and populate the COSDocument object. | 
| protected void | parseHeader() | 
| protected boolean | parseStartXref()This will parse the startxref section from the stream. | 
| protected boolean | parseTrailer()This will parse the trailer from the stream and add it to the state. | 
| void | parseXrefStream(COSStream stream,
               long objByteOffset)Fills XRefTrailerResolver with data of given stream. | 
| void | parseXrefStream(COSStream stream,
               long objByteOffset,
               boolean isStandalone)Fills XRefTrailerResolver with data of given stream. | 
| protected boolean | parseXrefTable(long startByteOffset)This will parse the xref table from the stream and add it to the state
 The XrefTable contents are ignored. | 
| protected void | readVersionInTrailer(COSDictionary parsedTrailer)The document catalog can also have a /Version parameter which overrides the version specified
 in the header if, and only if it is greater. | 
| void | setTempDirectory(File tmpDir)This is the directory where pdfbox will create a temporary file
 for storing pdf document stream in. | 
isClosing, isClosing, isEndOfName, isEOL, isEOL, isWhitespace, isWhitespace, parseBoolean, parseCOSArray, parseCOSDictionary, parseCOSName, parseCOSStream, parseCOSString, parseCOSString, parseDirObject, readExpectedString, readGenerationNumber, readInt, readLine, readLong, readObjectNumber, readString, readString, readStringNumber, readUntilEndStream, setDocument, skipSpacesprotected boolean isFDFDocment
protected XrefTrailerResolver xrefTrailerResolver
public PDFParser(InputStream input) throws IOException
input - The input stream that contains the PDF document.IOException - If there is an error initializing the stream.public PDFParser(InputStream input, RandomAccess rafi) throws IOException
input - The input stream that contains the PDF document.rafi - The RandomAccessFile to be used in internal COSDocumentIOException - If there is an error initializing the stream.public PDFParser(InputStream input, RandomAccess rafi, boolean force) throws IOException
input - The input stream that contains the PDF document.rafi - The RandomAccessFile to be used in internal COSDocumentforce - When true, the parser will skip corrupt pdf objects and
 will continue parsing at the next object in the fileIOException - If there is an error initializing the stream.public void setTempDirectory(File tmpDir)
tmpDir - The directory to create scratch files needed to store
        pdf document streams.protected boolean isContinueOnError(Exception e)
e - The exception if vailable. Can be null if there is no exception availablepublic void parse()
           throws IOException
IOException - If there is an error reading from the stream or corrupt data
 is found.protected void parseHeader()
                    throws IOException
IOExceptionpublic COSDocument getDocument() throws IOException
IOException - If there is an error getting the document.public PDDocument getPDDocument() throws IOException
IOException - If there is an error getting the document.public FDFDocument getFDFDocument() throws IOException
IOException - If there is an error getting the document.protected boolean parseStartXref()
                          throws IOException
IOException - If an IO error occurs.protected boolean parseXrefTable(long startByteOffset)
                          throws IOException
startByteOffset - the offset to start atIOException - If an IO error occurs.protected boolean parseTrailer()
                        throws IOException
IOException - If an IO error occurs.protected void readVersionInTrailer(COSDictionary parsedTrailer)
parsedTrailer - the parsed catalog in the trailerpublic void parseXrefStream(COSStream stream, long objByteOffset) throws IOException
stream - the stream to be readobjByteOffset - the offset to start atIOException - if there is an error parsing the streampublic void parseXrefStream(COSStream stream, long objByteOffset, boolean isStandalone) throws IOException
stream - the stream to be readobjByteOffset - the offset to start atisStandalone - should be set to true if the stream is not part of a hybrid xref tableIOException - if there is an error parsing the streampublic void clearResources()
clearResources in class BaseParserCopyright © 2002–2016 The Apache Software Foundation. All rights reserved.