org.apache.pdfbox.pdmodel
Class PDDocument

java.lang.Object
  extended by org.apache.pdfbox.pdmodel.PDDocument
All Implemented Interfaces:
Pageable

public class PDDocument
extends Object
implements Pageable

This is the in-memory representation of the PDF document. You need to call close() on this object when you are done using it!!

Version:
$Revision: 1.47 $
Author:
Ben Litchfield

Field Summary
 
Fields inherited from interface java.awt.print.Pageable
UNKNOWN_NUMBER_OF_PAGES
 
Constructor Summary
PDDocument()
          Constructor, creates a new PDF Document with no pages.
PDDocument(COSDocument doc)
          Constructor that uses an existing document.
 
Method Summary
 void addPage(PDPage page)
          This will add a page to the document.
 void clearWillEncryptWhenSaving()
          Deprecated. Do not rely on this method anymore. It is the responsability of COSWriter to hold this state.
 void close()
          This will close the underlying COSDocument object.
 void decrypt(String password)
          This will decrypt a document.
 void encrypt(String ownerPassword, String userPassword)
          This will mark a document to be encrypted.
 AccessPermission getCurrentAccessPermission()
          Returns the access permissions granted when the document was decrypted.
 COSDocument getDocument()
          This will get the low level document.
 PDDocumentCatalog getDocumentCatalog()
          This will get the document CATALOG.
 PDDocumentInformation getDocumentInformation()
          This will get the document info dictionary.
 PDEncryptionDictionary getEncryptionDictionary()
          This will get the encryption dictionary for this document.
 int getNumberOfPages()
          
 String getOwnerPasswordForEncryption()
          Deprecated. Do not rely on this method anymore.
 int getPageCount()
          Deprecated. Use the getNumberOfPages method instead!
 PageFormat getPageFormat(int pageIndex)
          
 Map getPageMap()
          This will return the Map containing the mapping from object-ids to pagenumbers.
 Printable getPrintable(int pageIndex)
          
 SecurityHandler getSecurityHandler()
          Get the security handler that is used for document encryption.
 String getUserPasswordForEncryption()
          Deprecated. Do not rely on this method anymore.
 PDPage importPage(PDPage page)
          This will import and copy the contents from another location.
 boolean isAllSecurityToBeRemoved()
           
 boolean isEncrypted()
          This will tell if this document is encrypted or not.
 boolean isOwnerPassword(String password)
          Deprecated.  
 boolean isUserPassword(String password)
          Deprecated.  
static PDDocument load(File file)
          This will load a document from a file.
static PDDocument load(File file, RandomAccess scratchFile)
          This will load a document from a file.
static PDDocument load(InputStream input)
          This will load a document from an input stream.
static PDDocument load(InputStream input, boolean force)
          This will load a document from an input stream.
static PDDocument load(InputStream input, RandomAccess scratchFile)
          This will load a document from an input stream.
static PDDocument load(InputStream input, RandomAccess scratchFile, boolean force)
          This will load a document from an input stream.
static PDDocument load(String filename)
          This will load a document from a file.
static PDDocument load(String filename, boolean force)
          This will load a document from a file.
static PDDocument load(String filename, RandomAccess scratchFile)
          This will load a document from a file.
static PDDocument load(URL url)
          This will load a document from a url.
static PDDocument load(URL url, boolean force)
          This will load a document from a url.
static PDDocument load(URL url, RandomAccess scratchFile)
          This will load a document from a url.
 void openProtection(DecryptionMaterial pm)
          Tries to decrypt the document in memory using the provided decryption material.
 void print()
          This will send the PDF document to a printer.
 void print(PrinterJob printJob)
           
 void protect(ProtectionPolicy pp)
          Protects the document with the protection policy pp.
 boolean removePage(int pageNumber)
          Remove the page from the document.
 boolean removePage(PDPage page)
          Remove the page from the document.
 void save(OutputStream output)
          This will save the document to an output stream.
 void save(String fileName)
          This will save this document to the filesystem.
 void setAllSecurityToBeRemoved(boolean allSecurityToBeRemoved)
           
 void setDocumentInformation(PDDocumentInformation info)
          This will set the document information for this document.
 void setEncryptionDictionary(PDEncryptionDictionary encDictionary)
          This will set the encryption dictionary for this document.
 void silentPrint()
          This will send the PDF to the default printer without prompting the user for any printer settings.
 void silentPrint(PrinterJob printJob)
          This will send the PDF to the default printer without prompting the user for any printer settings.
 boolean wasDecryptedWithOwnerPassword()
          Deprecated. use getCurrentAccessPermission instead
 boolean willEncryptWhenSaving()
          Deprecated. Do not rely on this method anymore. It is the responsibility of COSWriter to hold this state
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PDDocument

public PDDocument()
           throws IOException
Constructor, creates a new PDF Document with no pages. You need to add at least one page for the document to be valid.

Throws:
IOException - If there is an error creating this document.

PDDocument

public PDDocument(COSDocument doc)
Constructor that uses an existing document. The COSDocument that is passed in must be valid.

Parameters:
doc - The COSDocument that this document wraps.
Method Detail

getPageMap

public final Map getPageMap()
This will return the Map containing the mapping from object-ids to pagenumbers.

Returns:
the pageMap

addPage

public void addPage(PDPage page)
This will add a page to the document. This is a convenience method, that will add the page to the root of the hierarchy and set the parent of the page to the root.

Parameters:
page - The page to add to the document.

removePage

public boolean removePage(PDPage page)
Remove the page from the document.

Parameters:
page - The page to remove from the document.
Returns:
true if the page was found false otherwise.

removePage

public boolean removePage(int pageNumber)
Remove the page from the document.

Parameters:
pageNumber - 0 based index to page number.
Returns:
true if the page was found false otherwise.

importPage

public PDPage importPage(PDPage page)
                  throws IOException
This will import and copy the contents from another location. Currently the content stream is stored in a scratch file. The scratch file is associated with the document. If you are adding a page to this document from another document and want to copy the contents to this document's scratch file then use this method otherwise just use the addPage method.

Parameters:
page - The page to import.
Returns:
The page that was imported.
Throws:
IOException - If there is an error copying the page.

getDocument

public COSDocument getDocument()
This will get the low level document.

Returns:
The document that this layer sits on top of.

getDocumentInformation

public PDDocumentInformation getDocumentInformation()
This will get the document info dictionary. This is guaranteed to not return null.

Returns:
The documents /Info dictionary

setDocumentInformation

public void setDocumentInformation(PDDocumentInformation info)
This will set the document information for this document.

Parameters:
info - The updated document information.

getDocumentCatalog

public PDDocumentCatalog getDocumentCatalog()
This will get the document CATALOG. This is guaranteed to not return null.

Returns:
The documents /Root dictionary

isEncrypted

public boolean isEncrypted()
This will tell if this document is encrypted or not.

Returns:
true If this document is encrypted.

getEncryptionDictionary

public PDEncryptionDictionary getEncryptionDictionary()
                                               throws IOException
This will get the encryption dictionary for this document. This will still return the parameters if the document was decrypted. If the document was never encrypted then this will return null. As the encryption architecture in PDF documents is plugable this returns an abstract class, but the only supported subclass at this time is a PDStandardEncryption object.

Returns:
The encryption dictionary(most likely a PDStandardEncryption object)
Throws:
IOException - If there is an error determining which security handler to use.

setEncryptionDictionary

public void setEncryptionDictionary(PDEncryptionDictionary encDictionary)
                             throws IOException
This will set the encryption dictionary for this document.

Parameters:
encDictionary - The encryption dictionary(most likely a PDStandardEncryption object)
Throws:
IOException - If there is an error determining which security handler to use.

isUserPassword

public boolean isUserPassword(String password)
                       throws IOException,
                              CryptographyException
Deprecated. 

This will determine if this is the user password. This only applies when the document is encrypted and uses standard encryption.

Parameters:
password - The plain text user password.
Returns:
true If the password passed in matches the user password used to encrypt the document.
Throws:
IOException - If there is an error determining if it is the user password.
CryptographyException - If there is an error in the encryption algorithms.

isOwnerPassword

public boolean isOwnerPassword(String password)
                        throws IOException,
                               CryptographyException
Deprecated. 

This will determine if this is the owner password. This only applies when the document is encrypted and uses standard encryption.

Parameters:
password - The plain text owner password.
Returns:
true If the password passed in matches the owner password used to encrypt the document.
Throws:
IOException - If there is an error determining if it is the user password.
CryptographyException - If there is an error in the encryption algorithms.

decrypt

public void decrypt(String password)
             throws CryptographyException,
                    IOException,
                    InvalidPasswordException
This will decrypt a document. This method is provided for compatibility reasons only. User should use the new security layer instead and the openProtection method especially.

Parameters:
password - Either the user or owner password.
Throws:
CryptographyException - If there is an error decrypting the document.
IOException - If there is an error getting the stream data.
InvalidPasswordException - If the password is not a user or owner password.

wasDecryptedWithOwnerPassword

public boolean wasDecryptedWithOwnerPassword()
Deprecated. use getCurrentAccessPermission instead

This will tell if the document was decrypted with the master password. This entry is invalid if the PDF was not decrypted.

Returns:
true if the pdf was decrypted with the master password.

encrypt

public void encrypt(String ownerPassword,
                    String userPassword)
             throws CryptographyException,
                    IOException
This will mark a document to be encrypted. The actual encryption will occur when the document is saved. This method is provided for compatibility reasons only. User should use the new security layer instead and the openProtection method especially.

Parameters:
ownerPassword - The owner password to encrypt the document.
userPassword - The user password to encrypt the document.
Throws:
CryptographyException - If an error occurs during encryption.
IOException - If there is an error accessing the data.

getOwnerPasswordForEncryption

public String getOwnerPasswordForEncryption()
Deprecated. Do not rely on this method anymore.

The owner password that was passed into the encrypt method. You should never use this method. This will not longer be valid once encryption has occured.

Returns:
The owner password passed to the encrypt method.

getUserPasswordForEncryption

public String getUserPasswordForEncryption()
Deprecated. Do not rely on this method anymore.

The user password that was passed into the encrypt method. You should never use this method. This will not longer be valid once encryption has occured.

Returns:
The user password passed to the encrypt method.

willEncryptWhenSaving

public boolean willEncryptWhenSaving()
Deprecated. Do not rely on this method anymore. It is the responsibility of COSWriter to hold this state

Internal method do determine if the document will be encrypted when it is saved.

Returns:
True if encrypt has been called and the document has not been saved yet.

clearWillEncryptWhenSaving

public void clearWillEncryptWhenSaving()
Deprecated. Do not rely on this method anymore. It is the responsability of COSWriter to hold this state.

This shoule only be called by the COSWriter after encryption has completed.


load

public static PDDocument load(URL url)
                       throws IOException
This will load a document from a url.

Parameters:
url - The url to load the PDF from.
Returns:
The document that was loaded.
Throws:
IOException - If there is an error reading from the stream.

load

public static PDDocument load(URL url,
                              boolean force)
                       throws IOException
This will load a document from a url. Used for skipping corrupt pdf objects

Parameters:
url - The url to load the PDF from.
force - When true, the parser will skip corrupt pdf objects and will continue parsing at the next object in the file
Returns:
The document that was loaded.
Throws:
IOException - If there is an error reading from the stream.

load

public static PDDocument load(URL url,
                              RandomAccess scratchFile)
                       throws IOException
This will load a document from a url.

Parameters:
url - The url to load the PDF from.
scratchFile - A location to store temp PDFBox data for this document.
Returns:
The document that was loaded.
Throws:
IOException - If there is an error reading from the stream.

load

public static PDDocument load(String filename)
                       throws IOException
This will load a document from a file.

Parameters:
filename - The name of the file to load.
Returns:
The document that was loaded.
Throws:
IOException - If there is an error reading from the stream.

load

public static PDDocument load(String filename,
                              boolean force)
                       throws IOException
This will load a document from a file. Allows for skipping corrupt pdf objects

Parameters:
filename - The name of the file to load.
force - When true, the parser will skip corrupt pdf objects and will continue parsing at the next object in the file
Returns:
The document that was loaded.
Throws:
IOException - If there is an error reading from the stream.

load

public static PDDocument load(String filename,
                              RandomAccess scratchFile)
                       throws IOException
This will load a document from a file.

Parameters:
filename - The name of the file to load.
scratchFile - A location to store temp PDFBox data for this document.
Returns:
The document that was loaded.
Throws:
IOException - If there is an error reading from the stream.

load

public static PDDocument load(File file)
                       throws IOException
This will load a document from a file.

Parameters:
file - The name of the file to load.
Returns:
The document that was loaded.
Throws:
IOException - If there is an error reading from the stream.

load

public static PDDocument load(File file,
                              RandomAccess scratchFile)
                       throws IOException
This will load a document from a file.

Parameters:
file - The name of the file to load.
scratchFile - A location to store temp PDFBox data for this document.
Returns:
The document that was loaded.
Throws:
IOException - If there is an error reading from the stream.

load

public static PDDocument load(InputStream input)
                       throws IOException
This will load a document from an input stream.

Parameters:
input - The stream that contains the document.
Returns:
The document that was loaded.
Throws:
IOException - If there is an error reading from the stream.

load

public static PDDocument load(InputStream input,
                              boolean force)
                       throws IOException
This will load a document from an input stream. Allows for skipping corrupt pdf objects

Parameters:
input - The stream that contains the document.
force - When true, the parser will skip corrupt pdf objects and will continue parsing at the next object in the file
Returns:
The document that was loaded.
Throws:
IOException - If there is an error reading from the stream.

load

public static PDDocument load(InputStream input,
                              RandomAccess scratchFile)
                       throws IOException
This will load a document from an input stream.

Parameters:
input - The stream that contains the document.
scratchFile - A location to store temp PDFBox data for this document.
Returns:
The document that was loaded.
Throws:
IOException - If there is an error reading from the stream.

load

public static PDDocument load(InputStream input,
                              RandomAccess scratchFile,
                              boolean force)
                       throws IOException
This will load a document from an input stream. Allows for skipping corrupt pdf objects

Parameters:
input - The stream that contains the document.
scratchFile - A location to store temp PDFBox data for this document.
force - When true, the parser will skip corrupt pdf objects and will continue parsing at the next object in the file
Returns:
The document that was loaded.
Throws:
IOException - If there is an error reading from the stream.

save

public void save(String fileName)
          throws IOException,
                 COSVisitorException
This will save this document to the filesystem.

Parameters:
fileName - The file to save as.
Throws:
IOException - If there is an error saving the document.
COSVisitorException - If an error occurs while generating the data.

save

public void save(OutputStream output)
          throws IOException,
                 COSVisitorException
This will save the document to an output stream.

Parameters:
output - The stream to write to.
Throws:
IOException - If there is an error writing the document.
COSVisitorException - If an error occurs while generating the data.

getPageCount

public int getPageCount()
Deprecated. Use the getNumberOfPages method instead!

This will return the total page count of the PDF document. Note: This method is deprecated in favor of the getNumberOfPages method. The getNumberOfPages is a required interface method of the Pageable interface. This method will be removed in a future version of PDFBox!!

Returns:
The total number of pages in the PDF document.

getNumberOfPages

public int getNumberOfPages()

Specified by:
getNumberOfPages in interface Pageable

getPageFormat

public PageFormat getPageFormat(int pageIndex)

Specified by:
getPageFormat in interface Pageable

getPrintable

public Printable getPrintable(int pageIndex)

Specified by:
getPrintable in interface Pageable

print

public void print(PrinterJob printJob)
           throws PrinterException
Parameters:
printJob - The printer job.
Throws:
PrinterException - If there is an error while sending the PDF to the printer, or you do not have permissions to print this document.
See Also:
print()

print

public void print()
           throws PrinterException
This will send the PDF document to a printer. The printing functionality depends on the org.apache.pdfbox.pdfviewer.PageDrawer functionality. The PageDrawer is a work in progress and some PDFs will print correctly and some will not. This is a convenience method to create the java.awt.print.PrinterJob. The PDDocument implements the java.awt.print.Pageable interface and PDPage implementes the java.awt.print.Printable interface, so advanced printing capabilities can be done by using those interfaces instead of this method.

Throws:
PrinterException - If there is an error while sending the PDF to the printer, or you do not have permissions to print this document.

silentPrint

public void silentPrint()
                 throws PrinterException
This will send the PDF to the default printer without prompting the user for any printer settings.

Throws:
PrinterException - If there is an error while printing.
See Also:
print()

silentPrint

public void silentPrint(PrinterJob printJob)
                 throws PrinterException
This will send the PDF to the default printer without prompting the user for any printer settings.

Parameters:
printJob - A printer job definition.
Throws:
PrinterException - If there is an error while printing.
See Also:
print()

close

public void close()
           throws IOException
This will close the underlying COSDocument object.

Throws:
IOException - If there is an error releasing resources.

protect

public void protect(ProtectionPolicy pp)
             throws BadSecurityHandlerException
Protects the document with the protection policy pp. The document content will be really encrypted when it will be saved. This method only marks the document for encryption.

Parameters:
pp - The protection policy.
Throws:
BadSecurityHandlerException - If there is an error during protection.
See Also:
StandardProtectionPolicy, PublicKeyProtectionPolicy

openProtection

public void openProtection(DecryptionMaterial pm)
                    throws BadSecurityHandlerException,
                           IOException,
                           CryptographyException
Tries to decrypt the document in memory using the provided decryption material.

Parameters:
pm - The decryption material (password or certificate).
Throws:
BadSecurityHandlerException - If there is an error during decryption.
IOException - If there is an error reading cryptographic information.
CryptographyException - If there is an error during decryption.
See Also:
StandardDecryptionMaterial, PublicKeyDecryptionMaterial

getCurrentAccessPermission

public AccessPermission getCurrentAccessPermission()
Returns the access permissions granted when the document was decrypted. If the document was not decrypted this method returns the access permission for a document owner (ie can do everything). The returned object is in read only mode so that permissions cannot be changed. Methods providing access to content should rely on this object to verify if the current user is allowed to proceed.

Returns:
the access permissions for the current user on the document.

getSecurityHandler

public SecurityHandler getSecurityHandler()
Get the security handler that is used for document encryption.

Returns:
The handler used to encrypt/decrypt the document.

isAllSecurityToBeRemoved

public boolean isAllSecurityToBeRemoved()

setAllSecurityToBeRemoved

public void setAllSecurityToBeRemoved(boolean allSecurityToBeRemoved)


Copyright © 2002-2010 The Apache Software Foundation. All Rights Reserved.