public class LucenePDFDocument extends Object
| Lucene Field Name | Description | 
|---|---|
| path | File system path if loaded from a file | 
| url | URL to PDF document | 
| contents | Entire contents of PDF document, indexed but not stored | 
| summary | First 500 characters of content | 
| modified | The modified date/time according to the url or path | 
| uid | A unique identifier for the Lucene document. | 
| CreationDate | From PDF meta-data if available | 
| Creator | From PDF meta-data if available | 
| Keywords | From PDF meta-data if available | 
| ModificationDate | From PDF meta-data if available | 
| Producer | From PDF meta-data if available | 
| Subject | From PDF meta-data if available | 
| Trapped | From PDF meta-data if available | 
| Modifier and Type | Field and Description | 
|---|---|
| static org.apache.lucene.document.FieldType | TYPE_STORED_NOT_INDEXEDnot Indexed, tokenized, stored. | 
| Constructor and Description | 
|---|
| LucenePDFDocument()Constructor. | 
| Modifier and Type | Method and Description | 
|---|---|
| org.apache.lucene.document.Document | convertDocument(File file)This will take a reference to a PDF document and create a lucene document. | 
| org.apache.lucene.document.Document | convertDocument(InputStream is)Convert the PDF stream to a lucene document. | 
| org.apache.lucene.document.Document | convertDocument(URL url)Convert the document from a PDF to a lucene document. | 
| static String | createUID(File file)Create an UID for the given file. | 
| static String | createUID(URL url,
         long time)Create an UID for the given file using the given time. | 
| static org.apache.lucene.document.Document | getDocument(File file)This will get a lucene document from a PDF file. | 
| static org.apache.lucene.document.Document | getDocument(InputStream is)This will get a lucene document from a PDF file. | 
| static org.apache.lucene.document.Document | getDocument(URL url)This will get a lucene document from a PDF file. | 
| void | setTextStripper(PDFTextStripper aStripper)Set the text stripper that will be used during extraction. | 
public static final org.apache.lucene.document.FieldType TYPE_STORED_NOT_INDEXED
public void setTextStripper(PDFTextStripper aStripper)
aStripper - The new pdf text stripper.public org.apache.lucene.document.Document convertDocument(InputStream is) throws IOException
is - The input stream.IOException - If there is an error converting the PDF.public org.apache.lucene.document.Document convertDocument(File file) throws IOException
file - A reference to a PDF document.IOException - If there is an exception while converting the document.public org.apache.lucene.document.Document convertDocument(URL url) throws IOException
url - A url to a PDF document.IOException - If there is an error while converting the document.public static org.apache.lucene.document.Document getDocument(InputStream is) throws IOException
is - The stream to read the PDF from.IOException - If there is an error parsing or indexing the document.public static org.apache.lucene.document.Document getDocument(File file) throws IOException
file - The file to get the document for.IOException - If there is an error parsing or indexing the document.public static org.apache.lucene.document.Document getDocument(URL url) throws IOException
url - The file to get the document for.IOException - If there is an error parsing or indexing the document.public static String createUID(URL url, long time)
url - the file we have to create an UID fortime - the time to used to the UIDCopyright © 2002–2018 The Apache Software Foundation. All rights reserved.