org.apache.pdfbox.util
Class TextNormalize

java.lang.Object
  extended by org.apache.pdfbox.util.TextNormalize

public class TextNormalize
extends Object

This class allows a caller to normalize text in various ways. It will load the ICU4J jar file if it is defined on the classpath.

Version:
$Revision: 1.0 $
Author:
Brian Carrier

Constructor Summary
TextNormalize(String encoding)
           
 
Method Summary
 String makeLineLogicalOrder(String str, boolean isRtlDominant)
          Takes a line of text in presentation order and converts it to logical order.
 String normalizeDiac(String str)
          Normalize the diacritic, for example, convert non-combining diacritic characters to their combining counterparts.
 String normalizePres(String str)
          Normalize the presentation forms of characters in the string.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TextNormalize

public TextNormalize(String encoding)
Parameters:
encoding - The Encoding that the text will eventually be written as (or null)
Method Detail

makeLineLogicalOrder

public String makeLineLogicalOrder(String str,
                                   boolean isRtlDominant)
Takes a line of text in presentation order and converts it to logical order. For most text other than Arabic and Hebrew, the presentation and logical orders are the same. However, for Arabic and Hebrew, they are different and if the text involves both RTL and LTR text then the Unicode BIDI algorithm must be used to determine how to map between them.

Parameters:
str - Presentation form of line to convert (i.e. left most char is first char)
isRtlDominant - true if the PAGE has a dominant right to left ordering
Returns:
Logical form of string (or original string if ICU4J library is not on classpath)

normalizePres

public String normalizePres(String str)
Normalize the presentation forms of characters in the string. For example, convert the single "fi" ligature to "f" and "i".

Parameters:
str - String to normalize
Returns:
Normalized string (or original string if ICU4J library is not on classpath)

normalizeDiac

public String normalizeDiac(String str)
Normalize the diacritic, for example, convert non-combining diacritic characters to their combining counterparts.

Parameters:
str - String to normalize
Returns:
Normalized string (or original string if ICU4J library is not on classpath)


Copyright © 2002-2010 The Apache Software Foundation. All Rights Reserved.