com.healthmarketscience.jackcess.scsu
Class Expand
java.lang.Object
com.healthmarketscience.jackcess.scsu.SCSU
com.healthmarketscience.jackcess.scsu.Expand
public class Expand
- extends SCSU
Reference decoder for the Standard Compression Scheme for Unicode (SCSU)
Notes on the Java implementation
A limitation of Java is the exclusive use of a signed byte data type. The following work arounds
are required:
Copying a byte to an integer variable and adding 256 for 'negative' bytes gives an integer in the
range 0-255.
Values of char are between 0x0000 and 0xFFFF in Java. Arithmetic on char values is unsigned.
Extended characters require an int to store them. The sign is not an issue because only 1024*1024
+ 65536 extended characters exist.
Method Summary |
int |
bytesRead()
|
static char |
charFromTwoBytes(byte hi,
byte lo)
assemble a char from two bytes In Java bytes are signed quantities, while chars are unsigned |
int |
charsWritten()
|
java.lang.String |
expand(byte[] in)
expand a byte array containing compressed Unicode |
void |
reset()
reset is called to start with new input, w/o creating a new instance |
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Expand
public Expand()
charFromTwoBytes
public static char charFromTwoBytes(byte hi,
byte lo)
- assemble a char from two bytes In Java bytes are signed quantities, while chars are unsigned
- Parameters:
hi
- most significant bytelo
- least significant byte
- Returns:
- the character
expand
public java.lang.String expand(byte[] in)
throws IllegalInputException,
EndOfInputException
- expand a byte array containing compressed Unicode
- Throws:
IllegalInputException
EndOfInputException
reset
public void reset()
- reset is called to start with new input, w/o creating a new instance
- Overrides:
reset
in class SCSU
charsWritten
public int charsWritten()
bytesRead
public int bytesRead()
Copyright 2005-2011 by Gerry Weirich, Elexis