Elexis: Das führende OpenSource-Arztpraxisprogamm
im deutschsprachigen Raum
Java doc für Elexis version 2.1.7.dev vom 01.09.2013

com.healthmarketscience.jackcess.scsu
Class Expand

java.lang.Object
  extended by com.healthmarketscience.jackcess.scsu.SCSU
      extended by com.healthmarketscience.jackcess.scsu.Expand

public class Expand
extends SCSU

Reference decoder for the Standard Compression Scheme for Unicode (SCSU)

Notes on the Java implementation

A limitation of Java is the exclusive use of a signed byte data type. The following work arounds are required: Copying a byte to an integer variable and adding 256 for 'negative' bytes gives an integer in the range 0-255. Values of char are between 0x0000 and 0xFFFF in Java. Arithmetic on char values is unsigned. Extended characters require an int to store them. The sign is not an issue because only 1024*1024 + 65536 extended characters exist.


Field Summary
protected  int iIn
          input cursor used by the following functions
protected  int iOut
          string buffer length used by the following functions
 
Constructor Summary
Expand()
           
 
Method Summary
 int bytesRead()
           
static char charFromTwoBytes(byte hi, byte lo)
          assemble a char from two bytes In Java bytes are signed quantities, while chars are unsigned
 int charsWritten()
           
protected  void defineExtendedWindow(char chOffset)
          (re-)define (and select) a window as an extended dynamic window The surrogate area in Unicode allows access to 2**20 codes beyond the first 64K codes by combining one of 1024 characters from the High Surrogate Area with one of 1024 characters from the Low Surrogate Area (see Unicode 2.0 for the details).
protected  void defineWindow(int iWindow, byte bOffset)
          (re-)define (and select) a dynamic window A sliding window position cannot start at any Unicode value, so rather than providing an absolute offset, this function takes an index value which selects among the possible starting values.
 java.lang.String expand(byte[] in)
          expand a byte array containing compressed Unicode
protected  java.lang.String expandSingleByte(byte[] in)
          expand portion of the input that is in single byte mode
protected  int expandUnicode(byte[] in, int iCur, java.lang.StringBuilder sb)
          expand input that is in Unicode mode
 void reset()
          reset is called to start with new input, w/o creating a new instance
 
Methods inherited from class com.healthmarketscience.jackcess.scsu.SCSU
getCurrentWindow, isCompressible, selectWindow
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

iOut

protected int iOut
string buffer length used by the following functions


iIn

protected int iIn
input cursor used by the following functions

Constructor Detail

Expand

public Expand()
Method Detail

defineWindow

protected void defineWindow(int iWindow,
                            byte bOffset)
                     throws IllegalInputException
(re-)define (and select) a dynamic window A sliding window position cannot start at any Unicode value, so rather than providing an absolute offset, this function takes an index value which selects among the possible starting values. Most scripts in Unicode start on or near a half-block boundary so the default behaviour is to multiply the index by 0x80. Han, Hangul, Surrogates and other scripts between 0x3400 and 0xDFFF show very poor locality--therefore no sliding window can be set there. A jumpOffset is added to the index value to skip that region, and only 167 index values total are required to select all eligible half-blocks. Finally, a few scripts straddle half block boundaries. For them, a table of fixed offsets is used, and the index values from 0xF9 to 0xFF are used to select these special offsets. After (re-)defining a windows location it is selected so it is ready for use. Recall that all Windows are of the same length (128 code positions).

Parameters:
iWindow - - index of the window to be (re-)defined
bOffset - - index for the new offset value
Throws:
IllegalInputException

defineExtendedWindow

protected void defineExtendedWindow(char chOffset)
(re-)define (and select) a window as an extended dynamic window The surrogate area in Unicode allows access to 2**20 codes beyond the first 64K codes by combining one of 1024 characters from the High Surrogate Area with one of 1024 characters from the Low Surrogate Area (see Unicode 2.0 for the details). The tags SDX and UDX set the window such that each subsequent byte in the range 80 to FF represents a surrogate pair. The following diagram shows how the bits in the two bytes following the SDX or UDX, and a subsequent data byte, map onto the bits in the resulting surrogate pair. hbyte lbyte data nnnwwwww zzzzzyyy 1xxxxxxx high-surrogate low-surrogate 110110wwwwwzzzzz 110111yyyxxxxxxx

Parameters:
chOffset - - Since the three top bits of chOffset are not needed to set the location of the extended Window, they are used instead to select the window, thereby reducing the number of needed command codes. The bottom 13 bits of chOffset are used to calculate the offset relative to a 7 bit input data byte to yield the 20 bits expressed by each surrogate pair.

expandUnicode

protected int expandUnicode(byte[] in,
                            int iCur,
                            java.lang.StringBuilder sb)
                     throws IllegalInputException,
                            EndOfInputException
expand input that is in Unicode mode

Parameters:
in - input byte array to be expanded
iCur - starting index
sb - string buffer to which to append expanded input
Returns:
the index for the lastc byte processed
Throws:
IllegalInputException
EndOfInputException

charFromTwoBytes

public static char charFromTwoBytes(byte hi,
                                    byte lo)
assemble a char from two bytes In Java bytes are signed quantities, while chars are unsigned

Parameters:
hi - most significant byte
lo - least significant byte
Returns:
the character

expandSingleByte

protected java.lang.String expandSingleByte(byte[] in)
                                     throws IllegalInputException,
                                            EndOfInputException
expand portion of the input that is in single byte mode

Throws:
IllegalInputException
EndOfInputException

expand

public java.lang.String expand(byte[] in)
                        throws IllegalInputException,
                               EndOfInputException
expand a byte array containing compressed Unicode

Throws:
IllegalInputException
EndOfInputException

reset

public void reset()
reset is called to start with new input, w/o creating a new instance

Overrides:
reset in class SCSU

charsWritten

public int charsWritten()

bytesRead

public int bytesRead()

Elexis: Das führende OpenSource-Arztpraxisprogamm
im deutschsprachigen Raum
Java doc für Elexis version 2.1.7.dev vom 01.09.2013