com.ibm.icu.text
Class UnicodeDecompressor
- com.ibm.icu.text.SCSU
public final class UnicodeDecompressor
implements com.ibm.icu.text.SCSU
A decompression engine implementing the Standard Compression Scheme
for Unicode (SCSU) as outlined in
Unicode Technical
Report #6.
USAGE
The static methods on
UnicodeDecompressor may be used in a
straightforward manner to decompress simple strings:
byte [] compressed = ... ; // get compressed bytes from somewhere
String result = UnicodeDecompressor.decompress(compressed);
The static methods have a fairly large memory footprint.
For finer-grained control over memory usage,
UnicodeDecompressor offers more powerful APIs allowing
iterative decompression:
// Decompress an array "bytes" of length "len" using a buffer of 512 chars
// to the Writer "out"
UnicodeDecompressor myDecompressor = new UnicodeDecompressor();
final static int BUFSIZE = 512;
char [] charBuffer = new char [ BUFSIZE ];
int charsWritten = 0;
int [] bytesRead = new int [1];
int totalBytesDecompressed = 0;
int totalCharsWritten = 0;
do {
// do the decompression
charsWritten = myDecompressor.decompress(bytes, totalBytesDecompressed,
len, bytesRead,
charBuffer, 0, BUFSIZE);
// do something with the current set of chars
out.write(charBuffer, 0, charsWritten);
// update the no. of bytes decompressed
totalBytesDecompressed += bytesRead[0];
// update the no. of chars written
totalCharsWritten += charsWritten;
} while(totalBytesDecompressed <32len);
myDecompressor.reset(); // reuse decompressor
Decompression is performed according to the standard set forth in
Unicode Technical
Report #6
ARMENIANINDEX, COMPRESSIONOFFSET, GREEKINDEX, HALFWIDTHKATAKANAINDEX, HIRAGANAINDEX, INVALIDCHAR, INVALIDWINDOW, IPAEXTENSIONINDEX, KATAKANAINDEX, LATININDEX, MAXINDEX, NUMSTATICWINDOWS, NUMWINDOWS, RESERVEDINDEX, SCHANGE0, SCHANGE1, SCHANGE2, SCHANGE3, SCHANGE4, SCHANGE5, SCHANGE6, SCHANGE7, SCHANGEU, SDEFINE0, SDEFINE1, SDEFINE2, SDEFINE3, SDEFINE4, SDEFINE5, SDEFINE6, SDEFINE7, SDEFINEX, SINGLEBYTEMODE, SQUOTE0, SQUOTE1, SQUOTE2, SQUOTE3, SQUOTE4, SQUOTE5, SQUOTE6, SQUOTE7, SQUOTEU, SRESERVED, UCHANGE0, UCHANGE1, UCHANGE2, UCHANGE3, UCHANGE4, UCHANGE5, UCHANGE6, UCHANGE7, UDEFINE0, UDEFINE1, UDEFINE2, UDEFINE3, UDEFINE4, UDEFINE5, UDEFINE6, UDEFINE7, UDEFINEX, UNICODEMODE, UQUOTEU, URESERVED, sOffsetTable, sOffsets |
static String | decompress(byte[] buffer)- Decompress a byte array into a String.
|
static char[] | decompress(byte[] buffer, int start, int limit)- Decompress a byte array into a Unicode character array.
|
int | decompress(byte[] byteBuffer, int byteBufferStart, int byteBufferLimit, int[] bytesRead, char[] charBuffer, int charBufferStart, int charBufferLimit)- Decompress a byte array into a Unicode character array.
|
void | reset()- Reset the decompressor to its initial state.
|
UnicodeDecompressor
public UnicodeDecompressor()
Create a UnicodeDecompressor.
Sets all windows to their default values.
decompress
public static String decompress(byte[] buffer)
Decompress a byte array into a String.
buffer - The byte array to decompress.
- A String containing the decompressed characters.
decompress(byte [], int, int)
decompress
public static char[] decompress(byte[] buffer,
int start,
int limit) Decompress a byte array into a Unicode character array.
buffer - The byte array to decompress.start - The start of the byte run to decompress.limit - The limit of the byte run to decompress.
- A character array containing the decompressed bytes.
decompress
public int decompress(byte[] byteBuffer,
int byteBufferStart,
int byteBufferLimit,
int[] bytesRead,
char[] charBuffer,
int charBufferStart,
int charBufferLimit) Decompress a byte array into a Unicode character array.
This function will either completely fill the output buffer,
or consume the entire input.
byteBuffer - The byte buffer to decompress.byteBufferStart - The start of the byte run to decompress.byteBufferLimit - The limit of the byte run to decompress.bytesRead - A one-element array. If not null, on return
the number of bytes read from byteBuffer.charBuffer - A buffer to receive the decompressed data.
This buffer must be at minimum two characters in size.charBufferStart - The starting offset to which to write
decompressed data.charBufferLimit - The limiting offset for writing
decompressed data.
- The number of Unicode characters written to charBuffer.
reset
public void reset()
Reset the decompressor to its initial state.
Copyright (c) 2006 IBM Corporation and others.