Name: bb33257 Date: 11/25/97
The word-break tables (i.e., the tables used by the BreakIterator
returned by BreakIterator.getWordInstance()-- line-breaking
tables are fine) treat CJK characters in a Japanese-specific way:
an arbitrary run of Kanji characters, followed by an optional
arbitrary run of Hiragana characters, followed by an optional
arbitrary run of Katakana characters, all gets treated as a
single "word" by the word-break iterator. However, in Chinese
text, which doesn't use hiragana or katakana, this will result
in whole paragraphs (instead of individual ideographs) being
treated as "words" for the purposes of double-click selection
and "find whole words" operations. Chinese will therefore
require its own state tables for word breaking.
Dictionary-based break iterators may also be needed for Korean and Japanese.
###@###.### 11/2/04 18:15 GMT