Java Solaris Communities Sun Store Join SDN My Profile Why Join?
 
Bug Database
Bug Detail
Quick Lists
Top 25 Bugs
Top 25 RFE's
Recently Closed Bugs
Printable Page Printable Page


Bug Database
Bug ID: 6730652
Votes 0
Synopsis CharsetEncoder.canEncode(char) returns incorrect values for some Charsets
Category java:char_encodings
Reported Against
Release Fixed 7(b71)
State 10-Fix Delivered, bug
Priority: 3-Medium
Related Bugs
Submit Date 28-JUL-2008
Description
CharsetEncoder.canEncode(ch) != CharsetEncoder.canEncode("" + ch) for some Charsets?

I ran this simple test with JDK 5 and JDK 6 on Mac OSx and Windows XP,

SortedMap<String, Charset> map = Charset.availableCharsets();
        for (Charset cs : map.values()) {
            try {
                CharsetEncoder encoder = cs.newEncoder();

                char ch = '\u5185';
                if (encoder.canEncode(ch) != encoder.canEncode("" + ch))
                    System.out.println("Encoder: " + cs.name() + " failed");
            } catch (Exception e) {
                //System.out.println("Encoder: " + cs.name() + " Error");
            }
        }

It failed 
JDK 5 (1.5.0_15) windows xp:
Encoder: ISO-2022-KR failed
Encoder: x-IBM1381 failed
Encoder: x-IBM1383 failed
Encoder: x-IBM942 failed
Encoder: x-IBM942C failed
Encoder: x-IBM943 failed
Encoder: x-IBM943C failed

JDK 6 (1.6.0_04) Windows xp:
Encoder: x-ISO-2022-CN-CNS failed

Mac Osx (JDK 5 1.5.0_13)
Encoder: MacRoman failed
Encoder: x-IBM1381 failed
Encoder: x-IBM1383 failed
Encoder: x-IBM942 failed
Encoder: x-IBM942C failed
Encoder: x-IBM943 failed
Encoder: x-IBM943C failed
Posted Date : 2008-07-28 21:15:06.0
Work Around
N/A
Evaluation
The only failure in latest 6ux is "Encoder: x-ISO-2022-CN-CNS failed".
The root cause is that encoder of x-ISO-2022-CN-CNS failed to encode the CNS p3 character, which the testing codepoint \u5185 is.
Posted Date : 2009-05-03 02:31:38.0

The failure in attached test case has been fixed a while ago with other changes, however the FindCanEncodeBug still fails. The cause is that ISO2022.canEncode() invokes ISOEncoder.canEncode(), in which the ISOEncoder is the EUC_TW encoder in ISO2022_CN_CNS case, but ISO2022_CN_CNS only suports the first 3 planes of the EUC_TW, the canEncode() fails if character in other planes is fed.
Posted Date : 2009-08-14 20:45:42.0
Comments
  
  Include a link with my name & email   


PLEASE NOTE: JDK6 is formerly known as Project Mustang