|
Quick Lists
|
|
Bug ID:
|
4170549
|
|
Votes
|
11
|
|
Synopsis
|
Print{Stream,Writer} classes display characters incorrectly (win)
|
|
Category
|
java:classes_io
|
|
Reported Against
|
1.2
, 1.4
, 1.1.6
, 8.1ur1
|
|
Release Fixed
|
|
|
State
|
6-Fix Understood,
bug
|
|
Priority:
|
4-Low
|
|
Related Bugs
|
4201263
,
4670764
,
6549619
,
6236312
|
|
Submit Date
|
01-SEP-1998
|
|
Description
|
*Symptoms:
In the following testcase, the "\u00e9" does not print correctly (it should
be e-acute) when printed using PrintWriter.
---- Top of File ----
import java.io.*;
public class testacc {
public static void main(java.lang.String s[]){
PrintWriter pout=new PrintWriter(System.out, true);
String data1="\u00e9lan";
String data2="Úlan";
System.out.println(data1);
System.out.println(data2);
System.out.println("---");
pout.println(data1);
pout.println(data2);
}
}
======================================================================
Posted Date : 2006-02-01 08:06:10.0
|
|
Work Around
|
Use the "chcp" command to change the console code page to 1252. Unfortunately,
this only works on Windows NT. -- xxxxx@xxxxx 9/10/1998
|
|
Evaluation
|
This is most likely due to the fact that Win32 separates the encoding used for
the console (the "OEM" encoding, by default) from the encoding used for all
other operations (the "ANSI" encoding, by default).
This could be fixed Windows NT by using SetConsoleCP() and SetConsoleOutputCP()
during startup to set the console code page to the ANSI code page. This will
not work on Windows 95/98 because these procedures are ineffective on those
systems.
Another alternative is to use the OEM code page everywhere on Windows 95/98,
and the ANSI code page everywhere on NT.
-- xxxxx@xxxxx 9/1/1998
Further investigation has shown the above approach to be impractical. The
SetConsoleCP() and SetConsoleOutputCP() procedures permanently change the code
page for the current console rather than just for the process that's using it.
If we were to take this approach we'd have to arrange to revert the console
code page via an on-exit hook. This would work most of the time, but fail if
the VM were terminated with extreme prejudice. This approach would,
unfortunately, still only be effective for Windows NT.
Another possibility is to separate the notion of the platform's default
encoding for files, named by the "file.encoding" property, from that used by
the standard input, output, and error streams. With this approach we could
either use the OEM encoding for those streams or follow the advice given in the
Win32 API docs to use straight Unicode when interacting with the console,
though I don't know if this would actually work on Windows 95/98, where Unicode
support is a bit weak.
This is a pretty significant change that is too late to consider for JDK 1.2.
We should revisit this issue, along with the other problems surrounding the
System.{in,out,err} streams, in the next feature release.
-- xxxxx@xxxxx 9/10/1998
|
|
Comments
|
Submitted On 14-SEP-1999
mthornton
You should (probably) only use the consoles code page when the stream
(System.out, System.err, System.in) is attached to the console and not when it
has been redirected. To operate correctly under NT the code should check for
the code page used by the current console.
Submitted On 19-MAY-2000
gberche
As a workaround, I copied the code for java.ioPrintStream and renamed it into a custom class :
BinaryPrintStream and specified the right encoding at the creation of charout member.
I also replace calls to flushBuffer() by calls to flush().
Then I stacked up this converter to the stream used by stdout with the
code below:
/** Utility method to patch the string converter of PrintStream by stacking
* an instance of BinaryPrintStream to the stdout OutputStream
*/
public static void patchStdout() {
BinaryPrintStream newOut = new BinaryPrintStream(System.out);
System.setOut(newOut);
}
The workaround seem to work: System.out uses the converter I defined in BinaryPrintStream.
Let me know if you need details.
Submitted On 18-MAR-2001
synkronix
If there were a console.encoding and graphics.encoding in addition to
file.encoding, then System.out, etc. could use console.encoding, and
graphics components could use graphics.encoding. By default for
most platforms, they could be identical to file.encoding, but in Windows
(and others where relevant), console.encoding could be the MS-DOS
encoding of the console, and graphics.encoding could be Cp1252.
On the 390, Java does an automatic translation between Unicode
and EBCDIC; this really messed us up originally when we added
international support. This method would allow such behavior
of Windows (where it's not done, but should be) and OS/390
(where it's done, but behind your back without a programmatic
way of determining it).
Submitted On 30-DEC-2004
The_Crusher
Still present in 1.5.
Also setting the console's code page to cp1252 with "chcp 1252" doesn't seem to make any difference in Windows XP Home Edition.
Submitted On 22-AUG-2006
Benio_B
This still happens when you write to a file with the encoding set to cp1252.
Submitted On 11-JAN-2007
FredP
To make the command "chcp 1252" work on Windows XP, you need to change the font of the console to a TrueType font like "Lucida Console".
Submitted On 16-AUG-2007
yecril71pl
CHCP 1252 is not an option because CMD.EXE's prompt is OEM so you get it all wrong when you change to ANSI.
PLEASE NOTE: JDK6 is formerly known as Project Mustang
|
|
|
 |