Java Solaris Communities Sun Store Join SDN My Profile Why Join?
 
Bug Database
Bug Detail
Quick Lists
Top 25 Bugs
Top 25 RFE's
Recently Closed Bugs
Printable Page Printable Page


Bug Database
Bug ID: 4210199
Votes 2
Synopsis RFE: Numerals are always Arabic (Roman)
Category java:classes_2d
Reported Against 1.2
Release Fixed 1.4(merlin-beta)
State 10-Fix Delivered, request for enhancement
Priority: 4-Low
Related Bugs 4337267
Submit Date 09-FEB-1999
Description




When numerals are in an Arabic context, for example when they are 
surrounded by Arabic letters, they should have Hindi shapes (Unicode
values from \u0660 to \u0669). Currently the BIDI algorithm always
sets numeral shapes to the Arabic (Roman) shapes (Unicode values
\u0030 to \u0039). Shaping the numbers is the responsibility of the
BIDI algorithm as specified by the Unicode standard.
Note that shaping the numbers should only happen in Arabic blocks 
and not in Hebrew blocks, since Hebrew always uses the Roman numerals.

This is very important because the Hindi numerals are the only numerals
known in most of the Arab countries, especially in the Gulf region.

I suggest that there should be an attribute in the TextAtribute class 
as follows:
TextAttribute.NUMERALS_SHAPE
and it could be set to:
TextAttribute.NUMERALS_SHAPE_ROMAN //numerals are always Roman
TextAttribute.NUMERALS_SHAPE_HINDI //numerals are always Hindi
TextAttribute.NUMERALS_SHAPE_CONTEXT //the BIDI algorithm will shape the numerals depending on the context they're in
(Review ID: 53978)
======================================================================
Work Around
N/A
Evaluation
Must address this in the TextLayout and Swing bidi algorithms.
  xxxxx@xxxxx   1999-06-30

The problem is that the clients want to continue using the ascii numerals and not convert to the Hindi numerals themselves.  The arguments are 1) client data from external sources, while nominally 'unicode', stores numeric data as ascii because their software can't handle the other numerals in unicode; 2) keyboards for Arabic don't generate the numerals in the Hindi block, but instead generate the numerals in the roman block, making it difficult to enter the correct text using 'off the shelf' components such as the swing text components.  The second problem really shouldn't be the client's responsibility.  While unicode includes codes to turn on 'national' digit shapes' these are deprecated (because they are stateful) and we don't support them.  Since we rely on the OS for keyboard support, changing the keyboard handling for all platforms is rather error-prone, though iy is an option.  And we still face client reluctance to convert their numeric data.  

Some form of Attribute support could handle this.  There are lots of numerals, and even in the Arabic block there are different sets of numerals for different languages (the Persian digits at 06f0), so it could be argued (I would) that the numeric shaping should be language-dependent, and that instead of having explicit attributes for each number type as well as a bidi-contextual form, there should be only 'explicit' and 'language dependent', where 'language dependent' depends on either explicit language tagging (more attributes) or language analysis.  'Explicit' is the default and how unicode prefers to handle things.  'Language dependent' would trigger examining an attribute for language.  If it is not present we synthesize a language based on paragraph context (we do script analysis for OpenType anyway).  If the result is 'arabic' we use the numerals in the standard arabic block, if it is 'persian' we use the numerals from the extended arabic block, etc.  

Unfortunately some writing systems write numbers differently-- not just with different numerals-- though many just use variant character forms.  This wouldn't handle that, though it might lead to the expectation that it would-- for instance, that roman numbers embedded in Chinese would use a traditional Chinese representation (X 100 Y 10 Z) instead of X Y Z.

So I think this needs a bit more investigation.
  xxxxx@xxxxx   1999-07-07

The reporter is  incorrect in stating that "Shaping the numbers is the responsibility of the BIDI algorithm as specified by the Unicode standard."  This is not the case, the Bidi algorithm only deals with character positioning and not shaping.  Shaping is the responsibility of the rendering system and is outside Unicode's domain per se.  That said, we do perform shaping of Arabic text, and also lam-alef ligature substitution.

See the suggested fix.
  xxxxx@xxxxx   2000-02-07

The simplest thing to do is to add a new attribute and perform contextual shaping based on a small set of fixed values.  If other people want contextual shaping, we can expand the set of values.  Clients want contextual shaping and a generic implementation is probably overkill, and if not carefully designed could be easily abused.
  xxxxx@xxxxx   2000-04-25
Comments
  
  Include a link with my name & email   

Submitted On 08-NOV-1999
Hani ZIAD
This is an important missing feature that intersts most of the arab world. I
confirm also that the national numbers (persian) do not appear either on screen
nor on printed text. And this is true whether we set the numerals substitution
to context or national in the "Regional Settings" tab of the control
panel.



PLEASE NOTE: JDK6 is formerly known as Project Mustang