www.delorie.com/archives/browse.cgi | search |
2009/9/29 wynfield: > > Though I'm not an up on the details involved here, I will give > you feedback to the request for information about the locale issue, becau= se it affects the quick accessability and usage of Japanese language docume= nts. > > Either of the two follow values would be acceptable, but I feel that the = UTF-8 charset is becoming more and more adopted. > =C2=A0 =C2=A0 =C2=A0 =C2=A0LANG=3Dja -> UTF-8 > =C2=A0 =C2=A0 LANG=3Dja_JP -> UTF-8 > > Also the following be suitable if possible.. > =C2=A0 =C2=A0 =C2=A0 =C2=A0LANG=3Dja -> iso-2022-jp > =C2=A0 =C2=A0 LANG=3Dja_JP -> iso-2022-jp Thanks for the feedback! Now, Windows knows three different variants of iso-2022-jp. Do you know which one's the preferred one? CP50220: ISO 2022 Japanese with no halfwidth Katakana; Japanese (JIS) CP50221: ISO 2022 Japanese with halfwidth Katakana; Japanese (JIS-Allow 1 byte Kana) CP50222: ISO 2022 Japanese JIS X 0201-1989; Japanese (JIS-Allow 1 byte Kana - SO/SI) Also, Wikipedia has this to say: "Since ISO 2022 is a stateful encoding, a program can not jump in the middle of a block of text to search, insert or delete characters. This makes manipulation of the text very cumbersome and slow when compared to non-stateful encodings. Any jump in the middle of the text may require a back up to the previous escape sequence before the bytes following the escape sequence can be interpreted." Doesn't that make it very difficult to use with standard Unix tools? Andy -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
webmaster | delorie software privacy |
Copyright © 2019 by DJ Delorie | Updated Jul 2019 |