From: "Juan Manuel Guerrero" Organization: Darmstadt University of Technology To: JT Williams , Eli Zaretskii , salvador Date: Mon, 6 Aug 2001 19:22:00 +0200 Subject: Re: gettext port CC: djgpp-workers AT delorie DOT com X-mailer: Pegasus Mail for Windows (v2.54DE) Message-ID: <378E2A966FC@HRZ1.hrz.tu-darmstadt.de> Reply-To: djgpp-workers AT delorie DOT com On Sun, 5 Aug 2001 08:41:26 -0500, JT Williams wrote: > -: > In the `sed.mo' file for german I see that u" is represented > -: > by ascii 252 and o" by ascii 246. But this is not correct for > -: > either cp437 or cp850 (u" is 129 and o" is 148 in each). > -: > -: See the Content-type header of the file: it probably says that the > -: file is in ISO-8859-1. The conversion to cp850 is done on the fly by > -: libiconv, since on MS-DOS, the default for de locale is cp850. > > Using cp850 does not help, because as far as the display of german text is > concerned, cp437 and cp850 are equivalent. In fact, given the character > encoding of `sed.mo', *none* of the six codepages supplied with DOS 5 can > correctly display the german text from `sed.mo'. > > I have read the detailed post by Juan several times, but I still cannot > determine if the above would indicate that something is broken, or just > not possible under djgpp. I have reinspected the sources of libintl.a to recall all the things i have forgotten about this issue. The interesting function is localcharset.c:locale_charset(). This function tries to determinate the locale charset to be used by calling function: nl_langinfo() if available. If that function is not available, the locale charset is determinated by checking the environment variables: LC_ALL, LC_CTYPE and LANG in that order. The interesting variable is LANG. This variable may contain an alias like es for spanish or de_CH for german spoken in switzerland. This alias is resolved into a codepage using charset.alias. At the same time LANG can be set directely to a codepage. This means, it is possible to set LANG=437. At least the following ways to specify a codepage directely using LANG are allowed AFAIK: LANG=437 LANG=CP437 LANG=cp437 All those LANG settings are ok for codepage 437. Of course, the same applies for all the other codepages. In this particular case the .mo file will be recoded to codepage 437. To solve your difficulty I would suggest the following lines for your djgpp.env: LANG=CP437 LANGUAGE=de The first line selects the appropiate locale charset to be used during runtime recoding. The second line is evaluated by function dcigettext.c:dcigettext() and is used to build the path to the .mo file containing the translated strings. Btw, something like LANGUAGE=de:en make no much sense. Usualy there is no en (english) subdir in the share/locale tree because the english strings are _always_ in the binaries and the english strings are used by default if the translations can not be found. I have tested this with the binaries of gtxt039b.zip, recode35b.zip and sed3028b.zip in the cases that CP850 or CP437 is loaded (MSDOS 6.22). This works fine for me. There is nothing broken neither in gtxt039, licv17 nor in sed3028. It should be noticed that it is possible to set LANG=CP866 to overwrite the setting: ru_RU KOI8-R. Regards, Guerrero, Juan M.