www.delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/09/22/02:47:55

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL,BAYES_00,SARE_MSGID_LONG40,SPF_PASS
X-Spam-Check-By: sourceware.org
MIME-Version: 1.0
In-Reply-To: <h99p3v$e38$1@ger.gmane.org>
References: <416096c60908300959i1e0084b1xc8f6e65e792b035d AT mail DOT gmail DOT com> <20090831005258 DOT GG2068 AT ednor DOT casa DOT cgf DOT cx> <416096c60909012329l2f25e735yc07145b8d6698cda AT mail DOT gmail DOT com> <3f0ad08d0909020656v7d9fce6ft4afea63ed363b9a9 AT mail DOT gmail DOT com> <416096c60909071308qc5ff057sbe9cb1dbc270554f AT mail DOT gmail DOT com> <20090908193456 DOT GC17515 AT calimero DOT vinschen DOT de> <416096c60909081449r1fe024dbm7b82a3719be05e9e AT mail DOT gmail DOT com> <20090921103758 DOT GE20981 AT calimero DOT vinschen DOT de> <416096c60909211420g4ac8ea93l80fc1f00dcd5c0f3 AT mail DOT gmail DOT com> <h99p3v$e38$1 AT ger DOT gmane DOT org>
Date: Tue, 22 Sep 2009 07:47:44 +0100
Message-ID: <416096c60909212347r7e03a4f3q7d518ff7e8bce55d@mail.gmail.com>
Subject: Re: The C locale
From: Andy Koppe <andy DOT koppe AT gmail DOT com>
To: cygwin AT cygwin DOT com
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

2009/9/22 Lapo Luchini:
> Andy Koppe wrote:
>> This way, the non-ASCII needs of most users are covered
>> out-of-the-box [...]
>> Windows filenames show up correctly in Cygwin as long as they're
>> limited to the ANSI codepage.
>
> I fail to see how that is a desiderable thing.
> Filesystem is UTF-16, Cygwin is now Unicode-aware, but anything that
> doesn't fit ANSI is thrown away [...]?

No, it isn't. UTF-16 filename characters that can't be represented in
the current charset are encoded by a ^N followed by the character's
UTF-8 representation.

The current C locale, on the other hand, simply represents all
non-ASCII characters as UTF-8, even though the application charset is
ISO-8859-1. This means that even those characters that can be
represented in the application charset show up incorrectly. For
example, a Windows filename "b=C3=A4h" turns into "b=C3=85=C2=A4h" in the C=
 locale,
while it shows up correctly with explicitly set ISO-8859-1 or CP1252.


> As a user, the ability to show correctly formatted UTF-8 filenames is
> one of the features I most appreciated in Cygwin-1.7

That ability isn't going anywhere. As before, you need to set your
locale to one with a UTF-8 charset to get full UTF-8 support.

Btw, are you actually using the C locale?

Andy

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019