X-Recipient: archive-cygwin AT delorie DOT com X-Spam-Check-By: sourceware.org Date: Wed, 13 May 2009 15:26:26 +0200 From: Corinna Vinschen To: cygwin AT cygwin DOT com Subject: Re: Cygwin programs doesn't support non-ASCII filenames Message-ID: <20090513132626.GH21324@calimero.vinschen.de> Reply-To: cygwin AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com References: <20090509100231 DOT GR21324 AT calimero DOT vinschen DOT de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.19 (2009-02-20) Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On May 9 23:12, Lenik wrote: > This is a new test don't use cygpath: > C:\Profiles\Shecti> set LANG=& bash -c "cat ??????" > cat: ??????: No such file or directory > > C:\Profiles\Shecti> set LANG=zh_CN.GB2312& bash -c "cat ??????" > cat: ??????: No such file or directory > > C:\Profiles\Shecti> set LANG=zh_CN.GBK& bash -c "cat ??????" > 123 > > C:\Profiles\Shecti> set LANG=zh_CN.UTF-8& bash -c "cat ??????" > 123 > > C:\Profiles\Shecti> set LANG=& bash -c "d ??????" > /mnt/c/Profiles/Shecti/?????? doesn't exist! > > C:\Profiles\Shecti> set LANG=zh_CN.GBK& bash -c "d ??????" > /mnt/c/Profiles/Shecti/?????? doesn't exist! > > C:\Profiles\Shecti> set LANG=zh_CN.UTF-8& bash -c "d ??????" > /mnt/c/Profiles/Shecti/?????? doesn't exist! Your example is puzzeling me no end. All of the above commands are using the filename in UTF-8 encoding but given byte by byte. When I paste such a filename into the console, I get two chinese characters. They look like empty squares since I don't have a matching OEM charset installed on my machine (and `chcp 936' fails with "Invalid codepage"). But they are two chars and when using them in the above commands, ls as well as d work fine, independent of the LANG setting. I don't understand how you pasted the UTF-8 value into the console and got the single byte values in a visible way. Hmm. Did you paste the output of a ls command? If so, that can't work correctly. The result is a filename which is different from the original filename. That only works by chance. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/