X-Recipient: archive-cygwin AT delorie DOT com X-Spam-Check-By: sourceware.org Date: Thu, 14 May 2009 15:26:01 +0200 From: Corinna Vinschen To: cygwin AT cygwin DOT com Subject: Re: [1.7] Proposal: the filename encoding in C locale uses UTF-8 instead of SO/UTF-8 Message-ID: <20090514132601.GC21324@calimero.vinschen.de> Reply-To: cygwin AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com References: <20090512173741 DOT GZ21324 AT calimero DOT vinschen DOT de> <20090513142953 DOT GI21324 AT calimero DOT vinschen DOT de> <3f0ad08d0905130903o5cf0330enc8025bc92e94225c AT mail DOT gmail DOT com> <20090513164526 DOT GO21324 AT calimero DOT vinschen DOT de> <3f0ad08d0905131025j3f9a23c4k8c940dee496ee6fd AT mail DOT gmail DOT com> <20090513174114 DOT GU21324 AT calimero DOT vinschen DOT de> <3f0ad08d0905131213k6c8f1b25h3322f20b5bb80631 AT mail DOT gmail DOT com> <20090513194645 DOT GV21324 AT calimero DOT vinschen DOT de> <20090513195036 DOT GW21324 AT calimero DOT vinschen DOT de> <3f0ad08d0905140539o586dfce2o95162a4f068bd9c8 AT mail DOT gmail DOT com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <3f0ad08d0905140539o586dfce2o95162a4f068bd9c8@mail.gmail.com> User-Agent: Mutt/1.5.19 (2009-02-20) Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On May 14 21:39, IWAMURO Motonori wrote: > 2009/5/14 Corinna Vinschen : > >> > Should the following part not be modified? > >> > > >> > winsup/cygwin/fhandler_console.cc: > >> > > dev_state->con_mbtowc = __mbtowc; > >> > > dev_state->con_wctomb = __wctomb; > >> > >> I'd rather not.  It only affects the console and if LANG=C I'd rather > >> see the single bytes which make up the path instead of the corresponding > >> UTF-8 character. > > > > Hm, maybe I misunderstood.  In which manner should this be modifed? > > I think: > > dev_state->con_mbtowc = __mbtowc == __ascii_mbtowc ? __utf8_mbtowc : __mbtowc; > dev_state->con_wctomb = __wctomb == __ascii_wctomb ? __utf8_wctomb : __wctomb; Oh, ok. So I understood right. But that's exactly what I didn't want to do. The idea is that, even though UTF-8 is used for the filename conversion, the console should default to standard ASCII behaviour, unless you specify another charset before starting the first Cygwin process in the console. I'm also wondering if we should perhaps only allow either ASCII or UTF-8 as console charsets, but for now I don't want to touch this more than necessary. I just found that the console I/O doesn't work well for non-ASCII chars anyway. The core function which echos input to the terminal only handles singlebyte chars, which can be easily reproduced using copy/paste. Oh well. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/