X-Recipient: archive-cygwin AT delorie DOT com X-Spam-Check-By: sourceware.org Date: Wed, 13 May 2009 21:46:45 +0200 From: Corinna Vinschen To: cygwin AT cygwin DOT com Subject: Re: [1.7] Proposal: the filename encoding in C locale uses UTF-8 instead of SO/UTF-8 Message-ID: <20090513194645.GV21324@calimero.vinschen.de> Reply-To: cygwin AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com References: <3f0ad08d0905121029j119c8a7ep41d3a261d8bea338 AT mail DOT gmail DOT com> <20090512173741 DOT GZ21324 AT calimero DOT vinschen DOT de> <20090513142953 DOT GI21324 AT calimero DOT vinschen DOT de> <3f0ad08d0905130903o5cf0330enc8025bc92e94225c AT mail DOT gmail DOT com> <20090513164526 DOT GO21324 AT calimero DOT vinschen DOT de> <3f0ad08d0905131025j3f9a23c4k8c940dee496ee6fd AT mail DOT gmail DOT com> <20090513174114 DOT GU21324 AT calimero DOT vinschen DOT de> <3f0ad08d0905131213k6c8f1b25h3322f20b5bb80631 AT mail DOT gmail DOT com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3f0ad08d0905131213k6c8f1b25h3322f20b5bb80631@mail.gmail.com> User-Agent: Mutt/1.5.19 (2009-02-20) Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On May 14 04:13, IWAMURO Motonori wrote: > 2009/5/14 Corinna Vinschen : > > I already wrote that patch, see > > http://cygwin.com/ml/cygwin-cvs/2009-q2/msg00066.html > > It seems to do what you are proposing. > > I read it and built cygwin1.dll. It seems to work correctly. > > Should the following part not be modified? > > winsup/cygwin/fhandler_console.cc: > > dev_state->con_mbtowc = __mbtowc; > > dev_state->con_wctomb = __wctomb; I'd rather not. It only affects the console and if LANG=C I'd rather see the single bytes which make up the path instead of the corresponding UTF-8 character. > But I think the patch solves only the case of UTF-8 in the thread > starting at http://cygwin.com/ml/cygwin/2009-05/msg00245.html. > > It is necessary to separate the following variables for the library > and for the system to support encoding that is not UTF-8. > > - __mb_cur_max > - lc_ctype_charset > - __mbtowc > - __wctomb I understand what you're up to, but right now I'm not really sure that this is the way to go. I had this idea as well at one point, but, thinking about it, I see a couple of potential problems. I don't want to decouple the libraries' idea of a string from the application's idea. I tried various scenarios with the current solution and they all worked ok, one way or the other. I'm sure there are still some which don't work, but before doing what you propose, I'd rather see explicit failures. And have some time to discuss whether these are something the user can or even should fix or workaround alone. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/