X-Recipient: archive-cygwin AT delorie DOT com X-Spam-Check-By: sourceware.org Date: Wed, 2 Feb 2011 17:35:16 +0100 From: Corinna Vinschen To: cygwin AT cygwin DOT com, bug-gnulib AT gnu DOT org, bug-coreutils AT gnu DOT org Subject: Re: 16-bit wchar_t on Windows and Cygwin Message-ID: <20110202163516.GI2675@calimero.vinschen.de> Reply-To: cygwin AT cygwin DOT com, bug-gnulib AT gnu DOT org, bug-coreutils AT gnu DOT org Mail-Followup-To: cygwin AT cygwin DOT com, bug-gnulib AT gnu DOT org, bug-coreutils AT gnu DOT org References: <20110202122102 DOT GD2675 AT calimero DOT vinschen DOT de> <201102021229 DOT 04623 DOT bruno AT clisp DOT org> <201102021702 DOT 57387 DOT bruno AT clisp DOT org> <20110202162801 DOT GH2675 AT calimero DOT vinschen DOT de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20110202162801.GH2675@calimero.vinschen.de> User-Agent: Mutt/1.5.21 (2010-09-15) Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On Feb 2 17:28, Corinna Vinschen wrote: > On Feb 2 17:02, Bruno Haible wrote: > > But if you say that the application should convert UTF-16 surrogates > > to UTF-32 before calling iswalpha: That's certainly a requirement > > for Cygwin 1.7.x application that want to support the entire Unicode > > character set. But it's outside of POSIX, and many GNU programs will > > not want to include this added complexity. Just try to apply this > > suggestion to gnulib's quotearg.c, then estimate the time someone > > would need to apply it also to regcomp.c, strftime.c, mbscasestr.c, > > coreutils/src/wc.c, and so on. > > Cygwin's regcomp is taken from FreeBSD and is UTF-16 capable, including > surrogate handling. It only required two changes in the code. Btw., I would be sure glad if Cygwin would use a wchar_t of 4 bytes as well. The problem is that this requires too many changes at once to work right, and it would introduce a lot of backward compatibility problems which would have to be handled. If only the one's who decided that wchar_t in Cygwin should have the same size as WCHAR_T in the underlying Windows would have thought twice about the implications... Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple