X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU X-Spam-Check-By: sourceware.org X-Yahoo-SMTP: mjD.OBqswBAPbVUxYJaYPvc61jLEnpq8VnBwJGdbEJOPA9xw Message-ID: <4C968FD9.30104@sbcglobal.net> Date: Sun, 19 Sep 2010 22:34:01 +0000 From: Greg Chicares User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.8) Gecko/20100802 Thunderbird/3.1.2 MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Re: awk gsub problem References: <20100916092458 DOT GB15121 AT calimero DOT vinschen DOT de> <20100918092139 DOT GE14602 AT calimero DOT vinschen DOT de> <20100918200851 DOT GA5760 AT calimero DOT vinschen DOT de> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On 2010-09-19 20:33Z, Lee wrote: > [...awk character ranges are locale-sensitive...] > > Was the reply from the upstream maintainer answered on a mailing list? > (& if so, which one?) I'd like to understand the problem they're > solving.. I get the idea of "[[:lower:]]" working regardless of > collating order of the current char set, but how "[a-z]" gets > translated to something like "[aAbBcCdD...zZ]" boggles my mind. It > seems like they had to have gone out of their way to translate [a-z] > into a case-insensitive RE. Discussed here: http://www.gnu.org/manual/gawk/html_node/Character-Lists.html#Character-Lists And here's the same 'aAbBcC' question for 'ls' on solaris: http://groups.google.com/group/comp.unix.solaris/browse_thread/thread/8526e1b6eb18fb31/ It's not specific to gawk. > --traditional > Traditional Unix awk regular expressions are matched. The GNU > operators are not special, interval expressions are not available, and > neither are the POSIX character classes ([[:alnum:]] and so on). That option doesn't override the locale; to do that, see: http://www.gnu.org/manual/gawk/html_node/Locales.html#Locales -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple