Mail Archives: cygwin/2010/09/22/16:53:10
X-Recipient: | archive-cygwin AT delorie DOT com
|
X-SWARE-Spam-Status: | No, hits=-1.7 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,TW_RX,T_TO_NO_BRKTS_FREEMAIL
|
X-Spam-Check-By: | sourceware.org
|
MIME-Version: | 1.0
|
In-Reply-To: | <4C9A04C7.1020407@charter.net>
|
References: | <4C949BEA DOT 2090508 AT charter DOT net> <20100918112910 DOT GH14602 AT calimero DOT vinschen DOT de> <4C9886F1 DOT 7010309 AT charter DOT net> <4C993FCA DOT 4090103 AT charter DOT net> <AANLkTimHm6Eg8CE-h-yRpHztqJxKVK_bKFLJye1jXjPD AT mail DOT gmail DOT com> <4C9A04C7 DOT 1020407 AT charter DOT net>
|
Date: | Wed, 22 Sep 2010 21:52:53 +0100
|
Message-ID: | <AANLkTimhzzOA6RzUM9CQZT2CaDq0GL+7t==yC1nYt_Z0@mail.gmail.com>
|
Subject: | Re: Instead of a gripe, a memory-jog.
|
From: | Andy Koppe <andy DOT koppe AT gmail DOT com>
|
To: | cygwin AT cygwin DOT com
|
X-IsSubscribed: | yes
|
Mailing-List: | contact cygwin-help AT cygwin DOT com; run by ezmlm
|
List-Id: | <cygwin.cygwin.com>
|
List-Unsubscribe: | <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
|
List-Subscribe: | <mailto:cygwin-subscribe AT cygwin DOT com>
|
List-Archive: | <http://sourceware.org/ml/cygwin/>
|
List-Post: | <mailto:cygwin AT cygwin DOT com>
|
List-Help: | <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
|
Sender: | cygwin-owner AT cygwin DOT com
|
Mail-Followup-To: | cygwin AT cygwin DOT com
|
Delivered-To: | mailing list cygwin AT cygwin DOT com
|
On 22 September 2010 14:29, SJ Wright wrote:
> Andy Koppe wrote:
>>
>> On 22 September 2010 00:29, SJ Wright wrote:
>>
>>>>
>>>> Yes. I noticed where I had the territory mis-cased the next time I ran
>>>> wget. In the line that identified the file and URL for each download,
>>>> double-quotes and other punctuation became garbage characters, where
>>>> they
>>>> hadn't been when I either had *no* LANG variable set or a
>>>> correctly-written
>>>> one. So now it's fixed. Thanks again.
>>>>
>>
>> If LANG (and also LC_ALL and LC_CTYPE) aren't set, Cygwin defaults to
>> UTF-8. It's better to have it set though, because some programs such
>> as emacs default to plain ol' ASCII if the locale isn't set. That's
>> why LANG is set to C.UTF-8 during login shell startup (by
>> /etc/defaults/etc/profile.d/lang.sh). In other words, you shouldn't
>> have to worry about it.
>>
>>
>>>
>>> Spoke too soon on the wget matter. Since setting a LANG variable in the
>>> first place (and evidently the right place, or else this wouldn't be a
>>> "matter"), I've been seeing garbage text -- I prefer to call it "drone
>>> text"
>>> -- in place of quotation marks during normal (non-verbose and not set to
>>> "quiet") downloads. Here's a sample:
>>>
>>>>
>>>> Saving to: =C3=A2=E2=82=AC=C5=93gae77-7748-244-958stck.jpg=C3=A2=E2=82=
=AC
>>>>
>>
>> That looks like wget is using UTF-8 yet your terminal is using
>> ISO-8859-1. The Cygwin console as well as all the terminals shipped
>> with Cygwin (except for rxvt) use UTF-8 by default. With other
>> terminals, you might have to select it somewhere in their options.
>
> Well, my LANG is C.UTF-8, and the garbage in wget turned back into single-
> and double-quotes as soon as I added the command to .wgetrc I mentioned.=
=C2=A0So
> it turns out, at least in my case, that "local_encoding=3DUTF-8" does
> something positive with how commands/running task steps are displayed.
The use of fancy Unicode quotes in wget is actually controlled by the
locale setting, i.e. LANG and relatives. LANG=3DC.UTF-8 gives you ASCII
quotes, whereas LANG=3Den_US.UTF-8 results in Unicode quotes. As far as
I can see, the local_encoding setting has no bearing on this.
> This was, coincidentally, in rxvt that all of this was happening. I've ye=
t to try
> it in MinTTY. I don't expect much of a difference: these are 'peripheral'
> variables set and if a UTF-8 works from two directions in a term that isn=
't
> built to like it, then 'how much better should it be in one that does?' is
> not even a question worth asking, imo.
Well, if you're not using anything beyond ASCII, then no, rxvt's lack
of UTF-8 support doesn't matter. But don't come back crying when you
do encounter more funny letters.
Rxvt actually uses CP1252, which is MS's extended version of
ISO-8859-1, hence appropriate locale settings for rxvt use one of
those charsets, e.g. LANG=3Den_US.CP1252.
Andy
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
- Raw text -