www.delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/03/30/07:36:51

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL,BAYES_00,SPF_PASS
X-Spam-Check-By: sourceware.org
Message-ID: <BLU113-W19297AEE1A7E9B35BDB826BE8D0@phx.gbl>
From: Mike Marchywka <marchywka AT hotmail DOT com>
To: <cygwin AT cygwin DOT com>
Subject: RE: sed converts 8-bit input text to 16-bit (Unicode-16?) characters - how to suppress that?
Date: Mon, 30 Mar 2009 08:36:19 -0400
In-Reply-To: <20090330121043.GT12738@calimero.vinschen.de>
References: <B7436E984B004D5EB2413E4174482335 AT SEMENTINA> <20090330121043 DOT GT12738 AT calimero DOT vinschen DOT de>
MIME-Version: 1.0
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com






----------------------------------------
> Date: Mon, 30 Mar 2009 14:10:43 +0200
> From: corinna-cygwin AT cygwin DOT com
> To: cygwin AT cygwin DOT com
> Subject: Re: sed converts 8-bit input text to 16-bit (Unicode-16?) charac=
ters - how to suppress that?
>
> On Mar 30 13:48, Michael Moser wrote:
>> I need to mangle a file containing "8-bit ASCII" characters (i.e. the
>> file contains also characters in the upper 8-bit range, namely a few
>> umlauts as well as some french accented characters).
>>
>> Strange enough, the SED version that came as part of cygwin emits the
>> result of the mangling using 16-bit characters (I believe those are
>> Unicode-16 characters, but not sure. The Hexeditor shows each second
>> byte as always 00, execpt for the first two bytes which read FF FE).
>
> This is very likely not Cygwin's sed. Do you have another sed in $PATH
> by any chance? I tried with input files containing german umlauts and
> sed does not convert to wide char and it does not produce a BOM marker
> at the start of the file.
>

On a related note, sometimes "which" gives or gave confusing results if you=
 don't have the relevant "x" permissions set=20
( chmod 777 fixes everything LOL  )

I know I've never seen this problem with sed and in fact
use it with no special options to replace "0x00" in unicode
files and it works fine ( apparently the windoze registry
is unicode and if you ever need a set of verbose, highly
redundant strings it is a good place to look ).


>
> Corinna
>
> --
> Corinna Vinschen Please, send mails regarding Cygwin to
> Cygwin Project Co-Leader cygwin AT cygwin DOT com
> Red Hat
>
> --
> Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
> Problem reports: http://cygwin.com/problems.html
> Documentation: http://cygwin.com/docs.html
> FAQ: http://cygwin.com/faq/
>

_________________________________________________________________
Quick access to Windows Live and your favorite MSN content with Internet Ex=
plorer 8.
http://ie8.msn.com/microsoft/internet-explorer-8/en-us/ie8.aspx?ocid=3DB037=
MSN55C0701A

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019