www.delorie.com/archives/browse.cgi | search |
X-Recipient: | archive-cygwin AT delorie DOT com |
X-SWARE-Spam-Status: | No, hits=-2.4 required=5.0 tests=AWL,BAYES_00,SPF_PASS |
X-Spam-Check-By: | sourceware.org |
Message-ID: | <49D0BF21.9000304@gmail.com> |
Date: | Mon, 30 Mar 2009 13:46:25 +0100 |
From: | Dave Korn <dave DOT korn DOT cygwin AT googlemail DOT com> |
User-Agent: | Thunderbird 2.0.0.17 (Windows/20080914) |
MIME-Version: | 1.0 |
To: | cygwin AT cygwin DOT com |
Subject: | Re: sed converts 8-bit input text to 16-bit (Unicode-16?) characters - how to suppress that? |
References: | <B7436E984B004D5EB2413E4174482335 AT SEMENTINA> <20090330121043 DOT GT12738 AT calimero DOT vinschen DOT de> |
In-Reply-To: | <20090330121043.GT12738@calimero.vinschen.de> |
Mailing-List: | contact cygwin-help AT cygwin DOT com; run by ezmlm |
List-Id: | <cygwin.cygwin.com> |
List-Subscribe: | <mailto:cygwin-subscribe AT cygwin DOT com> |
List-Archive: | <http://sourceware.org/ml/cygwin/> |
List-Post: | <mailto:cygwin AT cygwin DOT com> |
List-Help: | <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs> |
Sender: | cygwin-owner AT cygwin DOT com |
Mail-Followup-To: | cygwin AT cygwin DOT com |
Delivered-To: | mailing list cygwin AT cygwin DOT com |
Corinna Vinschen wrote: > On Mar 30 13:48, Michael Moser wrote: >> I need to mangle a file containing "8-bit ASCII" characters (i.e. the >> file contains also characters in the upper 8-bit range, namely a few >> umlauts as well as some french accented characters). >> >> Strange enough, the SED version that came as part of cygwin emits the >> result of the mangling using 16-bit characters (I believe those are >> Unicode-16 characters, but not sure. The Hexeditor shows each second >> byte as always 00, execpt for the first two bytes which read FF FE). > > This is very likely not Cygwin's sed. Do you have another sed in $PATH > by any chance? I tried with input files containing german umlauts and > sed does not convert to wide char and it does not produce a BOM marker > at the start of the file. Another possibility is that wordpad or notepad has tried to be clever and gone and unexpectedly saved the original source file in UTF16. Did you verify the original source file in a hexeditor too, Michael? cheers, DaveK -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
webmaster | delorie software privacy |
Copyright © 2019 by DJ Delorie | Updated Jul 2019 |