X-Recipient: archive-cygwin AT delorie DOT com X-Spam-Check-By: sourceware.org Date: Mon, 30 Mar 2009 14:10:43 +0200 From: Corinna Vinschen To: cygwin AT cygwin DOT com Subject: Re: sed converts 8-bit input text to 16-bit (Unicode-16?) characters - how to suppress that? Message-ID: <20090330121043.GT12738@calimero.vinschen.de> Reply-To: cygwin AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.19 (2009-02-20) Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On Mar 30 13:48, Michael Moser wrote: > I need to mangle a file containing "8-bit ASCII" characters (i.e. the > file contains also characters in the upper 8-bit range, namely a few > umlauts as well as some french accented characters). > > Strange enough, the SED version that came as part of cygwin emits the > result of the mangling using 16-bit characters (I believe those are > Unicode-16 characters, but not sure. The Hexeditor shows each second > byte as always 00, execpt for the first two bytes which read FF FE). This is very likely not Cygwin's sed. Do you have another sed in $PATH by any chance? I tried with input files containing german umlauts and sed does not convert to wide char and it does not produce a BOM marker at the start of the file. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/