www.delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2004/04/02/16:55:02

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sources.redhat.com/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sources.redhat.com/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
X-Authentication-Warning: slinky.cs.nyu.edu: pechtcha owned process doing -bs
Date: Fri, 2 Apr 2004 16:54:35 -0500 (EST)
From: Igor Pechtchanski <pechtcha AT cs DOT nyu DOT edu>
Reply-To: cygwin AT cygwin DOT com
To: Arifi Koseoglu <arifi AT tnn DOT net>
cc: cygwin AT cygwin DOT com
Subject: Re: [Q] Use of UTF-8 in cygwin bash shell scripts
In-Reply-To: <CKEEILAKADKCNPNMDCJPIEAECAAA.arifi@tnn.net>
Message-ID: <Pine.GSO.4.56.0404021644160.2840@slinky.cs.nyu.edu>
References: <CKEEILAKADKCNPNMDCJPIEAECAAA DOT arifi AT tnn DOT net>
MIME-Version: 1.0
X-Scanned-By: MIMEDefang 2.39

On Sat, 3 Apr 2004, Arifi Koseoglu wrote:

> Hello everyone.
>
> I have a question regarding the use of UTF-8 in a cygwin-bash shell script
> under windows XP and 2000 (does the behavior differ between 2000 and XP ?).
>
> I have a bash script automatically generated with a Perl program, which is
> supposed to copy files from one disk to another and at the same time replace
> all international characters in the filename and path with english
> counterparts (for example c with cedilla becomes c).
>
> The lines in the shell script are all of the form:
>
> cp "source path with international chars in it" "target with no
> international chars"
>
> The shell script is generated/saved in UTF-8 encoding. (since it has to
> properly contain the international chars). By the way, with international I
> mean the additional characters in the Turkish alphabet - but the same
> question should apply to all non-english alphabets.
>
> Now, I cannot get the script to work. I can 'ls' the files using
>
> $ ls "source path with international chars in it"
>
> the listing displays the Turkish characters properly, however whenever I go
> ahead to execute the script, bash complains that "source path with
> international chars in it" cannot be found.
>
> What am I missing? Does bash not support scripts encoded in UTF-8? Should I
> use another Unicode encoding (and how?) Or shoud I trash this method and try
> something else (what?). There are thousands of files to be renamed.
>
> I will appreciate any pointers deeply. Many thanks in advance.
> Best,
> Arifi

Bash doesn't support UTF-8.  You might get away with using the appropriate
8-bit encoding based on your codepage.  Alternatively, just iterate over
the directory contents and rename each file from perl.
	Igor
-- 
				http://cs.nyu.edu/~pechtcha/
      |\      _,,,---,,_		pechtcha AT cs DOT nyu DOT edu
ZZZzz /,`.-'`'    -.  ;-;;,_		igor AT watson DOT ibm DOT com
     |,4-  ) )-,_. ,\ (  `'-'		Igor Pechtchanski, Ph.D.
    '---''(_/--'  `-'\_) fL	a.k.a JaguaR-R-R-r-r-r-.-.-.  Meow!

"I have since come to realize that being between your mentor and his route
to the bathroom is a major career booster."  -- Patrick Naughton

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019