Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com X-Authentication-Warning: slinky.cs.nyu.edu: pechtcha owned process doing -bs Date: Fri, 2 Apr 2004 16:54:35 -0500 (EST) From: Igor Pechtchanski Reply-To: cygwin AT cygwin DOT com To: Arifi Koseoglu cc: cygwin AT cygwin DOT com Subject: Re: [Q] Use of UTF-8 in cygwin bash shell scripts In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Scanned-By: MIMEDefang 2.39 On Sat, 3 Apr 2004, Arifi Koseoglu wrote: > Hello everyone. > > I have a question regarding the use of UTF-8 in a cygwin-bash shell script > under windows XP and 2000 (does the behavior differ between 2000 and XP ?). > > I have a bash script automatically generated with a Perl program, which is > supposed to copy files from one disk to another and at the same time replace > all international characters in the filename and path with english > counterparts (for example c with cedilla becomes c). > > The lines in the shell script are all of the form: > > cp "source path with international chars in it" "target with no > international chars" > > The shell script is generated/saved in UTF-8 encoding. (since it has to > properly contain the international chars). By the way, with international I > mean the additional characters in the Turkish alphabet - but the same > question should apply to all non-english alphabets. > > Now, I cannot get the script to work. I can 'ls' the files using > > $ ls "source path with international chars in it" > > the listing displays the Turkish characters properly, however whenever I go > ahead to execute the script, bash complains that "source path with > international chars in it" cannot be found. > > What am I missing? Does bash not support scripts encoded in UTF-8? Should I > use another Unicode encoding (and how?) Or shoud I trash this method and try > something else (what?). There are thousands of files to be renamed. > > I will appreciate any pointers deeply. Many thanks in advance. > Best, > Arifi Bash doesn't support UTF-8. You might get away with using the appropriate 8-bit encoding based on your codepage. Alternatively, just iterate over the directory contents and rename each file from perl. Igor -- http://cs.nyu.edu/~pechtcha/ |\ _,,,---,,_ pechtcha AT cs DOT nyu DOT edu ZZZzz /,`.-'`' -. ;-;;,_ igor AT watson DOT ibm DOT com |,4- ) )-,_. ,\ ( `'-' Igor Pechtchanski, Ph.D. '---''(_/--' `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-. Meow! "I have since come to realize that being between your mentor and his route to the bathroom is a major career booster." -- Patrick Naughton -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/