www.delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp-workers/1998/02/11/13:53:02

Message-ID: <34E1F2BE.24BB@tempe.vlsi.com>
Date: Wed, 11 Feb 1998 11:49:34 -0700
From: Charles Marslett <charles DOT marslett AT tempe DOT vlsi DOT com>
Reply-To: charles DOT marslett AT tempe DOT vlsi DOT com
Organization: VLSI Technology, Inc.
MIME-Version: 1.0
To: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>
CC: djgpp-workers AT delorie DOT com
Subject: Re: char != unsigned char... sometimes, sigh (long)
References: <Pine DOT SUN DOT 3 DOT 91 DOT 980211114656 DOT 15677A-100000 AT is>

Eli Zaretskii wrote:
> 
> On Tue, 10 Feb 1998, Vik Heyndrickx wrote:
> 
> > - All DOS compilers that I know about (not many), use 'unsigned char' by
> > default. SGI uses 'unsigned char' ;)
> 
> It would be interesting to know why did GCC choose signed char for
> x86.  Does anybody know?  Should we ask the GCC maintainers?  Or maybe
> somebody can tell what are the advantages of signed char?
> 
> The reason I think this would be educational is that Vik lists so many
> disadvantages of this choice, it almost makes you think GCC is dumb.

Well, Microsoft C compilers, Watcom C compilers and the traditional K&R
compilers all default to signed.  These are 3 of 4 major sources
(Borland,
I think, defaults to unsigned unless you use the IDE, then it defaults
to
whatever you used last, and Symantec defaults to unsigned -- but it has
so many idiosynchracies that Symantec code is virtually a language of
its
own anyway).  So signed is the traditional treatment if you want the
code
to be most portable (in or out of the PC world).  For that matter, the
pre-ANSI compilers had no 'signed' keyword, so if the default was not
signed, there was no mechanism to create a signed byte value.

So the decision may be too historical or parochial, but not implicitly
dumb.

> > - A char can be used as an array subscript, especially in translation
> > tables. Most of the time (99%) the user does not expect that this value
> > can be negative.
> 
> This unexpected effect is only understandable if a user thinks that
> char type is somehow ``magical'' because it represents printable
> characters.  But that is not how C defines them: in C they are just
> small integers.
> 
> Also, ANSI allows an index of -1.

As a matter of fact several 68K compilers I used (a decade ago, I have
to admit), used 257 entry tables for the is* macros for exactly this
reason.  If called with an unsigned char (or int) value, the full 256
character set was supported, but if called with a default (signed) char
value, only 7-bit ASCII was supported.  And of course, traditional ASCII
is a 7-bit code, so "the user does not expect that this value can be
negative" is a reasonable interpretation of ASCII characters in signed
bytes.  Of course, ISO and Unicode characters do not always fit into
the 7 bits of a signed positive byte (Unicode doesn't even fit in
an unsigned byte).

> > - If the user want his program to behave in an implementation specific
> > way he can always specify "-funsigned-char" or "-fsigned-char" at the
> > command line.
> 
> Or even edit their lib/specs to make it the default.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019