[OT] Re: 0xdeadbeef vs 0xdeadbeefL

From: Gabriel Paubert (paubert_at_iram.es)
Date: 07/08/04

  • Next message: David Teigland: "Re: [ANNOUNCE] Minneapolis Cluster Summit, July 29-30"
    Date:	Thu, 8 Jul 2004 11:32:50 +0200
    To: tom st denis <tomstdenis@yahoo.com>
    
    

    On Wed, Jul 07, 2004 at 11:53:40AM -0700, tom st denis wrote:
    > --- Christoph Hellwig <hch@infradead.org> wrote:
    > > On Wed, Jul 07, 2004 at 11:41:50AM -0700, tom st denis wrote:
    > > > Um, actually "char" like "int" and "long" in C99 is signed. So
    > > while
    > > > you can write
    > > >
    > > > signed int x = -3;
    > > >
    > > > You don't have to. in fact if you "have" to then your compiler is
    > > > broken. Now I know that GCC offers "unsigned chars" but that's an
    > > > EXTENSION not part of the actual standard.
    > >
    > > ------------------------------ snip -----------------------------
    > > [#15] The three types char, signed char, and unsigned char
    > > are collectively called the character types. The
    > > implementation shall define char to have the same range,
    > > representation, and behavior as either signed char or
    > > unsigned char.35)
    > > ------------------------------ snip -----------------------------
    >
    > Right. Didn't know that. Whoa. So in essence "char" is not a safe
    > type.

    It depends what you use it for, but it typically is not.

    The _very_ common mistake is assigning the result of fgetc/getc/getchar
    (which are defined to return an _unsigned_ char cast to an int or EOF)
    to a plain char and then comparing it with -1 to check for EOF:

    1) it will never detect the EOF if the char is unsigned (PPC)

    2) it will stop on a ÿ (that's an y with a diaeresis) on Intel. This
    character is infrequent in the languages I use but it occasionally
    happens.

    Of course people who only use plain 7 bit ASCII never hit the bug,
    but as soon as you go into Latin-$n encodings you may hit them (I'm
    only restricting myself to character sets based on the Latin alphabet).

    And no the solution is not to use -fsigned-char or -funsigned char
    as an optin to GCC. Most of the time it only changes the kind of bugs
    that are hidden in the code, and 2) above is statistically harder to
    hit than 1).

    >
    > > > As for writing portable code, um, jacka#!, BitKeeper, you know,
    > > that
    > > > thingy that hosts the Linux kernel? Yeah it uses LibTomCrypt. Why
    > > not
    > > > goto http://libtomcrypt.org and find out who the author is. Oh
    > > yeah,
    > > > that would be me. Why not email Wayne Scott [who has code in
    > > > LibTomCrypt btw...] and ask him about it?

    Yes, I know and I use BK. But given the fact that you insult me for
    better knowing C rules than you, I'm seriously considering switch
    to subversion or arch instead.

    Argh, I've mentioned BK. There should be a Goldwin's law equivalent
    for BitKeeper on lkml ;-)

    > > >
    > > > Who elses uses LibTomCrypt? Oh yeah, Sony, Gracenote, IBM [um Joy
    > > > Latten can chip in about that], Intel, various schools including
    > > > Harvard, Stanford, MIT, BYU, ...
    > >
    > > Tons of people use windows aswell. You just showed that you don't
    > > know
    > > C well enough, so maybe someone should better do an audit for your
    > > code ;-)
    >
    > To be honest I didn't know that above. That's why I'm always explicit.
    > [btw my code builds in MSVC, BCC and ICC as well].
    >
    > You don't need to know such details to be able to develop in C. I'm
    > sure if you walked into [say] Redhat and gave an "on the spot C quiz"
    > about obscure rules they would fail. You have to use some common sense
    > and apply the more relevant rules.

    Well, I consider the rules about plain char to be among the most
    relevant, since I've been hit by them _way_ _more_ than about any
    other badly known C rule.

    And finally, I'd personnaly prefer the char to be unsigned, for several
    reasons:
    - its name which suggests that it is an enumeration of symbols.
    - strcmp and friends do the comparisons using _unsigned_ char,
    despite the fact that the prototype declare plain char parameters
    - the aforementioned fgetc/getc/getchar issues.
      
      
    BTW, this signed/unsigned mess is a reason for some weirdness like
    tables with 384 entries in libc/ctype/ctype.h:

    /* These are defined in ctype-info.c.
       The declarations here must match those in localeinfo.h.

       In the thread-specific locale model (see `uselocale' in <locale.h>)
       we cannot use global variables for these as was done in the past.
       Instead, the following accessor functions return the address of
       each variable, which is local to the current thread if multithreaded.

       These point into arrays of 384, so they can be indexed by any `unsigned
       char' value [0,255]; by EOF (-1); or by any `signed char' value
       [-128,-1). ISO C requires that the ctype functions work for `unsigned
       char' values and for EOF; we also support negative `signed char' values
       for broken old programs.
       
     [snipped]

    Not specifying the signedness of the char types is one of C's original
    mistakes, and the one that statistically mostly affects me.

            Gabriel (the only good char is the unsigned char)
     
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: David Teigland: "Re: [ANNOUNCE] Minneapolis Cluster Summit, July 29-30"

    Relevant Pages

    • Re: Problems reading from files
      ... It should be an int. ... c can compare equal to EOF if plain char happens to be signed. ... It can fail badly if plain char is unsigned, ...
      (comp.lang.c)
    • Re: the buffer question of the stdio
      ... > Actually it can be equal to EOF. ... and -1 is in the range of char if char is signed. ... > The problem with 'cht' being a char is that it cannot correctly ... That's the problem with 'cht' being a char if plain char happens to be ...
      (comp.lang.c)
    • Re: Bored with WOW
      ... > mistake, yes there will be certain lull levels, but power on and wait til u ... > factor...abandoning a char just seems such a shame ... hmm, to be honest, I didn't see much difference when my mage hit 60. ... I suppose after getting my epic mount, ...
      (alt.games.warcraft)
    • Re: IE7 Beta 3 issue
      ... Outlook 2003 with preview screen on. ... Hit the print button and error ... Internet Explorer Scrit Error ... Char 1 ...
      (microsoft.public.windows.inetexplorer.ie6.browser)
    • Speed question and such.... - 1.Daddy (0/1)
      ... Now what benefits are given to the char ... blows/turn, I hit, hit, the enemy misses more often, I hit I hit, etc. ... what about those reports of Al Qaeda members using ...
      (rec.games.roguelike.angband)