www.delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1994/11/02/21:35:49

Date: Thu, 3 Nov 94 04:07:26 JST
From: Stephen Turnbull <turnbull AT shako DOT sk DOT tsukuba DOT ac DOT jp>
To: ptc AT ironwood DOT cray DOT com
Cc: DJGPP AT Sun DOT SOE DOT Clarkson DOT edu, (DJGPP, mailing, list)
Subject: Re:Is it random number sequence?

I *hope* that you have a reliable source that you can cite to me and
flame me right back into the ground, because if my understanding is
correct, *you are seriously confused*.  (This of course is no biggie,
I was quite confused myself, as we'll see.)
    My source is my 90% recollection of Donald Knuth's discussion in
Seminumerical Algorithms.  An oldie but still goodie.
    The one sensible thing you wrote is "the numbers you get depend on
how good the random number generator is."  But that's the point of
the thread:  the DJGPP implementation apparently fails one reasonable
test very badly.  My point in my post was that this is a common way
for generators to fail, and that according to Knuth taking higher
order bits is a good idea.  This may have been falsified by more
recent research; a quick scan of some of the standard references (ie,
I grepped for "rand" in my DJGPP archives---we've been through this
before, as usual) didn't show it though.
    You say it's not a good idea to play with bits.  To get a narrow
range pseudo-random sequence with a long period, there is no
alternative.  (You could do something that is truly random, as far as
we know, such as use the decays in a Geiger counter.  That's not good;
we need a pseudo-random sequence for many applications.)  For example,
you suggest

	printf(" %d", (int)( 8 * (double) rand() / RAND_MAX));

Now, all three factors are converted to double, which has 55 bits of
precision on both Intel and Motorola machines, dunno Sparcs, as I
recall.  Since 8*RANDMAX has at most 19 bits, the multiplication is
effectively integer arithmentic.  I believe that all of these machines
are radix 2 floating point; that means with RAND_MAX = 65536, the
division is a 16 place right shift, except for the rounding.  Ie, in
the case of DJGPP, except for a possible rounding issue, your
algorithm can be accomplished as (dropping the printf())

        ((rand() << 3) & (7 << lg RAND_MAX)) >> lg RAND_MAX

or somewhat more efficiently

        (rand() & (7 << (lg RAND_MAX - 3))) >> (lg RAND_MAX - 3)

or really efficiently

        rand() >> (lg RAND_MAX - 3)

where 'lg' is the logarithm base 2.  (So this formula doesn't quite
work for the Sun.)  Ie, what you're really doing is taking the top
three bits of rand().
    I suggest that you stipulate (at least for the list, I'll argue it
in private if you want) that the difference between this and the case
where RAND_MAX = 32767 is too small to worry about for our purposes
(remember that the bad behavior was 0 1 2 3 0 1 2 3 ad infinitum,
apparently), and the rounding issue can be handled by an addition (I'm
pretty sure).  I do *not* mean that the low order bits will be
similarly cyclical; in fact I would guess that the problem with the
DJGPP generator is exactly that use of a nice power of two as
RAND_MAX.  Generators where RAND_MAX is odd should have better
behavior in the low order bits.  I do mean that taking the high order
bits is effectively the same as dividing by a high power of two - 1
for long periods of the sequence.
    What I had in mind but didn't think about long enough to get right
was a generalization of the middle formula

        (rand() & (7 << k)) >> k

where 0 <= k <= lg RAND_MAX - 3.  k = 0 is likely to be bad, but large
k will be acceptable for most generators actually in use.  k =
RAND_MAX - 3 is probably best, as rand is (presumably, I haven't
looked at the library source) actually the bottom 16 bits of a
sequence of 32 bit numbers, probably generated by a linear
congruential generator.  Thus, your algorithm actually amounts to
taking the middle three bits from a pseudo-random sequence of 32-bit
numbers, more or less what I suggested.
    Note that your method is computationally very expensive, as it
involves an int to double cast and a double to int cast for every
draw, as well as two floating point operations.
    --Steve

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019