\ Mersenne Twister
( )
MODULE: MERSENNE_TWISTER
REQUIRE 'th ~ygrek/lib/neilbawd/toolbelt.f
0 [IF] =======================================================
Wil Baden 2000-05-27
Billions and Billions of Random Numbers
In 1997, Makoto Matsumoto and Takuji Nishimura developed the
Mersenne Twister. The Mersenne Twister is "designed with
consideration of the flaws of various existing generators,"
has a super-astronomical period of 2^19937 - 1, gives a
sequence that is 623-dimensionally equidistributed, and "has
passed many stringent tests, including the die-hard test of
G. Marsaglia and the load test of P. Hellekalek and S.
Wegenkittl." It is efficient in memory usage (typically
using 2506 bytes of static data, and the code is quite short
as well). It generates random numbers in batches of 624 at a
time, so the caching and pipelining of modern systems is
exploited. It is also divide- and mod-free.
And it is fast. In C, it's four times faster than the
inadequate Standard C Library function `rand`. Elko
Tchernev reports 100 million random numbers in 22 seconds
using SwiftForth.
For full information, http://www.math.keio.ac.jp/~matumoto/emt.html
It is easy to port the C code to Standard Forth.
------------------------------------------------------- [THEN]
0 [IF] =======================================================
A Forth program for MT19937: Integer version (1999/10/28)
`GENRAND` generates one pseudorandom unsigned integer (32bit)
which is uniformly distributed among 0 to 2^32-1 for each
call. `seed SGRENRAND` sets initial values in the working
area of 624 words. Before `GENRAND`, `seed SGENRAND` must be
called once. (`seed` is any 32-bit integer.)
Coded in C by Takuji Nishimura, considering the suggestions by
Topher Cooper and Marc Rieffel in July-Aug. 1997.
Converted to Standard Forth by Wil Baden, 2000-05-15.
C version copyright (C) 1997, 1999 Makoto Matsumoto and
Takuji Nishimura. When you use this, send an email to:
M. Matsumoto
with an appropriate reference to your work.
REFERENCE
M. Matsumoto and T. Nishimura,
"Mersenne Twister: A 623-Dimensionally Equidistributed Uniform
Pseudo-Random Number Generator",
ACM Transactions on Modeling and Computer Simulation,
Vol. 8, No. 1, January 1998, pp 3--30.
------------------------------------------------------- [THEN]
0 [IF] =======================================================
Needed from Tool Belt
NEEDS
'th
/NEEDS
------------------------------------------------------- [THEN]
: PRIME ( n -- flag )
DUP 1 AND 0= IF 2 = EXIT THEN
1 ( n d)
BEGIN 2 +
2DUP DUP * U< IF 2DROP TRUE EXIT THEN
2DUP MOD 0=
UNTIL 2DROP ( )
FALSE ;
\ Period parameters
624 CONSTANT MTN \ "N" has been renamed "MTN".
397 CONSTANT MTM \ "M" has been renamed "MTM".
0x09908B0DF CONSTANT Matrix-A \ Constant vector A
0x080000000 CONSTANT Upper-Mask \ Most significant bits
0x07FFFFFFF CONSTANT Lower-Mask \ Least significant bits
\ Tempering parameters
0x09D2C5680 CONSTANT Tempering-Mask-B
0x0EFC60000 CONSTANT Tempering-Mask-C
: Tempering-Shift-U 11 RSHIFT ;
: Tempering-Shift-S 7 LSHIFT ;
: Tempering-Shift-T 15 LSHIFT ;
: Tempering-Shift-L 18 RSHIFT ;
CREATE MT MTN CELLS ALLOT \ The array for the state vector.
CREATE MTI -1 ,
\ Unsigned MTI > MTN means MT[...] isn't initialized.
EXPORT
\ Initializing the array with a seed.
: SGENRAND ( seed -- )
MTN 0 DO
DUP 0x0FFFF0000 AND I 'th MT !
69069 * 1+
DUP 0x0FFFF0000 AND 16 RSHIFT I 'th MT @ OR I 'th MT !
69069 * 1+
LOOP DROP ( )
MTN MTI ! ;
0 [IF] =======================================================
Initialization by `SGENRAND` is an example. Theoretically,
there are 2^19937-1 possible states as an initial state. The
following function allows choosing any of 2^19937-1 states.
Essential bits in `seed-array[]` are the following 19937
bits: `seed-array[0]&Upper-Mask`, `seed-array[1]`, ...,
`seed-array[MTN-1]`. `seed-array[0]&Lower-Mask` is discarded.
Theoretically, `seed-array[0]&Upper-Mask`, `seed-array[1]`,
..., `seed-array[MTN-1]` can take any values except all zeros.
------------------------------------------------------- [THEN]
: LSGENRAND ( &seed-array -- )
\ The length of seed-array[] must be at least MTN cells.
MTN 0 DO
I 'th OVER @ I 'th MT !
LOOP DROP ( )
MTN MTI ! ;
: GENRAND ( -- u )
MTI @ MTN U< 0= IF \ Generate MTN words at one time.
MTI @ MTN = 0= IF \ If SGENRAND hasn't been called,
4357 SGENRAND \ a default initial seed is used.
THEN
\ {0...N-M-1}
MTN MTM - 0 DO
I 'th MT @ Upper-Mask AND
I 1+ 'th MT @ Lower-Mask AND OR ( y)
DUP 1 RSHIFT SWAP ( x y)
1 AND IF Matrix-A XOR THEN ( x)
I MTM + 'th MT @ XOR
I 'th MT ! ( )
LOOP
\ {N-M...N-2}
MTN 1- MTN MTM - DO
I 'th MT @ Upper-Mask AND
I 1+ 'th MT @ Lower-Mask AND OR ( y)
DUP 1 RSHIFT SWAP ( x y)
1 AND IF Matrix-A XOR THEN ( x)
I MTM + MTN - 'th MT @ XOR
I 'th MT ! ( )
LOOP
\ N-1, 0
MTN 1- 'th MT @ Upper-Mask AND
MT @ Lower-Mask AND OR ( y)
DUP 1 RSHIFT SWAP ( x y)
1 AND IF Matrix-A XOR THEN ( x)
MTM 1- 'th MT @ XOR
MTN 1- 'th MT ! ( )
0 MTI !
THEN
MTI @ 'th MT @ 1 MTI +! ( u)
DUP Tempering-Shift-U XOR
DUP Tempering-Shift-S Tempering-Mask-B AND XOR
DUP Tempering-Shift-T Tempering-Mask-C AND XOR
DUP Tempering-Shift-L XOR ;
0 [IF] =======================================================
The speed and staggering features make this a powerful
candidate for Monte Carlo simulation and cryptography.
2^19937-1 is a big number, suitable for billions and billions
of pseudo-random numbers. How big can be seen by writing it
in decimal -- 6002 digits.
------------------------------------------------------- [THEN]
\ `u GENRANDMAX` yields a uniform random integer < `u`.
: GENRANDMAX ( u -- n )
GENRAND UM* NIP ;
\ Mersenne Twister for Floating Point.
: FGENRAND ( F: -- 0. <= r <= 1. )
GENRAND 0 D>F S" 2.3283064370807974E-10 F*" EVALUATE
;
: FGENRAND-1 ( F: -- 0. <= r < 1. )
GENRAND 0 D>F S" 2.3283064365386963E-10 F*" EVALUATE
;
0 [IF] =======================================================
A _Mersenne number_ is a number of the form 2^_w_-1. A _Mersenne
prime_ is a Mersenne number that is prime. A necessary but not
sufficient condition for a Mersenne prime is that _w_ is prime.
Another condition will make it sufficient.
As of May 2000, there are 38 known Mersenne primes. 2^19937-1
is the 24th and has 6002 decimal digits. The 38th is
2^6972593-1 and I don't know how many decimal digits.
Finding Primes
\ Lukas-Lehmer Test for Mersenne Primes
: Lukas-Lehmer ( w -- flag )
DUP 2 < IF 1 = EXIT THEN
DUP 1 AND 0= IF 2 = EXIT THEN
DUP PRIME NOT IF DROP FALSE EXIT THEN
1 OVER LSHIFT 1- SWAP ( 2^w-1 w)
4 SWAP 1- 1 DO ( 2^w-1 u)
DUP UM* -2 M+ THIRD UM/MOD DROP
LOOP
NIP 0= ;
\ Mersenne primes in 32 bits.
MARKER ONCE
: RUN CR ." \ "
CR 32 2 DO
I Lukas-Lehmer IF 1 I LSHIFT 1- . THEN
LOOP ;
RUN ONCE
\ 3 7 31 127 8191 131071 524287 2147483647
2^61-1 and 2^89-1 are the next two Mersenne primes.
For cryptography, replace `SGENRAND` with a function using a
secure hash to fill MT.
`COUNTER SGENRAND` or `uCOUNTER XOR SGENRAND` is usually good
enough to initialize simple randomizing.
For best performance, use macros to define ancillary words.
: Tempering-Shift-U S" 11 RSHIFT " EVALUATE ; IMMEDIATE
: Tempering-Shift-S S" 7 LSHIFT " EVALUATE ; IMMEDIATE
: Tempering-Shift-T S" 15 LSHIFT " EVALUATE ; IMMEDIATE
: Tempering-Shift-L S" 18 RSHIFT " EVALUATE ; IMMEDIATE
How Mersenne Twister Works
One way to define a linear congruential series is to pick
numbers _m_ and _p_, where _p_ is a prime, and _m_ is a "generator" for
it. A generator for a prime _p_ is a number such that its
powers modulo _p_ yield all positive numbers less than _p_. Thus
you can start with any positive number less than _p_, and
continue multiplying by _m_ to get all positive numbers less
than _p_.
For example, 3 and 5 are the generators for 7.
MARKER Run-and-Forgotten
7 VALUE The-Prime
: Generator-Check ( -- )
The-Prime 1 DO CR
I BEGIN ( x)
DUP . DUP 1 >
WHILE
I * The-Prime MOD
REPEAT DROP ( )
LOOP ;
Generator-Check Run-and-Forgotten
1
2 4 1
3 2 6 4 5 1
4 2 1
5 4 6 2 3 1
6 1
In the Mersenne Twister, instead of numbers, we are working
with vectors of 19937 bits. In this, the equivalent of a
generator is a 19937 by 19937 matrix that can successively
multiply any of the vectors that is not all 0 to obtain every
vector that is not all 0. The arithmetic is done modulo 2.
Addition is _x + y mod 2_ and multiplication is _x * y mod 2_.
These are the same as _x xor y_ and _x and y_.
In the program, `Matrix-A` is a value that can be used to form
such a matrix. The once-every-624-times part of the program
takes this value and does calculations that are equivalent to
multiplying by the full matrix.
This gives 2^19937-1 combinations of 19937 bits.
The matrix has the following form, but 19937 by 19937.
0 1 0 0 0 0
0 0 1 0 0 0
0 0 0 1 0 0
0 0 0 0 1 0
0 0 0 0 0 1
a b c d e f
The done-every-time part twists the output to obscure the
algebraic connection between successive elements. It's
equivalent to multiplying by an invertible 19937 by 19937
matrix.
------------------------------------------------------- [THEN]
;MODULE
\EOF
: RUN 1000 0 DO I 5 MOD 0= IF CR THEN GENRAND U. CR LOOP ;
REQUIRE ms@ lib/include/facil.f
10000000 VALUE #N
: TEST
ms@ DUP SGENRAND
#N DUP 0 DO
GENRAND DROP
LOOP
SWAP ms@ - ABS / . ." pseudorandom numbers in 1 ms"
;
\ Celeron 3.2 GHz - 5 000 000 numbers/second