Contents
Diehard tests
The diehard tests are a battery of statistical tests for measuring the quality of a random number generator. They were developed by George Marsaglia over several years and first published in 1995 on a CD-ROM of random numbers. In 2006, the original diehard tests were extended into the tests.
History
An initial battery of randomness tests for RNGs was suggested in the 1969 first edition of The Art of Computer Programming by Donald Knuth (Volume 2, Chapter 3.3: Statistical tests). Knuth's tests were then supplanted by George Marsaglia's Diehard tests (1995) consisting of fifteen different tests. The inability to modify the test parameters or add new tests led to the development of the TestU01 library, introduced in 2007 by Pierre L’Ecuyer and Richard Simard of the Université de Montréal.
Test overview
Test descriptions
m3 / (4n) . Experience shows n must be quite large, say n ≥ 218, for comparing the results to the Poisson distribution with that mean. This test uses n = 224 and m = 29, so that the underlying distribution for j is taken to be Poisson with λ = 227 / 226 = 2 . A sample of 500 js is taken, and a chi-square goodness of fit test provides a p value. The first test uses bits 1–24 (counting from the left) from integers in the specified file. Then the file is closed and reopened. Next, bits 2–25 are used to provide birthdays, then 3–26 and so on to bits 9–32. Each set of bits provides a p-value, and the nine p-values provide a sample for a KSTEST. (j−141909) / 428 should be a standard normal variate (z score) that leads to a uniform [0,1) p value. The test is repeated twenty times. (OBS-EXP)2 / EXP on counts for 5- and 4-letter cell counts. (OBS − EXP)2 / EXP on counts for 5- and 4-letter cell counts. (k − 3523) / 21.9 should be a standard normal variable, which, converted to a uniform variable, provides input to a KSTEST based on a sample of 10. (n2 − n) / 2 pairs of points. If the points are truly independent uniform, then d2 , the square of the minimum distance should be (very close to) exponentially distributed with mean 0.995. Thus 1 − exp(−d2 / 0.995) should be uniform on [0,1) and a KSTEST on the resulting 100 values serves as a test of uniformity for random points in the square. Test numbers = 0 mod 5 are printed but the KSTEST is based on the full set of 100 random choices of 8000 points in the 10000×10000 square. 1 − exp(−r3 / 30) , then a KSTEST is done on the 20 p-values. ≤ 6, 7, ..., 47, ≥ 48 are used to provide a chi-square test for cell frequencies. 200000p(1 − p) , with p = 244 / 495 . Throws necessary to complete the game can vary from 1 to infinity, but counts for all > 21 are lumped with 21. A chi-square test is made on the no.-of-throws cell counts. Each 32-bit integer from the test file provides the value for the throw of a die, by floating to [0,1), multiplying by 6 and taking 1 plus the integer part of the result. Most of the tests in DIEHARD return a p-value, which should be uniform on [0,1) if the input file contains truly independent random bits. Those p-values are obtained by p = F(X), where F is the assumed distribution of the sample random variable X – often normal. But that assumed F is just an asymptotic approximation, for which the fit will be worst in the tails. Thus you should not be surprised with occasional p-values near 0 or 1, such as 0.0012 or 0.9983. When a bit stream really FAILS BIG, you will get ps of 0 or 1 to six or more places. Since there are many tests, it is not unlikely that a p < 0.025 or p > 0.975 means that the RNG has "failed the test at the 0.05 level". We expect a number of such events ps happen among the hundreds of events DIEHARD produces, even conditioned on the random number generator being perfect.
This article is derived from Wikipedia and licensed under CC BY-SA 4.0. View the original article.
Wikipedia® is a registered trademark of the
Wikimedia Foundation, Inc.
Bliptext is not
affiliated with or endorsed by Wikipedia or the
Wikimedia Foundation.