A brief and understandable guide to pseudo-random number generators and speciﬁc models for security

: The generation of random sequences is the basis of simulation and can be used in many diﬀerent areas such as Statistics, Computer Science, Systems Management and Control, Biology, Particle Physics, Cryp- tography or Cyber-Security, among others. It is crucial that the numbers generated were random or at least, behave as such. The fundamental sta- tistical properties required for such sequences are randomness and independence and, from a cryptographic perspective, unpredictability. There is a variety of methods to generate these sequences. The main ones are physical and arithmetic methods. In this work, a detailed study of the main arith- metic methods is carried out. On the other hand, the necessity of secure sequence generation will be analyzed and new lines of ongoing research fo- cusing applications in Internet of Things and new generator designs will be described.


Introduction
Random numbers are the essential basis of the simulation. In general terms, the randomness involved in a model is obtained through sequences of numbers that pretend to be random (we will go into this aspect in more detail later) and come from a Uniform distribution in the interval (0,1) (U (0, 1)) and are obtained through various generators. These random numbers are then conveniently transformed to simulate the different probability distributions required in the model.
In the early days of statistical simulation studies, random number tables were used to carry out the analysis. Later this methodology changed with the rise of computers and more efficient ways of obtaining sequences were established. Not only in statistics and applications such as re-sampling or stochastic processes is simulation very useful (mainly due to the complexity of the problem or the impossibility of its treatment by analytical methods), but in other disciplines simulation studies are of vital importance, for example in the study of physical and biological processes and in the area of cryptography and lightweight cryptography, among others. Is in cryptography that the generated sequences become critically important, for example they are used in key distribution scenarios such as Kerberos [140], temporary key generation, secure key generation public key generation [171], encryption, etc. Depending on the field of application, the desired conditions for the sequences of numbers to be used will be different.
A fairly general definition of the term simulation is as follows: reproduction of a real problem in an environment (computer or other device) controlled by the experimenter.
Simulation will be appropriate: • When you want to analyze complex models, so present today in which there is a great amount of information. • The observation of irreversible alterations in an environment which can prevent, for example, irreversible damage caused by direct intervention in certain ecosystems. • To guide the investigation of a phenomenon or even theoretical results by evaluating the feasibility or ineffectiveness of certain alternatives. • To verify the importance of some variables over others in a given context. • When it is wanted to prevent possible adverse effects before implementing a certain policy.
• When the direct experimentation in situ is costly or impossible.
• To validate certain analytical solutions and the applied methods.
• When there is no analytical procedure to address a problem or when even if there is one, it is costly to implement.
On the other hand, the simulation will not be suitable when: • The problem has a clear and simple analytical solution, i.e. there is an analytical procedure for resolution and it can be addressed efficiently. • The costs of the simulation exceed the possible savings that its implementation could imply.
As we mentioned beforehand, the numbers generated are intended to be random. In this respect, we can find in the literature a widespread classification of the different types of random numbers (see for example [175] or [194]): the so-called "true" random numbers, pseudo-random numbers and quasi-random numbers. True random numbers are sequences generated from phenomena with intrinsic randomness. They do not need any initial seed to be obtained and are expected to show no patterns or correlations between values. Their main drawback is that they are very costly to generate and, in many cases, time and/or hardware resources are limited. Therefore, from a certain amount of random information, it is sought to extend this information and generate very long sequences of "random" numbers in some alternative way. This is how the so-called pseudo-random numbers arise. These are generated from devices or algorithms that, given an input called seed, generate long sequences of random-looking numbers. Because of the way they are generated they can be reproducible. There is also a third category, quasi-random numbers. They are not designed to appear random, but to be uniformly distributed. One of the objectives of these numbers is to reduce and control errors in Monte Carlo simulations (for more details see [168]). In this case, statistical problems arise for the verification of the goodness of fit of the sequences obtained by hypothesis testing as the dimension increases [119]. In Table 1 principal characteristics of these types are described.
There are two main types of generators (classification based on their properties, architecture and type of implementation) that usually appear in the literature (see for example [183]): • Physical generators (true random number generators, TRNGs): these are physical devices that use external sources to generate random numbers (hardware or natural phenomena). The most commonly used are based on electrical circuits equipped with a noise source (often a resistor or a semiconductor diode) that is amplified, sampled and compared with a reference signal to produce bit streams. These random bits are joined together to form bytes, integers or real numbers as required. One observation about the source is that it must be chosen carefully to ensure effective randomness (e.g., do not use a pulse source that has some kind of periodic pattern). All of them have the fundamental characteristic that they produce unrepeatable sequences. The advantages and disadvantages of TRNG can be seen in Table 2. The output sequences of TRNGs can be used directly 140 E. Almaraz Luengo Table 1 Types of random numbers.
• based on physical sources True random • do not need an initial sequence (seed) numbers • they are expected to have no periodic pattern or correlation between the obtained data • they are generated in a deterministic way, that is, they are constructed through Pseudo-random generation algorithms and an initial seed numbers • they have the appearance of being values that come from independent realizations of a U (0, 1) random variable (r.v.) • the way they are generated makes them reproducible • are obtained through specific algorithms • the obtained sequences are distributed uniformly Quasi-random in the square or in the unit cube numbers • its main disadvantage is that as the dimension increases there are no specific hypothesis tests to assess the goodness of the obtained sequences [119].

Table 2
Pros and cons of the use of a TRNG.
Once a sequence has been generated, it can never be recreated by anyone else, even if they have the same Pros device that first created it.
This feature is very convenient, for example, to protect the secrecy of communications.
Once the sequence has been generated it is stored and distributed only to two correspondents, who can use it to communicate secretly using stream ciphering.
The difficulty of distributing the encryption key to the two correspondents using a secure channel. Cons The necessity to have a physical element that is more or less bulky, expensive and difficult to produce.
It is very difficult to design a device or a program that produces a bit-stream free of biases and correlations. In statistical simulation: non-reproducibility.
as random sequences or can be used as input to a pseudo-random number generator. • Arithmetic generators (pseudo-random number generators, PRNGs): these are deterministic algorithms that are executed in computers. There are two main sub-types, linear and non-linear. The generated sequences present period (the length depends on the type of generator and on the selection of the parameters involved in the equation or equations of the generator) In this work we will focus in such generators. Table 3 shows the principal differences between TRNGs and PRNGs. The generation of (or pseudo-random) numbers also makes it possible to generate values of other random variables by means of certain transformations. In fact, if U is a U (0, 1) random variable and F (x) is a distribution function, then X = F −1 (U ) is a random variable with distribution function F (x), being Indeed, if we denote by F X (x) to the distribution function of X, then by the ) and as U is a uniform random variable in the interval (0, 1) and F (x) ∈ (0, 1) we obtain that F X (x) = F (x) as we wanted to prove. So if we can simulate a U (0, 1) random variable it is possible (at least from a theoretic point of view) to simulate any other random variable.
It is also possible to obtain sequences of identically and independently distributed (i.i.d.) random variables by using sequences of i.i.d. random bits, see for example Chapter XV of [40]. So then as before, the problem of obtaining random sequences of numbers can be reduced to the obtainment of random bits sequences.
The importance and necessity of working with the concept of randomness, random and pseudo-random numbers is evident in many areas of knowledge and has been for a long time. From a theoretical/conceptual point of view regarding random and pseudo-random numbers, we could highlight different investigations. For example, the works of [31], [93], [96], [128] and [65] on the concept of randomness, random sequences and their construction are noteworthy. Papers [77] and [195] on the generation of equidistributed or uniform numbers or papers [79] and [129] on the generation of random numbers and algorithms are also worth mentioning. Finally, as a sample, we could point out the works on the generation of pseudo-random numbers, their concept and analysis [84], [37], [198], [139], [177], [158], [43] and [16] among others.
It is also possible to find a large number of papers related to a certain method of generating random and pseudo-random sequences, which will be discussed in the following sections of this paper with special attention to secure random numbers. Finally, it is possible to find in the literature numerous works related to statistical hypothesis tests applicable to check the properties of the sequences obtained with different procedures, in this sense a work that describes the existing statistical batteries as well as the definitions of the tests involved and the software that can be used to perform the verifications is [4].
The principal types of studies in randomness context are: • Theoretical/conceptual perspective about random and pseudo-random numbers. • Tables of random numbers, its analysis and extraction methods.
• Methods of generation of random and pseudo-random sequences.
• Methods for testing random numbers.
• Applications of random numbers and security The aim of this work is to: • Clarify the concept of randomness and the need to work with random sequences in different contexts. It will clearly distinguish what requirements are placed on the sequences generated depending on the context in which they are used, whether from a purely statistical point of view with applications in simulation and computation or from a cryptographic point of view. • Describe in detail the most important models of PRNGs emphasizing their properties and pointing out the context in which they are the most suitable for use. • Point out the strengths and weaknesses of the generators described to describe new designs and current lines of research.
This paper is organized as follows: Section 2 gives a historical development of random and pseudo-random number generation emphasizing the fact that arithmetic generators will be analyzed, Section 3 explains the different types of pseudo-random number generators (classical generation methods not currently used, linear and non-linear congruential methods) together with their fundamental properties. Section 4 describes the new designs of PRNGs and new lines of investigation and finally Section 5 gives the main results of this research.

Historical development
When working with a simulation, cryptography or system security problem (among others) it is necessary to include a source of randomness. During the first half of the 20th century, various physical procedures were used such as coin tosses, dice, experiments with cards, urns or mechanical procedures such as spinners to extract numbers (or, in general, samples) randomly or electrical circuits based on vacuum tubes with random pulses. During the second half of the 20th century a large number of works appeared proposing physical generators of random numbers. In some cases the random numbers were published in table form. One of the first papers in this line can be found in [196]. Tippet proposed a table of non-uniform random numbers, consisting of 41600 digits arranged in 10400 four digited numbers. The table proposed by Tippet has been analyzed in different works such as [211], [42], [64] and more recently in [179]. Other well-known tables in the literature are for example those proposed by Fisher and Yates in 1938 [56], analyzed in [178] or in [180] among others. In Table 4 [32] and analyzed for example in [182]. One of the advantages of Rand Corporation's table compared to those existing up to then was precisely its large size. The digits in the table were obtained from an electrical circuit whose operation was similar to that of a roulette wheel. The circuit consisted of a random frequency pulse source providing an average of 100000 pulses per second. Once per second this train of randomly spaced pulses was connected to a 5-bit counter. The counting was done during a constant time interval of less than one second. Once the counter had reached its maximum value (31), the next pulse would take the value 0, then 1 and so on. This cyclic behavior is similar to that of a 32-number roulette wheel that spins repeatedly through all the numbers until it stops at one of them. The time during which the pulses were counted was calculated so that on average, the counter was spinning approximately 3000 times. Once the pulse counting time interval was over, the number stored in the counter (0 ≤ n ≤ 31) was converted to a decimal base. When n > 20 it was discarded, if n ≤ 20, the least significant digit of the number was the random digit. This digit was stored on a perforated card. The circuits had to be adjusted several times until sequences with suitable statistical properties were obtained. In Figure 1 it is shown the random digit generation using the RAND corporation method.
Another physical generator of random numbers, in this particular case of random (independent) bits, is due to Vincent in 1970 [199]. The method he proposes consists of counting the number of randomly generated pulses during a certain time interval. If the sum is even, the value 1 is selected and if it is odd, the value 0 is selected. Pulses can be obtained in various ways such as measurement through a radioactive source detector or through a circuit that detects peaks above a certain preset threshold connected to the output of a noise source.
Vincent's generator machine satisfies that the number of pulses k, counted in The difference between the probability that the number of pulses is even, and the probability that the number of pulses is odd, is equal to: Since e −2λs 10 −13 for a value of s as small as 15, the probability of getting a 0 is practically the same as getting a 1, which makes Vincent's method very satisfactory.
Other noise sources that have been used for physical random number generators include a polarized Zener diode at the elbow of its characteristic curve, decomposition of radioactive sources, using radio signals received when tuning a radio receptor to a frequency at which it is not transmitting, etc.
The interest in building powerful physical random number generators that produce appropriate sequences is still strong at present. In fact, the sequences obtained with this type of generators are unpredictable and aperiodic. Given the characteristics of the sequences obtained with this type of generator, its application is of special interest in areas such as cryptography (data generation, key encryption, etc.). However, the application of physical generators in simulation studies has some disadvantages such as the necessity of high capacity memories or the slowness of the procedure. This is why other techniques have been developed to help overcome these undesirable effects. In fact, since the middle of the 20th century, and in parallel with the development of tables, other types of arithmetic (deterministic) generators were introduced, which have the advantage of being sequential and faster procedures.

Pseudo-random numbers generators
The most suitable and reliable method of generating random numbers is to use deterministic algorithms that have some solid mathematical basis. These algorithms produce a sequence of numbers that resembles that of a sequence of realizations of independent and identically distributed random variables according to a U (0, 1) random variable, although it is not really so. That is why such numbers are called pseudo-random and the algorithm that produces them is called pseudo-random number generator.
A good pseudo-random number generator should have some important properties that are described in Table 5.
Most arithmetic generators are usually very fast and require little data storage capacity. However, there are certain generators that do not satisfy the properties of uniformity and independence of the obtained sequences. Table 5 Properties of a good PRNG.

P1
The sequence of values it provides should resemble a sequence of independent realizations of a U (0, 1) r.v.

P2
From the point of view of statistical simulation: -the results must be reproducible. 1 From the point of view of cryptographic applications: -the sequences must be unpredictable. 2

P3
The sequence of generated values should have a non-repetitive cycle as long as possible.

P4
The generator -must be fast and -must occupy a small amount of internal memory.

P5
It is desirable that the generator be portable. 1 That is, starting with the same initial conditions, the same sequence must be obtained. This would allow to debug possible failures of the model or to simulate different alternatives of the model in the same conditions; 2 We are emphasizing the necessary security of the systems that make use of the generated sequences. See for example [186], [15] or [68]). In [19] and in [99] the problem of predicting the output of a pseudo-random number generator is considered.
There are several hypothesis tests that can be used to verify the statistical properties of randomness (autocorrelation test, test of streaks, etc.) and uniformity (for example, the Chi-Square goodness-of-fit test or the Kolmogomorov-Smirnov test). In the literature we can find different sets or groupings of these tests that are called test batteries or test suites. Among the best known are NIST SP 800-22 [176], Diehard [127], Dieharder [21], ENT [201], TestU01 [113], among others. For a detailed analysis of the most popular ones see [4].
In the case of cryptographic applications, it is also necessary that the sequence generators are able to escape from severe attacks, even if part of their initial or current state is available to an attacker. We will discuss later what additional conditions are required in this case when discussing Cryptographically Secure Pseudo-random Number Generator.
There are several well-known books in which the topic of random and pseudorandom numbers is treated, for example Chapter 3 of [97], Chapter 1 of [67], Chapter 7 of [6], Chapter 3 of [109], Chapter 7 of [101] among others as well as many papers that will be discussed in next sections.
Below we will describe the best known generators in the literature.

The middle-square method
This is due to von Neumann. It was originally presented by the author in 1949 at a conference and published in 1951 [200]. It is fundamentally only of historical interest since it is not currently used due to the weaknesses of the sequences that are obtained. These have very short periods, if the procedure is repeated a sufficient number of times the method will repeatedly generate the same number or it will go to a number previously obtained in the sequence and it will be repeated indefinitely or even degenerate to 0.
The algorithm consist of the following steps: • take a positive integer x 0 with 2n digits.
• it is squared to obtain a 4n-digit number (if it would be necessary, the number should be completed with zeros on the left). • remove the middle digits of the resulting number, let x 1 be this number (it will be considered the random number ). • use that number as the seed for the next iteration.
The pseudo-random numbers are obtained by dividing the sequence by 10 2n . The disadvantage of this procedure is that the generated numbers can be repeated cyclically after a short cycle and even degenerate to 0 very fast. Another important drawback is that the sequence obtained is not "random" because it can be predicted directly from the seed.
This type of generator has been discarded today and more sophisticated systems are used.
On the other hand, we have noted that this method can degenerate to 0 very fast (see [35]). For example, if we take x 0 = 1009 the obtained sequence is:

Lehmer's method
This method consists of the following steps: • An integer number x 0 of n digits, is taken as a seed.
• Another integer, c, of k digits is taken. Usually k < n.
• Calculate x 0 · c, number of, at most, n + k digits.
• The k digits on the left are separated from x 0 ·c and the number formed by the remaining n digits is subtracted from the one formed by the k digits on the left, resulting in x 1 . • This process is repeated as many times as necessary.
The sequence of pseudo-random numbers is obtained as follows: This method has some drawbacks, the most noteworthy are: • Possible appearance of negative iterants.
• Also short cycles often occur (in particular, zero is an absorbing value of this generator).
Due to these weaknesses, this type of generator is not used in practice and other methods have been designed to avoid these problems.
Example 3 Let us select the following values: x 0 = 2000 and c = 50, then x 0 c = 100000 so x 1 = 0000 − 10 = −10 < 0, which is not possible. This is a simple example in which a negative iterant appears.

Linear congruential methods
These methods are due to Lehmer (1951) [115]. The methods of generating pseudo-random numbers called congruentials are based on the mathematical concept of congruent numbers.
Two numbers a and b are said to be congruent modulo m if m divides a − b, this is represented by a ≡ b mod m. Congruence is an equivalence relationship, indeed: Although these congruent generators have limited capacity to produce very long number sequences that can pass as independent value sequences from a U (0, 1) random variable, they are considered a basic element in other more efficient generators.
In this type of generators the most popular are the linear congruential methods but also non-linear methods can be found in the literature. From the cryptographic point of view linear congruency generators are easily predictable (see [166]), it can be possible to obtain the values of the parameters in polynomial time given a sufficiently long string of generated numbers. Therefore, this type of generator is not recommended for cryptographic applications [156]. However, it can be used as an intermediate element in more complex generator designs, which may allow its use in cryptography as we will see later.
Within this type of methods we can distinguish the multiplicative congruential methods and mixed congruential methods.

Multiplicative congruential method
This method starts with an initial positive value x 0 called seed and two positive integer values a (multiplier) and m (modulus) with 0 < a < m and x 0 < m. The following numbers of the sequence are then obtained by means of the expression: Each x n ∈ {0, 1, ..., m − 1} and u n = x n /m is called pseudo-random number that is taken as a value of a U (0, 1) random variable.
The expression (4) for the integers is equivalent to: There exist many papers in the literature that study these generators with their principal properties: see for example [45], [59] or [76] among others.
The performance of multiplicative congruential generator depends on the selections of its parameters. Generally a and m are chosen in such a way that the following conditions (conditions 3.3.1) are satisfied: • For any initial seed, the resulting sequence will appear to be a sequence of i.i.d. values according to a U (0, 1) random variable (that is, the sequence passes the goodness-of-fit test when the theoretical distribution is U (0, 1) and the values are independent, they pass the independence test(s)). • The sequence must have a long period.
• The values must be obtained in an efficient way from a computer point of view.
Some well-known selections of the parameters are described in Table 6.
As it can be seen in Table 6 there is a lot of research about the possible values of the parameters. Other interesting reference in this context is [107] in which several tables of parameters (depending on the size) with good performance with respect to the spectral test, are given.
Another important aspect is to know the period of the generated sequence. In regard to this point, the following result is relevant: • m is a prime number. λ 2 q or 10 q [33]. For a q-place binary or decimal machine. In this case λ ∈ Z is prime to the modulus. 5 2 35 [11]. An algorithm is proposed for generating pseudo-random numbers with these parameters.
ap p [83]. The modulus is the largest prime p within accumulator capacity and ap is a primitive root of p. a 2 k [7] (also see [80] and [100]). The modulus is selected in this way because the binary base of most digital computers.

a pm
For example [83] or [172]. pm is the large prime number that can be fitted to the computer word size.
a * m * [191]. m * and a * are the values that maximize the period and minimize the correlation of the generated sequence.
The case with a1 is faster and has less risk of memory overflow, it was proposed in [117] and is widely used. a2 = 63036016 The case with a2 has better statistical properties but computationally it gives more problems, especially of memory overflow. A selection that is used occasionally is the Mersenne prime 2 61 − 1. [160]. Used by some FORTRAN.
a p [187]. The modulus is a prime number.

2 32
These selections are very popular and has been recommended by Marsaglia in 1972. Used in RN32 [85].  [157]. The authors make some comments about this selection instead of a = 16807 2 q1 ± 2 q2 2 p − 1 [208], the author states that the proposed generators with these values for m and a are very fast and pass several statistic tests. m − 2 q1 ± 2 q2 2 p − 1 Paper [208] was revised in [112]. It is shown the weakness of the generator proposed in [208]: it is stated that if a = ±2 q1 ± 2 q2 the number of ones of the binary representation of xn−1 and xn, (their Hamming weights) turn out to be very dependent and, therefore, this dependence extends to the bits of the generated un.
The sequences generated according to [208] do not pass the Hamming weights test. Some particular cases: • a ≡ 0 mod m and a (m−1)/q ≡ 1 mod m for each prime factor q of m − 1.
In [97] another result related with the period of a multiplicative congruential generator for the case of m = 2 β for any positive integer β, is explained:

Theorem 2 (Knuth [97])
The multiplicative congruential method x n+1 ≡ ax n mod 2 β has maximum period m/4 attained if and only if x 0 is odd and a ≡ 3 mod 8 or a ≡ 5 mod 8.
In practice, it is recommended a period of at least 10 9 , so at least the modulus that would need to be fixed would be at least 10 9 . However, this amount is still not adequate, as with the speed of computers, this period can be reached very quickly.
Some references about the period of these types of generators are [17] where the properties of the period in a multiplicative congruential generator a δ ≡ 1 mod n are studied or [23] where the properties about the period of the generator a θ(m) ≡ 1 mod m with a and m relatively primes are studied.

Mixed congruential method
The expression in this case is more general: where a, m ∈ Z + with 0 < a < m, 0 ≤ b < m and x 0 < m. Here b is called increment. Note that as in the previous situation (multiplicative congruential method) the numbers that are obtained with expression (7) are completely determined by m, a, b, and x 0 . In fact, it is verified that:  [72], section 2 selection of a, b and m in order to get maximum period.
a m b [3], several cases for different selection of the parameters are studied. a m b [82], selection of the parameters in case of being working in a binary machine. but with appropriate selection of these parameters the sequence can be considered to be formed by realizations of a U (0, 1) random variable (that is, the generated sequence passes the goodness-of-fit test when the theoretical distribution is U (0, 1)).
As in the previous case, it is important to know the period of the generated sequence. In this regard, we highlight the following result: Example 5 Let us consider the following mixed congruential method: x n+1 ≡ 5x n + 7 mod 200 (9) with seed x 0 = 3. This generator has period 8 but maximum period is not reached because m = 200 = 2 3 · 5 2 so q = 2, 5 and a − 1 = 4 is not a multiple of 5.
It has been proved that a good selection for b can be b mod 8 ≡ 5 if you work in binary system or b mod 200 ≡ 21 if you work in decimal system. More specifically, b must be an odd integer and relative prime with m.
As for the selection of the seed, this is irrelevant (in general, see Theorem 4) in the sense that the value of this parameter does not seem to affect the statistical properties of the generated sequences.
There are several works that analyze parameters in a mixed (linear) congruential generator. Some of the most well known values are given in Table 7.
Others well known references that analyzes this issue are the following. In [142] a method is given to search systematically for multipliers which are optimal with respect to statistical independence of pairs. In [18] a systematic search method is used in order to obtain the values of the multipliers that are optimal with respect to statistical independence of pairs of successive pseudo-random numbers. An interesting paper involving multiplicative congruential generators is [58] where a method of an exhaustive search to find optimal full period multipliers for the multiplicative congruential generator with m = 2 31 −1 is described. In this work, the concept of optimal multiplier is related to the concept of optimal distance. In [39] a method based in Stern sequences [190] and continued fractions is described to find optimal multipliers for linear congruential pseudorandom number generators. In [22] two systematic search methods are employed to find multipliers which are optimal with respect to an upper bound for the discrepancy of pairs of successive pseudo-random numbers.
Theorem 4 resumes some of the principal properties that should be verified in order to get periods as long as possible. As we have commented before, it is very important to work with generators that have long periods. One possibility to increase the period of a congruential linear generator is by shuffling. In this context in [121] it is suggested to shuffle the output of the congruential generator using another simpler generator. In [9] an algorithm to Shuffling of Uniform Deviates is presented. The principal steps of the algorithm are: • Set k as the length of the table T . Fill the table with An inherent property of congruential generators is that they produce a lattice structure as can be seen in Figure 2.
In this grid structure a series of lines can be identified where all the pairs in the series are located. Depending on the distance between these lines, the pairs are distributed more or less evenly in the plane. Generally, the greater the distance, the worse the generator. This distance is determined by the parameter a.
This structure can be detected by the spectral test [34] or by the lattice test [126]. Interesting references which studies the lattice structure in the case of a multiplicative congruential generator are [125] or [12]. Some interesting papers where the correlation of the generated values of a mixed congruential generator is studied are [33], [72] or [41].
With the previously mentioned IBM generator, generator all the triplets of consecutive numbers of the series (5 − 10 10 ) fall in only 15 planes. Marsaglia demonstrated in 1968 that the maximum number of parallel hyperplanes that can produce a linear multiplicative congruential generator is (n!m) 1/n , where n is the number of consecutive numbers of the subsections considered. Note that the number of hyperplanes decreases rapidly as the dimension of the space increases n.
With a shuffling procedure as in [121] is described, it is possible to break up the lattice structure.
There are several articles in the literature that study the fundamental properties of linear congruential generators, for example [141], [142] or [143].
If different random number sequences are available, each must be used for a random parameter [102]. In this case, the length of the string must be taken into account to avoid overlaps. In [102] different seeds for strings of the mentioned generator separated by 100000 pseudo-random numbers can be found.
Although congruential generators are the most used in practice, other types of generators have been developed with the objective of obtaining longer periods and better statistical properties. Often, however, a congruential generator with properly chosen parameters can work as well as more complicated alternatives.

Multiple recursive generators
Congruential generators can be generalized to higher order linear recursions, considering the following relationship: where k, m ∈ Z + and a j ∈ Z m , then the sequence of pseudo-random numbers is taken as u n = x n /m. (See [104] and [110]). The study of this type of generator is associated with the study of the characteristic polynomial: about the finite field Z m . When m is prime and the polynomial is primitive over Z m , the period of the generator is m k − 1 (maximum possible period in this class of generators) (See [97]). It is possible to generalize the previous expression a little more (10) by adding an addend b ∈ Z m with the result: In [110] the authors study multiple recursive generators to find those with good properties related to structure and computational efficiency, in their work show various possibilities depending on the parameter values.
In [87], different multiple recursive generators of orders one, two and three that possess good properties of randomness and homogeneity are studied. The authors consider different values of the generator parameters and apply statistical tests such as serial autocorrelation, runs test or chi-square test among others to verify these properties.
The computational efficiency of multiple recursive generators can be improved by choosing some a j = 0, 1, −1. In [38] it is proposed a portable Fast Multiple Recursive Generator (FMRG) and it is claimed that its properties are better than those of classical linear congruential generators.
A particular case of this type of generator is that in which m = 2 and is based on a sequence of zeros and ones that are generated according to the recursive formula (Linear Feedback Shift Register Generator-LFSR-): where a i ∈ {0, 1} for all i = 1, ..., k, in this case, x i represent bits that constitute the binary representation of an integer. This generator was introduced by Tausworthe in 1965 [193]. Its properties and characteristics have been widely studied in the literature (see for example [205], [55], [159], [197] or [69] among others). In relation to its use, this is very remarkable in areas such as communications or in cryptography, even though they are not cryptographically secure in principle.
To overcome this problem, they are used as building blocks in more sophisticated constructions of PRNGs. The most common techniques in this context are: • Combine the output of several LFSRs by using a nonlinear function.
• Use the output of one LFSR (or a combination of several LFSRs) to clock one (or a combination of) other LFSR(s).
However, the design of such a combination has to be done with great care and must be analyzed with the cryptanalytic techniques.
A particular case of (13) and that appears frequently in the literature is: resulting from a trinomial. A generalization of (13) was made by Lewis and Paine in 1973 [118] and they called it as Generalized Feedback Shift Register pseudo-random number Generator (GFSR). The authors claim that the GFSR is capable of producing multidimensional pseudo-random numbers, of arbitrarily long period and at higher speed than other pseudo-random generators such as Lehmner's or Kendall's. In their work, they also carry out a detailed study of the generator paying special attention to the period and statistical properties on correlations, among others.
Other authors have studied this type of generator such as [63], [131], [132] or [133]. In this last paper ( [133]), the pseudo-random number generator known as the "Mersenne twister" is defined, and is known for its quality. Its name is because that the length of the period of the generated sequence corresponds to a Mersenne prime number. There are at least two variants of this algorithm, differing only in the size of Mersenne primes used. The most recent and most widely used is the Mersenne Twister MT19937, with a word size of 32-bit. Its period is 2 19937 −1. There is another variant with 64-bit words, the MT19937-64, which generates another sequence. Two interesting references which study such generators are [155] and [111].
Mersenne twister in its primitive form is not recommended for cryptographic applications (it is based on a linear recursion and any sequence of pseudo-random numbers generated by a linear recursion is insecure since from a sufficiently long sub-sequence of the outputs, the rest of the outputs can be predicted), but for statistical simulations it is.

Lagged Fibonacci generator
The lagged Fibonacci generator is a particular case of multiple recursive generator. It is defined as: In the particular case in which j = 1, k = 2 the generator is known as the Old Fibonacci generator. This generator tends to have a period greater than m, but is unacceptable from a statistical point of view: the following arrangement of three consecutive output values u i−2 < u i < u i−1 can never be produced whereas such a structure should occur with probability 1/6 in the case of a "perfect" random number generator. [20] However, under certain selection of the parameters generator (15) can perform well. For example, if m is a prime number and k > j > 0, then the length of the cycle can be as long as m k − 1 (see [5]) and if m = 2 p , the maximum period can be (2 k − 1)2 p−1 . An example of this generator is: with an approximate period 2 129 . Some references where this type of generator is studied are for example: [73] where the additive generators are studied with its properties, [202] and [122], where the properties concerning the period of the Fibonacci generator are studied, [71] where it is studied the case x i ≡ x i−1 + x i−n mod 1 with j > n and its equivalent statement for binary machines x i ≡ x i−1 + x i−n mod 2 r , j > n, 0 ≤ x i < 2 r being r the number of bits used to encode each fixed point number and [137] where the additive congruential method is studied for the case j = 2, k = 3 and m a prime number.
A common way in which this generator is generalized is using some binary operator •: with 0 ≤ x i < m and 0 < j < k < i. A particular and very used operator is the XOR (⊕) operator:

157
and it is denoted by: Then a sequence of binary integers x 1 , x 2 ,... is defined as follows: That is, l consecutive t i are stringed to form x i as a number in base 2. The recurrence for the x i is the same that the recurrence for the t i , that is: where the ⊕ operation is performed bitwise. Then, the pseudo-random number u i which is taken as a value of a U (0, 1) random variable is:

Combined multiple recursive generators
Linear congruential generators with modulus around 2 31 may be insufficiently long for certain applications. In fact, this length can be used up in a few minutes on a PC. However, congruential algorithms can be combined to increase the period of the generation cycle. The principal idea is to combine two or more multiplicative congruential generators in order to obtain longer sequences with "good" statistical properties.
The disadvantage of using a combined generator is the associated computational cost, which is higher than that required for a simple congruential generator.
In [103] it is shown the following result that suggests how this can be done:

Lemma 1 (L'ecuyer [103])
Let W i , i = 1, ..., l be l independent discrete random variables where W 1 is a discrete uniform random variable between 0 and d − 1, d ∈ Z + . Then: follows a discrete uniform law between 0 and d − 1.
This lemma can be applied to form combined generators in this way: let X i,j , j = 1, ..., l the i-th output from l different multiplicative congruential generators, where the j-th generator has prime modulus m j and period m j − 1. Then the j-th generator produces integers x i,j that can be considered approximately as realizations of a discrete uniform random variable X i,j , on the integers from 1 to m j − 1 and W i,j = X i,j − 1 is approximately uniformly distributed on the integers from 0 to m j − 2. In [103] the following combined generator is proposed: The maximum period for this generator is: In [103] the properties of generator in (24) are studied paying attention to the period and the possible values that can be selected for the multipliers and modulus in order to achieve periods as long as possible verifying the desired statistical properties. In addition, the proposed generators are tested with Knuth's statistical test battery.
In [121] it is discussed the testing methods for generating pseudo-random numbers in case of using linear (multiplicative and mixed) conguential methods and other methods, in particular the authors proposed the following sequence {x i } built in the following way: and u 0 = 1, v 0 = 0. A table of 128 values of the sequence of {u i } is generated and then to generate a value of x i it is used the first seven bits of v i as an index to get x i from the table. Another interesting reference in which a method of generating pseudo-random uniform numbers based on the combination of two congruential generators is described is [204].
In [10] the following generator is analyzed: where k, l, p, q, u i and v i are positive integers. The sequence {x i } is generated by combining the sequences {u i } and {v i }. In this paper, also several statistic tests are applied to the resulting {x i } sequence. Other well-known combined multiple generator is due to Wichmann and Hill ( [206], [207]). It has a period of order 10 12 and is defined as: And take: This combination is based on the following results: • If U 1 , . . . U k are independent an identically distributed U (0, 1) random variables, then the fractional part of U 1 + . . . + U k also follows a U (0, 1) distribution and U 1 + . . . If u 1 , . . . , u k are generated by congruential methods with periods c 1 , . . . , c k , respectively, then the fractional part of u 1 + . . . + u k has a cycle of period least common multiple of c 1 , . . . , c k .
Generator (31) has a long period (but not as long as stated in [207]) it is portable and efficient although the code can present some numerical difficulties in some particular computer architectures (see [134]).
In [105] and in [108] another design combined generator is presented. I different multiple recursive generators of the form z i,j ≡ a 1 z i−1,j + a 2 z i−2,j + . . . + a q z i−q,j mod m i , i = 1, ..., I are considered. Given δ 1 , ..., δ I specific constants it is built: And taking u j = y j /m 1 . The values of this type of generator must be chosen very carefully. It is possible to obtain long periods and sequences verifying good statistical properties. An interesting study concerning this last point can be seen in [108].
In [114] it is studied the following combined generator: and being a 1 = 1403580, b 1 = 810728, m 1 = 2 32 − 209, a 2 = 527612, b 2 = 1370589, m 2 = 2 32 − 22853. These types of generators allow the simultaneous acquisition of multiple strings, each of which can be divided into many consecutive long sub-strings. The objective is to obtain sequences whose statistical properties are better than those obtained from each of the independent generators that compose it.
The length of this generator is far superior to those of the previous ones: To generate different strings and sub-strings, two positive integers v and w are chosen, and z = v + w is defined. Then, the cycle l is divided into contiguous strings of length Z = 2 z and each in turn is divided into V = 2 v sub-strings of length W = 2 w . Suitable values are v = 51 and w = 76, so W = 2 76 and Z = 2 127 . For this particular generator the following values can be used as initial default seeds (x 1,n−3 , x 1,n−2 , x 1,n−1 , x 2,n−3 , x 2,n−2 , x 2,n1 ) = (12345, 12345, 12345, 12345, 12345, 12345). Other parameters suitable for this type of generators can be found in [108].
Specific attention should be paid to the generation of random numbers to be used in parallel calculations [62].

Matrix congruential generators
It is possible to design other types of generators that produce better sequences than those obtained by classical congruential generators by means of a matrix design. Thus, the so-called Matrix Congruential Generators are developed. These are defined as: where x n is a d-dimensional vector and A and B are matrices d×d. The elements of the vectors and the arrays are integers z ∈ {1, . . . , m − 1}. The case B = 0, x 0 = 0 and m a prime number was studied by [61], [74] and [144].
The lattice structure of the sequences generated by this type of generator is studied in [2].
Of particular interest is the study in [38] that looks at this type of generator and proposes a fast generation method that the authors call Fast Matrix Congruential Generator (FMCG). It is claimed that FMCGs produce better sequences than classical linear congruential generators and are easily implemented and computationally efficient.
It is possible to generalize the expression of a multiple recursive generator (10) for higher dimensions and so it is defined the multiple recursive matrix random number generator as: where x n is a d-dimensional vector and A i , i = 1, ..., k are matrices d × d. Some interesting references about the matrix congruential generators are [147] or [148].

Non-linear congruential generators
The linear congruential generators have some weaknesses related to regularity, which is why other non-linear congruential methods have been developed to try to overcome these deficiencies. While it is true that their generation requires more computational effort than linear models, with technological progress and the existence of increasingly better, more efficient computers with more memory, this is not a major problem.
The generic expression of a non-lineal congruential generator is: where g is a non-linear deterministic function. As in the case of linear congruential generators, the sequences obtained with this formula, are integers between 0 and m − 1. From these, the sequence of values in (0, 1) is obtained taking u n = x n /m. Next, we will discuss some of the most well-known ones.

Inversive congruential generators
They were introduced in [53]) and three fundamental types can be distinguished according to whether the modulus is a prime number, a power of 2 or an odd prime number (the latter introduced in [51]). They are defined as: with 0 ≤ x i < m, and The sequence of values adjusted in the interval (0, 1) are obtained by the quotient: x i /m. As in the case of linear congruential generators, it is important to detect the length of the sequences that can be obtained in this type of generator. In fact, in [53] the authors describe an efficient method for the calculation of the period in these generators when the module is a prime number p. In particular, they give the following result: Theorem 5 (Eichenauer, J. and Lehn, J. [53]) Let a and b be the coefficients of the inversive non-linear congruential method choosing in order to verify that z 2 − bz − a were a primitive polynomial over Z p . Then, the sequence x i /p has maximum period and its value is p.
If the inversive congruential method has a maximum period, then the uniformity test for equidistribution is passed in [0, 1) as shown in [142]. This aspect related to the uniformity properties of the sequences obtained using this type of generators has also been studied in [53] and [52]. In relation to the independence of the obtained values, a detailed analysis is made in [145] and [146].
In the case that the modulus m were a power of 2, the period of the generated sequence has also been studied. In this respect in [54] the keys are given in the following theorem: In relation to the independence property tested with the serial test, some results detailed in [146] and [50] can be found in which, (among others), some disadvantages are shown that this type of generators with a power of two modulus have in detriment of those with a prime modulus. Another disadvantage is pointed out in [49] and refers to the existence of regular structures within the generated points.
As it was explained above, the last type of inversive congruential generator, those with modulus is a power of a prime number, was introduced in 1990. For this type it also has been studied the period pointing out the theorem of [47].
Theorem 7 (Eichenauer-Herrmann, J. [47]) Let be considered an inversive congruential generator with modulus m = p k , p ≥ 3 and k ≥ 2. Let λ ≥ 2 and η be an integer such that y λ ≡ y 0 + pη mod p 2 . For α, β integers let us consider a sequence of integers (y n ) n≥0 such that y 0 ≡ y 0 mod p and y n+1 ≡ (a + pα)y − n + b + pβ mod m, n ≥ 0. Then the inversive congruential sequence (y n ) n≥0 has maximal period l if and only if The properties of uniformity and independence can be studied in [46]. An interesting survey about this type of generators is [48]. With regard to the use of this type of generator, it should be noted that it is not very widespread. In the case of inversive congruential generators, there are some means of accelerating the calculations (see [70]). The randomness of the sequences is better than in the case of linear congruential generators, although their performance in passing the tests based on spacings was not as good, as stated by L'Ecuyer [106], and therefore their use is not recommended in most tests.

Knuth's non-linear congruential generator
It was proposed by Knuth in 1998 [97] it is defined as: The method can be generalized to higher order polynomials although in practice, there seems to be no advantage in doing so. The particular case of a = 0 and b = 0 with m the product of two different large prime numbers has been studied in [13] and [14], in these articles the generator is defined and its main characteristics are shown. Blum, Blum, and Shub demonstrated the following result about the unpredictability of their generator: if m = p 1 p 2 , where p 1 and p 2 are primes p 1 = p 2 that verify p i ≡ 3 mod 4, i = 1, 2, then the sequence provided by this generator is not predictable in polynomial time without knowing the values of p i , i = 1, 2. This result has important applications in the field of cryptography because of the computational difficulty of distinguishing quadratic residues mod m non-quadratic residues modulo mod m. On the contrary, for statistical purposes, this generator is not recommended because the non-linearity of the generator cannot be performed in an efficient way (from the perspective of computational efficiency). An interesting reference that studies the properties of this generator is [36].
Another particular case of this generator is when the parameters values are: d = a = 1, c = 0 and m = 2 k , k ∈ Z + . In this case the generator turns out to be very similar to the one defined by the middle-square method although it has better statistical properties. The period of this quadratic generator is at most m.

More general formulations
In addition to the above-mentioned generators, formulations in terms of more general g functions can be found in the literature. It can be considered general non-linear functions or mixtures in which linear and inversive addends are considered. With respect to the first case, the formulation would be: This type of generators with non-linear functions g and with modulus m being a prime number was studied in detail in [52].
Another example is the case in which linear and inversive addends are combined, this type was studied in [89] and can be defined as: with 0 ≤ x i < m, a, b, c ∈ Z m and m = 2 k , k ≥ 3. The sequence of pseudorandom numbers is obtained by x i /m. In [89] the authors study the period of the obtained sequences and formulate the next theorem: An important case of this type of non-linear generators is the Non-linear (feedback) Shift Register (NLFSR) generator. The equation of this model is: where f can be any non-linear function in L variables. For computational purposes the most recommended case is the binary case, in which each cell contains a bit, and f is a Boolean function. It is known that any binary sequence generated by a NLFSR has a period that can be at most equal to 2 L (sequences that reached this period are known as De Brujin sequences), and that any periodic sequence can be produced by such a register.  (49) where ∧ represents the AND operator and ⊕ the XOR operator, and the initial conditions x 0 = 0, x 1 = 1 and x 2 = 0. In this case the obtained sequence is

New lines of research: other methods for generating pseudo-random numbers
As before-mentioned, one of the fundamental areas in which the generation of pseudo-random numbers is crucial is cryptography and lightweight cryptography. The latter responds to security requirements in environments with limited hardware and software resources, for example the Internet of Things (IoT). Applications in key security and encryption [184], [188] are noteworthy. From a cryptographic point of view, it is essential that the used generators are secure. Recalling the properties that a good PRNG must fulfill (see Table 5), the unpredictability property must be verified. This implies that knowledge of any subsequence of a generated sequence does not imply that the complete sequence can be calculated or estimated, and that knowledge of some internal state does not imply that the preceding or subsequent numbers can be calculated. In this sense, it is possible to find in the literature the definition of Cryptographically Secure Pseudo-random Number Generator (CSPRNGs). CSPRNGs are a special type of PRNG with the property of unpredictability. This means that given n consecutive bits of the key, there is no algorithm in polynomial time that can predict the next bit with a probability of success greater than 50 %.
The requirements of an ordinary CSPRNG also satisfy those of a PRNG, but not vice versa. CSPRNGs must also satisfy that: (1) their statistical properties are good (pass statistical randomness tests) and (2) that they can come out successful under severe attacks, even if part of their initial state or current state is available to an attacker.
Every CSPRNG: • Must satisfy the next-bit test: given the first k bits of a random sequence, there is no polynomial time algorithm that can predict the (k + 1)-th bit with a probability of success greater than 0.5 [91]. A generator that passes the next bit test will pass any other statistical test of randomness in polynomial time [210]. • Must support state compromise extensions: if some or all of its state information is revealed, it must be impossible to reconstruct a sequence of random numbers generated prior to state attainment. Moreover, if there is entropy input while running, it should be impossible to use knowledge of the input state to predict future state conditions.
As described in the previous sections, certain PRNGs are not suitable for cryptographic uses. Although there are designs that are theoretically proved to be secure, see [43], [44] or [1] among others. However, most of the generators implemented in operating systems and cryptographic libraries are not based on the main security models of theoretical PRNGs and present weaknesses that make them vulnerable against known attacks (as can be the case of Linux OS PRNGs, OpenSSL, Android, OpenJDK, Bouncycastle and IBM). For example, certain PRNGs are vulnerable to machine learning attacks, in [86] it is described how long-short term memory (LSTM) turns out to be efficient to decrypt a normal PRNG.
The main attacks that PRNGs can suffer from are (see [92]): • Direct cryptanalytic attack: it is carried out when the sequence generated by the PRNG is not completely indistinguishable from a truly random sequence. In this case, a hacker could deduce what type of PRNG has been used and maybe what is the key that governs it. This attack is only feasible when a certain number of the generated sequence numbers can be observed directly or if the sequence can be found out indirectly. If the PRNG were used exclusively to generate keys for other secure encryption algorithms, such as TDEA or AES, it would not be possible to deduce these keys so it would be impossible to attack the PRNG that generated them. • Entry-based attack: this attack occurs when an attacker is able to use knowledge about the PRNG input sequence to cryptanalyze it. This attack can be implemented in several ways: (i) known input attack, (ii) chosen input attack and (iii) repeated input attack. • Attacks by extension of a jeopardized state: attempts to extend the benefits of a previous successful attack where a state S of the generator has been regained.
When working with PRNGs it is recommended using a cryptographic digest function (hash function) applied to the output of a PRNG if this is vulnerable to direct cryptanalytical attacks. This may increase the security of such a sequence, but decreases the speed of the generator. Other recommendation is to apply a summary function to the entry with a counter before using it. This helps prevent most of the attacks per chosen input. Inputs should be concatenated mod 2 bit by bit with time stamping and then summarized, before loading into the PRNG. It is also possible to generate from time to time a new PRNG input status. For PRNG generators such as ANSI X9.17, which leave a large part of their state unchanged after initialization, it is desirable to generate a new input state from the current one in the PRNG. In this way, the PRNG can reinitialize itself, with sufficient time and entropy. Finally, it is essential to pay special attention to the start and seed points of PRNGs guarantee the privacy (confidentiality) of the state of the generator.
In this context, new techniques have been developed that combine the methods explained in the previous section with others that use Deep-learning or new designs (that also incorporate TRNGs or uses other algorithms such Genetic Algorithm, etc.).
In IoT applications, the PRNGs used often have security vulnerabilities, i.e. they are not CSPRNGs. This is precisely due to hardware and software limitations. The generation of CSPRNGs in this context is a powerful and growing line of research. Table 8 shows some designs of cryptographically secure PRNGs together with a brief description and references. As can be seen, their mathematical basis is founded on some of the models discussed in the previous section, improved with coupling operations, incorporation of several interconnected generators, XOR operations or combinations with TRNGs among others.
As an example of the use of Deep-learning techniques, we could highlight [86] or [203]. In [86], a method for generating pseudo-random numbers is proposed that uses neural networks, in particular recurrent neural networks with long short-term memory. On the other hand, in [203] it is proposed a design that combines two PRNGs and a Physical unclonable function (PUF) for the generation of pseudo-random number sequences that are robust against LSTM attacks.
Examples of research that focus on designing generators that combine TRNGs and PRNGs include [66] or [88] among others. In [66] a generator is designed that combines a TRNG and a PRNG. The authors opt for this methodology because according to their design, they manage to use the advantages that both types of generators provide and in this way they obtain sequences that pass the usual statistical tests. They use a true random number as a seed for the application of the PRNG. Specifically, they start by extracting neuro-signals (using a low-resolution, cheap and portable encephalogram) that are considered seeds to apply a Linear congruential generator. A similar problem is addressed in [88]. The authors design a PRNG generator that uses as seed a (nearly) random sources produced from computer.
As an example of research in which it is combined a PRNG with Genetic Algorithm is [192]. In particular, it is used a LFSR.  [185] Can be used in other low-power embedded networks Problems: It has significant security concerns: (See [185]) → the exact entropy introduced by the transmitted and received messages in the network are undefined → how frequently it is necessary to apply the re-keying algorithm is not defined → the CRC of the transmitted packets has a very low entropy → the packets can bemodified by an attacker → the generated sequence has a short period → the internal secret state of the generator can be easily recovered by eavesdropping only two consecutive outputs

Conclusions
The generation of random sequences as discussed above is essential in several areas of knowledge. The essential statistical properties sought in these sequences are randomness, independence and uniformity (in which case any type of sequence can be generated from any other random variable). To the above properties we would add that of unpredictability if we were working in cryptography. From a practical point of view, and assuming that an arithmetic generator is used, it is also desirable that the period was as long as possible.
Focusing on the initial objectives of this research, we can affirm that they have been achieved: • The concept of randomness and the main types of generators have been explained in detail: TRNGs, PRNGs and the subclass CSPRNGs. It has been emphasized which properties are desirable for the sequences to fulfill depending on the context in which they are to be used. • This work has analyzed in detail the different types of arithmetic generators, which are characterized by a solid mathematical basis. They constitute a set of algorithms that allow the generation of sequences of pseudorandom numbers. Two main sub-types have been developed: linear and nonlinear. The most important generators in the literature have been defined, from the most rudimentary methods with the worst properties to the most sophisticated ones with better statistical properties and longer periods.