Some Remarks and Tests on the DH1 Cryptosystem Based on Automata Compositions

Dömösi and Horváth in their previous works (see [Dömösi and Horváth, 2015a] and [Dömösi and Horváth, 2015b]) introduced new block ciphers based on Gluškov-type product of automata. In what follows we will refer to the cipher in [Dömösi and Horváth, 2015a] as the first DömösiHorváth cryptosystem, or in short, DH1-cipher, whereas to the cipher in [Dömösi and Horváth, 2015b] as the second Dömösi-Horváth cryptosystem, or in short, DH2-cipher. In this paper we investigate some properties of the DH1cipher. However, we do not discuss all details of definition and motivation regarding DH1-chipers in this paper.


Introduction and problem statement
Dömösi and Horváth in their previous works (see [Dömösi and Horváth, 2015a] and [Dömösi and Horváth, 2015b]) introduced new block ciphers based on Gluškov-type product of automata. In what follows we will refer to the cipher in [Dömösi and Horváth, 2015a] as the first Dömösi-Horváth cryptosystem, or in short, DH1-cipher, whereas to the cipher in [Dömösi and Horváth, 2015b] as the second Dömösi-Horváth cryptosystem, or in short, DH2-cipher. In this paper we investigate some properties of the DH1cipher. However, we do not discuss all details of definition and motivation regarding DH1-chipers in this paper.
Both systems use the following simple idea: consider a giant-size permutation automaton such that the set of states and the set of inputs consisting of all given length of strings over a non-trivial alphabet as all possible plaintext/ciphertext blocks. Moreover consider a cryptographically secure pseudo random number generator with large periodicity having the property that, getting its really random kernel, it serves a sequence of pseudo random strings as inputs for the automaton. For each plaintext block the system calculates the new state into which the actual pseudorandom string takes the automaton from the state which is identified as the actual plaintext block. The stringidentified as the new state-will be the ciphertext block ordered to the considered plaintext block. Of course, the ciphertext will be the concatenation of the generated ciphertext blocks. The giant size of the automaton makes it infeasible to break the system by brute-force method.
For all notions and notations not defined in this paper we refer to the monographs [Dömösi andNehaniv, 2005, Mezenes andVanstone, 1996].
The cryptosystem discussed here is a block cipher. Since the key automaton is a permutation automaton, for every ciphertext there exists exactly one plaintext making the encryption and decryption unambiguous. Moreover, there is a huge number of corresponding encoded messages to each plaintext so that several encryptions of the same plaintext yield several distinct ciphertexts.
Given the cryptosystem DH1-cipher described above a natural question is the investigation of the statistical properties of the system from many perspectives. For instance, the avalanche effect of the system -as a natural property required in the profession-may be tested by several classical hypothesis tests. Some early results are given in [Dömösi et al., 2017] where they confirm that the avalanche effect is fulfilled. However, further tests can and should also be used, in particular the ones used for testing whether the output of it can be distinguished from 'true' random sources. That is why we turned to the well known NIST package of statistical tests in this paper, which can be considered as a 'standard' in the profession for such purposes. Our main aim is to give the results of the NIST test regarding the cryptosystem at issue (Section 5). For this we describe the system (Section 3) together with some theoretical background (Section 2), as well as the necessary details, of course, of our experimental analysis done for the tests (Section 4). We show in this paper that the system we discuss has passed all statistical tests in the NIST package.

Theoretical background
The automata are systems that can be used for the transmission of information of certain type. In wider sense, every system that accepts signals from its environment and, as a result, changes its internal state, can be considered as an automaton. By an automaton we mean a deterministic finite automaton without outputs. The automaton A = (A, Σ, δ) consists of the finite set of states A, the finite set of input signals Σ, and the transition function δ, which is often written in a matrix form. The transition matrix of the automaton A = (A, Σ, δ) consists of its states such that it has as many rows as input signals, and there are as many columns as states of the automaton. For the sake of simplicity we assume that A and Σ are ordered sets. The j-th element of the i-th row of the transition matrix will be the state which is assigned by the transition function to the pair consisting of j-th state and i-th input signal. We say about this element a of the i-th row and j-th column of the transition matrix that the i-th input signal takes the automaton from its j-th state to state a. (In fact, in this case it is also usual to say that the automaton goes from its j-th state to state a by the effect of the i-th input signal.) The rows of the transition matrix can be identified with the input signals of the automaton, and its columns with its states, while the transition matrix itself with the transition.
If all the rows of the transition matrix are permutations of the state set then we have a permutation automaton.
Proof. Suppose that A is a permutation automaton. Then all rows in its transition matrix are permutations of the state set. But then none of the rows of the transition matrix has a repetition. Therefore, for any states a, b ∈ A and input x ∈ Σ, δ(a, x) = δ(b, x) implies a = b. Conversely, assume that for any states a, b ∈ A and input x ∈ Σ, δ(a, x) = δ(b, x) implies a = b. Then none of the rows of the transition matrix has a repetition. Therefore all of its rows are permutations of the state set. This completes the proof. The Gluškov-type product of the automata A i with respect to the feedback functions ϕ i (i ∈ {1, . . . , n}) is defined to be the automaton A = A 1 × · · · × A n (Σ, (ϕ 1 , . . . , ϕ n )) with state set Figure 1: Gluškov-type product.
Next we define the concept of temporal product of automata. It is a model for multichannel automata networks where the network may cyclically change its internal structure during its work on each channel.
Let A t = (A, Σ t , δ t ), t = 1, 2 be automata having a common state set A. Take a finite nonvoid set Σ and a mapping ϕ of Σ into Σ 1 × Σ 2 . Then the automaton A = (A, Σ, δ) is a temporal product (t-product) of A 1 by A 2 with respect to Σ and ϕ if for any a ∈ A and x ∈ Σ, δ(a, x) = δ 2 (δ 1 (a, x 1 ), x 2 ), where (x 1 , x 2 ) = ϕ(x) (see also Figure 2). The concept of temporal product is generalized in the natural way to an arbitrary finite family of n > 0 automata A t (t = 1, . . . , n), all with the same state set A, for any mapping ϕ : Σ → n t=1 Σ t , by defining δ(a, x) = δ n (· · · δ 2 (δ 1 (a, x 1 ), x 2 ), · · · , x n ) when ϕ(x) = (x 1 , . . . , x n ). In particular, a temporal product of automata with a single factor is just a (one-to-many) relabeling of the input letters of some input-subautomaton of its factor.
Lemma 2. Every temporal product of permutation automata is a permutation automaton.
Proof. It is clear from the above mentioned remark that every temporal product of permutation automata with a single factor is a permutation automaton. Now let A t = (A, Σ t , δ t ), t = 1, 2 be permutation automata with the same state set A. Consider a temporal product of A 1 and A 2 with respect to an arbitrary input set Σ and mapping ϕ : Σ → Σ 1 × Σ 2 . Prove that for any a, b ∈ A, z ∈ Σ with ϕ(z) = (x, y), δ 2 (δ 1 (a, x), y) = δ 2 (δ 1 (b, x), y) implies a = b. Indeed, let δ 1 (a, x) = c and δ 1 (b, x) = d. Recall that A 2 is a permutation automaton. Therefore, by Lemma 1, δ 2 (c, y) = δ 2 (d, y) implies c = d. On the other hand, A 1 is also a permutation automaton. Thus, by Lemma 1, c = d with δ 1 (a, x) = c and δ 1 (b, x) = d imply a = b. Applying Lemma 1 again, we receive that the temporal product of A 1 and A 2 with respect to Σ and ϕ is a permutation automaton. Therefore our statement holds for all temporal products having two factors. Now we consider a temporal product of permutation automata A 1 , . . . , A n , n > 2 with respect to a given set Σ and mapping ϕ.
Define the mappings ϕ 1 : Let B 1 denote the temporal product of A 1 and A 2 with respect to Σ and ϕ 1 , B 2 denote the temporal product of B 1 and A 3 with respect to Σ and ϕ 2 , . . . , B n−1 denote the temporal product of B n−2 and A n with respect to Σ and ϕ n , respectively.
Then using the fact that our statement holds for all temporal products with two factors we obtain that all of B 1 , . . . , B n−1 are permutation automata. On the other hand, it is clear that B n−1 is equal to the temporal product of permutation automata A 1 , . . . , A n with respect to Σ and ϕ. Thus the proof is complete.
Given a function f : X 1 × · · · × X n → Y, we say that f is really independent of its i-th variable if for every pair If |V | = n then we also say that D is a digraph of order n. If V can be decomposed into two disjoint (nonempty) subsets V 1 , V 2 such that V 1 is the set of all targets and V 2 is the set of all sources then we say that D is a bipartite digraph. If the bipartite graph D has neither branches nor collapses then we say that D is a simple bipartite digraph.
An important property of key-automata is explained in the following result.
Theorem 1. Every key-automaton is a permutation automaton.

Proof.
Let B = (Σ n , (Σ n ) 2log2n , δ B ) be a keyautomaton. By definition, it is a temporal product of automata A D1 , . . . , A D 2log 2 n with respect to (Σ n ) 2log2n and the identity map ϕ : (Σ n ) 2log2n → (Σ n ) 2log2n as defined above. By Lemma 2, it is enough to prove that each of A D1 , . . . , A D 2log 2 n is a permutation automaton.
Consider an automaton A D = (Σ n , Σ n , δ D ) with A D ∈ {A D1 , . . . , A D 2log 2 n } and the simple bipartite digraph D = (V, E) assigned to A D . Let V 1 denote the set of targets and V 2 denote the set of sources of D as before.
Then, by the effect of its input (a j ⊕x j , x i ) the i-th component of A D goes from its state a i into state b i , and similarly, by the effect of its input But then by the effect of its input (a j ⊕ x j , x i ), the i-th component of A D goes from its state a i into state b i , and similarly, by the effect of its input (b i ⊕ x i , x j ), the j-th component of A D goes from its state a j into state b j .
Recall that A j is a permutation automaton. Therefore, applying Lemma 1, a j = a j . Therefore, using our previous assumptions we can derive that by the effect of its input (a j ⊕ x j , x i ) the i-th component of A D goes from its state a i into state b i . On the other hand, we assumed that by the effect of its input (a j ⊕ x j , x i ), the i-th component of A D goes from its state a i into state b i . Applying Lemma 1 again we obtain that a i = a i .
The basic idea of DH1 cryptosystem is to use a finite automaton and a pseudo random generator. The set of states of the automaton consists of all possible plaintext/cyphertext blocks and the input set of the automaton contains all possible pseudo random blocks. The size of the pseudo random blocks are the same as the size of the plaintext/cyphertext blocks. For each plaintext block the pseudo random generator generates the next pseudo random block and the automaton transforms the plaintext block into a cyphertext block by the effect of the pseudo random block.
The key is the transformation matrix of the automaton.
It is easy to see that the key must be a permutation automaton, since this property grants an unambiguous decryption. This condition is satisfied by Theorem 1.
On the other hand we can have more than one corresponding ciphertext for each plaintext even if we use the same key-automaton. The reason for this is that we can change the pseudo random numbers generated by the pseudo random generator. We can save a secret number n -as a part of the key-and before encryption we can choose a (public) random number m. This number m will be the first block of the ciphertext, and before encryption and decryption, the seed of the pseudo random number generator can be calculated with an XOR operation from n and m (n ⊕ m). This way each encryption process uses different pseudo random numbers and results different ciphertext for the same plaintext.
The problem with this idea is the following. Modern block ciphers operate on fixed-length groups of bits called blocks. The size of the blocks is at least 128 bits (16 bytes), so the size of the transition matrix of the automaton is huge, namely 2 128 × 2 128 × 16 bytes, which is impossible to be stored in the memory or on a hard disk. The solution is to use an automata network. Gluškov-type product of automata consists of smaller component automata and it is able to simulate the operation of a huge automaton. In this case we should store only the transition matrix of the isomorphic component-automata, the structure of the composition and the secret number n to calculate the seed of the pseudo random number generator.

Encryption and decryption
A symmetric cryptosystem consists of the following: a set of plaintexts P, -a set of ciphertexts C, -a key space K, -an encryption function e : P × K → C, and a decryption function d : C × K → P.
Furthermore, the following property must hold for each x ∈ P and k ∈ K: d((e(x, k), k) = x. Moreover, the cryptosystem is called approved block cipher if and only if the elements of the set of plaintexts and the set of ciphertexts are at least 128 bit long (|P| ≥ 2 128 and |C| ≥ 2 128 ).
Our cryptosystem is a block cipher one. Both of the encryption and decryption apparatus have a pseudo random generator and a key-automaton.
The encryption procedure is the following. Before the encryption procedure, the pseudo random generator gets its initialization vector as a true random string r 1 . . . r n ∈ Σ n , where the pseudo random alphabet Σ is also the plaintext and the ciphertext alphabet simultaneously. This initialization vector will also be the first block of the ciphertext.
Then the apparatus reads the plaintext block-by-block and, after reading the next plaintext block a 1 · · · a n ∈ Σ n (the first block first), it generates the second, third, and the further blocks of the ciphertext in the following way.
The apparatus takes the key-automaton B = (Σ n , (Σ n ) 2log2n , δ B ) into the state a 1 · · · a n ∈ Σ n which coincides with the actual one, i.e. the last received plaintext block.
The last state a 2log2n,1 · · · a 2log2n,n will be the generated ciphertext block of the plaintext block a 1 · · · a n .
The i-th transition a i,1 · · · a i,n = δ D (a i−1,1 · · · a i−1,n , w i ) will be performed in the following way.
Recall that D is a Gluškov product A D = A 1 × · · ·×A n (Σ n , (ϕ 1 , . . . , ϕ n )) of appropriate permutation automata A m = (Σ, Σ 2 , δ m ), m = 1, . . . , n that are state isomorphic to each other so that for an appropriate bipartite digraph D = (V, E) with the set V 1 of targets and V 2 of sources we have as follows: where w m = x 1 · · · x n ∈ Σ n is the actual pseudo random string. Obviously, using the transition matrix of A i , from a k−1,i , a k−1,j , x i , x j we can determine a k,i for every i ∈ V 1 , (j, i) ∈ E. Moreover, after calculating the values a i (i ∈ V 1 ), using the transition table of A i , from a k−1,j , a k,i , x i , x j we can determine a k,j for every i ∈ V 2 , (i, j) ∈ E.
Then, concatenating the calculated blocks, we will get the ciphertext.
The decryption procedure is the following. Similarly as before, before the decryption procedure the pseudo random generator gets the first ciphertext block as its initialization vector r 1 . . . r n ∈ Σ n .
Then the apparatus reads the ciphertext block-by-block and, after reading the next ciphertext block c 1 · · · c n ∈ Σ n (the first block first), it generates the second, third and the further blocks of the plaintext in the following way.
The apparatus determines the state a 1 · · · a n ∈ Σ n of key-automaton B = (Σ n , (Σ n ) 2log2n , δ B ) into which the automaton B is taken from the state a 1 · · · a n ∈ Σ n by the effect of 2log 2 n consecutive strings in Σ n generated by the pseudo random generator.
Thus the pseudo random generator should generate a 2log 2 n -long number of pseudo random sequences w 1 , . . . , w 2log2n ∈ Σ n and going back from the last member w 2log2n to the first one w 1 the following procedure is performed.
The last state a 0,1 · · · a 0,n will be the generated plaintext block of the ciphertext block c 1 · · · c n .
The state a i−1,1 · · · a i−1,n obtained from the i-th state transition a i,1 · · · a i,n = δ D (a i−1,1 · · · a i−1,n , w i ) will be performed in the following way.
Recall again that D is a Gluškov product A D = A 1 × · · · × A n (Σ n , (ϕ 1 , . . . , ϕ n )) of appropriate permutation automata A m = (Σ 2 , Σ, δ m ), m = 1, . . . , n that are state isomorphic to each other so that for an appropriate bipartite digraph D = (V, E) with the set V 1 of targets and V 2 of sources, we conclude as in (1).
Recall also that all of A 1 , . . . , A n are permutation automata. Therefore, for every a k,i , a k,j , x i , x j , j ∈ V 2 , (j, i) ∈ E, there exists only one a k−1,j with a k,j = δ i (a k−1,j , (a k,i ⊕ x i , x j )). Thus, using the transition table we can unambiguously determine a k−1,j for every j ∈ V 2 . Moreover, for every a k,i , a k−1,j , x i , x j , i ∈ V 1 , (j, i) ∈ E, there exists exactly one a k−1,i with a k,i = δ i (a k−1,i , (a k−1,j ⊕ x j , x i )). Therefore, using the transition table again we can unambiguously determine a k−1,i as well for every i ∈ V 1 .
Then by concatenating the determined plaintext blocks we will get the plaintext back.
To sum up, the discussed cryptosystem is a block cipher. Because of Theorem 1, for every ciphertext there exists exactly one plaintext making the encryption and decryption unambiguous. Moreover, there is a huge number of corresponding encoded messages to each plaintext so that several encryptions of the same plaintext yield several distinct ciphertexts.

Experimental results
The practical test was done using 16 byte (128 bit) long input blocks, output blocks and pseudo random blocks. First we present the size of the keyspace, then we continue our investigation with the test results of the the speed of the algorithm, and finally the effectiveness of the avalanche effect.
Using the above mentioned parameters with 256 possible states (1 byte long states) we need 16 automata having a transition matrix of 2 16 = 65536 lines and 2 8 = 256 columns. Each cell of the automaton contains 1 byte long data (One state). The size of the matrix is 16 megabytes and the number of possible matrices is 256! 65536 , where the exclamation mark means the factorial operation. This protection is much more than good enough against brute-force attacks. When we use isomorphic automata this huge number should be further increased to have 256! 65536 * 256! 15 = 256! 65551 possible keys. Using the above mentioned parameters with half byte (4bits) long states, we need 32 automata having a transition matrix of 2 8 = 256 lines and 2 4 = 16 columns and each cell of the automaton contains half byte long data. In this case the size of the matrix is only 2 kilobytes and the number of possible matrices is 16! 256 . Using permutation automata this can be increased to 16! 287 possible keys, which is still more than enough against brute-force attacks. However, we recommend the 8 bit version, because the number of calculations during the encoding and decoding process is less and the effectiveness of the avalanche effect is better.
The practical test of the encoding and decoding algorithm was done on an average desktop PC, (3,1 GHz Intel Core I3-2100 processor, 4 Gigabyte RAM). The program we used was a well written C# implementation. The results of the speed tests of the 8 bit version can be seen in Table  1.
The results of the speed tests show that using an average PC the encoding time is more than 4 megabytes per second, and decoding time is about the same.
The avalanche effect is a very important property of block ciphers. The block cipher is said to have avalanche effect when a small change in the plaintext block results in a significant change in the corresponding ciphertext block, further, a small change in the ciphertext block results in a significant change in the corresponding plaintext block. We tested the avalanche effect in the following way. We chose 1000000 random plaintext blocks, encoded them and then we changed 1 bit in each plaintext block, encoded again, then we calculated the number of different bytes in the ciphertext blocks pair-wise. We also tested the opposite case, namely, we chose 1000000 random ciphertext blocks, decoded them and then we changed 1 bit in each ciphertext block, decoded again and calculated the number of different bytes in each plaintext block pair-wise. During the first test we used just the first two rounds of encoding and decoding. The results can be seen in Table 2. When we change only one bit in the plaintext block the difference between the corresponding ciphertext blocks will be really huge in the majority of cases. The same effect can be seen in the opposite case: changing one bit in the ciphertext block results in a huge difference in the plaintext block as well. Although it was a good result, we also made a further test with the full 4-round algorithm. The results can be seen in Table 3.
Furthermore, we calculated the optimal avalanche effect. For this, we chose 2×1000000 completely random blocks and then calculated the difference between them pair-wise. The results are in Table 4 We can assume that using the 8-bit version of the algorithm with 128 bit long blocks and 4 rounds the algorithm has the maximal avalanche effect and an appropriate speed (4 megabyte/s). Of course the speed of the algorithm depends on the hardware, the programming language and the actual program code as well.

The NIST test
The National Institute of Standards and Technology (NIST) published a statistical package consisting of 15 statistical tests that were developed to test the randomness of arbitrarily long binary sequences produced by either hardware or software based cryptographic random or pseudo random number generators. In case of each statistical test a set of P-values was produced. Given a significance level α, if the P-value is less than or equal to α then test suggests that the observed data is inconsistent with our null hypothesis, i.e. the 'hypothesis of randomness', so we reject it. We used α = 0.01 as it is common in such problems in cryptography. An α of 0.01 indicates that one would expect 1 sequence in 100 sequences to be rejected under the null hypothesis. Hence a P-value exceeding 0.01 would mean that the sequence would be considered to be random, and P-value less than or equal to 0.01 would lead to the conclusion that the sequence is non-random.
One of the criteria used to evaluate the AES candidate algorithms was their demonstrated suitability as random number generators. That is, the evaluation of their output utilizing statistical tests should not provide any means by which to distinguish them computationally from a truly random source. Randomness testing was performed using the same parameters as for the AES candidates in order to achieve the most reliable and comparable results. First the input parameters -such as the sequence length, sample size, and significance level-were fixed. Namely, these parameters were set at 2 20 bits, 300 binary sequences, and α = 0.01, respectively. Furthermore, Table 5 shows the length parameters we used.
In order to analyze the output of the algorithm we encrypted the Rockyou database, which contains more than 32 millions of cleartext passwords (see e.g: [Tihanyi et al., 2015]). Applying the NIST test for the encrypted file it has turned out that the output of the algorithm can not be distinguished in polynomial time from true random sources by statistical tests. The exact p-values of the evaluation of the ciphertext are shown in Table (6). We also tested the uniformity of the distribution of the pvalues obtained by the statistical tests included in NIST, which is a usual requirement in the literature (see e.g. [Rukhin et al., 2010]). The uniformity of p-values provide no additional information about the type of the cryptosystem. We have also shown that the proportions of binary sequences which passed the 0.01 level lie in the required confidence interval (see e.g. [Rukhin et al., 2010]).

Conclusions
The output of our crypto algorithm has passed all statistical tests of the NIST suite we performed and we were not    Non-overlapping Template 9 Overlapping Template 9 Approximate Entropy 10 Serial 16 Linear Complexity 500 able to distinguish it from true random sources by statistical methods. Statistical analyses of a cryptosystem is a must-have requirement, and these test results are good indicators that further analyses can and should be done in order to check further properties. Cryptanalysis methods like chosen-plaintext, known-plaintext and related-key attack techniques will be used to prove or disprove the strength of the cryptosystem. These problems are the subject of our future research. Many information systems such as computers and computer networks may be simulated by means of a queueing system. In general, queueing systems model is developed assuming the arrival rate and service intensity to be in the equilibrium state. The well-known methods of the queueing system investigation are based on the stationary behaviour of the input flow and service duration. Taking into account these characteristics as well as technicaleconomical criteria, the optimal system performance parameters are determined.
In real conditions the input flow arrival rate is affected by the step-by-step influence and the system state can essentially differ from the desired one. Here we come across the problem of compensating these differences with the purpose of equalizing the real value of output of customers' flow to the desirable one.