Math.Comput.Sci. (2018) 12:319–337 https://doi.org/10.1007/s11786-018-0348-2
Mathematics in Computer Science
Smoothness Test for Polynomials Defined Over Small Characteristic Finite Fields Gora Adj · Isaac Canales-Martínez · Luis Rivera-Zamarripa · Francisco Rodríguez-Henríquez
Received: 6 June 2017 / Revised: 6 April 2018 / Accepted: 26 April 2018 / Published online: 27 June 2018 © Springer International Publishing AG, part of Springer Nature 2018
Abstract The problem of determining whether a polynomial defined over a finite field ring is smooth or not with respect to a given degree, is the most intensive arithmetic operation of the so-called descent phase of index-calculus algorithms. In this paper, we present an analysis and efficient implementation of Coppersmith’s smoothness test for polynomials defined over finite fields with characteristic three. As a case study, we review the best strategies for obtaining a fast field and polynomial arithmetic for polynomials defined over the ring Fq [X ], with q = 36 , and report the timings achieved by our library when computing the smoothness test applied to polynomials of several degrees defined in that ring. This software library was recently used in Adj et al. (Cryptology 2016. http://eprint. iacr.org/2016/914), as a building block for achieving a record computation of discrete logarithms over the 4841-bit field F36·509 . Keywords Smoothness test · Index-calculus · Polynomial rings Mathematics Subject Classification 11A51 · 11T06 · 11T71 · 11T55
1
Adleman mentioned that Ronald Rivest suggested him to use the term smoothness for this purpose.
G. Adj (B) · I. Canales-Martínez · F. Rodríguez-Henríquez Computer Science Department, CINVESTAV-IPN, Mexico City, Mexico e-mail:
[email protected] I. Canales-Martínez e-mail:
[email protected] F. Rodríguez-Henríquez e-mail:
[email protected] L. Rivera-Zamarripa Centro de Investigación en Computación del Instituto Politécnico Nacional CIC-IPN, Mexico City, Mexico e-mail:
[email protected]
320
G. Adj et al.
1 Introduction In [5], Addleman coined the term smooth number, as an integer that factors into small prime numbers smaller than a given bound.1 A few years later, in [12], Hellman and Reyneri generalized Adleman’s algorithm for Fq , with q = p m , thus extending the smoothness property to polynomials. Let Fq [x] denote the polynomial ring in the variable x over the finite field Fq of order q. Any polynomial f ∈ Fq [X ] of degree greater than zero can be expressed as a product of irreducible factors, f = a f 1e1 . . . f kek ,
(1)
where a ∈ Fq , and f 1 , . . . , f k are pairwise distinct monic irreducible polynomials over Fq [X ], and where e1 , . . . , ek are positive integers. The above factorization of the polynomial f is unique up to the order of the factors. Given f = a i f iei ∈ Fq [X ] and m ≥ 1, f is said to be m-smooth if the degree of each one of the factors f i is less or equal than m. Let f k be the irreducible factor of the polynomial f with the greatest degree. Then, by definition f is m-smooth for all m ≥ degree( f k ). Given a non-zero degree polynomial f defined over the polynomial ring Fq [X ], we are interested in knowing if f is m-smooth, for some fixed value m ≥ 1. For most cryptographic applications, one does not need to know the minimum smoothness value of the polynomial f , but rather, one simply wants to determine whether f is m-smooth or not. This question can be answered by following two different approaches. The first one consists of finding the exact factorization of f as shown in Eq. (1), and then verifying if all the irreducible factors have degree less or equal than m or not. The second option consists of performing a smoothness test over f . The output of this test will answer yes or no to the question of whether f is or not m-smooth, but will not find the exact factorization of f , nor its minimum smoothness value. The drawback associated with the factorization approach is that its associated computational cost tends to be significantly higher than performing a simpler smoothness test. On the other hand, the smoothness test may sometimes produce false positives (although never false negatives). In any case, as it is shown in this paper, for our case study the probability for the occurrence of false positives in the polynomial ring F36 [X ] is negligibly small. As mentioned above, finding whether a polynomial is m-smooth or not, is a crucial building block in the computation of discrete logarithms over finite field extensions that are based on index-calculus methods, as it is briefly explain next.
1.1 Index-Calculus Method for Computing Discrete Logarithms Over Small Characteristic Fields Let Fq∗ n denote the finite field of order N = q n − 1, where q = p , p is a prime number and , n are positive integers. The elements of Fq∗ n can be represented as polynomials of degree at most n − 1 over Fq . Given a generator g ∈ Fq , and an element h ∈ Fq , the discrete logarithm problem (DLP) in Fq∗ n is that of determining the integer ∈ [0, N − 1] satisfying h = g . In his 1979 landmark paper [5], Leonard Adleman presented an index-calculus algorithm that can be used to compute discrete logarithms over a prime field F p . Over the years, Adleman’s method was extended to address the computation of discrete logarithms over finite field extensions of the form Fq∗ n . In particular, in this paper we are interested in the problem of computing discrete logarithms over fields with small characteristic, where the prime p is chosen to be 3.2 The standard approach for computing the discrete logarithm of h using the index-calculus method can be summarized as follows. First, this algorithm finds the logarithms (mod N ) of all the elements in Fq∗ n included in the so-called factor base. Then, in a descent stage, logg h is expressed as a linear combination of logarithms of the elements in the factor base. The descent stage proceeds in several steps, each expressing the logarithm of a degree-D element as a linear combination of the logarithms of elements of degree ≤ m for some m < D. Most of these steps 2
Similar strategies applies when one selects p = 2.
Smoothness test for polynomials defined...
321
Continuedfraction descent
Classical Descent
Gr¨obner basis descent
Factor basis Fig. 1 Index-calculus descent phases for computing discrete logarithms over small characteristic finite fields of the form Fq n
end their computation when, after a typically very large number of trials, a polynomial with the required smoothness is finally found. Hence, the computational complexity of the descent stage directly depends on the computational cost of the smoothness test. Specifically targeting small characteristic fields, Joux [14] and Gölo˘glu et al. [9], presented a new DLP algorithm with a running time of L Q [ 41 + o(1), c], for some undetermined c. Their algorithm has three different descent stages (see Fig. 1), namely, the continued-fraction descent, the classical descent and the Gröbner basis descent.
1.2 Our Contributions In [1], a record computation of the discrete logarithm problem over the field F36·509 was reported. The authors of [1] shown that Fq=36 is the largest proper subfield of F36·509 where the index calculus method can be performed effectively. Hence, from the efficiency point of view it just makes sense to represent the field elements of F36·509 as univariate polynomials defined over the ring Fq [X ] partitioned by a 509-degree monic polynomial irreducible over Fq . In this setting, the main task in the continued-fraction descent shown in Fig. 1 is that of finding a randomly chosen 254-degree polynomial defined over the ring Fq [X ] that happens to be m-smooth, where m is a positive integer smaller than, say, 40. Since this task is computationally highly demanding, the efficient implementation of the smoothness test becomes a crucial ingredient for achieving the discrete logarithm computation over the field F36·509 in a reasonable time. Hence, the main contribution of this paper is to present all the design details towards an efficient implementation of the smoothness test for ring polynomials defined over Fq [X ]. In this paper we report a baseline reference C implementation of the smoothness test when dealing with ring polynomials of different degrees. Our implementation was supported by a careful analysis of the most efficient field and polynomial arithmetic algorithms, along with the smoothness test strategy outlined in [3,9]. Furthermore,
322
G. Adj et al.
we provide a rigorous analysis of the probability of incorrect outcomes of the smoothness test. We also report experiments showing that m-smooth polynomials of degree n, with n > m > 0, closely follow a Gaussian distribution as theoretically predicted by the analysis given in [3,8]. The remainder of this paper is organized as follows. In Sects. 2–3, we discuss the best possible approaches for a fast and efficient software implementation of the field and polynomial arithmetic over small finite fields of the form Fq , and polynomial rings of the form Fq [X ], respectively. Then, in Sect. 4 we provide a formal analysis of the probability of incorrect outcomes of Coppersmith’s smoothness test as well as a detailed description of how to implement it according to the approach proposed in [9]. In Sect. 5 we report the performance achieved by our software in the implementation of all three of them, the field and polynomial arithmetic, and the smoothness test for polynomials of different degrees defined over F36 [X ]. We conclude this Section by reporting the experimental computational effort associated to the task of finding a randomly chosen degree-254 polynomial in the ring F36 [X ] that happens to be 40-smooth. Finding such polynomial is the main task of the continued-fraction descent depicted in Fig. 1 when computing discrete logarithms over the 4841-bit field F36·509 (see [1] for more details). 2 Field Arithmetic Over Fq This Section describes how to perform the main arithmetic operations over small characteristic fields of the form Fq , with q = p , p a small prime. In particular, a detailed description of the sixth extension of the ternary field F3 , will be given. Field arithmetic over a sufficiently small finite field Fq can be performed via look-up tables. This method is independent of any particular field representation. The basic idea is that all possible pairwise combinations of the field elements can be pre-computed and stored in a look-up table. This way, addition and multiplication lookup tables require a memory space of q 2 field elements each, whereas the lookup tables for additive and multiplicative inverses require to store q field elements each. In the remaining of this section we will discuss two different ways of representing the field elements of Fq , and the best strategies for computing elementary arithmetic operations according to the field representation choice.
2.1 Polynomial Representation Let Fq be a prime extension field of the field F p . Since Fq ∼ = F p [X ]/(I ) for some degree-n polynomial I ∈ F p [X ] irreducible over F p , the elements in Fq can be represented as polynomials of degree < n with coefficients in F p . Field addition and substraction are especially efficient using this representation. However, field multiplication and exponentiation become relatively expensive. Although additive inverses can be computed at almost no cost, calculating multiplicative inverses is a relatively expensive operation. Hence, when using polynomial representation it is generally more optimal to perform field addition and subtraction on run-time and perform the other arithmetic operations using pre-computed look-up tables.
2.2 Exponential Representation Let g ∈ Fq be a generator of Fq∗ . Then, each nonzero element in a ∈ Fq∗ can be represented as a power of g, i.e, a = g i with i ∈ {0, . . . , q − 2}. One drawback of this representation is that the zero element cannot be represented as a power of g. Hence, the nonzero elements of Fq can be represented in exponential form by means of the mapping ψ : Fq → {0, . . . , q − 2} ∪ {∞}, defined as follows, ∞ if a = 0, ψ(a) = i if a = g i .
Smoothness test for polynomials defined...
323
The exponential representation leads to extremely fast arithmetic operations over Fq∗ . In particular, this representation allows to compute additive inverses efficiently. Nevertheless, field addition cannot be computed in a straightforward fashion. Let a = g m , b = g n ∈ Fq and e > 0, all the main arithmetic operations, except for the addition, can be computed using integer arithmetic modulo q − 1 as summarized next. Multiplication Additive inverse
a · b = g m · g n = g (m+n) mod q−1 − a = −1 · g m = g α g m = g (α+m) mod q−1 , where, α =
Multiplicative inverse Exponentiation
a −1 = (g m )−1 = g (−m) mod q−1
q −1 2
a e = (g m )e = g (me) mod q−1
Using the identity g (q−1)/2 = −1, it becomes easy to compute additive inverses. Notice also that the above operations have an exceptional case whenever a = 0 or b = 0, which needs to be handled separately. It is worth to remark that if the memory resources are plenty, then instead of actually computing on run-time the operations modulo q − 1 as listed above, one can always perform the field arithmetic Fq by means of look-up tables. In order to perform the field addition operation efficiently, the following strategies were considered, a. To use a look-up table for the addition b. To precompute two tables that allow us to go from exponential to polinomial representation and vice versa. By using a look-up table, the operands are translated from exponential to polynomial representation. Then, the addition is performed coefficient-wise and finally the result is translated to exponential representation c. To use the Zech’s algorithm (cf. Sect. 2.3)
2.3 Zech’s Logarithm The Zech’s logarithm (also known as Jacobi’s logarithm [13]), is a method for computing the addition when the elements are represented using exponential representation [16,17]. Basically, this method computes the addition of two elements using a multiplication and a precomputed lookup table. Once again, let us assume that the elements of the field Fq are represented in exponential form in base g. Then, the integer Zech’s function Z is defined as, g Z (m) = 1+g m . Let a = g m , b = g n ∈ Fq , then the addition/subtraction of these two elements can be computed as, a + b = g m + g n = g m g Z (n−m) = g m (1 + g (n−m) mod q−1 ), a − b = g m + (−g n ) = g m g Z (α+n−m) = g m (1 + g (α+n−m) mod q−1 ), where α = (q − 1)/2. Note that g Z (α) = (1 + g α ) = 0 = ∞. As it was discussed before, the special cases when a = 0 or b = 0 must be treated as special situations. This strategy requires the precomputation of a single lookup table storing a total of q values.
3 Arithmetic in the Polynomial Ring In this Section we discuss the polynomial arithmetic over a finite field Fq . The computational cost of the different algorithms are given in terms of the number of operations over the field Fq where the polynomial ring is defined.
324
G. Adj et al.
3.1 Addition and Subtraction n n Let f = i=0 f i X i and g = i=0 gi X i be two degree-n polynomials defined over Fq [X ]. Then, the addition h = f + g and the subtraction h = f − g are defined as, h=
n i=0
hi X i =
n n n ( f i + gi )X i and h = h i X i = ( f i − gi )X i . i=0
i=0
i=0
Since the cost of the polynomial addition and subtraction are similar, both operations will be referred as addition. The cost of polynomial addition is O(n) operations in Fq . 3.2 Multiplication n Let f = i=0 f i X i and g = mj=0 g j X j be two degree-n polynomials defined over Fq [X ]. Then, the polynomial multiplication h = f · g is defined as, h=
n+m k=0
h k X k , with h k =
fi g j .
0≤i≤n 0≤ j≤m i+ j=k
The polynomial multiplication can be performed by means of the schoolbook method at a quadratic computational complexity of O(n 2 ) field operations. More concretely, this classical strategy requires n 2 and (n − 1)2 field multiplications and additions, respectively. A second option is to perform a polynomial multiplication using the Karatsuba approach [15], which computes the polynomial multiplication at a cost of O(n log2 3 ) field operations. However, depending on the size of the operands, the Karatsuba approach tends to require more additions operations than what the schoolbook method would need. The number of field operations required by this method are n log2 3 and 6n log2 3 − 8n + 2 field multiplications and additions, respectively [18,21]. We stress that the field arithmetic costs associated to an exponential representation, lead to a rather unusual situation, namely, the fact that field addition is considerably more costly than field multiplication. This peculiar ratio has a sensible role in choosing the polynomial multiplication method. The solution that was implemented in this work consisted of combining the usage of the Karatsuba and the schoolbook multiplication strategies according to the following approach. First, the Karatsuba multiplication is performed until a certain pre-fixed level of recursion. After that, the schoolbook method takes over to complete the smaller degree products required by the first layer of the Karatsuba approach. Algorithm 1 applies this combined strategy, i.e., it computes the polynomial product by first using a Karatsuba approach until a certain threshold is reached. Then, the school-book method is adopted to end up the recursion. One can determine an optimal threshold by comparing the costs of combining the Karatsuba approach and the i schoolbook methods when multiplying n-coefficient polynomials after selecting as a potential limit, n/2 -coefficient polynomial multiplication for i ∈ {0, . . . , log2 n }. We assume an addition/multiplication ratio of 2.8 as supported by our experiments (cf. Table 2). Since the continued-fraction descent phase as described in [3] for the ring F36 [X ], requires the computation of polynomial products and divisions with up to 255 coefficients, we selected n = 255 for all of our experiments. Table 1 reports the computational costs of different combinations of the Karatsuba and the schoolbook approaches. In order to illustrate the figures presented in Table 1, let us consider the following example. Example 3.1 When i = 5, one can use the schoolbook method to multiply polynomials with n S B = 8 coefficients, and the Karatsuba strategy for computing the product of polynomials with n K = 32 coefficients. For this choice, the polynomial operands can be seen as a 32−coefficient polynomial, where each coefficient is in turn an eightcoefficient polynomial, where all these eight coefficients lie in Fq .
Smoothness test for polynomials defined...
325
Table 1 Study of different splittings for the computation of 256-coefficient polynomial product i
School-book method
Karatsuba method
nSB =
nK =
256 2i
# adds.
# mults.
Total cost (in Mq )
256 nSB
Cost (in Mq )
# adds.
# mults.
0
256
65,025
65,536
247,606.0
1
0
1
247,606
1
128
16,129
16,384
61,545.2
2
4
3
186,070
2
64
3969
4096
15,209.2
4
24
9
141,184
3
32
961
1024
3714.8
8
100
27
109,260
4
16
225
256
886.0
16
360
81
87,894
5
8
49
64
201.2
32
1204
243
75,862
6
4
9
16
41.2
64
3864
729
73,312
7
2
1
4
6.8
128
12,100
2187
82,632
8
1
0
1
1.0
256
37,320
6561
111,057
Mq denotes the cost of performing one field multiplication over F36 , and computing one field addition costs 2.8Mq . Columns 3 and 4 show the number of additions and multiplications required by the schoolbook method when dealing with polynomial operands having n S B coefficients as reported in Column 2. Columns 7–8 show the number of polynomial additions and multiplications required by the Karatsuba approach when dealing with polynomial operands having n K coefficients, where each coefficient is in turn an n S B -coefficient polynomial. In this case, the cost of a polynomial addition is of 2.8n S B Mq . The corresponding polynomial multiplication cost is shown in Column 5. The total cost of computing the 256-coefficient polynomial product according to the particular combination choice of the Karatsuba and the School-book method is reported in Column 9
Considering the space complexity of the schoolbook method, it is easy to see that a total of 64 and 49 multiplications and additions over Fq would be required. This is equivalent to (49 × 2.8) + 64 = 201.2 field multiplications, which will be denoted in the following as Mq . Furthermore, a polynomial multiplication of operands having n K = 32 coefficients require the computation of 243 and 1204 eight-coefficient polynomial multiplications and additions, respectively. As it was discussed above, the computational cost associated to each such polynomial multiplication and addition is of 201.2Mq and 2.8 × 8 = 22.4Mq . Hence, the total cost of multiplying two 255-coefficient polynomials using the above splitting between the Karatsuba and the schoolbook approaches is of, (1204 × 2.8 × 8) + (243 × 201.2) = 75,862Mq .
Algorithm 1 PolynomialMult Input: f = ( f n , . . . , f 0 ) ∈ Fq [X ], g = (gn , . . . , g0 ) ∈ Fq [X ] Output: h = (h 2n , . . . , h 0 ) ∈ Fq [X ] 1: if n ≤ Karatsuba_limit then 2: return PolynomialMultSchoolBook( f, g) 3: end if 4: m ← (n + 1)/2 5: f A , f B ← ( f n , . . . , f m ), ( f m−1 , . . . , f 0 ) 6: g A , g B ← (gn , . . . , gm ), (gm−1 , . . . , g0 ) 7: t0 ←PolynomialMult( f B , g B ) 8: t2 ←PolynomialMult( f A , g A ) 9: t1 ←PolynomialMult( f A + f B , gA + gB ) 10: return t2 X 2m + (t1 − t0 − t2 )X m + t0
326
G. Adj et al.
3.3 Division In this subsection we explained one classical way to compute this polynomial arithmetic operation, along with our own proposal for a more specialized reduction method. Theorem 3.1 (Polynomial division [16, Theorem 1.52]) Let g = 0 be a polynomial in Fq [X ], then for all f ∈ Fq [X ], there exist unique polynomials q, r ∈ Fq such that f = qg + r with degree(r ) < degree(g). Algorithm 2 performs the polynomial division operation f /g. In Step 4, a multiplicative inverse over the field Fq is computed. The operations in Steps 7–8 are performed at most n − m + 1 times. In Step 7, a multiplication over Fq is performed, and Step 8, computes qi · g using m multiplications and m additions over Fq , whereas the multiplication by X i is performed as a shift of i positions to the left. In total, Algorithm 2 performs about (2m+1)(n−m+1) field operations, plus the cost of the inverse multiplicative computation. As mentioned in [20], if n < 2m, then Algorithm 2 performs no more than 2m 2 + O(m) operations over Fq . Algorithm 2 PolynomialDiv [20] Input: f = ( f n , . . . , f 0 ) ∈ Fq [X ], g = (gm , . . . , g0 ) ∈ Fq [X ] with g = 0 Output: q, r ∈ Fq [X ] such that f = qg + r with degree(r ) < degree(g) 1: if n < m then 2: return 0, f {q = 0, r = f } 3: end if −1 4: r ← f , u ← gm 5: for i ← n − m, . . . , 0 do 6: if degree(r ) = m + i then 7: qi ← lc(r ) · u {lc(r ) denotes the leading coefficient of r } 8: r ← r − qi X i · g 9: else 10: qi ← 0 11: end if 12: end for 13: return q, r
3.4 Reduction Let f ∈ Fq [X ]/(P) be a polynomial of degree n ≥ . Then, the modulo reduction f mod P, can be computed via polynomial division, i.e., q, r = PolynomialDiv( f, P), where r ≡ f mod P. This operation has a computational cost of O(2 ) field operations over Fq . Without loss of generality, let us assume that P is monic. Then, P = X +
−1 i=0
pi X i ⇒ P =
−1
− pi X i ≡ X mod P.
i=0
Therefore, another option to compute f mod P, is to use P as shown in Algorithm 3. The computation of Step 3 requires multiplications and additions over Fq . The operations in Steps 5–7 are executed n − times. In Step 5, multiplication by X means to perform a left shift operation. Steps 6 and 7 require multiplications and additions over Fq each. In total, the execution of Algorithm 3 requires 2 + (4)(n − ) field operations over Fq . If n < 2, the computational cost of Algorithm 3 is of O(2 ) operations over Fq .
Smoothness test for polynomials defined...
327
Algorithm 3 Reduction Input: f = ( f n , . . . , f 0 ) ∈ Fq [X ], P = ( p , . . . , p0 ) ∈ Fq [X ] with n ≥ Output: r ∈ Fq [X ], the residue of f mod P 1: P ← (− p−1 , . . . , − p0 ) 2: Q ← P {Q ≡ X X mod P} 3: r ← ( f −1 , . . . , f 0 ) + f · Q 4: for i ← + 1, . . . , n do 5: Q ← Q · X {Q has degree } 6: Q ← (q−1 , . . . , q0 ) + q · P {to reduce q and then Q ≡ X X i mod P} 7: r ← r + f i · Q 8: end for 9: return r
3.5 Multiplicative Inverse Computation Let f ∈ Fq [X ]/(P) be a degree n polynomial. Then, by using the extended Euclidean algorithm with f and P, one obtains the polynomials d, s, t ∈ Fq [X ] such that d = gcd( f, P) and f s + Pt = d. If d = 1, f is not relatively prime with P and therefore, there does not exists a multiplicative inverse modulo P. If d = 1, then s is the multiplicative inverse of f mod P. The computational cost of the extended Euclidean algorithm is of O(n) operations over Fq , and therefore, the computation of multiplicative inverses over the ring has a computational cost of O(n) [19, Theorem 17.6].
4 Smoothness Test Without loss of generality, in this Section it is assumed that the polynomial f to be subject of the smoothness test is monic. Several known results [3,8] on the number of m-smooth polynomials over a finite field, for some m ≥ 1, are presented next. Some of the estimations are based on the heuristic argument that m−smooth polynomials of degree n for n > m > 0, are uniformly distributed among ordinary polynomials (cf. “Appendix A”).
4.1 About the Number of Smooth Polynomials The number of monic polynomials of degree n over a finite field Fq is q n . The number of monic irreducible polynomials of degree n over Fq is, Iq (n) =
1 μ(n/d)q d , n d|n
where μ is the Möbius function defined as, ⎧ if a has one or more repeated prime factors, ⎪ ⎨0 μ(a) = 1 if a = 1, ⎪ ⎩ k (−1) if a is the product of k different prime factors. The generating function for m-smooth monic polynomials in the ring Fq [X ], is giving as F(u, z) =
Iq () m uz 1+ , 1 − z
=1
328
G. Adj et al.
where u and z denote the number of distinct irreducible factors and the degree of the polynomial, respectively. From the above discussion, one can estimate the number of monic m-smooth polynomials of degree n over Fq [X ] that have exactly k distinct irreducible monic factors as, Nq (m, n, k) = [u k z n ]F(u, z), where [·] denotes the coefficient operator. Thus, we conclude that the number of m-smooth polynomials of degree n over Fq [X ] is giving as, Nq (m, n) = [z n ]F(1, z). For any given q, m and n, Nq (m, n) can be obtained by computing the first n +1 terms of the Taylor series expansion of F(1, z) and then extracting the coefficient of z n . In the same way, one can compute Nq (m, n, k) [3]. 4.2 Smoothness Test Theorem 4.1 (Smoothness test [7]) Let Fq be a finite field of characteristic p and f a monic polynomial in Fq [X ] of degree d, d > 0. For an integer m, 1 ≤ m ≤ d, define w(X ) = f (X ) ·
m
i
(X q − X )
mod f (X ).
(2)
i= m/2
If f is m-smooth, then w = 0; if f is not m-smooth, then w = 0 if and only if p divides the multiplicity of all irreducible factors of f with degree greater than m. i
Proof It is known that for any positive integer i, the polynomial X q − X is the product of all the monic irreducible m i (X q − X ) is the product of polynomials in Fq [X ] whose degree divides i. Thus, the polynomial P = i= m/2 all the monic irreducible polynomials of degree at most m. Since P never vanishes, we have w = 0 if and only if f = 0 or f | f P. One can see that f = 0 if and only f is a polynomial in X p ; and when f is not a polynomial in X p , the multiplicity in f of an irreducible factor of f with multiplicity e ≥ 1 in f is either e if e is a multiple of p or e − 1 otherwise. Therefore, w = 0 corresponds to f being a polynomial in X p or f | f P. Now suppose f | f P. This means that all the irreducible factors of f with multiplicity not a multiple of p are of degree at most m, because, for these factors, the multiplicity gap of 1 arising from the derivation of f could only have been filled during the multiplication by P, whose irreducible factors are all of degree at most m. Hence, if f is not m-smooth, w = 0 happens if and only if all the irreducible factors of degree greater than m are of multiplicity a multiple of p, since the latter statement is also realized when f is a polynomial in X p . Conversely, it is clear that when f is m-smooth we necessarily have w = 0. 4.3 Efficient Computation of the Smoothness Test In [10], Granger et. al. proposed an efficient strategy for computing the smoothness test of a polynomial f defined over the ring Fq [X ] via Eq. (2). Let q = p , where p is a small prime number and l a positive composed integer that can be expressed as, = r · s. Given f ∈ Fq [X ] of degree n, define the quotient ring Fq [X ]/( f ). For the sake of simplicity, in the following analysis it will be further assumed that the cost of elementary field arithmetic operations such as addition, multiplication and exponentiation have all the same O(1) computational cost.3 In the case of the ring arithmetic, it will be further assumed that the cost of performing a polynomial addition is O(n), whereas the polynomial ring operations of multiplication and division have each a cost of O(n 2 ). rs The approach of Granger et al. can be described as follows. The method first computes the powering [X p ] i followed by the exponentiations [X q ], for i = 1, . . . , m, according to the steps described next. 3
Compare with the computational cost analysis given in Sect. 2.
Smoothness test for polynomials defined...
329 r
1. Precomputation phase: For i = 1, . . . , n − 1, calculate X i p mod f r r r The exponentiations X p , X 2 p , . . . , X (n−1) p can be computed by consecutively multiplying by X . In general, a multiplication by X can be achieved by performing a shift operation possibly followed by a reduction modulo f . Notice further that a polynomial having the same degree of f can be reduced modulo f at a computational cost of m field multiplications and additions over Fq . Hence, the precomputation phase can be achieved by performing (n − 1)( pr − 1) shift operations at a computational cost of less than 2n 2 pr field operations over Fq . r 2. Computing a ring exponentiation a p Using the n − 1 precomputed elements of the former step, one can r compute the exponentation a p , for any arbitrary ring polynomial a ∈ Fq [X ]/( f ) as, n−1 pr n−1 r pr i (3) ai X = ai X i p . i=0
i=0
The computational cost of Eq. (3) is of n exponentiations and n scalar multiplications (i.e., the operation of multiplying a field element of Fq by a degree-(n − 1) polynomial). Hence, the total cost of this operation is of about 2n 2 + n field operations over Fq . i 3. Computing the powering [X q ], for i = 1, . . . , m Let us recall that q = pr s . Hence, applying repeatedly the rj i r si previous step, one can compute [X p ] for j ∈ {2, . . . , sm}, and therefore one obtains, [X q ] = [X p ] for i ∈ {1, . . . , m}. Thus, the computational cost of this step is of approximately, (2n 2 + n)sm = 2n 2 sm + nsm operations over Fq . The computational cost of multiplying and dividing two degree-(n −1) ring polynomials is of about n 2 operations over Fq each. Hence, a ring multiplication, i.e, the cost of multiplying two degree-n − 1 polynomials modulo f , m i has a cost of roughly 2n 2 field operations. The product f i= m/2 (X q − X ) mod f of Eq. (2), requires the computation of m/2 ring multiplications, which implies a total of n 2 m arithmetic operations over Fq . Summarizing, the cost of computing Eq. ( 2) can be estimated as, 2n 2 pr + 2n 2 sm + nsm + n 2 m arithmetic operations over Fq . 5 Experimental Timings In this Section, we report the performance achieved by our software in the implementation of all three of them, the field and polynomial arithmetic, and the smoothness test for polynomials of different degrees defined over F36 [X ]. We perform a total of 100,000 runs for the field and polynomial arithmetic operations over the field F36 and the ring F36 [X ], respectively. In the case of the smoothness test, we perform over 20,000 runs per each instance reported. All the measurements were performed using an Intel i7-3520M processor running at 2.9 GHz, using the openSUSE 13.2 with kernel 3.16.7-24 operating system. Our C library was compiled using GCC 4.8.3. 5.1 Field Arithmetic Timings We give here the experimental costs obtained for elementary arithmetic operations over the field F36 = F3 [u]/(u 6 + 2u 4 + u 2 + 2u + 2). 5.1.1 The field F36 = F3 [u]/(u 6 + 2u 4 + u 2 + 2u + 2) For the ternary field, F36 = F3 [u]/(u 6 + 2u 4 + u 2 + 2u + 2), an element a ∈ F36 is represented in polynomial 5 ai u i , where ai ∈ F3 = {0, 1, 2}. This field admits g = u as a generator. Then, field elements in form as, a = i=0 exponential representation are described by the mapping ψ : F36 → {0, . . . , 728} as, 0 if a = 0, ψ(a) = i if a = u i , where i ∈ {1, . . . , 728}.
330 Table 2 Execution time of the field arithmetic operations in F36 . The timings are given in clock cycles. The gray cells show our choice for implementing the field arithmetic operation
G. Adj et al.
additive inverse addition subtraction multiplication multiplicative inverse
Execution time (in clock cycles) Pre-computation Arith mod 728 5 5 14 (1) 17 (2) 20 14 5 5 5
Zech’s log 26 30 -
Notice that the zero element is represented precisely as zero, whereas non-zero elements are represented as powers of the generator in the range [1, 728], where for convenience the element 1 is represented as the 728-th power of u. Table 2 shows the execution timings associated to the elementary arithmetic operations in the field F36 . The first column shows the timings using lookup tables. In the case of the subtraction operation, the options (1) and (2) report the timings using the lookup table and the arithmetic modulo 728 strategies, respectively. The second column shows the timings corresponding to the arithmetic modulo 728 (except for the addition and the subtraction). Finally, the timings corresponding to the Zech’s logarithm approach are shown in the third column. The gray cells show our final choices for implementing the field arithmetic operations. As a consequence of the above analysis, the main arithmetic operations were computed as follows. Additive inverse This operation is computed via table lookup, where q = 36 field elements are stored. Addition The addition is computed using a lookup table, where (36 )2 field elements are stored. Subtraction The operation a − b is computed by querying the additive inverse lookup table for the operand b followed by querying the addition table for the elements a and −b. Multiplication Whenever a = 0 or b = 0, this operation returns zero. Otherwise, let a, b ∈ F36 , where a = u m and b = u n , then the product a · b is computed as, a · b = u (m+n) mod 728 . If (m + n) mod 728 = 0, then a · b = u 728 . Multiplicative inverse Let a be a non-zero field element represented as a = u m ∈ F36 , then a −1 = u −m mod 728 . If −m mod 728 = 0, then a −1 = u 728 . Exponentiation The exponentiation is computed using integer arithmetic modulo 728. Let a be a non-zero field element represented as a = u m ∈ F36 and e > 0, then a e = u (me) mod 728 . In the case that (me) mod 728 = 0, then a e = u 728 .
5.2 Polynomial Arithmetic Timings Here we present the timings measured for computing addition, subtraction, multiplication, division, reduction and the multiplicative inverse operations of elements belonging to the polynomial ring F36 [X ].
5.3 Multiplication Table 1 (see Sect. 3.2), shows that the best combination between the schoolbook and the Karatsuba multiplication methods occurs when the Karatsuba approach deals with 64-coefficient operands, where each operand is a 4coefficient polynomial, whose corresponding polynomial products and additions are resolved using the schoolbook method. In order to validate the above analysis, all the combinations shown in Table 1 were implemented. The experimental timings achieved by our software are reported in Table 3. Our implementation results show that the most efficient combination occurs when the Karatsuba approach deals with 32-coefficient operands, where each operand is an 8-coefficient polynomial. The discrepancy between the theoretical prediction presented in Sect. 3.2, and the experimental results can be explained from the specific way that the C compiler optimizes the arithmetic operations in order to make a better usage of the processor’s pipeline. Algorithm 1 shows our implementation of the polynomial multiplication.
Smoothness test for polynomials defined... Table 3 Clock cycle count for different 256-coefficient polynomial multiplication combinations of the Karatsuba and the schoolbook strategies
Table 4 Execution times for polynomial arithmetic ring over F36 [X ]. The timings are given in clock cycles
331
i
nSB =
0
256
1
1
128
2
447
2
64
4
375
3
32
8
323
4
16
16
288
5
8
32
248
6
4
64
261
7
2
128
382
8
1
256
733
Degree
nK =
256 2i
256 nSB
Execution time (in thousands of clock cycles) 571
Execution time (in clock cycles) Addition
Substraction
Multiplication
Division
254
1100
1200
248,000
670,400
200
850
990
181,100
409,400
191
825
940
149,400
379,400
182
790
900
146,700
339,200
173
750
850
140,430
307,200
164
720
820
132,900
275,800
155
680
765
114,300
247,150
146
640
730
110,550
219,100
137
600
695
102,900
193,800
128
565
650
88,500
169,300
119
535
605
77,450
147,100
110
500
560
62,730
128,600
92
420
485
48,640
89,070
83
400
440
44,500
73,000
74
355
395
36,550
58,900
73
350
380
36,100
57,100
72
345
375
35,600
55,600
71
340
370
34,410
54,100 52,700
70
335
365
34,330
69
330
360
33,900
51,200
68
325
355
33,600
49,800
67
320
350
32,700
48,500 47,400
66
315
345
32,320
65
310
340
31,250
45,700
64
305
335
30,000
44,400
63
300
330
26,600
43,600
62
295
325
26,500
42,300
332
G. Adj et al.
We present in Table 4, the clock cycle count obtained for the main polynomial ring arithmetic operations, namely, addition, subtraction, multiplication and division.
5.4 Smoothness Test Timings The smoothness test was computed using the Granger et al. method as presented in [10,11], and as it was described in Sect. 4.2. Since we did not count with a baseline reference, we considered that the smoothness test was showing a good performance in case that its execution time was considerably faster than that of the factorization cost using Magma. Unfortunately, Magma does not offer a way of measuring the execution time in terms of clock cycles. Hence, we were forced to compare the timings of both functions in milliseconds. For this comparison, we used Magma version 2.20-2. The results obtained are shown in Table 5. It is noted that in all the cases, our C implementation of the smoothness test indeed yields faster timings as compared with the Magma factorization function.
5.5 Cost of the Continued-Fraction Descent for the Discrete Logarithm Computation in the Field F36·509 The work described here was used as a building block for achieving a record computation of discrete logarithms over the 4841-bit field F36·509 . Complete details of this computation can be found in [1,2,6]. Let us recall that in order to compute discrete logarithms over the field F36·509 , a continued-fraction descent must be performed. The most costly task of this descent was to find by chance a randomly chosen degree-254 polynomial in the ring F36 [X ] that happens to be 40-smooth. However, the odds are overwhelmingly against this hope. Indeed, in order to be successful, this test must be repeated on an estimated number of 233.76 random degree-254 polynomials [1]. Additionally to this difficulty is that the cost of each smoothness test is relatively expensive. In Table 6, the running time for performing the smoothness test on polynomials with several degrees are reported. According to the data reported in Table 6, the software library developed in this work can compute a 40-smooth test on a degree-254 polynomial in about 265.7 millions clock cycles. Hence, the running time to complete the continued-fraction descent was estimated as 233.76 · (265.70 × 106 ) ≈ 261.82 clock cycles. Assuming a clock frequency of 2.8 GHz, this computation is equivalent to 46 core-years.4 Fortunately, this task is considered as “embarrassingly parallel”, which means that we can easily distribute this computation using a cluster. In this work, the continued-fraction descent was carried out by a cluster composed by several servers with a combined capacity of 270 cores. This allows us to complete this task in 71.45 calendar days, which was equivalent to roughly 51-core years. The computational costs associated with several of the descents shown in Fig. 1, which also use the algorithms and software described in this work are described at length in [6].
Appendix A: Experimental Distribution of the Smooth Polynomials In [8], Flajolet et al. analyzed the properties of random polynomials by using generating functions. In [3], Adj et al. reported concrete expressions for estimating the number of m-smooth polynomials of degree n in Fq [X ], denoted by Nq (m, n). 4
One core-year being the computation that one single core can carry out working uninterruptedly for one whole year.
Smoothness test for polynomials defined... Table 5 Smoothness test and Magma factorization comparison. The timings are given in milliseconds representing the average time execution of 5000 runs
Descent
333
Degree
Execution time (in milliseconds) Smoothness test
Continued-fraction
Classical first stage
Classical second stage
Magma Factorization
254
29.0986
141.3260
200
10.6662
72.2620
191
9.6892
62.8740
182
8.8332
53.3860
173
8.0016
46.5880
164
7.1851
41.4840
155
6.4297
36.0580
146
5.7323
31.0600
137
5.0665
27.3780
128
4.3788
20.1640
119
3.8145
16.0980
71
1.3987
3.9060
70
1.3623
3.8020
69
1.3270
3.6880
68
1.2895
3.6440
67
1.2534
3.5160
66
1.2162
3.3940
65
1.1785
3.0400
64
1.1314
2.9600
63
1.1065
2.9880
62
1.0764
2.8840
119
2.9452
16.0980
110
2.5101
13.3120
92
1.7828
8.3640
83
1.4623
6.1040
74
1.1779
4.7100
73
1.1453
4.3180
72
1.1080
4.2760
71
1.0812
3.9060
As it was discussed in Sect. 4, the number of monic degree-n polynomials in Fq [X ] is q n . The generating function for m-smooth monic polynomials in the ring Fq [X ], is giving as F(u, z) =
Iq () m uz 1+ , 1 − z
=1
where u and z denote the number of distinct irreducible factors and the degree of the polynomial, respectively. Finally, the number of m-smooth polynomials of degree n over Fq [X ] is giving as, Nq (m, n) = [z n ]F(1, z), where [·] denotes the coefficient operator. Notice that for any given q, m and n, Nq (m, n) can be obtained by computing the first n + 1 terms of the Taylor series expansion of F(1, z) and then extracting the coefficient of z n .
334 Table 6 Execution time of the smoothness test required by the continued-fraction and classical descents
G. Adj et al.
Descent
m (smoothness test)
Degree
Execution times (in millions of clock cycles)
Continued-fraction
40
254
265.70
191
32.70
182
29.51
173
26.79
164
24.14
155
21.62
146
19.22
137
17.02
Classical stage 1
Classical stage 2
21
15
128
14.75
200
35.63
119
12.79
71
4.71
70
4.59
69
4.46
68
4.34
67
4.22
66
4.08
65
3.96
64
3.80
63
3.70
62
3.60
119
10.14
110
8.65
92
6.13
83
5.03
74
4.02
73
3.92
72
3.81
71
3.72
Hence, the probability of a random degree-n polynomial in Fq [X ] to be m-smooth, is given as, Nq (m, n) . qn One can experimentally analyze the accuracy of the generating function Nq (m, n) in practice. To this end, we performed the following experiments: We randomly selected 62- and 71-degree 5000 polynomials from the ring F36 [X ], and counted the number of 21-smooth and 15-smooth polynomials, respectively. This experiment was repeated 1500 times for both degrees. We selected the polynomial ring F36 [X ], due to its cryptographic relevance for the discrete logarithm computations discussed and analyzed in [3,4]. The expected number of 21-smooth polynomials among 5000 random polynomials of degree 62 in F36 [X ], is given as, N36 (21, 62) · 5 000 ≈ 295. (36 )62 Our experimental results are shown in Table 7. The first column shows the number of 21-smooth polynomials found within a random sample of 5000 degree-62 polynomials, whereas the second column shows the number
Smoothness test for polynomials defined... Table 7 Number of 21-smooth polynomials among 5000 random degree-62 polynomials in F36 [X ]
# of 21-smooth polynomials
# of experiments
241–249
6
250–258
13
259–267
54
268–276
137
277–285
249
286–294
301
295–303
294
304–312
229
313–321
129
322–330
58
331–339
20
340–348
6
349–357
3
367–375 This experiment was repeated 1500 times
335
Total
1 1500
Fig. 2 Number of 21-smooth polynomials among 5000 random degree-62 polynomials in F36 [X ]
of experiments in which the range of the first column occurred. The experimental average number of 21-smooth polynomials of degree 62 observed in our experiment was of 294.5266 ≈ 295. Furthermore, we observed a Gaussian distribution with a standard deviation of 17.0372 as shown in Fig. 2. Now, the expected number of 15-smooth polynomials among 5000 random degree-71 polynomials from the ring F36 [X ], is given as, N36 (15, 71) · 5000 ≈ 5. (36 )71 Our experimental results are shown in Table 8. The first column shows the number of 15-smooth polynomials found within a random sample of 5000 degree-71 polynomials, whereas the second column shows the number of experiments in which the range of the first column occurred. The experimental average number of 15-smooth polynomials of degree 71 observed in our experiment was of 5.0140 ≈ 5. Furthermore, we observed a Gaussian distribution with a standard deviation of 2.2497 as shown in Fig. 3. In both cases, the average number of smooth polynomials obtained by the experiment is practically the same as that obtained using the generating function Nq (m, n). Although we performed only two experiments, we see that in those cases the generating function gives a reliable result on the number of m-smooth polynomials of degree n.
336 Table 8 Number of 15-smooth polynomials among 5000 random degree-71 polynomials in F36 [X ].
This experiment was performed 1500 times. In some experiments no 15-smooth polynomials were found
G. Adj et al.
# of 15-smooth polynomials
# of experiments
1
55
2
143
3
202
4
255
5
258
6
217
7
167
8
88
9
59
10
19
11
19
12
5
13
3
15 Total
1 1491
Fig. 3 Number of 15-smooth polynomials among 5000 random degree-71 polynomials in F36 [X ]
References 1. Adj, G., Canales-Martínez, I., Cruz-Cortés, N., Menezes, A., Oliveira, T., Rivera-Zamarripa, L., Rodríguez-Henríquez, F.: Computing discrete logarithms in cryptographically-interesting characteristic-three finite fields. Cryptology ePrint Archive, Report 2016/914, (2016). http://eprint.iacr.org/2016/914 2. Adj, G.: Logaritmo discreto en campos finitos de característica pequeña: atacando la criptografía basada en emparejamientos de Tipo 1. PhD thesis, CINVESTAV-IPN, 7 (2016). http://tinyurl.com/yan2ukwa 3. Adj, G., Menezes, A., Oliveira, T., Rodríguez-Henríquez, F.: Weakness of F36·509 for discrete logarithm cryptography. In: Cao, Z., Zhang, F. (eds.) Pairing-Based Cryptography—Pairing 2013, vol. 8365 of Lecture Notes in Computer Science, pp. 20–44. Springer International Publishing, New York (2014) 4. Adj, G., Menezes, A., Oliveira, T., Rodríguez-Henríquez, F.: Computing discrete logarithms in F36·137 and F36·163 using magma. In: Koç, C.K., Mesnager, S., Sava¸s, E. (eds.) Arithmetic of Finite Fields, volume 9061 of Lecture Notes in Computer Science, pp. 3–22. Springer International Publishing, New York (2015) 5. Adleman, L.: A Subexponential algorithm for the discrete logarithm problem with applications to cryptography. In: Proceedings of the 20th Annual Symposium on Foundations of Computer Science, pp. 55–60 (1979) 6. Canales-Martínez. I. A.: Implementación eficiente de prueba de suavidad para polinomios. Master’s thesis, CINVESTAV-IPN, 12 2015. http://tinyurl.com/y9p6xk7s
Smoothness test for polynomials defined...
337
7. Coppersmith, D.: Fast evaluation of logarithms in fields of characteristic two. IEEE Trans. Inf. Theory 30(4), 587–594 (1984) 8. Flajolet, P., Gourdon, X., Panario, D.: The complete analysis of a polynomial factorization algorithm over finite fields. J. Algorithms 40(1), 37–812 (2001) 9. Göloglu, F., Granger, R., McGuire, G., Zumbrägel, J.: On the function field sieve and the impact of higher splitting probabilities— application to discrete logarithms. In: Canetti, R., Garay, J.A. (eds) Advances in Cryptology—CRYPTO 2013—33rd Annual Cryptology Conference, volume 8043 of Lecture Notes in Computer Science. Springer, pp. 109–128 (2013) 10. Granger, R., Kleinjung, T., Zumbrägel, J.: Breaking ‘128-bit Secure’ supersingular binary curves (or how to solve discrete logarithms in F24·1223 and F212·367 ). Cryptology. ePrint Archive, Report 2014/119, (2014). http://eprint.iacr.org/2014/119 11. Granger, R., Kleinjung, T., Zumbrägel, J.: Breaking ‘128-bit Secure’ Supersingular Binary Curves (Or How to Solve Discrete Logarithms in F24·1223 and F212·367 ). In: Garay, Juan A., Gennaro, Rosario (eds.) Advances in Cryptology—CRYPTO 2014, volume 8617 of Lecture Notes in Computer Science, pp. 126–145. Springer, Berlin, Heidelberg (2014) 12. Hellman, M., Reyneri, J.: Fast computation of discrete logarithms in G F(q). In: Chaum, D., Rivest, R., Sherman, A. (eds) Advances in Cryptology: Proceedings of CRYPTO ’82. Springer US, pp. 3–13 (1983) 13. Jacobi, C.G.: Über die Kreistheilung und ihre Anwendung auf die Zahlentheorie. J. Reine Angew. Math. 30, 166–182 (1846) 14. Joux, A.: A new index calculus algorithm with complexity L( 41 + o(1)) in small characteristic. In: Lange, T., Lauter, K., Lison˘ek, P. (eds.) Selected Areas in Cryptography—SAC 2013. Lecture Notes in Computer Science, pp. 355–379. Springer, Berlin Heidelberg (May 2014) 15. Karatsuba, A.A., Ofman, Y.: Multiplication of multidigit numbers on automata. Sov. Phys. Dokl. 7, 595–7596 (1963) 16. Lidl, R., Niederreiter, H.: Finite fields, volume 20 of Encyclopedia of Mathematics and its Applications, 2nd edn. Cambridge University Press, Cambridge (1997) 17. Mullen, G.L., Panario, D.: Handbook of Finite Fields. Discrete Mathematics and Its Applications. CRC Press, Boca Raton (2013) 18. Rodríguez-Henríquez, F., Koç, Ç.K.: On fully parallel Karatsuba multipliers for GF(2m ). In: Tria, A., Choi, D. (eds.) International Conference on Computer Science and Technology, pp. 405–410. ACTA Press, Calgary (2003) 19. Shoup, V.: A Computational Introduction to Number Theory and Algebra, 2nd edn. Cambridge University Press, Cambridge (2009) 20. von zur Gathen J, Gerhard J, : Modern computer algebra, 3rd edn. Cambridge University Press, Cambridge (2013) 21. Weimerskirch, A., Paar, C.: Generalizations of the Karatsuba algorithm for efficient implementations. Cryptology. ePrint Archive, Report 2006/224, (2006). http://eprint.iacr.org/