Smoothness Test for Polynomials Defined Over Small Characteristic Finite Fields

The problem of determining whether a polynomial defined over a finite field ring is smooth or not with respect to a given degree, is the most intensiv...

2 downloads 188 Views 752KB Size

Download PDF

Math.Comput.Sci. (2018) 12:319–337 https://doi.org/10.1007/s11786-018-0348-2

Mathematics in Computer Science

Smoothness Test for Polynomials Defined Over Small Characteristic Finite Fields Gora Adj · Isaac Canales-Martínez · Luis Rivera-Zamarripa · Francisco Rodríguez-Henríquez

Received: 6 June 2017 / Revised: 6 April 2018 / Accepted: 26 April 2018 / Published online: 27 June 2018 © Springer International Publishing AG, part of Springer Nature 2018

Abstract The problem of determining whether a polynomial defined over a finite field ring is smooth or not with respect to a given degree, is the most intensive arithmetic operation of the so-called descent phase of index-calculus algorithms. In this paper, we present an analysis and efficient implementation of Coppersmith’s smoothness test for polynomials defined over finite fields with characteristic three. As a case study, we review the best strategies for obtaining a fast field and polynomial arithmetic for polynomials defined over the ring Fq [X ], with q = 36 , and report the timings achieved by our library when computing the smoothness test applied to polynomials of several degrees defined in that ring. This software library was recently used in Adj et al. (Cryptology 2016. http://eprint. iacr.org/2016/914), as a building block for achieving a record computation of discrete logarithms over the 4841-bit field F36·509 . Keywords Smoothness test · Index-calculus · Polynomial rings Mathematics Subject Classification 11A51 · 11T06 · 11T71 · 11T55

1

Adleman mentioned that Ronald Rivest suggested him to use the term smoothness for this purpose.

G. Adj (B) · I. Canales-Martínez · F. Rodríguez-Henríquez Computer Science Department, CINVESTAV-IPN, Mexico City, Mexico e-mail: [email protected] I. Canales-Martínez e-mail: [email protected] F. Rodríguez-Henríquez e-mail: [email protected] L. Rivera-Zamarripa Centro de Investigación en Computación del Instituto Politécnico Nacional CIC-IPN, Mexico City, Mexico e-mail: [email protected]

320

G. Adj et al.

1 Introduction In [5], Addleman coined the term smooth number, as an integer that factors into small prime numbers smaller than a given bound.1 A few years later, in [12], Hellman and Reyneri generalized Adleman’s algorithm for Fq , with q = p m , thus extending the smoothness property to polynomials. Let Fq [x] denote the polynomial ring in the variable x over the finite field Fq of order q. Any polynomial f ∈ Fq [X ] of degree greater than zero can be expressed as a product of irreducible factors, f = a f 1e1 . . . f kek ,

(1)

where a ∈ Fq , and f 1 , . . . , f k are pairwise distinct monic irreducible polynomials over Fq [X ], and where e1 , . . . , ek are positive integers. The above factorization of the polynomial f is unique up to the order of the factors. Given f = a i f iei ∈ Fq [X ] and m ≥ 1, f is said to be m-smooth if the degree of each one of the factors f i is less or equal than m. Let f k be the irreducible factor of the polynomial f with the greatest degree. Then, by definition f is m-smooth for all m ≥ degree( f k ). Given a non-zero degree polynomial f defined over the polynomial ring Fq [X ], we are interested in knowing if f is m-smooth, for some fixed value m ≥ 1. For most cryptographic applications, one does not need to know the minimum smoothness value of the polynomial f , but rather, one simply wants to determine whether f is m-smooth or not. This question can be answered by following two different approaches. The first one consists of finding the exact factorization of f as shown in Eq. (1), and then verifying if all the irreducible factors have degree less or equal than m or not. The second option consists of performing a smoothness test over f . The output of this test will answer yes or no to the question of whether f is or not m-smooth, but will not find the exact factorization of f , nor its minimum smoothness value. The drawback associated with the factorization approach is that its associated computational cost tends to be significantly higher than performing a simpler smoothness test. On the other hand, the smoothness test may sometimes produce false positives (although never false negatives). In any case, as it is shown in this paper, for our case study the probability for the occurrence of false positives in the polynomial ring F36 [X ] is negligibly small. As mentioned above, finding whether a polynomial is m-smooth or not, is a crucial building block in the computation of discrete logarithms over finite field extensions that are based on index-calculus methods, as it is briefly explain next.

1.1 Index-Calculus Method for Computing Discrete Logarithms Over Small Characteristic Fields Let Fq∗ n denote the finite field of order N = q n − 1, where q = p , p is a prime number and , n are positive integers. The elements of Fq∗ n can be represented as polynomials of degree at most n − 1 over Fq . Given a generator g ∈ Fq , and an element h ∈ Fq , the discrete logarithm problem (DLP) in Fq∗ n is that of determining the integer ∈ [0, N − 1] satisfying h = g . In his 1979 landmark paper [5], Leonard Adleman presented an index-calculus algorithm that can be used to compute discrete logarithms over a prime field F p . Over the years, Adleman’s method was extended to address the computation of discrete logarithms over finite field extensions of the form Fq∗ n . In particular, in this paper we are interested in the problem of computing discrete logarithms over fields with small characteristic, where the prime p is chosen to be 3.2 The standard approach for computing the discrete logarithm of h using the index-calculus method can be summarized as follows. First, this algorithm finds the logarithms (mod N ) of all the elements in Fq∗ n included in the so-called factor base. Then, in a descent stage, logg h is expressed as a linear combination of logarithms of the elements in the factor base. The descent stage proceeds in several steps, each expressing the logarithm of a degree-D element as a linear combination of the logarithms of elements of degree ≤ m for some m < D. Most of these steps 2

Similar strategies applies when one selects p = 2.

Smoothness test for polynomials defined...

321

Continuedfraction descent

Classical Descent

Gr¨obner basis descent

Factor basis Fig. 1 Index-calculus descent phases for computing discrete logarithms over small characteristic finite fields of the form Fq n

end their computation when, after a typically very large number of trials, a polynomial with the required smoothness is finally found. Hence, the computational complexity of the descent stage directly depends on the computational cost of the smoothness test. Specifically targeting small characteristic fields, Joux [14] and Gölo˘glu et al. [9], presented a new DLP algorithm with a running time of L Q [ 41 + o(1), c], for some undetermined c. Their algorithm has three different descent stages (see Fig. 1), namely, the continued-fraction descent, the classical descent and the Gröbner basis descent.

1.2 Our Contributions In [1], a record computation of the discrete logarithm problem over the field F36·509 was reported. The authors of [1] shown that Fq=36 is the largest proper subfield of F36·509 where the index calculus method can be performed effectively. Hence, from the efficiency point of view it just makes sense to represent the field elements of F36·509 as univariate polynomials defined over the ring Fq [X ] partitioned by a 509-degree monic polynomial irreducible over Fq . In this setting, the main task in the continued-fraction descent shown in Fig. 1 is that of finding a randomly chosen 254-degree polynomial defined over the ring Fq [X ] that happens to be m-smooth, where m is a positive integer smaller than, say, 40. Since this task is computationally highly demanding, the efficient implementation of the smoothness test becomes a crucial ingredient for achieving the discrete logarithm computation over the field F36·509 in a reasonable time. Hence, the main contribution of this paper is to present all the design details towards an efficient implementation of the smoothness test for ring polynomials defined over Fq [X ]. In this paper we report a baseline reference C implementation of the smoothness test when dealing with ring polynomials of different degrees. Our implementation was supported by a careful analysis of the most efficient field and polynomial arithmetic algorithms, along with the smoothness test strategy outlined in [3,9]. Furthermore,

322

G. Adj et al.

we provide a rigorous analysis of the probability of incorrect outcomes of the smoothness test. We also report experiments showing that m-smooth polynomials of degree n, with n > m > 0, closely follow a Gaussian distribution as theoretically predicted by the analysis given in [3,8]. The remainder of this paper is organized as follows. In Sects. 2–3, we discuss the best possible approaches for a fast and efficient software implementation of the field and polynomial arithmetic over small finite fields of the form Fq , and polynomial rings of the form Fq [X ], respectively. Then, in Sect. 4 we provide a formal analysis of the probability of incorrect outcomes of Coppersmith’s smoothness test as well as a detailed description of how to implement it according to the approach proposed in [9]. In Sect. 5 we report the performance achieved by our software in the implementation of all three of them, the field and polynomial arithmetic, and the smoothness test for polynomials of different degrees defined over F36 [X ]. We conclude this Section by reporting the experimental computational effort associated to the task of finding a randomly chosen degree-254 polynomial in the ring F36 [X ] that happens to be 40-smooth. Finding such polynomial is the main task of the continued-fraction descent depicted in Fig. 1 when computing discrete logarithms over the 4841-bit field F36·509 (see [1] for more details). 2 Field Arithmetic Over Fq This Section describes how to perform the main arithmetic operations over small characteristic fields of the form Fq , with q = p , p a small prime. In particular, a detailed description of the sixth extension of the ternary field F3 , will be given. Field arithmetic over a sufficiently small finite field Fq can be performed via look-up tables. This method is independent of any particular field representation. The basic idea is that all possible pairwise combinations of the field elements can be pre-computed and stored in a look-up table. This way, addition and multiplication lookup tables require a memory space of q 2 field elements each, whereas the lookup tables for additive and multiplicative inverses require to store q field elements each. In the remaining of this section we will discuss two different ways of representing the field elements of Fq , and the best strategies for computing elementary arithmetic operations according to the field representation choice.

2.1 Polynomial Representation Let Fq be a prime extension field of the field F p . Since Fq ∼ = F p [X ]/(I ) for some degree-n polynomial I ∈ F p [X ] irreducible over F p , the elements in Fq can be represented as polynomials of degree < n with coefficients in F p . Field addition and substraction are especially efficient using this representation. However, field multiplication and exponentiation become relatively expensive. Although additive inverses can be computed at almost no cost, calculating multiplicative inverses is a relatively expensive operation. Hence, when using polynomial representation it is generally more optimal to perform field addition and subtraction on run-time and perform the other arithmetic operations using pre-computed look-up tables.

2.2 Exponential Representation Let g ∈ Fq be a generator of Fq∗ . Then, each nonzero element in a ∈ Fq∗ can be represented as a power of g, i.e, a = g i with i ∈ {0, . . . , q − 2}. One drawback of this representation is that the zero element cannot be represented as a power of g. Hence, the nonzero elements of Fq can be represented in exponential form by means of the mapping ψ : Fq → {0, . . . , q − 2} ∪ {∞}, defined as follows, ∞ if a = 0, ψ(a) = i if a = g i .

Smoothness test for polynomials defined...

323

The exponential representation leads to extremely fast arithmetic operations over Fq∗ . In particular, this representation allows to compute additive inverses efficiently. Nevertheless, field addition cannot be computed in a straightforward fashion. Let a = g m , b = g n ∈ Fq and e > 0, all the main arithmetic operations, except for the addition, can be computed using integer arithmetic modulo q − 1 as summarized next. Multiplication Additive inverse

a · b = g m · g n = g (m+n) mod q−1 − a = −1 · g m = g α g m = g (α+m) mod q−1 , where, α =

Multiplicative inverse Exponentiation

a −1 = (g m )−1 = g (−m) mod q−1

q −1 2

a e = (g m )e = g (me) mod q−1

Using the identity g (q−1)/2 = −1, it becomes easy to compute additive inverses. Notice also that the above operations have an exceptional case whenever a = 0 or b = 0, which needs to be handled separately. It is worth to remark that if the memory resources are plenty, then instead of actually computing on run-time the operations modulo q − 1 as listed above, one can always perform the field arithmetic Fq by means of look-up tables. In order to perform the field addition operation efficiently, the following strategies were considered, a. To use a look-up table for the addition b. To precompute two tables that allow us to go from exponential to polinomial representation and vice versa. By using a look-up table, the operands are translated from exponential to polynomial representation. Then, the addition is performed coefficient-wise and finally the result is translated to exponential representation c. To use the Zech’s algorithm (cf. Sect. 2.3)

2.3 Zech’s Logarithm The Zech’s logarithm (also known as Jacobi’s logarithm [13]), is a method for computing the addition when the elements are represented using exponential representation [16,17]. Basically, this method computes the addition of two elements using a multiplication and a precomputed lookup table. Once again, let us assume that the elements of the field Fq are represented in exponential form in base g. Then, the integer Zech’s function Z is defined as, g Z (m) = 1+g m . Let a = g m , b = g n ∈ Fq , then the addition/subtraction of these two elements can be computed as, a + b = g m + g n = g m g Z (n−m) = g m (1 + g (n−m) mod q−1 ), a − b = g m + (−g n ) = g m g Z (α+n−m) = g m (1 + g (α+n−m) mod q−1 ), where α = (q − 1)/2. Note that g Z (α) = (1 + g α ) = 0 = ∞. As it was discussed before, the special cases when a = 0 or b = 0 must be treated as special situations. This strategy requires the precomputation of a single lookup table storing a total of q values.

3 Arithmetic in the Polynomial Ring In this Section we discuss the polynomial arithmetic over a finite field Fq . The computational cost of the different algorithms are given in terms of the number of operations over the field Fq where the polynomial ring is defined.

324

G. Adj et al.

3.1 Addition and Subtraction n n Let f = i=0 f i X i and g = i=0 gi X i be two degree-n polynomials defined over Fq [X ]. Then, the addition h = f + g and the subtraction h = f − g are defined as, h=

n i=0

hi X i =

n n n ( f i + gi )X i and h = h i X i = ( f i − gi )X i . i=0

i=0

i=0

Since the cost of the polynomial addition and subtraction are similar, both operations will be referred as addition. The cost of polynomial addition is O(n) operations in Fq . 3.2 Multiplication n Let f = i=0 f i X i and g = mj=0 g j X j be two degree-n polynomials defined over Fq [X ]. Then, the polynomial multiplication h = f · g is defined as, h=

n+m k=0

h k X k , with h k =

fi g j .

0≤i≤n 0≤ j≤m i+ j=k

The polynomial multiplication can be performed by means of the schoolbook method at a quadratic computational complexity of O(n 2 ) field operations. More concretely, this classical strategy requires n 2 and (n − 1)2 field multiplications and additions, respectively. A second option is to perform a polynomial multiplication using the Karatsuba approach [15], which computes the polynomial multiplication at a cost of O(n log2 3 ) field operations. However, depending on the size of the operands, the Karatsuba approach tends to require more additions operations than what the schoolbook method would need. The number of field operations required by this method are n log2 3 and 6n log2 3 − 8n + 2 field multiplications and additions, respectively [18,21]. We stress that the field arithmetic costs associated to an exponential representation, lead to a rather unusual situation, namely, the fact that field addition is considerably more costly than field multiplication. This peculiar ratio has a sensible role in choosing the polynomial multiplication method. The solution that was implemented in this work consisted of combining the usage of the Karatsuba and the schoolbook multiplication strategies according to the following approach. First, the Karatsuba multiplication is performed until a certain pre-fixed level of recursion. After that, the schoolbook method takes over to complete the smaller degree products required by the first layer of the Karatsuba approach. Algorithm 1 applies this combined strategy, i.e., it computes the polynomial product by first using a Karatsuba approach until a certain threshold is reached. Then, the school-book method is adopted to end up the recursion. One can determine an optimal threshold by comparing the costs of combining the Karatsuba approach and the i schoolbook methods when multiplying n-coefficient polynomials after selecting as a potential limit, n/2 -coefficient polynomial multiplication for i ∈ {0, . . . , log2 n }. We assume an addition/multiplication ratio of 2.8 as supported by our experiments (cf. Table 2). Since the continued-fraction descent phase as described in [3] for the ring F36 [X ], requires the computation of polynomial products and divisions with up to 255 coefficients, we selected n = 255 for all of our experiments. Table 1 reports the computational costs of different combinations of the Karatsuba and the schoolbook approaches. In order to illustrate the figures presented in Table 1, let us consider the following example. Example 3.1 When i = 5, one can use the schoolbook method to multiply polynomials with n S B = 8 coefficients, and the Karatsuba strategy for computing the product of polynomials with n K = 32 coefficients. For this choice, the polynomial operands can be seen as a 32−coefficient polynomial, where each coefficient is in turn an eightcoefficient polynomial, where all these eight coefficients lie in Fq .

Smoothness test for polynomials defined...

325

Table 1 Study of different splittings for the computation of 256-coefficient polynomial product i

School-book method

Karatsuba method

nSB =

nK =

256 2i

# adds.

# mults.

Total cost (in Mq )

256 nSB

Cost (in Mq )

# adds.

# mults.

0

256

65,025

65,536

247,606.0

1

0

1

247,606

1

128

16,129

16,384

61,545.2

2

4

3

186,070

2

64

3969

4096

15,209.2

4

24

9

141,184

3

32

961

1024

3714.8

8

100

27

109,260

4

16

225

256

886.0

16

360

81

87,894

5

8

49

64

201.2

32

1204

243

75,862

6

4

9

16

41.2

64

3864

729

73,312

7

2

1

4

6.8

128

12,100

2187

82,632

8

1

0

1

1.0

256

37,320

6561

111,057

Mq denotes the cost of performing one field multiplication over F36 , and computing one field addition costs 2.8Mq . Columns 3 and 4 show the number of additions and multiplications required by the schoolbook method when dealing with polynomial operands having n S B coefficients as reported in Column 2. Columns 7–8 show the number of polynomial additions and multiplications required by the Karatsuba approach when dealing with polynomial operands having n K coefficients, where each coefficient is in turn an n S B -coefficient polynomial. In this case, the cost of a polynomial addition is of 2.8n S B Mq . The corresponding polynomial multiplication cost is shown in Column 5. The total cost of computing the 256-coefficient polynomial product according to the particular combination choice of the Karatsuba and the School-book method is reported in Column 9

Considering the space complexity of the schoolbook method, it is easy to see that a total of 64 and 49 multiplications and additions over Fq would be required. This is equivalent to (49 × 2.8) + 64 = 201.2 field multiplications, which will be denoted in the following as Mq . Furthermore, a polynomial multiplication of operands having n K = 32 coefficients require the computation of 243 and 1204 eight-coefficient polynomial multiplications and additions, respectively. As it was discussed above, the computational cost associated to each such polynomial multiplication and addition is of 201.2Mq and 2.8 × 8 = 22.4Mq . Hence, the total cost of multiplying two 255-coefficient polynomials using the above splitting between the Karatsuba and the schoolbook approaches is of, (1204 × 2.8 × 8) + (243 × 201.2) = 75,862Mq .

Algorithm 1 PolynomialMult Input: f = ( f n , . . . , f 0 ) ∈ Fq [X ], g = (gn , . . . , g0 ) ∈ Fq [X ] Output: h = (h 2n , . . . , h 0 ) ∈ Fq [X ] 1: if n ≤ Karatsuba_limit then 2: return PolynomialMultSchoolBook( f, g) 3: end if 4: m ← (n + 1)/2 5: f A , f B ← ( f n , . . . , f m ), ( f m−1 , . . . , f 0 ) 6: g A , g B ← (gn , . . . , gm ), (gm−1 , . . . , g0 ) 7: t0 ←PolynomialMult( f B , g B ) 8: t2 ←PolynomialMult( f A , g A ) 9: t1 ←PolynomialMult( f A + f B , gA + gB ) 10: return t2 X 2m + (t1 − t0 − t2 )X m + t0

326

G. Adj et al.

3.3 Division In this subsection we explained one classical way to compute this polynomial arithmetic operation, along with our own proposal for a more specialized reduction method. Theorem 3.1 (Polynomial division [16, Theorem 1.52]) Let g = 0 be a polynomial in Fq [X ], then for all f ∈ Fq [X ], there exist unique polynomials q, r ∈ Fq such that f = qg + r with degree(r ) < degree(g). Algorithm 2 performs the polynomial division operation f /g. In Step 4, a multiplicative inverse over the field Fq is computed. The operations in Steps 7–8 are performed at most n − m + 1 times. In Step 7, a multiplication over Fq is performed, and Step 8, computes qi · g using m multiplications and m additions over Fq , whereas the multiplication by X i is performed as a shift of i positions to the left. In total, Algorithm 2 performs about (2m+1)(n−m+1) field operations, plus the cost of the inverse multiplicative computation. As mentioned in [20], if n < 2m, then Algorithm 2 performs no more than 2m 2 + O(m) operations over Fq . Algorithm 2 PolynomialDiv [20] Input: f = ( f n , . . . , f 0 ) ∈ Fq [X ], g = (gm , . . . , g0 ) ∈ Fq [X ] with g = 0 Output: q, r ∈ Fq [X ] such that f = qg + r with degree(r ) < degree(g) 1: if n < m then 2: return 0, f {q = 0, r = f } 3: end if −1 4: r ← f , u ← gm 5: for i ← n − m, . . . , 0 do 6: if degree(r ) = m + i then 7: qi ← lc(r ) · u {lc(r ) denotes the leading coefficient of r } 8: r ← r − qi X i · g 9: else 10: qi ← 0 11: end if 12: end for 13: return q, r

3.4 Reduction Let f ∈ Fq [X ]/(P) be a polynomial of degree n ≥ . Then, the modulo reduction f mod P, can be computed via polynomial division, i.e., q, r = PolynomialDiv( f, P), where r ≡ f mod P. This operation has a computational cost of O(2 ) field operations over Fq . Without loss of generality, let us assume that P is monic. Then, P = X +

−1 i=0

pi X i ⇒ P =

−1

− pi X i ≡ X mod P.

i=0

Therefore, another option to compute f mod P, is to use P as shown in Algorithm 3. The computation of Step 3 requires multiplications and additions over Fq . The operations in Steps 5–7 are executed n − times. In Step 5, multiplication by X means to perform a left shift operation. Steps 6 and 7 require multiplications and additions over Fq each. In total, the execution of Algorithm 3 requires 2 + (4)(n − ) field operations over Fq . If n < 2, the computational cost of Algorithm 3 is of O(2 ) operations over Fq .

Smoothness test for polynomials defined...

327

Algorithm 3 Reduction Input: f = ( f n , . . . , f 0 ) ∈ Fq [X ], P = ( p , . . . , p0 ) ∈ Fq [X ] with n ≥ Output: r ∈ Fq [X ], the residue of f mod P 1: P ← (− p−1 , . . . , − p0 ) 2: Q ← P {Q ≡ X X mod P} 3: r ← ( f −1 , . . . , f 0 ) + f · Q 4: for i ← + 1, . . . , n do 5: Q ← Q · X {Q has degree } 6: Q ← (q−1 , . . . , q0 ) + q · P {to reduce q and then Q ≡ X X i mod P} 7: r ← r + f i · Q 8: end for 9: return r

3.5 Multiplicative Inverse Computation Let f ∈ Fq [X ]/(P) be a degree n polynomial. Then, by using the extended Euclidean algorithm with f and P, one obtains the polynomials d, s, t ∈ Fq [X ] such that d = gcd( f, P) and f s + Pt = d. If d = 1, f is not relatively prime with P and therefore, there does not exists a multiplicative inverse modulo P. If d = 1, then s is the multiplicative inverse of f mod P. The computational cost of the extended Euclidean algorithm is of O(n) operations over Fq , and therefore, the computation of multiplicative inverses over the ring has a computational cost of O(n) [19, Theorem 17.6].

4 Smoothness Test Without loss of generality, in this Section it is assumed that the polynomial f to be subject of the smoothness test is monic. Several known results [3,8] on the number of m-smooth polynomials over a finite field, for some m ≥ 1, are presented next. Some of the estimations are based on the heuristic argument that m−smooth polynomials of degree n for n > m > 0, are uniformly distributed among ordinary polynomials (cf. “Appendix A”).

4.1 About the Number of Smooth Polynomials The number of monic polynomials of degree n over a finite field Fq is q n . The number of monic irreducible polynomials of degree n over Fq is, Iq (n) =

1 μ(n/d)q d , n d|n

where μ is the Möbius function defined as, ⎧ if a has one or more repeated prime factors, ⎪ ⎨0 μ(a) = 1 if a = 1, ⎪ ⎩ k (−1) if a is the product of k different prime factors. The generating function for m-smooth monic polynomials in the ring Fq [X ], is giving as F(u, z) =

Iq () m uz 1+ , 1 − z

=1

328

G. Adj et al.

where u and z denote the number of distinct irreducible factors and the degree of the polynomial, respectively. From the above discussion, one can estimate the number of monic m-smooth polynomials of degree n over Fq [X ] that have exactly k distinct irreducible monic factors as, Nq (m, n, k) = [u k z n ]F(u, z), where [·] denotes the coefficient operator. Thus, we conclude that the number of m-smooth polynomials of degree n over Fq [X ] is giving as, Nq (m, n) = [z n ]F(1, z). For any given q, m and n, Nq (m, n) can be obtained by computing the first n +1 terms of the Taylor series expansion of F(1, z) and then extracting the coefficient of z n . In the same way, one can compute Nq (m, n, k) [3]. 4.2 Smoothness Test Theorem 4.1 (Smoothness test [7]) Let Fq be a finite field of characteristic p and f a monic polynomial in Fq [X ] of degree d, d > 0. For an integer m, 1 ≤ m ≤ d, define w(X ) = f (X ) ·

m

i

(X q − X )

mod f (X ).

(2)

i= m/2

If f is m-smooth, then w = 0; if f is not m-smooth, then w = 0 if and only if p divides the multiplicity of all irreducible factors of f with degree greater than m. i

Proof It is known that for any positive integer i, the polynomial X q − X is the product of all the monic irreducible m i (X q − X ) is the product of polynomials in Fq [X ] whose degree divides i. Thus, the polynomial P = i= m/2 all the monic irreducible polynomials of degree at most m. Since P never vanishes, we have w = 0 if and only if f = 0 or f | f P. One can see that f = 0 if and only f is a polynomial in X p ; and when f is not a polynomial in X p , the multiplicity in f of an irreducible factor of f with multiplicity e ≥ 1 in f is either e if e is a multiple of p or e − 1 otherwise. Therefore, w = 0 corresponds to f being a polynomial in X p or f | f P. Now suppose f | f P. This means that all the irreducible factors of f with multiplicity not a multiple of p are of degree at most m, because, for these factors, the multiplicity gap of 1 arising from the derivation of f could only have been filled during the multiplication by P, whose irreducible factors are all of degree at most m. Hence, if f is not m-smooth, w = 0 happens if and only if all the irreducible factors of degree greater than m are of multiplicity a multiple of p, since the latter statement is also realized when f is a polynomial in X p . Conversely, it is clear that when f is m-smooth we necessarily have w = 0. 4.3 Efficient Computation of the Smoothness Test In [10], Granger et. al. proposed an efficient strategy for computing the smoothness test of a polynomial f defined over the ring Fq [X ] via Eq. (2). Let q = p , where p is a small prime number and l a positive composed integer that can be expressed as, = r · s. Given f ∈ Fq [X ] of degree n, define the quotient ring Fq [X ]/( f ). For the sake of simplicity, in the following analysis it will be further assumed that the cost of elementary field arithmetic operations such as addition, multiplication and exponentiation have all the same O(1) computational cost.3 In the case of the ring arithmetic, it will be further assumed that the cost of performing a polynomial addition is O(n), whereas the polynomial ring operations of multiplication and division have each a cost of O(n 2 ). rs The approach of Granger et al. can be described as follows. The method first computes the powering [X p ] i followed by the exponentiations [X q ], for i = 1, . . . , m, according to the steps described next. 3

Compare with the computational cost analysis given in Sect. 2.

Smoothness test for polynomials defined...

329 r

1. Precomputation phase: For i = 1, . . . , n − 1, calculate X i p mod f r r r The exponentiations X p , X 2 p , . . . , X (n−1) p can be computed by consecutively multiplying by X . In general, a multiplication by X can be achieved by performing a shift operation possibly followed by a reduction modulo f . Notice further that a polynomial having the same degree of f can be reduced modulo f at a computational cost of m field multiplications and additions over Fq . Hence, the precomputation phase can be achieved by performing (n − 1)( pr − 1) shift operations at a computational cost of less than 2n 2 pr field operations over Fq . r 2. Computing a ring exponentiation a p Using the n − 1 precomputed elements of the former step, one can r compute the exponentation a p , for any arbitrary ring polynomial a ∈ Fq [X ]/( f ) as, n−1 pr n−1 r pr i (3) ai X = ai X i p . i=0

i=0

The computational cost of Eq. (3) is of n exponentiations and n scalar multiplications (i.e., the operation of multiplying a field element of Fq by a degree-(n − 1) polynomial). Hence, the total cost of this operation is of about 2n 2 + n field operations over Fq . i 3. Computing the powering [X q ], for i = 1, . . . , m Let us recall that q = pr s . Hence, applying repeatedly the rj i r si previous step, one can compute [X p ] for j ∈ {2, . . . , sm}, and therefore one obtains, [X q ] = [X p ] for i ∈ {1, . . . , m}. Thus, the computational cost of this step is of approximately, (2n 2 + n)sm = 2n 2 sm + nsm operations over Fq . The computational cost of multiplying and dividing two degree-(n −1) ring polynomials is of about n 2 operations over Fq each. Hence, a ring multiplication, i.e, the cost of multiplying two degree-n − 1 polynomials modulo f , m i has a cost of roughly 2n 2 field operations. The product f i= m/2 (X q − X ) mod f of Eq. (2), requires the computation of m/2 ring multiplications, which implies a total of n 2 m arithmetic operations over Fq . Summarizing, the cost of computing Eq. ( 2) can be estimated as, 2n 2 pr + 2n 2 sm + nsm + n 2 m arithmetic operations over Fq . 5 Experimental Timings In this Section, we report the performance achieved by our software in the implementation of all three of them, the field and polynomial arithmetic, and the smoothness test for polynomials of different degrees defined over F36 [X ]. We perform a total of 100,000 runs for the field and polynomial arithmetic operations over the field F36 and the ring F36 [X ], respectively. In the case of the smoothness test, we perform over 20,000 runs per each instance reported. All the measurements were performed using an Intel i7-3520M processor running at 2.9 GHz, using the openSUSE 13.2 with kernel 3.16.7-24 operating system. Our C library was compiled using GCC 4.8.3. 5.1 Field Arithmetic Timings We give here the experimental costs obtained for elementary arithmetic operations over the field F36 = F3 [u]/(u 6 + 2u 4 + u 2 + 2u + 2). 5.1.1 The field F36 = F3 [u]/(u 6 + 2u 4 + u 2 + 2u + 2) For the ternary field, F36 = F3 [u]/(u 6 + 2u 4 + u 2 + 2u + 2), an element a ∈ F36 is represented in polynomial 5 ai u i , where ai ∈ F3 = {0, 1, 2}. This field admits g = u as a generator. Then, field elements in form as, a = i=0 exponential representation are described by the mapping ψ : F36 → {0, . . . , 728} as, 0 if a = 0, ψ(a) = i if a = u i , where i ∈ {1, . . . , 728}.

330 Table 2 Execution time of the field arithmetic operations in F36 . The timings are given in clock cycles. The gray cells show our choice for implementing the field arithmetic operation

G. Adj et al.

additive inverse addition subtraction multiplication multiplicative inverse

Execution time (in clock cycles) Pre-computation Arith mod 728 5 5 14 (1) 17 (2) 20 14 5 5 5

Zech’s log 26 30 -

Notice that the zero element is represented precisely as zero, whereas non-zero elements are represented as powers of the generator in the range [1, 728], where for convenience the element 1 is represented as the 728-th power of u. Table 2 shows the execution timings associated to the elementary arithmetic operations in the field F36 . The first column shows the timings using lookup tables. In the case of the subtraction operation, the options (1) and (2) report the timings using the lookup table and the arithmetic modulo 728 strategies, respectively. The second column shows the timings corresponding to the arithmetic modulo 728 (except for the addition and the subtraction). Finally, the timings corresponding to the Zech’s logarithm approach are shown in the third column. The gray cells show our final choices for implementing the field arithmetic operations. As a consequence of the above analysis, the main arithmetic operations were computed as follows. Additive inverse This operation is computed via table lookup, where q = 36 field elements are stored. Addition The addition is computed using a lookup table, where (36 )2 field elements are stored. Subtraction The operation a − b is computed by querying the additive inverse lookup table for the operand b followed by querying the addition table for the elements a and −b. Multiplication Whenever a = 0 or b = 0, this operation returns zero. Otherwise, let a, b ∈ F36 , where a = u m and b = u n , then the product a · b is computed as, a · b = u (m+n) mod 728 . If (m + n) mod 728 = 0, then a · b = u 728 . Multiplicative inverse Let a be a non-zero field element represented as a = u m ∈ F36 , then a −1 = u −m mod 728 . If −m mod 728 = 0, then a −1 = u 728 . Exponentiation The exponentiation is computed using integer arithmetic modulo 728. Let a be a non-zero field element represented as a = u m ∈ F36 and e > 0, then a e = u (me) mod 728 . In the case that (me) mod 728 = 0, then a e = u 728 .

5.2 Polynomial Arithmetic Timings Here we present the timings measured for computing addition, subtraction, multiplication, division, reduction and the multiplicative inverse operations of elements belonging to the polynomial ring F36 [X ].

5.3 Multiplication Table 1 (see Sect. 3.2), shows that the best combination between the schoolbook and the Karatsuba multiplication methods occurs when the Karatsuba approach deals with 64-coefficient operands, where each operand is a 4coefficient polynomial, whose corresponding polynomial products and additions are resolved using the schoolbook method. In order to validate the above analysis, all the combinations shown in Table 1 were implemented. The experimental timings achieved by our software are reported in Table 3. Our implementation results show that the most efficient combination occurs when the Karatsuba approach deals with 32-coefficient operands, where each operand is an 8-coefficient polynomial. The discrepancy between the theoretical prediction presented in Sect. 3.2, and the experimental results can be explained from the specific way that the C compiler optimizes the arithmetic operations in order to make a better usage of the processor’s pipeline. Algorithm 1 shows our implementation of the polynomial multiplication.

Smoothness test for polynomials defined... Table 3 Clock cycle count for different 256-coefficient polynomial multiplication combinations of the Karatsuba and the schoolbook strategies

Table 4 Execution times for polynomial arithmetic ring over F36 [X ]. The timings are given in clock cycles

331

i

nSB =

0

256

1

1

128

2

447

2

64

4

375

3

32

8

323

4

16

16

288

5

8

32

248

6

4

64

261

7

2

128

382

8

1

256

733

Degree

nK =

256 2i

256 nSB

Execution time (in thousands of clock cycles) 571

Execution time (in clock cycles) Addition

Substraction

Multiplication

Division

254

1100

1200

248,000

670,400

200

850

990

181,100

409,400

191

825

940

149,400

379,400

182

790

900

146,700

339,200

173

750

850

140,430

307,200

164

720

820

132,900

275,800

155

680

765

114,300

247,150

146

640

730

110,550

219,100

137

600

695

102,900

193,800

128

565

650

88,500

169,300

119

535

605

77,450

147,100

110

500

560

62,730

128,600

92

420

485

48,640

89,070

83

400

440

44,500

73,000

74

355

395

36,550

58,900

73

350

380

36,100

57,100

72

345

375

35,600

55,600

71

340

370

34,410

54,100 52,700

70

335

365

34,330

69

330

360

33,900

51,200

68

325

355

33,600

49,800

67

320

350

32,700

48,500 47,400

66

315

345

32,320

65

310

340

31,250

45,700

64

305

335

30,000

44,400

63

300

330

26,600

43,600

62

295

325

26,500

42,300

332

G. Adj et al.

We present in Table 4, the clock cycle count obtained for the main polynomial ring arithmetic operations, namely, addition, subtraction, multiplication and division.

5.4 Smoothness Test Timings The smoothness test was computed using the Granger et al. method as presented in [10,11], and as it was described in Sect. 4.2. Since we did not count with a baseline reference, we considered that the smoothness test was showing a good performance in case that its execution time was considerably faster than that of the factorization cost using Magma. Unfortunately, Magma does not offer a way of measuring the execution time in terms of clock cycles. Hence, we were forced to compare the timings of both functions in milliseconds. For this comparison, we used Magma version 2.20-2. The results obtained are shown in Table 5. It is noted that in all the cases, our C implementation of the smoothness test indeed yields faster timings as compared with the Magma factorization function.

5.5 Cost of the Continued-Fraction Descent for the Discrete Logarithm Computation in the Field F36·509 The work described here was used as a building block for achieving a record computation of discrete logarithms over the 4841-bit field F36·509 . Complete details of this computation can be found in [1,2,6]. Let us recall that in order to compute discrete logarithms over the field F36·509 , a continued-fraction descent must be performed. The most costly task of this descent was to find by chance a randomly chosen degree-254 polynomial in the ring F36 [X ] that happens to be 40-smooth. However, the odds are overwhelmingly against this hope. Indeed, in order to be successful, this test must be repeated on an estimated number of 233.76 random degree-254 polynomials [1]. Additionally to this difficulty is that the cost of each smoothness test is relatively expensive. In Table 6, the running time for performing the smoothness test on polynomials with several degrees are reported. According to the data reported in Table 6, the software library developed in this work can compute a 40-smooth test on a degree-254 polynomial in about 265.7 millions clock cycles. Hence, the running time to complete the continued-fraction descent was estimated as 233.76 · (265.70 × 106 ) ≈ 261.82 clock cycles. Assuming a clock frequency of 2.8 GHz, this computation is equivalent to 46 core-years.4 Fortunately, this task is considered as “embarrassingly parallel”, which means that we can easily distribute this computation using a cluster. In this work, the continued-fraction descent was carried out by a cluster composed by several servers with a combined capacity of 270 cores. This allows us to complete this task in 71.45 calendar days, which was equivalent to roughly 51-core years. The computational costs associated with several of the descents shown in Fig. 1, which also use the algorithms and software described in this work are described at length in [6].

Appendix A: Experimental Distribution of the Smooth Polynomials In [8], Flajolet et al. analyzed the properties of random polynomials by using generating functions. In [3], Adj et al. reported concrete expressions for estimating the number of m-smooth polynomials of degree n in Fq [X ], denoted by Nq (m, n). 4

One core-year being the computation that one single core can carry out working uninterruptedly for one whole year.

Smoothness test for polynomials defined... Table 5 Smoothness test and Magma factorization comparison. The timings are given in milliseconds representing the average time execution of 5000 runs

Descent

333

Degree

Execution time (in milliseconds) Smoothness test

Continued-fraction

Classical first stage

Classical second stage

Magma Factorization

254

29.0986

141.3260

200

10.6662

72.2620

191

9.6892

62.8740

182

8.8332

53.3860

173

8.0016

46.5880

164

7.1851

41.4840

155

6.4297

36.0580

146

5.7323

31.0600

137

5.0665

27.3780

128

4.3788

20.1640

119

3.8145

16.0980

71

1.3987

3.9060

70

1.3623

3.8020

69

1.3270

3.6880

68

1.2895

3.6440

67

1.2534

3.5160

66

1.2162

3.3940

65

1.1785

3.0400

64

1.1314

2.9600

63

1.1065

2.9880

62

1.0764

2.8840

119

2.9452

16.0980

110

2.5101

13.3120

92

1.7828

8.3640

83

1.4623

6.1040

74

1.1779

4.7100

73

1.1453

4.3180

72

1.1080

4.2760

71

1.0812

3.9060

As it was discussed in Sect. 4, the number of monic degree-n polynomials in Fq [X ] is q n . The generating function for m-smooth monic polynomials in the ring Fq [X ], is giving as F(u, z) =

Iq () m uz 1+ , 1 − z

=1

where u and z denote the number of distinct irreducible factors and the degree of the polynomial, respectively. Finally, the number of m-smooth polynomials of degree n over Fq [X ] is giving as, Nq (m, n) = [z n ]F(1, z), where [·] denotes the coefficient operator. Notice that for any given q, m and n, Nq (m, n) can be obtained by computing the first n + 1 terms of the Taylor series expansion of F(1, z) and then extracting the coefficient of z n .

334 Table 6 Execution time of the smoothness test required by the continued-fraction and classical descents

G. Adj et al.

Descent

m (smoothness test)

Degree

Execution times (in millions of clock cycles)

Continued-fraction

40

254

265.70

191

32.70

182

29.51

173

26.79

164

24.14

155

21.62

146

19.22

137

17.02

Classical stage 1

Classical stage 2

21

15

128

14.75

200

35.63

119

12.79

71

4.71

70

4.59

69

4.46

68

4.34

67

4.22

66

4.08

65

3.96

64

3.80

63

3.70

62

3.60

119

10.14

110

8.65

92

6.13

83

5.03

74

4.02

73

3.92

72

3.81

71

3.72

Hence, the probability of a random degree-n polynomial in Fq [X ] to be m-smooth, is given as, Nq (m, n) . qn One can experimentally analyze the accuracy of the generating function Nq (m, n) in practice. To this end, we performed the following experiments: We randomly selected 62- and 71-degree 5000 polynomials from the ring F36 [X ], and counted the number of 21-smooth and 15-smooth polynomials, respectively. This experiment was repeated 1500 times for both degrees. We selected the polynomial ring F36 [X ], due to its cryptographic relevance for the discrete logarithm computations discussed and analyzed in [3,4]. The expected number of 21-smooth polynomials among 5000 random polynomials of degree 62 in F36 [X ], is given as, N36 (21, 62) · 5 000 ≈ 295. (36 )62 Our experimental results are shown in Table 7. The first column shows the number of 21-smooth polynomials found within a random sample of 5000 degree-62 polynomials, whereas the second column shows the number

Smoothness test for polynomials defined... Table 7 Number of 21-smooth polynomials among 5000 random degree-62 polynomials in F36 [X ]

# of 21-smooth polynomials

# of experiments

241–249

6

250–258

13

259–267

54

268–276

137

277–285

249

286–294

301

295–303

294

304–312

229

313–321

129

322–330

58

331–339

20

340–348

6

349–357

3

367–375 This experiment was repeated 1500 times

335

Total

1 1500

Fig. 2 Number of 21-smooth polynomials among 5000 random degree-62 polynomials in F36 [X ]

of experiments in which the range of the first column occurred. The experimental average number of 21-smooth polynomials of degree 62 observed in our experiment was of 294.5266 ≈ 295. Furthermore, we observed a Gaussian distribution with a standard deviation of 17.0372 as shown in Fig. 2. Now, the expected number of 15-smooth polynomials among 5000 random degree-71 polynomials from the ring F36 [X ], is given as, N36 (15, 71) · 5000 ≈ 5. (36 )71 Our experimental results are shown in Table 8. The first column shows the number of 15-smooth polynomials found within a random sample of 5000 degree-71 polynomials, whereas the second column shows the number of experiments in which the range of the first column occurred. The experimental average number of 15-smooth polynomials of degree 71 observed in our experiment was of 5.0140 ≈ 5. Furthermore, we observed a Gaussian distribution with a standard deviation of 2.2497 as shown in Fig. 3. In both cases, the average number of smooth polynomials obtained by the experiment is practically the same as that obtained using the generating function Nq (m, n). Although we performed only two experiments, we see that in those cases the generating function gives a reliable result on the number of m-smooth polynomials of degree n.

336 Table 8 Number of 15-smooth polynomials among 5000 random degree-71 polynomials in F36 [X ].

This experiment was performed 1500 times. In some experiments no 15-smooth polynomials were found

G. Adj et al.

# of 15-smooth polynomials

# of experiments

1

55

2

143

3

202

4

255

5

258

6

217

7

167

8

88

9

59

10

19

11

19

12

5

13

3

15 Total

1 1491

Fig. 3 Number of 15-smooth polynomials among 5000 random degree-71 polynomials in F36 [X ]

References 1. Adj, G., Canales-Martínez, I., Cruz-Cortés, N., Menezes, A., Oliveira, T., Rivera-Zamarripa, L., Rodríguez-Henríquez, F.: Computing discrete logarithms in cryptographically-interesting characteristic-three finite fields. Cryptology ePrint Archive, Report 2016/914, (2016). http://eprint.iacr.org/2016/914 2. Adj, G.: Logaritmo discreto en campos finitos de característica pequeña: atacando la criptografía basada en emparejamientos de Tipo 1. PhD thesis, CINVESTAV-IPN, 7 (2016). http://tinyurl.com/yan2ukwa 3. Adj, G., Menezes, A., Oliveira, T., Rodríguez-Henríquez, F.: Weakness of F36·509 for discrete logarithm cryptography. In: Cao, Z., Zhang, F. (eds.) Pairing-Based Cryptography—Pairing 2013, vol. 8365 of Lecture Notes in Computer Science, pp. 20–44. Springer International Publishing, New York (2014) 4. Adj, G., Menezes, A., Oliveira, T., Rodríguez-Henríquez, F.: Computing discrete logarithms in F36·137 and F36·163 using magma. In: Koç, C.K., Mesnager, S., Sava¸s, E. (eds.) Arithmetic of Finite Fields, volume 9061 of Lecture Notes in Computer Science, pp. 3–22. Springer International Publishing, New York (2015) 5. Adleman, L.: A Subexponential algorithm for the discrete logarithm problem with applications to cryptography. In: Proceedings of the 20th Annual Symposium on Foundations of Computer Science, pp. 55–60 (1979) 6. Canales-Martínez. I. A.: Implementación eficiente de prueba de suavidad para polinomios. Master’s thesis, CINVESTAV-IPN, 12 2015. http://tinyurl.com/y9p6xk7s

Smoothness test for polynomials defined...

337

7. Coppersmith, D.: Fast evaluation of logarithms in fields of characteristic two. IEEE Trans. Inf. Theory 30(4), 587–594 (1984) 8. Flajolet, P., Gourdon, X., Panario, D.: The complete analysis of a polynomial factorization algorithm over finite fields. J. Algorithms 40(1), 37–812 (2001) 9. Göloglu, F., Granger, R., McGuire, G., Zumbrägel, J.: On the function field sieve and the impact of higher splitting probabilities— application to discrete logarithms. In: Canetti, R., Garay, J.A. (eds) Advances in Cryptology—CRYPTO 2013—33rd Annual Cryptology Conference, volume 8043 of Lecture Notes in Computer Science. Springer, pp. 109–128 (2013) 10. Granger, R., Kleinjung, T., Zumbrägel, J.: Breaking ‘128-bit Secure’ supersingular binary curves (or how to solve discrete logarithms in F24·1223 and F212·367 ). Cryptology. ePrint Archive, Report 2014/119, (2014). http://eprint.iacr.org/2014/119 11. Granger, R., Kleinjung, T., Zumbrägel, J.: Breaking ‘128-bit Secure’ Supersingular Binary Curves (Or How to Solve Discrete Logarithms in F24·1223 and F212·367 ). In: Garay, Juan A., Gennaro, Rosario (eds.) Advances in Cryptology—CRYPTO 2014, volume 8617 of Lecture Notes in Computer Science, pp. 126–145. Springer, Berlin, Heidelberg (2014) 12. Hellman, M., Reyneri, J.: Fast computation of discrete logarithms in G F(q). In: Chaum, D., Rivest, R., Sherman, A. (eds) Advances in Cryptology: Proceedings of CRYPTO ’82. Springer US, pp. 3–13 (1983) 13. Jacobi, C.G.: Über die Kreistheilung und ihre Anwendung auf die Zahlentheorie. J. Reine Angew. Math. 30, 166–182 (1846) 14. Joux, A.: A new index calculus algorithm with complexity L( 41 + o(1)) in small characteristic. In: Lange, T., Lauter, K., Lison˘ek, P. (eds.) Selected Areas in Cryptography—SAC 2013. Lecture Notes in Computer Science, pp. 355–379. Springer, Berlin Heidelberg (May 2014) 15. Karatsuba, A.A., Ofman, Y.: Multiplication of multidigit numbers on automata. Sov. Phys. Dokl. 7, 595–7596 (1963) 16. Lidl, R., Niederreiter, H.: Finite fields, volume 20 of Encyclopedia of Mathematics and its Applications, 2nd edn. Cambridge University Press, Cambridge (1997) 17. Mullen, G.L., Panario, D.: Handbook of Finite Fields. Discrete Mathematics and Its Applications. CRC Press, Boca Raton (2013) 18. Rodríguez-Henríquez, F., Koç, Ç.K.: On fully parallel Karatsuba multipliers for GF(2m ). In: Tria, A., Choi, D. (eds.) International Conference on Computer Science and Technology, pp. 405–410. ACTA Press, Calgary (2003) 19. Shoup, V.: A Computational Introduction to Number Theory and Algebra, 2nd edn. Cambridge University Press, Cambridge (2009) 20. von zur Gathen J, Gerhard J, : Modern computer algebra, 3rd edn. Cambridge University Press, Cambridge (2013) 21. Weimerskirch, A., Paar, C.: Generalizations of the Karatsuba algorithm for efficient implementations. Cryptology. ePrint Archive, Report 2006/224, (2006). http://eprint.iacr.org/

Smoothness Test for Polynomials Defined Over Small Characteristic Finite Fields

Recommend Documents