# this is a test…

vanilla | fix |

{\huge Do Gaps Between Primes Affect RSA Keys?}
Arjen Lenstra is one of the top computational number theorists in the world, especially on the state-of-the-art in integer factoring. For example, he helped create the best general factorization method known to date—the number field sieve—see here for a previous discussion. Also he was one of the authors of the great paper that introduced the LLL lattice algorithm, which was joint work with one of his brothers, Hendrik Lenstra, and also with Lászlò Lovász.
Today we want to add some The paper is joint work with James Hughes, Maxime Augie, Joppe Bos, Thorsten Kleinjung, and Christophe Wachter, and bears the title, “Ron was wrong, Whit was right.” Here Ron refers to Ron Rivest and Whit refers to Whitfield Diffie. Rivest is the “R” in RSA, of course, and Diffie is the Diffie in the Diffie-Hellman protocol. The former uses two primes and the latter uses one prime. The message is that crypto-systems that use one prime may be better than ones that use two primes—because the latter allow two users unwittingly to share one prime in a way that endangers both their systems. Here is how.
AQ QA xY Yx xYx YxY
Ap pA pAp ApA
k. 0.5 5.0 X,Y
a’ b’ c’ d’ e’ f’ g’ h’ i’ j’ k’ l’ m’ n’ o’ p’ q’ r’ s’ t’ u’ v’ w’ x’ y’ z’
A’ B’ C’ D’ E’ F’ G’ H’ I’ J’ K’ L’ M’ N’ Np’ N,p’ O’ P’ Q’ R’ S’ T’ U’ V’ W’ X’ Y’ Z’
hello 0 hello 1 hello 2 hello 3 hello 4 hello 5 hello 6 hello 7 hello 8 hello 9
a b c d e f g h i j k l m n o p q pq p,q r s t u v w x y z
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
A^n Q^n A^A Q^Q 2^n n^2 p^2 2^p
2p 4x 2.123x 2.123p
x0 x1 x2 x3 x4 x5 x6 x7 x8 x9
0.1234 1.235 1234.1234 1235.1235
-0.1234 -1.235 -1234.1234 -1235.1235
1^2 4^4 1234.1234^2 1234.1234^1234.1234 1235.1235
-1^2 -4^4 -1234.1234^2 -1234.1234^1234.1234 -1235.1235
hello 0 hello 1 hello 2 hello 3 hello 4 hello 5 hello 6 hello 7 hello 8 hello 9 1>2 1,2
= – + ‘ * / . , ! ? : > <
x+y x-y x^2 x_2 x^2_2 1+x 1+y 1-x 1-y
x^x x^A A^x A^A
x^x x^A^A x^A A^x A^A
p
xj xi f af ag xt
x0 x1 x2 x3 x4 x5 x6 xa7 x8 x9
x> x< x! x? x^2 x_2 x^x x_x x, x. x: x/
xA xQ xA xQ xA xQ
x=
considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of.
xO xA’
x^x x_x
Perhaps there is one fundamental idea in factoring, an idea that goes back at least to Pierre de Fermat. Suppose that where are primes. The approach is to try to factor by constructing another number so that the greatest common divisor (gcd) of and is equal to either or to . There is good and bad news with this method. The good news is if you can find such an then is easy to factor. Just compute the value of by Euclid’s algorithm and or will pop out. Since the gcd algorithm is quite efficient, this runs fast even if and are very large numbers. Thanks are due to Euclid, although the first proof that this algorithm runs fast is due to Gabriel Lamé in 1844. Perhaps this is one of the first algorithm analysis results. The bad news is, where does come from? Fermat’s idea was to try and find and so that and then use and as potential values of . Note that this succeeds provided is not equal to . The difficulty is that finding such a pair of integers and is still hard, but there are tricks that can be used to make the search for them more efficient. In any event, the search for values of that will work is one of the achievements of modern computational number theory. There are a variety of more and more powerful methods that attempt to find an . If there was a way to always find efficiently, then of course factoring would be easy. Even Peter Shor’s quantum factoring algorithm uses the power of quantum circuits to find such an .
Let’s turn to the recent attack on RSA. Recall RSA relies on having an , where and are secret primes that are not public. The value of is public, but the primes are secret and will remain so provided factoring is hard. Lenstra’s paper used a clever idea, an idea that probably has been around, but still a clever idea: instead of working hard to try and find an that will help factor , just use values that public. Use from other public keys—basically play the RSA system against itself. If and share a prime, finds it—bingo. This is really neat because these values are available, there are a lot of them, and there are reasons to believe they could be useful. The empirical result found in the paper is that using these values as the ‘s works quite well. The paper summarizes the result by ending with a riff on a quotation commonly attributed to Josef Stalin: Factoring one 1024-bit RSA modulus would be historic. Factoring 12,720 such moduli is a statistic. The original quotation is, “When one dies, it is a tragedy. When a million die, it is a statistic.” According to this wiki the quotation came from a 1981 Russian biography of Stalin that was used by an American historian, but it probably originates in a 1932 essay on the nature of jokes in France.
Lenstra put two extra charges into his broadside: having the story appear in the New York Times on the same Valentine’s Day as the paper’s e-print date, and the title “Ron was wrong, Whit was right.” A great title to get noticed, but not the best for search engines in the future—jump forward ten years and imagine saying, “I recall a paper that showed a problem with RSA, I forget the title.” Right afterward there were attacks not on the results, but on their meaning and on the bearer of the news—Lenstra. I personally find this pushback to be unscientific and unwarranted. The paper shows that a small fraction of the RSA keys are easy to break, 0.2%, while 99.8% are okay. But Lenstra and his team have been attacked that this observation is picayune—if only a small percent fail, what is the concern? This line strikes me as unfair, unreasonable, and wrong: cryptography here is predicated on the chance of total failure being negligible. I regard the following statement as wrongheaded insofar as it is seen to disclaim resposibility for considerations that should be on RSA’s watch as a security firm: RSA said Thursday in a statement. “Our analysis confirms to us that the data does not point to a flaw in the algorithm, but instead points to the importance of proper implementation, especially regarding the exploding number of embedded devices that are connected to the Internet today.”
The nub of the argument is whether Lenstra’s contention that relying on two primes is innately more dangerous than one is fair. There is no obvious counterpart to the GCD trick for systems that use one prime. Note that two primes are required for RSA, and this is why the GCD method could even work.
Clearly the facts speak for themselves: many RSA keys were broken. I do not accept that “only” a small percent were compromised. This reminds of the scene in Dr. Strangelove where a US bomber has ignored orders to return to base and is on the way to attack the USSR and likely start WW III:
The argument then moves on to ask, what is the root cause, and what can be done to avoid these problems? The paper does not identify a main culprit, and the answers might truly emerge only from sifting how millions of RSA certificates were created—acts of checking that the authors disclaim for reasons of both security and time-intensiveness. We don’t have a main culprit either, but rather an observation about the scenery.
We wish to draw attention to one aspect of generating primes, and ask whether it may be interacting with more-salient problems found in the generation of keys, especially by smaller “consumer” devices. For more on the salient problems, see this excellent post by Nadia Heninger pointing to another in-depth study, this Y-Combinator thread, and this Secret Blogging Seminar note, among many good sources. We make an educated guess that many implementers start with and initialized correctly. Then they perform something like the following key loop—we will consider different ways to change later: This strategy of picking a random place and then starting a search from that place yields extremely biased primes. Whether this matters may depend on implementation details discussed below, but potentially it could have three ramifications: - It could explain some, if not all, the lost entropy in the key generation that Lenstra found.
- It could suggest a method that could break many more keys.
- Searching for primes with “extra” security properties, such as so-called safe primes and/or strong primes, could make the bias problem
*worse*not*better*.
To understand the bias, let’s recall an old problem from Martin Gardner’s first book of mathematical recreations from his A young man reaches a subway platform at a random moment each Saturday afternoon. Brooklyn and Bronx trains arrive equally often—[each regularly] every 10 minutes. Yet on the average he goes to Brooklyn nine times out of ten. Why?
The answer is that while uptown trains to the Bronx may dutifully arrive on the :00, :10, thru :50 every hour, downtown trains to Brooklyn could be coming at :09, :19, , :59. The downtown trains have a
The distribution of primes is much more uneven than that of the trains, and some primes may have lead intervals that are vastly larger than average. Call such primes
If the first prime were to come a fifth of the way into that range, and we’re searching for primes with the same number of bits as , then one-fifth of the random starting points will yield the same prime, . Then about Of course saying is a huge exaggeration. For sufficiently large there is always a prime between and , where is known to work. Some conjectures push essentially down to , perhaps with dependence on the Riemann Hypothesis. There is even a much stronger conjecture by Harald Cramér in 1936 that the gaps are not , and numerical evidence has so far supported this out to about . Whether this makes the problem go away is where we get into concrete details.
We can do one quick calculation to argue the gap issue should not matter, even with the bigger gap values . Suppose we are picking random 512-bit primes, and use one 64-bit random seed to select a starting point between and . If we space the starting points equally, then they are apart. Meanwhile the largest gaps are no more than . Thus no gap has two random points—it is not close—so we should be dealing from a pool of fully distinct primes.
Of course such a pool is not observed in the Lenstra and Heninger-et-al. studies. Ironically, using In one sense, of course it should—completely apart from the issue of multiple selection, leader primes are more likely to be chosen by any form of the above loop with . For the relatively small number of primes that were “outed” by the GCD calculations in the studies, we can feasible compute the distance to the adjacent primes lower and higher, and simply ask: Are the distances to the nearest neighbors of the primes caught in the studies significantly greater than the expectation of ? Same question for gaps to the nearest “strong” and/or “safe” prime.
As we said a positive answer may not mean much—we would expect it anyway—but perhaps the magnitude of the gaps may shed light on why certain recurred so often. It is also possible to compare them with the non-shared primes giving . We might like to compare them with the for keys that were
One way to evade the literal problem of gaps is to use for some other update function besides or . However, this may merely shuffle the gap issue without improving it. What happens to the gaps between primes under single-cycle permutations of ? Of course we can create permutations with arbitrarily large gaps by defining to alternate between long even-number and long odd-number stretches upon iteration, but does this happen for any truly
In any event, a safer solution that avoids the gap problem is easy to describe: generate a new random seed each time and use or just on each iteration. We stop only when itself is a prime, not doing a search at all. Then all primes in the chosen range are
Does our question about gaps neighboring the primes that were found in the GCD computations have a ready answer? Is it relevant? (We might call a simple-minded question whose relevance is also part of the question a “Pipsqueak.”) What happens to gaps between prime numbers when iterating low-complexity functions other than adding or ? |
{\huge Do Gaps Between Primes Affect RSA Keys?}
Arjen Lenstra is one of the top computational number theorists in the world, especially on the state-of-the-art in integer factoring. For example, he helped create the best general factorization method known to date—the number field sieve—see here for a previous discussion. Also he was one of the authors of the great paper that introduced the LLL lattice algorithm, which was joint work with one of his brothers, Hendrik Lenstra, and also with Lászlò Lovász.
Today we want to add some The paper is joint work with James Hughes, Maxime Augie, Joppe Bos, Thorsten Kleinjung, and Christophe Wachter, and bears the title, “Ron was wrong, Whit was right.” Here Ron refers to Ron Rivest and Whit refers to Whitfield Diffie. Rivest is the “R” in RSA, of course, and Diffie is the Diffie in the Diffie-Hellman protocol. The former uses two primes and the latter uses one prime. The message is that crypto-systems that use one prime may be better than ones that use two primes—because the latter allow two users unwittingly to share one prime in a way that endangers both their systems. Here is how.
AQ QA xY Yx xYx YxY
Ap pA pAp ApA
k. 0.5 5.0 X,Y
a’ b’ c’ d’ e’ f’ g’ h’ i’ j’ k’ l’ m’ n’ o’ p’ q’ r’ s’ t’ u’ v’ w’ x’ y’ z’
A’ B’ C’ D’ E’ F’ G’ H’ I’ J’ K’ L’ M’ N’ Np’ N,p’ O’ P’ Q’ R’ S’ T’ U’ V’ W’ X’ Y’ Z’
hello 0 hello 1 hello 2 hello 3 hello 4 hello 5 hello 6 hello 7 hello 8 hello 9
a b c d e f g h i j k l m n o p q pq p,q r s t u v w x y z
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
A^n Q^n A^A Q^Q 2^n n^2 p^2 2^p
2p 4x 2.123x 2.123p
x0 x1 x2 x3 x4 x5 x6 x7 x8 x9
0.1234 1.235 1234.1234 1235.1235
-0.1234 -1.235 -1234.1234 -1235.1235
1^2 4^4 1234.1234^2 1234.1234^1234.1234 1235.1235
-1^2 -4^4 -1234.1234^2 -1234.1234^1234.1234 -1235.1235
hello 0 hello 1 hello 2 hello 3 hello 4 hello 5 hello 6 hello 7 hello 8 hello 9 1>2 1,2
= – + ‘ * / . , ! ? : > <
x+y x-y x^2 x_2 x^2_2 1+x 1+y 1-x 1-y
x^x x^A A^x A^A
x^x x^A^A x^A A^x A^A
p
xj xi f af ag xt
x0 x1 x2 x3 x4 x5 x6 xa7 x8 x9
x> x< x! x? x^2 x_2 x^x x_x x, x. x: x/
xA xQ xA xQ xA xQ
x=
considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of considerations to the discussion of.
xO xA’
x^x x_x
Perhaps there is one fundamental idea in factoring, an idea that goes back at least to Pierre de Fermat. Suppose that where are primes. The approach is to try to factor by constructing another number so that the greatest common divisor (gcd) of and is equal to either or to . There is good and bad news with this method. The good news is if you can find such an then is easy to factor. Just compute the value of by Euclid’s algorithm and or will pop out. Since the gcd algorithm is quite efficient, this runs fast even if and are very large numbers. Thanks are due to Euclid, although the first proof that this algorithm runs fast is due to Gabriel Lamé in 1844. Perhaps this is one of the first algorithm analysis results. The bad news is, where does come from? Fermat’s idea was to try and find and so that and then use and as potential values of . Note that this succeeds provided is not equal to . The difficulty is that finding such a pair of integers and is still hard, but there are tricks that can be used to make the search for them more efficient. In any event, the search for values of that will work is one of the achievements of modern computational number theory. There are a variety of more and more powerful methods that attempt to find an . If there was a way to always find efficiently, then of course factoring would be easy. Even Peter Shor’s quantum factoring algorithm uses the power of quantum circuits to find such an .
Let’s turn to the recent attack on RSA. Recall RSA relies on having an , where and are secret primes that are not public. The value of is public, but the primes are secret and will remain so provided factoring is hard. Lenstra’s paper used a clever idea, an idea that probably has been around, but still a clever idea: instead of working hard to try and find an that will help factor , just use values that public. Use from other public keys—basically play the RSA system against itself. If and share a prime, finds it—bingo. This is really neat because these values are available, there are a lot of them, and there are reasons to believe they could be useful. The empirical result found in the paper is that using these values as the ‘s works quite well. The paper summarizes the result by ending with a riff on a quotation commonly attributed to Josef Stalin: Factoring one 1024-bit RSA modulus would be historic. Factoring 12,720 such moduli is a statistic. The original quotation is, “When one dies, it is a tragedy. When a million die, it is a statistic.” According to this wiki the quotation came from a 1981 Russian biography of Stalin that was used by an American historian, but it probably originates in a 1932 essay on the nature of jokes in France.
Lenstra put two extra charges into his broadside: having the story appear in the New York Times on the same Valentine’s Day as the paper’s e-print date, and the title “Ron was wrong, Whit was right.” A great title to get noticed, but not the best for search engines in the future—jump forward ten years and imagine saying, “I recall a paper that showed a problem with RSA, I forget the title.” Right afterward there were attacks not on the results, but on their meaning and on the bearer of the news—Lenstra. I personally find this pushback to be unscientific and unwarranted. The paper shows that a small fraction of the RSA keys are easy to break, 0.2%, while 99.8% are okay. But Lenstra and his team have been attacked that this observation is picayune—if only a small percent fail, what is the concern? This line strikes me as unfair, unreasonable, and wrong: cryptography here is predicated on the chance of total failure being negligible. I regard the following statement as wrongheaded insofar as it is seen to disclaim resposibility for considerations that should be on RSA’s watch as a security firm: RSA said Thursday in a statement. “Our analysis confirms to us that the data does not point to a flaw in the algorithm, but instead points to the importance of proper implementation, especially regarding the exploding number of embedded devices that are connected to the Internet today.”
The nub of the argument is whether Lenstra’s contention that relying on two primes is innately more dangerous than one is fair. There is no obvious counterpart to the GCD trick for systems that use one prime. Note that two primes are required for RSA, and this is why the GCD method could even work.
Clearly the facts speak for themselves: many RSA keys were broken. I do not accept that “only” a small percent were compromised. This reminds of the scene in Dr. Strangelove where a US bomber has ignored orders to return to base and is on the way to attack the USSR and likely start WW III:
The argument then moves on to ask, what is the root cause, and what can be done to avoid these problems? The paper does not identify a main culprit, and the answers might truly emerge only from sifting how millions of RSA certificates were created—acts of checking that the authors disclaim for reasons of both security and time-intensiveness. We don’t have a main culprit either, but rather an observation about the scenery.
We wish to draw attention to one aspect of generating primes, and ask whether it may be interacting with more-salient problems found in the generation of keys, especially by smaller “consumer” devices. For more on the salient problems, see this excellent post by Nadia Heninger pointing to another in-depth study, this Y-Combinator thread, and this Secret Blogging Seminar note, among many good sources. We make an educated guess that many implementers start with and initialized correctly. Then they perform something like the following key loop—we will consider different ways to change later: This strategy of picking a random place and then starting a search from that place yields extremely biased primes. Whether this matters may depend on implementation details discussed below, but potentially it could have three ramifications: - It could explain some, if not all, the lost entropy in the key generation that Lenstra found.
- It could suggest a method that could break many more keys.
- Searching for primes with “extra” security properties, such as so-called safe primes and/or strong primes, could make the bias problem
*worse*not*better*.
To understand the bias, let’s recall an old problem from Martin Gardner’s first book of mathematical recreations from his A young man reaches a subway platform at a random moment each Saturday afternoon. Brooklyn and Bronx trains arrive equally often—[each regularly] every 10 minutes. Yet on the average he goes to Brooklyn nine times out of ten. Why?
The answer is that while uptown trains to the Bronx may dutifully arrive on the :00, :10, thru :50 every hour, downtown trains to Brooklyn could be coming at :09, :19, , :59. The downtown trains have a
The distribution of primes is much more uneven than that of the trains, and some primes may have lead intervals that are vastly larger than average. Call such primes
If the first prime were to come a fifth of the way into that range, and we’re searching for primes with the same number of bits as , then one-fifth of the random starting points will yield the same prime, . Then about Of course saying is a huge exaggeration. For sufficiently large there is always a prime between and , where is known to work. Some conjectures push essentially down to , perhaps with dependence on the Riemann Hypothesis. There is even a much stronger conjecture by Harald Cramér in 1936 that the gaps are not , and numerical evidence has so far supported this out to about . Whether this makes the problem go away is where we get into concrete details.
We can do one quick calculation to argue the gap issue should not matter, even with the bigger gap values . Suppose we are picking random 512-bit primes, and use one 64-bit random seed to select a starting point between and . If we space the starting points equally, then they are apart. Meanwhile the largest gaps are no more than . Thus no gap has two random points—it is not close—so we should be dealing from a pool of fully distinct primes.
Of course such a pool is not observed in the Lenstra and Heninger-et-al. studies. Ironically, using In one sense, of course it should—completely apart from the issue of multiple selection, leader primes are more likely to be chosen by any form of the above loop with . For the relatively small number of primes that were “outed” by the GCD calculations in the studies, we can feasible compute the distance to the adjacent primes lower and higher, and simply ask: Are the distances to the nearest neighbors of the primes caught in the studies significantly greater than the expectation of ? Same question for gaps to the nearest “strong” and/or “safe” prime.
As we said a positive answer may not mean much—we would expect it anyway—but perhaps the magnitude of the gaps may shed light on why certain recurred so often. It is also possible to compare them with the non-shared primes giving . We might like to compare them with the for keys that were
One way to evade the literal problem of gaps is to use for some other update function besides or . However, this may merely shuffle the gap issue without improving it. What happens to the gaps between primes under single-cycle permutations of ? Of course we can create permutations with arbitrarily large gaps by defining to alternate between long even-number and long odd-number stretches upon iteration, but does this happen for any truly
In any event, a safer solution that avoids the gap problem is easy to describe: generate a new random seed each time and use or just on each iteration. We stop only when itself is a prime, not doing a search at all. Then all primes in the chosen range are
Does our question about gaps neighboring the primes that were found in the GCD computations have a ready answer? Is it relevant? (We might call a simple-minded question whose relevance is also part of the question a “Pipsqueak.”) What happens to gaps between prime numbers when iterating low-complexity functions other than adding or ? |