A few months ago, we published an article with the admittedly dramatic title of The Quantum Apocalypse: What is post-quantum cryptography, and why do we need it?
We wrote that article because quantum computers, which are widely misunderstood, often generate exciting and scary-sounding headlines.
That’s because quantum computers are not only weirdly different from everyday computers such as the the laptops and mobile phones that we’re used to, but also pitched as being on the verge of achieving the apparently impossible.
One of the most dramatic claims made for quantum computers is that they may be able to defeat the encryption algorithms that we currently rely on to protect our web browsing against surveillance and tampering.
In regular computer storage devices, each storage bit, short for binary digit, can be in exactly one of two possible states at any time, representing 0 or 1, such as ELECTRICALLY CHARGED or NOT CHARGED; HOLE PUNCHED or NOT PUNCHED; ILLUMINATED or DARK.
With five bits, you can therefore represent 2×2×2×2×2 (2 to the power 5, or 25) different possibilities for a total of 32; with 16 bits, you get 216 values, or 65536; and so on.
But with N quantum bits, known jocularly as qubits, you can miraculously – in a perfect world, at least – keep track of a mind-boggling 2 to the power of (2N) different possible values, thanks to what are known as superpositions.
N qubits don’t just let you represent one of 2N different values at a time, like regular memory bits, but can essentially keep track of a ghostly, intertwined combination of many of those possible 2N values at the same time.
This has led to a popular but erroneous belief that a powerful enough quantum computer might be able to:
The myth is that instead of trying to decrypt with the password AAAA
, then AAAB
, then AAAC
and so on until you hit the jackpot, you get to try all combinations from AAAA
to ZZZZ
at once, and therefore hit the jackpot on your first go.
However, the downside of superpositions, very greatly simplified, is that when you try read them out, all answers that remain in the superposition evaporate into the quantum ether except for one.
The superposition collapses, in the jargon, leaving not necessarily THE answer, merely AN answer, perhaps not the one you wanted.
A quantum computer that really could directly perform 2128 different decryptions in parallel, thus trying out 2128 different keys in one go, would indeed be super-fast.
But even if it came up with all possible decryptions at once internally, it would ultimately provide you with just one random decryption at the end, because you don’t get to dig around inside the superposition to select the answer you are after.
You’d need to use a conventional computer to check if the answer was right, and to try again if it wasn’t.
In other words, you’d still be stuck with trying over and over and over again until you got lucky, which is exactly what you can already do with a conventional computer, while saving yourself the millions or billions of dollars that this imaginary quantum computer would cost to build and run.
In fact, only a few very carefully chosen algorithms are suitable for quantum computers.
You need to find mathematical problems where the quantum calculations that you perform (during which you can’t peek inside the quantum computer to make IF-THIS-THEN-THAT decisions without collapsing it and thus destroying its internal state) automatically eliminate sufficiently many wrong answers along the way that the one you get back at the end has a useful chance of being right.
You then check that answer with a conventional computer and repeat the “experiment” over and over until you get lucky.
This could speed things up, as long you get rid of sufficiently many wrong answers before you collapsing the superposition and ending up with just a single output value.
As a simple example, imagine that you have a problem that takes up to a million iterations to solve with a conventional computer, because there are a million possible input conditions to try out.
Now imagine that you can encode those one million answers into a quantum superposition, and perform a quantum cacluation that eliminates 999,000 wrong answers, leaving just the right answer and 999 wrong answers behind.
One of those 1000 remaining answers is what you will get out of the quantum part of the computation, but you don’t get to choose which one it is.
At this point, you can approach the problem either by having a million goes at it with your conventional computer, or by repatedly having a go at it on the quantum computer, followed by verifying the answer with the conventional computer so you can detect when you get the correct result.
Assuming it takes, on average, 1000 tries with the quantum computer to end up with the right answer (999 times out of 1000 the answer will be one of the wrong ones), and provided that 1000 goes on the quantum computer are faster than a million goes on the conventional computer, the quantum computer should save you time overall.
The best-known quantum cryptographic cracking algorithm of this sort is called Shor’s algorithm, which can in theory give an exponential speedup in some of the mathematical operations needed for cracking both RSA and Elliptic Curve (EC) encryption.
AES encryption and most cryptographic hashes can’t be handled by Shor’s algorithm. For those, a different process called Grover’s algorithm might be able to give a quadratic speedup, which is the same as reducing the time needed to its square root. A quadratic speedup can be counteracted simply by doubling the key size, say from 128 bits to 256 for AES, or the hash size, say from SHA-256 to SHA-512. However, the exponential improvement that some people claim for Shor’s algorithm can’t be “solved” just by doubling RSA or EC key sizes, so completely different algorithms are needed instead. This fear that a quantum computer could outrun any strengthening available from key size changes alone is part of the collective concern about a so-called quantum apocalypse.
Cue widespread fear, uncertainty and doubt every time a paper turns up that has the words “quantum” and “cryptography” in its name.
And that’s what happened a few weeks ago when journalists picked up on a paper out of China entitled Quantum Annealing Public Key Cryptographic Attack Algorithm, and turned it into red-hot news.
Breathless headlines appeared, such as Chinese researchers break RSA encryption with a quantum computer.
Really?
Just how close did this research really bring us to RSA being cracked?
The “secret sauce” in RSA encryption is what you might call one-way arithmetic, or a mathematical function that is computationally unfeasible to invert, the formal name for ‘running them in reverse’.
In RSA, you multiply two X-bit prime numbers P and Q together to make a 2X-bit product N, which is known as a semiprime, because it has exactly two prime factors:
P and Q become the private key, which is needed for the process of decryption; N becomes the public key, which anyone can use for encryption.
Handily, this means you can openly share your public key with anyone who wants to send you confidential data, but as long as you keep the private key to yourself, only you can unscramble the data after it has arrived: there is no need to meet up first to agree on a shared secret.
The one-wayness in RSA comes from the fact that doing the calculation P × Q to generate N is very easy, but splitting N back into its unique pair of prime factors P and Q is much harder.
In other words, knowing N is not enough for an attacker to work backwards to the values of P and Q that they would need to snoop on encrypted data.
Here’s why.
Numbers get harder to multiply together as they get bigger, but at a cost that goes up at worst with the square of their length.
Even with huge numbers, with hundreds or thousands of decimal digits, we can compute products surprisingly quickly, and we have efficient algorithms for multiplying together even truly huge numbers, far bigger that the ones needed for RSA keys.
For the data below, we multiplied together two prime numbers each with tens or hundreds of thousands of decimal digits (millions of bits), using a programming library called imath
that is dedicated to handling huge integers:
In the chart above, the dotted line is a quadratic curve computed to fit as closely as possible to the individual data points.
The quadratic (which means based on squaring) nature of the relationship is clear from this alternative chart in which the numbers on the Y-axis are the square roots of the numbers above, and the dotted line is the best-fitting straight line through the resulting points:
But dividing each 2X-bit number N back into its X-bit components P and Q, known as factoring or factorizing, can only be done by trial and error.
The simplest algorithm involves trying every possible prime factor one after the other: 2, 3, 5, 7, 11, 13 and so on, all the way up to the square root of N, which has X bits.
(We don’t need to try dividing by factors greater than √N, on the grounds that if one of the factors is greater than √N, any and all other factors must be less than √N, so we would already have found them.)
There are therefore at most 2X different possible X-bit factors to try, so the time needed for factoring, loosely speaking, goes up with the exponent of the number of bits:
Below is an alternative view of the data using a logarithmic scale on the Y-axis (logarithm is the name given to the inverse of exponents), a trick that creates a straight line if the numbers really do grow exponentially:
There are factoring techniques available that are much faster than the elementary approach of dividing a semiprime by every viable factor in turn until you find one (which automatically gives you the other factor, given that if N = PQ and you find P, then Q = N/P).
But even the best-known algorithms on conventional computers can’t escape from the runaway increase in the time needed to factorize ever-bigger semiprimes.
Here are the CPU times recorded using the specialized CADO-NFS factoring tool, which is based on the best-known algorithms today:
And here are the CADO factoring times with a logarithmic scale, so that each grid square on the X-axis involves adding 10 more digits to the semiprime product that we’re factoring, while each grid square on the Y-axis involves multiplying by 10 the CPU time required:
Adding more CPU cores and more networked computers to a CADO factoring task will at best linearly reduce the elapsed time needed to complete the job, but that’s not enough to keep up with the rate of growth in the complexity of the overall task as the prime factors increase in size.
Because multiplication goes up at most with the square of the number of bits in P and Q, calculating N = PQ with 100-bit primes has a cost on the order 1002, or 10,000 “units of difficulty”.
But factoring goes up exponentially, so splitting N back into P and Q costs on the order of 2100 units of difficulty in comparison, which is one followed by 30 decimal zeros, or one million million million million million.
That’s why calculating N from P and Q is feasible, but working out Q = N/P or P = N/Q from N alone is not, as long as P and Q are large enough.
In practice, we don’t use 200-bit RSA public keys, which would be the product of two 100-bit primes, but 2048-bit or 3072-bit public keys, with a correspondingly huge increase in the difficulty of cracking them.
2048-bit RSA semiprime public keys are therefore today’s table stakes for a quantum computer that would truly justify the headline “RSA cracked”.
In this much-hyped paper, however, the authors “cracked” an RSA “public key” with a paltry 22 bits.
That means they factored a 22-bit semiprime (8 digits) back into the two randomly generated 11-bit (4 digit) primes that were multiplied together to create it.
At this scale, of course, the difference between the multiplication N = PQ and the factorization Q = N/P is computationally unexciting.
If you want to put rough numbers on it, the ‘forward function work cost’ of the multiplication is 112, which is 121, and the ‘inverse function work cost’ of factoring is 211, or 2048, which is only about 16 times bigger.
That doesn’t compare to the difference between 1002 and 2100 that we mentioned above, and it is absurdly far away from the difference between 10242 and 21024, which are the corresponding difficulty units for multiplying (N = PQ) versus factoring (Q = N/P) when 2048-bit RSA keys are used.
So, what about 22-bit “public keys”?
We can already trivially process this sort of problem with an everyday computer.
Here’s a tiny, old-school C function to factorize 22-bit numbers, using an algorithm that goes back well over 2000 years, called the Sieve of Eratosthenes:
Let’s time it:
Notice that we haven’t just factored a chosen 22-bit number in 0.659 seconds, but have fully factored all possible 22-bit numbers (there are just over two million of them) in that time.
That’s an average of about 300 nanoseconds to factor each 22-bit number.
With a tiny modification to skip numbers as soon as we know they don’t have exactly two 11-bit factors, which means they can’t be RSA semiprimes, the same code runs through all 22-bit candidate semiprimes in less than 80 milliseconds:
In fact, the factoring run above computes a complete N = PQ lookup table for every viable 22-bit RSA public key in those 0.079 seconds
With a pre-computed list of semiprimes and their factors as shown above, we could factor any 22-bit RSA public key with a simple table lookup, which would probably take just a few nanoseconds on a commodity laptop.
I think we can all now easily answer the question, “Has RSA been cracked?” with a resounding “No,” whether we invoke quantum physics and quantum computing or not.
So, if you’re a cybersecurity journalist, please try to steer clear of Quantum Crypto Clickbait, because it leaves community-centered security advisors with needless debunking work to do.
There’s plenty of useful work we could be doing to help our friends, family, and colleagues get the basics right instead, such as getting faster at patching, safer at managing passwords, and more resilient against getting lured onto phishing sites or tricked into giving out information we should have kept to ourselves.
Thanks for listening, and stand down from beige alert!
Why not ask how SolCyber can help you do cybersecurity in the most human-friendly way? Don’t get stuck behind an ever-expanding convoy of security tools that leave you at the whim of policies and procedures that are dictated by the tools, even though they don’t suit your IT team, your colleagues, or your customers!
Paul Ducklin is a respected expert with more than 30 years of experience as a programmer, reverser, researcher and educator in the cybersecurity industry. Duck, as he is known, is also a globally respected writer, presenter and podcaster with an unmatched knack for explaining even the most complex technical issues in plain English. Read, learn, enjoy!
Featured DB5 image by Dynamic Wang via Unsplash.