How Does Math Keep Secrets?
Introduction
Can you keep a secret? Modern techniques for maintaining the confidentiality of information are based on mathematical problems that are inherently too difficult for anyone to solve without the right hints. Yet what does that mean when quantum computers capable of solving many problems astronomically faster are on the horizon? In this episode, host Janna Levin talks with computer scientist Boaz Barak about the cryptographic techniques that keep information confidential, and why “security through mathematics” beats “security through obscurity.”
Listen on Apple Podcasts, Spotify, TuneIn or your favorite podcasting app, or you can stream it from Quanta.
Transcript
[Theme plays]
JANNA LEVIN: We all have secrets we want to obscure, from childhood notes between friends, to Da Vinci’s notebooks, to the wartime messages famously cracked by Alan Turing and a cohort of English cryptographers. To share secrets with a friend, an ally, a co-conspirator, there is cryptography. There are codes and ciphers, ingenious means to safeguard information against prying eyes. But in lockstep, there are codebreakers and equally ingenious means to decipher the hidden information.
Cryptography has become crucial to modern life and commerce to protect our emails, our banks, and our national security. While developing more and more secure encryptions, researchers have recently made some unexpected discoveries that reveal deeper truths about the theoretical limits of secrecy itself.
I’m Janna Levin and this is “The Joy of Why,” a podcast from Quanta Magazine where I take turns at the mic with my cohost, Steve Strogatz, exploring the biggest questions in math and science today.
In this episode, theoretical computer scientist Boaz Barak demystifies cryptography as we ask: Is it possible to perfectly protect secrets?
[Theme fades out]
Boaz is the Gordon McKay professor of computer science at Harvard University. He’s also a member of the scientific advisory board for Quanta Magazine and the Simons Institute for the Theory of Computing. His research centers on cryptography, computational complexity, quantum computing and the foundations of machine learning.
Boaz, welcome to the show.
BOAZ BARAK: Thank you. Thank you very much.
LEVIN: So glad to have you. This is quite a challenging subject, and to open, I kind of want to do the opposite of encrypting this conversation. We’re here to share ideas. And so let’s start by opening the dictionary of terms here. What is cryptography? What are ciphers?
BARAK: So, cryptography’s meaning has really evolved over the years. I think in the early days, since humans began writing, they had this notion of secret writing, some ways to obscure their secrets. And cryptography was kind of synonymous with that, with basically encryption.
But then, more recently, since the 1970s, cryptography has really expanded and evolved, and it’s no longer just encryption, it’s also authentication — like digital signatures and even more, fancier things like zero-knowledge proofs, multiparty secure computation, and many other ways to basically protect not just communication but also computation.
LEVIN: So it’s as though we figured out how to write language, and then we decide sometimes we don’t want to share our inner thoughts, and so we write secretly. And that must go back quite a long way. So what are some of the earliest encryption and decryption techniques?
BARAK: So a famous example in encryption is Caesar’s cipher, which is attributed to Julius Caesar — I believe it has predated him — which is a very, very simple system of obscuring data or messages, where the idea is just you shift letters of the alphabet. So, for example, the letter A maps to, say, the letter F, the letter B maps to G, the letter C maps to H, et cetera. And this is a very simplistic approach, which is not that hard to break.
Generally, the Caesar cipher is a special case of what’s known as a substitution cipher, where you have some kind of a table mapping the letters of your message that you’re trying to encrypt. That is what we typically call the plaintext into the encrypted form, which we call the ciphertext.
And these types of substitution ciphers have been very common. One of the famous examples was used by Mary, the Queen of Scots, when she was planning a coup against her cousin, Elizabeth. And substitution ciphers are not very secure. You can typically break them. By just looking at the frequency of how many symbols appear in the cypher text, for example, you can figure out that the most frequent symbol is probably the encoding of E because that’s the most frequent letter in the English alphabet.
LEVIN: I was going to guess E.
BARAK: Yes. And using that, Queen Elizabeth’s spies managed to crack Mary’s cipher, and it didn’t end up well for Mary, who was executed.
And that has actually been kind of a typical story with encryption throughout the years, where someone comes up with an encryption scheme, they believe it is unbreakable, they use it for life-and-death applications. And it turns out that it is breakable. And typically when you use something for life-and-death applications and it doesn’t work, it doesn’t bode well for you.
LEVIN: Yeah. Dire consequences. To speak to that point, the 20th century really was a critical point in cryptography. I mean, there were two world wars where encryption and decryption played a really major role. And at the same time, maybe as a result, the field of cryptography began to become a very serious and important subject, both for intellectual and scientific reasons, but also for survival — for the fate of the world, right?
And we mentioned one of the central figures, like Alan Turing, and his role famously in cracking the German Enigma cipher. What else was really shifting in the significance of cryptography in the 20th century?
BARAK: So, from a cryptography point of view, I think, yes, the Enigma cipher, which was believed to be unbreakable by the Germans, partly because if you were trying to figure out how many combinations were for the secret key of the Enigma, which involved setting wires of several rotors …
LEVIN: It was kind of like a typewriter, almost.
BARAK: Yes, it looked exactly like a typewriter. When I teach at Harvard, I always have to bring up a photo of a typewriter because now these days students don’t know what the typewriter looks like.
But it looked like a typewriter. But when you hit a key, then something else would come out. So you hit the letter A, maybe a letter Z would come out. And it wasn’t a simple substitution cipher in the sense that, say, if you hit the letter A again, then another letter would come out. So the state of the system would be constantly evolving, which made it much harder to break.
And the secret key was actually the setting of the wires of the rotors inside this typewriter. So there were like several rotors, which were wired in a certain way. And the number of possibilities for the secret key was absolutely enormous, was something like 10100, which even with today’s computers, if we were trying to break the Enigma by brute force, by simply trying out all possibilities, the sun would die out and collapse before we were done.
And let alone the computing devices that they had in the 1940s. So it took a lot of mathematical ingenuity that was done by Alan Turing, many other people at Bletchley Park, and even before that, uh, some insights were done by the Polish intelligence services before that, to actually break the Enigma.
And one of the lessons that cryptography took from that is that trying to build security by having a very complicated system like the Enigma is actually not going to succeed. Cryptography transitioned into relying in some sense on simpler systems, but with a more sound mathematical analysis. And a key figure in bringing about the mathematical analysis of cryptography was Claude Shannon, who in the late 1940s wrote some influential papers, starting with a mathematical theory of encryption.
And then in the 1970s, people like [Whitfield] Diffie, [Martin] Hellman and [Ralph] Merkel — these are three separate people — started with the mathematical theory of public key cryptography, which was then built up. And really in the late ’70s and early ’80s we started to have a mathematical theory of cryptography that, rather than being based on obscure problems, like the Enigma, was actually based on very simple problems, like say the problem of finding the prime factors of large integers, that have been around for thousands of years, but which, despite this history, we still don’t know an efficient algorithm for.
LEVIN: Now, it’s interesting that Alan Turing comes up not only in these world changing crises over cracking codes, but also he is well known as the inventor of the modern concept of computation at all. You know, Turing thought about mechanizing thought, and it led him to the notion of a computer that wasn’t a human, which is how the term “computer” had originally been used. Computers were people who computed, and Alan Turing changed that to the idea of a machine that could compute. So you’re talking about these wonderful, theoretical changes. How did cryptography change with the advent of actual machines that we now call computers?
BARAK: So, indeed, I think some of the earliest mechanical computers were built in Bletchley Park exactly for the task of automating some of the analysis in breaking the Enigma and other ciphers. And Alan Turing had a broader vision than that. So specific computing devices have always been around to mechanize computation to some extent.
But Alan Turing had this broader vision of a general-purpose computer. I should say that Charles Babbage had this same vision maybe 70 years earlier, but he had never gotten around to actually building the device.
And Turing had this idea that you could actually build this device that would be general purpose. In some sense, there is a notion of a universal computer, or as we call it today, a universal Turing machine, that is the same piece of hardware that can run arbitrary functionality by basically being supplied with software.
LEVIN: It is quite an amazing evolution. So now here we are, and cryptography plays a part in nearly everything we do — whether we’re fully aware of it or not. I mean, we use encryption to hide private messages, of course, but also to secure information, to compact it and secure it. It shows up in telecommunications, medical records, how photos are stored on our phones. So tell us a little bit about how cryptography is integrated into all of our everyday lives.
BARAK: Yeah. So people don’t realize, for example, the conversations such as the one that you and I are having, and millions of people are having, using Zoom or other telecommunication framework, they often use wireless connections, which basically means that the signals that we are transmitting are going through the air and anyone can pick them up. The reason our conversations are still secure is because they are encrypted.
Now, also, all of us basically carry around a device that both stores all of our private information and has a microphone and a camera that can record, potentially, all of our communication.
And, moreover, this device is a fully programmable computer. All our smartphones are fully programmable computers. And they can get over-the-air updates to completely change their functionality. The reason that, say, some hackers don’t send us over-the-air updates that can convert our smartphones into listening, recording and then surveillance devices is because we use cryptography, and specifically digital signatures so that the device — even if it gets some piece of software update — can verify using digital signatures that the software update actually came from the manufacturer.
LEVIN: So fascinating that all of this is really part of our absolutely ordinary, everyday lives, not just life or death stuff. We’re going to take a little break and be right back after this message.
[Break insertion for ads]
LEVIN: Welcome back to “The Joy of Why.” We’re speaking with Boaz Barak about the art and science of cryptography.
Now, you’ve mentioned some of the ways where today we’re trying to protect information — and why we’re trying to protect information. So, what still makes us vulnerable? I mean, we have these algorithms, we have data that we can encrypt, we have all kinds of rules about passwords and user privacy. But what makes a system vulnerable? And how do they typically break?
BARAK: So that’s a great question. So there are generally two ways in which systems can be broken. So first of all, while we have these great encryption schemes, we actually don’t have a mathematical proof that they are secure. And proving that they are secure is actually related to some of the greatest open questions in computer science and science at large, and specifically the P-versus-NP question.
LEVIN: Can you remind us what P and NP mean?
BARAK: Yes. So the P means polynomial time and NP means non-deterministic polynomial time.
So the P refers to the class of problems that can be solved efficiently by a computing device, whether that computing device is our standard digital computer or any other computing device. And NP refers to the class that can be verified by computing device. So the P-versus-NP question really asks whether every problem whose solution can be efficiently verified can actually be also efficiently found.
Now, intuitively, we think that P should be different than NP, that there are some problems where it’s much easier to verify a solution once someone gave it to you than to find it yourself from scratch. But we actually don’t have a mathematical proof that this is the case, even though it is widely conjectured.
And one example of a problem that is of this type is if someone gave you the secret key that could decrypt all of the communication between two parties, then you could easily verify that the secret key actually works. But finding out the secret key can take time, which, as I said, could potentially be longer than the time that it would take for the sun to die out.
So if P equals NP, then in some sense we could break every possible encryption scheme. But it is widely conjectured that it is not. So we don’t have a proof that the encryption schemes that we use are unbreakable. And once in a while, people do manage to break the underlying cryptography. Although at least the main cryptosystems that we are using basically have been unbroken since the 1970s.
But it’s far more common to go around the cryptography. And that’s how hackers actually break into our systems. So one metaphor that I like to use is that cryptography, when properly implemented, is basically like a hundred-ton steel door, but the systems where we use it are basically like a wooden shack. So if you’re installing like a hundred-ton steel door on a wooden shack, then yes, a thief would not be able to break the door, but they might find a different way around it.
And the sheer complexity of modern systems means that it’s really hard to secure all sides of them. So hackers typically don’t break the cryptography, at least when it’s properly implemented, but go around the cryptography.
LEVIN: Fascinating, because a lot of this also dates back to those very deep concepts in mathematics about what’s provable and unprovable. And in a way, this is kind of the computing manifestation of that. That’s a whole other episode. Very deep stuff.
So before the 1970s, most cryptography was symmetric in the sense that the same key would be used to encrypt and decrypt a message. Is that what you’re referring to, that since the 1970s, cryptography is asymmetric, where there’s some key used for encryption and a private key used for decryption?
BARAK: Yes. So this was one of the major changes that happens in the 1970s. So before the 1970s, indeed, cryptography was basically what we call private key cryptography, where the sender and receiver share a secret key that they use for communication.
And this kind of worked okay for the military applications of cryptography, where you have a spy, and maybe before you send them off to a mission you give them a code book that they would use to communicate with the home base. But it doesn’t really work with the modern economic applications of cryptography.
So I don’t know about you, but, you know, I’m a Gmail user and I rely on encrypted communication between me and the Google servers to look at my email securely. But I’ve never actually paid a visit to Google headquarters to exchange with them a private key. And if every user of Google had to physically exchange a key with the Google service, the modern internet would not exist.
So in public key cryptography, two parties can communicate over an unsecured channel and exchange confidential information by basically having one party, let’s say the receiver, send their public key to the sender. And then the sender can use that public key to encrypt their message, send it over the unsecured channel. And the receiver can use their secret key to decrypt it.
Point being that, yes, people could be listening on this channel and they could all learn the public key — but the public key can only be used for encryption. It cannot be used for decryption.
LEVIN: It does make me wonder, though — all of that’s quite amazing. That’s such tremendous progress. We exchanged Gmails today and I feel pretty confident nobody read our Gmails, not least because they weren’t that interesting, right? “I’ll see you there then,” you know, “I’m running late,” whatever. But are there theoretical limits to cryptography? And can things truly be unconditionally secure?
BARAK: So there are two answers to this question.
First of all, yes, it is known that public key cryptography can never be unconditionally secure in the sense that it could always be broken by basically trying all possible keys. But trying all possible keys is an effort that scales exponentially with the key size. So, at very moderate key sizes, that already requires, say, spending more cycles than there are atoms in the observable universe, or similarly astronomical quantities, which you are probably more familiar than me. So that’s not really a barrier.
So, theoretically, we could have a mathematical theorem that tells us that this cryptosystem cannot be broken by any attacker that would spend less than, say, a number of operations that scales like 2n, where n is the length of the key.
We don’t have a theorem like that. And the reason is that such a theorem would in particular also imply that P is different from NP, which is a major unsolved problem that we haven’t been able to solve. So at the moment, we have to settle for conditional proofs of security that are conditional based on certain conjectures.
LEVIN: Now, there’s also twists on this whole idea. Sometimes I don’t want to completely obscure everything that’s going on. Sometimes I want to let you know I know something. But I don’t necessarily want to reveal all, right? I believe these are known as zero-knowledge proofs. Can you expand on that? Explain to me why sometimes I want you to know I know, I want to prove to you that I know. I don’t want you to just take my word for it without revealing the information I’m protecting.
BARAK: Sure. Actually, let me give you an example in the context of nuclear disarmament. You know, you have these, say Russia and the U.S., both have a huge stockpile of nuclear warheads, far more than necessary to destroy, you know, major parts of the world several times over. It’s very expensive. It’s actually in both countries’ interest to reduce this stockpile — and it’s also in the interest, obviously, of world safety, because the less warheads out there, the less likely that we would have like a completely devastating nuclear war.
But part of the problem is that this is an equilibrium, and it’s hard to agree on reducing the stockpiles. Another part is, how do you verify that the other side really did, say, destroy the warheads? One solution to that is simple. You know, say, for example, the Russians declare we are going to destroy these hundred warheads.
They invite American inspectors to come to the warehouse where the warheads are stored and take a look at them, examine them, and then they go into whatever machine that destroys all of these things. That is great — except that the design of a nuclear warhead is one of the most classified secrets that the country has. And the Russians have no intention of letting American inspectors anywhere near opening up the warhead and examining it.
So then it becomes a question of, say, I have a barrel. Can I prove to you that there is a nuclear warhead inside the barrel without giving you any details of the exact design of this warhead?
And this is the type of question that zero-knowledge proofs are designed to address. So, you want to, say, prove that something satisfies a certain predicate. So, for example, this barrel contains a nuclear warhead, or maybe this number is a composite number, or it’s a composite number where one of the moduli has the last digit of 7.
So, you have a certain object, and you want to prove that it satisfies a certain predicate without giving away the information such as the design of the warhead or the factorization of the number that really proves why the object satisfies this predicate.
LEVIN: Fascinating example, and unfortunately timely. Does this relate to the concept of “obfuscation” that I’ve been reading about?
BARAK: So obfuscation is kind of a vast generalization of a lot of things, including zero-knowledge proofs and others. Basically, the idea is, could you take, say, a computer program and transform it into a way that it will become like a virtual black box, in a sense that you will be able to run it, but you will not be able to examine its internals.
So you could have, say, some computer program that potentially takes as input some of the secret information, but only produces like a 0 or 1 bit — is the secret information satisfies a certain predicate or not? So obfuscation can be used to achieve zero-knowledge proofs. It can be used to achieve a lot of other cryptographic primitives.
And one of them, for example, is the idea of secure multiparty computation, where the idea is, maybe, for example, you have a certain private input. I have a certain private input. Maybe, you know, you are a hospital and you have your own patient records; I am another hospital, we have our own patient records. For reasons of patient confidentiality, we cannot share with each other the patient records. But could we run some computation that at the end of it, we will learn some statistical information about both of the records without revealing any of the secret information? And this falls under secure multiparty computation.
LEVIN: Now, in particular, I’ve read the phrase “indistinguishability obfuscation.” Doesn’t exactly roll off the tongue, but I believe I’ve heard you refer to it as, you know, “the one cryptographic primitive to rule them all.”
BARAK: Yes. So obfuscation in some sense, if you could have it generically, you could basically do anything you want in cryptography. And unfortunately, in 2001, I and some colleagues proved that the most natural notion of obfuscation — virtual black-box obfuscation, which is kind of a mathematical translation of what I said before, that you take a program and you compile it in such a way that it is virtually a black box — so we proved that that is impossible to achieve.
But then we said, there are weaker notions of obfuscation, in particular this notion of indistinguishability obfuscation, which our impossibility proof didn’t apply to. But we had no idea whether that notion of indistinguishability obfuscation is possible to achieve and, if so, whether it would actually be useful for all of the applications.
So then in 2013, there was a breakthrough showing that, yes, indistinguishability obfuscation can be constructed and, in fact, that it can be useful for many applications. And since then there’s been a steady stream of works showing more and more applications of indistinguishability obfuscation, and also works trying to make it not just theoretically possible, but also practically feasible.
The latter is still very much a work in progress. So right now, the overhead is such that we cannot use it. But the theoretical feasibility shows that perhaps in the future, we will be able to improve the efficiency and make it usable.
LEVIN: Now, there’s this beautiful thing on the horizon that keeps looming and receives a lot of discussion. And that’s quantum computers, of which we have no examples yet. But how would these methods fare in a quantum world against quantum technologies?
BARAK: So this is a fascinating question, and actually a very timely one at the time we are recording this interview. So, first of all, quantum computers are still computers. They can be mathematically modeled, and while they appear to give exponential speedups for some very structured computational problems, they are not a magic bullet that can speed up any computation whatsoever. In particular, essentially all of the secret key encryptions that we currently use will remain secure also against quantum computers.
However, maybe the most important and famous problem for which quantum computers can offer an exponential speedup is the integer factoring problem, which lies at the heart of the RSA encryption scheme.
That means that if scalable quantum computers are built, then all the public encryption schemes that are based on these number theoretic objects — such as RSA system, based on integer factoring, Diffie-Hellman, based on the discrete logarithm, and its variant based on elliptic curves — will all be broken.
Unfortunately, it’s very hard to construct a public key encryption, so we don’t have many candidates. This is one major family of candidates, and there is one other major family of candidates for public key encryptions, which is based on problems relating to lattices. So we do have these problems-based lattices that are not known to be broken by quantum computers. So these form the basis for what is sometimes known as post-quantum cryptography, at least in the public encryption setting.
LEVIN: Now, you’ve already mentioned a number of major other techniques used in cryptography, and we don’t really have a chance to pick apart each of these. What does the lattice mean, or the integer approach? But I think we get the impression. They’re rooted deeply in fundamental mathematics, I think, is an important point, which maybe not everyone realizes. And that the tools are somehow kind of fundamental in some way also to how nature encodes math and how math encodes information in a deep way. And that a lot of these complex techniques go back to fundamental mathematics and keep relying on that sort of core discipline to progress.
BARAK: Absolutely. And one of the things I tell students when I lecture on cryptography is that we have moved from a “security through obscurity” to “security through mathematics.”
And it turns out that security through mathematics is much more reliable. So, even though, today, attackers have access to much higher amounts of computational resources that far dwarf the resources that people had in Bletchley Park, we still have cryptographic schemes that nobody has been able to break, and as I said, therefore attackers always go around the cryptography.
And the reason is that rather than trying to build very complicated esoteric systems like the Enigma, we are relying on simpler principles, but applied for mathematical understanding. And that is how we are getting more reliably secure.
LEVIN: Yeah, that’s a big change. So I guess it’s natural for me to ask here — maybe you’ve already implicitly answered that in prompting this question — but where is cryptography headed next?
Barak: So there are maybe four different research strands in cryptography. One is expanding the reach of cryptography. So going beyond, say, secret writing to public key encryption, and then to even fancier things like zero-knowledge proofs, multiparty secure computation, obfuscation.
The second is bringing things from theory to practice. So taking some theoretical constructions that initially are far too much overhead to be used in practice and improving it.
The third is sharpening our understanding of the cryptographic schemes that we currently have by attempting to break them. So cryptoanalysis and attempting to break cryptographic schemes is also important area of research.
And the fourth is really understanding the basic assumptions that we are using in cryptography. Things like integer factoring, lattice-based assumptions, and maybe we can find new assumptions. And trying to understand their validity and whether we can provide rigorous evidence for their correctness, or show that they are incorrect.
LEVIN: It’s a field that’s taken many turns, from the kind of natural instinctive urge to have secret notes to the depths of mathematics to computing to quantum computing. It’s really a fascinating subject. And, at this point, we have a question we like to ask, which is, what about this research brings you joy?
BARAK: When I started as a graduate student in the Weizmann Institute of Science, I didn’t actually intend to study cryptography. I didn’t know much about cryptography, and I thought it was a very technical field just having to do with number theory, which is nice, but not something that I was passionate about.
The thing that kind of blew my mind was when I took a cryptography class with Oded Goldreich, who became my advisor, and I realized that you could actually mathematically define what it means to be secure. And I kind of just found it fascinating that this intuitive notion that I thought that had had no bearing with formal mathematics can actually be captured by mathematics, and then we can actually prove things about it.
And this is the thing that I still find so fascinating about cryptography that it brings math into places where we didn’t really think it would hold.
LEVIN: And it reminds me of these very deep ideas of a hundred years ago that we can prove that there are unknowable facts about math.
BARAK: Yes, some of the techniques are actually sometimes similar.
Maybe another reason why I find, kind of, cryptography fascinating is that I think as human beings and as scientists, we have a long history of being fascinated by impossibility results. So there was, you know, the impossibility of deriving the parallel postulate from the other axioms of geometry, impossibility of trisecting an angle with just a compass and a straight edge, and of course, Gödel’s theorems of impossibilities of proving all true facts about mathematics.
But cryptography is about the practical application of impossibility, which I kind of find really fascinating, that we take what we think of as fundamentally negative results that are just for intellectual curiosity and would have no practical implication whatsoever, and we turn them into something that we actually apply and are using every day to do commerce over the internet.
LEVIN: So compelling. Thank you so much. We’ve been speaking with computer scientist Boaz Barak about cryptography in the modern era. Boaz, thank you for joining us on “The Joy of Why.”
BARAK: Thank you very much.
[Theme plays]
LEVIN: Thanks for listening. If you’re enjoying “The Joy of Why” and you’re not already subscribed, hit the subscribe or follow button where you’re listening. You can also leave a review for the show — it helps people find this podcast.
“The Joy of Why” is a podcast from Quanta Magazine, an editorially independent publication supported by the Simons Foundation. Funding decisions by the Simons Foundation have no influence on the selection of topics, guests or other editorial decisions in this podcast or in Quanta Magazine.
“The Joy of Why” is produced by PRX Productions. The production team is Caitlin Faulds, Livia Brock, Genevieve Sponsler and Merritt Jacob. The executive producer of PRX Productions is Jocelyn Gonzales. Morgan Church and Edwin Ochoa provided additional assistance.
From Quanta Magazine, John Rennie and Thomas Lin provided editorial guidance, with support from Matt Carlstrom, Samuel Velasco, Arleen Santana and Meghan Willcoxon.
Samir Patel is Quanta’s Editor in Chief.
Our theme music is from APM Music. Julian Lin came up with the podcast name. The episode art is by Peter Greenwood and our logo is by Jaki King and Kristina Armitage. Special thanks to the Columbia Journalism School and Bert Odom-Reed at the Cornell Broadcast Studios.
I’m your host, Janna Levin. If you have any questions or comments for us, please email us at [email protected]. Thanks for listening.
[Theme ends]