How Is AI Changing the Science of Prediction?

Comments

How Is AI Changing the Science of Prediction?

November 7, 2024

With lots of data, a strong model and statistical thinking, scientists can make predictions about all sorts of complex phenomena. Today, this practice is evolving to harness the power of machine learning and massive datasets. In this episode, co-host Steven Strogatz speaks with statistician Emmanuel Candès about black boxes, uncertainty and the power of inductive reasoning.

Peter Greenwood for Quanta Magazine

Scientists routinely build quantitative models — of, say, the weather or an epidemic — and then use them to make predictions, which they can then test against the real thing. This work can reveal how well we understand complex phenomena, and also dictate where research should go next. In recent years, the remarkable successes of “black box” systems such as large language models suggest that it is sometimes possible to make successful predictions without knowing how something works at all. In this episode, noted statistician Emmanuel Candès and host Steven Strogatz discuss using statistics, data science and AI in the study of everything from college admissions to election forecasting to drug discovery.

Listen on Apple Podcasts, Spotify, TuneIn or your favorite podcasting app, or you can stream it from Quanta.

Transcript

[Theme plays]

STEVEN STROGATZ: Making predictions is a challenge woven into every part of our lives, often in ways we don’t even think about. Will it rain this afternoon? How will the stock market respond to the latest news? What would mom like for her birthday?

Typically, we build up a knowledge base and a theoretical understanding, at least in science, and apply what we know to predict future outcomes. But that approach faces sharp limitations, especially when the systems to be analyzed are profoundly complex and poorly understood.

I’m Steve Strogatz, and this is “The Joy of Why,” a podcast from Quanta Magazine where I take turns at the mic with my co-host, Janna Levin, exploring the biggest unanswered questions in math and science today.

For this episode, we’re joined by mathematician and statistician Emmanuel Candès to ask how are data science and machine learning helping us approach complex prediction problems like never before? And how confident, or skeptical, should we be in their predictions? Can we figure out ways to quantify that uncertainty?

[Theme ends]

Emmanuel is a chair and professor of mathematics and statistics at Stanford University. His work lies at the interface of math, statistics, information theory, signal processing and scientific computing. He’s a member of the U.S. National Academy of Sciences, and has received a MacArthur Fellowship, a Collatz Prize and a Lagrange Prize.

Emmanuel, welcome to “The Joy of Why.”

EMMANUEL CANDÈS: Thank you very much for having me, and since you mentioned the National Academy, let me start by congratulating you on your election. This is truly awesome.

STROGATZ: Oh, you’re too kind. Thank you. Well, I’m honored to be joining you and all of our other esteemed colleagues.

Well, so let us begin here by talking about something on the mind of just about everybody nowadays, machine learning models. We keep hearing so much about them. We know that they can pore through massive datasets and often pick up patterns that no human being could detect.

But these models, people refer to them often as “black boxes.” And I’m just wondering, would you yourself use this phrase, and if so, what do we mean by a black box?

CANDÈS: As you said, a machine learning algorithm takes as input, data collected in the past. And given a set of features, tries to make a prediction about an unknown label.

So I have to say that the predictive modeling culture is as old as a field of statistics itself. Statisticians, starting with [Francis] Galton and [Karl] Pearson and [Ronald] Fisher, have been very focused on making predictions from data. But they use relatively simple models — models that could be analyzed mathematically, models that we teach at college, for which you can provide sometimes reliable inference.

But I don’t think I need to tell you that now we’re past these simple regressions. That we’re using deep learning, gradient boosting, random forecasts — a lot of techniques that have become very popular, sometimes in combination. And now this becomes so complicated that it’s very difficult. We use the term “black box” to refer to algorithms that are so complex that they will resist analysis. There are, of course, a lot of theoreticians who try to understand what’s happening in the black box.

STROGATZ: Thank you. Wonderful explanation. It’s a whole new universe of statistics, it seems like.

CANDÈS: Absolutely. But it doesn’t mean that we have to trash what we’ve done so far. What my research group has been doing, and what a lot of groups are doing worldwide at the moment, is to try to get the output of these black boxes and treat them as statistical objects.

And so we see a whole branch of statistics that is reasoning about the output of these black boxes without making any modeling assumption. So that the result of analysis can be trusted and so that we can quantify uncertainty, so we can make reliable decisions. And so all the stuff like the p values and the confidence intervals, they are present in one way or the other.

The concept of p values, it’s essentially a measure that quantifies how surprised should you be by a certain experimental outcome. And in the context of black boxes, if a black box makes a prediction, I can still ask how surprised I should be from this prediction. And so I need to be able to quantify the element of surprise. So I would like to be able to transform the prediction into what you refer to as a p value so that I can actually calibrate what comes out of the black box.

So, surprisingly, we do not have to abandon what we’ve been doing. Mostly we’re moving towards a world where there are less parametric models, but the concept of having outcomes that are well calibrated — of quantifying uncertainties — this is still there.

STROGATZ: It’s so interesting. I really like the way you put that, that it’s sort of like the black box can stay black. We don’t have to look underneath or inside the model to make sense of what’s going on analytically. So it’s like we are taking the old methodology, the old desires of traditional statistics to quantify uncertainty and rebuilding the theory for this new world of these black box models.

CANDÈS: Absolutely. And this rebuilding of the new world comes in many different flavors, but I’ll give you an example. Let’s imagine a world not too far in the future where people apply to colleges, for example, and because colleges, say, receive so many applications, we’re going to outsource at least part of the decision process to a black box. So let’s say that now students apply to Cornell, to your home university, and that somehow you decide to predict how well they will do at Cornell using a black box, right? And so the question is, how calibrated are these predictions?

But what you can do is, you can say, “Well, I’ve trained my model, and now I have reserved a set of students for which I know the outcome and I can actually see how the black box predicts these outcomes.” And now I can try to understand, for what kind of students is the error large? For what kind of student is it low? What kind of accuracy do I get from this black box?

And from that you can calibrate, now, when you’re going to use a black box to perhaps screen a few candidates. Because you have observed the mismatch between the black box prediction on the test set, then you can understand a bit the accuracy of the black box and what you can actually conclude. So by observing the outcome of the black box on a group of students for which you have the outcome, the labels, you’re able to actually produce not a point prediction of how the students will do, but you can obtain a prediction interval that has a probability of containing the true performance a prescribed fraction of the time.

And when I say this, I say you do not have a model. You do not have a Gaussian distribution anywhere inside. You’re only using the fact that you draw students at random, you look at what the black box does on the random subsets, and you use these observations to actually generalize to unseen students.

And it’s very much a statistical spirit, which is, you collect the data of features of students applying, and what the black box says about these students. You’re learning from this to be able to say things that are valid into the future.

STROGATZ: Good. I really want to unpack this example. It’s so provocative. The language of features and labels, I think, is a little abstract, maybe. So let me see if I get what you’re saying.

But so if I imagine a cohort of high school seniors applying to Cornell or Stanford, your institution, either way, features might be things like their high school GPA, whether they played on a varsity sport, whether they are African American or Latino, or male or female, all kinds of things like this. You would call those features?

CANDÈS: Yeah, these are features. These are essentially what’s in your application file, right? So these are what you know about the applicant that can be sort of digitized. But I think we live in a modern world now, and so a feature might be your essay. Because your essay will become a string of numbers. That’s the revolution around large language models. And so that is also a numerical feature that you can use to predict, you know, how well do you write English? What is the richness of your vocabulary? You know, there are lots of things you could use.

STROGATZ: Yes, but in terms of what the college might want to predict, just to make it simple, what if we said we want to predict the GPA upon graduation of the student?

CANDÈS: For example. Or it could be even more simple: Will the students graduate in four years? And so in this case, let’s look at your first example: You want to predict the GPA after two years of undergraduate education. And I can say, what does the black box say about these students? And so by looking at sort of the distribution of errors — that is a difference between the true GPA of the students and the black box predictions — I might have a sense of the typical errors that the black box makes on the random student.

And so when a new student comes in, I have a sense of the errors that I’m going to suffer and I can sort of — instead of giving you just a point prediction — I might give you an interval that likely contains the true outcome of the student. And to our surprise, it might be that for some applications or some students, this interval is short. We’re fairly confident of how well they’ll do. And for others, it might be wide.

STROGATZ: Mm hmm. In this case, the interval, what would the interval be in this example?

CANDÈS: One interval might be I’m predicting 2.9 to 3.9, so the center is around 3.4, and the other one I’m predicting 3.3 to 3.5, and the interval is much shorter. They have the same center, the prediction is the same, but the range is very different.

And so, you know, if I’m an admissions officer, this is something I’d like to know about my prediction engine, right? Like, how accurate are you? What level of uncertainty is associated with your point prediction?

If we’re doing finance and I have an investment strategy and I say, “Steven, I can promise you a 6% rate of return,” there’s a very different scenario between 6% plus or minus 1% and 6% plus or minus 10%. You might lose a lot of money and a lot of customers if you’re in the second category.

STROGATZ: Alright, good. So this example that you’ve given, either in the context of finance or GPA, really does help underscore why we care not just about means — or what we might call “point estimates” — but also intervals within which we might have high confidence. I mean, anyone can see, I hope, how valuable it is to be able to make predictions of intervals, not just numbers.

So if we could, I’d like to move now to another real-world example outside of the collegiate setting, having to do with election forecasting. Just to be clear for our listeners, we are recording this podcast a few months before the 2024 U.S. elections. But this episode, we predict — if you’ll pardon the pun — will air sometime right in the aftermath of the election. So I’m sure this is something very much on the minds of our listeners.

And I know you have worked in this area and your students, also. The question is what, what insights can you give us into some of the complex models that are being used to forecast our elections?

CANDÈS: So perhaps first I should be clear. I don’t have really first-hand experience with forecasts of elections. I’m working with students at the Washington Post, with their data science desk, and they actually do the work. And I’m just going to be a messenger for this part of the conversation, if that’s all right.

STROGATZ: I would like to give some credit to the young people who are involved in this. And I also feel like you may be a little bit modest, as is a nice quality that you have. But isn’t it true that Lenny Bronner and Stanford undergraduates who were working, I mean, at least in Lenny’s case, for the Washington Post. Didn’t they build on some of these techniques that you helped develop?

CANDÈS: That is true. But as you know, when you actually work in the trenches on something of consequence, such as predicting the outcome of an election — even though the general principles are in some of the papers we wrote, there’s still an enormous amount of work that they’ve done to make it all work.

STROGATZ: OK, good. Thank you.

CANDÈS: So what a news organization will try to do essentially is, some poll closes and some precincts are reporting and some counties begin to report. In fact, this is a very cool problem because the ballots are already in the box, so to speak, and you have not opened the box yet and you’d like to know what’s in there.

And a lot of the statistical work that is ongoing, for example at the Washington Post, which is the organization I know best, is they’re trying to predict unreported counties. And so, instead of giving their viewership a point estimate of, well, “Santa Clara will vote this way,” you can tally up the forecast for unreported counties, aggregate them at the state level, and have a very nuanced picture of how California will vote.

Now, how is this done? So, obviously we’re going to need to predict how counties are going to vote. And this is going to be based on a lot of features. Is it a predominantly urban county? Is it a rural county? What’s the level of education? What’s the socioeconomic variables associated with the counties? And most importantly, how did the county vote last time?

And so you’re using all these features, you’re trying to learn a model that can predict accurately how counties are going to vote. And that’s your black box, if you will, except that they use models that are not too complicated from what I’ve seen, that are fairly simple.

But then the second part is the calibration. Because you cannot just go on air [and] say, “Oh, you know, California will vote this way.” When, in fact, it’s just a point estimate, you know. This has enormous consequences if you get it wrong. And so what they will do is they will report a range of possible outcomes for the state of California that is dynamically updated as the election goes along. That reflects truly their knowledge about what they think will happen when the vote will have been completely tallied.

And so it’s very cool what they’re doing, because they’re really projecting errors. They’re projecting uncertainty. And you can see that their uncertainty band, of course, narrows as more and more counties are becoming reported.

And they’re fairly faithful. They are back-testing them, as we say in the field, that they are saying, “OK, let’s see how this model would work in 2020.” And they want to make sure that the interval that they project contain the true labels, the true votes, the prescribed fraction of the time.

And so it’s all engineered very well. And I think, kudos to the Washington Post to being so respectful of their readership in not just giving you point estimates, but a real sense of accuracy of their point estimates.

STROGATZ: Now, just to be dead clear about this, we’re not talking about forecasting the election based on polls a year in advance or anything like that. This is election-night forecasting based on the results that are coming in.

CANDÈS: Exactly. So the reader has to imagine that basically there are ballots in a box somewhere. And the only thing is that the box has not been opened yet. But I’ve seen similar boxes open elsewhere in other counties, other precincts. And I’m going to use this knowledge to make a prediction about what’s in this box. And it’s going to be a very well-calibrated prediction following the principles we laid out earlier. And you do have the right to use polls as features, as predictive variables in your model.

STROGATZ: I suppose you could. I think a lot of people out there may be skeptical of polls. We’ve seen how difficult it is to do polling. But then again, the model may take that into account. Maybe it doesn’t assign much weight.

CANDÈS: Exactly. The model will take this into account. Now, what’s kind of a bit tricky about polls is that polls might be different in different counties, right?

Typically, when you fit a statistical model, you’d like the features to be the same for all units in your dataset. So going back to the example we had about college admissions earlier, right? Everybody has a high school GPA. Everybody has a yes/no answer to “Are you on a varsity team?”

And so what might be tricky regarding your poll to use it as a feature is that some counties might have it, others may not. And so you have to be a bit careful about this.

STROGATZ: Good. All right. Let us take a little break here and we will be right back.

[Break]

STROGATZ: Welcome back. We’ve been speaking with Emmanuel Candès about statistics, prediction models, and the inherent uncertainties in them.

So let’s move along to another real-world example. I’m thinking here in the context of medical applications of prediction models. Drug discovery — that, of course, is very important, with life-and-death consequences. So, for example, there’s a move to generate artificial data using artificial intelligence to increase our sample size. That sounds kind of hard to imagine that that could work, but apparently it can be a helpful strategy.

CANDÈS: So, what you’re asking is very, very interesting, and I think you’re touching again on the future of statistical science as a discipline.

Statistics has always been an empirical science that tries to make sense of the world around itself. And so now we’re dealing with Gen AI, for example, or extremely fancy machine learning algorithms.

So to understand drugs we started in vivo: Like, we would just inject people with stuff. Then we did this in in vitro. And now we’re moving in in silico, as you point out, right? Which is that now we want to use algorithms to make predictions about what drugs will do.

And so let’s say you’re a big pharma company and you’re sitting on a huge library of compounds. It can be 400 million, 500 million. And you would like to know which of these compounds will actually bind to a target. So, what do you do?

Well, you should take your compounds one by one and experiment whether they will bind to your target. But as you can imagine, this takes an enormous amount of time and money. So now people are using machine learning to guess whether they will bind. In the past few years, we’ve seen things like AlphaFold. We’ve seen a lot of models that try to predict the shape of a compound from just the sequences of amino acids, for example.

Now, that will not replace physical experiment, but what machine learning does in this instance, it’s going to prioritize the compound that you should try first. One of the things we do in this area is to say, “OK, we’re going to train some extraordinarily fancy models — and they’re really black boxes. I mean, they’re so complicated, I have no idea what they do really, but they produce an affinity score, an affinity of a compound for a target disease.” And I say, “Can I trust this?”

And so without any statistical models, just looking at what the algorithm predicts on molecules on which it wasn’t trained, we were able to select [a] data-adaptive threshold, if you will, that says that if you select all these molecules whose predicted affinity is above this threshold, you’re guaranteed that 80% of what I’m giving you is actually of interest to you.

Downstream, you will do some real experiments on some real thing. But here, what’s very exciting is that AI can really speed up the prioritization of drugs that should be passed on to a lab.

STROGATZ: No, it gives a whole new meaning to the concept of an educated guess. These are now brilliantly educated guesses that have to be tested, as you say.

CANDÈS: They still have to be tested. Now there’s another thing, which is perhaps this time a bit more scary, which is that, what if we use Gen AI to build what people might call “digital twins,” things that are not physical, but can be generated by generative AI. And so here, there’s a new line of research.

So, for example, suppose I want to study statistical properties of some drugs, right? And the problem is I have too few samples, right? Let’s say I want to kind of estimate which fraction of drugs have a certain property. And the problem is I have a lot of sequences of amino acids for which I have not measured their property. And as you can imagine, the tendency is to use a predictive model, a black box, and replace the real measurement with a prediction and then pretend that it’s real data. And then average now these predictions and say that’s the overall fraction of drugs that have their property. And that’s wrong, because this method introduced biases.

We want to use this predictive model, we want to use Gen AI to fill in missing data to possibly create new dataset. But at the same time, we need to understand how we can remove the biases to draw conclusions that are scientifically valid.

Let me give you an example. Let’s say I want just to estimate the mean of a random variable. So let’s call this y. And I have some feature, let’s call them x. And so what I could try to learn is, I could fit a model to predict y from x? And now I can replace the true label, the true amount by the prediction when I don’t have it.

And I could average those, but they’re going to be biased. But guess what? I can remove the bias because I have an estimate of bias from the label data you gave me.

STROGATZ: Nice.

CANDÈS: And so if I do this correctly, I can effectively augment the sample size enormously. If my prediction has reasonable accuracy, then it’s as if I had a sample size which is much bigger. And so the level of accuracy of what I can tell you is much higher.

STROGATZ: Well, I can’t resist asking you since it’s such a, you know, a rare treat for us. You’re very well known for contributions to an area that people call “compressed sensing.” And I don’t know if it exactly fits into our discussion today, but I feel like I want to ask you to tell us how does compressed sensing and maybe its applications to medical imaging, to MRI or other things, does that fit into the framework we’re talking about? And even if it doesn’t, could you tell us a little about it?

CANDÈS: It doesn’t fit directly. I think compressed sensing is the fact that sparsity is an important phenomenon. So what we’re seeing at the moment is people measure everything under the sun because we don’t know ultimately what will matter, right? And so we need people like you and me to sift through what matters. What compressed sensing says is that if we measure a lot of things, but if only a few things matter, and if we use the right algorithm of the kind suggested by compressed sensing theory, then we should be able to build a very accurate predictive model.

Like, we will understand that a lot of variables have no business in predicting the outcome, and it will quickly focus on variables that have something to say about the outcome, and then build a good predictive model from then on.

STROGATZ: So you’ve been using the word “sparsity.” In this context, does it mean all those variables that don’t matter, we can effectively set their contribution to zero?

CANDÈS: Exactly. So it’s saying that in this case, just for our audience, it might say that even though I measured a million genetic variants, the distribution of the phenotype does not depend on this million thing. It maybe depends on 20, on 30. That’s sparsity.

And so the question of that compressed sensing asks is that when something depends on a few-but-unknown from a long list, how do you go about and find them?

STROGATZ: Uh huh. So the technique or the method will identify which are the key 20 or whatever small number it is.

CANDÈS: Exactly. Exactly. Let’s think about this as almost a matrix problem, right? So I have matrix, it has a million columns because these are all the genetic variations. And then I have a response, y, and these are the rows of this matrix. If I want to solve a system y = ax, like which genetic variations matter to predict y? Well, classical theory will say that, well, I need as many people as I have unknowns. But compressed sensing theory says, no, that’s not true. Because if you know ahead of time that only a few of these genetic variations matter, then you can deal with fewer people. And that’s why we can develop predictive models for phenotypes that do not need a million patients.

STROGATZ: It’s wonderful. One of the big issues that seems to be everywhere in science these days is a crisis of reproducibility. And I just wonder if you have statistical comments for us about that.

CANDÈS: Yeah, it’s very interesting that you ask this. I think, first of all, I will make an observation about the reproducibility crisis. It occurs at a moment where people have enormous datasets at their disposal, usually prior to the formulation of scientific hypotheses, access to extremely fancy models that depend on billions of parameters. And so I would say to start with that it’s not a coincidence that this crisis occurs at this time, because I give you a dataset, you believe it’s gold. You’re going to try a model, it doesn’t pan out, and you’re going to try something else. And so you’re fine-tuning parameters, you’re fine-tuning a lot of things until something clicks, and there’s nothing wrong with that.

But I think what we need to do as [a] statistician — and there’s a lot of us that are working on things like this — is, how can we build safeguards around the freedom you have in selecting models, parameters, such that at the end of the day, the discoveries you claim have a chance of being reproduced by, let’s say, an independent experiment?

The statistical community is developing a lot of methods so that when you think you have something, you really do have something. And so this is a very exciting moment for the field, to develop methods that now are not really quantifying the uncertainty in your prediction, but actually calibrating in such a way. Then, when you report findings, we make sure that a good fraction of what you’re reporting is correct.

STROGATZ: Well, I would like to, sort of, now back out to a broader, like, even societal scale to think about education just for a minute. Every learned or educated citizen should know something about the ideas of probability and statistics, including in their modern incarnation that we’ve been talking about. And I wonder if you have thoughts about this. What we could be doing as either educators or communicators to promote greater statistical savvy?

CANDÈS: That’s a good question. I think what I see at lower levels of statistics teaching is a reliance on formulas — you know, which formula should I apply when? And I think that’s not helpful.

As a student, I learned, of course, mathematical reasoning, and that was important. And then a bit through high school and college, I learned physical reasoning. And that’s distinct from mathematical reasoning, and it’s extremely powerful. But in grad school at Stanford, I learned about this new thing called inductive reasoning, which is neither of the first two. And I think we need to be doing a good job at teaching this at an early stage.

What is inductive reasoning? It’s the ability of making generalization out of particular observations. And how do we do this? OK, so I would promote an approach which is not too mathematical in nature, which is trying to make kids understand how it’s possible to generalize from a sample to a population to individuals we haven’t seen yet, and what makes this possible.

There’s a bit of a tension between fields. Should we go towards more mathematics or should we go more towards CS, and where AI is mostly taking place? I think there’s a danger of losing the ability to reason statistically if we go either too much towards math and too much towards CS. It might be a bit abstract, what this is, but I find statistical reasoning extremely powerful, extremely beautiful.

Because I don’t want to talk about it in generality, I’ll give you one problem. And it’s a famous thing that happened in the ‘30s, I think. I think [Alexander Steven] Corbett was studying butterflies and he went to Malaysia for a year and he was a very conscientious man. So every day, he would observe species of butterfly and he wrote in a notebook, “This species I’ve seen once and this species I’ve seen twice, and this species I’ve seen three times,” and so on and so forth.

So he came back to England, and he approached one of the founding fathers of the field, R.A. Fisher, and he asked, “If I go back to Malaysia for six months, how many new species am I going to see?” This is a kind of question different from math. The answer is not in the question.

And I don’t think deep learning can be very helpful. And that’s what statisticians do. And it is a very modern question, which is that, you have a lab and they’re looking at cancer cells. And they’re going to do exactly the same thing: “This is how many cancer cells I’ve seen once. This is how many cancer cells I’ve seen twice.” And they say, “How many cancer cells I have not seen yet? And if I continue looking for cancer cells for six months or a year or two years, how many new types am I going to expect to see?” So this is what you learn when you study statistics. And I find it fascinating.

STROGATZ: Oh, well, that’s just great. It’s really interesting to hear about the culture of statistics, how it’s distinct from that of math or computer science. Because nowadays, with the rise of what people are calling data science, there’s a kind of muddying of the waters. Who owns statistics? Why are we calling it data science? Why isn’t it statistics? I’m sure you have an opinion about this.

CANDÈS: Of course, because there are lots of activities in data science that you would not find traditionally represented in the stats department. So I have a colleague, Jure Leskovec, and he’s a very recognized data scientist.

STROGATZ: Cornell Ph.D. [Editor’s note: Leskovec was a Cornell postdoc, with a Ph.D. from Carnegie Mellon University.]

CANDÈS: Exactly, exactly. He’s a brilliant person. And so when Covid hit, people were calculating this beta number. Like, the model where you’re susceptible, exposed, infected, recovered, and you have these differential equations and, you know, if the beta number is greater than one, we have a problem, stuff like that, right?

And so this is a very macro model. And what Jure Leskovec did was created an enormous digital dataset. He tracked 100 million Americans, in all major U.S. cities. And so he would see where they would go during the day, where they come home at night. And so instead of fitting, like, the epidemiological model everybody knows at the global scale, which doesn’t really make sense, because the behavior in California and the behavior in Florida were very different. Then you can fit it at kind of nodes on the graph, and so you’re going to fit a model which is adapted to the mobility of the people where you are.

And that is data science, because what Jure did, which you will not see in the stats department, is he basically tracked 100 million people for a few weeks. I would like to claim that I have some colleagues in the stats department who do something like this, but I cannot name any. And that is modern data science. This is not something I typically see in a stats department.

So my position is quite clear on this. Data science is much bigger than the traditional field of statistics, but statistics is one of its intellectual pillars.

STROGATZ: Oh, I’m so glad I asked you about that. I hit a goldmine with that one, but all right you’ve already expressed your fascination with statistical thinking. Is there something in your research that brings you particular joy?

CANDÈS: Yeah, I think so. My job at Stanford is unique in the sense that the students I get to work with are phenomenal. I feel that it keeps me young. It keeps me alert. I don’t fall asleep because I just have to catch up with them all the time.

And I feel that it’s strange to say this on air, but I’m going to age better because of this, because like mentally, physically, they keep me fit. And it’s a joy to see them develop, become great scientists. Last year, I had two former students who received a MacArthur Fellowship in the same year. So the students I’ve got to work with are tremendously accomplished, and so it’s just a privilege. It’s a privilege to feel so much energy, so much enthusiasm for the subject, and selfishly I would say that it’s good for my health.

STROGATZ: Well, thank you very much. It’s been really fun to talk to you. We’ve been speaking with mathematician and statistician Emmanuel Candès. Thanks again for joining us on “The Joy of Why.”

CANDÈS: Thank you for your time. It’s been a pleasure.

[Theme plays]

STROGATZ: Thanks for listening. If you’re enjoying “The Joy of Why” and you’re not already subscribed, hit the subscribe or follow button where you’re listening. You can also leave a review for the show — it helps people find this podcast.

“The Joy of Why“ is a podcast from Quanta Magazine, an editorially independent publication supported by the Simons Foundation. Funding decisions by the Simons Foundation have no influence on the selection of topics, guests or other editorial decisions in this podcast or in Quanta Magazine.

“The Joy of Why” is produced by PRX Productions; the production team is Caitlin Faulds, Livia Brock, Genevieve Sponsler, and Merritt Jacob. The executive producer of PRX Productions is Jocelyn Gonzales. Morgan Church and Edwin Ochoa provided additional assistance.

From Quanta Magazine, John Rennie and Thomas Lin provided editorial guidance, with support from Matt Carlstrom, Samuel Velasco, Arleen Santana and Meghan Willcoxon. Samir Patel is Quanta’s editor in chief.

Our theme music is from APM Music. Julian Lin came up with the podcast name. The episode art is by Peter Greenwood and our logo is by Jaki King and Kristina Armitage. Special thanks to the Columbia Journalism School and Bert Odom-Reed at the Cornell Broadcast Studios.

I’m your host, Steve Strogatz. If you have any questions or comments for us, please email us at [email protected]. Thanks for listening.

Share this article

Newsletter

Get Quanta Magazine delivered to your inbox

Recent newsletters

Share this article

Newsletter

Get Quanta Magazine delivered to your inbox

Recent newsletters

Also in Mathematics

The Joy of Why

How Does Graph Theory Shape Our World?

By Janna Levin +1 authors

June 26, 2025

Looping video of a tetrahedral model flipping onto one stable side.

geometry

A New Pyramid-Like Shape Always Lands the Same Side Up

By Elise Cutts

June 25, 2025

infinity

Is Mathematics Mostly Chaos or Mostly Order?

By Gregory Barber

June 20, 2025

Comment on this article

Quanta Magazine moderates comments to facilitate an informed, substantive, civil conversation. Abusive, profane, self-promotional, misleading, incoherent or off-topic comments will be rejected. Moderators are staffed during regular business hours (New York time) and can only accept comments written in English.

Animation in which tornadoes twirl amid a crystalline lattice.

Physicists Spot Quantum Tornadoes Twirling in a ‘Supersolid’

Saved Articles

Log out

Change password

Share

How Is AI Changing the Science of Prediction?

Transcript

Also in Mathematics

How Does Graph Theory Shape Our World?

A New Pyramid-Like Shape Always Lands the Same Side Up

Is Mathematics Mostly Chaos or Mostly Order?

Comment on this article

Next article

Share

How Is AI Changing the Science of Prediction?

Introduction

Transcript

Also in Mathematics

How Does Graph Theory Shape Our World?

A New Pyramid-Like Shape Always Lands the Same Side Up

Is Mathematics Mostly Chaos or Mostly Order?

Comment on this article

Next article