What Can Birdsong Teach Us About Human Language?

Comments

What Can Birdsong Teach Us About Human Language?

November 21, 2024

We often consider spoken language to be a feature that distinguishes humans from other forms of animal life. Brain research, however, suggests that other creatures — including certain birds — share some of our neural circuitry related to language. In this episode, co-host Janna Levin explores the origins and underlying mechanisms of human speech and birdsong with neurobiologist and geneticist Erich Jarvis.

Peter Greenwood for Quanta Magazine

It’s fair to say that enjoyment of a podcast would be severely limited without the human capacity to create and understand speech. That capacity has often been cited as a defining characteristic of our species, and one that sets us apart in the long history of life on Earth. Yet we know that other species communicate in complex ways. Studies of the neurological foundations of language suggest that birdsong, or communication among bats or elephants, originates with brain structures similar to our own. So why do some species vocalize while others don’t? In this episode, Erich Jarvis, who studies behavior and neurogenetics at the Rockefeller University, chats with Janna Levin about the surprising connections between human speech, birdsong and dance.

Listen on Apple Podcasts, Spotify, TuneIn or your favorite podcasting app, or you can stream it from Quanta.

Transcript

[Theme plays]

JANNA LEVIN: All animals exhibit some form of communication, from the primitive hiss of a lizard to the complex gestures natural to chimps, or the songs shared by whales. But human language does seem exceptional, a vast and discrete cognitive leap.

Yet recent research is finding surprising neurological connections between our expressive speech and the types of communication innate to other animals, giving us new ideas about the biological and developmental origins of language.

I’m Janna Levin, and this is “The Joy of Why,” a podcast from Quanta Magazine, where I take turns at the mic with my co-host, Steve Strogatz, exploring the biggest questions in math and science today.

In this episode, we speak with neuroscientist Erich Jarvis about the evolution of language and the search for answers in unexpected places — like the songs of birds.

[Theme fades out]

Erich is a professor at the Rockefeller University and a Howard Hughes Medical Institute investigator. At Rockefeller, he directs the Field Research Center of Ethology and Ecology. He also directs the Neurogenetics Lab of Language and codirects the Vertebrate Genome Lab, where he studies song-learning birds and other species to gain insight into the mechanism’s underlying language and vocal learning.

Erich, we’re so glad to have you here with us today.

ERICH JARVIS: Thank you very much, and thank you for that wonderful introduction.

LEVIN: Yes, we have the pleasure of actually being in person today, which is nice.

JARVIS: Yes, yes.

LEVIN: So I want to start with what we’re actually doing here, which is using language, and what our audience is doing, which is listening to words. I guess the first question is: Is language built in biologically, even genetically? And also, is it uniquely human?

JARVIS: So, the first part: Language is built-in genetically in us humans. We’re born with the capacity to learn how to produce and how to understand language, and pass it on culturally from one generation to the next. The actual detail is learned, but the actual plan in the brain is there.

Second part of your question: Is it, you know, special or unique to humans? It is specialized in humans, but certainly many components of what gives rise to language is not unique to humans. There’s a spectrum of abilities out there in other species that we share some aspects of with other species.

LEVIN: It really is quite amazing, the proliferation of human languages, and yet we share this common biological genetic root.

JARVIS: Yes, yes.

LEVIN: And is it incredibly complicated genetically, or is it kind of simple to identify, that there’s this invariant genetic component?

JARVIS: I’m going to say that it is complicated in terms of the genetics underlying language — particularly spoken language, which is even more rare than other kinds of communication. It’s complicated because we haven’t figured it out yet, but I don’t think it’s impossible to figure out because we’re getting there. And I say “we,” I mean the collective “we” of scientists in the field studying this question.

LEVIN: Now, can you walk me through the mechanics of language? What’s happening neurologically and physiologically when I’m both speaking to you, but also inventing the things I’m going to say and how I’m going to say it?

JARVIS: Well, language actually is like all kinds of different behaviors that we and other animals display. When we think of it only in the human experience, we tend to just think of language as a unitary behavior. But actually, when you look closely at the comparisons across species and in the brain, you realize language can be broken down into multiple components.

One component is understanding what someone is saying, perceiving those sounds, processing them — we call that auditory perception — and learning how to understand them. Another component would be like syntax, or sequencing of sounds with some rules. The other would be semantics, like meaning in the sounds. And the component that’s most rare is vocal-production learning, the ability to produce imitated sounds.

And so those are like five different components; there are more. And when you look in the brain, right, you can actually find that there are different brain circuits that control these different components. Not all of them are separate, but the two that I like to really highlight and make a distinction on is the auditory-learning brain circuits, different from the vocal-production learning brain circuits.

The auditory-learning brain circuits, many species have — like your dog can understand the word “sit,” siéntate in Spanish, and so forth. But your dog can’t say those words. But we can, and that’s the vocal production component that’s more rare.

LEVIN: [laughing] Right, he definitely understands, “You want to go out?”

JARVIS: Yes, yes.

LEVIN: Very excited about those words.

JARVIS: Yeah, and some breeds of dogs, you can get them to understand several hundred human words. But they can’t say one of them. So my thinking, and other people’s as well, is that auditory learning and the ability to understand complex sound combinations came earlier than the ability to produce those sounds.

LEVIN: Now, what is the difference between, let’s say, language and speech and what you refer to as vocal learning? Maybe we could pick that apart a little more.

JARVIS: Yes, I’ll have to say that the distinctions there are not, let’s say, uniformly or universally agreed upon or defined in our field. So, many species that have vocal organs — larynx in mammals, syrinx in birds — they produce sounds. Most of those species are producing innately determined sounds, like the color of your eyes are innately determined. You can put lenses in to change the colors, but that’s some kind of physical modification going on separate from the brain. And so this ability to produce learned sounds is an essential component of spoken language, but not the only one, right? But without the vocal learning, you won’t have spoken language.

Now, language broadly and speech. In the traditional linguistic sense, many people will say “speech” and “language” as if they’re two separate things. In our everyday experience as people, as citizens of this planet, when people say “language,” they’re thinking of speech. They’re thinking of perceiving it and producing it. They’re not thinking of sign language, writing and so forth. And so why is that the case?

Now people argue with me on this, but I’m going to say it. When you look in the brain and look at what brain cells, what circuits are being activated when we speak, and then when we express language through speech, through gesture, through facial communication and so forth. Really, what I think is going on is that there isn’t a separation really between speech and spoken language. All the grammatical rules, the syntax, all of that is using the same brain areas that are responsible for producing the sounds. All right? I don’t see a distinction there.

For gesturing, like sign language — now, some will disagree with me as well, but I believe in the data that I see is that those brain regions are directly adjacent to the brain regions that are controlling the voice. And they’re using similar grammatical rules and so forth in those brain circuits in a parallel fashion that the spoken-language circuit uses.

So what I think is going on there evolutionarily is that our brain, our ancestor’s brain, and the brains of some animals, already have a lot of complex processing for syntactic-like structure of their behavior. And what happened to the spoken-language circuits, once vocal learning evolved that new brain circuit adapted what was already preexisting.

LEVIN: I’m fascinated by that idea. I’ve been wondering about other aspects of the brain that must have had to develop in parallel or in some kind of feedback loop, with these other neuropathways.

JARVIS: Yeah, that’s right. So there’s this idea that the language circuit sort of evolved totally de novo as a separate language module in the brain, commanding everybody else what to do, and that it’s sort of autonomous. I totally disagree with that.

LEVIN: So you raised some interesting issues as well. One is the motor aspect of speaking and controlling your voice or speaking sign language. And then the structure of grammar, which also is built in. I think we all know that new dialects crop up and develop with languages, but they have their own grammar that has to be adhered to as well. You can tell if you’re not using slang properly and you’re laughed at because you’re making mistakes. So, can I see neurologically or biologically a difference between the motor function aspect of this versus the concept that there’s a grammar built in as well?

JARVIS: Yeah. And there are two parts to the answer itself. That is, I think that the spoken language brain circuits is basically in an advanced motor pathway controlling movement of the larynx, of the lips, of the jaw, and so forth. Not that dramatically different from other motor pathways for learning how to dance, learning how to walk, learning how to fly in birds. So I think it’s essentially an advanced motor-learning pathway.

And I say that because we and others have found evidence that the vocal-learning pathways — we can also call them spoken-language pathways in their analogous regions in other species that learn how to imitate sounds — is embedded within a motor pathway that controls body movement.

It’s not embedded in the visual system. It’s not embedded in the auditory system for hearing. It’s embedded in movement systems of the brain.

LEVIN: Literally where it is in the brain?

JARVIS: Literally positioned physically in more frontal lobe areas. The circuit for the production part of the sounds is more distributed to the front part of the brain in humans and mostly in the other species we’ve looked at.

So, a lot of people when they hear “movement control” about the brain, they automatically assume that the other stuff like grammar and syntax — all the rules that go with controlling those sounds — are all somewhere else in the brain in some higher-level cognitive areas. I disagree with that. I think they’re actually embedded, too, in the motor system. Being fed, yeah, by some of these other areas for perception, but I don’t think the motor systems that control body movement and sound are so primitive.

LEVIN: Fascinating. Now, there are a number of species that don’t have language in the way that we do, but that are capable of what you’re calling vocal learning. Can you tell me a little bit more about, some examples of those species?

JARVIS: Yes, so, among us mammals, advanced vocal learners would be us humans, of course; bats, who sing ultrasonic vocalizations in a range that we can’t even hear for their learned sounds.

LEVIN: And that’s different from the echolocation.

JARVIS: That’s different from the echolocation. That’s right, yes.

LEVIN: So it’s a form of communication. It’s not just map of the space.

JARVIS: No, no. Yeah, these are learned sounds that they use for various reasons other than trying to find where the wall is at, you know. So bats, the cetaceans — basically those are whales and dolphins, all right? And dolphins are just basically small whales.

And pinnipeds like sea lions, and elephants. The elephants are disputed, but there’s some evidence, you know. I mean, there are some elephants who are imitating human speech sounds, but they put their trunk in their mouth and they move their mouth in order to make the sounds, as opposed to do[ing] it voluntarily to control human speech production.

So those are mammals, and then there are three bird groups. The songbirds, like canaries and starlings and so forth. Parrots, like an African gray parrot. And hummingbirds of all species. They’re the smallest birds around, and they’re also vocal learners.

LEVIN: That’s an amazing range, from birds to aquatic mammals. And are all of these different species having a common source to their ability to demonstrate vocal learning?

JARVIS: No, that’s the surprising thing, is that all of these species, all five mammals and the three birds that I named, they appear to have evolved this ability of vocal learning independently of a common ancestor having it.

LEVIN: Really? That’s kind of amazing.

JARVIS: Yeah, so we call it parallel evolution or independent evolution.

LEVIN: That just strikes me as huge.

JARVIS: Yes, yes it does.

LEVIN: I think that’s good for astrobiology, that if there are other species that crop up, that language might be a natural part of the process.

JARVIS: That thought came to me and, just jumping a little ahead here, we and others found that the brain circuits, at least the species that we have looked at, also are convergent. You know, there are differences. Each lineage has evolved something different than the other. But a lot of the brain circuitry and the underlying genetics is convergent.

LEVIN: Wow.

JARVIS: And when we made those discoveries, I was thinking, this has got to be some type of suggestion that life could evolve multiple times in a similar way on some other planet.

LEVIN: Amazing. And why do you think that bats and whales share this in common?

JARVIS: Yeah, there’s so many theories about why did language evolve, or even why the vocal learning evolved, and what’s selected, and why isn’t it more common and so forth. No one really knows. I mean, you know, Steven Pinker thinks sexual selection or some kind of advanced ability for communication and survival.

I think that there’s something selecting for it, and there’s something selecting against it. And they may be more simple than you think. I think the fundamental thing selecting for the ability to imitate sounds is sexual selection, that is, mate attraction. When vocal learners tend to produce their variety of sounds and try to use it to attract the mate, the more diverse your vocal repertoire, the more likely you’re going to attract mates.

How do you get a diverse repertoire? You learn how to imitate sounds, and you also steal sounds from the environment, like mockingbirds do or African gray parrots do. So that’s what’s selecting for it.

What’s selecting against it? I think that a diversity of sounds not only maintains the auditory perception of the listening animal of your own species, but also of the predators, right? So the auditory system of predators is going to have a harder time habituating to this sound diversity that you’re trying to advertise from the top of a tree somewhere. So you’re more likely to be eaten and not survive.

LEVIN: So that’s not a good thing. It works in the wrong direction.

JARVIS: Yeah, so, then you ask why humans and dolphins and elephants and parrots and so forth? I think, you know, most of the vocal learning species are either near the top of their food chain or they’re vocalizing in high pitch range, like bats, where many other species can’t hear them.

And it turns out, we’ve done some phylogenetic studies to show that songbirds and parrots were descended from apex bird predators that are now extinct. But those were their ancestors, and maybe they were evolving this ability during that time and now have held on to it.

LEVIN: So they’re not so afraid because they were at the top of the food chain.

JARVIS: Yeah. And hummingbirds are pretty fast.

LEVIN: Right.

JARVIS: Those are my thoughts.

LEVIN: So, a bat communicating, can they learn what a whale is trying to say? Can a whale learn what a hummingbird is trying to communicate?

JARVIS: Yes and no. And reason why I say yes is that you do have species, more likely closely related species, like, you can take a zebra finch and raise it with its cousin species the Bengalese finch, and the zebra finch young animal will pick up the Bengalese finch song. Not as good as a Bengalese finch because there are physical differences in its larynx and so forth, and even some of the brain circuitry may impose some limitations. But you can get species imitating other species, sounds being fostered, basically. Not only cross-fostered in your own species, cross-fostered with another species. Alright? And vocal learners will pick up those sounds. And they will communicate in whatever way they can, you know, not as good as your own species.

Now the reason why I say no is that, you know what, it’s even hard for us humans to understand another language if we’re not growing up with it — particularly a language that’s phylogenetically distant from the language you grew up with. And so just plopping me in the middle of a population of people speaking a different language is going to be a lot of effort.

LEVIN: Very challenging.

JARVIS: Yes, yeah.

LEVIN: And we see that it hardens with age.

JARVIS: That’s right, yes. So we have these critical periods for vocal-learning abilities. That’s why we can learn how to imitate sounds at a younger age, before puberty, before we get this hormone surge. Afterwards, the brain settles in, and it makes it harder. Not impossible, but harder, in all the vocal-learning species.

LEVIN: We’ll be right back after this message.

[Break]

LEVIN: Now, we’ve been talking about language, and I want to emphasize that a lot of your work is on birds. And in particular, as you’ve said, not all birds are vocal learners. You’ve mentioned the songbirds, the parrots and the hummingbirds. What sets them apart? Do we know why, even if their ancestry is in this apex predator, why only some of them — these three categories — develop vocal learning?

JARVIS: Yeah, beyond what I was saying is selection for and against that I don’t know. But I can say in terms of the brain regions that control this behavior, what’s remarkable is that all three of those bird groups have exactly seven brain structures connected in a similar network pattern in the brain that controls the syrinx in birds. And they have genetic differences inside the vocal brain regions that differ in what we call gene regulation. So the up- and down-regulation of protein products. Okay, of certain genes that control connectivity, that control how fast the neurons communicate with each other, and so forth. We find differences in those brain regions in the same way in all three of the bird groups, even though they’re not closely related to each other.

LEVIN: Fascinating.

JARVIS: But there are other interesting correlations that all the vocal learners share, or at least some of them do, right? One is, we found that the more advanced the vocal-learning ability they have, the more advanced they are at problem solving — indicating there probably is some relationship between some other cognitive abilities and vocal learning and language. It’s usually been an assumption, but we actually found this at least among songbirds recently.

And another one is that all vocal learners seem to have their juvenile periods of life extended. They go through this altricial kind of growth where they have to be cared for by parents. They’re not like where they’re born, like a chicken, and can walk right away. So, I think that is happening because it’s necessary to be young with adults for a period of time to pick up the culture, to pick up the learned repertoire of vocalizations.

And then an unexpected one, but you know, one discovered about a decade ago, is that only vocal-learning species can learn how to dance. And that is to synchronize body movements to a rhythmic beat in music.

LEVIN: Wow. Is that the connection with the motor function aspect? What do they call that, synesthesia, when you mix your senses? So why would listening to something make you want to move your body? We don’t walk up to paintings and start dancing.

[Both laugh]

JARVIS: Yeah, very good. I wonder if there’s somebody that does that. That’s a good one. So, I think what’s going on there is that the larynx is the most rapidly firing muscle in the entire body. You need very good, tight auditory integration from your ears to the brain to integrate it with the brain pathways that control the larynx. That tight integration of sound and movement for the larynx, I think then basically contaminated the rest of the brain in vocal learners to now get tight integration between sound and not just muscle of the larynx, but muscles of the rest of the body. And now we can control our body movements to sound in more advanced ways than the non-vocal learning species.

LEVIN: So does rhythm itself or pitch play an important part in language? You seem to be saying that it does.

JARVIS: Yeah, in some societies there isn’t really much of a distinction between singing and dance. The two might go hand in hand and that would be consistent with a shared evolutionary history there.

Others think that actually dance itself can be a form of communication, and was a form of communication in early cultures. And by the way, I was once a dancer myself, so this excites me, and is another reason why I do this. So maybe there’s some kind of connection there.

LEVIN: I have wanted to ask this for quite a while. Parrots, are they understanding human language and to what extent? Or are they simply emulating a sound?

JARVIS: Yeah, parrots, you can teach them to understand meaning of human words. Irene Pepperberg’s work is the most famous for this. A lot of people just assume that these animals, whether they’re vocal learners or not, are just rambling off with random sounds with no meaning and so forth, right? It’s really hard to think that nature was built that way, you know, just does these things randomly for no reason at all.

In fact, what I think is going on is that, you know, from studies of vervet monkeys —[Robert M.] Seifarth and [Dorothy L.] Chaney showed this a while ago — is that there’s these certain alarm calls that would mean an eagle in the sky or a snake on the ground to these animals in Africa. And, if you play these sounds through a speaker, you’ll see them look up for an eagle, look down on the ground for a snake.

So there are species that are producing innate vocalizations. So already, even without vocal learning, they have understanding of the meaning of sounds that they pass on culturally from one generation to the next. So I think meaning and sounds came before spoken language.

LEVIN: I wonder if pre-Homo sapiens also had this. Is there any sense that Neanderthals were also using vocal learning? Is that something we suspect to be the case?

JARVIS: I was just watching the Neanderthal documentary on Netflix yesterday. But what they were saying and what I’ve been believing, you know, from our own work, is the more and more you study Neanderthals, the more and more you really question whether they’re a separate species, or more or less like a hybrid kind of species where you’re in that gray zone where you’re starting to speciate, you haven’t quite become distinct but you can’t breed anymore. And I think that’s where Neanderthals were.

Now then the question becomes, yeah, they have different facial features, but still of all the species out there, they’re the closest looking to us, right? Well, what does that mean in terms of the brain, and the speech?

Well, fortunately people have recovered DNA from Neanderthal bones. It’s not as good quality as a living human. But it’s still decent-enough quality that you can get a good proportion of the genetic code of Neanderthal sequence completed. And all the genetic differences that we thus far have seen in humans that are in genes correlating with the presence of language, we see in Neanderthals as well.

So I wouldn’t be surprised if Neanderthals and our human ancestors that were living at the time were speaking to each other.

It could have been much simpler than what we have now, both biologically and culturally, but I wouldn’t be surprised. I mean, it’s hard for me to believe that’s not the case.

LEVIN: I could go on about that for a while.

[Both laugh]

But one of the projects that you’ve been pursuing aims to neuro-engineer song into birds that are typically songless, like pigeons, for example. I’d like to know how this is done and what the project’s revealing.

JARVIS: So from the technology standpoint, we know that some of the circuitry of the brain that controls learned vocalizations is different than [in] other species. And actually we’re finding more of it may not exactly be a difference that’s categorically a different binary, yes or no. Some of this circuitry is, let’s say, very weak, in mice but very robust a connection in humans and songbirds and parrots.

And one of the ones that we’ve been focusing on that’s different in this way is the connection that goes from the cortical regions of the brain to the brainstem neurons that control the larynx, all right?

In us humans, in songbirds, in parrots there are hundreds of wires, basically, that go from the cortex that control those motor neurons that control the muscles for speaking, all right, or singing in these birds. Whereas for many decades, it was thought that mice and some other mammals had zero connections. Didn’t even have that cortical region. We found that mice actually do have a rudimentary structure of what we see in our speech circuits, just very primitive, so to speak, in its connectivity and its development.

And so what we’re trying to do is take some of those convergent genetic differences that we see in all the vocal learners looked at to date, and change those genes to be a similar way in the mouse brain to test the hypothesis that these genes formed that special connection, or enhance that special connection in others. If we are able to genetically modify this one important connection — I don’t think we’re going to get a full-blown mouse that’s speaking like Mickey Mouse or anything like that.

LEVIN: Yeah, I’m worried the rats in Central Park are going to start like yelling at tourists or something. [laughing]

JARVIS: That’s right. Yes. But I think we’ll get a step closer. That’s my prediction.

LEVIN: So you’re going to reveal something about the understanding of the biological basis, regardless.

JARVIS: That’s right. That’s 50 years or more of people hypothesizing the importance of such a connection difference to vocal behavior, so this will help us understand the principles of the molecules that are setting up these brain circuits, in studies that we can’t do with humans. And we’re trying to do it with the non-human species that you can study in the laboratory, like a mouse, because there’s so many advanced genetic tools that have been developed for them.

LEVIN: I want to ask about the tools. I’m unclear in understanding when you’re talking about a molecular level, if you’re talking about epigenetics, turning certain things on or off, or if you’re actually grafting genes from another species into the DNA.

JARVIS: Actually both, right? Because I think both are different. Where the actual genetic code of mostly regulatory regions — so regions that tell a gene to make more of it or make less of it, right? — I think that there’s a difference there in those regulatory regions that influences an epigenetic difference on those same regions.

LEVIN: So you can manipulate that at a molecular level.

JARVIS: That’s right, yeah. So this would involve grafting parts of human genes into the mouse. What would be good for us as scientists and society to have is a mammalian species that you can work with in the laboratory to study brain circuits involved in speech disorders, autism spectrum disorders associated with communication and so forth. Since mice don’t have these more advanced circuits, we can’t use them. We have to rely on the birds. And the birds have been very insightful for us. But their brain structures are different, and we’re never going to make a bird as a close model to human as a mammal. That’s one reason that’s also motivating for us, is can we make a mouse model for communication disorders, and then figure out how to repair them.

LEVIN: Now, we’ve talk about this feedback loop and the connection between language and problem solving. If you succeed in wiring with greater multiplicity between, as you said, the cortex and maybe some of the motor aspects of the brain, will the mouse have to necessarily develop greater problem-solving skills?

JARVIS: At face value, I don’t think it’s going to be a simple correlation because we’re trying to specifically focus on the pathways involved in vocal communication and not change the whole brain circuitry. Yeah, yeah, if we put a gene in the whole genome, we might affect something, but we see hundreds of genes that differ in humans relative to the other species.

LEVIN: So it wouldn’t be that in a specific mouse, it suddenly grows new parts of the brain? That’s not genetically possible, but it would have to be in some sense in the germ line passed down? And then it could allow generations of these mice …

JARVIS: Well, I guess while they start selecting upon each other, you know, we get them started.

LEVIN: Yeah, the chatty ones, right, reproduce more often. [laughs]

JARVIS: That’s right. Yes. Yeah, but, you know, this correlation between vocal learning and problem solving, it gets us the closest to think that, yeah, having this advanced form of communication made us humans a more sentient, advanced species, more intelligent and so forth. I’ve been describing vocal learning kind of like a binary trait, but if you really look carefully, you find that it’s more continuous. Alright? It’s not all or none.

LEVIN: I was exactly going to ask you that, whether it was a discrete jump to language or if there was a continuum? And if there is a continuum, are we undervaluing other animals’ capacities to understand because they don’t demonstrate every aspect of language?

JARVIS: Yes, my answer is yes to all those questions in that it is not a discrete jump to get to a spoken language. I think it’s like a stepladder function, where you have a jump, and then you stay there, and you have another jump, and you have another jump, but they’re small jumps, and those small jumps add up to a continuum amongst species.

But the continuum doesn’t necessarily have to be phylogenetically linear. What I mean by that, just because the species is related to you, like a chimpanzee, doesn’t mean that the chimpanzee ancestor was more primitive or more advanced, because of this parallel evolution I described earlier. Parrots can imitate us in ways that chimpanzees can’t, alright?

And so there’s a continuum out there amongst species that is not all genetically related according to the family tree. It’s partly related to the family tree, partly related to how the environment influences the evolution of that species.

LEVIN: [laughing] So, who do we communicate with more effectively, the chimpanzee or the parrot?

JARVIS: That’s a good question. I, you know what, that’s one thing I don’t really have an answer for.

[Both laugh]

LEVIN: That’s an experiment.

JARVIS: You know, I’m going to be biased and I’m going to say… Put it this way, vocally we’ll communicate better with a parrot. Gesturally, with the hands, we’ll communicate better with the chimpanzee. And why is that the case? Well, obviously, because the parrot can learn hundreds of learned vocalizations, right? Some species can go over a thousand, right, whereas the chimp can’t. But the chimpanzee can learn to do sign gestures. And can understand those words and so you can communicate with limited sign language, if you want to call it, with a chimpanzee.

LEVIN: You’ve mentioned sign language a couple of times, which is fascinating to me. I spoke sign language with my close cousin when we grew up, but it was signed English. It was the grammar of English. And when she got older and started going to deaf schools, she spoke a different grammar, American Sign Language, which I find really hard to follow. And so how much is sign language truly a language? Which, I mean, I’m very biased; I’ve seen every evidence in the world that it’s truly a language. And in what sense is it a representation of speech?

JARVIS: Yeah, that’s really a good question. And there are definitely people studying this. And so I think sign language is a language. It’s a form of learned communication that involves movement, but not the movement of the larynx.

Although, I wanted my colleagues to separate out the difference between moving the oral musculature and signing to try to pull apart what’s going on in the brain with those two behaviors. When I talk to people who study sign language, they say it’s almost impossible. It’s hard to not move the mouth and sign at the same time. And so I think because of that, there is a connection behaviorally and evolutionarily in speech and signing in humans.

What’s interesting is when you teach gorillas and chimpanzees to sign, I don’t see good evidence that they’re moving their oral musculature. And so in them, maybe it’s not as connected, evolutionarily — but in humans, it is. But that would mean that the oral movement part in humans came after the signing. And why might that be the case?

It goes back to what I was saying earlier, the brain pathways for producing spoken language are embedded in brain pathways controlling learned movement, including, I believe, signing. Some people say they actually intertwine. And I think this happened by a whole brain pathway duplication where the whole motor-learning circuitry that controls the gesturing and other body movements, right, replicated itself and got connected to the vocal organs. And this is also partly why they’re connected.

LEVIN: Interesting, because you’ve also mentioned understanding disorders in human beings and how to address certain disorders. So it’s my understanding that about 8% of children in the U.S. have some speech disorders or issues related to swallowing, which is connected; more than 3 million Americans stutter. How is your work relevant to human disorders, both the study and also the treatment?

JARVIS: Yeah. So this is why actually songbirds have been funded by the National Institutes of Health as a model for not only studying the basic science of speech, but also disorders. And what’s interesting is because of the convergent genetic changes we see in humans and songbirds and parrots as well, the underlying genes and brain circuitry, when something happens that’s wrong with them, right, you get a similar disorder. So convergent function is associated with convergent disorders.

An example is the most famous gene, this FOXP2, discovered by a colleague of mine, Simon Fisher, and others. This gene is one of those regulatory genes that modifies the expression or the amount of protein product of other genes in the brain that are involved in connections.

And when this gene, one copy of it, is disrupted in humans, those people have difficulty learning how to imitate speech. So their speech is very limited to a few words at a time. And even the words are hard to form. But they can understand speech well.

LEVIN: That must be frustrating.

JARVIS: That’s right, yeah. And so, when you manipulate this gene in songbirds, you can get similar deficits in communication, in learning, in vocalizations. If you manipulate this gene in mice, what was interesting is that we got effects on the vocal behavior, just not as dramatic. So that was consistent with the continuum hypothesis.

And so what I think is going on there is that old genes with us, in vertebrates for millions of years, have now become genetically modified to enhance these and make these novel brain circuits for vocal communication. And some of these old genes that even if you touch them a little bit and mess around with them a little bit, they have a dramatic effect on vocal communication in us vocal learners, but not as dramatic an effect on other behaviors.

LEVIN: Does this suggest that the path forward is gene therapy as opposed to, say, physical therapy?

JARVIS: Yeah, so I think both do work. Gene therapy is actually becoming not only possible, it’s happening, in humans. It’s amazing. I wouldn’t have said it before, but I’ll say it now: I wouldn’t be surprised at some point in the future, whether I’m around or not, we’ll be having gene therapy for some type of speech disorders, and physical therapy on top of that. So speech is a heavily controlled genetic behavior, but it’s also a culturally controlled behavior. And if you practice, you can overcome some of the genetic handicaps.

LEVIN: You mentioned that you were once a dancer. Did being a dancer spark your scientific interests?

JARVIS: I won’t say being a dancer sparked my scientific interest, but being a dancer prepared me to be a scientist. I now realize that the discipline that I learned as a dancer, in terms of practicing until you make it perfect, failing a lot before you succeed, it’s not quite a 9-to-5 job, being creative, all of these things, is really what a scientist needs to be. And so whenever I have some dancers or passionate artists want to join my group, I know they’re prepared.

LEVIN: They have what it takes. I’ve always wanted to know if being a great dancer, being a great singer is more in the mind or more in the instrument, the body.

JARVIS: Yeah. I think it’s a combination of three things, right? It’s the mind and the body working together. I don’t think everybody’s vocal cords is the same. Everybody’s brain is not the same. Just like, we’re diverse people in terms of height, color and so forth. You know, there’s diversity in our body, in our brains as well.

However, that doesn’t mean automatically it’s going to make somebody a great singer and somebody else not. There’s one thing also controlled by the mind that you really need to be great at almost anything: discipline. Right? And you need that discipline to really perfect your instrument, whatever part of that body is going to be, to do well at it. That discipline, who knows, it could be genetically controlled as well. And so I’m going to say that’s the ultimate part, is the discipline.

LEVIN: And now a question that we like to ask at “The Joy of Why” is, what about your research brings you joy?

JARVIS: Oh, that makes sense, “The Joy of Why,” yes.

[Both laugh]

Yes, um, I’m one of those people that really just likes learning, and it’s one reason why I went into science, because I’m always learning something new, I’m always involved in learning something new, involved in discovery.

I believe my science, and scientists in general, we’re doing good for the planet. We’re doing good for society. I know there’s a lot of science advancements that led to toxins in the environment that cause cancer. I don’t like that, you know, but I know we scientists also can do something about that.

My mother always told me when I was growing up, do something that has a positive impact on society. And I felt I can do that best as a scientist. And so, I get joy out of just knowing that I’m helping. You know, that what I’m doing someday is going to be helpful to somebody.

I’ll just add one thing because you’re an astrophysicist. Once I decided I was not going to be a professional dancer, although I still dance, I was going to be a scientist, then I was considering the origins of the universe was my interest, or how the brain works. And a number of years later, I’ve been getting invitations to [the] Astrophysicist Conference, American Society (something like that) for Astrophysicists, and I’m like fascinated by the common interest here of advanced behavior and language and evolution and what astrophysicists are interested in. So I’m hoping that our conversation together here is a sign of more of a marriage between those two fields.

LEVIN: I love the idea of working on a biology project now and then. It’s become such a fascinating field with the advances in genetics. I can see why there’s that comparison. It reminds me of the Emily Dickinson poem, which I’m probably going to butcher, but it’s something like, “The mind is wider than the sky because the one contains the other with ease and more besides.” [Editor’s note: A minor misquotation from memory; the poem can be read here.]

Thank you so much, Erich. We’ve been speaking with neuroscientist Erich Jarvis. It’s been such a pleasure to have you. Thank you for joining us.

JARVIS: You’re welcome. And it’s been great talking with you.

[Theme plays]

LEVIN: Thanks for listening. If you’re enjoying “The Joy of Why” and you’re not already subscribed, hit the subscribe or follow button where you’re listening. You can also leave a review for the show. It helps people find this podcast.

“The Joy of Why” is a podcast from Quanta Magazine, an editorially independent publication supported by the Simons Foundation. Funding decisions by the Simons Foundation have no influence on the selection of topics, guests or other editorial decisions in this podcast or in Quanta Magazine.

“The Joy of Why” is produced by PRX Productions; the production team is Caitlin Faulds, Livia Brock, Genevieve Sponsler, and Merritt Jacob. The executive producer of PRX Productions is Jocelyn Gonzales. Morgan Church and Edwin Ochoa provided additional assistance.

From Quanta Magazine, John Rennie and Thomas Lin provided editorial guidance, with support from Matt Carlstrom, Samuel Velasco, Arleen Santana and Meghan Willcoxon. Samir Patel is Quanta’s editor in chief.

Our theme music is from APM Music. Julian Lin came up with the podcast name. The episode art is by Peter Greenwood and our logo is by Jaki King and Kristina Armitage. Special thanks to the Columbia Journalism School and Bert Odom-Reed at the Cornell Broadcast Studios.

I’m your host, Janna Levin. If you have any questions or comments for us, please email us at [email protected]. Thanks for listening.

Share this article

Newsletter

Get Quanta Magazine delivered to your inbox

Recent newsletters

Share this article

Newsletter

Get Quanta Magazine delivered to your inbox

Recent newsletters

Also in Biology

A close-up image of a mantis shrimp, which has rainbow spots and stripes and beady, complex eyes on stalks.

evolution

When Did Nature Burst Into Vivid Color?

By Molly Herring

June 27, 2025

microbiology

The Ecosystem Dynamics That Can Make or Break an Invasion

By Gabriel Popkin

June 16, 2025

The Joy of Why

Does Form Really Shape Function?

By Janna Levin +1 authors

June 12, 2025

Comment on this article

Quanta Magazine moderates comments to facilitate an informed, substantive, civil conversation. Abusive, profane, self-promotional, misleading, incoherent or off-topic comments will be rejected. Moderators are staffed during regular business hours (New York time) and can only accept comments written in English.

An illustration shows a diverse menagerie of modern life forms — human, fern, fish, bird, lion, octopus, crab, tree, and more — inside a cell membrane.

All Life on Earth Today Descended From a Single Cell. Meet LUCA.

Saved Articles

Log out

Change password

Share

What Can Birdsong Teach Us About Human Language?

Transcript

Also in Biology

When Did Nature Burst Into Vivid Color?

The Ecosystem Dynamics That Can Make or Break an Invasion

Does Form Really Shape Function?

Comment on this article

Next article

Share

What Can Birdsong Teach Us About Human Language?

Introduction

Transcript

Also in Biology

When Did Nature Burst Into Vivid Color?

The Ecosystem Dynamics That Can Make or Break an Invasion

Does Form Really Shape Function?

Comment on this article

Next article