Astral Codex Ten - Your Book Review: How Language Began

͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Forwarded this email? Subscribe here for more

Your Book Review: How Language Began

Finalist #5 in the Book Review Contest

Jul 19

READ IN APP

[This is one of the finalists in the 2024 book review contest, written by an ACX reader who will remain anonymous until after voting is done. I’ll be posting about one of these a week for several months. When you’ve read them all, I’ll ask you to vote for a favorite, so remember which ones you liked]

I. THE GOD

You may have heard of a field known as "linguistics". Linguistics is supposedly the "scientific study of language", but this is completely wrong. To borrow a phrase from elsewhere, linguists are those who believe Noam Chomsky is the rightful caliph. Linguistics is what linguists study.

I'm only half-joking, because Chomsky’s impact on the study of language is hard to overstate. Consider the number of times his books and papers have been cited, a crude measure of influence that we can use to get a sense of this. At the current time, his Google Scholar page says he's been cited over 500,000 times. That’s a lot.

It isn’t atypical for a hard-working professor at a top-ranked institution to, after a career’s worth of work and many people helping them do research and write papers, have maybe 20,000 citations (= 0.04 Chomskys). Generational talents do better, but usually not by more than a factor of 5 or so. Consider a few more citation counts:

Computer scientist Alan Turing (65,000 = 0.13 Chomskys)
Neuro / cogsci / AI researcher Matthew Botvinick (83,000 = 0.17 Chomskys)

Mathematician Terence Tao (96,000 = 0.19 Chomskys)
Cognitive scientist Joshua Tenenbaum (107,000 = 0.21 Chomskys)
Nobel-Prize-winning physicist Richard Feynman (120,000 = 0.24 Chomskys)
Psychologist and linguist Steven Pinker (123,000 = 0.25 Chomskys)
Two-time Nobel Prize winner Linus Pauling (128,000 = 0.26 Chomskys)
Neuroscientist Karl Deisseroth (143,000 = 0.29 Chomskys)
Biologist Charles Darwin (182,000 = 0.36 Chomskys)
Theoretical physicist Ed Witten (250,000 = 0.50 Chomskys)
AI researcher Yann LeCun (352,000 = 0.70 Chomskys)
Historian and philosopher Hannah Arendt (359,000 = 0.72 Chomskys)
Karl Marx (458,000 = 0.92 Chomskys)

Yes, fields vary in ways that make these comparisons not necessarily fair: fields have different numbers of people, citation practices vary, and so on. There is also probably a considerable recency bias; for example, most biologists don’t cite Darwin every time they write a paper whose content relates to evolution. But 500,000 is still a mind-bogglingly huge number.

Not many academics do better than Chomsky citation-wise. But there are a few, and you can probably guess why:

Human-Genome-Project-associated scientist Eric Lander (685,000 = 1.37 Chomskys)

AI researcher Yoshua Bengio (780,000 = 1.56 Chomskys)
AI researcher Geoff Hinton (800,000 = 1.60 Chomskys)
Philosopher and historian Michel Foucault (1,361,000 = 2.72 Chomskys)

…well, okay, maybe I don’t entirely get Foucault’s number. Every humanities person must have an altar of him by their bedside or something.

Chomsky has been called “arguably the most important intellectual alive today” in a New York Times review of one of his books, and was voted the world’s top public intellectual in a 2005 poll. He’s the kind of guy that gets long and gushing introductions before his talks (this one is nearly twenty minutes long). All of this is just to say: he’s kind of a big deal.

This is what he looks like. According to Wikipedia, the context for this picture is:

“Noam Chomsky speaks about humanity's prospects for survival”

Since around 1957, Chomsky has dominated linguistics. And this matters because he is kind of a contrarian with weird ideas.

Is language for communicating? No, it’s mainly for thinking: (What Kind of Creatures Are We? Ch. 1, pg. 15-16)

It is, indeed, virtual dogma that the function of language is communication. ... there is by now quite significant evidence that it is simply false. Doubtless language is sometimes used for communication, as is style of dress, facial expression and stance, and much else. But fundamental properties of language design indicate that a rich tradition is correct in regarding language as essentially an instrument of thought, even if we do not go as far as Humboldt in identifying the two.

Should linguists care about the interaction between culture and language? No, that’s essentially stamp-collecting: (Language and Responsibility, Ch. 2, pg. 56-57)

Again, a discipline is defined in terms of its object and its results. Sociology is the study of society. As to its results, it seems that there are few things one can say about that, at least at a fairly general level. One finds observations, intuitions, impressions, some valid generalizations perhaps. All very valuable, no doubt, but not at the level of explanatory principles. … Sociolinguistics is, I suppose, a discipline that seeks to apply principles of sociology to the study of language; but I suspect that it can draw little from sociology, and I wonder whether it is likely to contribute much to it. … You can also collect butterflies and make many observations. If you like butterflies, that’s fine; but such work must not be confounded with research, which is concerned to discover explanatory principles of some depth and fails if it has not done so.

Did the human capacity for language evolve gradually? No, it suddenly appeared around 50,000 years ago after a freak gene mutation: (Language and Mind, third edition, pg, 183-184)

An elementary fact about the language faculty is that it is a system of discrete infinity, rare in the organic world. Any such system is based on a primitive operation that takes objects already constructed, and constructs from them a new object: in the simplest case, the set containing them. Call that operation Merge. Either Merge or some equivalent is a minimal requirement. With Merge available, we instantly have an unbounded system of hierarchically structured expressions.
The simplest account of the “Great Leap Forward” in the evolution of humans would be that the brain was rewired, perhaps by some slight mutation, to provide the operation Merge … There are speculations about the evolution of language that postulate a far more complex process … A more parsimonious speculation is that they did not, and that the Great Leap was effectively instantaneous, in a single individual, who was instantly endowed with intellectual capacities far superior to those of others, transmitted to offspring and coming to predominate. At best a reasonable guess, as are all speculations about such matters, but about the simplest one imaginable, and not inconsistent with anything known or plausibly surmised. It is hard to see what account of human evolution would not assume at least this much, in one or another form.

I think all of these positions are kind of insane for reasons that we will discuss later. (Side note: Chomsky’s proposal is essentially the hard takeoff theory of human intelligence.)

Most consequential of all, perhaps, are the ways Chomsky has influenced (i) what linguists mainly study, and (ii) how they go about studying it.

Naively, since language involves many different components—including sound production and comprehension, intonation, gestures, and context, among many others—linguists might want to study all of these. While they do study all of these, Chomsky and his followers view grammar as by far the most important component of humans’ ability to understand and produce language, and accordingly make it their central focus. Roughly speaking, grammar refers to the set of language-specific rules that determine whether a sentence is well-formed. It goes beyond specifying word order (or ‘surface structure’, in Chomskyan terminology) since one needs to know more than just where words are placed in order to modify or extend a given sentence.

Consider a pair of sentences Chomsky uses to illustrate this point in Aspects of the Theory of Syntax (pg. 22), his most cited work:

(1a) I expected John to be examined by a specialist.

(2a) I persuaded John to be examined by a specialist.

The words “expected” and “persuaded” appear in the same location in each sentence, but imply different ‘latent’ grammatical structures, or ‘deep structures’. One way to show this is to observe that a particular way of rearranging the words produces a sentence with the same meaning in the first case (1a = 1b), and a different meaning in the second (2a != 2b):

(1b) I expected a specialist to examine John.

(2b) I persuaded a specialist to examine John.

In particular, the target of persuasion is “John” in the case of (2a), and “the specialist” in the case of (2b). A full Chomskyan treatment of sentences like this would involve hierarchical tree diagrams, which permit a precise description of deep structure.

You may have encountered the famous sentence: “Colorless green ideas sleep furiously.” It first appeared in Chomsky’s 1957 book Syntactic Structures, and the point is that even nonsense sentences can be grammatically well-formed, and that speakers can quickly assess the grammatical correctness of even nonsense sentences that they’ve never seen before. To Chomsky, this is one of the most important facts to be explained about language.

A naive response to Chomsky’s preoccupation with grammar is: doesn’t real language involve a lot of non-grammatical stuff, like stuttering and slips of the tongue and midstream changes of mind? Of course it does, and Chomsky acknowledges this. To address this point, Chomsky has to move the goalposts in two important ways.

First, he famously distinguishes competence from performance, and identifies the former as the subject of any serious theory of language: (Aspects of the Theory of Syntax, Ch. 1, pg. 4)

The problem for the linguist, as well as for the child learning the language, is to determine from the data of performance the underlying system of rules that has been mastered by the speaker-hearer and that he puts to use in actual performance. Hence, in the technical sense, linguistic theory is mentalistic, since it is concerned with discovering a mental reality underlying actual behavior. Observed use of language or hypothesized dispositions to respond, habits, and so on, may provide evidence as to the nature of this mental reality, but surely cannot constitute the actual subject matter of linguistics, if this is to be a serious discipline.

Moreover, he claims that grammar captures most of what we should mean when we talk about speakers’ linguistic competence: (Aspects of the Theory of Syntax, Ch. 1, pg. 24)

A grammar can be regarded as a theory of a language; it is descriptively adequate to the extent that it correctly describes the intrinsic competence of the idealized native speaker.

Another way Chomsky moves the goalposts is by distinguishing E-languages, like English and Spanish and Japanese, from I-languages, which only exist inside human minds. He claims that serious linguistics should be primarily interested in the latter. In a semi-technical book summarizing Chomsky’s theory of language, Cook and Newson write: (Chomsky’s Universal Grammar: An Introduction, pg. 13)

E-language linguistics … aims to collect samples of language and then describe their properties. … I-language linguistics, however, is concerned with what a speaker knows about language and where this knowledge comes from; it treats language as an internal property of the human mind rather than something external …

Not only should linguistics primarily be interested in studying I-languages, but to try and study E-languages at all may be a fool’s errand: (Chomsky’s Universal Grammar: An Introduction, pg. 13)

Chomsky claims that the history of generative linguistics shows a shift from an E-language to an I-language approach; ‘the shift of focus from the dubious concept of E-language to the significant notion of I-language was a crucial step in early generative grammar’ (Chomsky, 1991b, pg. 10). … Indeed Chomsky is extremely dismissive of E-language approaches: ‘E-language, if it exists at all, is derivative, remote from mechanisms and of no particular empirical significance, perhaps none at all’ (Chomsky, 1991b, pg. 10).¹

I Am Not A Linguist (IANAL), but this redefinition of the primary concern of linguistics seems crazy to me. Is studying a language like English as it is actually used really of no particular empirical significance?

And this doesn’t seem to be a one-time hyperbole, but a representative claim. Cook and Newson continue: (Chomsky’s Universal Grammar: An Introduction, pg. 14)

The opposition between these two approaches in linguistics has been long and acrimonious, neither side conceding the other’s reality. … The E-linguist despises the I-linguist for not looking at the ‘real’ facts; the I-linguist derides the E-linguist for looking at trivia. The I-language versus E-language distinction is as much a difference of research methods and of admissible evidence as it is of long-term goals.

So much for what linguists ought to study. How should they study it?

The previous quote gives us a clue. Especially in the era before Chomsky (BC), linguists were more interested in description. Linguists were, at least in one view, people who could be dropped anywhere in the world, and emerge with a tentative grammar of the local language six months later. (A notion like this is mentioned early in this video.) Linguists catalog the myriad of strange details about human languages, like the fact that some languages don’t appear to have words for relative directions, or “thank you”, or “yes” and “no”.

After Chomsky's domination of the field (AD), there were a lot more theorists. While you could study language by going out into the field and collecting data, this was viewed as not the only, and maybe not even the most important, way to work. Diagrams of sentences proliferated. Chomsky, arguably the most influential linguist of the past hundred years, has never done fieldwork.

In summary, to Chomsky and many of the linguists working in his tradition, the scientifically interesting component of language is grammar competence, and real linguistic data only indirectly reflects it.

All of this matters because the dominance of Chomskyan linguistics has had downstream effects in adjacent fields like artificial intelligence (AI), evolutionary biology, and neuroscience. Chomsky has long been an opponent of the statistical learning tradition of language modeling, essentially claiming that it does not provide insight about what humans know about languages, and that engineering success probably can’t be achieved without explicitly incorporating important mathematical facts about the underlying structure of language. Chomsky’s ideas have motivated researchers to look for a “language gene” and “language areas” of the brain. Arguably, no one has yet found either—but more on that later.

How Chomsky attained this stranglehold on linguistics is an interesting sociological question, but not our main concern in the present work². The intent here is not to pooh-pooh Chomsky, either; brilliant and hard-working people are often wrong on important questions. Consider that his academic career began in the early 1950s—over 70 years ago!—when our understanding of language, anthropology, biology, neuroscience, and artificial intelligence, among many other things, was substantially more rudimentary.

Where are we going with this? All of this is context for understanding the ideas of a certain bomb-throwing terrorist blight on the face of linguistics: Daniel Everett. How Language Began is a book he wrote about, well, what language is and how it began. Everett is the anti-Chomsky.

II. THE MISSIONARY

We all love classic boy-meets-girl stories. Here’s one: boy meets girl at a rock concert, they fall in love, the boy converts to Christianity for the girl, then the boy and girl move to the Amazon jungle to dedicate the rest of their lives to saving the souls of an isolated hunter-gatherer tribe.

Daniel Everett is the boy in this story. The woman he married, Keren Graham, is the daughter of Christian missionaries and had formative experiences living in the Amazon jungle among the Sateré-Mawé people. At seventeen, Everett became a born-again Christian; at eighteen, he and Keren married; and over the next few years, they started a family and prepared to become full-fledged missionaries like Keren’s parents.

First, Everett studied “Bible and Foreign Missions” at the Moody Bible Institute in Chicago. After finishing his degree in 1975, the natural next step was to train more specifically to follow in the footsteps of Keren’s parents. In 1976, he and his wife enrolled in the Summer Institute of Linguistics (SIL) to learn translation techniques and more viscerally prepare for life in the jungle:

They were sent to Chiapas, Mexico, where Keren stayed in a hut in the jungle with the couple’s children—by this time, there were three—while Everett underwent grueling field training. He endured fifty-mile hikes and survived for several days deep in the jungle with only matches, water, a rope, a machete, and a flashlight.

Everett apparently had a gift for language-learning. This led SIL to invite Everett and his wife to work with the Pirahã people (pronounced pee-da-HAN), whose unusual language had thwarted all previous attempts to learn it. In 1977, Everett’s family moved to Brazil, and in December they met the Pirahã for the first time. As an SIL-affiliated missionary, Everett’s explicit goals were to (i) translate the Bible into Pirahã, and (ii) convert as many Pirahã as possible to Christianity.

But Everett’s first encounter with the Pirahã was cut short for political reasons: (Don’t Sleep There Are Snakes, Ch. 1, pg. 13-14)

In December of 1977 the Brazilian government ordered all missionaries to leave Indian reservations. … Leaving the village under these forced circumstances made me wonder whether I’d ever be able to return. The Summer Institute of Linguistics was concerned too and wanted to find a way around the government’s prohibition against missionaries. So SIL asked me to apply to the graduate linguistics program at the State University of Campinas (UNICAMP), in the state of São Paulo, Brazil. It was hoped that UNICAMP would be able to secure government authorization for me to visit the Pirahãs for a prolonged period, in spite of the general ban against missionaries. … My work at UNICAMP paid off as SIL hoped it would.

Everett became a linguist proper sort of by accident, mostly as an excuse to continue his missionary work. But he ended up developing a passion for it. In 1980, he completed Aspects of the Phonology of Pirahã, his master’s thesis. He continued on to get a PhD in linguistics, also from UNICAMP, and in 1983 finished The Pirahã Language and Theory of Syntax, his dissertation. He continued studying the Pirahã and working as an academic linguist after that. In all, Everett spent around ten years of his life living with the Pirahã, spread out over some thirty-odd years. As he notes in Don’t Sleep, There Are Snakes: (Prologue, pg. xvii-xviii)

I went to the Pirahãs when I was twenty-six years old. Now I am old enough to receive senior discounts. I gave them my youth. I have contracted malaria many times. I remember several occasions on which the Pirahãs or others threatened my life. I have carried more heavy boxes, bags, and barrels on my back through the jungle than I care to remember. But my grandchildren all know the Pirahãs. My children are who they are in part because of the Pirahãs. And I can look at some of those old men (old like me) who once threatened to kill me and recognize some of the dearest friends I have ever had—men who would now risk their lives for me.

Everett interviewing some Pirahã people. (source)

Everett did eventually learn their language, and it’s worth taking a step back to appreciate just how hard that task was. No Pirahã spoke Portuguese, apart from some isolated phrases they used for bartering. They didn’t speak any other language at all—just Pirahã. How do you learn another group’s language when you have no languages in common? The technical term is monolingual fieldwork. But this is just a fancy label for some combination of pointing at things, listening, crude imitation, and obsessively transcribing whatever you hear. For years.

It doesn’t help that the Pirahã language seems genuinely hard to learn in a few different senses. First, it is probably conventionally difficult for Westerners to learn since it is a tonal language (two tones: high and low) with a small number of phonemes (building block sounds) and a few unusual sounds³. Second, there is no written language. Third, the language has a variety of ‘channels of discourse’, or ways of talking specialized for one or another cultural context. One of these is ‘whistle speech’; Pirahãs can communicate purely in whistles. This feature appears to be extremely useful during hunting trips: (Don’t Sleep, There Are Snakes, Ch. 11, pg. 187-188)

My first intense contact with whistle speech came one day when the Pirahãs had given me permission to go hunting with them. After we’d been walking for about an hour, they decided that they weren’t seeing any game because I, with my clunking canteens and machete and congenital clumsiness, was making too much noise. “You stay here and we will be back for you later.” Xaikáibaí said gently but firmly. …
As I tried to make the best of my solitary confinement, I heard the men whistling to one another. They were saying, “I’ll go over there; you go that way,” and other such hunting talk. But clearly they were communicating. It was fascinating because it sounded so different from anything I had heard before. The whistle carried long and clear in the jungle. I could immediately see the importance and usefulness of this channel, which I guessed would also be much less likely to scare away game than the lower frequencies of the men’s normal voices.

Fourth, important aspects of the language reflect core tenets of Pirahã culture in ways that one might not a priori expect. Everett writes extensively about the ‘immediacy of experience principle’ of Pirahã culture, which he summarizes as the idea that: (Don’t Sleep, There Are Snakes, Ch. 7, pg. 132)

Declarative Pirahã utterances contain only assertions related directly to the moment of speech, either experienced by the speaker or witnessed by someone alive during the lifetime of the speaker.

One way the language reflects this is that the speaker must specify how they know something by affixing an appropriate suffix to verbs: (Don’t Sleep, There Are Snakes, Ch. 12, pg. 196)

Perhaps the most interesting suffixes, however (though these are not unique to Pirahã), are what linguists call evidentials, elements that represent the speaker’s evaluation of his or her knowledge of what he or she is saying. There are three of these in Pirahã: hearsay, observation, and deduction.
To see what these do, let’s use an English example. If I ask you, “Did Joe go fishing?” you could answer, “Yes, at least I heard that he did,” or “Yes, I know because I saw him leave,” or “Yes, at least I suppose he did because his boat is gone.” The difference between English and Pirahã is that what English does with a sentence, Pirahã does with a verbal suffix.

Everett also convincingly links this cultural principle to the lack of Pirahã number words and creation myths. On the latter topic, Everett recalls the following exchange: (Don’t Sleep, There Are Snakes, Ch. 7, pg. 134)

I sat with Kóhoi once and he asked me, after hearing about my god, “What else does your god do?” And I answered, “Well, he made the stars, and he made the earth.” Then I asked, “What do the Pirahãs say?” He answered, “Well, the Pirahãs say that these things were not made.”

And all of this is to say nothing of the manifold perils of the jungle: malaria, typhoid fever, dysentery, dangerous snakes, insects, morally gray river traders, and periodic downpours. If Indiana Jones braved these conditions for years, we would consider his stories rousing adventures. Everett did this while also learning one of the most unusual languages in the world.

People on the bank of the Maici river. (source)

By the way, he did eventually sort of achieve his goal of translating the Bible. Armed with a solid knowledge of Pirahã, he was able to translate the New Testament’s Gospel of Mark. Since the Pirahã have no written language, he provided them with a recorded version, but did not get the reaction he expected: (Don’t Sleep, There Are Snakes, Ch. 17, pg. 267-268)

When we returned to the village, I recorded Mark’s gospel in my own voice for the Pirahãs to listen to. I then brought in a wind-up tape recorder to play the recording, and I taught the Pirahãs how to use it, which, surprisingly enough, some of the children did. Keren and I left the village and returned a few weeks later. The people were still listening to the gospel, with children cranking the recorder. I was initially quite excited about this, until it became clear that the only part of the book that they paid attention to was the beheading of John the Baptist. “Wow, they cut off his head. Play that again!”

One reaction to hearing the gospel caught Everett even more off-guard: (Don’t Sleep, There Are Snakes, Ch. 17, pg. 269)

"The women are afraid of Jesus. We do not want him."
"Why not?" I asked, wondering what had triggered this declaration.
"Because last night he came to our village and tried to have sex with our women. He chased them around the village, trying to stick his large penis into them."
Kaaxaóoi proceeded to show me with his two hands held far apart how long Jesus's penis was—a good three feet.

But the Pirahã had an even more serious objection to Jesus: (Don’t Sleep, There Are Snakes, Ch. 17, pg. 265-266)

Part of the difficulty of my task began to become clear to me. I communicated more or less correctly to the Pirahãs about my Christian beliefs. The men listening to me understood that there was a man named Hisó, Jesus, and that he wanted others to do what he told them.
"The Pirahã men then asked, "Hey Dan, what does Jesus look like? Is he dark like us or light like you?" I said, "Well, I have never actually seen him. He lived a long time ago. But I do have his words." "Well, Dan, how do you have his words if you have never heard him or seen him?"
They then made it clear that if I had not actually seen this guy (and not in any metaphorical sense, but literally), they weren't interested in any stories I had to tell about him. Period. This is because, as I now knew, the Pirahãs believe only what they see. Sometimes they also believe in things that someone else has told them, so long as that person has personally witnessed what he or she is reporting.

In the end, Everett never converted a single Pirahã. But he did even worse than converting zero people—he lost his own faith after coming to believe that the Pirahã had a good point. After keeping this to himself for many years, he revealed his loss of faith to his family, which led to a divorce and his children breaking contact with him for a number of years afterward.

But Everett losing his faith in the God of Abraham was only the beginning. Most importantly for us, he also lost his faith in the God of Linguistics—Noam Chomsky.

III. THE WAR

In 2005, Everett’s paper “Cultural constraints on grammar and cognition in Pirahã: Another look at the design features of human language” was published in the journal Cultural Anthropology. An outsider might expect an article like this, which made a technical observation about the apparent lack of a property called ‘recursion’ in the Pirahã language, to receive an ‘oh, neat’ sort of response. Languages can be pretty different from one another, after all. Mandarin lacks plurals. Spanish sentences can omit an explicit subject. This is one of those kinds of things.

But the article ignited a firestorm of controversy that follows Everett to this day. Praise for Everett and his work on recursion in Pirahã:

He became a pure charlatan, although he used to be a good descriptive linguist. That is why, as far as I know, all the serious linguists who work on Brazilian languages ignore him.

Noam Chomsky, MIT professor and linguist

You, too, can enjoy the spotlight of mass media and closet exoticists! Just find a remote tribe and exploit them for your own fame by making claims nobody will bother to check!

Andrew Nevins, UCL professor and linguist (Harvard professor at quote time)

I think he knows he’s wrong, that’s what I really think. I think it’s a move that many, many intellectuals make to get a little bit of attention.

Tom Roeper, U. Mass. Amherst professor and linguist

Everett is a racist. He puts the Pirahã on a level with primates.

Cilene Rodrigues, PUC-Rio professor and linguist

Is Daniel Everett the village idiot of linguistics?

bedobi, Redditor

Apparently he struck a nerve. And there is much more vitriol like this; see Pullum for the best (short) account of the beef I’ve found, along with sources for each quote except the last. On the whole affair, he writes:

Calling it a controversy or debate would be an understatement; it was a campaign of vengeance and career sabotage.

I’m not going to rehash all of the details, but the conduct of many in the pro-Chomsky faction is pretty shocking. Highly recommended reading. Substantial portions of the books The Kingdom of Speech and Decoding Chomsky are also dedicated to covering the beef and related issues, although I haven’t read them.

What’s going on? Assuming Everett is indeed acting in good faith, why did he get this reaction? As I said in the beginning, linguists are those who believe Noam Chomsky is the rightful caliph. Central to Chomsky’s conception of language is the idea that grammar reigns supreme, and that human brains have some specialized structure for learning and processing grammar. In the writing of Chomsky and others, this hypothetical component of our biological endowment is sometimes called the narrow faculty of language (FLN); this is to distinguish it from other (e.g., sensorimotor) capabilities relevant for practical language use.

A paper by Hauser, Chomsky, and Fitch titled “The Faculty of Language: What Is It, Who Has It, and How Did It Evolve?” was published in the prestigious journal Science in 2002, just a few years earlier. The abstract contains the sentence:

We hypothesize that FLN only includes recursion and is the only uniquely human component of the faculty of language.

Some additional context is that Chomsky had spent the past few decades simplifying his theory of language. A good account of this is provided in the first chapter of Chomsky’s Universal Grammar: An Introduction. By 2002, arguably not much was left: the core claims were that (i) grammar is supreme, (ii) all grammar is recursive and hierarchical. More elaborate aspects of previous versions of Chomsky’s theory, like the idea that each language might be identified with different parameter settings of some ‘global’ model constrained by the human brain (the core idea of the so-called ‘principles and parameters’ formulation of universal grammar), were by now viewed as helpful and interesting but not necessarily fundamental.

Hence, it stands to reason that evidence suggesting not all grammar is recursive could be perceived as a significant threat to the Chomskyan research program. If not all languages had recursion, then what would be left of Chomsky’s once-formidable theoretical apparatus?

Everett’s paper inspired a lively debate, with many arguing that he is lying, or misunderstands his own data, or misunderstands Chomsky, or some combination of all of those things. The most famous anti-Everett response is “Pirahã Exceptionality: A Reassessment” by Nevins, Pesetsky, and Rodrigues (NPR), which was published in the prestigious journal Language in 2009. This paper got a response from Everett, which led to an NPR response-to-the-response.

To understand how contentious even the published form of this debate became, I reproduce in full the final two paragraphs of NPR’s response-response:

We began this commentary with a brief remark about the publicity that has been generated on behalf of Everett's claims about Pirahã. Although reporters and other nonlinguists may be aware of some ‘big ideas’ prominent in the field, the outside world is largely unaware of one of the most fundamental achievements of modern linguistics: the three-fold discovery that (i) there is such a thing as a FACT about language; (ii) the facts of language pose PUZZLES, which can be stated clearly and precisely; and (iii) we can propose and evaluate SOLUTIONS to these puzzles, using the same intellectual skills that we bring to bear in any other domain of inquiry. This three-fold discovery is the common heritage of all subdisciplines of linguistics and all schools of thought, the thread that unites the work of all serious modern linguists of the last few centuries, and a common denominator for the field.
In our opinion, to the extent that CA and related work constitute a ‘volley fired straight at the heart’ of anything, its actual target is no particular school or subdiscipline of linguistics, but rather ANY kind of linguistics that shares the common denominator of fact, puzzle, and solution. That is why we have focused so consistently on basic, common-denominator questions: whether CA’s and E09’s conclusions follow from their premises, whether contradictory published data has been properly taken into account, and whether relevant previous research has been represented and evaluated consistently and accurately. To the extent that outside eyes may be focused on the Pirahã discussion for a while longer, we would like to hope that NP&R (and the present response) have helped reinforce the message that linguistics is a field in which robustness of evidence and soundness of argumentation matter.

Two observations here. First, another statement about “serious” linguistics; why does that keep popping up? Second, wow. That’s the closest you can come to cursing someone out in a prestigious journal.

Polemics aside, what’s the technical content of each side’s argument? Is Pirahã recursive or not? Much of the debate appears to hinge on two things:

what one means by recursion
what one means by the statement “All natural human languages have recursion.”

Everett generally takes recursion to refer to the following property of many natural languages: one can construct sentences or phrases from other sentences and phrases. For example:

“The cat died.” -> “Alice said that [the cat died].” -> “Bob said that [Alice said that [the cat died.]]”

In the above example, we can in principle generate infinitely many new sentences by writing “Z said X,” where X is the previous sentence and Z is some name. For clarity’s sake, one should probably distinguish between different ways to generate new sentences or phrases from old ones; Pullum mentions a few in the context of assessing Everett’s Pirahã recursion claims:

Everett reports that there are no signs of no multiple coordination (It takes [skill, nerve, initiative, and courage]), complex determiners ([[[my] son’s] wife’s] family), stacked modifiers (a [nice, [cosy, [inexpensive [little cottage]]]]), or—most significant of all—reiterable clause embedding (I thought [ you already knew [that she was here ] ]). These are the primary constructions that in English permit sentences of any arbitrary finite length to be constructed, yielding the familiar argument that the set of all definable grammatical sentences in English is infinite.

Regardless of the details, a generic prediction should be that there is no longest sentence in a language whose grammar is recursive. This doesn’t mean that one can say an arbitrarily long sentence in real life⁴. Rather, one can say that, given a member of some large set of sentences, one can always extend it.

Everett takes the claim “All natural human languages have recursion.” to mean that, if there exists a natural human language without recursion, the claim is false. Or, slightly more subtly, if there exists a language which uses recursion so minimally that linguists have a hard time determining whether a corpus of linguistic data falsifies it or not, sentence-level recursion is probably not a bedrock principle of human languages.

I found the following anecdote from a 2012 paper of Everett’s enlightening:

Pirahã speakers reject constructed examples with recursion, as I discovered in my translation of the gospel of Mark into the language (during my days as a missionary). The Bible is full of recursive examples, such as the following, from Mark 1:3:
‘(John the Baptist) was a voice of one calling in the desert…’
I initially translated this as:
‘John, the man that put people in the water in order to clean them for God, that lived in a place like a beach with no trees and that yelled for people to obey God’.
The Pirahãs rejected every attempt until I translated this as:
‘John cleaned people in the river. He lived in another jungle. The jungle was like a beach. It had no trees. He yelled to people. You want God!’
The non-recursive structure was accepted readily and elicited all sorts of questions. I subsequently realized looking through Pirahã texts that there were no clear examples involving either recursion or even embedding. Attempts to construct recursive sentences or phrases, such as ‘several big round barrels', were ultimately rejected by the Pirahãs (although initially they accepted them to be polite to me, a standard fieldwork problem that Jeanette Sakel and I discuss).

He does explicitly claim (in the aforementioned paper and elsewhere) that Pirahã probably has no longest sentence, which is about the most generic anti-recursion statement one can make.

Chomsky and linguists working in his tradition sometimes write in a way consistent with Everett’s conception of recursion, but sometimes don’t. For example, consider this random 2016 blogpost I found by a linguist in training:

For generative linguistics the recursive function is Merge, which combines two words or phrases to form a larger structure which can then be the input for further iterations of Merge. Any expression larger than two words, then, requires recursion, regardless of whether there is embedding in that expression. For instance the noun phrase “My favourite book” requires two iterations of Merge, (Merge(favourite, book)= [Favourite book], Merge(my, [favourite book])= [my [favourite book]]) and therefore is an instance of recursion without embedding.

To be clear, this usage of ‘recursion’ seems consistent with how many other Chomskyan linguists have used the term. And with all due respect to these researchers, I find this notion of recursion completely insane, because it would imply (i) any language with more than one word in its sentences has recursion, and that (ii) all sentences are necessarily constructed recursively.

The first implication means that “All natural human languages have recursion.” reduces to the vacuously true claim that “All languages allow more than one word in their sentences.”⁵ The second idea is more interesting, because it relates to how the brain constructs sentences, but as far as I can tell this claim cannot be tested using purely observational linguistic data. One would have to do some kind of experiment to check the order in which subjects mentally construct sentences, and ideally make brain activity measurements of some sort.

Aside from sometimes involving a strange notion of recursion, another feature of the Chomskyan response to Everett relates to the distinction we discussed earlier between so-called E-languages and I-languages. Consider the following exchange from a 2012 interview with Chomsky:

NS: But there are critics such as Daniel Everett, who says the language of the Amazonian people he worked with seems to challenge important aspects of universal grammar.
Chomsky: It can't be true. These people are genetically identical to all other humans with regard to language. They can learn Portuguese perfectly easily, just as Portuguese children do. So they have the same universal grammar the rest of us have. What Everett claims is that the resources of the language do not permit the use of the principles of universal grammar.
That's conceivable. You could imagine a language exactly like English except it doesn't have connectives like "and" that allow you to make longer expressions. An infant learning truncated English would have no idea about this: they would just pick it up as they would standard English. At some point, the child would discover the resources are so limited you can't say very much, but that doesn't say anything about universal grammar, or about language acquisition.

Chomsky makes claims like this elsewhere too. The argument is that, even if there were a language without a recursive grammar, this is not inconsistent with his theory, since his theory is not about E-languages like English or Spanish or Pirahã. His theory only makes claims about I-languages, or equivalently about our innate language capabilities.

But this is kind of a dumb rhetorical move. Either the theory makes predictions about real languages or it doesn’t. The statement that some languages in the world are arguably recursive is not a prediction; it’s an observation, and we didn’t need the theory to make it. What does it mean for the grammar of thought languages to be recursive? How do we test this? Can we test it by doing experiments involving real linguistic data, or not? If not, are we even still talking about language?

To this day, as one might expect, not everyone agrees with Everett that (i) Pirahã lacks a recursive hierarchical grammar, and that (ii) such a discovery would have any bearing at all on the truth or falsity of Chomskyan universal grammar. Given that languages can be pretty weird, among other reasons, I am inclined to side with Everett here. But where does that leave us? We do not just want to throw bombs and tell everyone their theories are wrong.

Does Everett have an alternative to the Chomskyan account of what language is and where it came from? Yes, and it turns out he’s been thinking about this for a long time. How Language Began is his 2017 offering in this direction.

IV. THE BOOK

So what is language, anyway?

Everett writes: (How Language Began, Ch. 1, pg. 15)

Language is the interaction of meaning (semantics), conditions on usage (pragmatics), the physical properties of its inventory of sounds (phonetics), a grammar (syntax, or sentence structure), phonology (sound structure), morphology (word structure), discourse conversational organizational principles, information, and gestures. Language is a gestalt—the whole is greater than the sum of its parts. That is to say, the whole is not understood merely by examining individual components.

Okay, so far, so good. To the uninitiated, it looks like Everett is just listing all of the different things that are involved in language; so what? The point is that language is more than just grammar. He goes on to say this explicitly: (How Language Began, Ch. 1, pg. 16)

Grammar is a tremendous aid to language and also helps in thinking. But it really is at best only a small part of any language, and its importance varies from one language to another. There are tongues that have very little grammar and others in which it is extremely complex.

His paradigmatic examples here are Pirahã and Riau Indonesian, which appears to lack a hierarchical grammar, and which moreover apparently lacks a clear noun/verb distinction. You might ask: what does that even mean? I’m not 100% sure, since the linked Gil chapter appears formidable, but Wikipedia gives a pretty good example in the right direction:

For example, the phrase Ayam makan (lit. 'chicken eat') can mean, in context, anything from 'the chicken is eating', to 'I ate some chicken', 'the chicken that is eating' and 'when we were eating chicken'

Is “chicken” the subject of the sentence, the object of the sentence, or something else? Well, it depends on the context.

What’s the purpose of language? Communication: (How Language Began, Introduction, pg. 5)

Indeed, language changes lives. It builds society, expresses our highest aspirations, our basest thoughts, our emotions and our philosophies of life. But all language is ultimately at the service of human interaction. Other components of language—things like grammar and stories—are secondary to conversation.

Did language emerge suddenly, as it does in Chomsky’s proposal, or gradually? Very gradually: (How Language Began, Introduction, pg. 7-8)

There is a wide and deep linguistic chasm between humans and all other species. … More likely, the gap was formed by baby steps, by homeopathic changes spurred by culture. Yes, human languages are dramatically different from the communication systems of other animals, but the cognitive and cultural steps to get beyond the ‘language threshold’ were smaller than many seem to think. The evidence shows that there was no ‘sudden leap’ to the uniquely human features of language, but that our predecessor species in the genus Homo and earlier, perhaps among the australopithecines, slowly but surely progressed until humans achieved language. This slow march taken by early hominins resulted eventually in a yawning evolutionary chasm between human language and other animal communication.

So far, we have a bit of a nothingburger. Language is for communication, and probably—like everything else!—emerged gradually over a long period of time. While these points are interesting as a contrast to Chomsky, they are not that surprising in and of themselves.

But Everett’s work goes beyond taking the time to bolster common sense ideas on language origins. Two points he discusses at length are worth briefly exploring here. First, he offers a much more specific account of the emergence of language than Chomsky does, and draws on a mix of evidence from paleoanthropology, evolutionary biology, linguistics, and more. Second, he pretty firmly takes the Anti-Chomsky view on whether language is innate: (Preface, pg. xv)

… I deny here that language is an instinct of any kind, as I also deny that it is innate, or inborn.

These two points are not unrelated. Everett’s core idea is that language should properly be thought of as an invention rather than an innate human capability. You might ask: who invented it? Who shaped it? Lots of people, collaboratively, over a long time. In a word, culture. As Everett notes in the preface, “Language is the handmaiden of culture.”

In any case, let’s discuss these points one at a time. First: the origins of language. There are a number of questions one might want to answer about how language began:

In what order did different language-related concepts and components emerge?
When did language proper first arise?
What aspects of human biology best explain why and how language emerged?

To Everett, the most important feature of language is not grammar or any particular properties of grammar, but the fact that it involves communication using symbols. What are symbols? (Ch. 1, pg. 17)

Symbols are conventional links to what they refer to. They … need not bear any resemblance to nor any physical connection to what they refer to. They are agreed upon by society.

There are often rules for arranging symbols, but given how widely they can vary in practice, Everett views such rules as interesting but not fundamental. One can have languages with few rules (e.g., Riau) or complex rules (e.g., German); the key requirement for a language is that symbols are used to convey meaning.

Where did symbols come from? To address this question, Everett adapts a theory due to the (in his view underappreciated) American polymath Charles Sanders Peirce: semiotics, the theory of signs. What are signs? (Ch. 1, pg. 16)

A sign is any pairing of a form (such as a word, smell, sound, street sign, or Morse code) with a meaning (what the sign refers to).

Everett, in the tradition of Peirce, distinguishes between various different types of signs. The distinction is based on (i) whether the pairing is intentional, and (ii) whether the form of the sign is arbitrary. Indexes are non-intentional, non-arbitrary pairings of form and meaning (think: dog paw print). Icons are intentional, non-arbitrary pairings of form and meaning (think: a drawing of a dog paw print). Symbols are intentional, arbitrary pairings (think: the word “d o g” refers to a particular kind of real animal, but does not resemble anything about it).

Everett argues that symbols did not appear out of nowhere, but rather arose from a natural series of abstractions of concepts relevant to early humans. The so-called ‘semiotic progression’ that ultimately leads to symbols looks something like this:

indexes (dog paw print) -> icons (drawing of dog paw print) -> symbols (“d o g”)

This reminds me of what little I know about how written languages changed over time. For example, many Chinese characters used to look a lot more like the things they represented (icon-like), but became substantially more abstract (symbol-like) over time:

Eight examples of how Chinese characters have changed over time. (source)

For a given culture and concept, the icon-to-symbol transition could’ve happened any number of ways. For example, early humans could’ve mimicked an animal’s cry to refer to it (icon-like, since this evokes a well-known physical consequence of some animal’s presence), but then gradually shifted to making a more abstract sound (symbol-like) over time.

The index (non-intentional, non-arbitrary) to icon transition must happen even earlier. This refers to whatever process led early humans to, for example, mimic a given animal’s cry in the first place, or to draw people on cave walls, or to collect rocks that resemble human faces.

Is there a clear boundary between indexes, icons, and symbols? It doesn’t seem like it, since things like Chinese characters changed gradually over time. But Everett doesn’t discuss this point explicitly.

Why did we end up with certain symbols and not others? Well, there’s no good a priori reason to prefer “dog” over “perro” or “adsnofnowefn”, so Everett attributes the selection mostly to cultural forces. Everett suggests these forces shape language in addition to practical considerations, like the fact that, all else being equal, we prefer words that are not hundreds of characters long, because they would be too annoying to write or speak.

When did language—in the sense of communication using symbols—begin? Everett makes two kinds of arguments here. One kind of argument is that certain feats are hard enough that they probably required language in this sense. Another kind of argument relates to how we know human anatomy has physically changed on evolutionary time scales.

The feats Everett talks about are things like traveling long distances across continents, possibly even in a directed rather than random fashion; manufacturing nontrivial hand tools (e.g., Oldowan and Mousterian); building complex settlements (e.g., the one found at Gesher Benot Ya'aqov); controlling fire; and using boats to successfully navigate treacherous waters. Long before sapiens arose, paleoanthropological evidence suggests that our predecessors Homo erectus did all of these things. Everett argues that they might have had language over one million years ago⁶.

This differs from Chomsky’s proposal by around an order of magnitude, time-wise, and portrays language as something not necessarily unique to modern humans. In Everett’s view, Homo sapiens probably improved on the language technology bestowed upon them by their erectus ancestors, but did not invent it.

Everett’s anatomy arguments relate mainly to the structure of the head and larynx (our ‘voice box’, an organ that helps us flexibly modulate the sounds we produce). Over the past two million years, our brains got bigger, our face and mouth became more articulate, our larynx changed in ways that gave us a clearer and more elaborate inventory of sounds, and our ears became better tuned to hearing those sounds. Here’s the kind of thing Everett writes on this topic: (Ch. 5, pg. 117)

Erectus speech perhaps sounded more garbled relative to that of sapiens, making it harder to hear the differences between words. … Part of the reason for erectus’s probably mushy speech is that they lacked a modern hyoid (Greek for ‘U-shaped’) bone, the small bone in the pharynx that anchors the larynx. The muscles that connect the hyoid to the larynx use their hyoid anchor to raise and lower the larynx and produce a wider variety of speech sounds. The hyoid bone of erectus was shaped more like the hyoid bones of the other great apes and had not yet taken on the shape of sapiens’ and neanderthalensis’ hyoids (these two being virtually identical).

Pretty neat and not something I would’ve thought about.

What aspects of biology best explain all of this? Interestingly, at no point does Everett require anything like Chomsky’s faculty of language; his view is that language was primarily enabled by early humans being smart enough to make a large number of useful symbol-meaning associations, and social enough to perpetuate a nontrivial culture. Everett thinks cultural pressures forced humans to evolve bigger brains and better communications apparatuses (e.g., eventually giving us modern hyenoid bones to support clearer speech), which drove culture to become richer, which drove yet more evolution, and so on.

Phew. Let’s go back to the question of innateness before we wrap up.

Everett’s answer to the innateness question is complicated and in some ways subtle. He agrees that certain features of the human anatomy evolved to support language (e.g., the pharynx and ears). He also agrees that modern humans are probably much better than Homo erectus at working with language, if indeed Homo erectus did have language.

He mostly seems to take issue with the idea that some region of our brain is specialized for language. Instead, he thinks that our ability to produce and comprehend language is due to a mosaic of generally-useful cognitive capabilities, like our ability to remember things for relatively long times, our ability to form and modify habits, and our ability to reason under uncertainty. This last capability seems particularly important since, as Everett points out repeatedly, most language-based communication is ambiguous, and it is important for participants to exploit cultural and contextual information to more reliably infer the intended messages of their conversation partners. Incidentally, this is a feature of language Chomskyan theory tends to neglect⁷.

Can’t lots of animals do all those things? Yes. Everett views the difference as one of degree, not necessarily of quality.

What about language genes like FOXP2 and putative language areas like Broca’s and Wernicke’s areas? What about specific language impairments? Aren’t they clear evidence of language-specific human biology? Well, FOXP2 appears to be more related to speech control—a motor task. Broca’s and Wernicke’s areas are both involved in coordinating motor activity unrelated to speech. Specific language impairments, contrary to their name, also involve some other kind of deficit in the cases known to Everett.

I have to say, I am not 100% convinced by the brain arguments. I mean, come on, look at the videos of people with Broca’s aphasia or Wernicke’s aphasia. Also, I buy that Broca’s and Wernicke’s areas (or whatever other putative language areas are out there) are active during non-language-related behavior, or that they represent non-language-related variables. But this is also true of literally every other area we know of in the brain, including well-studied sensory areas like the primary visual cortex. It’s no longer news when people find variable X encoded in region Y-not-typically-associated-with-X.

Still, I can’t dismiss Everett’s claim that there is no language-specific brain area. At this point, it’s hard to tell. The human brain is complicated, and there remains much that we don’t understand.

Overall, Everett tells a fascinatingly wide-ranging and often persuasive story. If you’re interested in what language is and how it works, you should read How Language Began. There’s a lot of interesting stuff in there I haven’t talked about, especially for someone unfamiliar with at least one of the areas Everett covers (evolution, paleoanthropology, theoretical linguistics, neuroanatomy, …). Especially fun are the chapters on aspects of language I don’t hear people talk about as much, like gestures and intonation.

As I’ve tried to convey, Everett is well-qualified to write something like this, and has been thinking about these topics for a long time. He’s the kind of linguist most linguists wish they could be, and he’s worth taking seriously, even if you don’t agree with everything he says.

V. THE REVELATIONS

I want to talk about large language models now. Sorry. But you know I had to do this.

Less than two years ago at the time of writing, the shocking successes of ChatGPT put many commentators in an awkward position. Beyond all the quibbling about details (Does ChatGPT really understand? Doesn’t it fail at many tasks trivial for humans? Could ChatGPT or something like it be conscious?), the brute empirical fact remains that it can handle language comprehension and generation pretty well. And this is despite the conception of language underlying it—language use as a statistical learning problem, with no sentence diagrams or grammatical transformations in sight—being somewhat antithetical to the Chomskyan worldview.

Chomsky has frequently criticized the statistical learning tradition, with his main criticisms seeming to be that (i) statistical learning produces systems with serious defects, and (ii) succeeding at engineering problems does not tell us anything interesting about how the human brain handles language. These are reasonable criticisms, but I think they are essentially wrong.

Statistical approaches succeeded where more directly-Chomsky-inspired approaches failed, and it was never close. Large language models (LLMs) like ChatGPT are not perfect, but they’re getting better all the time, and the onus is on the critics to explain where they think the wall is. It’s conceivable that a completely orthogonal system designed according to the principles of universal grammar could outperform LLMs built according to the current paradigm—but this possibility is becoming vanishingly unlikely.

Why do statistical learning systems handle language so well? If Everett is right, the answer is in part because (i) training models on a large corpus of text and (ii) providing human feedback both give models a rich collection of what is essentially cultural information to draw upon. People like talking with ChatGPT not just because it knows things, but because it can talk like them. And that is only possible because, like humans, it has witnessed and learned from many, many, many conversations between humans.

Statistical learning also allows these systems to appreciate context and reason under uncertainty, at least to some extent, since both of these are crucial factors in many of the conversations that appear in training data. These capabilities would be extremely difficult to implement by hand, and it’s not clear how a more Chomskyan approach would handle them, even if some kind of universal-grammar-based latent model otherwise worked fairly well.

Chomsky’s claim that engineering success does not necessarily produce scientific insight is not uncommon, but a large literature speaks against it. And funnily enough, given that he is ultimately interested in the mind, engineering successes have provided some of our most powerful tools for interrogating what the mind might look like.

The rub is that artificial systems engineered to perform some particular task well are not black boxes; we can look inside them and tinker as we please. Studying the internal representations and computations of such networks has provided neuroscience with crucial insights in recent years, and such approaches are particularly helpful given how costly neuroscience experiments (which might involve, e.g., training animals and expensive recording equipment) can be. Lots of recent computational neuroscience follows this blueprint: build a recurrent neural network to solve a task neuroscientists study, train it somehow, then study its internal representations to generate hypotheses about what the brain might be doing.

In principle, (open-source) LLMs and their internal representations can be interrogated in precisely the same way. I’m not sure what’s been done already, but I’m confident that work along these lines will become more common in the near future. Given that high-quality recordings of neural dynamics during natural language use are hard to come by, studying LLMs might be essential for understanding human-language-related neural computations.

When we peer inside language-competent LLMs, what will we find? This is a topic Everett doesn’t have much to say about, and on which Chomsky might actually be right. Whether we’re dealing with the brain or artificial networks, we can talk about the same thing at many different levels of description. In the case of the brain, we might talk in terms of interacting molecules, networks of electrically active neurons, or very many other effective descriptions. In the case of artificial networks, we can either talk about individual ‘neurons’, or some higher-level description that better captures the essential character of the underlying algorithm.

Maybe LLMs, at least when trained on data from languages whose underlying rules can be parsimoniously described using universal grammar, effectively exploit sentence diagrams or construct recursive hierarchical representations of sentences using an operation like Merge. It’s still possible that formalisms like Chomsky’s provide a useful way of talking about what LLMs do, if anything like that is true. Such descriptions might be said to capture the ‘mind’ of an LLM, since from a physicalist perspective the ‘mind’ is just a useful way of talking about a complex system of interacting neurons.

Regardless of who’s right and who’s wrong, the study language is certainly interesting and we have a lot more to learn. Something Chomsky wrote in 1968 seems like an appropriate summary of the way forward: (Language and Mind, pg. 1)

I think there is more of a healthy ferment in cognitive psychology—and in the particular branch of cognitive psychology known as linguistics—than there has been for many years. And one of the most encouraging signs is that skepticism with regard to the orthodoxies of the recent past is coupled with an awareness of the temptations and dangers of premature orthodoxy, an awareness that, if it can persist, may prevent the rise of new and stultifying dogma.
It is easy to be misled in an assessment of the current scene; nevertheless, it seems to me that the decline of dogmatism and the accompanying search for new approaches to old and often still intractable problems are quite unmistakable, not only in linguistics but in all of the disciplines concerned with the study of mind.

Chomsky 1991b refers to “Linguistics and adjacent fields: a personal view”, a chapter of The Chomskyan Turn. I couldn’t access the original text, so this quote-of-a-quote will have to do.

Chomsky’s domination of linguistics is probably due to a combination of factors. First, he is indeed brilliant and prolific. Second, Chomsky’s theories promised to ‘unify’ linguistics and make it more like physics and other ‘serious’ sciences; for messy fields like linguistics, I assume this promise is extremely appealing. Third, he helped create and successfully exploited the cognitive zeitgeist that for the first time portrayed the mind as something that can be scientifically studied in the same way that atoms and cells can. Moreover, he was one of the first to make interesting connections between our burgeoning understanding of fields like molecular biology and neuroscience on the one hand, and language on the other. Fourth, Chomsky was not afraid to get into fights, which can be beneficial if you usually win.

One such sound is the bilabial trill, which kind of sounds like blowing a raspberry.

This reminds me of a math joke.

Why is this vacuously true? If, given some particular notion of ‘sentence’, the sentences of any language could only have one word at most, we would just define some other notion of ‘word collections’.

He and archaeologist Lawrence Barham provide a more self-contained argument in this 2020 paper.

A famous line at the beginning of Chomsky’s Aspects of the Theory of Syntax goes: “Linguistic theory is concerned primarily with an ideal speaker-listener, in a completely homogeneous speech community, who knows its language perfectly and is unaffected by such grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors (random or characteristic) in applying his knowledge of the language in actual performance.”

You're currently a free subscriber to Astral Codex Ten. For the full experience, upgrade your subscription.

Upgrade to paid

Comment

Restack

Astral Codex Ten - Your Book Review: How Language Began

Your Book Review: How Language Began

Finalist #5 in the Book Review Contest

I. THE GOD

II. THE MISSIONARY

III. THE WAR

IV. THE BOOK

V. THE REVELATIONS

Older messages

Highlights From The Comments On Mentally Ill Homeless People

Consciousness As Recursive Reflections

Open Thread 338

Your Book Review: The Family That Couldn’t Sleep

Lifeboat Games And Backscratchers Clubs

You Might Also Like

Coming Soon: Your Money Under 24/7 Gov’t Surveillance

👋 Goodbye retirement, hello crypto

Lawmakers target social media giants | Microsoft unveils new AI assistant for healthcare

☕ Death becomes him

☕ Conference call

The Trump-Zelensky Oval Office blowup.

Sorry Birthday

⚡️ Should We All Eat Like Athletes?

Oval Office debacle reflects new reality for Ukraine

Numlock News: March 3, 2025 • Anora, Mixue, Stars