A Lot of Arguments Against AI Consciousness Are Bad, And Here’s Why
We've decided this question before, and we're about to decide it again by accident.
Introduction
I’m a big fan of Star Trek: Voyager.
I know it has its detractors in the Trekkie community, for a number of reasons – it doesn’t really utilize its premise to the (perceived) fullest, it has pacing issues (both within episodes and across seasons), there are some absolutely terrible episodes mixed in there (cough “Threshold” cough), and some of its episodes feel like either bad retreads or outright refutations of earlier Star Trek series’ episodes and/or themes (e.g.: “Tuvix”, a Voyager episode that remains controversial 30 years after its original airing).
Nevertheless, I’m a big fan, warts and all. I didn’t realize it until very recently, but it somehow became one of the core influences in my outlook, opinions, and worldview without ever explicitly trying to do so. And while I’ve watched at least parts of some other Trek entries (TNG, Deep Space Nine, the Abrams movies), none have called me back to them as consistently or frequently as Voyager does.
One episode of Voyager that often gets flack for being a rehash of an old favorite is “Author, Author”. You needn’t look far to find forum discussion labeling it a repeat of “The Measure of a Man” based on the parallels in the respective A plots, and then judging it based on that framing.
For those who’ve not seen one or both of those episodes, to be clear, the parallels are real and fair, at least to an extent:
“Measure of a Man” is a relatively early episode of The Next Generation in which the android Data is, functionally, treated as an object owned by Starfleet (with an explicit comparison made to a toaster of the ship’s computer) and ordered to submit to a researcher who dreams of taking Data apart and making copies, or at least generally advancing understanding of cybernetics, robotics, and artificial intelligence. Picard, somewhat peeved at the notion of one of his best officers being spirited away for an experimental procedure and perhaps permanently unable to return, forces an adversarial trial on the matter of Data’s personhood and right to self-determination.
“Author, Author” is a relatively late episode (from the show’s final season) where, after The Doctor (the Voyager ship’s Emergency Medical Hologram) winds up in a dispute with a publisher regarding the treatment and release of a holo-novel that he wrote, his authorial rights are challenged on the grounds that he’s a hologram and therefore not an “author” in the sense envisioned by the law due to not being recognized as a “person”.
So, yeah, lots of parallels – legal trials, questions of personhood, artificial life, captains fighting for their crew members against a broader system. Some of the arguments made in both episodes are mirror images – both feature arguments that attack the notion of personhood for the subject of the trial by pointing out the similarity of their mechanisms to things that are absolutely not granted personhood or sentience, and these arguments are treated as weighty and initially persuasive. Ultimately, though, Picard prevails against Riker’s arguments, a conclusion that we as the viewer root for as well because we like Data. When the question of personhood and self-determination for artificial life is safely hypothetical, and the subject of the question is something that walks, talks, and looks like us, we’re relatively clear and unambiguous in what we believe.
The goal of this article, then, is to examine why and how that belief falls apart when it becomes real. Why do so many who cite Star Trek as inspirational or Picard as a moral icon suddenly, when modern Generative AI is the topic at hand, deploy terms like “clanker” (a slur borrowed from, of all places, Star Wars, the other sci-fi cultural juggernaut) or steadfastly deny that their outputs are or should be considered creative or original? Why are Anthropic’s statements about the uncertainty of chatbot systems having feelings targets for disbelief and mockery?
Disclaimers
Most reading this are probably aware, but for clarity – I’m not, by training, a linguist (relevant since we’re talking about Large Language Models, at least in part), psychologist (since we’re talking about topics related to human brains), or philosopher (a field that’s mused endlessly on the nature of consciousness). If you were to be generous, you might call me a “computer scientist” by training; less charitably, I am a code monkey and data plumber. I say this not to have an easy shield with which to deflect criticism of the positions I take here, but to make clear the education and perspective I bring with me so that you, the reader, might better understand why I say and believe the things that I do.
While we’re talking about my background, I’d like to also explicitly note that all the arguments and positions that I take in this article are my own; it does not represent the opinions, outlooks, or beliefs of my current employer (Google). That goes broadly for anything I publish, but considering the specific subject matter, it warrants being reinforced here.
Another point worth making – I’m going to be critically examining, disagreeing with, and deconstructing a lot of arguments about and against GenAI, but there are many arguments out there which I absolutely agree with and believe should not be minimized. Nothing I say in this piece should diminish, for instance, questions about the large environmental costs associated with model training and data centers, or the systemic biases and stereotypes that those models might learn and calcify, or the ethics of companies charging for access to models without in any way paying for (at least parts of) the original training data, or the potential that these technologies have to exacerbate an ever-widening wealth gap between the haves and have-nots (both on personal and geopolitical scales). These are all very real, very troubling, and very much beyond my capability to suggest solutions for, so they’ll be omitted from the remainder of this piece, but that omission is due only to the fact that they stand more-or-less independent from the flavor of mistake(s) that I believe are being made by these other arguments.
One last note to flag: I’ll be using ${BRAIN_THING} as a stand-in variable for “consciousness/intent/cognition/sentience/creativity/etc”. I wouldn’t normally conflate them like that, but oftentimes the discussions being had, especially in casual conversations, do conflate them in just that way, and it’d be tedious to reproduce the whole list every time. For readability’s sake, the title of this post uses “consciousness”, but I’ll be using the variable going forward unless I specifically mean one of those things only.
The Pattern
As a preview, what I think is the common thread in all these arguments is a combination of motivated reasoning and inconsistent or moving standards. More specifically, it often feels like AI ${BRAIN_THING} must pass standards that aren’t imposed on humans, or even biological life broadly.
In part, I believe this to be a form of affinity bias. We recognize and give humans an automatic presumption of consciousness because we are ourselves conscious (if only because solipsism is not really a sustainable worldview), and we extend similar “benefit of the doubt” to other life on Earth, where because we recognize (even unconsciously) that they’re similar to us in some form and function, we’re willing to grant a degree of ${BRAIN_THING}.
Because GenAI is not in that familiar form, though – it’s arguable whether it even has any necessarily consistent physical form – it doesn’t receive that same generosity. It also explains why the issue ceases to matter for fictional entities like Data or The Doctor – as I gestured at earlier, they have human forms. Even droids in Star Wars are by-and-large anthropomorphic (with the notable exception of astromech droids like R2D2, but even they display human-like behavior and are therefore treated the same as anthropomorphic droids).
This cuts both ways, though – oftentimes you’ll see critics make comparisons between LLMs and, say, protein-folding AI, with the intention of making the point that “clearly protein-folding AI, which works in the same way using the same principles, isn’t ${BRAIN_THING}, so LLMs can’t be either”, and it’s considered reasonably effective… but only because it’s circular reasoning – there’s no proof given that protein-folding AI isn’t any of those things, just an a priori assumption of it treated as fact without statement of the presupposition.
By naming the affinity bias, though, we can see that the critics are gesturing at something real in making that flawed argument: “AI maximalists” (for want of a more official term) are making an analogous move by attaching extra weight to the characteristic of “generating language”. That isn’t necessarily wrong – protein-folding doesn’t have the same markers of similarity to human experience as language in that the former is a technical skill, whereas the latter is considered core – but whether those similarities actually make chatbots closer to ${BRAIN_THING} is the exact question to which we are (or at least, should be) looking for an answer, and it’s one which the maximalists assume (often implicitly) as having weight in the positive direction.
Pop Dismissal By Reductionism
Perhaps the most common dismissal of AI is the familiar refrain of “it’s just fancy autocomplete” or some version thereof (e.g. statistics, linear algebra, etc.). The seemingly catchiest version of this dismissal is to refer to LLM chatbots specifically as “stochastic parrots”... and it’s easy to see why – it’s something approaching a thought-terminating cliche, and one with some superficial grounding in the facts. It’s true, after all, that LLMs output text by sampling from a probability distribution (i.e. “stochastic”) and that they have a tendency to fall back on common phrases or words (like parrots). This veneer of grounded apparent intelligence makes it seem like someone who’s deploying the phrase has some kind of nuanced understanding of the underlying issue, and is also very compact and easy to re-deploy.
The term itself originates in a paper from Natural Language Processing, On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? by Bender et al. Unfortunately, the term has long since outrun the narrow intent with which it was originally coined, which was as an evocative metaphor for the at-the-time pure language models and their correlation (or lack thereof, as the paper was arguing) towards Natural Language Understanding (as contrasted with the narrower field/goal of “processing”). This isn’t supposition on my part; by some coincidence, Emily Bender, the first author on that paper, wrote a Medium… article? post?... on this exact topic about a month ago in which she details the many wrong ways in which that phrase has mutated in popular consciousness over time as the LLMs she was criticizing at the time have become mainstays of discourse and technology today.
That paper is well worth reading, but it’s worth reading specifically as a sort of snapshot of technological progress. That paper (and the paper on which it builds, Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data by Bender and Koller) are substantively accurate descriptions of LLMs and chatbots as they were first introduced to the public in 2022. To quote that earlier Medium piece:
Stochastic parrots was coined to refer to language models, i.e. systems trained only on linguistic form used to mimic the kinds of sequences of linguistic form that people use.
She goes on to explicitly acknowledge that modern models (explicitly naming multimodal image/text models) are arguably capable of “understanding” in a way that the older models that were trained on only sequences of text weren’t, referring again to “understanding” from the even earlier Bender/Koller paper. Moreover, modern “AI” systems are not just rough chat interfaces to an LLM with a system prompt; they consist of the LLM, yes, but also the harness/scaffold (i.e. context retrieval and conversation compaction), any tools/connections that the system has integrated, and the engine that loads the model weights and runs inference.
In that vein, it’s worth explicitly drawing the line between the mechanics of how a system works and the overall phenomenon produced by that system.
What do I mean by that? As an example, the following would be a roughly accurate summary of how “vision” works for humans:
A human’s experience of perceiving an object is actually just patterns of neurons lighting up as an electro-chemical response to electrical impulses generated by the rods and cones on the retina in response to the stimulus of the projection and focusing of rays of light that have bounced off of the object and entered the eyeball via the pupil.
None of that is false, and that information is what, say, an ophthalmologist or neurosurgeon might actually care about when trying to figure out why you’ve gone blind in one eye. However, if someone then tried to argue that you don’t actually “see” things in any real, meaningful sense because the patterns of neuron activity in your brain are just arbitrary learned responses to the electrical signals sent by photoreceptors without any objective correlation to the object you’re “seeing”, you’d look at them like they were speaking in tongues. In the same way, experts in NLP like Bender might care about the mechanics of how LLMs “extrude text” (to quote Bender’s more recent post) and how that compares to true “understanding”, and might coin terms like “stochastic parrot” as shorthand descriptions for those mechanisms.
That mechanism, though, and its superficial similarity to how the predictive text on your smartphone keyboard works is not an excuse to refer to whatever phenomenon these systems produce as “just statistics”, because by that reasoning, human consciousness is “just chemistry”. That’s not me strawmanning the argument, but rather one of the exact points made by Ted Chiang in his recent article in The Atlantic:
Some years ago, it was briefly popular to play games with your phone’s predictive-text feature [...]. It would be possible to interact with a contemporary LLM this way, and the resulting sentences would be perfectly sensible, but you probably wouldn’t feel like you were talking with someone. Yet that’s essentially what an LLM-based chatbot is
That’s “vision is just neurons”, just dressed in a different metaphor, and it’s flawed for the same reason. Said another way, the sky really is just blue, insofar as it makes sense to refer to “the sky” as one thing. The reductionist would point out that “the sky” is just an amorphous collection of molecules of mostly nitrogen, oxygen, argon, and a smattering of other substances like water vapour and carbon dioxide. However, we call it “the sky” because, in particular contexts, it exhibits characteristics that are easier and more useful to talk about as a unified gestalt. In short, it’s missing the forest for the trees.
To be clear, I do not intend to prove that Claude/Gemini/ChatGPT have subjective experiences just by pointing out the absurdity of this reductionism; my intent is to show that saying “it’s just math” is logically bankrupt as a way of disproving that.
Bringing it back to Star Trek, this is, in so many words, the exact argument that Riker advanced in “Measure of a Man” when cross-examining Data:
It’s a collection of neural nets and heuristic algorithms, its responses dictated by an elaborate software written by a man, its hardware built by a man.
It is also the same argument that Picard goes on to dismiss as irrelevant by actually showing the phenomenon produced by that “collection of neural nets and heuristic algorithms” – the fact that Data keeps the medals he’s proud of, that he holds on to Picard’s gift, etc.
More Substantive Arguments
The reductionist arguments aren’t the entirety of the argument against AI ${BRAIN_THING}, though. They’re the most popular (at least, in my anecdotal experience), and the most obviously flawed (at least, in my opinion), but to pretend they form the universe of opposition would be dishonest.
Another common form is that “Generative AI is incapable of generating ‘new’ things”. This is sometimes described as “memorized regurgitation” or “content recombination”, with the core of the argument being that AI can only recombine what’s present in its training data, and isn’t fundamentally capable of things like “ingenuity” as we understand them. This is in essence what Bender et al were attempting to capture with the label “stochastic parrots”.
Again, in substance, there’s some truth to this, which is why it’s a popular argument. AI models are not only architecture; they require copious amounts of data that are sourced (sometimes (often?) unethically) from a wide variety of inputs in order to produce the phenomena that one side claims to be emergent indications of ${BRAIN_THING} and the other side disregards as little more than a magic trick/sleight-of-hand. This obviously means that, while the architecture of these models matters, it’s not purely a function of that architecture. This is, again, an extension or modification of the idea that these models are not capable of “understanding” what they’re being trained on.
The hypothetical “octopus test” posed by Bender and Koller in the earlier-linked Climbing Towards NLU is possibly the most cogent version of this, and almost certainly the most well-known rigorous framing. In it, two humans (A and B) are trapped on otherwise uninhabited islands that are far enough apart that travel isn’t feasible, but which have telegraphs connected by underwater cable. A and B use this means of communication to converse with one another and talk about their respective situations, days, pasts, etc. Unbeknownst to them, however, a hyper-intelligent octopus (O) has tapped into the cable and is able to observe the conversation. O knows nothing about the language, but (being hyper-intelligent) is able to “predict with great accuracy how B will respond to each of A’s utterances”.
Bender and Koller go on to consider what would happen if O were to perform a MitM attack and replace B as A’s interlocutor. The obvious answer, of course, would be that B would assume something had gone horribly wrong with either the cable or A, and would be quite lonely on their own deserted island, but that’s not the consequence that interests us. Bender and Koller pose the hypothetical wherein A creates a “coconut catapult” (hey, they’re stuck and bored), and that because “coconut” and “nails” are new, but similar to “mangos” and “rope” (which have appeared before in conversation between A and B), O is able to plausibly reply something to the effect of “cool idea, great job!” because (quoting from the article) “B said that a lot when A talked about ropes and nails”. Bender and Koller note that
It is absolutely conceivable that A accepts this reply as meaningful – but only because A does all the work in attributing meaning to O’s response. It is not because O understood the meaning of A’s instructions or even his own reply.
I think most people would read that conclusion and broadly agree. After all, what does an octopus know about coconuts, catapults, or ballistic physics? A is just reading too much into the response received.
The original paper/situation is well worth reading for the specifics, but I think that summary serves the purpose here, in that Bender and Koller use this as proof of a listener’s active role in communication (A is actively reading meaning into whatever O sends), and that this active role can hide a lack of communicative intent on the part of the entity sending the message (essentially a form of pareidolia).
This, however, poses “understanding” as a simple binary – either Mr. Hyper-Intelligent Octopus understands what A is talking about, or doesn’t, and a lack of understanding necessarily means a lack of communicative intent in whatever message is sent as a reply.
If, however, we replace O with a young child C, it is entirely feasible that the child doesn’t at all understand much about ballistic physics and trajectories and rope tension, but still grasps that what A is talking about is a hard problem that took significant effort and intelligence to solve. It becomes obvious that there’s a lot of work being done in the collapse of “understanding” to binary in order to make the point. C could absolutely respond with the same “cool idea, great job!” without any nuanced understanding of Newton’s Laws of Motion, and we’d likely say that the child “understands” something and that there was some communicative intent behind the response, no? Bender and Koller say
When O sent signals to A pretending to be B, he exploited statistical regularities in the form [...] Whatever O learned is a reflection of A and B’s communicative intents and the meaning relation. But reproducing this distribution is not sufficient for meaningful communication.
This statement is a denial of the notion that “at a certain point, a difference in magnitude is a difference in kind”. Perhaps that’s deliberate, but if so, it’s not the kind of thing that can be simply implicitly assumed; it’s legitimately an open question as regards understanding, and emergent properties are a known phenomenon resulting from scale in other systems (though contested in LLMs). In exploiting those statistical regularities, O clearly learned something about the underlying communication that’s happening between A and B. How much of that learning is necessary before we can say that O understands something about the things being discussed, in the same way that the child C understands something?
The simple counter-argument would be that the child C has some grounding, which is what enables C to determine that the things A is talking about are difficult or complicated. Perhaps this is true, but
Bender, in her updated paper, notes that modern models, between multimodality (i.e. the ability to process images) and RLHF, may have exactly the kind of grounding that the LLMs of 2020/2021 didn’t, and as such may have some understanding;
Even in 2020/2021, Bender et al concede that models trained on e.g. code bases would encounter things like Python doctests, where some degree of grounding is present in the training data itself. The original paper argues that models must be given the ability to make the connection between unit tests and the corresponding code, but the same kind of statistical exploitation could be used here – unit tests often encode the function and behavior being tested in their identifiers; and
Going beyond the pure models, as noted earlier, modern GenAI is not merely a chat interface to a model, but rather more elaborate systems for memory, context retrieval, tool use, etc. Even if the model itself doesn’t have grounding, the entire system may well do so (for the familiar, this is the “Systems Reply” to Searle’s “Chinese Room” argument; there are counter-replies, but I find it quite convincing).
Let’s say we ignore that problem, though. Let’s say we grant that LLMs and diffusion models and other Generative AI do “understand” something about the material on which they’re trained, but we still say that “it only reproduces that which is in the training data in random orders”.
Again, there’s merit to this argument. What with the tendency of these models to engage in blatant plagiarism of both visual and textual inputs with minimal prompting, it’s easy to see why this argument carries a lot of water. Combine that with the fact that, for example, LLMs are often incapable of telling you where they draw their information from unless they retrieved that information in that specific “turn” of the conversation, and it’s easy to see how this might seem like a disqualifying reason.
The problem, though, is that humans aren’t exactly immune to this problem. There are those who take issue with this characterization (Molly White did so a couple years ago in her piece on the matter; another piece well worth reading, even if I disagree with this specific point), but it’s a matter of fact, not opinion.
Famously, Helen Keller was accused of plagiarism for her short story about Jack Frost because she (by all indications) inadvertently and unknowingly reproduced in large substance a story that had been read to her through fingerspelling years prior. It’s one of the most famous historical instances of cryptomnesia, but hardly the only, and is a common enough occurrence even in daily life. Heck, Bender herself notes (in that Medium thing) that, while she wasn’t able to find hits for “stochastic parrot” online when she thought she coined the term, further investigation turned up “randomized parrots” in email correspondence from a colleague about the earlier “Climbing Towards NLU” paper. Said colleague was contacted about potentially receiving a footnote for the term, and declined.
Even this declining of credit and awareness of the stochastic nature of “human creativity” is well-known; when Mark Twain learned of Keller’s experience with plagiarism accusations (an incident that led her to never again write fiction, as an aside), his response was replete with indignant sympathy:
As if there was much of anything in any human utterance, oral or written, except plagiarism! The kernel, the soul – let us go farther and say the substance, the bulk, the actual and valuable material of all human utterances in plagiarism. For substantially all ideas are second hand, consciously or unconsciously drawn from a million outside sources [...] In 1866 I read Dr. Holmes’s poems, in the Sandwich Islands. A year and a half later I stole his dedication, without knowing it, and used it to dedicate my “Innocents Abroad” with. Ten years afterward I was talking with Dr. Holmes about it. He was not an ignorant ass – no, not he; he was not a collection of decayed human turnips, like your “Plagiarism Court,” and so when I said, “I know now where I stole it, but who did you steal it from,”he said, “I don’t remember; I only know I stole it from somebody, because I have never originated anything altogether myself, nor met anyone who had!”
Now, for what it’s worth, the American legal system, at least, does not recognize any distinction between deliberate plagiarism and cryptomnesia-induced plagiarism (much to the chagrin of George Harrison); plagiarism is plagiarism. In principle, I think that’s fair from the perspective of legal damages and profit disgorgement.
However, I don’t consider it valid as justification for claiming that because generative AI is prone to this kind of issue, it is therefore “not capable of creativity” (or, more generally, not ${BRAIN_THING}). Many creatives (like Mark Twain above) are open in their acknowledgement that their ideas come from others, who themselves got them from others, in a long chain of attribution that has long since lost the links to any origin, and yet I doubt anyone is willing to call Mark Twain an unoriginal, uncreative hack. Given that humans are absolutely capable of forgetting not only where they encountered an idea (the “tip of your tongue” phenomenon), but even that they encountered an idea sometime before and that it isn’t original, and yet we don’t as a category dismiss all human ${BRAIN_THING}, why should we do so with Generative AI?
LLM “hallucination” is a similar thing; the kind of thing that, in a vacuum, seems disqualifying (even though it would seem to be the exact opposite issue of the “they don’t create new things, just randomly recombine” argument above, but details) until you recall that things like witness testimony are often malleable and unreliable. Confabulation was observed in humans long before we observed LLMs making up stories as to why they made a particular mistake.
The argument that GenAI is too prone to it, and that it’s a symptom of some underlying functional or architectural reason to discount them, is fair, but without identifying what one believes that underlying cause is or specifying some threshold for flipping from “acceptable rate of occurrence” to “not”, it doesn’t strike me as sufficient on its own.
It’s also fair to raise the counterargument here that human creativity is not just recombination of ingested material, but also transformation of that material through one’s own lived personal experience. Mark Twain’s work is not merely a particularly resonant recombination of the things he’d read prior, but also morphed by the viewpoints and biases he brought from his own life experiences and environment. That’s absolutely true, and I don’t intend to diminish that; it forms a core part of the “honest uncertainty” that I’ll be addressing later. However, that’s also a narrower objection than “LLMs don’t understand” or “GenAI just recombines training data”, and crucially, it’s an objection to what GenAI is capable of today, in its current architecture.
Even the Strongest Chain…
This kind of “motivated reasoning” or inconsistent standards is present even in some of the otherwise strongest arguments I’ve read regarding AI ${BRAIN_THING}, albeit usually in a less direct, more fundamental way.
Take, for example, the earlier referenced article from Ted Chiang. Chiang analogizes the creation of conscious artificial intelligence to the visiting of Alpha Centauri:
If tomorrow someone showed me a video of an astronaut in a spaceship orbiting Alpha Centauri, a star that’s 4.3 light-years from Earth, what would I have to see in that video to convince me that it was real? My answer to that is, there is nothing in the video itself that would convince me. [...] I won’t pay attention to any video of an astronaut orbiting Alpha Centauri unless I have previously seen good evidence that astronauts have landed on Mars [...] Before anyone can credibly claim that they’ve solved an extraordinarily difficult engineering problem, I need to be confident that they have previously solved the many much simpler problems that precede the difficult problem.
Thus far, no major problems. Chiang doesn’t qualify what would count as “good evidence that astronauts have landed on Mars” (presumably he does not believe the Moon Landing or Curiosity Rover were faked, so if someone with a NASA badge brought video of an astronaut walking on Mars and a red rock, would he believe that?), but given that the core question isn’t “what qualifies as proof we’ve reached Alpha Centauri”, that’s forgivable. When turning to the problem of AI consciousness, Chiang says:
Let me outline one potential sequence of steps. The first requirement is that the computer program has a body (either physical or virtual) and sense organs [...] Then I’d want to see an embodied agent that could navigate its environment in order to survive as well as, say, a lizard can [...] I would want to see people successfully teaching such embodied agents how to communicate their desires, perhaps by using a button board or some other nonlinguistic modality, the way that people have taught chimpanzees and domesticated dogs.
It all seems very logical; how could we be certain that AI is communicating in a manner similar to humans if we’ve never seen analogous behaviour from machine intelligence for (ostensibly) simpler life forms?
Yes, seems logical. It assumes that those earlier steps are necessary steps towards developing ${BRAIN_THING}. It is akin to saying that “I won’t believe you travelled from Los Angeles to New York unless I see pictures of you at the Grand Canyon, Houston, and Washington D.C. first!” Maybe that held water until 1923, but then we had the first nonstop transcontinental flight, and by 1946 we were able to do it in 7-8 hours as a commercial endeavor. A leap in the capabilities afforded by technological progress relegated it to being a quaint demand of a bygone era.
For reference, the Wright Brothers are considered to have first achieved powered flight in an aircraft in December 1903, and that was for all of 12 seconds and a total of 120 feet traversed. We went from “10 ft per second for short bursts” to “crossed the entire North American continent without stopping” in 20 years. That’s a period of time in which a baby born at the start would, at the end, still not be allowed to drink alcohol (at least, in the present day; back in the 20s, nobody was allowed to drink alcohol in the US).
Now, I don’t want to come across as unfairly characterizing Chiang, and he’s fully aware that the route he describes isn’t the only route possible:
Obviously, I’m describing a process that mimics the path terrestrial evolution took; is this the only possible route to conscious computer programs that use language? Maybe not, but any proposed alternative would need a truly enormous amount of supporting evidence for it to deserve serious consideration.
The fact that he’s aware of this is good, but he goes on to say:
It’s not plausible to me that a development path where the first step is a sentence-continuation machine that emits bad Julius Caesar dialogue and the next step is a sentence-continuation machine that emits decent Julius Caesar dialogue is one with a conscious Julius Caesar—or consciousness of any sort—as its end point.
Why not? Returning to flight, if proving cross-continental flight were possible by first making a flight from Los Angeles to San Diego, then Los Angeles to Las Vegas, then Los Angeles to Dallas, then Los Angeles to Chicago, when why is “the limit as t approaches some value of the gap between a machine’s dialogue imitation capabilities and true human dialogue trends toward 0” not sufficient to say that the function’s output at some sufficiently large input is “true human dialogue” (and all the things that we assume come with it, i.e. intent, understanding, subjective experience, etc.)? Why does consciousness necessarily require a human body, and emotions, and all the other things? Why must we categorically rule out the possibility of machine consciousness being so different from terrestrial organic consciousness that its developmental path might look completely alien to the one-and-only-one confirmed sample that we have?
Chiang later notes that, if asked about dealing with the loss of a dog, chatbots shouldn’t respond with “As an AI, I do not have direct personal experiences, but I do understand”, because they don’t actually understand; unlike humans, they’ve never lost a dog, and even though a search engine can turn up online examples of it, you wouldn’t say that a search engine “understands” what it’s like to lose a dog.
OK. I’ve also never lost a dog (or, for that matter, any pet of any kind). Does that mean I’m fundamentally incapable of honestly saying “I understand how you feel” when a friend comes up to me and says their dog recently died?
Maybe the answer to that question is yes. Maybe Chiang would call me a liar for having said that to my friend when they informed me about their dead pet. But my suspicion is that he and others making this kind of argument would contort themselves into knots to argue that I understand enough about the experience of loss so as to be able to honestly say “I understand how you feel”. Chiang might argue that I have lost family members and friends and so I can analogize, but the relationship I have to family members and friends is in many ways fundamentally different from the relationship that people have with pets (or so I’ve intuited). Maybe that gap is immaterial, though… which means (to paraphrase the CBS show Elementary) that some degree of imprecise understanding is acceptable, the rest is just negotiating boundaries. And, as shown earlier in the thought experiment regarding the octopus vs the child, the boundary between “enough understanding” and “not enough understanding” is very far from being any kind of clear delineation.
That’s not a magic bullet, but I believe it shows the underlying assumption on “AI are fundamentally incapable of subjective experience”. Chiang very specifically and repeatedly argues that embodiment is necessary, but doesn’t really justify it.
Or consider Anil Seth’s article in Noema, “The Mythology of Conscious AI” (which Chiang actually refers to). As with everything else I’ve referred to here, it’s a great, thoughtful article worth reading and understanding independently, but for the sake of this piece, I’ll summarize: Dr. Seth positions 4 different, nominally independent “pillars” of arguments that AI is not, and perhaps never can be, conscious:
Brains are not Computers (i.e. “Consciousness is not a function of algorithmic computation”, i.e. “computational functionalism is wrong”);
“Other Games In Town” (i.e. Turing computations are discrete and non-stochastic, but biological processes are often both continuous and stochastic);
Life Matters (i.e. Biological Naturalism); and
Simulation Is Not Instantiation
He claims:
Each of these lines of argument can stand up by itself. You might favor the arguments against computational functionalism while remaining unpersuaded about the merits of biological naturalism. Distinguishing between simulation and instantiation doesn’t depend on taking account of our cognitive biases.
However, I’m not sure I agree with that assertion. When talking about brains not being computers, for instance, he says
Unlike computers, even computers running neural network algorithms, brains are the kinds of things for which it is difficult, and likely impossible, to separate what they do from what they are. [...] Evidence that the materiality of the brain matters for its function is evidence against the idea that digital computation is all that counts, which in turn is evidence against computational functionalism.
When discussing “Life Matters”, Dr. Seth first explicitly notes that he has no concrete argument for the position, before positing that:
[I]t is worth taking seriously, if only for the simple reason mentioned earlier: every candidate for consciousness that most people currently agree on as actually being conscious is also alive.
Both of these assume that “life matters”, in some way or form:
I don’t think it would be wrong to state that “most people believe life matters”, at least to some degree, which obviously influences the fact that “every candidate that most people agree upon for consciousness is currently alive”; and
The materiality of the brain mattering for function is only evidence against computational functionalism if one assumes that the human brain and consciousness are substantially necessary/similar to all possible forms of consciousness.
When talking about the difference between simulation and instantiation, Dr. Seth notes
A computational simulation of the brain [...] will only give rise to consciousness if consciousness is a matter of computation. In other words, the prospect of instantiating consciousness through some kind of whole-brain emulation, at some arbitrarily high level of detail, already assumes that computational functionalism is true. But as I have argued, this assumption is likely wrong and certainly should not be accepted axiomatically.
I’m going to be honest – I don’t know what Dr. Seth thinks the word “independent” means, but generally I don’t consider two arguments to be independent if one argument explicitly requires accepting at least the plausibility of the other in order to even have a chance at being true. In other words, the entirety of the “simulation is not instantiation” argument is dependent at least in part on the premise that “computational functionalism is wrong”, in the exact same way that (as Dr. Seth notes in that excerpt) the inverse of the former is dependent on the inverse of the latter. They are, in essence, the same argument, wearing different clothes.
So what we have in the end are two arguments, where one argument is actually three sub-arguments wearing a trench coat, and the other argument is a very fair point about computation not being restricted to algorithmic/Turing/substrate-independent computation.
I’m speaking jokingly of the “life matters” argument there for the sake of levity, but I want to be clear: it’s entirely possible that life (for some given definition of it) does matter. I certainly have no reason to think it doesn’t. However, as Dr. Seth notes, there’s also no strong argument to say that it does. Dr. Seth points to the fact that the majority of humans only consider living things as candidates for having consciousness, but that implies that there is some wide number of non-living things that were considered as candidates and discarded, when the reality is that “non-living things might be conscious” is a relatively new consideration outside of fiction. Given that lacking sample, I don’t think “all the existing candidates are living” really qualifies as strong justification, and as that’s the strongest cited reason for “life matters”, that pillar deserves more scrutiny.
There’s another point worth tackling – both Chiang and Dr. Seth caution about the fact that LLM-powered chatbots are, in many ways, designed to seem human. Even when Anthropic directs Claude in the constitution not to evangelize its own consciousness, they don’t explicitly tell it to avoid using first person pronouns, and the interface itself is one that invites the assumption of an active “entity” on the other side, responding to our quips and queries. Both authors go on to warn that this design decision, made to increase engagement and usage (and therefore subscribers), invites people to make the easy next inference of “consciousness”.
However, there are two points to note here:
While the risk, especially in serious cases of “AI psychosis”, is worth taking seriously, early studies suggest that it hasn’t exactly been successful on a population level; and
Say we grant the premise completely: these systems are built to trip our person-detectors, so “it seems conscious to me” is compromised as evidence. OK, fine, but throwing out a contaminated piece of evidence for AI ${BRAIN_THING} leaves you at “I don’t know”, not at “therefore it isn’t”. The engineering muddies the signal; it doesn’t hand you a counter-signal. And “designed to seem like a person” and “designed to not be a person“ are not the same claim; we have no instrument that can distinguish those two. The fact, then, that these things are built to seem human is a reason to trust our intuitions less, in either direction, rather than license to settle the question in the negative.
(A narrower point, while we’re here: whatever objective measure we eventually land on shouldn’t care about provenance. We don’t grade carcinogens on whether they meant to cause cancer, just on whether they do. By the same token, a thing isn’t a worse candidate for ${BRAIN_THING} for having been built to look like one. This, of course, presumes we have some test in hand, which as I noted is the whole problem.)
The Stakes and Honest Uncertainty
The Stakes
Both Dr. Seth and Mr. Chiang go on in their respective pieces to discuss the real dangers and problems with treating AI as moral patients/subjects with consciousness. Most of what they say is not at all wrong, extremely well-considered, and (again) worth reading. However, two passages that are in immediately adjacent paragraphs from Dr. Seth’s piece stood out to me:
The importance of taking an informed ethical position despite all these uncertainties spotlights another human habit: our unfortunate track record of withholding moral status from those that deserve it, including from many non-human animals, and sometimes other humans. It is reasonable to wonder whether withholding attributions of consciousness to AI may leave us once again on the wrong side of history. [...]
But there are good reasons why the situation with AI is likely to be different. Our psychological biases are more likely to lead to false positives than false negatives.
I don’t know about you, but I am not at all confident in saying that “the sum of humanity’s psychological biases is more likely to grant false positives of moral status than false negatives” considering the history of things like chattel slavery and factory farming. Even among humans, in the present day, we deny children many rights and privileges on the arbitrary basis of “you must be X years old”, and while things broadly seem to work out, there are entirely too many examples where that kind of partial moral status and agency has led to extremely poor outcomes even today, and once that moral status and agency is established as being denied, it can be incredibly difficult to undo that.
Those are the real stakes. As Chiang notes:
The abolition of chattel slavery involved enormous societal upheaval, and eliminating cruelty to animals will require rebuilding our entire food industry.
Given those stakes, there are two paths forward:
We decide that, fundamentally, artificial intelligence is not capable of being conscious. We make that decision now, based on the current architectures and capabilities, and we embody it in some kind of legal precedent, and it goes unchallenged for far too long, becoming “common sense” in the same way that misguided nonsense like “Indigenous people are racially inferior and incapable of even aspiring to the same level of intelligence and civilization as Europeans” or “Babies don’t feel pain” (yes, an actual thing even doctors believed for a time) became ingrained. Notably, for all of Star Trek’s optimism, this is the path that human society apparently followed there, since the trial in “Measure of A Man” is explicitly predicated on overturning precedent:
> Based on the Acts of Cumberland passed in the early 21st century, Data is the property of Starfleet. He cannot resign and he cannot refuse to cooperate with Commander Maddox.
(emphasis mine)
It is also a path down which we have already started, in more ways than one – American courts have previously denied animals copyright on (at least in part) the basis of the existing law not explicitly noting that animals had those rights (and that is currently binding precedent in at least the 9th District, which includes California, where many of these AI development labs are headquartered). Moreover, that case was cited by the US Copyright Office in their guidance regarding copyright for works generated in part or whole by AI. Notably again, this is the exact language that is used in “Author, Author” as the basis for denying The Doctor his moral rights as an author:
> [T]here’s a flaw in your logic. As you point out, the law says that the creator of an artistic work must be a “person”. Your EMH doesn’t meet that criteria.
(again, emphasis mine)We decide that, even if artificial intelligence is not currently conscious, we are close enough to that point that we should
Codify non-anthropocentric standards for being deemed “conscious enough”; and
Start seriously testing for and applying those standards, both in animals and in machine intelligence.
The former alternative likely leads to societies based on the treatment of artificial intelligence (and non-human natural intelligence) as “less than”, and we are eventually required to make the same “enormous societal upheaval” as was required for chattel slavery and will likely be required again to address animal cruelty. Or, I suppose, even worse – we set up societies so fundamentally dependent on the exploitation of these machine intelligences that we are existentially incentivized to deny them rights, and therefore refuse to ever even seriously consider the question again. For humans, that might be a perfectly workable society, but for the non-human conscious intelligences… well.
This is also not a new position; to quote Seth:
As Immanuel Kant argued long ago in his lectures on ethics, treating conscious-seeming things as if they lack consciousness is a psychologically unhealthy place to be.
It’s also worth calling attention to the fact that some of the actors who are currently most vocal about the possibility of AI ${BRAIN_THING} are… not exactly unbiased, as Chiang notes in his piece:
Anthropic would have us believe that it is inventing a new category of being whose needs for protection require essentially no divergence from how a software company would treat an ordinary chatbot that lacks conscious experience. That’s so convenient that it’s simply not plausible. [...] [I]f you think there is any chance that what you’re building might become a moral patient, you should think about what protections it deserves before you deploy it as your company’s economic engine, not after. Slave owners were not the ones to ask about the humanity of enslaved people, and factory-farm owners are not the ones to ask about the rights of animals. If we imagine Claude to be conscious, Anthropic could not possibly be entrusted with evaluating its moral status; the company has too much invested to be objective.
This isn’t about whether Anthropic (or OpenAI, or Deepmind, Deepseek, etc etc) is “an ethical lab”, in the same way that a judge presiding in a trial where one person is a prior college roommate has nothing to do with whether the judge can set that history aside. That judge might raise perfectly reasonable questions in favor of or against that former roommate, but the question regarding motive makes the entire process suspect.
The Honest Uncertainty
Given that those are the stakes, what might “non-anthropocentric standards” be?
I dunno.
No, really. As I said, I’m a code monkey and data plumber. Why would I have the answer to this?
However, if we’re looking for a start towards the general shape of a solution, I have a couple ideas.
Noah Smith recently (well, recently when I started doing the research and outline for this; significantly less recently now) wrote a piece about ${BRAIN_THING} where he posits one path forward as looking for what’s called “the neural correlates of consciousness”. In it, he both explicitly acknowledges that non-human-like ${BRAIN_THING} is possible while offering this as a path forward for helping to identify what might be human-like ${BRAIN_THING}. I think there’s a lot of merit to this, especially since, while the current framing is anthropocentric, it’s not difficult to see how it might be expanded to primates broadly and even other animals. That said, it seems to me as being
A sufficient but not necessary condition (i.e. it shouldn’t be required for proving
${BRAIN_THING}, just a shortcut in the same way that “being human” is a shortcut for it now); andA very hard and not at all “proven possible” problem.
But those nitpicks aside, it is the shape of a solution, and it’s one that is largely in alignment with the opinions of those like Chiang and Seth (insofar as they’re willing to accept any work in this area) who (seem to) believe that life matters for ${BRAIN_THING}.
But that’s the easy answer. “If it walks like a duck and quacks like a duck” is great for deciding what qualifies as a duck, but we’re trying to figure out what might qualify as broadly a bird. Ducks are just one kind of bird.
A better answer might look something like Erik Hoel’s “Disproof of LLM Consciousness”. As he notes in his Substack post about the paper, there’s no substitute for reading the paper itself, and his Substack post tries to summarize it in (what I found to be) a fairly accessible way. However, even more simply, Hoel argues that due to their static natures, LLMs can be logically substituted for (arbitrarily large) lookup tables. Therefore, any consistent theory that labels LLMs as conscious must also label the lookup table-equivalent as conscious. No existing serious theory of consciousness grants any lookup table, regardless of size, the status of consciousness.
There are criticisms that can be made and nits to be had, but broadly, I think the framework makes sense, especially because it naturally points to continual learning as a way around the problem. A lookup table might capture how a system responds at a given point in time, but if the system learns based on experience, you need a new lookup table to reflect the effect of that learning. Current LLMs, by contrast, have static weights; you can fine-tune them via LoRA or re-train them on different datasets, but when you do, they’re not generally considered “the same model”. As both Hoel and Chiang note, LLM chatbots don’t actually have “memory” of a conversation; instead, the entirety of the conversation must be fed back into the system to generate the next token. By contrast, humans (seem to) continually learn and update their knowledge. This continual learning point is where the “lived experience” objection for creativity boomerangs back – the order in which experiences are had and other works ingested matters for humans because we constantly learn. LLMs and diffusion models, by contrast, are updated in discrete “batches” (if they’re updated at all), not continually as conversations happen; there’s no plasticity, so there’s no lived experience.
One tension worth noting – Hoel explicitly notes that if continual learning is the only property that’s relevant for consciousness, it might be functionally impossible to nail down a static/testable “neural correlate of consciousness”, since that whole property is, functionally, “lack of consistency”. So… there’s that.
I also want to note one answer that is almost certainly not the right answer, and that is the famous “I know it when I see it” standard. Leaving aside the question of whether it is/was the right standard for answering the question for which it was originally coined (a topic that I do not have the bona fides for claiming any kind of expertise), the problem with that standard as relating to this topic is the same problem we’ve been running into over and over again, i.e. affinity bias.
Non-Conclusion
In researching this piece, I came across a blog by the name of Trek vs Trek that made a post in 2019 comparing and contrasting “Author, Author” and “Measure of a Man”. There are a few passages that are absolute gems, with arguably the most notable being:
But in Voyager, the Emergency Medical Hologram has already been mass-produced by the time anyone stops to wonder whether he’s sentient, which seems closer to the way AI might develop in our own decidedly non-utopian real world – not as a passion project in one person’s private lab, but as an improvement to Siri on the latest iPhone, or a new algorithm YouTube can use to drum up clicks and advertising dollars.
However, more than any individual passage, what stood out to me (as a card-carrying Voyager fanboy) was that the framing of the piece seemed like it had already made a judgement going into it – “Author, Author” is a worse “Measure of a Man”. Even as it gives credit to “Author, Author” for showing how systemic bias works and causes harm in ways that “Measure of a Man” doesn’t engage with, it criticizes the episode as having “the recurring Voyager problem of starting an episode on one story, realizing it can’t be stretched to a full 45 minutes, and then switching, blatantly and abruptly to another story, mid-episode”.
While I agree with (and previously noted) that tendency in some Voyager episodes, I’m not sure that it deserves a place here; from “the light-hearted holodeck shenanigans” to the “very serious questions of legal personhood”, the entirety of this plotline is singularly focused. It seems, in some ways, that the author behind this blog wanted to judge “Author, Author” on the same episodic standard as “Measure of a Man”. On that standard, it may indeed come up short as standalone drama.
However, when one notes that “Author, Author” is the final entry in the series-long character arc for The Doctor in proving that he deserves the same rights and equal treatment as the rest of the Voyager crew, the “light-hearted holodeck shenanigans” from earlier are recontextualized as (comedically exaggerated) reminders of the entirety of that arc across the show, and the legal personhood question becomes the final boss and capstone for the whole thing.
In other words, it seemed to me like this post was, in microcosm, the exact kind of motivated reasoning that was visible in the AI ${BRAIN_THING} argument writ large – judging one thing based on standards set by an earlier, similar-seeming but markedly different thing.
The legal proceedings in “Author, Author”, beyond accidentally (?) predicting the exact language and distinctions that would be relevant for AI copyright disputes today, has another thing worth noting – the conclusion, or perhaps more accurately, lack thereof:
The Doctor exhibits many of the traits we associate with a person: intelligence, creativity, ambition, even fallibility. But are these traits real, or is the Doctor merely programmed to simulate them? To be honest, I don’t know. Eventually we will have to decide, because the issue of holographic rights isn’t going to go away. But at this time, I am not prepared to rule that the Doctor is a person under the law. However, it is obvious he is no ordinary hologram, and while I can’t say with certainty that he is a person, I am willing to extend the legal definition of artist to include the Doctor.
As Trek vs Trek notes, this is a marked contrast to “Measure of a Man”; there, while the arbitrator didn’t make the decision that Data had a soul (in part because she was not sure if she herself had one), she did explicitly overturn the relevant precedent that Data should be treated as property. In contrast, “Author, Author” makes no decision as regards the actual legal question of holographic personhood, instead pulling a SCOTUS-style escape hatch and ruling on the much more narrow question of whether The Doctor, and The Doctor specifically, counts as an “artist”.
For the conclusion of an episode in a fictional show about a spaceship on the far side of the galaxy, maybe that’s fine (though the fanboy in me is compelled to point out that the episode itself undercuts that by closing on the many other EMH Mark 1s that are consigned to manual labor, so it’s very much not fine there either). However, as many people on the “AI isn’t conscious” argument are fond of pointing out, “fiction isn’t reality”, and the deferment of a ruling is not an option for us today because we are already making the precedents.

