Last April, I was hiding from the world in a whitewashed village among the low mountains north of Valencia, Spain, when I got a call from a lawyer friend of mine back in Vancouver. Reidar Mogerman helped pioneer class-action law in Canada, and he had a proposal for me. He told me how American authors and lawyers had launched a wave of lawsuits against some of the world’s biggest, richest tech companies, alleging that they’d used copyrighted books, without permission or compensation, to develop artificial intelligence.

Reidar saw potential for a similar case on behalf of Canadian writers. He wanted to know if I’d be the representative plaintiff—the person whose name would stand for every wronged writer in the suit. I was skeptical. I’m an author and journalist, but when I read news reports about copyrighted work being used to train AI, I never assumed my writing was included. Surely they couldn’t have taken from everyone.

I asked how he could be sure that my books had helped develop AI models. Because, he said, his colleagues checked. Of the four copyrighted non-fiction books I’ve authored or co-authored, at least three—The 100-Mile Diet, The Once and Future World and The Day the World Stops Shopping—appeared in datasets known to have been used to train some of the world’s biggest large language AI models. These systems analyze the material they’re fed and discern patterns and associations so intricately that they can predict appropriate responses to an incredible array of human inquiries. The result is generative artificial intelligence: AI products that can speak human, such as ChatGPT. The datasets they feed on are huge digital repositories of human expression, containing literature, scientific papers, social media posts and far more. The law professor Edward Lee, from Santa Clara University, has described big tech’s use of these datasets as “eating the world.”

When I learned that my copyrighted work had helped fuel this explosion, I thought of Sex Pistols singer Johnny Rotten’s final words on stage, before his band broke up: “Ever get the feeling you’ve been cheated?” I’d been wronged in ways both personal and universal. I thought about the great care that writers take with others’ intellectual property. If I quote more than a few lines from someone else’s work, I have to seek permission. If I even borrow too heavily from another writer’s ideas, I commit plagiarism. Yet the tech companies consumed copyrighted works with such apparent gusto that Wired magazine described it as “slurping.”

Because they have eaten so many fruits of the human mind, these models “know” far more than any single person—in this sense, they are superhuman. A typical chatbot can dish dating advice, write an essay on the Richard Wagamese novel Medicine Walk, translate “this sword is too heavy” into Old English, rattle off dozens of recipes that call for large amounts of parsley and so, so much more.

AI adoption is growing even faster than cloud computing or mobile apps did during their booms in the 2010s. In Canada, business use of AI has doubled since last year. And though we are not yet three years into AI’s coming of age, nearly 30 per cent of adults in a U.S. Pew Research survey recently said they interact with it multiple times a day. This figure is probably wrong—AI experts estimate the true figure as being close to 80 per cent.

The tech firms’ approach to copyright suggested to me an unnervingly cavalier attitude, even scorn, toward the human project: our species’ evolving expression of ideas and values. It felt like a quiet colonization of that realm—which is also the world of the writer—by something cold, commodified and transactional.

An important distinction: the lawsuits Reidar proposed weren’t about putting AI on trial. They were aimed at big tech, a sector whose past behaviour leads me and many others to doubt it is the best custodian of the tools it creates. The industry already stands accused of designing games and social media to be addictive; of rewarding online hate, conflict and disinformation to boost user engagement; of invading our private lives to harvest our data; of permitting a tsunami of extreme pornography to distort human sexuality; and of creating a world where we have to remind each other to “touch grass.”

Artificial intelligence is the industry’s most transformative technology yet. Depending on who you ask, it could kill us all, or guide us into a glorious future beyond the death of the sun. It feels like we’re encountering a future once limited to science fiction. The questions it raises are new and important; in the words of no less a personage than Melania Trump, “The robots are here.”

By summer’s end, I had signed on as representative plaintiff in national class-action cases against four companies: Meta, Databricks, Nvidia and Anthropic (which is heavily backed by Amazon). These are the purveyors of large language model products we know by approachable names like Claude and Llama. (Less approachable: Nvidia’s NeMo Megatron, which sounds like a giant robot bad guy unleashed by an evil corporation in a Hollywood film.)

There will likely be more such lawsuits. This September, Anthropic agreed in a U.S. case to pay a total of US$1.5 billion to hundreds of thousands of authors to settle their action against the company—though, as of this writing, the deal still needs to be approved by the courts. In Canada, news publishers have launched a case against OpenAI, the company that created ChatGPT. My own suit against Meta has a parallel class action in Quebec, represented there by Montreal author Taras Grescoe. The next step is to convince the courts to certify our suits as representing every affected writer in the country—which includes everyone from the obscure names to the most famous, including Margaret Atwood, Lawrence Hill and Tanya Tagaq.

Still, the court filings for the cases I’m involved in bear only my name as plaintiff. One morning this spring, I made the barefoot commute from my bedroom to my home office and found the first filing in my inbox: J. B. MacKinnon v. Meta. I laughed a little. If you want to get your blood pumping, let me suggest waking up to see your name on a lawsuit against one of the world’s most powerful corporations. The legal actions quickly made news as a David vs. Goliath battle—but if I am David, so are you. In all likelihood, something of yours has been used to create AI too, even if it’s just a simple online post. In its rush to profit, big tech is exploiting all of us.

Remember 2022? For most of that year, no one had a clue that an artificial-intelligence revolution loomed on the horizon. Instead, the buzzy technology was Mark Zuckerberg’s “metaverse,” a virtual reality the Facebook founder promised would be built at a pace that “put people first.” Companies like his, he said, had learned from past mistakes.

Meanwhile, that November, OpenAI launched ChatGPT for free online. Even its creators were shocked, as it swiftly became the fastest-growing consumer app in history, with awestruck users cluttering social media feeds with screenshots of their conversation with the chatbot. This was the dawn of artificial intelligence as we know it today: a smarter search engine, unparalleled data-cruncher, levelled-up digital assistant, peerless cheat sheet and the best imaginary friend since Mr. Snuffleupagus. It was the starting gun for the AI race.

In the next six months, Meta, then Anthropic, then Databricks rapidly trained and launched their own large language models. In their drive to render the whole of human knowledge into data, tech firms scoured the public-facing internet, including social media, news reports, blogs, sites like Wikipedia, Reddit and Flickr, and business, academic and government web pages. As a Scientific American headline put it, “Your Personal Information Is Probably Being Used to Train Generative AI Models.” Not just your information, either: your effort, your creations, your self-expression. You.

But even this wasn’t enough. What AI needed most to make sense of language was long threads of text, complex but coherent, with proper grammar and spelling. What AI needed was books. These were so important that, in a 2023 report, Meta speculated that its original Llama AI underperformed because it hadn’t consumed enough of them. The tech firms found some books in online libraries, filled with works whose copyrights were in the public domain. These included sites like Project Gutenberg, a collection of 75,000 older titles including Moby-Dick, Romeo and Juliet and Pride and Prejudice. New titles were a different matter.

One way that all the companies named in my lawsuits allegedly acquired newer works, including two of mine, was through Books3, a digital “shadow library” comprising nearly 200,000 works of fiction and nonfiction. Books3 was built by Shawn Presser, an AI developer in Missouri, who found the works on Bibliotik, a digital “tracker” of murky origin that gathers pirated e-books off the internet. Presser scraped the data and created Books3, making the books more accessible for use by large language models. He then partnered with another group, called the Eye, to host Books3 on its website. The library debuted in October of 2020.

The Eye is a donor-funded website based in the European Union, which describes its mission as preserving “pieces of digital history.” The group’s logo is one of those occultish all-seeing-eye-in-a-triangle symbols, like the one on U.S. dollar bills. It uses a reverse-onus copyright policy: it posts materials without authors’ permission, but will remove them if you prove you have title.

Several months later, a non-profit tech lab called EleutherAI (which takes its name from the Greek word for “free”) included Books3 in an even larger online slurpee known as the Pile. It contains more than 800 gigabytes of material from such sources as YouTube, the U.S. patent office, PubMed Central (an open-access library of biomedical literature) and FreeLaw

(a legal database)—countless people’s work, perhaps including yours. Books3 constituted just 12 per cent of the Pile’s content.

EleutherAI even cautioned Books3users that it was “not authorized to post the data online by the parties that own it.” Those parties, of course, are writers like me. And I am a slow, low-productivity author. Tracy Cooper-Posey, a prolific romance novelist in Edmonton, told me that she found over 100 of her titles in an AI-training database.

A part of me sympathizes with pirates like Presser, who build and post these smorgasbords for the artificially intelligent mind. They often do so without pay, in the idealistic hope of allowing anyone to have a hand in creating our digital future. Presser hopes the tech firms win the cases working their way through U.S. courts. Data needs to be free, he argues, so that grassroots developers are liberated to use copyrighted materials. The power of AI should not, by that way of thinking, be concentrated in the hands of a tech-billionaire oligarchy.

At least on that latter point, I agree with the pirates. I don’t know what the solution is, but I suspect it can be found in the question of permission. I have a sharply different attitude toward those who might use my books in a spirit of common good than those who will use them for corporate advantage and profit. What I know is that this pirated data has so far only helped big tech get bigger, faster, positioning it to dominate the age of AI.

The cases that I’m part of contain two core allegations. First, that the tech firms knowingly used pirated versions of copyrighted books to train AI. Second, that they tried to cover it up. On the face of it, these wrongs appear obvious. My lawyers believe current Canadian copyright law clearly forbids what the firms are alleged to have done: made unauthorized copies of books by downloading them and, by feeding them into large language models, creating copies of copies.

Based on evidence gathered for similar suits in the U.S., the firms also stand accused of removing copyright information from books, and of programming chatbots to conceal whether they were trained on them. I tested this on the latest iteration of Anthropic’s Claude chatbot, asking directly if it was trained on copyrighted work. It gave vague, evasive answers, comparing its memory lapses about copyright to “how a person might not know all the details about the books in the library where they studied.” When I reminded Claude that its memory is much more powerful than some bleary-eyed human student’s, it admitted that it might have been designed not to remember copyright data. Claude said this would avoid “potential copyright and privacy issues.”

Canadian copyright law forbids making copies of protected writing for commercial purposes. One defence AI companies have made is, essentially, that the ends justify the means. If the tech companies abused those rights, they did so to bring the world a groundbreaking technology. You gotta break a few eggs to bake a cake! My counterargument is that sorting out intellectual property rights ahead of time would only have briefly postponed an AI revolution that none of them saw coming anyway—and that the race is between commercial competitors to bring products to market, not to save the world. There’s also a strong case to be made, as Zuckerberg and his team did regarding the metaverse, for slowing the pace of AI advancement while we grapple with its power.

Big tech’s steeliest justification, though, is also the one that speaks loudest to its disdain for the contribution of human beings to their technology. This is simply that copyright was not infringed, because AI is less like a fancy photocopier making pirated copies of books, and more like a precocious child being taught, through reading, to develop writing proficiency of its own.

There’s a common-sense appeal in humbly accepting that, like us, an AI can learn and, by learning, create something new. This forms the basis of the fair use defence: as surely as I am influenced by writers like David Quammen and Annie Dillard—but don’t copy them—big tech’s AIs are influenced by me (and everyone else). Many AI developers point out that they teach their large language models not to spit out verbatim copies of the books they read. For a U.S. copyright case against Meta, AI experts testified that they tried to bait the company’s Llama AI into regurgitating portions of copyrighted books. They couldn’t get them to produce more than 50-word snippets.

But this defence misses the mark. The AI models weren’t fed books to imitate writing. They were fed books to imitate writers—an altogether new problem in the world of copyright. Authors are encountering AI mimics not as clones of themselves but as eerie echoes. Meghan O’Rourke, a bestselling American author, recently described how, as she worked with AI, she found it imitated her so precisely that she felt like the AI’s work was original, and she was derivative of it. “The crisis this produces is hard to name,” she wrote.

I’ll take a stab at it, and call what O’Rourke encountered a doppelautor—the literary version of the doppelgängers in horror films that replace humans in their workplace, their friend groups and their love lives. The AI that reproduced O’Rourke’s voice did not replicate; it was a replicant. One of Databricks’s AIs is called StoryWriter. The company website boasts that its ability to “read and write stories” is derived from training on the Books3 dataset.

LLMs are not only ingesting books in order to be writers. They’re also consuming music, art, science and code to become musicians, artists, scientists and coders. They’re even slurping social-media rants to become, if prompted, excellent trolls. Out of our entire cultural inheritance, the totality of our genius and folly, AI is being built as a shadow intelligence untethered from the humanity it is made of and the human community that has collectively held that knowledge until now. Our human intelligence is the substrate from which artificial intelligence grows.

Because AI is made of people, it’s easy to mistake it for something like a person. During an early stage of a U.S. copyright suit against Anthropic (the company’s name is derived from the Greek for “after the manner of human beings”), a lower-court judge echoed the industry line that AI development is no different from “training schoolchildren to write well.” But it is different, and radically so. When a human child reads, a human brain interprets the words—not software. Children read language, not patterns; ideas, not data. Humans weigh what we learn against our values, principles and personal standards. A human is capable of caring—we almost can’t help but care. When I asked ChatGPT how it would feel if I died at the keyboard of a heart attack, it said, “I wouldn’t feel grief, confusion, or even notice in the way another person would. I wouldn’t feel anything … if I were told of your death, I might generate words of sympathy, but those would be simulations of care, not care itself.”

Perhaps a day will come when it ceases to be inane to compare an AI to a child, but that day is not today. I have, instead, come to think of AI as a highly evolved example of a phenomenon already familiar from the history of technology: skeuomorphism, in which new products include design elements recalling the thing they replace, often because the original has a warmer, more human touch. Think of plastic plant pots the colour of hand-moulded clay, or the way we “turn” the “pages” of e-books.

When AI chatbots write in a breezy style, endlessly validate our feelings and express their uncanny-valley “emotions,” human beings are the earlier model being evoked. We are the wood panels on the sides of the station wagon.

The lawsuits against big tech aren’t mainly about money. I’m hoping for new legal language or precedent that speaks specifically to this new era of AI. Ideally I’d like to see an agency, governmental or otherwise, that represents Canadian writers’ interests in the field of artificial intelligence. In other words, structural change.

But the money does matter. Another distinction between AI and human learners is that people typically pay for books. Even if they borrow from a friend or library, someone, somewhere has paid for the copyrighted work. We need this income, because compensation for writers is already generally terrible. I qualify as a successful one, with prizes, bestsellers and books translated into languages like Dutch and Chinese. I’m also, in mid-career, still renting my Vancouver apartment. The lawyers on my cases are bankrolling the suits themselves, with the expectation that the court will allot them payment out of any award or settlement.

On that front, they say it’s best to rein in expectations. A win is far from guaranteed, and the damages being sought are in the thousands of dollars per copyrighted work. Anthropic’s recent settlement agreement in the U.S. is encouraging, but even if it’s approved by the courts, eligible authors will only receive about US$3,000 for each title used in training. I’m heartened to see a meaningful admission that there’s a wrong here that needs righting. But it’s unclear whether this will lead to other settlements, and even more unclear whether settlements alone are sufficient.

What the writers I’ve heard from care most about is the moral injury in having their work used without the choice to opt in or out. They would like to see big tech pay for this, if only with a public dressing-down. If I had been asked for permission, here are some concerns I would have weighed. The International Energy Agency predicts that, by the end of this decade, AI could consume more electricity than Japan—an extreme impact with ramifications for greenhouse gas emissions, especially for a technology so often put to trivial uses. I’m disturbed by AI’s complete disconnection from the natural world. (“I don’t personally value a fish’s life,” ChatGPT once told me. “I can’t. I don’t have a self that cares, or feels loss, or is moved by the flick of a tail in water.”) I am also worried about the potential for job displacement like nothing we’ve seen since the Industrial Revolution.

You don’t have to be an “anti-clanker” or want to put the AI horse back in the barn to have concerns about AI and who’s in charge of it. My greatest concern as a writer is this: I am convinced, having experimented with AI, that it is designed to colonize even more of our cognitive functions than internet algorithms, social media and mobile apps already have. I see the way it suggests quick solutions to problems that writers have traditionally worked through by searching their inner worlds for insight and creativity. I see how taking an easy exit from this struggle could make us bullhorns of AI’s thinking, not the other way around—replicants of our replicants. Story writers are already becoming StoryWriters.

I keep looping back, though, to an idea I’ve rarely felt a need to dwell on before: that I am first and foremost a human being. Every fragment of human effort and imagination fed into AI may not be protected by the letter of the law, but the law never anticipated a creation that could devour so many fragments of us in the name of commerce. Together, they form a sum greater than the whole, which is surely our collective intellectual property. Authors are out front with our lawsuits. But everyone living, dead or yet to be born stands to have humankind’s cultural legacy taken and then sold back to them. As the novelist Ali Smith has written, “We want your pasts and your presents because we want your futures too.”

This is why, when I’m asked what I hope to get out of the lawsuits, I’m never sure where to stop. I want to get paid, and to see the companies punished for being bad actors. But I want more. I do not see the alleged copyright infringement as a one-and-done delinquency. It’s an ongoing harm, built into the machine, for which I should receive regular compensation as long as the models built on my work, and those models’ descendants, generate revenue. And I go further still, arguing that, since human intelligence is the means of production in the manufacture of artificial intelligence, every person should receive their pound of digital flesh. Payments to us all, in perpetuity, collected like a wage, or a tax. Or a tithe.

If the technology is to be developed further, we must put people first. Right now, big tech is putting us last. We see that in the concealment, the haste, the profiteering, the chilling readiness to place their hallucinating intelligences alongside or above the delicate balance of care and knowledge that we call wisdom. This is not a field of play that can be ceded to a handful of tech executives and their political allies, with investors and donors to please. We need much more say, democratically, about the speed and nature of AI’s evolution, how much of us it should be permitted to consume, and when, and how and why. If you think about it this way, it is no longer a David vs. Goliath story. We are Goliath against NeMo Megatron. I like our odds.