Publishers Face Existential Threat: Large Language Models Exploit Content Without Credit or Revenue, Sparking Copyright Battle

Summary: Large Language Models (LLMs) like Bard and ChatGPT pose a significant threat to publishers as they answer reader queries, diverting traffic and revenue from publishers without credit. The article argues for copyright protection, suggesting LLMs should strike deals with publishers before using their content for training, emphasizing the need for a reevaluation of the notion that “information wants to be free.”

Publishers are all facing a huge threat from large language models (LLMs) like Bard and ChatGPT, because their prospective readers will get answers to their questions from the LLM rather than going to the publisher. The publisher will get neither revenue nor credit for this. Not even a lousy link.

This is a flagrant violation of the publisher’s copyright, and publishers need to make that case. The LLMs need to make a deal with the publisher before they use the publisher’s content for training, and until such a deal is made, publishers should cut off the LLMs.

“But it’s out there for free,” someone will say.

No, it’s not. Publishers make content available under a certain set of assumptions. One of those assumptions is that it’s being read by a person who is a potential customer. Another assumption may be that the person will see ads along with the content, might sign up for an e-newsletter, etc. In other words, the content is conditionally free, and the LLM is violating those conditions.

The publisher is making a business decision along these lines: “It’s worth it to me to have this information available on the internet because of the potential business I might get out of it.”

Consider me. I provide a lot of “free” information about publishinng, technology, customer data platforms, and other things. I do this so that people will read, watch, or listen to my material and think, “oh, this guy knows what he’s talking about. We should hire him to help us with this problem.”

It’s possible that someone will read everything I’ve written or said, understand it all, figure out the problem for themselves, and no longer need my services.

It’s possible … but it’s not likely. Most people don’t have the time to read it all, and even if they did, they aren’t likely to absorb it, or understand it in the larger context. It’s easier just to hire me.

The same is not true with an LLM. It’s a piece of cake for an LLM to read everything I’ve written, put it in the context of mountains of information from other sources, and spit out the Krehbielian answer — with counter-arguments, charts, pretty pictures, and additional information. (I suppose we’re still learning how good LLMs are at that, but it’s a decent bet they’ll be very good at it very soon.)

Some random person reading all my stuff is a negligible threat to my business. The LLM is a huge threat.

This is analogous to the old saying, “there’s no such thing as a free lunch.” My content isn’t really “free.” It’s provided on the condition that there’s a chance the reader will recommend or hire me.

That condition doesn’t apply when the reader is ChatGPT, Bard, or Bing, and that’s the fundamental inequity here. A publisher’s content is not provided for the purpose of putting the publisher out of business.

Publishers are in this mess because we’ve swallowed the wrong-headed idea from the copyright anarchists that “information wants to be free.” But that’s a larger topic, which I’ll explain in a talk today at the Global Media Congress, and in a forthcoming video. Stay tuned.

Leave a Reply Cancel reply

Related News