OpenAI, a company built on ‘scraping’ content without permission, makes a copyright claim against a subreddit using its logo

OpenAI is a company built on the work of others. Techbros may hail CEO Sam Altman as some sort of digital messiah but, with apologies to Monty Python, really he’s just a very naughty boy, who understands that if OpenAI hoovers up as much content as it can to train its models, then all we can do is close the stable door long after the horse is bolted. OpenAI trains ChatGPT on copyrighted content by design, and dares society to try and stop it (on which note, good luck to the New York Times with its lawsuit).

One of the advantages of all that lovely venture capital flowing in is that OpenAI can afford all the lawyers it wants to fight these battles. But maybe there was something of a lull recently, because OpenAI has issued a “copyright complaint” against the r/ChatGPT subreddit for the use of the OpenAI logo.

The news came in a post to the subreddit (first spotted by 404Media), which included a screenshot of this message received from Reddit:

“Hello Mods, We have received a copyright complaint from openai.com alleging unauthorized use of their copyrighted logos in r/ChatGPT. The ‘subreddit profile image’ does make use of the copyrighted content, which can lead to user confusion: please address the unauthorized copyrighted elements by May 16.”

The mods were asked to remove the OpenAI logo and reply confirming they’d done so.

“It does not seem wise for OpenAI to start enforcing copyright claims,” observes not_wyoming, a user on r/ChatGPT. “Ironic for a company who scraped the entire internet,” adds Elsa_Versailles. Nelculiungran says “this is so hypocritical it hurts…” and Kiwizoo replies “well considering they used all our Reddit posts to train the thing, I agree.”

The last comment refers to Reddit’s recent agreement to sell user data in order to train AI, which is currently the subject of an FTC enquiry. The most amusing responses to the copyright claim came from users who were inspired to prompt ChatGPT into generating OpenAI logos that wouldn’t infringe on the logo.

The logo was removed from r/ChatGPT, but OpenAI subsequently relented on its stance, perhaps realising that going after its own enthusiasts for copyright was not a great look. The subreddit’s already in the process of a competition to create a new logo, though, with a winner to be chosen next week.

So what does OpenAI think about all this troublesome copyright malarkey? In a submission to the UK’s House of Commons earlier this year, the company put forward this position: “Because copyright today covers virtually every sort of human expression—including blog posts, photographs, forum posts, scraps of software code, and government documents—it would be impossible to train today’s leading AI models without using copyrighted materials.” It went on to add that limiting the training data to public domain books and drawings would not produce AI models that “meet the needs of today’s citizens.”

The company has claimed to “respect the choices of creators and content owners on AI” with an opt-out model, and has made vague promises of a tool that will identify copyrighted material (though what it will actually do about it is unknown). Essentially it’s flipping the problem back onto the people and companies that are having their work scraped: OpenAI says it’s their responsibility to stop it and opt out.

All of which makes it amusing, even if in a slightly dystopian way, to see OpenAI get prissy about a bunch of ChatGPT fans using the company’s logo on a subreddit. One rule for AI, in other words, and another for you.