A federal judge’s recent mandate requires OpenAI to indefinitely preserve all ChatGPT data amidst an ongoing copyright lawsuit, prompting OpenAI to appeal the decision. The company argues this “sweeping, unprecedented order” infringes upon its users’ privacy rights, adding another layer to the complex legal battle with The New York Times over AI training data.
The core of the legal challenge began in 2023 when The New York Times sued OpenAI and Microsoft, alleging copyright infringement through the use of its articles to train their large language models. OpenAI countered, stating the Times’ case was “without merit” and asserting that its training practices fall under the doctrine of “fair use.” You can find OpenAI’s perspective on their blog post about AI and journalism.
Previously, OpenAI’s policy was to retain chat logs only for users of its free and paid ChatGPT tiers who had not opted out of data collection. However, in May, the Times and other news organizations submitted a filing claiming OpenAI was engaged in the “substantial, ongoing” destruction of chat logs potentially containing evidence of copyright violations. In response, Judge Ona Wang ordered OpenAI to maintain and segregate all ChatGPT logs that would otherwise have been deleted.
OpenAI’s Appeal: Prioritizing User Privacy and Data Integrity
In its court appeal, OpenAI contended that Judge Wang’s order “prevent[s] OpenAI from respecting its users’ privacy decisions.” As reported by Ars Technica, the company also labeled the Times’ accusations of data destruction as “unfounded.” OpenAI stated, “OpenAI did not ‘destroy’ any data, and certainly did not delete any data in response to litigation events. The order appears to have incorrectly assumed the contrary.”
OpenAI COO Brad Lightcap reinforced this stance in a public statement, asserting, “The [Times] and other plaintiffs have made a sweeping and unnecessary demand in their baseless lawsuit against us.” He elaborated that compelling OpenAI to retain all user data “abandons long-standing privacy norms and weakens privacy protections.”
Echoing these concerns, OpenAI CEO Sam Altman commented on X (formerly Twitter), describing the data retention request as “inappropriate” and one that “sets a bad precedent.” He further suggested the situation underscores the need for “AI privilege,” likening conversations with AI to confidential discussions with professionals like lawyers or doctors.
we have been thinking recently about the need for something like “AI privilege”; this really accelerates the need to have the conversation.
imo talking to an AI should be like talking to a lawyer or a doctor.
i hope society will figure this out soon.
— Sam Altman (@sama) June 6, 2025
Public Reaction and Broader Implications for AI Users
The court’s data retention order has reportedly caused concern among some users. OpenAI’s court filing, as noted by Ars Technica, referenced social media posts where individuals expressed anxiety over their privacy. One LinkedIn user cautioned clients to be “extra careful” about the information shared with ChatGPT. Similarly, a tweet criticized the ruling: “Wang apparently thinks the NY Times’ boomer copyright concerns trump the privacy of EVERY @OPENAI USER – insane!!!”
While some users might not store highly sensitive information in ChatGPT, many utilize the AI for personal matters, including as a therapy tool, for life advice, or even as a form of companionship. These users have a reasonable expectation that such personal interactions remain private. Understanding the nuances of how AI models use data is crucial, and interested readers can explore more about the complexities of AI data privacy.
Balancing Copyright, Innovation, and Consent in AI Training
Despite OpenAI’s assertions that the Times’ case is baseless, the lawsuit highlights critical questions about how artificial intelligence is trained. The methods used to gather data for AI models have come under scrutiny before. For instance, Clearview AI reportedly scraped 30 billion images from platforms like Facebook to develop its facial recognition technology. There have also been reports of government entities using images of vulnerable populations to test facial recognition software.
Although these examples extend beyond journalism and copyright law, they underscore a growing societal need for transparent discussions. A key question emerging is whether AI companies like OpenAI should be required to obtain explicit consent to use content for training, rather than broadly scraping data from the internet.
The Path Forward: Navigating Legal and Ethical Frontiers
The legal tussle between OpenAI and The New York Times brings the conflict between intellectual property rights, evidence preservation, and individual user privacy into sharp focus. OpenAI’s appeal to protect user data against a sweeping retention order highlights its commitment to privacy principles, while the lawsuit itself forces a necessary conversation about the ethics of AI development and data utilization. The outcome of this case could establish significant precedents for the AI industry, influencing how data is handled and how user privacy is balanced against other legal and commercial interests. The ongoing debate emphasizes the need for clear guidelines and ethical frameworks as AI technology continues to evolve.