Do AI Programs Store or Memorize Personal Data? German Regulator Says “No” featured image

Do AI Programs Store or Memorize Personal Data? German Regulator Says “No”

by John DiGiacomo

Partner

Internet Law

There are several ongoing legal controversies relating to AI computer software models — such as Chatbot — and whether the training and output of such models violate copyright laws and data privacy laws and endanger personal and social freedoms. We wrote recently about the pending case of Andersen v. Stability AI, Ltd. (N.Dist. Cal.) involving whether AI-generated images infringe upon copyrights. Recently, the federal judge in the case allowed the case to proceed beyond the Motion To Dismiss phase — see here — because it was alleged that the AI program involved stored or contained compressed copies of billions of copyrighted images that had been downloaded and used for training. This was an allegation in the Amended Complaint that the judge was required to accept “as true.” Because this fact was “taken as true,” the court allowed the case to go forward on claims of direct infringement and induced infringement.

Relevant to this issue is the recently released report by a German regulator that AI models do NOT memorize or store personal data like names and birth dates. The regulator in question is the Hamburg Commissioner for Data Protection and Freedom of Information. The report itself involves personal data privacy rights but presents a factual finding that might be relevant to questions of copyright infringement.

The Hamburg Commissioner noted that AI generative software programs contain several interacting components, one of which is generally called a Large Language Model (“LLM”). These are used for text-generative AI programs, and similar components are used for image and video-generative AI models. The Hamburg Commissioner’s ultimate finding was that LLMs do not store or memorize personal data. Rather, “LLMs store highly abstracted and aggregated data points from training data and their relationships to each other, without concrete characteristics or references that “relate“ to individuals.” (p. 6). Because of this, LLMs are not storing “personal data” — as defined by EU personal and data privacy jurisprudence — because what is stored “lacks the necessary direct, targeted association to individuals….” In overly simplified terms, the LLMs store data that is disaggregated, abstracted, and disconnected. As such, there is no “personal data.”

Now, it must be said that the Hamburg Commissioner’s report is focused on a couple of very narrow questions: are LLMs engaged in the “processing” of personal data, and are they, themselves, subject to EU data privacy regulations? On that very narrow set of questions, the Hamburg Commissioner is suggesting that the answer is “no.” However, there is – or will be – a different answer when the whole AI program is considered since it is admitted that the output of the AI program “… may contain information relating to natural persons, especially if the prompt specifically asks for it.” Again, in overly simplified terms, the dis-aggregated data is re-aggregated, and that output generates information that identifies natural persons. That is “personal data” subject to EU privacy regulations.

In any event, it will be interesting to see how information and data are stored with respect to image-generated AI programs. The outcome of the various copyright cases may turn on the answer to that question.

Contact the AI, Internet Law, and Copyright Attorneys at Revision Legal

For more information, contact the experienced the AI, Internet Law, and Copyright Lawyers at Revision Legal. You can contact us through the form on this page or call (855) 473-8474.

Extra, Extra!
Recent Posts

Does the AI-Copyright Legal Fight Represent a National Security Threat?

Does the AI-Copyright Legal Fight Represent a National Security Threat?

Copyright

The holders of copyrights for newspapers, magazines, books, and other publications are involved in numerous legal battles with owners of AI modules over alleged copyright infringement. The plaintiff copyright owners claim that the AI large language modules have been trained on huge quantities of copyrighted materials without permission and — most importantly — without payment. […]

Read more about Does the AI-Copyright Legal Fight Represent a National Security Threat?

How Does Buy-Sell Insurance Work For An Owners’ Agreement?

How Does Buy-Sell Insurance Work For An Owners’ Agreement?

Corporate

The owners of most small, closely-held businesses negotiate and sign some form of an “Owner’s Agreement.” An important part of such Agreements is the “Buy-Sell” provisions. These are often some of the most difficult to negotiate. The gist of the buy-sell part of the Owners’ Agreement is to establish the rules for what happens if […]

Read more about How Does Buy-Sell Insurance Work For An Owners’ Agreement?

Status on Social Media Moderation Statutes and Cases

Status on Social Media Moderation Statutes and Cases

Internet Law

Social media content moderation by technology platforms was one of the “hot” legal topics in 2023-2024. Three States — California, Texas, and Florida — passed different statutes to either require more content moderation (California) or to limit such moderation (Texas and Florida). All the statutes, in one way or another, demanded more transparency and information […]

Read more about Status on Social Media Moderation Statutes and Cases

Put Revision Legal on your side