The holders of copyrights for newspapers, magazines, books, and other publications are involved in numerous legal battles with owners of AI modules over alleged copyright infringement. The plaintiff copyright owners claim that the AI large language modules have been trained on huge quantities of copyrighted materials without permission and — most importantly — without payment. The plaintiffs claim that such training is actionable copyright infringement and the plaintiffs seek to recover vast amounts of money damages that are allowed under the Copyright Act.
Recently, some interesting factual nuances have been disclosed about the source of some of the copyrighted materials and about whether the United States’ restrictive copyright laws create a national security threat. It is probably not well-known that there are a couple of enormous “illegal” or “shadow” online libraries containing tens of millions of books and other materials (including academic papers) which are all covered by copyright protections.
As discussed here in some detail, the owners are fully aware of the copyright protections and admit that they are running an “illegal” library. However, they feel they have a moral obligation to ensure that “humanity’s heritage” is not lost. About 20 years ago, the largest such shadow library was called Z-Library. About 15 years ago, governments around the world were able to successfully shut down Z-Library, but not before at least one group had offloaded the whole library onto a new server with a new name. The current version is called Anna’s Library. Anna’s Library claims to have grown the Z-Library to the point that it now contains 140 million items.
What is interesting for our purposes is that many AI companies have, over the last few years, contacted Anna’s Library for assistance with using Anna’s Library as a source of training materials for their AI modules. From the article linked above, Anna’s Library stated:
“Virtually all major companies building LLMs contacted us to train on our data. Most (but not all!) US-based companies reconsidered once they realized the illegal nature of our work. By contrast, Chinese firms have enthusiastically embraced our collection, apparently untroubled by its legality. This is notable given China’s role as a signatory to nearly all major international copyright treaties. We have given high-speed access to about 30 companies.” (emphasis added)
For additional details, see article here.
The foregoing revelations are interesting in and of themselves, adding a factual nuance to the AI-copyright cases.
However, just as interesting is the suggestion that a national security problem exists. As can be seen in the quote above, according to Anna’s Library, Chinese AI firms are not particularly concerned about the legality of using pirated books and illegal libraries. For creation and training of AI modules, this then puts Chinese AI firms at a significant advantage over U.S. firms who are more leery of copyright complications. For example, in this report (page 7), it is noted that Anna’s Library was used, in part, in the pretraining phase of an earlier version of China’s DeepSeek AI module.
So, the concern then is that, with less constraints, Chinese and other national AI firms will move more quickly to advance their AI modules, leaving U.S. and other Western AI modules to fall behind. This, then, generates the questions of whether that constitutes a national security threat and what can be done about it. Many suggest that the obvious answer to the first question is “yes.” As for the second, one suggestion is to amend copyright laws to create exceptions for AI training.
Contact the Copyright and AI Attorneys at Revision Legal
For more information, contact the experienced Copyright and AI Lawyers at Revision Legal. You can contact us through the form on this page or call (855) 473-8474.