ANI vs OpenAI: Can India’s outdated copyright act deal with legal conundrums?

Last month, a group of digital news publishers in India sought to join a case filed by news agency ANI in the Delhi High Court accusing the US firm OpenAI of misusing material to which they own the copyright.

They contend that OpenAI’s ChatGPT – an artificial intelligence-powered “bot” that responds to user queries – has built its database illegally using text and visual material owned by them.

They also claim that the responses generated by ChatGPT are similar to their works but fails to attribute the source or misattributes it.

OpenAI, however, told the court that it only uses “publicly available data”.

This case in India echoes suits filed by publishers and media organisations around the world, accusing technology companies of violating copyright law to create AI products.

In India, OpenAI faces a major hurdle. India’s Copyright Act of 1957 and related jurisprudence favours the rights of authors and creators and was not designed to deal with the challenges posed by this emerging technology.

With India trying to pitch itself as a leader in AI, the single judge’s decision in the ANI case could have far-reaching policy ramifications. Requiring companies to obtain licences to access material on which AI models are trained may slow the development of the new technology. But allowing the unauthorised use of copyrighted materials may hurt creative industries.

OpenAI vs ANI, India pic.twitter.com/f1GmCrSeuq
— Udit Kulshrestha (@uditkulshrestha) November 19, 2024

AI tech vs creative industry

AI products, such as ChatGPT, have been trained on vast amounts of data gathered from the internet. The greater the quantity of data and the higher its quality, the more advanced will be the performance of generative AI models.

However, Indian copyright law protects material that might be used as part of the datasets on which the AI product has been trained.

News publishers, for instance, are granted exclusive rights or “copyright” over their content, which involves a substantial investment of time and money to produce. Without such protection, others could start competing businesses by merely copying the articles without investing in news gathering, editing and other tasks related to producing a publication.

In the AI context, scraping publicly available news articles off the internet may involve temporarily reproducing the items in databases used to train the models. According to Indian law, such reproduction without permission is a copyright infringement. This is the case even if the output generated by the AI model is not a copyright violation.

This poses a legal conundrum: the use of copyright material to train models is a violation but the output may not be.

One of the central concerns before the Delhi High Court, therefore, will be whether the law should focus on the misuse of copyrighted content rather than merely using it.

In the New York Times OpenAI lawsuit, you can see how complex the relationship of training data to output can be. On one hand, they find that you can induce ChatGPT to produce exact content from famous Times articles, on the other, they show it also hallucinates false articles. pic.twitter.com/cY7cyZjd8r
— Ethan Mollick (@emollick) December 27, 2023

Copyright law limitations

At the same time, copyright protection is not absolute: it is subject to limitations that are for the benefit of society. In India, the limitations are narrowly tailored and determined by the legislature. Section 52 of the Copyright Act, 1957, lists 33 specific activities that do not require a licence from the copyright-holder.

These include fair use of works for criticism or reviews, news reporting, personal research, educational activities, certain uses of government works and by libraries. There is, however, no express clause in Section 52 that permits the large-scale reproduction of copyrighted material to train AI models, especially those used commercially and for profit.

Unlike in India, copyright laws in the US, European Union and Singapore have provisions to respond to new technologies such as AI training. The US has a free-wheeling “fair use” doctrine that allows the unlicensed use of copyright-protected work in certain contexts. For instance, in the 2015 Google Books case, the “fair use doctrine” was used to permit large-scale data mining.

The European Union, on the other hand, does not allow AI companies to unconditionally use copyrighted content for training.

Singapore, meanwhile, in 2021 amended its copyright law to allow text and data mining for “computational data analysis”. This exception was introduced to make Singapore more attractive for AI research.

Copyright law is right-holder centric

In the absence of this kind of express exception, for the Delhi court to rule in favour of OpenAI, it may have to travel beyond the law and into the realm of policy. This is because Indian copyright law is strikingly right-holder centric in text, history and spirit.

In its text, the current exceptions to copyright infringement under Section 52 are activity specific, pre-defined by the legislature and do not include large-scale reproduction of copyright works for training AI models.

Historically, gaps that arise due to technological advancements have been filled by legislative amendments and not courts. For example, in 2012, the legislature amended the Copyright Act to permit unlicensed transient or incidental storage of works for specific purposes that were required for smoothing functioning of the internet.

This meant that works that were reproduced temporarily or incidentally for the purpose of caching or linking, purely in the technical process of electronic transmission, were not considered copyright infringement. But the legislature did not exempt all transient or incidental storage – and this may be relevant in the context of training AI models.

In spirit, the Copyright Act encourages right-holders to control the use of their works in new markets. For example, when Hindi film songs were used as phone ringtones, the Copyright Act was amended in 2012 to allow right holders to earn revenues from this emerging market. Though the songs were originally made for use only in movies at a time ringtones did not exist, the legislature stepped in to strengthen author rights.

Similar logic might apply to articles written for newspapers but are being used to train AI models. There is a new market for the training of AI models and the spirit of the law appears to favour authors and enable them to license their rights in these new markets.

Moreover, the commercial use of ChatGPT and OpenAI’s for-profit motives may weaken its claims that it is not infringing on copyright protections.

Unfair competition, a fairer way to decide?

If the case does go to trial as it is expected to in February or March the judge will have to make a decision on whether AI companies should seek permission from creators in India before reusing their work to train their models.

But Indian copyright law is not designed to resolve this question.

In such a situation, it may have been more realistic for the court to decide the case not on the basis of copyright law but on the principles of unfair competition. Unfair competition, in common law, is generally understood as any act of competition that is “contrary to honest practices in commercial matters”.

“Tortious” claims made outside of the Competition Act and the Copyright Act seek compensation for harms caused by another person or for misappropriating the value of an asset. Such claims are relevant when commercial value needs protection – where someone is “reaping where they have not sown” and when there are fact specific issues of “unfairness” and “competition”.

Such cases give courts considerable discretion in shaping practices that may be considered “contrary to honest commercial practices”. They are often case-, fact- and evidence-specific. This would be easier for courts to decide – instead of having to craft a national copyright policy that is actually the domain of the legislature.

Unfortunately, the Delhi High Court has categorically rejected such claims in intellectual property disputes. The court has been clear that the copyright statute pre-empts claims of unfair competition – which means that there can be no claim of unfair competition where a claim of copyright infringement has been made.

With so many legal dead-ends, as with other technological advances, market-based solutions may evolve faster than the law. Perhaps new features and the evolution of technology could pave the way for creative deal making between AI companies and creators.

Aparajita Lath is a lawyer and Assistant Professor of Law at the National Law School of India University, Bangalore.

We welcome your comments at letters@scroll.in.