OpenAI, Google Could Be Charged for AI Training on Copyrighted Material in India
India’s Department of Promotion of Industry and Internal Trade proposed a framework that would give AI companies access to all copyrighted data and works for training, with plans to pay royalties to a new collection agency that includes rights-holding organizations and to offer payments to creators. According to this proposal, mandatory blanket licenses will likely reduce compliance costs for AI companies. It also ensures that writers, musicians, artists, and other rights holders receive compensation when their work is used to train commercial models.

This proposal from India emerges amid growing concerns in the global market, where many leading companies are training their models using copyrighted content by scraping it from the web. This practice has sparked a debate among authors, artists, news organizations, and various rights holders in the U.S. and Europe about the legality of such training. While courts and regulators are still considering these issues, questions remain about whether this training is fair or if it infringes on copyright laws. This uncertainty is exploited by many AI companies that are using large amounts of data without proper regulations and expanding globally. This situation raises further concerns about data privacy and security among consumers.
India is proposing a highly interventionist approach to these changes by granting AI companies automatic access to copyrighted data and requiring mandatory payments in exchange, which is significantly more than unregulated data scraping. Meanwhile, U.S. and European policymakers are still debating fair use limitations and transparency requirements.
The eight-member committee created by the Indian government in late April shared their view, stating that the system would reduce legal uncertainty while ensuring that creators receive their rights to be compensated for their original content.
In defending the system, the committee outlined their argument in a 125-page submission: “aims to provide easy access to content for AI developers, reduce transaction costs… [and] ensure fair compensation for rightsholders.” According to the committee, this is the least burdensome way to handle large-scale AI and data scrapping for training.