November 20, 2024

News , Article

Ai

OpenAI: Copyrighted data ‘impossible’ to avoid for AI training

The company (OpenAI) asserted that complying with copyright laws in the context of training advanced AI tools like ChatGPT would be practically unfeasible due to the extensive scope of required training.

In its written testimony, OpenAI argued that the combination of stringent copyright regulations and the prevalence of copyrighted online content would render nearly all forms of human expression inaccessible for training data. This includes a wide range of content, from news articles and forum comments to digital images, making it challenging to find freely and legally usable data.

Attempts to create sophisticated AI models

OpenAI contended that any attempts to create sophisticated AI models while strictly avoiding copyright infringement would be unsuccessful. The company emphasized that restricting training data to public domain materials from over a century ago would not yield AI systems that meet the contemporary needs of users.

also read: AI advances risk facilitating cyber crime: top US officials

While OpenAI defended its current practices as compliant with existing regulations, it acknowledged that collaborations and compensation arrangements with publishers might be appropriate to “support and empower creators.” However, there was no indication that the company plans to significantly curtail its use of online data, including content behind paywalls such as journalism and literature.

Taking this stance has exposed OpenAI to a barrage of lawsuits, with media organizations like The New York Times alleging copyright violations.

However, despite the legal challenges, OpenAI seems unwilling to make fundamental changes to its data collection and training methods, citing the “impossible” constraints that self-imposed copyright limits would impose. Instead, the company aims to rely on expansive interpretations of fair use provisions to legally utilize extensive amounts of copyrighted data.

also read: UP Police On Alert In 7 Districts Along Nepal Border

At present, OpenAI is taking a calculated gamble against copyright maximalists, opting for an approach that involves extensive copying to propel ongoing AI development.