Training of AI models: Threats and opportunities

The use of data from the open web to train artificial intelligence systems is receiving greater scrutiny from regulators and privacy activists. Tom Cooper reports.

Generative artificial intelligence models (GenAI), such as ChatGPT need to ingest large quantities of data, usually from the open Internet. But can the potential data protection problems be mitigated while still allowing increasingly valuable tools to be developed?

In April last year, Italy’s Data Protection Authority, the Garante, imposed then lifted a ban on the popular ChatGPT tool(1), and the investigation continues. In June, Ireland’s data watchdog the Data Protection Commission (DPC) “engaged intensively” with Meta over plans to train an artificial intelligence (AI) model using data shared by Facebook and Instagram users. Meta subsequently paused its plans.(2) Privacy group nyob had earlier filed 11 complaints with national regulators, including Ireland’s, over the change in Meta’s terms and conditions that would have forced users to opt-out of the processing by Meta AI.(3)

Continue Reading

International Report subscribers, please login to access the full article

LOGIN

If you wish to subscribe, please see our subscription information.

Subscribe