Apple denied using unethically obtained data to train Apple Intelligence but acknowledged using it for other projects.
on tuesday,
An AI research institute called EleutherAI was found to have been harvesting subtitles from YouTube videos without the explicit permission of the creators, as well as scraping data from Wikipedia, the UK Parliament, and Enron employee emails, adding it to a dataset called “The Pile.”
EleutherAI says its goal was to lower the barrier to AI development for people outside of big tech companies, but companies like Nvidia, Salesforce, and Apple all use Pile to train various AI projects.
Now, Apple said that while it did use Pile, the dataset wasn’t used for Apple Intelligence, but instead was used to train the open-source OpenELM models it released in April.
Apple later confirmed to AppleInsider that the OpenELM model isn’t used in any of the company’s AI or machine learning capabilities. Instead, the tech giant claims it developed OpenELM as a contribution to the research community.
They also state that the OpenELM model is not intended for use with Apple Intelligence, and that they have no plans to build a new version of the OpenELM model.
Apple has repeatedly insisted that the sources of information for its artificial intelligence projects are ethical, and is known to pay millions of dollars to publishers and license images from photo library companies.