Earlier this week, an investigation revealed that Apple and other tech giants had been using YouTube subtitles to train AI models, including over 170,000 videos from MKBHD, Mr. Beast, etc. Apple used the dataset to train its open-source OpenELM model, which was released in April.
However, Apple confirmed to 9to5Mac that OpenELM is not included in any of its AI or machine learning features, including Apple Intelligence.
Apple says it created the OpenELM model as a way to contribute to the research community and foster the development of open-source large-scale language models. To date, Apple researchers have described OpenELM as “the state-of-the-art open language model.”
According to Apple, OpenELM was created for research purposes only and will not be used to power Apple Intelligence features. The model has been open sourced and is widely available, including on Apple’s Machine Learning Research website.
Because OpenELM is not used as part of Apple Intelligence, the “YouTube Subtitles” dataset is not used to power Apple Intelligence. Apple has previously said that its Apple Intelligence models are trained “on licensed data, including data selected to power specific features, as well as public data collected by our web crawlers.”
Finally, Apple also said that it has no plans to build a new version of the OpenELM model.
As Wired reported earlier this week, companies including Apple, Anthropic and NVIDIA have all trained AI models using this “YouTube subtitles” dataset, which is part of a larger collection from nonprofit EleutherAI called “The Pile.”
FTC: We use automated affiliate links that generate revenue. Learn more.