Updated: July 18, 2024, 4:44 PM EDT Salesforce provided a comment to Mashable in response to Wired’s report.
a New reports The investigation, conducted by ProofNews, claimed that tech giants such as Apple, Nvidia, Antropic and Salesforce are training their AI using data from “thousands of YouTube videos.” Wiredclaimed that subtitles from 173,000 YouTube videos had been stolen for the two companies’ AI models.
The dataset, called “YouTube Subtitles,” includes transcripts of videos from educational channels such as Khan Academy, MIT, and Harvard, as well as The Wall Street JournalNPR, BBC, and others. Material from YouTube stars such as PewDiePie, Marques Brownlee, and MrBeast was also found.
Anthropic did not immediately respond to a request for comment, but Apple and Salesforce have responded. wired report.
Will Apple use this data for Apple Intelligence or other AI services?
The short answer is no, but here’s a longer answer for those of you who aren’t into TLDR:
In an email to Mashable, Apple said that its open source language model, OpenELM, surely We used a dataset, but not in the way that some might think.
The OpenELM project is part of Apple’s ongoing efforts to benefit the broader research community. In other words, according to Apple, the OpenELM model was created for research purposes. only and do not have It supports Apple’s machine learning hardware and AI services. Apple Intelligence.
Mashable Lightspeed
For beginners, Apple Intelligence The company’s new suite of AI capabilities includes: World Development Congress 2024 (annual event where Apple announces upcoming plans for its software products, including iOS and iPadOS).
Apple IntelligenceFor example, it summarizes the text of emails and text messages to help you communicate more quickly with friends, loved ones, coworkers, etc. It also supports entertainment-focused features such as: Genmojigenerates new iOS emoji with prompts, and there’s also an Image Playground where users can create AI-generated images on the fly.
iOS 18 is coming with a new Genmoji feature.
Credit: Apple
When it comes to consumer AI utilities, Apple highlighted that the website offers the option to opt out of having your content used for AI training.Apple assured that its generative models are built and fine-tuned using high-quality data, including content licensed from publishers and stock image companies, as well as data publicly available on the web.
Simply put, Apple isn’t denying that its open source language model, OpenELM, used the dataset, but it wants to be clear that it won’t use it as the foundation for any of its AI services, including Apple Intelligence.
Salesforce claims academic use
Salesforce also shared its perspective in an email to Mashable.
“The Pile dataset referenced in the research paper was used to train AI models for academic research purposes in 2021,” a Salesforce representative said. “The dataset is publicly available and was released under a permissive license.”
What does Nvidia say?
We reached out to Nvidia for comment, but the company, known for incorporating AI into many of its gaming hardware and services, declined to provide a statement.
We’ll update this post if we hear anything from Anthropic.
topic
apple
artificial intelligence