AI in practice

Apple woos publishers with $50 million deals for AI training content

Matthias Bastian
An editorial illustration in a hand-drawn style, depicting the Apple logo creatively constructed from newspapers. The logo is to be rendered with a glitch effect, giving it a distorted, digital malfunction appearance. The overall composition should capture the essence of a glitch art style, incorporating elements of digital noise, overlapping images, and color distortions. The background should complement this theme, possibly with more newspaper textures or abstract digital elements. The image should be in a 16:9 aspect ratio, suitable for editorial use.

DALL-E 3 prompted by THE DECODER

Professional texts are a valuable resource for training language models. Apple is the next major technology company to strike a deal with publishers.

Apple has reportedly reached out to several major publishers and news organizations in recent weeks to license their content for AI training. That's according to The New York Times, which cited four people familiar with the matter.

Apple is said to be offering multi-year deals worth at least $50 million. Among the companies Apple has approached are Condé Nast, NBC News and IAC, which owns People, The Daily Beast and Better Homes and Gardens.

However, some publishers are reportedly concerned that Apple's licensing terms go too far in licensing publishers' archives for the money offered. Publishers would also be held legally liable for Apple's use of their content.

Another risk is that it is not known how Apple will use generative AI in the news context. If Apple were to offer news-trained generative AI through its own hardware and software ecosystem to take over parts of the news business, publishers would be competing with themselves to make their content available to Apple.

This is similar to what is happening with artists, except that artists have never been asked if their material can be used to train AI, nor have they received any royalties.

At least Apple is asking

But Apple could find ways to train capable generative text systems without those licenses. Or it could do what Google or OpenAI do and just take what it needs without asking.

According to the NYT, this is why some news industry executives are optimistic that Apple's approach could ultimately lead to a useful partnership, since the company is asking for permission rather than using content without consent.

Recently, OpenAI announced two partnerships with Axel Springer and the Associated Press in the US. In both partnerships, OpenAI will receive journalistic content for AI training and provide the organizations with its AI technology to optimize production processes. In addition, content from Axel Springer publications will be distributed in ChatGPT.

Apple tries to catch up in AI

The recent AI hype has been dominated by OpenAI, Microsoft and Google, complemented by a few startups like Anthropic and Midjourney, and a vibrant open-source scene. Apple has been largely absent. But that is slowly changing.

Apple is working on AI tools, including its large language model framework called Ajax and a chatbot service called AppleGPT. The project has become an important part of the company.

According to Bloomberg, Apple is already using the language model internally. It is designed to help employees build prototypes, summarize text, and answer questions about the data it has been trained on.

Apple's language modeling team is said to be small, about 16 people, but with a big budget: Millions of dollars are said to go into AI training every day. The "Foundational Models" team is led by AI engineer John Giannandrea, whom Apple poached from Google in the spring of 2018 and promoted to the executive team the following winter.

Apple also recently released MLX, an efficient machine-learning framework for Apple chips. With MLX, users can locally train or fine-tune a Transformer language model, generate text with Mistral, create images with Stable Diffusion, and perform speech recognition with Whisper, among other features.

Sources: