OpenAI expands fine-tuning offering and believes in custom AI models for businesses

Apr 4, 2024

Midjourney prompted by THE DECODER

Key Points

OpenAI announces new features for self-service fine-tuning of GPT-3.5 via the API. According to OpenAI, thousands of companies have trained hundreds of thousands of custom models to date.
OpenAI is also expanding its custom model program. Working with OpenAI's technical teams, companies can use Assisted Fine-Tuning to develop custom GPT-4 models that go beyond the capabilities of the Fine-Tuning API and optimize performance for specific use cases.
Early testing and customer examples show the potential and limitations of GPT-4 Fine-Tuning. While some companies have seen significant improvements, challenges such as high latency and cost, as well as weaknesses in broad, open issues, remain.

OpenAI introduces new functions for API fine-tuning and expands its program for customer-specific models.

OpenAI has announced new features for self-service fine-tuning of GPT-3.5 via its API, which has been used by thousands of companies to train hundreds of thousands of models since its launch in August 2023, the company says.

New features include saving checkpoints during each training epoch, a new Playground interface for comparing model quality and performance, support for third-party platform integrations (starting with Weights and Biases), calculation of metrics across the entire validation dataset at the end of each session, and various improvements to the Fine-Tuning Dashboard.

Video: OpenAI

According to OpenAI, the most common use cases for fine-tuning include training a model to generate better code in a specific programming language, summarizing text in a specific format, or creating personalized content based on user behavior.

Indeed, a global job listing and job placement platform, used GPT-3.5 Turbo fine-tuning to send personalized recommendations to jobseekers, reducing costs and latency by reducing the number of tokens in the prompts by 80 percent and increasing the number of personalized job recommendations sent per month from one million to around 20 million.

OpenAI believes in customized AI models for companies - and is becoming a service provider

OpenAI is also continuing to develop its program for customer-specific models. As part of Assisted Fine-Tuning, the company's technical teams work with customers to implement techniques that go beyond the Fine-Tuning API, which is supposed to be particularly helpful for companies that need assistance in building efficient training data pipelines, evaluation systems, and customized parameters to maximize model performance for their use case.

According to OpenAI, after several weeks of collaborative work on GPT-4, South Korean telecommunications provider SK Telecom was able to increase call summary quality by 35 percent, intent recognition accuracy by 33 percent, and satisfaction scores from 3.6 to 4.5 (out of 5) compared to standard GPT-4.

Harvey, an AI tool for lawyers and an OpenAI investment, achieved an 83 percent increase in factual answers to legal questions by making adjustments throughout the training process, with lawyers preferring the outputs of the customized model compared to GPT-4 in 97 percent of cases.

Harvey GPT-4, without (left) and with fine-tuning (right). | Video: OpenAI

An independent test of GPT-4 fine-tuning by the data analysis platform Supersimple found that while fine-tuning improves task performance, there are challenges.

In the case of Supersimple, which achieved a 56 percent performance improvement over GPT-3.5, the benefits of fine-tuning GPT-4 were less significant than those observed when switching from GPT-3 to GPT-3.5. Additionally, the fine-tuned GPT-4 continued to show weaknesses in answering broad and open-ended questions, and had significantly higher latency and cost compared to GPT-3.5.

Fine-tuning can help models better understand content and extend the existing knowledge and capabilities of a model for a given task. Experts debate whether learning from many examples directly in the prompt ("many-shot in-context learning") might be more efficient than the comparatively more complex fine-tuning. Either way, it is easier to test.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: OpenAI