OpenAI now allows developers to fine-tune GPT-4o. This customization aims to enhance performance for specific use cases and reduce costs.
According to OpenAI, fine-tuning enables adjustments to response structure and tone, as well as the ability to follow complex domain-specific instructions. Significant improvements can be achieved with just a few dozen examples in the training dataset, the company says.
OpenAI showcases two applications of fine-tuning: Cosine's AI assistant Genie achieved a top score of 43.8 percent on the SWE-bench Verified Benchmark for software engineering. Distyl secured first place on the BIRD-SQL benchmark for text-to-SQL tasks, with their customized GPT-4o model reaching 71.83 percent accuracy.
Free training tokens available through September 23
OpenAI emphasizes that developers retain full control over their tuned models, and according to the company, input data is only used to refine the developer's own model, not to train other models. However, OpenAI implements security measures to prevent model misuse and monitors tuned models for potential safety issues.
Fine-tuning is available to all developers with paid usage tiers. Costs are $25 per million tokens for training, $3.75 per million input tokens, and $15 per million output tokens for inference.
OpenAI also offers fine-tuning for GPT-4o mini. The company is offering two million free training tokens per day for GPT-4o mini until September 23, and one million free training tokens per day for GPT-4o until the same date.
Fine-tuning can help, but is not a cure-all
Fine-tuning improves AI model performance and tailors them to specific tasks. It can enhance a model's understanding of content and extend its knowledge and capabilities for particular applications. OpenAI's new fine-tuning options are part of a broader initiative to customize AI models for businesses.
An independent test by data analytics platform Supersimple showed that a fine-tuned AI model can significantly improve performance on specific tasks compared to a standard model, though it's not perfect. Moreover, the performance boost from fine-tuning GPT-4 was smaller than the improvement seen when upgrading from GPT-3 to GPT-3.5.