Ad
Short

OpenAI has introduced a new initiative called the "Pioneers Program" aimed at developing AI benchmarks tailored to specific industries. The company says the goal is to create evaluation methods that better reflect real-world use cases in areas such as law, finance, and healthcare—domains where existing benchmarks fall short. According to OpenAI, current AI benchmarks are often flawed. They tend to measure tasks that are difficult to interpret or overly susceptible to manipulation—criticisms that have also been directed at OpenAI itself. As reported previously, the company has faced scrutiny over its involvement in funding and promoting a prominent math evaluation dataset. In the coming months, OpenAI plans to collaborate with multiple companies to build domain-specific evaluation tools. These benchmarks will eventually be released publicly. The first cohort includes select startups focused on practical AI applications. Participating companies will also have the opportunity to work with OpenAI on improving model performance via reinforcement fine-tuning, a method the company recently introduced for customizing expert-level language models.

Ad
Ad
Ad
Ad
Short

OpenAI has introduced an Evals API that enables programmatic test creation and automation. The system lets developers integrate evaluations directly into their workflows while maintaining the same configuration options available in OpenAI's dashboard interface. Through API calls, teams can define test parameters, manage evaluation data, and rapidly refine prompts. While the API works with models from other companies, those models must support OpenAI's "Chat Completions API" format. Technical documentation is available in both the OpenAI Cookbook and API documentation.

Ad
Ad
Short

Amazon has introduced Nova Sonic, a new AI voice model designed to process speech natively and generate natural-sounding responses. The model reportedly matches the performance of leading speech models from OpenAI and Google in key metrics like speed, speech recognition, and call quality. The company has made Nova Sonic available through its Bedrock developer platform at what it claims is an 80% lower cost compared to OpenAI's GPT-4o, though OpenAI does offer a more affordable option with GPT-4o-Mini. Some components of Nova Sonic are already integrated into Amazon's Alexa+ service. According to Rohit Prasad, SVP and Chief Scientist for AGI at Amazon, the model stands out for its ability to handle speech recognition in challenging conditions and efficiently route user requests to various APIs.

Google News