Ad
Short

Wan2.2 A14B now tops the rankings for open source video models, according to Artificial Analysis. It ranks seventh for text-to-video and fourteenth for image-to-video, with the lower placement in the latter likely due to its 16 frames per second output compared to 24 fps in some competitors. Among open models, Wan2.2 A14B leads the field, but it still trails behind closed models like Veo 3 and Seedance 1.0 in overall performance. Pricing, however, is often much lower depending on the provider.

Image: Artificial Analysis
Ad
Ad
Short

Uber Eats now manipulates food images using generative AI.

Uber Eats is now using generative AI to identify and enhance low-quality food photos on its menus. The technology does more than just adjust lighting, resolution, or cropping. It can move food onto different plates or backgrounds, and even modify the food itself - making portions look bigger or digitally filling in gaps for a more polished look.

This approach goes further than traditional retouching or generic stock photos. The AI is capable of generating convincing images of dishes that, in some cases, never actually existed in this form.

Image: Uber
Ad
Ad
Short

Cohere's new Command A Vision model is designed to handle images, diagrams, PDFs, and other types of visual data. Cohere says the model outperforms GPT-4.1, Llama 4 Maverick, Pixtral Large, and Mistral Medium 3 on standard vision benchmarks.

The model's OCR can recognize both the text and the structure of documents such as invoices and forms, outputting the extracted data in structured JSON. Command A Vision can also process real-world images, like identifying potential risks in industrial environments, the company says.

Image: Cohere

Command A Vision is available through the Cohere platform and for research on Hugging Face. The model can run locally with either two A100 GPUs or a single H100 using 4-bit quantization.

Short

Black Forest Labs and Krea AI have released FLUX.1 Krea [dev], an open text-to-image model designed to generate more realistic images with fewer of the exaggerated, AI-typical textures.

The model is based on FLUX.1 [dev] and remains fully compatible with its architecture. It was built for flexible customization and easy integration into downstream applications. Model weights are available on Hugging Face, with commercial licenses offered through the BFL Licensing Portal. Partners like FAL, Replicate, Runware, DataCrunch, and TogetherAI provide API access.

Short

Google is rolling out Opal, a new experimental tool that lets users build AI-powered mini-apps with simple natural language prompts, no coding required.

Opal takes descriptions written in everyday language and automatically connects prompts, AI models, and other tools to turn them into working apps, displaying everything as a visual workflow.

Once an app is built, users can share it with others, who only need a Google account to use it. Opal is launching as a public beta in the US, with plans to develop the tool further based on feedback from the community.

Ad
Ad
Google News