Baidu unveiled AI smart glasses and an image generator designed to produce more accurate output than existing systems.
At the conference, Baidu CEO Robin Li introduced I-RAG, a text-to-image system that aims to reduce inaccuracies in AI-generated images where the output doesn't match the text input or contains non-existent elements.
As reported by Reuters and the Financial Times, I-RAG integrates Baidu's search capabilities and Retrieval Augmented Generation (RAG) to improve the alignment between text prompts and generated images.
When discussing the system, Li reportedly used the term "hallucinations," but in a different sense than its usual AI context. While hallucinations typically refer to AI text generators making false statements that appear factual, Li was describing what image modelers usually call "prompt alignment"-how accurately a model interprets and visualizes user instructions.
Li also reported that Baidu's AI chatbot Ernie, which the company positions as China's answer to ChatGPT, now processes 1.5 billion user requests daily. It's not clear if this is an increase from the 200 million users reported in May, as each daily user would likely send multiple requests.
If there was significant growth, Baidu would likely have stuck with the same metric for better comparability. In any case, it's much less usage as ChatGPT. Beyond its consumer offerings, Baidu is looking to add AI capabilities to its cloud and API services.
AI-powered wearable assistant
Baidu's hardware division Xiaodu revealed a new AI headset that functions as a personal assistant. According to Xiaodu CEO Li Ying, the device combines cameras with Ernie's voice model to enable various hands-free functions.
Users can capture photos and videos, track calorie intake, play music, and ask questions about their surroundings - features similar to those found in Meta's smart glasses.
You can watch the full Baidu World 2024 show here. A trailer for the device starts at 02:05:26.
The company also announced Miaoda, a code generation tool built on Baidu's language model. The system aims to make software development more accessible to users without extensive programming expertise.