Thinking Machines wants large language models to give consistent answers every time

AI startup Thinking Machines wants to make large language models more predictable. The team is studying why large language models sometimes give different answers to the same question, even when temperature is set to 0, a setting that should always return the most probable answer.

Despite a temperature setting of 0, Deepseek 3.1 generates different answers to the same query. | Image: Thinking Machines

According to Thinking Machines, the problem isn't just GPU precision, which they say is "not entirely wrong" but "doesn’t reveal the full picture." Server load also affects how a model responds: when the system is under heavy load, the same model can produce slightly different results. To fix this, the team developed a custom inference method that keeps outputs consistent regardless of system load. More predictable behavior like this could make AI-supported research more reliable.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Thinking Machines wants large language models to give consistent answers every time

Thinking Machines Lab reportedly seeks up to 5 billion dollars in new funding

Thinking Machines Lab: Former OpenAI CTO's startup reportedly valued at $10 billion

Gemini 3 Pro tops new AI reliability benchmark, but hallucination rates remain high

Researchers push "Context Engineering 2.0" as the road to lifelong AI memory

German court deepens the split on AI and copyright with its latest ruling

Thinking Machines wants large language models to give consistent answers every time

Thinking Machines Lab reportedly seeks up to 5 billion dollars in new funding

Thinking Machines Lab: Former OpenAI CTO's startup reportedly valued at $10 billion