OpenAI's New Models o3 and o4-Mini Show Increased ‘Hallucination’ Rates!

When AI Thinks Too Much: OpenAI’s o3 and o4-mini Struggle with Hallucination

Written By:

Published on:

21 Apr 2025, 10:00 am

OpenAI has released its latest models, o3 and o4-mini, to increase the reasoning performance. It also aims to provide clearer responses to the user population’s prompts. However, internal testing indicates that these models have higher hallucination rates than older models.

Increased Hallucination Rates in o3 and o4-mini

O3 and O4-mini were described to have high hallucination rates. O3 hallucinated in 33% of responses on the Person QA benchmark models, according to internal evaluations, O4-mini was 48%, and the older models: O1 and O3-mini, were 16% and 14.8%, respectively.

Why Could This Be Happening?

Experts have conjectured that the different reinforcement learning methods used could amplify hallucinations. The rapid increase in hallucination rates makes us all wonder if we should trust any material derived from AI now, and in fields of importance, when knowledge is critical.

Need for Further Research

OpenAI admitted to the problem and stated that there will be research to understand why hallucination rates go up as the reasoning model increases in size. That is an important step, as the first challenge is to understand if we can trust AI systems at all.

Need for innovation in light of reasoned errors

The O3 and O4-mini modes may be more advanced in terms of AI reasoning, but an increase in overall hallucination trades off how well innovation can balance with accuracy. More research and development maximizes reliability in AI technologies.

Join our WhatsApp Channel to get the latest news, exclusives and videos on WhatsApp

OpenAI