News

OpenAI's New Models o3 and o4-Mini Show Increased ‘Hallucination’ Rates!

When AI Thinks Too Much: OpenAI’s o3 and o4-mini Struggle with Hallucination

Written By : Anudeep Mahavadi

OpenAI has released its latest models, o3 and o4-mini, to increase the reasoning performance. It also aims to provide clearer responses to the user population’s prompts. However, internal testing indicates that these models have higher hallucination rates than older models.

Increased Hallucination Rates in o3 and o4-mini

O3 and O4-mini were described to have high hallucination rates. O3 hallucinated in 33% of responses on the Person QA benchmark models, according to internal evaluations, O4-mini was 48%, and the older models: O1 and O3-mini, were 16% and 14.8%, respectively.

Why Could This Be Happening?

Experts have conjectured that the different reinforcement learning methods used could amplify hallucinations. The rapid increase in hallucination rates makes us all wonder if we should trust any material derived from AI now, and in fields of importance, when knowledge is critical.

Need for Further Research

OpenAI admitted to the problem and stated that there will be research to understand why hallucination rates go up as the reasoning model increases in size. That is an important step, as the first challenge is to understand if we can trust AI systems at all.

Need for innovation in light of reasoned errors

The O3 and O4-mini modes may be more advanced in terms of AI reasoning, but an increase in overall hallucination trades off how well innovation can balance with accuracy. More research and development maximizes reliability in AI technologies.

Join our WhatsApp Channel to get the latest news, exclusives and videos on WhatsApp

Top 3 Best Meme Coins to Rebound Your Portfolio After End of September Market Crash

From Bitcoin to Ozak AI: How Flipping 0.02 BTC Now Might Return Larger % Profits Than BTC’s Next Cycle

Can XRP Sustain a Breakout Above $3? Cardano’s Surge Fuels Altcoin Rally Hopes

Crypto News Today: $950M XRP Sell-Off, Grayscale Debuts Staking ETPs, Bitcoin ETFs See Record Inflows, Morgan Stanley Endorses Crypto

Aussivo Debuts Verification Layer for Cloud at Token2049, Pioneering Blockchain Transparency for Enterprises