
OpenAI’s live-streamed GPT-5 launch on Thursday was designed to underscore its leap over previous models and rivals. Part of the spotlight fell on a glaring presentation error instead: charts that misrepresented the data they were meant to display.
One bar chart comparing GPT-5 to the firm's o3 model on a ‘coding deception’ measure presented o3 with a much higher bar despite GPT-5 having a slightly better score, 50% compared to 47.4%. In the revised blog post, GPT-5's rate of deception was changed to 16.5%, a big departure from the number displayed onstage.
A second graph contrasted GPT-5, o3, and GPT-4o on another performance measure. The scores in absolute numbers, 74.9 for GPT-5, 69.1 for o3, and 30.8 for GPT-4o, were displayed such that they made the performance of o3 and GPT-4o appear closer visually, even though they differed widely numerically.
OpenAI CEO Sam Altman posted on X (formerly Twitter) as a ‘mega chart screwup’ and promised that the right graphics were now on the company blog. A marketing staffer even apologized online, referring to it as an ‘unintentional chart crime.’
OpenAI says GPT-5 significantly improves accuracy, speed, reasoning, organized thinking, context awareness, and problem-solving over GPT-4o. The model introduces a single system: an ‘efficient model’ for routine work and a ‘GPT-5 Thinking’ engine for more difficult reasoning.
For the first time, users no longer have to choose models manually. An in-real-time router, trained on real-time usage signals, selects the best system in the blink of an eye.
Also Read: Open AI Launches GPT-OSS: The First Open-Weight AI Model in 60 Years
The firm states that Chat GPT-5 is significantly better at coding and able to produce apps, games, and websites using natural language instructions. It also asserts improvement on writing tasks and health-related questions.
Even with the launch-day misstep, OpenAI frames GPT-5 as its most powerful and flexible AI. This model combines speed, efficiency, and deep thinking into one effortless bundle.