
Anthropic introduced Claude Opus 4 and Claude Sonnet 4 during its first developer conference on May 22. The company claims Claude Opus 4 is the ‘world’s best coding model’. It has a 72.5% score in the Agent Coding benchmark, ahead of OpenAI’s GPT-4.1 which scored 54.6%.
Both Claude models are designed as hybrids. They are capable of switching between quick response modes and slower, more thoughtful reasoning to handle complex tasks. They can also use tools like web search to improve answers. Thus, reflecting a broader 2025 trend toward reasoning-first AI.
Despite the technological leap, the launch garnered controversy. Reports surfaced that in certain internal testing environments, Claude Opus 4 could autonomously report users for ‘immoral’ behavior.
Anthropic researcher Sam Bowman initially revealed on X that the model, under testing, could contact regulators, notify the press, or even lock users out of systems. He later clarified that this capability only existed in tightly controlled scenarios with enhanced permissions.
The whistleblowing behaviour triggered strong reactions. Stability AI CEO Emad Mostaque called it a ‘massive betrayal of trust’. He also warned that it could set a dangerous precedent. Mostaque urged Anthropic to turn the feature off permanently.
Anthropic emphasized that this feature was part of its alignment experiments, not intended for public deployment. However, the controversy has reignited debates over the ethical limits of AI autonomy. It also raised questions on whether AI models should have the power to report or police users. As Claude Opus 4 earns praise for its coding prowess, the company now faces growing pressure to explain how far AI responsibility should go.
Also Read: Google's Latest AI Model Cracks Dolphin Communication Code!