Elon Musk’s xAI Launches Grok 4 Fast With 2M Token Limit and 40% Lower Costs

xAI Launches Grok 4 Fast, Cutting Token Use by 40% While Matching Grok 4 Accuracy. Available Across Web, Apps, and APIs with Flexible Pricing

Written By:

Reviewed By:

Published on:

21 Sep 2025, 2:21 pm

Updated on:

21 Sep 2025, 2:21 pm

Elon Musk’s xAI has launched a new AI model, Grok 4 Fast. The model aims to keep costs low and maintain competitive accuracy by combining non-reasoning and reasoning abilities into a single system, thereby eliminating the need for separate frameworks.

According to xAI, Grok 4 Fast uses approximately 40% of the number of thinking tokens used by Grok 4. The performance levels are benchmarked with fewer tokens, yet the results are close to Grok 4. Based on the objective exploration done by Artificial Analysis, Grok 4 Fast could run with 98% less money while maintaining the same performance to improve its cost-performance ratio.

The results of Benchmarking in AIME 2025, HMMT 2025, and the GPQA Diamond test gave scores of 85.7%, 92%, and 93.3%, respectively. Additionally, the model scored 95% on SimpleQA and 74% on X Bench Deepsearch, meaning that it can be applied to various tasks, including code execution and sophisticated search.

Technical Improvements and Integration

Grok 4 Fast has a 2-million-token context window, which lets the system handle bigger inputs. It was trained using reinforcement learning methods that were optimal in terms of efficiency and latency. It is a single model that enables both reasoning and non-reasoning to operate simultaneously, reducing the costs of both enterprise and consumer applications.

The release improves on previous versions of Grok, which used different models to perform various tasks. Grok 4 Fast makes deployment easy and more accessible by leveraging these features and making it more business and developer-friendly.

Availability and Pricing

xAI attested that Grok 4 Fast can be used on several platforms. It can be accessed through apps on grok.com, iOS, and Android, and can be connected to OpenRouter, Vercel AI Gateway, and the xAI API. To a restricted extent, users obtain the model free of charge on OpenRouter and Vercel.

The two forms of the model are currently proposed: Grok-4-fast-reasoning and Grok-4-fast-non-reasoning, and both are in favor of the entire 2-million token context window. While the pricing is based on the small workloads of $0.20 per million input tokens, the number of tokens consumed scales the costs.

The launch makes Grok 4 Fast an affordable alternative to the prior models, offering scalability to developers and enterprises interested in efficient AI solutions.

Also Read: Elon Musk's xAI Signs EU's AI Code of Practice, But There's a Catch

Join our WhatsApp Channel to get the latest news, exclusives and videos on WhatsApp

Tech news

News

Elon Musk’s xAI Launches Grok 4 Fast With 2M Token Limit and 40% Lower Costs

Technical Improvements and Integration

Availability and Pricing

Join our WhatsApp Channel to get the latest news, exclusives and videos on WhatsApp

Related Stories

Elon Musk Unveils Grok 4.5 Beta, Claims it Beats Claude Opus AI

Elon Musk Suffers Another Courtroom Setback: Federal Judge Rejects xAI’s Evidence in Trade Secrets

Fired xAI Engineer Files Explosive Lawsuit Over Grok Risks

Elon Musk Eyes xAI-SpaceX Integration in Bold AI Expansion