Open-source AI models often use up to 10x more tokens, making them more expensive than expected.
DeepSeek and JetMoE show that efficient architectures can cut training and inference costs dramatically.
The global AI race highlights that cost transparency and efficiency matter as much as accessibility in Artificial Intelligence.
The world of artificial intelligence is changing quickly, with open-source models becoming more popular than ever. Companies, researchers, and governments often see open-source models as cheaper and more flexible compared to closed systems owned by private firms. However, new studies reveal that these models, which were expected to save money, may be causing budgets to spiral out of control.
The problem lies in hidden costs, especially when it comes to how many tokens these models use to complete a single task. On the surface, an open-source AI model may look cheaper, but in practice, it is often far less efficient than expected. This has turned the idea of “affordable AI for all” into a difficult challenge for businesses and institutions.
The cost of running an open-source AI model depends not only on how much it charges per token but also on how many tokens it uses to finish a job. A recent study by Nous Research tested 19 different AI models across knowledge tasks, math problems, and logical reasoning. The results were surprising.
Most open-source models used between 1.5 and 4 times more tokens than closed-source models to complete the same tasks. In certain cases, the difference was even greater. For some basic questions, open-source models used up to 10 times more tokens than closed competitors.
For example, if asked “What is the capital of Australia?”, an efficient model should answer “Canberra” in one word. But many open-source models generate long and unnecessary reasoning, which results in hundreds of tokens being consumed. This turns what seems like a cheap system into a costly one.
Although many open-source models struggle with efficiency, there are some exceptions. Nvidia’s “llama-3.3-nemotron-super-49b-v1” model has been highlighted as one of the most efficient open-source systems across different tasks.
On the other hand, OpenAI’s closed-source models, such as the new gpt-oss version, showed very high efficiency. In math-related tasks, these models were able to use up to three times fewer tokens than open competitors. This suggests that the way a model is designed and trained plays a major role in determining its efficiency.
Models that use a method called Mixture-of-Experts, or MoE, tend to perform better in terms of efficiency. In these models, not every part of the network is activated for every token, which helps reduce the overall cost. DeepSeek, a Chinese company, has made strong progress in this area by building MoE-based systems that are more resource-friendly.
One of the biggest disruptions in recent times has come from DeepSeek, a fast-growing Chinese AI developer. The company introduced its DeepSeek-V3 model in early 2025 and immediately gained worldwide attention.
DeepSeek claimed that the model was trained at a cost of only $5,000,000 and $6,000,000. To put this into perspective, this was about one-tenth of the computing power and cost that Meta used for training its LLaMA 3.1 model. Although later analyses suggest that the real cost might be higher, there is no doubt that DeepSeek has shown how smart design can reduce expenses dramatically.
This breakthrough has been called a “Sputnik moment” in the AI world, raising concerns about global competition and even sparking fears of an AI price war.
Also Read - AI Model S1: How an Open-Source Rival Challenges OpenAI?
For enterprises, the choice between open and closed models is becoming complicated. Open-source AI models are attractive as they allow customization and give more control. However, the high cost of inference, or running a model repeatedly for tasks, can make it more expensive in the long run.
Stanford University’s 2025 AI Index Report revealed that the cost of inference for GPT-3.5-level systems dropped 280 times between late 2022 and late 2024. This was mainly due to better hardware and smarter deployment methods. But the same efficiency has not yet reached most open-source systems.
A report by the Linux Foundation in May 2025 shows that two-thirds of organizations believe open-source AI helps reduce costs, and nearly half said saving money is their main reason for adopting it. But the reality is that unless token usage is optimized, the supposed savings can quickly vanish.
Several new approaches are being developed to make open-source AI more efficient. A strong example is JetMoE-8B, a model built using publicly available data and tools. This model performed better than LLaMA-2-7B while costing less than 100,000 US dollars to train. It uses a sparse MoE architecture, it reduces compute needs by around 70 percent.
In addition to better design, cloud-based services are helping businesses manage costs. Companies like NetApp Instaclustr provide pay-as-you-go AI infrastructure, which reduces the need for expensive upfront investment and makes expenses more predictable. Startups such as Nosana are also proving that replacing expensive proprietary models with optimized open-source ones can cut costs significantly.
The competition between countries is making the debate even more intense. Chinese companies are at the forefront of open-source AI, with models like Qwen by Alibaba, ERNIE 4.5 by Baidu, and DeepSeek gaining wide adoption. Others, such as Moonshot, Z.ai, and MiniMax, are also pushing open-source access.
In response, American firms are also releasing open-source versions, such as OpenAI’s gpt-oss, although the degree of openness differs. Policymakers in the United States are paying close attention to these developments. Fears of losing technological leadership have already led to strategies such as heavily discounted AI services for government agencies and the launch of a national AI testbed known as USAi.
The main lesson from these findings is that the idea of open-source AI automatically being cheaper is no longer true. Token inefficiency can make many open-source systems far more expensive to run than closed ones.
DeepSeek has shown that with smart design, training costs can be reduced by a huge margin. JetMoE-8B proves that efficient models can be created at low cost if the right methods are used. At the same time, hardware improvements and cloud platforms are giving hope that these costs will continue to fall.
For organizations, choosing the right model now means looking not only at licensing and flexibility but also at the hidden cost of running the system. Without transparency in efficiency and deployment, businesses risk spending far more than planned.
Also Read - AI Trends in 2025: From Open Source to Advanced Models
The open-source movement in artificial intelligence has created new opportunities for innovation, collaboration, and global participation. Yet the belief that open-source models are always cheaper is being challenged by new evidence. Token inefficiency, hidden computational demands, and costly deployments are making some of these systems budget-breakers rather than budget-savers.
The road ahead requires a focus on efficiency, transparency, and smarter design. If future open-source models succeed in combining openness with low-cost operation, they could reshape the AI landscape. But unless that balance is achieved, the risk of open-source AI breaking budgets will remain a major concern for businesses and governments worldwide.