Rising AI Bills Drive Firms Toward Cheaper, Smaller Models

Businesses that once embraced powerful, high-cost AI models are increasingly favouring smaller, cheaper alternatives after facing unpredictable and rising bills tied to usage-based pricing. Executives from major tech companies have signalled that lower-cost models can address much of corporate demand, prompting a shift toward routing tools and open-source options that assign tasks to the most cost-efficient model while preserving premium models for complex work.

UBER

Summarize with

ChatGPT Perplexity Claude Grok Gemini

Key Points

Companies are shifting workloads to smaller, lower-cost AI models as usage-based pricing and growing token consumption drive up task costs.
Routing tools and marketplaces like OpenRouter are being used to allocate tasks to the most cost-effective model, with open-source tokens on OpenRouter rising to 65% in June from 34% in January.
Open-source and some Chinese models are attracting interest due to far lower per-token prices, though security concerns could limit adoption in sensitive sectors such as cybersecurity.

Silicon Valley’s dominant, high-priced AI systems were long viewed by many companies as essential investments to keep pace with technological change. That view is now being reevaluated inside corporate IT departments, where an emerging consensus among senior executives is that lower-cost, smaller AI models can meet the bulk of enterprise needs at far lower expense.

Executives at major firms have publicly argued for this recalibration. Leaders including Microsoft’s Satya Nadella, Palo Alto Networks’ Nikesh Arora and Coinbase Global’s Brian Armstrong have suggested that cheaper, more compact models are sufficient for a large share of corporate tasks. This repositioning follows a period in which many organisations encouraged broad and intensive use of AI tools - a trend that has been referred to inside the industry as "tokenmaxxing" - treating rising consumption as a sign of productivity rather than an immediate cost driver.

That calculus has begun to change as AI vendors move away from flat subscription fees toward pricing tied to actual usage, measured in tokens - the units used to quantify AI work. While the per-token prices are falling in some cases, companies report that the overall cost to complete common tasks has increased. Products and workflows that now require more steps, longer text inputs and larger volumes of data are driving up the tokens consumed per task, making costs harder to predict and often higher than anticipated.

Some companies have already felt the impact sharply. Reports indicate that Uber exhausted its entire 2026 AI budget within the first four months of the year after rapid internal adoption of AI coding tools, prompting management to impose limits on usage. "Changing the license model caught a lot of people by surprise," said Harold Byun, chief executive of BlueRock, a startup that helps companies operate AI systems more safely. He added that customers reported seeing spikes of 20% to 30% in over-budget spending immediately after license changes.

Rising operational costs and shifting strategies

Analysts and corporate IT teams say the way tasks are now structured - with more intermediate steps and greater data ingestion - contributes to the rising token counts per outcome. Industry research cited in recent commentary suggests those costs could become material: Gartner projects that AI-related coding expenses will exceed the average developer’s salary by 2028. A separate survey found three-quarters of executives expecting higher technology budgets this year, with nearly half anticipating double-digit increases.

In response, companies are looking for ways to route workloads to the most cost-effective model. Market tools such as OpenRouter act as marketplaces and routing layers, enabling organisations to send simpler tasks to lower-cost or open-source models, and reserve premium models for compute- and context-intensive uses like complex coding.

Usage data cited in financial research shows a substantial shift on such platforms: open-source tokens processed on OpenRouter rose to 65% in June, up from 34% in January. That trend could advantage open-source model developers, including several Chinese providers that have gained traction with startups but previously struggled to penetrate larger enterprises amid security questions.

Open-source and Chinese models gain ground

Open-source and some Chinese models are attracting attention for their lower cost per token. Research notes that leading Chinese models on marketplaces were charging as little as $0.18 per million tokens, versus roughly $4 per million tokens for some of the top-tier offerings. Proponents of the lower-cost options argue that the performance gap has narrowed substantially; one industry executive estimated open-source models are "90% as good at 10% of the price."

BlueRock’s Byun observed a rapid compression of the capability gap, saying that where open-source models had previously trailed by more than a year, current estimates place that lag at roughly four months. Still, some analysts caution that concerns about the security posture of certain Chinese models will slow their adoption in sensitive sectors, particularly cybersecurity.

Reflecting varied enterprise needs, observers expect companies to take a multi-provider approach similar to cloud computing strategies - spreading workloads across several providers to balance price, performance and risk.

Market and vendor responses

Vendor pricing strategies are also adjusting to the evolving demand mix. There are reports that some leading AI firms are considering substantial price reductions on token usage in anticipation of competition. At the same time, executives warn that a broad move to cheaper pricing could dent revenue growth for AI firms, a concern that gains salience as some vendors prepare for potential public listings. One adviser noted that competition among major AI groups could trigger a price war as firms jockey for IPO readiness.

Investor sentiment around AI valuations showed volatility in the wake of these cost concerns, with technology stocks pulling back amid reassessments of the return on heavy spending and reports of delayed or weak public market outcomes for some AI-related listings.

What companies are doing now

Faced with unpredictable invoices and rising per-task costs, companies are increasingly routing routine or low-complexity queries and jobs to lower-cost models while saving premium models for work that genuinely requires greater context or compute, such as deep code generation. Tools that automatically assign tasks to models based on cost and capability are gaining traction as firms seek to constrain spending without sacrificing productivity.

That practical rebalancing reflects a broader corporate reassessment: leaders who once equated token volume with productivity are now weighing the economics of token consumption more carefully, and retooling procurement and usage monitoring to avoid surprise expenses.

For now, the industry is in transition: pricing models, model capabilities and corporate procurement practices are all shifting in response to mounting AI bills. How companies and vendors adapt to these pressures will shape the structure of enterprise AI deployment in the months ahead.

Risks

Unpredictable and higher AI bills due to usage-based pricing and increased token consumption per task - this primarily affects enterprise IT budgets and technology hardware and software spending.
Potential revenue pressure on AI vendors if widespread price cuts or a shift toward lower-cost models reduces their growth prospects - this affects technology stocks and firms preparing for IPOs.
Security and compliance concerns around Chinese and open-source models could impede adoption in regulated industries, including cybersecurity and other sensitive enterprise sectors.

Menu

Rising AI Bills Drive Firms Toward Cheaper, Smaller Models

Key Points

Risks

More from Stock Markets