Tokenmaxxing and renting the frontier

🔓 The full version of this article is for premium subscribers only.

Upgrade with a FREE trial to read it in full

For about a year, the instruction issued to engineers across big tech was uncomplicated.

Use as much of this AI stuff as you possibly can, the company card has no limit, go forth and consume.

Knowing better than to second-guess directives from the all-seeing, all-knowing C-suite, they went forth, and boy, did they consume.

They called it tokenmaxxing, and they kept leaderboards – up until recently that is.

The underlying logic (more tokens burned = more work done = a more defensible career) was largely just taken as Gospel. Scrutinising token usage was the sort of party-pooping nobody was being rewarded for.

Then, somewhere in the last two to three months, bills the length of medieval scrolls started arriving, and those very same once-lauded leaderboards, those small monuments to consumption, were quietly disassembled overnight.

Celebrating quite how much you were spending on a frontier model provider turned out, on reflection, to be a slightly counterintuitive (even bonkers) thing to have been doing.

"People were flexing with how much they paid Anthropic," marvelled Jawad Shreim, co-founder and CEO of Dubai-based cloud cost-optimisation startup Milkstraw AI.

"What's your moat? People flex my bills."

🔓 The full version of this article is for premium subscribers only.

Upgrade now with a FREE trial to continue reading

But all good, untrammelled spending eventually comes to an abrupt end. What's followed over the past few weeks has been one of the faster corporate handbrake turns in recent memory.

Meta has told staff it would start rationing AI after an "exponential" rise in costs, and has begun nudging everyone off third-party tools and onto its own cheaper models. Uber, having discovered that it had spent its entire projected annual AI budget in roughly four months, has now capped its coding tools at $1,500 a month per tool.

In short, CFOs everywhere have sat upright again.

But how are founders building across MENA thinking about token-spending caps and model usage?

And given most have to grapple with much stricter data residency rules, and are now facing down the real problem of access to the latest frontier models, with the Trump administration having pulled Anthropic's Mythos offline under export controls and OpenAI's new GPT-5.6 family available only to a handful of government-vetted partners, the question of how meaningful a role open source could play looms larger.

To get a better idea, we spoke to founders and operators at regional startups all the way from Seed to Series C.

A big thank you to Tewfik Cassis at Lean Technologies, Talal Bayaa at Bayzat, Anis Harb at Algebra AI, Nauman Ali at Orbii, Mohammad Al-Khalili at ClearGrid and Jawad Shreim at Milkstraw AI, for being so generous with their time and insights in speaking with us for this piece.

Building the product

Early-stage companies, at least regionally, aren't finding themselves staring down a nine-figure inference bill. By and large, based on our conversations, they're not rationing anything actually.

For engineers it's still early nearly everywhere, and companies largely still seem content to let their best people use as much as they can, on the reasonable logic that an engineer shipping three or four times faster is worth almost any bill.

Tewfik Cassis, who controls the engineering opex budget at fintech-infrastructure company Lean Technologies, puts…

Don't miss what MENA's startup and VC insiders are reading

Continue reading with a FREE trial →

Tokenmaxxing and renting the frontier

Building the product

Don't miss what MENA's startup and VC insiders are reading

Keep Reading

Tabby secures consumer and SME finance licences from SAMA, opening up larger purchases and working capital

1001 raises $30M Series A led by Lux Capital to build sovereign AI for Gulf critical infrastructure