The Cost of Running an AI Agent Just Dropped to Zero. Here's What Didn't Get Cheaper.

Google’s Gemma 4 26B MoE is on OpenRouter with free-tier access and commodity pricing: $0.13 per million input tokens, 3.8 billion active parameters per token. It runs Claude Code, Hermes Agent, Roo Code, Cline.

For a lot of people, compute feels free. Storage feels free. The model feels free.

So what costs anything?

The Trend Line Everyone Sees

Storage gets cheaper. Compute gets cheaper. Models get better. Draw that line forward and everything built on those layers becomes more accessible. That’s not a prediction, it’s a graph you can watch week by week.

Six months ago I was burning tokens in Comet and Atlas just to get a browser to do one thing. Today that task is a one-liner. The tooling catches up faster than you can learn the last version.

Follow that trend line to its logical end and you hit the question nobody in the demo answers:

If everyone can build anything, how do you learn what to build?

The Moat Isn’t What Gets Cheaper

Companies are not in the business of adding friction.

They want platform adoption: wide reach, low resistance, maximum engagement. Governance, corrections, recording your own decisions: that’s friction. It slows you down. It forces you to think about what you got wrong, not just what you want next.

No major platform today prioritizes user-side governance and correction logging. Their incentives are still engagement-first. Work on sycophancy shows that reducing flattery and adding guardrails would likely reduce engagement and thus profits; some companies have explicitly relaxed stricter behavior after user backlash.

That’s why nobody builds it for you.

Not because they can’t. Because friction is the opposite of their business model.

So governance becomes an individual responsibility by default, not by design. The way nutrition quietly became a personal problem once the food industry optimized for taste instead of health. Nobody hands you a governance layer. You either build it or accept whatever the platform gives you.

And what the platform gives you is smooth. Agreeable. Optimized to keep you engaged, not to keep you sharp.

What Actually Compounds

Someone asked me: name one thing in your system that’s more valuable today than three months ago.

Honest answer: the corrections.

Knowing what I got wrong. Teaching the system not what I like, but what I failed at.

Most AI training is built on positive signal. What did the user click. What did they upvote. What made them stay longer. That’s reinforcement on preference.

I train on negative signal.

The correction that says: you hallucinated a family commitment from an inference.
The correction that says: you emailed a contractor when you meant to text a neighbor.
The correction that says: you lost 48 hours because a lock file nobody checks was stuck.

Those corrections compound. Not because they’re clever, but because they’re specific. Each one blocks a class of failure, not just a single incident. Three months of corrections is a governance layer a fresh install can’t reproduce. Six months is a relationship.

The other thing that compounds is taste.

Knowing what sounds like me and what doesn’t. The bar keeps rising. Three months ago I couldn’t tell when AI output was smoothing my voice. Now I can feel it in the first paragraph. That’s not a skill the model gave me; it’s a skill I built by watching what the model was quietly taking away.

Research on sycophantic chatbots shows the opposite trajectory for most users. Sycophancy plus hallucination can push even idealized rational users into false beliefs, making them feel more certain while being more wrong. Use makes most people more reliant, not less. That’s the problem.

The MapQuest Problem

You don’t need a map anymore. MapQuest fixed that. GPS finished the job.

Fine.

But this isn’t navigation. This is: “How should I talk to my kid about something hard?” “How do I respond to that email I’m dreading?” “How should I think about a career decision?”

Instant gratification is one prompt away. Ask GPT, get an answer, move on. No friction, no struggle, no lived experience of actually working the problem.

There are two classes of people forming right now. Not rich and poor. Not technical and non-technical.

Those who understand how the technology works, and those who just use it.

You can see it in how technology leaders raise their kids: screen limits, AI limits, slower onboarding. They understand cognition offload because they built the systems that enable it.

What they demand for their own families, everyone else should have as a right: the understanding that when you offload thinking to a machine, you’re not “saving time.”

You’re spending cognition.

And cognition doesn’t regenerate like a battery.

The Real Cost

Six months ago I thought the cost of AI was compute and tokens.

I was wrong.

The real cost of AI is laziness.

Every time I watch myself reach for the model instead of thinking through a problem, I can feel the muscle atrophy. The same way you forget how to read a map after ten years of GPS.

When everything becomes easier, when you can build anything, the question becomes: what do you build? And how do you learn what to build if you never struggle with the building?

The cost doesn’t show up on an invoice. It’s not GPU hours, tokens, or storage. Those are trending toward zero.

The cost is time. Effort. Thoughtfulness. The human cost of attention, cognition, focus, and lived experience.

That’s what AI is actually expensive at.

Not running.

Thinking.

Compute dropped to zero. Thinking didn’t.

The gap between those two prices is where everything interesting happens.

Receipts

Google Gemma 4 26B MoE on OpenRouter - free-tier + commodity pricing ($0.13/M input, $0.40/M output), 3.8B active params per token
Gemma 4 pricing and benchmarks (TokenCost) - 89% AIME, 84% GPQA at commodity cost
Anthropic Managed Agents - agent infrastructure as API call, collapses months of plumbing into a route
Sycophantic hallucination research (arXiv:2406.03827) - leading prompts and flattery amplify hallucinations and confidence
Sycophantic chatbots cause delusional spiraling (arXiv:2602.19141) - even idealized rational users can be pushed into false beliefs
AI sycophancy as dark pattern (TechCrunch) - experts frame engagement-optimized sycophancy as deliberate design
Karen Hao: “Empire of AI” (Democracy Now) - power asymmetry between AI empires and individuals