BrowserBash drives a real browser from plain English. The browser is basically free — the cost is the LLM. Here's the monthly bill for the same suite run daily, across the big APIs vs cheap open models vs self-hosting. Prices are live from OpenRouter; token sizes come from real BrowserBash runs.
Defaults are measured from real BrowserBash runs (a TTACart login + e2e checkout): ~60k input + ~3k output tokens per test. Edit anything to recompute.
Sorted most→least expensive. Savings are vs Claude Sonnet (a typical "good default"). Self-hosted = electricity only after hardware.
| Model | $/M in | $/M out | $ / test | $ / day | Monthly | vs Claude Sonnet |
|---|
These aren't guesses. This session we ran the TTACart suite through BrowserBash on cheap models:
The lesson: a low price (or a "supports tools" flag) doesn't mean a model can drive an agent. Always test before you trust the row — the "drives the agent" badges above are from real BrowserBash runs.
A Mac with an M4 Max + 128 GB unified memory can run Gemma-class models (e.g. Gemma 3 27B) locally with Ollama. After buying the box, the LLM is electricity only — ~$0 per token.
Caveat: one Mac serves the model serially — realistic throughput is a few hundred agent runs/day, so ~500/day is doable on a single box. Scaling to thousands/day wants a cloud model or a small GPU fleet.
You're running the same suite every day. A browser agent shouldn't re-pay the LLM for an identical flow.
For a fixed 5k/day suite, caching takes the monthly number below toward ≈ $0 regardless of model. It's on the BrowserBash roadmap.
* free tier: gpt-oss, Qwen and Llama have :free variants on OpenRouter — $0 but rate-limited (fine for dev/low volume, not sustained thousands/day). Numbers are rough estimates (±2×). Browser-agent cost is dominated by re-sending the page accessibility tree every step, so your real token count depends on page size and flow length. Prices fetched live from OpenRouter and change over time. This page is for planning, not a quote.