We pointed BrowserBash at our own Playwright suite

The Testing Academy maintains a production Playwright + TypeScript framework that tests TTACart, our demo store — login, add to cart, full checkout, order confirmation. We took that exact end-to-end journey and rewrote it as one plain-English file, then ran it with a single command on a free local model — driving a real browser against the live app, the whole session recorded for replay.

Real repo, open sourcePlaywright 1.60 + TSSame journey, plain English$0 local model
6 → 1Six page-object classes replaced by one plain-English file
0CSS / data-test selectors to write or maintain
1Command runs the whole login → checkout journey
$0Cost on a local Ollama model — no API key, no grid

A real, production Playwright framework

AdvancePlaywrightFramework1xis our open-source, batteries-included Playwright + TypeScript suite: the Page Object Model, custom fixtures, Faker data factories, a Winston logger, Allure plus a custom TTA-branded HTML reporter, and a GitHub Actions pipeline. It exercises TTACart — a SauceDemo-style store at app.thetestingacademy.com/playwright/ttacart — across login, inventory, cart and a three-step checkout.

  • login.spec.ts — signs in as standard_user and asserts the form is gone.
  • e2e-checkout.spec.ts — the flagship: login → add item → cart → checkout → “Thank you for your order!”
  • apiTests/ — a serial CRUD flow (token → create → update) that runs green in CI.
  • Recorded by design — Playwright video on, per-step screenshots, traces on retry.
The point of the case study

This isn’t a toy. It’s a maintained suite that passes in CI. We wanted to know: could the same coverage be written by anyone on the team in plain English — and still drive the real browser? It can.

From six page objects to one paragraph

Left: the real e2e-checkout.spec.ts — page objects, fixtures, selectors, assertions. Right: the exact same journey as examples/ttacart_checkout_test.md, shipped in the BrowserBash repo. No selectors. No page objects. Just intent.

before · playwright + typescript
e2e-checkout.spec.ts
// e2e-checkout.spec.ts  ·  AdvancePlaywrightFramework1x
test.beforeEach(async ({ loginPage }) => {
  await loginPage.open();
  await loginPage.loginAs(
    credentials.standardUser, credentials.password);
});

test('should complete checkout successfully', async ({
  inventoryPage, cartPage, checkoutStepOnePage,
  checkoutStepTwoPage, checkoutCompletePage,
}) => {
  const customer = DataGenerator.checkoutCustomer();
  await inventoryPage.open();
  await inventoryPage.addToCart('test-allthethings-tshirt-red');
  await cartPage.open();
  expect(await cartPage.rowCount()).toBe(1);
  await cartPage.checkout();
  await checkoutStepOnePage.fillGuest(customer);
  await checkoutStepOnePage.continue();
  await checkoutStepTwoPage.finish();
  await checkoutCompletePage.assertOrderComplete();
});

// + 6 page objects (Login, Inventory, Cart, CheckoutStepOne,
//   CheckoutStepTwo, CheckoutComplete), fixtures, BasePage,
//   UtilElementLocator, data-test selectors, Faker factories…
after · browserbash markdown
ttacart_checkout_test.md
# TTACart end-to-end checkout

- Open the TTACart login page
- Log in as standard_user with the password tta_secret
- Go to the products inventory page
- Add the "Test.allTheThings() T-Shirt (Red)" to the cart
- Open the cart and verify it contains exactly 1 item
- Click Checkout
- Fill the checkout details: first name Pramod,
  last name Dutta, postal code 560001
- Continue to the order overview, then click Finish
- Verify the page shows "Thank you for your order!"
Same intent, masked secrets

Credentials shown here are TTACart’s public demo creds. For real apps, pass values as {{variables}} — BrowserBash masks them as ***** in every log line, event and summary.

One command runs the whole journey

No build step, no page.locator, no waiting code. The AI agent reads each line, finds the element on the live page, acts, and keeps the logged-in session alive from the first step to the confirmation screen.

zsh — the whole run
$ browserbash testmd run \
  examples/ttacart_checkout_test.md --record --upload
Engine: stagehand (MIT, stagehand.dev)
Recording session video (--record)
 opening app.thetestingacademy.com/playwright/ttacart in local Chromium
 login · inventory · add to cart · checkout · finish
 one browser context across every step · recorded with --record
1

Write the intent

Plain-English steps in a committable *_test.md — or generate them from an existing spec.

2

Run one command

Local Chrome by default. Add --provider lambdatest or browserstack for a grid.

3

Drive a real browser

The agent finds elements live — the session stays logged in through checkout.

4

Get a verdict + replay

An exit code, a Result.md, and a recorded video — uploaded to your dashboard.

Plain English in, real TTACart out

With --record, BrowserBash captures a session video and a screenshot on every engine (the builtin engine also saves a Playwright trace). Below is the actual frame BrowserBash captured driving a local Chromium against the live TTACart — not a mockup, the real app.

app.thetestingacademy.com/playwright/ttacart/index.html
Real --record capture: BrowserBash opening TTACart in a local Chromium. The CLI keeps everything on your machine until you add --upload.

Every run, recorded and replayable

Add --upload (after a one-time browserbash connect) and the run streams to your free BrowserBash dashboard: run history, status, the video replay, and a per-run page you can share with the team. This is the “showcase in a dashboard” part — a living record of every TTACart journey, not a wall of CI logs.

  • Run history — every objective, with pass / fail and duration.
  • Video replay — watch exactly what the agent saw, step by step.
  • Per-run share link — send a teammate the replay, not a stack trace.
  • Free tier — uploaded runs kept 15 days; optional retention for longer.

A failure tells you exactly where

Tests exist to catch regressions, so the failure path matters as much as the happy path. If TTACart ever stopped saying “Thank you for your order!”, BrowserBash marks that step failed, captures a screenshot at the point of failure, writes the reason to Result.md, and exits non-zero so CI goes red — no prose to parse.

a failed verdict
  ✓ [5] act: click Finish
  ✗ [6] verify the page shows "Thank you for your order!"
FAILED — expected text not found · screenshot saved
exit code 1 · Result.md written · CI fails the build
Exit codeMeaning
0Passed — every step succeeded
1Failed — an assertion or step did not pass
2Error — the run could not execute
3Timeout — the run exceeded its budget

Free locally — or pennies on a hosted model

BrowserBash is model-agnostic and resolves in this order: a local Ollama model first, then your Anthropic or OpenRouter key if set. For TTACart we ran a local model — $0, fully private, nothing leaving the machine.

  • Free & localollama pull qwen3, then run. Best for short, direct flows; no keys, no cost.
  • Cheap & hosted — long multi-step journeys are most reliable on a stronger model. A budget OpenRouter model like deepseek/deepseek-chat or a Qwen model costs a few cents per run.
  • One flag to switch--model openrouter/deepseek/deepseek-chat or --model ollama/qwen3. Same test file, your choice of brain.
  • No lock-in — swap models or grids without touching the test.
Honest note

Tiny local models are great for simple objectives; a full login-to-checkout journey is more reliable on a stronger (still cheap) hosted model. BrowserBash lets you pick per run — start free, scale up only when the flow demands it.

Run it on your own app

Install the CLI and turn your next test into a sentence.

npm install -g browserbash-cli