Testing AWS Cognito Hosted UI Login Flows With AI

Name: BrowserBash
Author: The Testing Academy

To test an AWS Cognito Hosted UI login flow with AI, you describe the journey in plain English (open your app, click the login button, sign in on the Cognito Hosted UI page, and confirm you land back on your dashboard) and let an AI agent drive a real Chrome browser through every redirect. With BrowserBash, a free open-source CLI from The Testing Academy, you write the intent as a Markdown *_test.md file or a one-line browserbash run objective, pass the test username and password as masked variables so they never hit your logs, and the agent reads the live page on each step to follow the hop from your single-page app out to the Cognito domain and back. The one piece that does not fully automate is the email or SMS confirmation code, which needs a human-in-the-loop step or a backdoor, and this guide is honest about exactly where that line sits.

Cognito is one of the most-tested-yet-hardest-to-test login flows in the AWS stack, because the Hosted UI lives on a domain you do not own, the markup is generated by AWS, and the round trip from your SPA to *.auth.<region>.amazoncognito.com and back crosses an origin boundary that trips up selector-based scripts. This guide walks the whole flow concretely.

What the Cognito Hosted UI flow actually looks like

Before automating anything, it helps to name the steps a real user walks through, because that sequence is exactly what you will describe to the agent. The Cognito Hosted UI is the AWS-hosted sign-in and sign-up page you get for free with a user pool. From the browser's seat, a typical hosted-UI sign-in looks like this:

The user lands on your single-page app and clicks a "Log in" or "Sign in" button.
The app redirects the browser to your Cognito domain, something like https://your-app.auth.us-east-1.amazoncognito.com/login?client_id=...&redirect_uri=...&response_type=code.
The Cognito Hosted UI renders an email or username field and a password field, plus a "Sign in" button and a "Sign up" link.
The user enters credentials and submits.
Cognito validates, then redirects the browser back to your app's redirect_uri with an authorization code (or tokens, depending on your flow) in the URL.
Your SPA exchanges that code for tokens, and the user lands on an authenticated screen: a dashboard, a profile, a home page.

Sign-up adds a fork: after the user fills the registration form, Cognito sends a confirmation code by email or SMS, and the user has to type that code back into a confirmation screen before the account becomes usable. That confirmation step is the honest hard part, and we will get to it.

The browser test's job is to prove that a human clicking "Log in" ends up authenticated on your app. The token exchange, the PKCE handshake, and the state parameter checks are real and important, but they happen between your SPA and Cognito server-side and are better covered by integration tests against your callback. The slice we automate here is the visible journey across the redirect.

Why selector-based Cognito tests are fragile

If you have ever scripted a Cognito Hosted UI login in Playwright or Selenium, the pain is familiar, and it is worth naming the specific failure modes because the AI approach answers each one directly.

You do not own the markup. The email field, the password field, and the "Sign in" button live on AWS-generated pages. Cognito has more than one Hosted UI: the older classic UI and the newer managed login experience render different DOM, and AWS can change either without telling you. A locator that is green today can be red after an AWS-side update, with nothing in your repo having changed.

The flow leaves your origin. The journey crosses from your SPA's domain to the amazoncognito.com domain and back. Selector scripts that assume a single origin can lose track of which page they are on, and "element not found" errors appear when the script is looking at the wrong document mid-redirect.

Timing is unpredictable. The redirect out, the Cognito page render, and the redirect back each take a variable amount of time depending on region, cold starts, and network. Scripts littered with fixed sleep calls are either slow (over-padded) or flaky (under-padded).

Secrets leak. A login test types a real password. In a naive script that password can end up in shell history, CI logs, and archived run transcripts that outlive the test by months. That is a security problem, separate from flakiness, and it is covered in depth in secret handling for AI browser tests in CI.

Intent-based AI testing addresses the first three by reading the live page each run, and gives you tooling for the fourth through variable masking. It does not magic away the confirmation-code problem, and the honest-limits section below says so plainly.

How BrowserBash drives the Cognito login

BrowserBash is a natural-language browser automation and testing CLI. You write a plain-English objective, an AI agent drives a real Chrome browser step by step (no selectors, no page objects), and it returns a pass or fail verdict plus any values it extracted. The agent finds elements through the accessibility tree (roles, accessible names, states) plus the DOM, not CSS classes, so a relocated "Sign in" button on the Cognito side is something it adapts to rather than something that snaps a hardcoded locator.

It is free and open source under Apache-2.0. Install from npm (you need Node 18+ and Chrome):

npm install -g browserbash-cli

There are two engines under the hood, and both re-derive what to click from the live page. The default engine is stagehand (MIT, by Browserbase), which observes the live DOM each step and decides the next action from what is rendered right then. The alternate builtin engine is an Anthropic tool-use loop that captures native Playwright traces and re-derives the selector on every action from a fresh snapshot, never cached across runs. To be precise about what this is and is not: BrowserBash does not patch or keep a saved selector script between runs. It re-derives from live state on every run, so when the Cognito Hosted UI shifts, the next run reads the new page instead of replaying a stale map.

The model story is local-first, and it matters for a multi-step flow like Cognito. By default --model auto resolves in order: a local Ollama install first (free, no keys, nothing leaves your machine), then ANTHROPIC_API_KEY, then OPENROUTER_API_KEY (free models exist there too). The honest caveat: very small local models (8B and under) get flaky on long, multi-step objectives, and a Cognito sign-in is exactly that, five or six dependent steps across two domains. The sweet spot is a 70B-class local model (Qwen3 or Llama 3.3) or a capable hosted model for the hardest flows. If your Cognito test wanders or gives up halfway, upsize the model first, not the prompt.

A first run: the one-liner

The fastest way to see the flow work is a single browserbash run with an objective. Point it at your app's URL and describe the journey:

browserbash run "Open https://app.example.com, click the Log in button, \
sign in on the Cognito Hosted UI with email {{COGNITO_USER}} and password {{COGNITO_PASS}}, \
then confirm the dashboard heading is visible after the redirect back"

The {{COGNITO_USER}} and {{COGNITO_PASS}} are variables. Pass them as environment values or via your variables file, and BrowserBash masks them in logs and transcripts so the real password is never printed. The agent will open the app, click through to the Cognito domain, fill the fields it sees on the Hosted UI, submit, wait for the redirect back, and assert the dashboard heading. You get a pass or fail and a Result.md written for the run.

This is great for a quick check, but for anything you want to keep and run in CI, move it into a Markdown test file.

Writing the Cognito flow as a test.md file

BrowserBash tests are intent, not selectors. A *_test.md file has a title (#), ordered or bulleted steps, and supports {{variables}} and @import composition so you can reuse a login across many tests. Here is a complete cognito_login_test.md:

# Cognito Hosted UI sign-in

1. Go to https://app.example.com
2. Click the "Log in" button
3. Wait for the page to redirect to the Cognito Hosted UI
4. Type {{COGNITO_USER}} into the email or username field
5. Type {{COGNITO_PASS}} into the password field
6. Click the "Sign in" button
7. Wait for the browser to redirect back to app.example.com
8. Verify the heading "Welcome back" is visible on the dashboard

Run it:

browserbash testmd run ./cognito_login_test.md

Two details worth calling out. Step 3 and step 7 say "wait for the redirect" in plain English. You do not write a fixed sleep. BrowserBash leans on Playwright's built-in auto-wait with a 15-second ceiling, so it waits for the page to actually settle rather than guessing a duration. That is the antidote to the timing fragility named earlier. Step 4 says "email or username field" rather than naming a selector, because the agent matches on the accessible name and role, so it works whether your pool is configured for email or username sign-in.

Because the login is a *_test.md of its own, you can compose it into larger journeys with @import. A checkout test that needs an authenticated user starts with:

# Authenticated checkout

@import ./cognito_login_test.md

1. Click "Add to cart" on the first product
2. Go to the cart and click "Checkout"
3. Verify the order summary shows one item

The @import pulls the whole Cognito sign-in in as the opening steps, so you write the login intent once and reuse it everywhere. For the broader patterns of describing a login as intent, see AI login flow testing.

Handling secret masking properly

A login test types a real password, so masking is not optional. BrowserBash masks any {{variable}} value in its logs, its NDJSON agent stream, and the Result.md it writes, so the literal password does not appear in artifacts. You still have to get the value into the run safely. The clean pattern in CI is to keep COGNITO_USER and COGNITO_PASS in your CI provider's secret store and expose them as environment variables to the step, never inline in the command or committed to a file.

A dedicated test user in a non-production user pool is the right move. Do not point a Hosted UI login test at a real customer account, and do not reuse your own admin credentials. Spin up a throwaway user whose only job is to be logged in by the test, so that even if something leaks, the blast radius is a disposable account in a test pool. The full treatment of secret hygiene, including what to do about CI log retention, is in secret handling for AI browser tests in CI.

Running it in CI

For pipelines, BrowserBash has flags built for machines. --agent switches output to NDJSON so your CI can parse each step. --headless runs without a visible window. --record captures a webm video plus screenshots, which is gold when a Cognito run fails on a CI box you cannot see. Exit codes are unambiguous: 0 pass, 1 fail, 2 error, 3 timeout, so your pipeline can branch on the result without scraping text.

browserbash testmd run ./cognito_login_test.md --agent --headless --record

A Result.md is written per run with the verdict and what the agent saw. If you want a shared view, --upload opts into a free cloud dashboard (runs kept 15 days), or run browserbash dashboard for a local one, no upload. You can also target where the browser runs with --provider local|cdp|browserbase|lambdatest|browserstack, so the same Cognito test can run on your laptop or on a grid.

One scheduling note specific to Cognito and login flows generally: a live Hosted UI sign-in is a good scheduled smoke test, the nightly or hourly run that catches "AWS changed the managed login page and our redirect broke" before a user does. It is a weaker fit for a blocking gate on every pull request, both because of the confirmation-code limit below and because hammering a real auth endpoint on every commit is not kind to rate limits.

The honest limits

Here is where BrowserBash struggles on this specific topic, stated plainly.

Email and SMS confirmation codes do not fully automate. Cognito sign-up sends a one-time code to an email inbox or a phone, and the agent cannot read your inbox or your SMS. There is no way around this with the browser alone. You have three realistic options. First, a human-in-the-loop step: the run pauses, a person fetches the code and feeds it in, which is fine for a manual smoke test but not for unattended CI. Second, a backdoor: in a test user pool, use the Cognito admin APIs (AdminConfirmSignUp or AdminCreateUser with a known password) to provision an already-confirmed user out of band, then have BrowserBash test only the sign-in of that pre-confirmed account. Third, a test inbox you can read programmatically (a catch-all address or a service with an API) that a small script polls for the code before handing it to the run. The pattern for pausing a run to inject a code is covered in human-in-the-loop for CLI browser OTP and CAPTCHA. The honest summary: automate sign-in end to end, but provision confirmed users through the admin API rather than driving the email-code screen in unattended runs.

MFA challenges hit the same wall. If your pool enforces TOTP or SMS MFA, the second factor is the same human-in-the-loop or backdoor problem as the confirmation code. A TOTP secret you control can be turned into a code by a small helper and injected, but SMS MFA on a real number is not automatable in CI.

Hosted UI variants can confuse small models. Cognito's classic UI and the newer managed login render differently, and on a tiny local model the agent can occasionally misread which field is which. This is a model-capability issue, not a selector issue: a 70B-class or hosted model handles both layouts reliably, while an 8B model may stumble on a long flow. Upsize the model before you blame the test.

Bot detection and rate limits are real. Repeatedly hammering a Cognito sign-in from a CI data center can trip rate limiting or throttling. This is why the scheduled-smoke pattern beats per-commit gating for live auth.

None of these are unique to BrowserBash. Any tool driving a real Cognito Hosted UI hits the same confirmation-code and MFA walls, because the limit is "a browser cannot read your email," not anything about the framework. Playwright and Selenium face it too; the difference is that BrowserBash absorbs the markup churn and cross-origin redirect for you, while a selector script makes you maintain those by hand. Where Playwright shines is deterministic, fully-scripted control and a mature trace viewer, and BrowserBash's builtin engine actually emits native Playwright traces precisely so you keep that debugging surface. Pick the intent-based approach when the Cognito UI changes under you often; keep hand-written selectors where you own the markup and want pixel-exact control.

Putting it together

The end-to-end recipe for testing AWS Cognito login with AI is short. Install BrowserBash, write the sign-in as a cognito_login_test.md using plain-English steps and {{masked}} credentials, lean on the built-in auto-wait for the two redirects instead of sleeps, and run it headless with --agent --record in a scheduled CI job. For sign-up, provision a confirmed user through the Cognito admin API and test the sign-in of that account, rather than trying to read a one-time code from a browser. You get a flow that survives AWS-side UI changes because the agent re-reads the live page each run, and credentials that stay out of your logs because every variable is masked.

For more on what the CLI can do, see the features page, and for guided walkthroughs of other flows, the learn hub. If your login is social or enterprise rather than Cognito-native, the companion guide on how to test OAuth and SSO login with AI covers the Google and Okta variants of the same redirect dance.

FAQ

Can BrowserBash handle the Cognito email confirmation code automatically?

Not on its own, and no browser tool can, because the code arrives in an email inbox or by SMS that the browser cannot read. For unattended CI, provision an already-confirmed test user through the Cognito admin API (AdminConfirmSignUp or AdminCreateUser) and have BrowserBash test the sign-in of that account. For a manual smoke test, use a human-in-the-loop pause where a person fetches the code and feeds it in. A programmatically readable test inbox is the third option.

How does BrowserBash wait for the redirect back to my SPA without a fixed sleep?

You write "wait for the browser to redirect back" as a plain-English step, and BrowserBash relies on Playwright's built-in auto-wait with a 15-second ceiling. It waits for the page to actually settle after the Cognito redirect rather than guessing a duration, so you never hardcode a sleep that is either too slow or too flaky.

Are my Cognito credentials safe in the logs and CI artifacts?

Yes, if you pass them as {{variables}}. BrowserBash masks any variable value in its logs, its NDJSON agent stream, and the per-run Result.md, so the literal password is not printed. Keep the values in your CI secret store and expose them as environment variables, and use a disposable test user in a non-production pool rather than a real account.

Does this replace my Cognito integration tests?

No. The browser test proves that a human clicking "Log in" ends up authenticated on your app across the redirect. The token exchange, PKCE handshake, and callback validation happen server-side and are better covered by integration tests against your callback endpoint. Treat the AI-driven Hosted UI test as a scheduled smoke test that catches AWS-side UI breakage, not as a substitute for your backend auth tests.