How to Test File Upload Fields With an AI Browser CLI

To test file upload fields with an AI browser CLI, you point a test at a local fixture file, tell the agent in plain English what to upload and where, then assert on the success state that appears after the upload completes. With BrowserBash, a free open-source CLI, that looks like a Markdown test file with steps such as "upload the file fixtures/avatar.png into the profile photo field" and "confirm the page shows Upload complete." The agent reads the live page, finds the upload control through the accessibility tree, sets the file on the underlying input, waits for the success or error state to render, and checks it. No CSS selectors, no manual waits, no scripting the file chooser dialog by hand. This guide walks through single uploads, multi-file widgets, drag-and-drop dropzones, progress and validation assertions, and the one place where browser automation genuinely cannot help you: the operating system's native file picker.

The honest framing first. File upload is one of the more mechanically awkward things to automate in any tool, AI or not, because the actual file selection happens in a dialog the browser does not control. BrowserBash works the same way Playwright does under the hood: it sets the file programmatically on the <input type="file"> element rather than driving the OS dialog. That covers the large majority of real upload widgets, but not a true native dialog with no input behind it, which is the whole point of the limits section below.

Why file upload trips up scripted tests

Most upload bugs are not in the happy path. They live in the states around it: the wrong file type silently accepted, the progress bar that never reaches 100 percent, the multi-file picker that keeps only the last file, the dropzone that highlights on hover but drops nothing. Selector-based scripts struggle here for the same reasons they struggle everywhere, plus a few specific to uploads.

The visible button is a decoy. What you click is usually a styled <label> or <button>. The real <input type="file"> is hidden off screen with display:none or opacity:0. A test that clicks the pretty button opens a dialog the browser cannot read.
The success state is asynchronous and varies. Some widgets show a thumbnail, some a filename chip, some a toast that vanishes after three seconds, some just enable Submit. Hard-coding a wait for one breaks on the next redesign.
Drag-and-drop is not a click. Dropzones listen for dragenter, dragover, and drop events carrying a DataTransfer object, which a normal click cannot reproduce.
Validation rewrites the DOM. Reject a 20 MB file and an error node injects, shifting every index below it.

None of this is a defect. It is normal frontend behavior. The job of an upload test is to confirm behavior, not encode the markup: a valid file is accepted and a success state appears, an invalid file is rejected with the right message. BrowserBash moves the fragile part, finding and operating the control, out of your test and into a model that re-derives it against the live page on every run. It is not self-healing in the sense of patching a saved script: there is no saved selector to patch. Each run reads the current page from scratch. For more on how that plays against changing markup, see how BrowserBash handles dynamic UIs.

Setup: install and stage a fixture file

Install the CLI globally:

npm install -g browserbash-cli

Uploads need a real file on disk to point at. Keep your test fixtures next to your tests so paths stay portable and reviewers can see exactly what is being uploaded.

project/
  tests/
    upload_test.md
    multi_upload_test.md
  fixtures/
    avatar.png          # a small valid image
    resume.pdf          # a valid document
    huge.zip            # oversized, to trigger size validation
    notes.txt           # wrong type, to trigger type validation

A few small, purpose-named fixtures beat one big grab bag. Name them for the case they exercise (huge.zip, notes.txt) so the intent of each negative test is obvious from the filename alone.

Single-file upload: the happy path

The smallest useful test uploads one valid file and asserts the success state. Here is a complete upload_test.md:

# Profile photo upload

1. Go to https://example.com/account/profile
2. Upload the file ./fixtures/avatar.png into the profile photo field
3. Confirm the page shows a preview thumbnail of the uploaded image
4. Confirm an "Upload complete" message is visible
5. Confirm the Save button is now enabled

Run it:

browserbash testmd run ./tests/upload_test.md

Read step 2 the way the agent does. It does not look for a class or id. It reads the rendered page, finds the control whose accessible name and context say "profile photo," resolves the file input behind it, and sets ./fixtures/avatar.png on that input. Steps 3 through 5 are intent assertions against the accessibility tree and DOM: a thumbnail, the completion text, the enabled Save button. Because it re-reads the page after the upload action, it naturally waits for the thumbnail to render instead of racing it.

If you prefer a single objective without a Markdown file, the same thing works as a one-shot run:

browserbash run "On https://example.com/account/profile, upload ./fixtures/avatar.png \
as the profile photo and confirm an upload-complete message appears"

Both forms find elements through the accessibility tree (roles, accessible names, states) plus the DOM, never CSS classes, which is why the hidden-input-behind-a-styled-label pattern does not bother them. The agent operates the input regardless of how the visible button is dressed up.

Asserting on the right success signal

Be specific about what success means for your widget. "Confirm the upload worked" is weak; the agent has to guess. "Confirm a thumbnail of the uploaded image appears and the filename avatar.png is shown" is strong, because it names two concrete, observable facts. The more precisely you describe the rendered success state, the less the model has to infer, and the more deterministic your test becomes. This is the same discipline that makes good intent-based assertions across the board, covered in the Markdown test files tutorial.

Multi-file upload

A multi-file input is just an <input type="file" multiple>. The test pattern is the same; you list more than one file and assert on the collected set.

# Attach multiple documents

1. Go to https://example.com/tickets/new
2. Upload these files into the attachments field:
   ./fixtures/resume.pdf and ./fixtures/avatar.png
3. Confirm the attachments list shows exactly two items
4. Confirm "resume.pdf" appears in the list
5. Confirm "avatar.png" appears in the list
6. Confirm a remove button is shown next to each attachment

Two assertions earn their place here. Checking the count (step 3) catches the classic multi-upload bug where the widget keeps only the last file. Checking each filename (steps 4 and 5) catches the case where the count is right but a file was dropped or duplicated. Asking for the remove control (step 6) confirms the widget treated them as a managed set, not raw inputs.

If your widget supports adding files in batches, you can split the upload across steps and assert the list grows:

3. Upload ./fixtures/resume.pdf into the attachments field
4. Confirm the attachments list shows 1 item
5. Upload ./fixtures/avatar.png into the attachments field
6. Confirm the attachments list shows 2 items

That sequencing tests append behavior rather than replace behavior, which is a real and frequently broken distinction.

Drag-and-drop dropzones

Dropzones are where people assume an AI tool will fall over, and where the input-based approach quietly saves you. Most dropzones built on libraries like react-dropzone, Uppy, or FilePond render a hidden file input alongside the drop area precisely so keyboard and assistive-tech users can still upload. When that input exists, BrowserBash sets the file on it and the dropzone's own handler fires its success path, exactly as if you had dragged a file in.

# Drag-and-drop image dropzone

1. Go to https://example.com/upload
2. Upload ./fixtures/avatar.png into the drag-and-drop dropzone
3. Confirm the dropzone shows the file as added
4. Confirm a thumbnail or filename for avatar.png is visible
5. Confirm no "unsupported file" error is shown

The phrasing "into the drag-and-drop dropzone" is deliberate. It tells the agent which region of the page you mean, and the agent finds the associated input behind that region. You are testing the observable outcome (file added, thumbnail shown, no error), which is what you actually care about, rather than simulating the literal mouse drag.

The honest caveat: a small number of dropzones are pure drag targets with no backing input at all, accepting files only through real drop events. Those are rarer than they used to be because they fail accessibility audits, but they exist. For those, see the limits section.

Progress bars and slow uploads

Progress UI is timing-sensitive, and timing is where manual sleep calls go to rot. BrowserBash leans on Playwright's built-in auto-wait with a 15-second ceiling, so you assert on states rather than guessing durations. The reliable approach is to assert the terminal state, not a mid-flight percentage you cannot pin down.

# Large file upload progress

1. Go to https://example.com/upload
2. Upload ./fixtures/resume.pdf into the file field
3. Confirm a progress indicator appears
4. Confirm the upload reaches a completed state
5. Confirm the success message "File uploaded" is visible
6. Confirm the progress indicator is no longer shown

Step 3 confirms progress UI exists at all, a meaningful check on its own. Steps 4 through 6 wait for completion and confirm the transition: success text in, progress bar out. Do not write a step like "confirm progress shows 47 percent." That number is a moving target the test cannot deterministically catch, and asserting on it makes the test flaky by construction. Assert the start and end states; let auto-wait bridge the middle. If a backend is genuinely slow and an upload exceeds the 15-second ceiling, that surfaces as a timeout (exit code 3), which is information you want.

Validation: the negative cases that matter most

The negative path is where uploads most often ship broken, and it is the easiest part to test by intent. Wrong type and oversize are the two big ones.

# Upload validation

1. Go to https://example.com/account/profile

# Wrong file type
2. Upload ./fixtures/notes.txt into the profile photo field
3. Confirm an error like "only image files are allowed" is shown
4. Confirm no thumbnail is displayed
5. Confirm the Save button remains disabled

# Oversized file
6. Reload the page
7. Upload ./fixtures/huge.zip into the profile photo field
8. Confirm an error about file size or maximum size is shown
9. Confirm the file is not accepted

Two things make this robust. First, each negative step pairs a positive assertion (the error appears) with a negative one (no thumbnail, Save still disabled). A widget that shows an error but also quietly accepts the file is a real bug, and the paired assertion catches it. Second, the agent reads the actual error text from the page, so you can phrase the expected message loosely ("an error like ...") and let the model judge the match. For deeper edge-case and validation strategy, see automating form validation testing for edge cases and the broader AI form-filling automation guide.

Composing upload tests with @import and variables

Uploads rarely stand alone. They usually come after login and inside a larger flow. Two BrowserBash features keep that tidy: @import for reusing setup, and {{variables}} for parameterizing values, with secret masking in logs.

# Authenticated avatar upload

@import ./login_test.md

1. Go to https://example.com/account/profile
2. Upload {{avatar_path}} into the profile photo field
3. Confirm "Upload complete" is visible

The @import ./login_test.md line runs your existing login steps first, so the upload test does not re-describe authentication. The {{avatar_path}} variable lets one test file cover several fixtures from CI: a valid image in one job, a malformed one in another. Any value flagged as a secret is masked in logs, which matters if a signed upload URL or token shows up in your variables. The composition mechanics are the same ones detailed in the Markdown test files tutorial.

Running upload tests in CI

Upload tests belong in your pipeline, and the CLI is built for it. Run headless, emit machine-readable output, and record artifacts so a failed upload is debuggable after the fact.

browserbash testmd run ./tests/upload_test.md --headless --agent --record

What each flag buys you:

--headless runs without a visible browser, the norm on CI runners.
--agent emits NDJSON, one structured event per line, so your pipeline can parse exactly what happened on each step.
--record saves a webm video plus screenshots, so when an upload fails you can watch the dropzone not light up instead of guessing.

Exit codes drive the gate cleanly: 0 pass, 1 fail, 2 error, 3 timeout. A Result.md is written per run for a human-readable summary, and an optional --upload opt-in pushes results to a free cloud dashboard (runs kept 15 days); browserbash dashboard gives you a local one instead. Model selection defaults to auto, resolving Ollama first, then ANTHROPIC_API_KEY, then OPENROUTER_API_KEY where free models exist. For short upload flows a small local model can be enough; for longer multi-step journeys a 70B-class model (Qwen3, Llama 3.3) or a hosted model is steadier, since small local models (8B and under) get flaky on long flows. Running fully local means nothing leaves the machine, worth weighing if your fixtures are sensitive. There is a fuller walkthrough of these tradeoffs on the learn page.

Honest limits

This is the section that matters most for uploads, because the failure mode is specific and worth stating plainly.

True OS-native file dialogs are out of reach. BrowserBash, like Playwright, uploads by setting the file on the underlying <input type="file"> element rather than driving the operating system's file chooser. Almost every web upload widget has an input behind it, so this is rarely a problem. But if a widget has no input at all and forces the actual OS dialog (some legacy desktop-bridge apps, certain plugin-based or Flash-era uploaders, occasional Electron wrappers), the input-based approach has nothing to attach to. No browser-context automation tool gets around this; it is a property of how browsers sandbox file selection.

Pure drag-only dropzones with no backing input. A dropzone that accepts files solely through real drop events with no hidden input cannot be driven by the set-file approach. These are increasingly rare because they break accessibility, but when you hit one the upload step will fail to find a target.

Reordering and rich post-upload editing. Dragging uploaded thumbnails into a new order, or cropping an image in an in-page editor, leans on fine-grained pointer gestures that are not the agent's strength. Assert the outcome where you can (the order changed, a cropped image saved) rather than the gesture.

Non-deterministic content checks. The agent confirms a thumbnail or filename appears; it does not pixel-compare the preview against the source file. If your bug is "the preview is rotated 90 degrees," an intent assertion will not catch it. Pair the upload test with a visual-diff tool for that class of defect.

Genuinely slow backends. With a 15-second auto-wait ceiling and no manual sleeps, an upload that legitimately takes longer will time out (exit code 3). Usually correct behavior, but if you test intentionally large files on a slow staging server, account for it.

Where this leaves you: for input-backed single, multiple, and dropzone uploads with success, progress, and validation assertions, an AI browser CLI is a strong fit and removes a lot of selector pain. For native dialogs and pixel-level content verification, reach for a different tool and be glad you knew the boundary in advance.

FAQ

How do I point a BrowserBash test at a local file to upload?

Reference the file by path in a natural-language step, for example "Upload ./fixtures/avatar.png into the profile photo field." Keep fixtures in a folder next to your tests and use relative paths so the test runs the same on every machine and in CI. You can also parameterize the path with a {{variable}} so one test file covers several fixtures. The agent resolves the file input on the page and sets the file on it directly, the same mechanism Playwright uses.

Can it test drag-and-drop file dropzones?

Yes, for the common case. Most modern dropzones (react-dropzone, Uppy, FilePond, and similar) render a hidden file input behind the drop area for accessibility, and BrowserBash sets the file on that input, which fires the dropzone's normal success handler. Phrase the step as "upload the file into the drag-and-drop dropzone" and assert on the result. The exception is a pure drag-only zone with no backing input at all; those cannot be driven this way, but they are rare because they fail accessibility audits.

How do I assert on upload progress without flaky timing?

Assert on states, not percentages. Confirm a progress indicator appears, then confirm the upload reaches a completed state and the success message shows, then confirm the progress indicator is gone. Avoid asserting a specific mid-flight percentage, since that is a moving target the test cannot reliably catch. BrowserBash uses Playwright's auto-wait with a 15-second ceiling, so you describe the start and end states and let the wait bridge the middle. No manual sleeps.

What can it not do with file uploads?

It cannot drive a true operating-system file dialog, because it sets the file on the page's <input> element rather than clicking through the OS chooser. That covers most real widgets but not the few with no input behind them. It also will not pixel-compare a preview against the source file, so a rotated or visually corrupted preview that still renders needs a visual-diff tool. And uploads that legitimately exceed the 15-second auto-wait ceiling will time out, which is usually correct but worth planning for with very large files.