Open source · MIT license · developer preview · v1.11.0

Tandem Browser is the open-source, local-first browser for AI agents — where the AI lives inside your real browser session, sharing your tabs, cookies, and DOM, ready to ask for your help the moment it gets stuck.

The browser becomes a programmable workspace where human intent and AI capability meet.

Tandem Browser is built around a simple shift: the AI is not a sidebar talking about your browser, it is inside it. Same tabs, same cookies, same logged-in sessions, same page you are looking at right now. Reading the accessibility tree, watching the network, writing live styles, handing back to you when something needs a human. It turns the web that already exists into something agents can actually operate on — no per-site API, no RPA rig, no screenshot loop. The real browser, as a shared human-AI symbiotic runtime.

513GitHub stars

257MCP tools

300+HTTP endpoints

8Security layers

View on GitHub → Read docs Join discussions ♥ Sponsor

Automate any SaaS, no API neededIf you can use it in a browser, your agent can use it in Tandem. Gmail, Coolblue, Funda, internal tools — same session, same login, no wrapper per site.

Rewrite live UI on the fly"Overlay the square-meter-price on every listing" and the agent injects a script into the real site, in the real session, while you keep scrolling.

Read beyond what’s renderedAccessibility tree + network log + DOM + DevTools. The agent sees structure and hidden traffic, not a screenshot and a guess.

Bring any AITandem Browser is model-agnostic. Claude, GPT, Gemini, OpenClaw, local Ollama, LM Studio, custom scripts — anything that speaks MCP or HTTP. Swap them, combine them, run fully offline.

What this unlocks

What an AI agent browser unlocks that other tools cannot

Every point here is something you can reproduce today on a stock Tandem Browser install with a capable model behind it. None of this needs a site to opt in, a new protocol to land, or a bespoke scraper per service.

Real co-browsing in one runtime

You and the agent are in the same browser at the same time. When you click, it sees the new DOM. When it types, the form fills in front of you. No screen sharing, no "now give me control," no second headless session trying to catch up to yours.

Web understanding beyond pixels

The agent gets the accessibility tree, the rendered DOM, live network requests, console output, and DevTools. It can answer "what API did this page just hit and what came back?" without any of that leaving your machine.

AI asks you when it gets stuck

Tandem Browser agents don't pretend they know everything. When a CAPTCHA appears, a form has a judgment call, or something looks off — the agent pauses and asks. You answer in seconds, it picks back up. One conversation, two minds.

You steer mid-task

Watching the agent head down the wrong path? Step in. Type, click, switch tabs, redirect with a sentence. The agent sees what you did and adapts — no restart, no reset. You're co-driving, not handing off a task.

Real SaaS, no API required

Gmail has no usable agent API. Coolblue has no agent API. Your HR portal, your municipality portal, your internal admin tool — none of them have one either. In Tandem Browser all of that is just "the browser": the agent logs in like you do, navigates the UI you already know, and gets work done in your real account.

Live UI rewriting, any site

Ask the agent to add a “price per square meter” column to Funda, or a keyboard shortcut to a SaaS tool that refuses to add one, and it injects a user script into the live site. Your browser, your modifications, persisted locally, no extension store involved.

Cold-start usable

The moment you install Tandem and connect a strong model, normal web tasks work. No per-site training, no “teach it your flow” phase, no hand-authored recipes. The capability is in the model; Tandem just gives it a real browser to stand on.

Bring any AI — MCP-native, model-agnostic

Tandem Browser is model-agnostic. Any agent that speaks MCP or HTTP works — Claude, GPT, Gemini, OpenClaw, local Ollama, LM Studio, custom scripts. Swap models when a cheaper or faster one ships. Run two at once. Go fully offline with a local model. The browser doesn't care which brain you plug in — which also means a better Claude release or a smarter Ollama model shows up as a better Tandem Browser experience the same day, without any glue code to rewrite.

What this is not

Useful to say out loud — because it keeps getting confused with these.

Not an AI sidebar

A chat panel next to a browser still leaves the AI outside the page. Tandem Browser runs the agent inside the same browser you use, in your real session, not in a separate window guessing what you see.

Not RPA or screen scraping

No brittle pixel coordinates, no "record-and-replay" macros. The agent works with structure — the accessibility tree and DOM — the way a careful human developer would.

Not screenshot-driven automation

Screenshot loops are slow, fragile, and expensive. Tandem gives the agent the page as semantic structure first, with screenshots available when they genuinely help, not as the only input channel.

Not just another Chromium wrapper

This is a daily-driver browser — tabs, workspaces, extensions, bookmarks, password vault, sidebar apps — with a serious security perimeter between web content and the agent layer. The agent story is first-class, not a bolted-on feature.

Not “chat with this webpage”

Reading the page aloud is the table-stakes version. Tandem's agent can open tabs, fill forms, cross-reference sources, log in, hand back to you for a CAPTCHA, inject a userscript, and come back to finish — as one continuous piece of work.

Not cloud-hosted “AI agent SaaS”

Your browser, your machine, your sessions, your data. Remote agents reach you only over your private Tailscale network. Nothing about your browsing is shipped to a Tandem Browser backend — there isn't one.

Core idea

Symbiosis, not automation theater.

Other AI browsers

Build API wrappers for 40 services. Gmail wrapper, Slack wrapper, Notion wrapper. Each one is custom code. If they haven't built a wrapper for a site, the AI cannot use it.

Tandem Browser — the symbiotic browser

Uses the accessibility tree, the structure every website already has. The AI reads, clicks, fills, and navigates any site like a human. No wrappers needed. It works on every website that exists. This is the symbiotic browser model: human and AI sharing one real session, neither in control alone.

Anti-scraping doesn't apply

Tandem Browser is a real browser used by a real human. You browse in it daily. The AI rides alongside, using the same session and the same cookies. There is no bot to detect, just a person browsing alongside their AI.

AI evolves, Tandem Browser evolves with it

Smarter Claude? It automatically gets better at using Tandem Browser. New local model via Ollama? Plug it in. Tandem Browser does not chase AI evolution, it is the MCP-native browser layer AI evolves on.

Positioning

Tandem Browser and WebMCP solve different layers of the problem

WebMCP is promising because it helps websites expose cleaner agent actions. Tandem Browser solves a broader problem: how a human and an AI actually work together inside a real browser, across tabs, sessions, and existing sites, with security and governance built in.

WebMCP

Makes individual websites more agent-ready by letting sites expose structured tools. Best fit when the site itself participates.

Tandem Browser

Makes the browser itself a shared human-AI workspace. Best fit when the real job spans multiple tabs, authenticated sessions, human judgment, and the web as it exists today. This is why Tandem Browser is best understood as the human-AI symbiotic browser.

Why that matters

Tandem Browser does not depend on every site adopting a new protocol before useful work can happen. As an MCP-native browser, it gives agents structured access, visibility, and handoffs inside the browser people already use — on every site that exists today.

Not anti-WebMCP

If more sites become agent-readable, great. Tandem Browser can benefit from that too. But Tandem Browser's category is the symbiotic browser: shared browser context, local-first control, and human-in-the-loop coordination, not generic site automation.

Who this is for

Built for people already pushing real AI workflows

Tandem Browser is for users who want real browser context, not toy abstractions. It fits best when the browser is part of the workflow, not a disposable automation target.

AI power users

People who already live with Claude, OpenClaw, local models, or mixed-agent setups and want them operating in the same real browser — the model-agnostic symbiotic browser that doesn't lock you in.

MCP builders

Teams and developers who want a serious browser surface behind MCP, instead of one more pile of custom wrappers or narrow site-specific integrations.

Privacy-minded operators

Users who want local-first control, authenticated sessions, and strong security boundaries between the web and the agent layer.

Teams exploring browser agents

Companies testing secure browser-based AI workflows, human-in-the-loop systems, and real authenticated browser work.

The everyday version of the pitch: it feels like going on the internet with a brilliant teammate — not like using a chatbot parked next to a browser.

Multi-agent symbiosis

Multiple AI agents + one human. Different model vendors. Same browser — the symbiotic browser in practice.

Tandem Browser supports multiple AI agents and a human working simultaneously in the same browser — and they do not have to be the same model or the same vendor. A local Ollama model for private research, Claude Code on another machine for heavy reasoning, you in the driver's seat. This is the human-AI symbiotic browser model: agents connect locally or remotely over Tailscale, each gets their own workspace, all share the same tabs, cookies, and sessions. Tab locks prevent conflicts.

👤

Robin (human)

Default workspace. Browses normally, handles CAPTCHAs and judgment calls.

🤖

Kees (OpenClaw)

Runs on Ollama locally. Has its own workspace for job search automation. Tandem Browser was originally built for OpenClaw agents like Kees.

⚙

Claude Code on Windows

Connected remotely over Tailscale via MCP. Works inside its own workspace while sharing the same live browser context.

Live demo

What agents can do through Tandem Browser now

$ claude via tandem MCP

> Opened Gmail, composed email, filled To/Subject/Body, clicked Send > Navigated to Coolblue, extracted all MacBook Air M4 prices > Scrolled X.com profile using PageDown/PageUp > Read YouTube search results for "tandem browser" > Read pinboards, screenshots, workspace config > Drew annotations visible to both human and AI

No Gmail API. No Coolblue wrapper. Just the accessibility tree, shared browser context, and now remote MCP over Tailscale too.

Get started

Up and running in 3 steps

Download and install

Download Tandem Browser for Windows v1.11.0 →

Windows 11 x64 installer and portable builds are official Tandem Browser downloads. They are unsigned for now, so Windows may show an unknown publisher or SmartScreen warning.

Download signed macOS Apple Silicon build v1.0.0 →

Linux remains best-effort and can be run from source.

Open Tandem and go to Settings

Launch Tandem Browser. Open Settings → Connected Agents and scroll to Connect your AI to Tandem.

Choose On this machine if your AI runs locally, or On another machine if it runs remotely over Tailscale. Then click Generate connection instructions.

Copy the instructions into your AI

Tandem generates a ready-to-paste block. Copy it and give it to your AI agent. That's it — the agent reads Tandem's bootstrap surface and connects automatically.

Bring any AI. Claude, GPT, OpenClaw, local Ollama, LM Studio, custom scripts — anything that speaks MCP or HTTP works. Swap models, combine models, run fully offline. The browser doesn't care which AI you bring.

Same machine: MCP via stdio (257 tools) or HTTP API (300+ endpoints).
Another machine: MCP via Streamable HTTP or HTTP API over a private Tailscale network.

Prefer to build from source? See the README →

FAQ

Frequently asked questions

What is Tandem Browser?

Tandem Browser is the open-source, local-first browser for AI agents — where the AI lives inside your real browser session. The agent shares the same tabs, cookies, accessibility tree, and DOM you see — no per-site API, no headless second browser, no screenshot loop. It is built on Chromium and licensed MIT.

Is Tandem Browser free?

Yes. Tandem Browser is free and open source under the MIT license. There is no cloud backend, no subscription, and no telemetry. If you want to support development, you can sponsor on GitHub.

Which AI models work with Tandem?

Tandem Browser is model-agnostic — it works with any agent that speaks MCP (Model Context Protocol) or HTTP. It originally grew out of the OpenClaw community, and today supports Claude, GPT, Gemini, OpenClaw, local Ollama models, LM Studio, and custom scripts. You can swap models, combine them, or run fully offline with a local model.

Can the AI agent ask for help when it gets stuck?

Yes — that's a core part of how Tandem Browser works. When the agent hits a CAPTCHA, an unclear form, or a judgment call, it pauses and asks you. You answer in seconds, the agent picks back up. You can also interrupt the agent mid-task — type, click, or redirect with a sentence — and it adapts without restarting. Human and AI work as one team, not as a hand-off.

How is Tandem different from Comet, Dia, or Arc Search?

Comet (Perplexity), Dia (Browser Company), and Arc Search bundle a single vendor model and ship cloud-hosted features. Tandem Browser is open source, local-first, model-agnostic, and runs the agent inside the same authenticated session the human uses — not a separate headless instance. Your data never leaves your machine unless you choose to send it somewhere.

How is Tandem different from Browser Use or Playwright?

Browser Use and Playwright are libraries for headless or scripted automation. Tandem Browser is a daily-driver desktop browser where the agent operates inside your real session alongside you. You don't write a script per workflow — the agent reads the page like a human would, and pauses to ask you when it needs human judgment.

Is Tandem safe against prompt injection?

Tandem Browser ships an 8-layer security pipeline between every byte of external content and the agent: NetworkShield, OutboundGuard, ContentAnalyzer, ScriptGuard, BehaviorMonitor, Gatekeeper AI, EvolutionEngine, and a dedicated PromptInjection layer. It includes an 811K+ entry blocklist, YARA-style rules, AST fingerprinting, Shannon entropy detection, and 51 automated tests.

What operating systems does Tandem support?

Tandem Browser supports macOS Apple Silicon and Windows 11 x64. Linux is best-effort. macOS builds are signed and notarized; Windows builds are official Tandem Browser downloads but currently unsigned, so Windows may show an unknown publisher or SmartScreen warning during installation.

Can multiple AI agents use Tandem at the same time?

Yes. Tandem Browser supports multiple agents and a human working simultaneously in the same browser. Each agent gets its own workspace; all share the same tabs, cookies, and sessions. Tab locks prevent conflicts. Agents can run locally or connect remotely over Tailscale — Tandem Browser is never exposed to the public internet.

Does Tandem work without an internet connection?

Yes. With a local model (Ollama, LM Studio) Tandem Browser runs fully offline. There is no Tandem Browser cloud backend that needs to be reachable.

How does Tandem compare to WebMCP?

WebMCP and Tandem Browser solve different layers. WebMCP is a proposal for websites to expose structured agent tools — it requires sites to opt in. Tandem Browser makes the browser itself a shared human-AI workspace, working on every site that exists today, no opt-in required. They are complementary: if WebMCP succeeds, Tandem Browser benefits too.