Status: alpha prototype. APIs, rule contracts, and trace formats may change without notice. Pin a commit if you need stability.
Prototyping browser extension capabilities for improving browser use agent performance:
- Token efficiency
- Security/Compliance: e.g., exposure to PII, data loss prevention, etc.
- Accuracy
Ideas explored in this repository:
- Masking/Redacting sensitive information on the webpage
- Blocking/Modifying dark patterns on the webpage
- Preprocessing webpage content to be more agent-friendly, e.g., hiding irrelevant content, hiding user-generated comments which could contain prompt-injection attacks, etc.
- Node ≥ 24 and Bun ≥ 1.3 — extension and demo site
- uv — runs the Python scripts (each declares its own PEP 723 deps; the repo
pins Python 3.14 via
.python-version, but scripts work on 3.11+) - Chrome / Chromium 148+ — to load the unpacked extension
| Path | What's there |
|---|---|
extension/ |
Chromium MV3 extension (Bun + TypeScript) |
demo-site/ |
Vite/React mock e-commerce site that exercises every rule |
benchmark/ |
Tasks, scenarios, and pricing for the agent benchmark harness |
scripts/ |
PEP 723 scripts: agent task runner, benchmark harness, trace tools |
skills/ |
Claude Code skills for installing, configuring, and diagnosing |
The Chromium MV3 extension lives in extension/. Build output
goes to extension/dist/, which is what you load as an unpacked extension at
chrome://extensions.
cd extension
bun install
bun run buildbun run watch rebuilds extension/dist/ whenever a file in extension/src/
changes:
cd extension
bun run watchAfter each rebuild, click the reload icon for the extension at
chrome://extensions (or use a tool like
Extensions Reloader)
and refresh any open tabs to pick up the new content script.
Rule unit tests run under Jest against a
jsdom DOM. They live alongside the source in
extension/src/rules/__tests__/.
cd extension
bun install
bun run testFilter to a single suite with the standard Jest CLI, e.g.
bun run test -- pii-mask.
The ads-hide rule bundles a snapshot of EasyList's generic element-hiding
selectors (extension/src/rules/easylist-generic.generated.ts, ~13k selectors).
Refresh it when ad-network selectors drift:
cd extension
bun run fetch-easylist # alias for `uv run scripts/fetch_easylist.py`The generated file is committed to keep builds deterministic and
offline-capable; pre-commit hooks and Biome skip *.generated.* files.
Bundle extension/dist/ into a ZIP suitable for uploading via the
Browserbase extensions API:
cd extension
bun run build
bun run package # writes output/extension.zip at the repo root
# or specify an output path / directory:
bun run package -- ~/Downloads/agent-browser-shield.zipThe output/ directory is gitignored.
demo-site/ is a Vite/React mock e-commerce SPA ("RiverMart")
that deliberately packs the threats and dark patterns agent-browser-shield
defends against onto a few pages. Load it with and without the extension to see
the before/after difference.
Live deployment: https://shield-dark-pattern-demo.vercel.app/
To run it locally instead:
cd demo-site
bun install
bun run dev # http://localhost:5173See demo-site/README.md for the per-page rule
coverage and Vercel deploy instructions.
scripts/agent_task.py runs a
Browserbase agent task via the
Stagehand Python SDK. The script declares its dependencies inline (PEP 723), so
uv will fetch them on first run.
Copy env.sample to .env and fill in the API keys, then:
# Without the extension
uv run scripts/agent_task.py --instruction "Find the top story on HN"
# With the agent-browser-shield extension uploaded and loaded into the session
# (requires running the packaging script above first)
uv run scripts/agent_task.py --with-extension \
--instruction "Find the top story on HN"scripts/benchmark_run.py compares agent
performance across configurations (extension on/off, model vendor/size, step
budget) over a fixed task set, judges each result inline, and writes a run
bundle under output/results/<run_id>/.
scripts/benchmark_report.py renders an HTML
matrix with per-task side-by-side scenario diffs and a11y-tree comparisons.
# 1. Build + package the extension (only for scenarios with extension: true)
cd extension && bun run build && bun run package && cd ..
# 2. Run the benchmark
uv run scripts/benchmark_run.py \
--scenarios benchmark/scenarios.example.yaml \
--tasks benchmark/tasks.csv \
--concurrency 25 -n 3
# 3. Render the report
uv run scripts/benchmark_report.py --run-id <run_id> --open
# 4. Resume / repair an incomplete run (idempotent)
uv run scripts/benchmark_resume.py --run-id <run_id>Old run artifacts under output/results/ and output/reports/ are gitignored;
prune them with uv run scripts/clean_artifacts.py (dry-run by default). See
benchmark/README.md for the full workflow, BU Bench
V1 fetch, and trace-bundle diagnostics.
skills/ holds Claude Code skills for installing, configuring,
running tasks against, and diagnosing the extension. See each skill's SKILL.md
for invocation details.
See CONTRIBUTING.md for setup, expectations, and the
contributor-license-agreement workflow. New rules are a great place to start —
extension/src/rules/scarcity-hide.ts is a small worked example.
agent-browser-shield is source-available under
PolyForm Shield 1.0.0. Use it commercially, internally, or for
research at no cost — the only restriction is that you can't use it to build a
product that competes with agent-browser-shield or with a PixieBrix product
built on it. See LICENSING.md for details and how to obtain a
commercial license if you need one.
Please report vulnerabilities privately via GitHub's "Report a vulnerability" form. Do not open a public issue for security problems.