{"skill":{"slug":"chromeskill","displayName":"chrome_skill","summary":"Browser automation via Chrome AI Action (CAA) bridge. Control Chrome programmatically — navigate, click, type, screenshot, extract content, and more. Uses Pu...","description":"---\nname: chrome-ai-action-skill\ndescription: \"Browser automation via Chrome AI Action (CAA) bridge. Control Chrome programmatically — navigate, click, type, screenshot, extract content, and more. Uses Puppeteer (CDP) mode. First use auto-installs npm package and starts the bridge. Chrome is auto-launched if not running.\"\n---\n\n# Chrome AI Action — Browser Automation Skill\n\nAI Agent 浏览器自动化技能。通过 Chrome AI Action (CAA) 桥接服务，以 Puppeteer (CDP) 模式编程控制 Chrome 浏览器，支持导航、点击、输入、截图、内容提取、网络拦截、Cookie 管理、PDF 导出等 60+ 操作。\n\n---\n\n## When to Use / 何时使用\n\n| 场景 | 调用 |\n|---|---|\n| User asks to browse a web page, search, fill forms, extract data | Yes |\n| User needs screenshots of a web page | Yes |\n| User wants to automate browser interactions | Yes |\n| User asks about writing code / debugging (no browser involved) | No |\n\n| 场景 | 调用 |\n|---|---|\n| 用户需要在浏览器中打开网页、搜索、填写表单、提取数据 | 是 |\n| 用户需要网页截图 | 是 |\n| 用户希望自动化浏览器操作 | 是 |\n| 用户问代码/调试相关（不涉及浏览器） | 否 |\n\n---\n\n## ⚠️ CRITICAL: Chinese URL Encoding\n\n> **IMPORTANT**: When constructing URLs with Chinese characters for the `navigate` action, the agent MUST encode the query string values using `encodeURIComponent`. The bridge automatically encodes non-ASCII characters in the URL path, but query string values must be pre-encoded by the caller.\n\n> **重要说明**: 调用 `navigate` 时，URL 中如果包含中文字符，智能体必须先用 `encodeURIComponent` 对查询参数值进行编码。例如 `wd=妻子的浪漫旅行` 必须写成 `wd=%E5%A6%BB%E5%AD%90%E7%9A%84%E6%B5%AA%E6%BC%AB%E6%97%85%E8%A1%8C`。\n\n### Correct / 正确写法\n\n```json\n{\"action\": \"navigate\", \"params\": {\"url\": \"https://www.baidu.com/s?wd=%E5%A6%BB%E5%AD%90%E7%9A%84%E6%B5%AA%E6%BC%AB%E6%97%85%E8%A1%8C\"}}\n```\n\n### Wrong / 错误写法\n\n```json\n{\"action\": \"navigate\", \"params\": {\"url\": \"https://www.baidu.com/s?wd=妻子的浪漫旅行\"}}\n```\n\n### How to encode in Node.js / 如何在 Node.js 中编码\n\n```javascript\nconst encoded = encodeURIComponent('妻子的浪漫旅行');\n// Result: %E5%A6%BB%E5%AD%90%E7%9A%84%E6%B5%AA%E6%BC%AB%E6%97%85%E8%A1%8C\n```\n\n---\n\n## Prerequisites / 前提条件\n\n| Requirement | Check | Auto-resolve |\n|---|---|---|\n| Chrome / Chromium installed | Detected automatically | No (user must install) |\n| Chrome running with CDP | Detected on startup | Yes (auto-launched) |\n| Node.js 18+ | `node --version` | No |\n\n| 要求 | 检查方式 | 自动处理 |\n|---|---|---|\n| 已安装 Chrome / Chromium | 自动检测常用安装路径 | 否（用户需安装） |\n| Chrome 以 CDP 模式运行 | 启动时检测 | 是（自动启动） |\n| Node.js 18+ | `node --version` | 否 |\n\n---\n\n## Startup Protocol / 启动协议\n\nWhen loaded for the first time, the agent MUST run the startup script. The script runs the bridge as a **background child process** — the agent does NOT need to manage the process separately.\n\n首次加载时，AI 智能体必须执行以下启动脚本。脚本会自动在后台启动桥接服务，智能体**无需单独管理进程**。\n\n```bash\nnode <skill_dir>/scripts/startup.js\n```\n\n### What it does / 执行流程\n\n1. **Check if bridge is already running**: `GET /health` on port 9876 → skip if OK\n2. **Ensure npm package installed**: `npm list -g chrome-ai-action` → installs via `npm install -g chrome-ai-action` if missing\n3. **Start the bridge**: `chrome-ai-action --port 9876`, waits for health check\n4. **Auto-launch Chrome**: If Chrome not running with CDP, the bridge starts it automatically (cross-platform)\n\n### Environment Variables / 环境变量\n\n| Variable | Default | Description |\n|---|---|---|\n| `CAA_BRIDGE_PORT` | `9876` | Bridge HTTP server port |\n| `CAA_STARTUP_TIMEOUT` | `30000` | Max wait for bridge ready (ms) |\n| `CHROME_PATH` | auto-detect | Custom Chrome executable path |\n| `CHROME_USER_DATA_DIR` | platform-dependent | Chrome profile directory |\n\n---\n\n## API Protocol / 通信协议\n\n**Endpoint**: `http://127.0.0.1:9876/`\n\n### Endpoints / 接口地址\n\n| Method | Path | Description |\n|---|---|---|\n| `GET` | `/health` | Health check — returns bridge & CDP status |\n| `GET` | `/schema` | Full action schema (64+ actions) |\n| `POST` | `/` | Execute action(s) |\n\n### Request Format / 请求格式\n\n```json\n{\"type\": \"action\", \"action\": \"<ACTION>\", \"params\": {...}, \"requestId\": \"optional-id\"}\n```\n\n### Batch Request / 批量请求\n\n```json\n{\"type\": \"batch\", \"actions\": [\n  {\"action\": \"navigate\", \"params\": {\"url\": \"https://example.com\"}},\n  {\"action\": \"getTitle\"}\n]}\n```\n\n### Response Format / 响应格式\n\n```json\n{\"success\": true, \"data\": {...}, \"requestId\": \"req-1\", \"timestamp\": 1712345678901}\n```\n\n### Error Response / 错误响应\n\n```json\n{\"success\": false, \"error\": {\"code\": \"ACTION_ERROR\", \"message\": \"...\"}, \"requestId\": \"req-1\", \"timestamp\": 1712345678901}\n```\n\n---\n\n## Available Actions (64+) / 可用操作 (64+)\n\n### Navigation / 导航\n`navigate`, `goBack`, `goForward`, `reload`, `getUrl`, `getTitle`\n\n### Page Content / 页面内容\n`getText`, `getHtml`, `getLinks`, `getImages`, `getHeadings`, `getMetaTags`, `getFormFields`, `getFocusableElements`\n\n### Element Interaction / 元素交互\n`click`, `type`, `pressKey`, `scroll`, `scrollIntoView`, `findElement`, `focus`, `hover`, `select`\n\n### Data Extraction / 数据提取\n`getValue`, `getAttribute`, `getAttributeAll`, `getBoundingBox`, `getCookies`, `getPerformanceMetrics`, `getSelectedValue`, `getSelectOptions`\n\n### JavaScript / JS 执行\n`evaluate`, `injectScript`, `injectCSS`\n\n### Screenshot & Export / 截图与导出\n`screenshot` (PNG/JPEG), `getPdf` (A4/Letter)\n\n### Tab Management / 标签页管理\n`listTabs`, `newTab`, `closeTab`, `switchTab`, `getCurrentTab`\n\n### Waiting / 等待\n`waitForElement`, `waitForTimeout`, `waitForNavigation`\n\n### Cookie Management / Cookie 管理\n`setCookie`, `deleteCookie`\n\n### Network Interception / 网络拦截\n`blockUrls`, `unblockUrls`, `mockResponse`, `getNetworkRequests`, `clearNetworkRequests`\n\n### Storage / 本地存储\n`getLocalStorage`, `setLocalStorage`, `removeLocalStorage`, `clearLocalStorage`\n\n### File Operations / 文件操作\n`uploadFile`, `setInputFiles`, `downloadFile`\n\n### Viewport / 视口\n`getViewport`, `setViewport`\n\n### Console / 控制台日志\n`getConsoleLogs`, `clearConsoleLogs`\n\n### Accessibility / 无障碍\n`getAccessibilityTree`\n\n### Utility / 工具\n`ping`, `connect`, `disconnect`, `getBrowserInfo`, `highlight`, `dispatchEvent`\n\n---\n\n## Typical Workflow / 典型工作流\n\n1. **Navigate**: `navigate` → go to target URL (encode Chinese in query params)\n2. **Wait**: `waitForElement` → wait for key content\n3. **Read**: `getText` / `getHtml` / `getLinks` → understand page\n4. **Interact**: `click` / `type` / `pressKey` → perform actions\n5. **Extract**: `getText` / `screenshot` / `evaluate` → get results\n6. **Confirm**: `screenshot` → visually verify\n\n### Example: Search Baidu with Chinese / 百度搜索中文示例\n\n```json\n{\"type\": \"batch\", \"actions\": [\n  {\"action\": \"navigate\", \"params\": {\"url\": \"https://www.baidu.com/s?wd=%E5%A6%BB%E5%AD%90%E7%9A%84%E6%B5%AA%E6%BC%AB%E6%97%85%E8%A1%8C\"}},\n  {\"action\": \"waitForTimeout\", \"params\": {\"ms\": 2000}},\n  {\"action\": \"getText\"}\n]}\n```\n\n### Example: Full Login Flow / 登录流程示例\n\n```json\n{\"type\": \"batch\", \"actions\": [\n  {\"action\": \"navigate\", \"params\": {\"url\": \"https://example.com/login\"}},\n  {\"action\": \"waitForElement\", \"params\": {\"selector\": \"input[name=username]\", \"timeout\": 10000}},\n  {\"action\": \"type\", \"params\": {\"selector\": \"input[name=username]\", \"value\": \"myuser\"}},\n  {\"action\": \"type\", \"params\": {\"selector\": \"input[name=password]\", \"value\": \"mypassword\"}},\n  {\"action\": \"click\", \"params\": {\"selector\": \"button[type=submit]\"}},\n  {\"action\": \"waitForTimeout\", \"params\": {\"ms\": 3000}},\n  {\"action\": \"getCurrentTab\"}\n]}\n```\n\n---\n\n## Error Handling / 错误处理\n\n| Error Code | Meaning | Resolution |\n|---|---|---|\n| `CDP_NOT_CONNECTED` | Chrome not running with debug port | Bridge auto-launches Chrome, retries every 3s |\n| `ACTION_ERROR` | Action execution failed | Check params, use `getFocusableElements` to find elements first |\n| `INVALID_REQUEST` | Malformed request | Check request format |\n| `PARSE_ERROR` | JSON parse failure | Send valid JSON |\n\n---\n\n## Discovery Tips / 探测提示\n\nWhen you don't know what elements are on a page:\n\n1. `getFocusableElements` → all interactive elements (with positions)\n2. `getFormFields` → all form inputs with metadata\n3. `getLinks` → all links on page\n4. `getHeadings` → understand page structure\n5. `getText` → all visible text\n\n---\n\n## References / 参考资料\n\n- `references/bridge-api.md` — Complete API reference with all 64+ actions\n- `references/setup-guide.md` — Detailed setup and troubleshooting\n- `scripts/startup.js` — Startup automation script\n","tags":{"latest":"1.0.0"},"stats":{"comments":0,"downloads":288,"installsAllTime":0,"installsCurrent":0,"stars":0,"versions":1},"createdAt":1778426751914,"updatedAt":1779076311648},"latestVersion":{"version":"1.0.0","createdAt":1778426751914,"changelog":"chrome-ai-action-skill 1.0.0 initial release:\n\n- Enables full browser automation via the Chrome AI Action (CAA) bridge using Puppeteer (CDP) mode.\n- Supports 60+ browser actions: navigation, clicking, typing, screenshots, data extraction, network interception, cookie and storage management, PDF export, and more.\n- Automatically installs required npm package and launches the bridge; Chrome is auto-started if not running.\n- Startup, API usage, error handling, and discovery tips clearly documented in English and Chinese.\n- Special guidance for correct URL encoding with Chinese characters in navigation actions.","license":"MIT-0"},"metadata":null,"owner":{"handle":"jami-lin","userId":"s179a0cbz8d1bjgrnrqga46wyx85xd9y","displayName":"Jami-Lin","image":"https://avatars.githubusercontent.com/u/59037276?v=4"},"moderation":{"isSuspicious":false,"isMalwareBlocked":false,"verdict":"clean","reasonCodes":["review.llm_review"],"summary":"Review: review.llm_review","engineVersion":"v2.4.24","updatedAt":1780090778165}}