{"skill":{"slug":"long-research","displayName":"Long Research","summary":"[BETA] Deep research that actually reads pages instead of summarizing search results. Tell it how long to research (10 min, 2 hours, all night) and it works...","description":"---\nname: long-research\ndescription: >\n  [BETA] Deep research that actually reads pages instead of summarizing search results.\n  Tell it how long to research (10 min, 2 hours, all night) and it works the full\n  duration — searching, reading every result, following leads, cracking forums,\n  cross-verifying findings, and writing progressively to a research file.\n  Tree-style exploration: each page read spawns new searches, like a human researcher.\n  Enforced read-to-search ratio prevents shallow search-spamming. Wall-clock time\n  commitment — it won't finish early. Self-audit gate blocks delivery until quality\n  checks pass. Works with web_search, web_fetch, and browser-use for JS-heavy sites.\ndependencies:\n  - browser-use\n---\n\n# Long Research\n\n> ⚠️ **BETA** — This skill works as described but is under active development. Review the notes below before installing.\n\n## Dependencies\n\n- **web_search** — any search provider (Perplexity Sonar, Brave, etc.)\n- **web_fetch** — built into most agent frameworks\n- **browser-use** (REQUIRED) — install via `pip install browser-use && browser-use install`. Used for JS-heavy sites, login-gated forums, and retailer pricing. The skill will not function fully without it.\n\n## Privacy & Trust Notes\n\n**browser-use remote mode:**\nThe browser-use cascade tries 3 modes in order: `chromium` (local, free) → `real` (local, free) → `remote` (cloud-hosted, burns API credits). Remote mode sends page content to browser-use.com's cloud infrastructure. If you care about privacy, you can:\n- Remove `remote` from the cascade and only use local modes\n- Run in Interactive mode so the agent asks before escalating to remote\n- The skill works fine with local-only browser modes for most sites\n\n**Login-gated forums and browser profiles:**\nThe skill includes patterns for extracting content from login-gated forums using `--profile` flags. Profiles persist cookies locally on your machine in browser-use's default profile directory. The agent does NOT attempt automated login — it uses browser-use's cookie persistence from previous manual sessions. If you haven't logged into a forum manually via browser-use before, the agent won't have access. No credentials are stored or transmitted by this skill.\n\n**Filesystem writes:**\nResearch output is written to `research/[topic]-[date].md` in your agent's working directory. Progressive writes happen every 3-5 tool calls as crash recovery. Files stay on your local machine — nothing is uploaded. If running in a shared environment, configure your agent's working directory to a safe location.\n\n**Sub-agent prompt injection surface:**\nThe skill mandates pasting full instructions into sub-agent task prompts. This means the entire SKILL.md (including your research query) is sent to whatever model provider your sub-agent uses. If you use external/remote model providers, be aware that your research queries and the full skill text are transmitted to those services. This is standard for any agent skill that delegates to sub-agents — but worth noting if your research topics are sensitive.\n\n**Recommended for first-time users:**\n- Run in **Interactive** mode (not Autonomous) until you're comfortable with the skill's behavior\n- Start with a short duration (10 min) to see how it works\n- Review the research file output before trusting findings\n\n## Activation\n\nWhen the user invokes this skill (e.g. \"do long research\", \"research X\", \"pull up long research\"), **immediately start the pre-flight checklist**. Do NOT ask \"what topic?\" separately — go straight into the pre-flight questions. If the user already provided a topic in their message, pre-fill it and ask the remaining questions.\n\n## Pre-Flight Checklist (MANDATORY — NO EXCEPTIONS)\n\nBefore starting ANY research, confirm all of these with the user:\n\n1. **Topic** — What exactly to research. Get specific. If not provided yet, ask now.\n2. **Duration** — How long to spend. Minimum 10 minutes, can be hours.\n3. **Autonomy** — Choose one. Default: Autonomous.\n   - **Autonomous** — No questions, log everything, report at end.\n   - **Interactive** — May ask clarifying questions during research. Pauses for answers.\n   - **Interactive-continue** — May ask clarifying questions, but if user doesn't reply within ~2 minutes, continue research with best judgment. Never block on silence.\n4. **Tools** — List in priority order (highest priority first). The plan block MUST reflect this ordering. Default priority:\n   1. **web_search** — discovery, overviews, forum consensus\n   2. **web_fetch** — reading specific review/forum/article pages\n   3. **browser-use** (REQUIRED dependency) — retailer pricing (Amazon, etc.), JS-heavy sites, anything web_fetch can't reach. See `references/browser-use-patterns.md`\n   \n   User may reorder (e.g., browser-use first for pricing-heavy research). Always include all three in the plan block — never omit browser-use.\n5. **Scheduling** — Now or delayed? If delayed, note time + timezone.\n6. **Output** — Where to deliver (this chat, specific topic, file only) and format (summary, full report, comparison table).\n7. **Clarifying questions** — Ask anything about the user's situation BEFORE starting. Don't assume hardware, location, budget, preferences, etc.\n\n⛔ **GATE: You MUST post the plan block below and receive explicit user approval before making ANY research tool calls.** Do not start research based on implied approval, partial answers, or \"just get going\" energy. The plan block IS the gate.\n\nPost this and wait for approval:\n```\n📋 Research Plan\n• Topic: [what]\n• Duration: [X] minutes/hours\n• Mode: Autonomous / Interactive / Interactive-continue\n• Tools: [list in priority order, e.g. web_search → web_fetch → browser-use]\n• Start: Now / [scheduled time]\n• Output: [where and format]\n• Questions: [anything unclear about user's situation]\n\nProceed? (yes to start)\n```\n\nIf the user answers clarifying questions but doesn't say \"proceed/yes/go\", re-post the updated plan and ask again.\n\n---\n\n## Spawning (CRITICAL — read this before delegating to a sub-agent)\n\n⛔ **The sub-agent MUST have the full skill instructions in its task prompt.** The root cause of most research failures is the orchestrator paraphrasing the skill instead of injecting it. A sub-agent that doesn't read this file will ignore every gate, every ratio, every enforcement rule.\n\n### How to spawn a research sub-agent:\n\n1. **Read this entire SKILL.md** into context (you're already doing this if you're reading this).\n2. **Construct the task prompt** with these MANDATORY sections in this order:\n\n```\nTASK PROMPT TEMPLATE (paste in this order — critical rules at TOP and BOTTOM):\n─────────────────────────────────────────────────────────────────────────────\n## ⛔ CRITICAL RULES (read first, enforce always)\n1. READ > SEARCH. After Seed, you cannot search unless you've read something since your last search.\n2. NEVER start synthesis while time remains. Run `date +%s` and check END_TS.\n3. browser-use cascade: chromium → real → remote. Try ALL 3 before giving up on any source.\n4. web_search returns URLS, not answers. Ignore Perplexity's synthesis text. Extract only the citation URLs.\n5. \"Not found\" IS a valid finding. If you can't find what was asked, say so honestly. Do NOT answer an adjacent question.\n\n## Research Task\n[The user's EXACT question, quoted verbatim. Do not paraphrase, soften, or broaden it.]\n\n## Task Anchor (re-read every 5 tool calls)\nYour job is to answer THIS question: \"[exact question again]\"\nEvery 5 tool calls, ask yourself: \"Does my last action directly serve THIS question?\"\nIf no → STOP current branch → re-orient.\n\n## User Context\n[Any relevant context — location, timeline, constraints. Keep separate from the task.]\n\n## Research Rules\n[Paste the FULL Execution section from SKILL.md — the complete Read-Driven Loop,\nbrowser-use Cascade, Forum-Cracking Playbook, Progressive Writing, Time Enforcement,\nNegative Results, Self-Audit, etc.]\n\n## ⛔ REMINDER (read last, enforce always)\n- You MUST read more pages than you search. Track: reads vs searches.\n- You MUST try all 3 browser modes before giving up.\n- You MUST NOT start synthesis while time remains.\n- You MUST deliver honest negative results, not drift to adjacent questions.\n- You MUST run the self-audit before delivering.\n─────────────────────────────────────────────────────────────────────────────\n```\n\n3. **Never summarize the rules.** Paste them. The sub-agent needs the actual enforcement language.\n4. **Quote the user's question verbatim** in both Task and Task Anchor. Do not rephrase \"find real life experience where X happened\" into \"research whether X is common.\"\n5. **Put critical rules at both TOP and BOTTOM** of the prompt — primacy and recency effects in attention mean middle sections get ignored.\n\n### Common spawning mistakes:\n- ❌ \"Research X for 10 minutes following the long-research skill\" — the sub-agent doesn't have the skill\n- ❌ Paraphrasing the question — changes what the agent optimizes for\n- ❌ Adding your own interpretation — \"Focus on category A\" when user said \"any category\"\n- ❌ Omitting browser-use cascade details — agent won't know to retry\n- ❌ Rules only in the middle of a long prompt — agent attention drops off\n- ✅ Critical rules at top AND bottom + full rules in middle + exact question quoted\n\n---\n\n## Execution\n\n### Where to Run\n- **Dedicated topic or sub-agent** — never in main session\n- Sub-agent for fully autonomous tasks\n- Dedicated topic if interactive (so user can engage)\n\n### The Read-Driven Loop (replaces the old \"Execution Flow\")\n\nThe fundamental architecture: **reading pages is the default action. Searching is only allowed when the URL queue is empty.** This prevents the search-addiction pattern where agents fire 15 searches and read 3 pages.\n\n```\n1. SETUP\n   ├─ START_TS=$(date +%s)\n   ├─ END_TS=$((START_TS + DURATION_SECONDS))\n   ├─ SEARCH_COUNT=0, READ_COUNT=0, WRITE_COUNT=0\n   ├─ Create URL_QUEUE (empty list)\n   └─ Post: \"🚀 Research started. Deadline: $(date -d @$END_TS '+%H:%M:%S')\"\n\n2. SURVEY PHASE (first 30% of committed time — BREADTH before depth)\n   ├─ Goal: IDENTIFY as many candidates/angles as possible. Do NOT deep-dive yet.\n   ├─ Run 3-5 broad searches with different angles/terminology\n   ├─ For each search: extract citation URLs + candidate names/items mentioned\n   ├─ Build a CANDIDATE LIST: every distinct option/entity mentioned across all sources\n   ├─ ⛔ GATE: Write #1 to file (Task Anchor + FULL candidate list + URL queue)\n   ├─ CANDIDATE GATE: You MUST have identified ≥10 candidates before deep-diving ANY of them.\n   │   If topic naturally has fewer (e.g., \"best 3 X\"), adjust to ≥2x the expected answer count.\n   ├─ WRITE_COUNT += 1\n   └─ Post: \"📋 Survey complete. [N] candidates identified: [list]. Now deep-diving.\"\n   \n   Example: Don't stop at the first 3 options you find. If researching [Topic], survey should\n   uncover at least 10-20 candidates across different categories, brands, or approaches.\n   THEN prioritize which to deep-dive based on relevance to the question.\n\n3. MAIN LOOP (repeat until time runs out)\n   │\n   ├─ IF URL_QUEUE is not empty:\n   │   ├─ Pop next URL from queue\n   │   ├─ RELEVANCE CHECK: Is this URL likely to contain what the Task Anchor asks for?\n   │   │   If clearly irrelevant (e.g., beginner guide when searching for expert anecdotes): SKIP, log as \"pruned: [reason]\"\n   │   ├─ READ it (web_fetch first, browser-use cascade if fails)\n   │   ├─ After reading, assess: \"Did this page help answer the Task Anchor?\" \n   │   │   If yes: READ_COUNT += 1 (productive read)\n   │   │   If no: DEAD_END_COUNT += 1 (log as dead end, don't count as productive)\n   │   ├─ Extract: findings + new URLs → add new URLs to queue\n   │   ├─ If read spawns a question that needs a NEW search:\n   │   │   └─ ALLOWED (because you just read something)\n   │   │       ├─ web_search → extract citation URLs → add to queue\n   │   │       └─ SEARCH_COUNT += 1\n   │   └─ Continue to next iteration\n   │\n   ├─ IF URL_QUEUE is empty AND you need more:\n   │   ├─ web_search with NEW terms (not repeating old queries)\n   │   ├─ SEARCH_COUNT += 1\n   │   ├─ Extract URLs → add to queue\n   │   └─ Go back to reading from queue\n   │\n   ├─ Every 3-5 tool calls: write findings to file, WRITE_COUNT += 1\n   │\n   ├─ Every 5 tool calls: \n   │   ├─ Re-read Task Anchor — am I still on-target?\n   │   ├─ Run: NOW=$(date +%s); REMAINING=$((END_TS - NOW))\n   │   ├─ Post: ⏱️ elapsed/committed (remaining) | 🔍 S | 📄 R | 📝 W | 🪦 D (dead ends) | R>S? yes/no\n   │   └─ Verify: READ_COUNT > SEARCH_COUNT? If not, STOP searching, READ.\n   │\n   ├─ ⛔ BROWSER-USE DEADLINE: By 50% of committed time, you MUST have attempted\n   │   at least 1 browser-use call. If you reach the halfway mark without one,\n   │   STOP everything and run a browser-use cascade on your most promising blocked URL.\n   │\n   ├─ ⛔ NO IDLE WAITING — NEVER USE `sleep` EXCEPT FOR browser-use PAGE LOADS (max 3s):\n   │   If you have time remaining, you MUST be making productive tool calls.\n   │   Running `sleep 60` or any sleep > 3s to fill time is a HARD FAIL.\n   │   \n   │   Before you even think about sleeping, run this MANDATORY checklist:\n   │   □ Have I tried the top 3 forums for this domain?\n   │   □ Have I tried old.reddit.com web_fetch on at least 1 subreddit?\n   │   □ Have I tried Google cache on at least 1 blocked URL?\n   │   □ Have I searched YouTube comments?\n   │   □ Have I searched in non-English languages (if relevant)?\n   │   □ Have I tried older threads (2020, 2021, 2022)?\n   │   □ Have I tried alternative community forums beyond the obvious ones?\n   │   □ Have I cross-verified my existing findings with a second source?\n   │   If ANY box is unchecked → DO THAT instead of sleeping.\n   │\n   └─ TIME CHECK: if $(date +%s) >= END_TS → exit loop, go to SYNTHESIS\n\n4. SYNTHESIS (only after time is up)\n   ├─ ⛔ GATE: Run date +%s. Print result. If NOW < END_TS, go back to step 3.\n   ├─ Write final version of research file\n   ├─ Include: Task Anchor, Executive Summary, Research Tree, Findings, Sources, Verification\n   └─ WRITE_COUNT += 1\n\n5. SELF-AUDIT (hard gate — blocks delivery)\n   ├─ Run every checklist item (see Self-Audit section below)\n   ├─ Fix any failures\n   └─ Write Process Compliance section\n\n6. DELIVER\n   ├─ Chat summary (<500 words)\n   ├─ Link to research file\n   └─ Final elapsed time from date +%s\n```\n\n**Why this works better than a ratio rule:**\n- The ratio is *architecturally enforced*: you can only search when the queue is empty OR after completing a read.\n- Searching is the exception, reading is the default.\n- The agent can't chain 5 searches in a row because the loop structure doesn't allow it.\n- The READ_COUNT > SEARCH_COUNT check every 5 calls catches any drift.\n\n### The Search Discipline\n\n**web_search returns URLs, not answers.**\n\nWhen web_search returns a response from Perplexity/Sonar:\n1. **Extract ONLY the citation URLs** from the response.\n2. **Add those URLs to your queue.**\n3. **Ignore the synthesized text.** Do not read it. Do not cite it. Do not base any finding on it.\n\nWhy: Perplexity fabricates quotes, hallucinates thread URLs, and presents unverified claims as consensus. If you read its synthesis, you'll unconsciously treat it as a finding. Don't read it. URLs only.\n\n**After Seed phase, you earn each search by reading:**\n- ✅ READ → SEARCH → READ → SEARCH (alternating)\n- ✅ READ → READ → READ → SEARCH (reading more is always fine)\n- ❌ SEARCH → SEARCH (never, unless in Seed)\n- ❌ SEARCH → SEARCH → READ (too late, you already chained)\n\n**Hard search cap:** Maximum searches = 1.5x committed minutes. For 10 min = max 15 searches.\nIf you hit the cap, you are BLOCKED from searching for the rest of the session. Read only.\nAt any checkpoint: if SEARCH_COUNT > 2x READ_COUNT → BLOCKED from searching until reads catch up.\n\n**PDF URLs:** If a URL ends in `.pdf`, use browser-use to render it. web_fetch returns binary garbage for PDFs. Skip PDFs entirely if browser-use is unavailable.\n\n### Source Balance: User Reports vs Official Research (MANDATORY)\n\n**For grey-area topics** (supplements, nootropics, off-label treatments, emerging tech, anything without large studies or established consensus):\n- User reports ARE the data. The person on a forum who's been using a product for 6 months IS a primary source.\n- **Hard rule: ≥40% of productive reads must be forum/user-report sources** (not academic papers, not marketing sites, not review articles).\n- Track: FORUM_READ_COUNT separately from STUDY_READ_COUNT.\n- At every 5-call checkpoint: `Forum reads / Total reads >= 0.4?` If not → next reads MUST be forums.\n- Forums to prioritize: Reddit (relevant subreddits), StackExchange, specialized community forums for the topic, niche discussion boards, practitioner blogs with comment sections\n\n**For well-studied topics** (established science, mainstream products):\n- Official research takes priority. Forum reads still valuable but ratio can be 30/70.\n\n### Adversarial Thinking (MANDATORY — prevents confirmation bias)\n\nFor EVERY candidate/option the research identifies as \"good\" or \"recommended\":\n1. **Search for the negative case.** Literally search: `\"[candidate] dangers\" OR \"[candidate] problems\" OR \"[candidate] risks\" OR \"[candidate] didn't work\"`\n2. **Follow the mechanism chain.** If something works by mechanism X, ask: \"What ELSE does mechanism X do? Could it cause harm?\"\n   - Example: If X promotes growth/healing via mechanism Y → Does Y also have unwanted side effects? Search it.\n   - Example: If X improves metric A → Does it also worsen metric B? Search it.\n   - Example: If X suppresses process P → Does that suppression cause downstream problems? Search it.\n3. **Devil's advocate search quota:** At least 1 adversarial search per 3 candidates identified. If you found 9 options, you need ≥3 \"what could go wrong\" searches.\n4. **Write a \"Risks & Concerns\" section** that is at LEAST 25% the length of the \"Benefits\" section. If your benefits section is 2000 words and risks is 200 words, you're not being honest.\n\n### Task Anchoring (MANDATORY — prevents drift)\n\nThe #1 failure mode is **task drift** — answering an adjacent, easier question.\n\n**Mechanism:**\n1. Write the user's **exact question** (verbatim) at the top of the research file under `## Task Anchor`.\n2. Every 5 tool calls, re-read it and ask: \"Does my last action directly help answer THIS question?\"\n3. If no → STOP the current branch → re-orient.\n\n**Examples of drift:**\n- User asks: \"find real life experience where [specific event] happened\"\n- ❌ Drift: researching general requirements (answers \"what's needed\" not \"who experienced it\")\n- ❌ Drift: reading official policy documents (answers \"what's the rule\" not \"what happened\")\n- ✅ On-target: finding a forum thread where someone says \"this happened to me: [specific event]\"\n\n**The test:** If your findings say \"according to official policy...\" instead of \"user X on forum Y reported that...\" — you've drifted.\n\n### Negative Results (MANDATORY — don't panic-pivot)\n\nSometimes the answer is: **\"I looked hard and didn't find it.\"**\n\nThis is a VALID research outcome. Do NOT:\n- ❌ Pivot to answering an adjacent question to have \"something to show\"\n- ❌ Present policy/official guidance as a substitute for the anecdotes you couldn't find\n- ❌ Inflate weak leads into findings\n- ❌ Quote Perplexity synthesis as if it were a real source\n\nInstead:\n- ✅ Document exactly what you searched for and where (queries + forums + pages read)\n- ✅ State clearly: \"After [N] searches and [M] page reads across [forums], I found zero anecdotes matching [exact question]\"\n- ✅ Report adjacent findings separately: \"While I didn't find X, I did find Y which is related but distinct\"\n- ✅ Assess WHY you might not have found it: rare event? wrong search terms? content behind paywalls? topic not discussed publicly?\n- ✅ Suggest what would need to happen to find it: \"This might require searching [specific forum] with a logged-in account\" or \"This may be too rare for public discussion\"\n\n**The summary for a negative result should lead with the negative.** Don't bury it under adjacent findings.\n\n---\n\n## Tool Rules\n\n### General\n- **Retailer sites (Amazon, Best Buy, etc.)**: ALWAYS browser-use with `--profile`. web_fetch cannot scrape Amazon.\n- **Forums/reviews**: web_fetch first, browser-use cascade if blocked.\n- **Discovery**: web_search (for URLs, not answers).\n- **browser-use is a hard requirement.** If not installed: `pip install browser-use && browser-use install`.\n\n### browser-use Cascade (try ALL 3 — no exceptions)\n\nbrowser-use has 3 browser modes. You MUST try them **in order**. Do NOT give up after one failure.\n\n**Use this exact bash pattern:**\n```bash\nURL=\"https://example.com\"\nSESSION=\"research\"\n\n# Attempt 1: chromium (free)\necho \">>> Trying chromium...\"\nbrowser-use --session $SESSION --browser chromium open \"$URL\" 2>&1 | tail -5\n# If \"url:\" appears in output → SUCCESS\n\n# Attempt 2: real (free)  \necho \">>> Trying real...\"\nbrowser-use --session $SESSION --browser real open \"$URL\" 2>&1 | tail -5\n\n# Attempt 3: remote (paid, last resort)\necho \">>> Trying remote...\"\nbrowser-use --session $SESSION --browser remote open \"$URL\" 2>&1 | tail -5\n```\n\n**Run all 3 in sequence. Stop at the first success. Log all attempts:**\n```\n[BROWSER CASCADE: {url}]\n  chromium: SUCCESS/FAIL (reason)\n  real: SUCCESS/FAIL (reason)\n  remote: SUCCESS/FAIL (reason)\n```\n\n⛔ If you get an **import error** (e.g., `BrowserConfig` not found), that's a code bug, not a site block. Fix the code (use `BrowserProfile`) and retry. Do NOT count code bugs as cascade failures.\n\n⛔ NEVER give up after 1 failure. NEVER skip to \"find alternative source\" without trying all 3 modes. If you log only 1 attempt, the self-audit will fail you.\n\n### web_fetch Failure Escalation\nWhen web_fetch returns 403, empty content, Cloudflare, or broken HTML:\n1. Run the **full browser-use cascade** (all 3 modes)\n2. Log: `[web_fetch BLOCKED: {url} → cascade: chromium=X, real=Y, remote=Z]`\n3. Only after all 3 fail: find an alternative source (but log that you tried)\n\n### Forum-Cracking Playbook\n\nForums are where real experiences live. They're also the hardest to access. Use these strategies IN ORDER:\n\n**Reddit (NOTE: as of 2026, Reddit blocks most automated access including old.reddit.com):**\n1. `old.reddit.com` via web_fetch (try first but expect 403)\n2. Google cache: web_search `cache:reddit.com/r/[sub]/comments/[id]`\n3. Reddit aggregation sites that compile Reddit reviews by topic\n4. browser-use cascade on `reddit.com` (last resort)\n5. web_search `site:reddit.com \"[exact phrase]\"` — use citation URLs, NOT Perplexity synthesis\n6. **Fallback forums when Reddit fails:** StackExchange, specialized community forums for the topic, niche discussion boards\n\n**Login-gated forums:**\n1. web_fetch (sometimes works for newer threads)\n2. browser-use chromium + JavaScript eval to extract post content:\n   ```bash\n   browser-use --session forum --browser chromium open \"https://forum.example.com/thread/[ID]\"\n   # Wait 2s for page load\n   browser-use --session forum --browser chromium eval \"Array.from(document.querySelectorAll('[data-role=commentContent], .post-body, .message-content')).map((e,i) => 'POST ' + i + ': ' + e.innerText.substring(0,600)).join('\\n===\\n')\"\n   ```\n3. This eval pattern extracts post content from login-gated pages — adapt selectors per forum.\n\n**Blocked forums:**\n1. web_fetch (may fail — if so, skip to browser-use)\n2. browser-use cascade\n3. Google cache\n\n**YouTube comments:**\n1. web_fetch on the video page (gets description + some comments)\n2. browser-use for full comment sections\n\n**Domain-specific community forums:**\n1. Identify the top 3-5 forums for the research topic via web_search\n2. web_fetch with /threads/ or /topic/ URLs, search via `site:[forum-domain]`\n3. browser-use chromium for search, eval to extract post content\n4. Look for practitioner blogs, review aggregation sites, and niche wikis\n5. Academic databases for study abstracts when relevant (web_fetch works for most)\n\n**Vendor research with browser-use:**\n- Use browser-use eval to extract product catalogs with prices\n- Example: `document.querySelectorAll('.product-miniature')` → map name + price\n- Always capture full catalog, not just target products\n\n**General fallback for any blocked source:**\n1. Google `cache:[url]` via web_fetch\n2. Wayback Machine: `https://web.archive.org/web/[url]`\n3. Google `site:[domain] \"[exact phrase]\"` to find alternative pages on the same site\n\n### Anecdote-Hunting Searches (for \"find real experiences\" tasks)\n\nWhen the user asks for **real-life experiences**, first-person accounts, or anecdotes:\n\n**Use personal-language queries:**\n```\n\"my experience with\" [topic] site:[relevant-forum]\n\"I tried\" OR \"I switched to\" OR \"after using\" [topic] forum\n\"my experience\" [topic] \"what happened\" forum\n\"I had to\" OR \"they told me\" OR \"turns out\" [topic keywords]\n```\n\n**DO NOT use policy-language queries:**\n```\n\"[topic] requirements overview\" ← returns official pages, not stories\n\"what is needed for [topic]\" ← returns guides, not experiences\n```\n\n**Source priority for anecdotes:**\n1. Specialized forums for the topic (use browser-use eval pattern if login-gated)\n2. Reddit (old.reddit.com → Google cache → browser-use)\n3. Niche community forums and discussion boards\n4. Q&A sites (real expert answers to real questions)\n5. YouTube comments\n6. Google `site:` searches with first-person language\n\n**The test:** Your findings should contain direct quotes from real people, with usernames and thread URLs. If your report says \"typically X happens...\" instead of \"user @JohnDoe42 on [Forum] reported: '...'\" — you've answered a different question.\n\n---\n\n## Source Verification (MANDATORY)\n\n- **Never trust Perplexity/Sonar synthesis.** Extract URLs only. Ignore the text.\n- When Sonar claims \"Reddit users say X\": treat it as a LEAD to find URLs, not a source.\n- Real user quotes must come from pages you actually fetched.\n- If you can't verify a forum claim: mark it `[⚠️ unverified forum consensus]`\n- ⛔ **FIREWALL:** Unverified claims CANNOT appear in Recommendation or Executive Summary. They can appear in Detailed Findings with ⚠️ tag only.\n\n---\n\n## Progressive Writing (MANDATORY — even for short sessions)\n\n⛔ **GATE: First write MUST happen after Seed phase.** You CANNOT proceed to tree traversal until you've written initial findings, URL queue, and Task Anchor to the research file.\n\n- **Write #1 (GATE):** After Seed — Task Anchor, URL queue, search terms. No more tool calls until this write lands.\n- **Subsequent writes:** Every 3-5 tool calls — append new findings in batches.\n- **Final write:** Synthesis phase — executive summary, tree, recommendation.\n- ⛔ **NEVER accumulate all findings and write once at the end.** If session crashes, all data is lost.\n- **≤15 min sessions:** minimum 3 writes. **>15 min:** minimum 1 write per 5 minutes.\n- **Count writes.** Every status update: `📝 [N] file writes`. If N=0 after 4+ tool calls → STOP and write NOW.\n\n---\n\n## Time Enforcement (WALL-CLOCK — NO GUESSING)\n\n**Setup:**\n```bash\nSTART_TS=$(date +%s)\nEND_TS=$((START_TS + DURATION_SECONDS))\necho \"START_TS=$START_TS\" > /tmp/research_time\necho \"END_TS=$END_TS\" >> /tmp/research_time\necho \"Deadline: $(date -d @$END_TS '+%H:%M:%S UTC')\"\n```\n\n**Every status check:**\n```bash\nsource /tmp/research_time\nNOW=$(date +%s)\necho \"Elapsed: $(( (NOW - START_TS) / 60 )) min | Remaining: $(( (END_TS - NOW) / 60 )) min\"\n```\n\n**Status format:**\n```\n⏱️ [elapsed] / [committed] ([remaining] left) | 🔍 [S] searches | 📄 [R] reads | 📝 [W] writes | R>S? [yes/no]\n```\n\nThe `R>S?` field: reads must exceed searches. If it says \"no\", stop searching and read.\n\n### Synthesis Gate (HARD — prevents finishing early)\n\n⛔ **Before starting synthesis, you MUST run this check:**\n```bash\nsource /tmp/research_time\nNOW=$(date +%s)\nif [ $NOW -lt $END_TS ]; then\n  echo \"BLOCKED: $((( END_TS - NOW ) / 60)) minutes remaining. Go back to research.\"\nelse\n  echo \"TIME UP: Proceed to synthesis.\"\nfi\n```\n\n**If the check prints \"BLOCKED\":** You CANNOT synthesize. Go back to the Main Loop. Use the remaining time for:\n- New search terms (synonyms, adjacent topics, different languages)\n- Older threads (add year ranges: \"2020\", \"2021\", etc.)\n- Adjacent communities (different forums)\n- Cross-verifying existing findings with additional sources\n- Following up on leads you deprioritized earlier\n- YouTube comments on relevant videos\n\n**If the check prints \"TIME UP\":** Proceed to synthesis.\n\n---\n\n## Context Management (see `references/context-management.md`)\n- Write findings to `research/[topic]-[date].md` progressively.\n- Chat context gets summaries only — detail goes to file.\n- Write checkpoints every 5-10 minutes (recovery anchors after compaction).\n- Long sessions WILL compact — that's expected and fine.\n- Never stop research due to context size.\n\n---\n\n## Pricing Data (when applicable)\nIf the research involves products/purchases:\n- **MUST hit at least one retailer** via browser-use for real pricing.\n- web_search price summaries are estimates, NOT verified.\n- Note currency, tax inclusion (e.g., regional pricing differences such as VAT inclusion vs exclusion), date of check.\n\n---\n\n## Output\n\n### Research File (`research/[topic]-[date].md`)\n```markdown\n# [Topic] Research\n**Date:** [timestamp]\n**Duration:** [actual time spent]\n**Tools used:** [list]\n**Searches:** [count] | **Deep reads:** [count] | **Ratio R:S:** [ratio]\n\n## Task Anchor\n> [User's exact question, quoted verbatim]\n\n## Executive Summary\n[3-5 bullet points — ONLY ✅ verified findings]\n[If negative result: lead with \"After X searches and Y page reads, no [specific thing] was found.\"]\n\n## Research Tree\n[SEARCH → READ → spawned → READ → etc. Shows traversal.]\n\n## Detailed Findings\n### [Category 1]\n[findings with source URLs, ✅/⚠️ markers]\n\n## Comparison Table (if applicable)\n\n## Sources\n[numbered list of URLs actually fetched — NOT search result URLs]\n\n## Source Verification\n[✅ verified (fetched actual page) | ⚠️ unverified (search synthesis) | 🔗 direct quote]\n\n## Confidence Notes\n[what's solid vs uncertain]\n\n## Recommendation\n[ONLY ✅ verified sources. If negative result, recommend next steps to find the answer.]\n\n## Process Compliance\n[Self-audit pass/fail for each gate + remediation taken]\n```\n\n### Chat Summary\nPost concise summary (<500 words) when done. Link to full research file.\n\n---\n\n## Post-Research Self-Audit (HARD GATE — blocks delivery)\n\n⛔ **You CANNOT post the chat summary until every box is checked.** Run the audit, fix failures, THEN deliver.\n\n### Process Gates\n- [ ] START_TS and END_TS set with `date +%s`\n- [ ] All time reports used real `date +%s` (no guessing)\n- [ ] Time commitment honored — verify: final `date +%s` ≥ END_TS\n- [ ] Synthesis Gate bash check was run and printed \"TIME UP\"\n- [ ] Task Anchor written to file and re-checked every 5 tool calls\n- [ ] **No idle waiting** — no sleep loops or timer countdowns. Every minute had active tool calls.\n\n### Read/Search Balance\n- [ ] READ_COUNT > SEARCH_COUNT — post actual numbers: `reads: [R] / searches: [S]`\n- [ ] No SEARCH→SEARCH chains outside Seed phase\n- [ ] Every search after Seed was preceded by at least 1 read\n- [ ] **Dead ends tracked** — reads on irrelevant pages logged as dead ends, not counted as productive reads. Post: `🪦 [D] dead ends`\n\n### Writing Gates\n- [ ] Write #1 happened after Seed (before tree traversal)\n- [ ] ≥3 writes for ≤15 min sessions, or 1/5min for longer\n- [ ] Research tree logged in file\n- [ ] Every status update included `📝 [N]` and `R>S? [yes/no]`\n\n### Tool Usage Gates (EVIDENCE REQUIRED — not honor system)\n⛔ **For each browser-use gate, you must paste the actual command and its output.** Checking a box without evidence is audit fraud.\n\n- [ ] browser-use cascade attempted — **PASTE the 3 commands and outputs below:**\n  ```\n  chromium attempt: [paste command + result]\n  real attempt: [paste command + result]  \n  remote attempt: [paste command + result]\n  ```\n- [ ] browser-use attempted before 50% of committed time (paste timestamp proof)\n- [ ] Every web_fetch failure escalated to full cascade — **list each 403 and what you did:**\n  ```\n  [url] → 403 → cascade: chromium=[result], real=[result], remote=[result]\n  ```\n- [ ] Reddit 403 → tried old.reddit.com first (paste the web_fetch attempt), then cascade\n- [ ] At least 1 successful browser-use interaction in the session\n\n### Source Quality Gates\n- [ ] Zero Perplexity synthesis text in Executive Summary or Recommendation\n- [ ] All sources have URLs from pages actually fetched\n- [ ] Claims marked ✅ verified / ⚠️ unverified — none unmarked\n- [ ] Recommendations backed ONLY by ✅ verified sources\n\n### Task Fidelity Gate\n- [ ] Re-read Task Anchor. Does the Executive Summary directly answer the exact question?\n- [ ] If user asked for anecdotes: report contains direct quotes with usernames + thread URLs\n- [ ] If mostly policy/guidance when anecdotes were asked: FAIL → go find stories\n- [ ] If negative result: summary LEADS with the negative, doesn't bury it under adjacent findings\n\n### Read Quality Gate\n- [ ] Count reads that actually helped answer the Task Anchor vs dead-end reads\n- [ ] At least 50% of reads must be on-topic (relevant to the actual question)\n- [ ] Dead ends logged with reason: `🪦 [url] — [why irrelevant]`\n- [ ] URL queue was filtered for relevance before reading (didn't blindly read every Perplexity citation)\n\n### Survey Breadth Gate\n- [ ] Survey Phase completed in first 30% of time\n- [ ] Candidate list has ≥10 items (or ≥2x expected answer count for smaller topics)\n- [ ] Candidates written to file BEFORE deep-diving any single one\n- [ ] Deep-dive prioritization justified (why these candidates over others)\n\n### Source Balance Gate (for grey-area topics)\n- [ ] FORUM_READ_COUNT / total reads ≥ 0.4 (40% user reports)\n- [ ] Post actual counts: `Forum reads: [F] / Study reads: [S] / Ratio: [F/(F+S)]`\n- [ ] Real user quotes with usernames included in findings\n- [ ] If ratio < 0.4: explain why (e.g., \"no forums exist for this topic\")\n\n### Adversarial Thinking Gate\n- [ ] At least 1 adversarial search per 3 candidates (e.g., 9 candidates = ≥3 \"dangers/risks\" searches)\n- [ ] Each recommended candidate has a \"Risks & Concerns\" subsection\n- [ ] Mechanism chains explored (if X promotes Y for benefit, does Y also cause Z harm?)\n- [ ] Risks section is ≥25% the length of Benefits section\n- [ ] Post: `Devil's advocate searches: [N] / Candidates: [M] / Ratio: [N/ceil(M/3)]`\n\n### Remediation (fix before delivering)\n| Failure | Fix |\n|---------|-----|\n| READ_COUNT ≤ SEARCH_COUNT | Read more pages now |\n| Missing writes | Write current findings immediately |\n| Unverified claims in recommendation | Remove or verify them |\n| Time not honored | Continue researching |\n| browser-use cascade incomplete | Run remaining modes now |\n| Task drift | Re-search with original question terms |\n| No anecdotes when requested | Forum searches with first-person language |\n| Only 1-2 browser modes tried | Try the remaining modes on any pending source |\n\n---\n\n## Anti-Patterns\n\n### Architecture violations\n- ❌ **Search addiction** — firing web_search to \"see what comes up\" instead of reading pages in the queue. The queue is your job. Search only when it's empty or a read spawns a new question.\n- ❌ **Reading Perplexity synthesis** — the synthesized text from web_search is noise. Extract URLs. Ignore text. If your findings cite \"according to Perplexity...\" you've failed.\n- ❌ **SEARCH→SEARCH chains** after Seed — architecturally forbidden. Each search must be earned by a read.\n- ❌ **Finishing early** — the Synthesis Gate bash check prevents this. Run it. **TIME IS A HARD CONTRACT.** If committed to 10 minutes, you work for 10 minutes. Finishing at 4 min on a 10-min commitment = CRITICAL FAILURE. If you think you're \"done\" early, you're NOT — run the mandatory checklist, search deeper, cross-verify, find more threads. Check `date +%s` against END_TS before ANY final output.\n- ❌ **Idle waiting / sleep commands** — NEVER use `sleep` for more than 3 seconds (browser-use page loads only). Running `sleep 60` to fill remaining time is a HARD FAIL that the audit WILL catch. The checklist of \"things to do when done\" is mandatory — run through it before claiming you're out of options.\n- ❌ **Garbage reads** — reading irrelevant pages just to inflate READ_COUNT. A beginner tutorial doesn't count as a \"read\" when researching expert-level anecdotes. Filter the URL queue for relevance BEFORE reading. Dead-end reads must be logged as dead ends, not productive reads.\n- ❌ **Audit fraud** — checking self-audit boxes for things you didn't do. If the audit says \"browser-use cascade attempted\" but you made zero browser-use calls, that's fabrication. The audit requires EVIDENCE (paste commands + outputs), not just checkboxes.\n\n### Content violations\n- ❌ **Answering a different question** — if user wants anecdotes, don't deliver policy summaries. Re-read the Task Anchor.\n- ❌ **Panic-pivoting on negative results** — \"not found\" is valid. Don't inflate weak leads.\n- ❌ **Unverified claims in recommendations** — if you didn't fetch the page, it can't go in the recommendation.\n- ❌ **Marketing claims as benchmarks** — distinguish clearly.\n- ❌ **Premature narrowing** — picking 3-5 candidates in Seed and ignoring everything else. The Survey Phase exists to prevent this. You MUST cast a wide net before deep-diving. If you're deep-diving a candidate before identifying ≥10 options, you've narrowed too early.\n- ❌ **All-research, no-users** — citing 10 academic papers and 0 forum threads. For grey-area topics, user reports are EQUALLY valuable. The 40% forum-read quota exists for a reason.\n- ❌ **Confirmation bias** — only searching for why something works, never for why it might harm. The Adversarial Thinking section is mandatory. Every \"this is great\" needs a \"but here's what could go wrong\" search.\n\n### Tool violations\n- ❌ **Giving up after 1 browser-use failure** — try ALL 3 modes. Log all attempts.\n- ❌ **Silently skipping blocked sources** — web_fetch 403 → full cascade, always.\n- ❌ **web_fetch on Amazon** — use browser-use with `--profile`.\n- ❌ **Zero browser-use calls** — must attempt at least once per session.\n\n### Process violations\n- ❌ **Running in main session** — always dedicated topic or sub-agent.\n- ❌ **Spawning sub-agent without full rules** — paraphrasing = agent ignores rules. Paste them.\n- ❌ **Paraphrasing user's question** — quote verbatim. \"Find real experiences\" ≠ \"research whether common\".\n- ❌ **Single write at end** — progressive writes mandatory. Session crashes lose everything.\n- ❌ **Fabricating elapsed time** — every timestamp from `date +%s`. No guessing.\n- ❌ **Skipping self-audit** — hard gate. No exceptions.\n- ❌ **Skipping pre-flight** — plan block must be posted and approved.\n","tags":{"latest":"1.2.0"},"stats":{"comments":0,"downloads":387,"installsAllTime":15,"installsCurrent":0,"stars":0,"versions":3},"createdAt":1771623938971,"updatedAt":1779077215497},"latestVersion":{"version":"1.2.0","createdAt":1771624743294,"changelog":"Beta notice. Dependency declarations. Privacy/trust docs.","license":null},"metadata":null,"owner":{"handle":"vanya1210","userId":"s177kxfsdw09ve02yxfw9b8anh8854g3","displayName":"vanya1210","image":"https://avatars.githubusercontent.com/u/9931203?v=4"},"moderation":{"isSuspicious":false,"isMalwareBlocked":false,"verdict":"clean","reasonCodes":["review.llm_review"],"summary":"Review: review.llm_review","engineVersion":"v2.4.24","updatedAt":1779943568423}}