{"skill":{"slug":"legal-tos-differ","displayName":"Legal/TOS Diff-er","summary":"Fetches Terms of Service documents, stores snapshots, and performs semantic diffing to identify meaningful legal changes across Privacy Risks, Financial Chan...","tags":{"latest":"1.0.0"},"stats":{"comments":0,"downloads":77,"installsAllTime":0,"installsCurrent":0,"stars":1,"versions":1},"createdAt":1776068059923,"updatedAt":1776070618988},"latestVersion":{"version":"1.0.0","createdAt":1776068059923,"changelog":"# Legal/TOS Diff-er\n\n![OpenClaw Skill](https://img.shields.io/badge/OpenClaw-Skill-blue)\n![Node.js](https://img.shields.io/badge/Node.js-18%2B-green)\n![License](https://img.shields.io/badge/License-MIT-yellow)\n\nA semantic diff tool for Terms of Service and legal documents. Unlike standard text diffs that spot character changes, this skill understands legal meaning — catching when \"may\" becomes \"will\" in a data-sharing clause or when a forced arbitration clause quietly appears.\n\n## The Problem\n\nCompanies update their Terms of Service frequently, and the changes are often buried in pages of dense legal text. A standard code diff looks for character changes, but legal changes require **semantic understanding**:\n- Changing \"may share data\" to \"will share data\" is a single word, but a massive privacy shift\n- Adding \"mandatory arbitration\" to a dispute section strips users of their right to sue\n- Changing a refund policy from \"within 30 days\" to \"at our discretion\" eliminates a financial right\n\n## How It Works\n\n```\n┌─────────────┐     ┌─────────────┐     ┌─────────────┐     ┌─────────────┐\n│  Fetch URL   │────▶│   Extract   │────▶│   Snapshot   │────▶│   Compare   │\n│  (node-fetch)│     │  (cheerio)  │     │   (JSON)     │     │  (Claude)   │\n└─────────────┘     └─────────────┘     └─────────────┘     └─────────────┘\n```\n\n1. **Fetch** — Retrieves the legal page HTML\n2. **Extract** — Two-pass engine strips noise (nav, ads, popups) and scores content blocks to isolate legal text\n3. **Snapshot** — Stores timestamped versions with SHA-256 hashes\n4. **Compare** — Outputs a structured prompt for Claude to semantically analyze changes\n\n## Change Categories\n\n| Category | What It Detects | Example |\n|----------|----------------|---------|\n| **Privacy Risks** | Data collection, sharing, tracking, cookies | \"may share\" → \"will share\" with third parties |\n| **Financial Changes** | Pricing, fees, billing, refunds, auto-renewal | \"30-day refund\" → \"at our discretion\" |\n| **User Rights** | Termination, ownership, arbitration, governing law | New mandatory arbitration clause |\n\n## Quick Start\n\n### Commands\n\n```\n# Track a new legal document\nadd_url --url \"https://example.com/terms\" --label \"Example Corp TOS\"\n\n# See what you're tracking\nlist_tracked\n\n# Capture the current version\nfetch_current --url \"https://example.com/terms\"\n\n# Compare current version against last snapshot\ndiff --url \"https://example.com/terms\"\n\n# Stop tracking\nremove_url --url \"https://example.com/terms\"\n```\n\n## Installation\n\n```bash\ncd legal-tos-differ\nnpm install\n```\n\nRequirements: Node.js 18+\n\n## Architecture\n\n### Extraction Engine\n\nThe extraction engine uses a two-pass approach with Cheerio:\n\n1. **Noise Removal** — Strips `<nav>`, `<footer>`, `<script>`, and elements with noise-related classes/IDs (sidebar, cookie, popup, etc.)\n2. **Content Scoring** — Scores remaining block elements by:\n   - Text density (legal text is text-heavy, not link-heavy)\n   - Legal keyword frequency (\"terms\", \"agreement\", \"liability\", etc.)\n   - Link density penalty (too many links = navigation, not legal text)\n   - Structural hints (`<main>`, `<article>`, legal-related IDs/classes)\n\n### Snapshot Storage\n\nSnapshots are stored as JSON files in `snapshots/`:\n\n```\nsnapshots/\n  registry.json                           # Tracked URLs metadata\n  example-com-terms-2026-04-11T17-00.json # Timestamped snapshot\n```\n\nEach snapshot includes the full extracted text, SHA-256 hash, and fetch metadata. The hash enables instant \"no changes\" detection without invoking the LLM.\n\n### Analysis Prompting\n\nThe skill builds a structured prompt that delegates semantic analysis to the Claude Code runtime. The prompt instructs the LLM to:\n- Ignore cosmetic changes (typos, formatting, reordering)\n- Ignore clarifying language that doesn't change legal meaning\n- Flag removals of user protections as higher severity\n- Quote specific old/new text for each change\n\n## License\n\nMIT","license":"MIT-0"},"metadata":null,"owner":{"handle":"liverock","userId":"s175ewnhnmrjf1y1sxk9d8r89s83wbxr","displayName":"Peter Lum","image":"https://avatars.githubusercontent.com/u/211891?v=4"},"moderation":{"isSuspicious":true,"isMalwareBlocked":false,"verdict":"suspicious","reasonCodes":["suspicious.env_credential_access","suspicious.vt_suspicious"],"summary":"Detected: suspicious.env_credential_access, suspicious.vt_suspicious","engineVersion":"v2.2.0","updatedAt":1776070618988}}