# Audit Framework — SEO Audit The auditor's brain. Five-tier priority order, every check, when to short-circuit. Recipes (concrete `bdata` commands) live in `bdata-recipes.md`. Site-type extras live in `site-type-playbooks.md`. Report shape lives in `output-templates.md`. ## Priority Order (Short-Circuiting) The audit walks tiers top-to-bottom; lower-tier findings are partial signal until higher-tier blockers are fixed. 1. **Crawlability & Indexation** — can Google find and index the site at all? 2. **Technical Foundations** — is the site fast, secure, mobile-friendly? 3. **On-Page Optimization** — titles, meta, headings, internal links. 4. **Content Quality** — does each page deserve to rank? 5. **Authority & Links** — out of scope for direct measurement; HTML red flags only. When a Tier-1 critical issue is found (e.g., `Disallow: /` in robots.txt), the rule is: report it as the top priority, explicitly tell the user that lower-tier findings may be misleading until the Tier-1 issue is fixed, but **still run lower tiers and report what you find** — caveat each downstream section so the user has a full picture without taking the findings as definitive. Lower-tier findings still require an Evidence block per the Hard Rule below; if a check cannot run because the Tier-1 blockage prevents fetching the page, omit it from the report rather than fabricate a finding. ## Tier 1 — Crawlability & Indexation (always run) - **robots.txt** — fetch with R-01. Parse for `Disallow:` directives that match indexable paths (defined as: any path appearing in the sitemap, or the homepage, or any path matching common SEO-important patterns — `/products/`, `/category/`, `/categories/`, `/blog/`, `/posts/`, `/articles/`, `/locations/`, `/services/`, `/about`, `/`). Standard utility paths (`/wp-admin`, `/cgi-bin`, `/api`, `/admin`, `/cart`, `/checkout`, `/account`) being disallowed is normal and should not be flagged. Also confirm a `Sitemap:` line is present. - **sitemap.xml** — fetch with R-02. Direct check: exists, parseable, contains canonical and indexable URLs, uses `` if multilingual. - **Indexation proxy** — R-12: `bdata search "site:" --json`. Approximate indexed-URL count from the SERP sample. Flag as critical when the indexed sample is below 30% of the sitemap URL count for sitemaps under 100 URLs, or when the absolute gap exceeds 100 URLs for larger sitemaps (e.g., 12 indexed vs. 847 in sitemap is critical; 6 indexed vs. 10 is not because `site:` returns a sample, not exact counts). Always note in the finding that `site:` is a sample, not a precise index count, and direct the user to Google Search Console's Coverage report for authoritative numbers (Out-of-Scope Notes). - **Per-page robots and canonical** — from rendered HTML: ``, ``, self-referencing canonical. Flag `noindex` on indexable pages (using the same definition as the robots.txt rule above: pages that appear in the sitemap, the homepage, or pages matching common SEO-important path patterns). `noindex` on utility pages like `/cart` or `/account` is normal and should not be flagged. - **Hreflang** (multilingual sites only) — R-08: parsed from rendered HTML head and sitemap. Self-referencing entry, reciprocal links, valid ISO codes, `x-default` present. ## Tier 2 — Technical Foundations - **HTTPS** — final URL after redirects. If `bdata scrape http://` lands on `https://`, we're fine. - **Mobile-friendliness** — viewport meta tag present, no fixed-width inline styles, responsive image markup (`srcset`, `