DocsForAI

v0.7.0

Crawl and read documentation websites using DocsForAI. Use when you need to learn a new library, framework, or tool by reading its official docs; when you wa...

1· 43·0 current·0 all-time
byDaoXuan@dx2331lxz
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Pending
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
The declared purpose (crawl and read docs) matches the requested binary (docsforai) and the uv/PyPI install. Minor inconsistencies: SKILL.md claims Latest 0.6.0, _meta.json shows 0.6.1, registry lists 0.7.0 — inconsistent version metadata and the skill's Source/Homepage fields are 'unknown'/'none' while SKILL.md references PyPI/GitHub. These are not fatal but worth verifying.
Instruction Scope
Instructions stay within the stated purpose: check for existing docs, crawl only when needed, write outputs to ~/.openclaw/workspace/skills/docsforai/docs/, and record entries in MEMORY.md. Two items to review: (1) the skill instructs reading and appending to MEMORY.md (agent will read and write a user file that may contain other info), and (2) examples use a 'read' command to display files (this may be a placeholder — typical shells use cat/less). The skill also makes network requests to arbitrary doc URLs (expected for a crawler) — that implies network access and fetching external content.
Install Mechanism
Install uses 'uv' to install the docsforai PyPI package (creates docsforai binary), which is an expected mechanism. The SKILL.md also suggests a pip fallback with --break-system-packages (risky for system Python). Overall moderate risk: PyPI/GitHub are normal sources but verify the package and prefer isolated installs (uv) rather than the pip fallback.
Credentials
No environment variables or external credentials are requested. The skill only requires the ability to run the docsforai binary and to read/write under the user's ~/.openclaw workspace — these are proportionate to the stated function.
Persistence & Privilege
The skill persistently stores crawled documentation under ~/.openclaw/workspace/skills/docsforai/docs and appends entries to MEMORY.md; it does not request system-wide privileges nor 'always: true'. Persistent storage and modification of MEMORY.md are expected for a crawler but users should be aware of the persistent footprint and of any sensitive content in MEMORY.md that the agent will read.
Assessment
This skill appears to do what it says, but check a few things before installing: 1) Verify the docsforai PyPI/GitHub project yourself (author, recent releases, and source) rather than relying on SKILL.md metadata — the listed versions are inconsistent. 2) Prefer installing with 'uv' (isolated) instead of the pip fallback; the fallback suggests --break-system-packages which can modify system Python. 3) Be aware it will fetch arbitrary websites (network access) and persist files to ~/.openclaw/workspace/skills/docsforai/docs; ensure that location is acceptable. 4) The skill reads/appends MEMORY.md — inspect that file for sensitive data before allowing the skill to access it. 5) Confirm the agent environment provides the expected file-display command (examples use 'read' which may be a placeholder); if not, adjust to safe viewers like cat/less. If you want stronger assurance, review the docsforai package source code on GitHub and run the crawler in an isolated environment (container or VM) first.

Like a lobster shell, security has layers — review code before you run it.

latestvk97117zepp2keyvmhdm9ctp90s844ef1

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

Binsdocsforai

Install

Install DocsForAI (PyPI)
Bins: docsforai
uv tool install docsforai

SKILL.md

DocsForAI — Documentation Crawler Skill

Crawl any documentation website into structured, persistent Markdown files and read them on demand — so you always work from accurate, up-to-date documentation rather than training-data guesses.

Source: https://pypi.org/project/docsforai/ | https://github.com/dx2331lxz/DocsForAI | Latest: 0.6.0


Install (one-time)

uv tool install docsforai   # recommended: isolated, no system Python pollution
pip install --break-system-packages docsforai  # fallback if uv unavailable

Verify: docsforai --version


Core Principles

Always use multi-md format. It preserves the site's original chapter hierarchy as individual files, so you can navigate to exactly the section you need without loading the entire documentation into context.

Output rule: docsforai writes directly to <output>/<site-name>/ — no extra subdirectory is created.

Docs are persistent. Once crawled, they live on disk across sessions. Check before crawling; never re-crawl what already exists.


Workflow

Step 1 — Check if docs already exist

Before doing anything else, check both the local filesystem and MEMORY.md:

ls ~/.openclaw/workspace/skills/docsforai/docs/

Also look up the 「已下载文档(DocsForAI)」 section in MEMORY.md for a record of previously crawled sites and their paths.

If the site folder already exists → skip to Step 3.

Step 2 — Crawl (only if not already downloaded)

Always pass the skill's docs/ directory as -o. DocsForAI creates <site-name>/ inside it automatically.

docsforai crawl <URL> -f multi-md \
  -o ~/.openclaw/workspace/skills/docsforai/docs

Common examples:

URLSite nameFinal path
https://vitepress.dev/guidevitepressdocs/vitepress/
https://docs.pydantic.devpydanticdocs/pydantic/
https://docusaurus.io/docsdocusaurusdocs/docusaurus/
https://react.dev/learnreactdocs/react/
https://docs.python.org/3pythondocs/python/

After crawling completes, proceed to Step 2b.

Step 2b — Record to MEMORY.md (required)

Append a row to the 「已下载文档(DocsForAI)」section in MEMORY.md. Create the section if it doesn't exist yet:

## 已下载文档(DocsForAI)

| Site | Local path | Crawled |
|---|---|---|
| vitepress | ~/.openclaw/workspace/skills/docsforai/docs/vitepress/ | 2026-04-02 |

Never overwrite existing rows — always append.

Step 3 — Map the structure

Before reading any file, get a full picture of the directory tree:

find ~/.openclaw/workspace/skills/docsforai/docs/<site-name> -name "*.md" | sort

Scan the output. Identify which subdirectories and files correspond to the topic you need. This costs nothing and saves you from loading irrelevant chapters.

Step 4 — Read on demand (the most important step)

Load only what is directly relevant to the current task. Follow this decision tree:

4a. You need a quick orientation

Read the top-level index first:

read ~/.openclaw/workspace/skills/docsforai/docs/<site-name>/index.md

4b. You know roughly what you need

Read the specific chapter file directly:

read ~/.openclaw/workspace/skills/docsforai/docs/<site-name>/guide/configuration.md
read ~/.openclaw/workspace/skills/docsforai/docs/<site-name>/reference/api.md

4c. You need to find where something is documented

Search across all files for a keyword, then read only the matching file:

# Find which file covers a specific topic
grep -rl "defineConfig\|plugin\|vite" \
  ~/.openclaw/workspace/skills/docsforai/docs/<site-name>/ | head -10

4d. You need to understand a full feature area

Read the section index, then follow up with the specific sub-pages you need:

# Read section overview
read ~/.openclaw/workspace/skills/docsforai/docs/<site-name>/guide/index.md

# Then read only the sub-pages that apply
read ~/.openclaw/workspace/skills/docsforai/docs/<site-name>/guide/routing.md

Rules:

  • Never read the entire docs tree in one go
  • Stop reading once you have enough to proceed
  • If you read something and it's not what you needed, search more precisely rather than loading more files

When to Consult Docs (decision guide)

Use this skill proactively whenever you are about to:

SituationAction
Use an API you haven't used in this sessionRead the relevant API reference page
Write configuration for a frameworkRead the configuration guide
Debug an unexpected behaviorSearch docs for the error or behavior, read matching section
Use a CLI tool you're unfamiliar withRead the CLI reference page
Implement a non-trivial featureRead the feature's guide page before writing code
Upgrade a library versionCheck migration or changelog docs first

Do not guess at API signatures, config options, or CLI flags when the docs are available on disk. A 2-second read beats a hallucinated parameter.


CLI Reference

# Standard crawl
docsforai crawl <URL> -f multi-md -o <output-dir>

# Force framework type (skip auto-detection)
docsforai crawl <URL> --type nextdocs -f multi-md -o <output-dir>
docsforai crawl <URL> --type mkdocs -f multi-md -o <output-dir>

# Polite crawling (for rate-sensitive sites)
docsforai crawl <URL> -f multi-md --concurrency 2 --delay 0.5 -o <output-dir>

# Limit pages (generic mode only)
docsforai crawl <URL> -f multi-md --max-pages 100 -o <output-dir>

Supported Frameworks (auto-detected)

FrameworkDetection signal
VitePress.VPSidebar CSS class / generator meta
Docsify$docsify global variable — fetches raw .md source
Mintlifyx-llms-txt response header — single request for full content
Docusaurusgenerator meta / .theme-doc-sidebar-container
mdBook#mdbook-sidebar / ol.chapter
MkDocsgenerator meta / .md-nav--primary (Material + default themes)
Starlight#starlight__sidebar / .sl-markdown-content
GitBookgenerator meta GitBook / sitemap-based discovery
NextDocs/_next/ assets + .mdx-content — sitemap discovery + sidebar fallback
Feishu Docsopen.feishu.cn domain — internal API
GenericBFS link traversal — fallback for any other site

Tips

  • Mintlify sites fetch everything in one request — near-instant
  • Cloudflare-protected sites — DocsForAI auto-retries with system curl
  • Count total pages: find ~/.openclaw/workspace/skills/docsforai/docs/<site> -name "*.md" | wc -l
  • Re-crawl to refresh: delete the site folder first, then crawl again

Files

2 total
Select a file
Select a file to preview.

Comments

Loading comments…