Install
openclaw skills install @wangzhiming1999/doc-snapshot-agentAutomatically illustrate Markdown documents by turning image markers into screenshots or generated images, then writing an image-enriched Markdown output. Use this skill when a document needs screenshots, AI-generated visuals, image placement, or end-to-end document illustration automation.
openclaw skills install @wangzhiming1999/doc-snapshot-agentdoc-snapshot-agent is a single entry-point skill for automatically adding images to Markdown documents.
It supports:
This package is intentionally published as one main skill plus supporting reference documents:
{baseDir}/references/browser-automation.md{baseDir}/references/playwright-mcp.md{baseDir}/references/site-explorer.md{baseDir}/references/image-generation.mdLoad this skill whenever the user asks to:
Input:
Image Summary tableOutput:
All input, output, and cache paths are relative to a single project root directory ({project-root}).
At the very beginning of every run, ask the user which directory to use as the project root. If the user declines or says they have no preference, default to /tmp/doc-snapshot-agent.
Once confirmed, all subsequent paths in this skill (cases/, output/, .cache/, etc.) resolve under {project-root}/.
{project-root}/
├── cases/
│ └── {article-id}.md
├── output/
│ ├── {article-id}/
│ │ ├── raw/
│ │ │ ├── A1_example.png
│ │ │ └── A2_example.png
│ │ ├── A1_example.png
│ │ ├── A2_example.png
│ │ └── README.md
│ └── markdowns/
│ └── {article-id}.md
└── .cache/
└── screenshots/
└── {article-id}/
Conventions:
{project-root}/cases/ stores the source Markdown file.{project-root}/output/{article-id}/raw/ stores original browser screenshots and should never be overwritten by later processing.{project-root}/output/{article-id}/ stores final images referenced by Markdown.{project-root}/output/markdowns/ stores the final illustrated Markdown.{project-root}/.cache/screenshots/ stores reusable screenshot cache entries.If the user specifies a different layout, follow the user instruction instead.
Some sites require authentication before the requested screenshot can be captured.
Read website credentials from environment variables using this pattern:
PLAYWRIGHT_CRED_{SERVICE}_{FIELD}
Examples:
PLAYWRIGHT_CRED_FELO_EMAILPLAYWRIGHT_CRED_FELO_PASSWORDRules:
This skill must support both inline markers and summary tables.
### 📷 Screenshot: {marker-id} ({filename})
Use: {why this screenshot exists}
Processing: {post-processing instruction}
Difference: {optional distinction from similar screenshots}
Fields:
marker-id: unique screenshot identifier such as A1, B3-1, or D3filename: base filename without the marker prefixUse: what the screenshot should communicateProcessing: crop, resize, or other post-processing needsDifference: optional explanation for how this screenshot differs from similar onesScreenshot:
<!-- IMAGE: screenshot (https://example.com/app)
Description: Workspace dashboard showing project activity and team sidebar
Filename: workspace-dashboard.png
-->
Generated image:
<!-- IMAGE: generated
Description: Editorial illustration of a collaborative AI workflow with folders and browser windows
Filename: ai-workflow-hero.png
-->
A document may end with a summary table listing all required images:
## Image Summary
| # | Type | Description | Filename |
|---|------|-------------|----------|
| 1 | generated | Description... | `hero.png` |
| 2 | screenshot | Description... | `dashboard.png` |
Important:
Do not assume the workflow always starts from zero. Before doing any work, inspect the article state and continue from the right step.
For a given article id, inspect:
{project-root}/output/{article-id}/raw/*.png{project-root}/output/{article-id}/*.png{project-root}/output/{article-id}/README.md{project-root}/output/markdowns/{article-id}.md{project-root}/.cache/screenshots/{article-id}/This check MUST run at the start of EVERY execution, not just the first time.
Before any other work, verify that the Playwright MCP server is properly configured and running:
Check for Playwright MCP tools availability
mcp__playwright__ prefixmcp__playwright__browser_navigate, mcp__playwright__browser_snapshot, mcp__playwright__browser_screenshotIf tools are NOT detected, STOP immediately and guide the user to install:
Detect the current client environment and show the matching installation command:
Claude Code
claude mcp add playwright -- npx @playwright/mcp@latest
Codex
codex mcp add playwright -- npx @playwright/mcp@latest
VS Code / Cursor / Kiro (IDE with MCP settings UI)
Add to the MCP settings JSON (e.g. .vscode/mcp.json, .cursor/mcp.json, .kiro/settings/mcp.json):
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["@playwright/mcp@latest"]
}
}
}
Claude Desktop
Add to claude_desktop_config.json:
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["@playwright/mcp@latest"]
}
}
}
Standalone MCP Server (headless environments or worker processes)
npx @playwright/mcp@latest --port 8931
Then point the client config to:
{
"mcpServers": {
"playwright": {
"url": "http://localhost:8931/mcp"
}
}
}
Grant Tool Permissions (Claude Code / Codex)
{
"permissions": {
"allow": ["mcp__playwright__*"]
}
}
Ask the user to configure and restart the session
Do NOT proceed to Step 1 until this check passes
After verifying Playwright MCP, ask the user:
Which directory should I use as the project root for this run?
{project-root}./tmp/doc-snapshot-agent.Create the directory if it does not exist. All subsequent paths (cases/, output/, .cache/, scripts/, references/) resolve under {project-root}/.
Read the source Markdown and merge image requirements from three sources:
<!-- IMAGE: ... --> markersImage Summary tableFor each image, record:
screenshot or generatedAlso detect the target website or websites mentioned by the article.
{project-root}/references/playwright-mcp.md before interacting with the sitenpx playwright install chromium before continuingCRITICAL: Browser Tool Requirement
This skill uses only Playwright MCP tools for browser automation. Do NOT use:
mcp__playwright__* prefixAll browser interactions must go through the Playwright MCP server tools:
mcp__playwright__browser_navigatemcp__playwright__browser_snapshotmcp__playwright__browser_screenshotmcp__playwright__browser_clickmcp__playwright__browser_fill_formIf these tools are not available in the current runtime, the workflow cannot proceed. Ask the user to configure the Playwright MCP server first.
Bad screenshots usually come from navigating to the wrong page, not from using the wrong screenshot command.
Before capturing anything:
Check whether site knowledge already exists under:
$IMAGE_AGENT_SITE_KNOWLEDGE_DIR/$IMAGE_AGENT_SITE_LEARNING_DIR/Derive a stable site-key from the domain name:
memclaw.me -> memclawapp.felo.ai -> feloIf {site-key}.md exists and is recent, read it before browsing.
If site knowledge is missing or stale, perform a structured site exploration and save the findings into the site knowledge files. See {project-root}/references/site-explorer.md.
Map every screenshot description to a specific page or state.
Common mapping mistakes:
dashboard, session history, team members, or inviteUse the browser automation reference in {project-root}/references/browser-automation.md.
If Playwright MCP is available, also use {project-root}/references/playwright-mcp.md as the concrete execution guide for:
Typical flow:
{project-root}/output/{article-id}/raw/Naming rule:
{marker-id}_{filename}Example:
A1_workspace-dashboard.pngAfter taking each screenshot, verify that the captured image actually matches the description. Do not rely only on DOM text. Visual layout, modals, loading states, overlays, and empty panels must be checked against the real screenshot file.
Apply the requested processing instructions if present.
Typical operations:
raw/ into the final output directoryPrinciple:
raw/ keeps untouched originals{project-root}/output/{article-id}/ are the assets referenced by MarkdownThis step has two jobs:
Heading marker example:
### 📷 Screenshot: A1 (workspace-dashboard.png)
Use: Show the authenticated workspace homepage
Processing: Full-width screenshot
becomes:

HTML comment marker example:
<!-- IMAGE: screenshot (https://example.com/app)
Description: Workspace dashboard showing Architecture Decisions
Filename: architecture-decisions.png
-->
becomes:

For images that appear only in the Image Summary table:
Common mistakes:
Example:
Share panel showing team members and invite controls, prefer the paragraph that mentions inviting teammates rather than the end of a general onboarding sectionFor generated images, use the image-generation reference in {project-root}/references/image-generation.md and the bundled script in {project-root}/scripts/generate_image.py.
If generation succeeds, insert the normal Markdown image reference. If generation fails, insert a warning block:
> Warning: AI image generation failed for {filename}
The Image Summary block is workflow metadata and should not remain in the final illustrated Markdown.
Create {project-root}/output/{article-id}/README.md with metadata such as:
Suggested format:
# {article-id} Illustration Output
Article: {title}
Completed: {timestamp}
## Image Inventory
| Filename | Marker | Description | Size | Processing |
|----------|--------|-------------|------|------------|
| A1_example.png | A1 | Workspace dashboard | 1200x800 | resized |
## Notes
- Credentials source: environment variables
- Additional comments
## Remaining Work
- [ ] Any missing screenshot or failed generated image
Use a simple file-based screenshot cache:
{project-root}/.cache/screenshots/{article-id}/When an image type is generated, do not mark it as missing by default. Generate it.
Prerequisites:
OPENROUTER_API_KEY is availablerequests is installedDefault command:
python {project-root}/scripts/generate_image.py "{description}" -o "{project-root}/output/{article-id}/{filename}"
Use a stronger model for text-heavy images:
python {project-root}/scripts/generate_image.py "{description}" -o "{project-root}/output/{article-id}/{filename}" -m google/gemini-3-pro-image-preview
Generation prompt guidance:
Failure handling:
If the document is language-specific, make sure the captured website matches that language. If the site supports language switching, switch before taking screenshots.
Before taking screenshots:
When this skill finishes, return a concise summary containing:
Project root (ask user, default /tmp/doc-snapshot-agent):
{project-root}/
Input:
{project-root}/cases/{article-id}.md
Output:
{project-root}/output/{article-id}/raw/*.png
{project-root}/output/{article-id}/*.png
{project-root}/output/{article-id}/README.md
{project-root}/output/markdowns/{article-id}.md
Credentials:
PLAYWRIGHT_CRED_{SERVICE}_{FIELD}
Cache:
{project-root}/.cache/screenshots/{article-id}/
References:
{project-root}/references/browser-automation.md
{project-root}/references/playwright-mcp.md
{project-root}/references/site-explorer.md
{project-root}/references/image-generation.md
Bundled script:
{project-root}/scripts/generate_image.py