Junyi Doc Reader

ReviewAudited by ClawScan on May 10, 2026.

Overview

This is mostly a coherent document-archiving skill, but its optional LLM mode has a privacy-contract mismatch that can send document chunks to a default external endpoint, and Feishu mode uses local app credentials.

Install only if you are comfortable with its document outputs being stored in your chosen vault and with Feishu mode using local Feishu app credentials. Keep DOC_READER_ALLOW_EXTERNAL=false for offline use; if you enable LLM insights, explicitly set DOC_READER_API_URL and DOC_READER_MODEL so you know exactly where document chunks are sent.

Findings (5)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

A user may believe they must explicitly choose the LLM endpoint, while the skill can fall back to a default external provider when enrichment is enabled.

Why it was flagged

This privacy claim is contradicted by the included code: `scripts/enricher.py` defines a default OpenAI URL and `scripts/pipeline.py` can enable insights with only an API key and `DOC_READER_ALLOW_EXTERNAL=true`. That mismatch can mislead users about where document content may go.

Skill content
启用 LLM 增强需要用户主动设置全部四个环境变量 ... `DOC_READER_API_URL`(用户指定 endpoint,无硬编码生产 URL)
Recommendation

Require DOC_READER_API_URL and DOC_READER_MODEL explicitly before enrichment, or change the privacy text to clearly state the default endpoint and model.

What this means

Private document chunks can leave the local machine in enrichment mode, potentially to a provider the user did not explicitly configure.

Why it was flagged

When LLM enrichment is enabled, chunk text is sent to an LLM API endpoint, and the endpoint defaults to OpenAI if DOC_READER_API_URL is not set.

Skill content
DEFAULT_API_URL = "https://api.openai.com/v1/chat/completions" ... "content": USER_PROMPT_TEMPLATE.format(text=text) ... urllib.request.urlopen(req, timeout=60)
Recommendation

Leave DOC_READER_ALLOW_EXTERNAL=false unless external analysis is intended, and set DOC_READER_API_URL explicitly to the desired provider before using insights mode.

What this means

Using Feishu mode gives the skill access to Feishu app credentials for the selected account.

Why it was flagged

Feishu mode reads local app credentials to obtain an access token. This is purpose-aligned for fetching Feishu documents, but it is sensitive account authority.

Skill content
仅 `~/.openclaw/openclaw.json` 中 `channels.feishu.accounts[<account>].appId` / `.appSecret` 两个字段 ... 用于换取 Feishu tenant_access_token
Recommendation

Use a least-privilege Feishu app/account and verify the requested `--account` before running Feishu imports.

What this means

Archived content may be reused by agents later if the output directory is part of an Obsidian vault or other searchable knowledge base.

Why it was flagged

The skill intentionally persists source text, chunks, and indexes for later agent use. This is expected for document archiving, but it means private or untrusted document content may remain available in future retrieval contexts.

Skill content
`chunks.jsonl` — 精确检索定位 ... `source.md` — 需要全文搜索时使用 ... `ROOT_INDEX.md` — 先读这个了解文档结构
Recommendation

Store outputs only in a vault/location appropriate for the document sensitivity, and review generated indexes before relying on them.

What this means

The skill depends on the behavior and trustworthiness of local pandoc/pdftotext installations when processing documents.

Why it was flagged

The converter runs local document-conversion binaries. This is central to the stated purpose and uses argument lists rather than shell execution, but users should know local tools are invoked.

Skill content
subprocess.run(["pandoc", "-f", "docx", "-t", "markdown", ...]) ... subprocess.run(["pdftotext", "-layout", input_path, "-"], ...)
Recommendation

Use trusted, up-to-date converter binaries and avoid processing untrusted documents in a sensitive environment.