paper-lark-report
v1.1.1全自动科研论文日报/周报生成。通过 arXiv RSS 抓取最新论文,arXiv API 获取完整摘要,LLM 语义评分筛选,生成基于原文的学术报告,推送飞书 Wiki。
MIT-0
Security Scan
OpenClaw
Suspicious
high confidencePurpose & Capability
The code and SKILL.md match the stated purpose: querying arXiv, preparing JSON for an LLM step, and creating Feishu Wiki docs. However, the skill metadata claims no required credentials or env vars while the Feishu integration relies on appId/appSecret stored in ~/.openclaw/openclaw.json. That mismatch between declared requirements and actual credential access is noteworthy.
Instruction Scope
SKILL.md documents the overall flow and mentions using openclaw.json to get Feishu tokens. The runtime scripts only perform network calls to arXiv and Feishu and local file reads/writes in the skill directory, plus one read of the user's home openclaw.json. The code does not perform broad file system enumeration, does not exfiltrate data to unexpected endpoints, and does not call external installers. It does, however, print part of the tenant token to stdout (token prefix), which could leak sensitive info to logs.
Install Mechanism
No install spec / remote downloads are used (instruction-only plus included scripts). No archives or third-party package installs are pulled in by the skill itself, so there is no high-risk install URL or extraction step.
Credentials
The skill declares no required env vars or primary credential, yet create_feishu_doc.py reads ~/.openclaw/openclaw.json to fetch channels.feishu.appId and appSecret and exchanges them for a tenant_access_token. Accessing a home-config JSON with potentially multiple credentials is not declared and increases exposure. The Feishu credentials themselves are proportionate to the stated Feishu publishing capability, but the skill should have declared this requirement and instructed where/how users must provide credentials (or use env vars).
Persistence & Privilege
always:false and no attempt to modify other skills or global agent settings. The skill writes only to its SKILL_DIR/data and processed_log, and registers created docs in a local doc_registry.json. The only cross-directory read is the user's ~/.openclaw/openclaw.json to obtain Feishu credentials.
What to consider before installing
Before installing or running this skill: (1) Understand that it will read ~/.openclaw/openclaw.json to obtain Feishu appId/appSecret — inspect that file for other secrets and consider creating a dedicated Feishu app with minimal permissions. (2) The skill does not itself perform LLM scoring: it writes data/daily_papers.json and expects an LLM step to produce data/selected_papers.json (you or another skill must perform scoring). (3) Check config.yaml and set feishu_space_id, feishu_parent_node/feishu_root and research_direction appropriately. (4) Be aware the script prints a tenant token prefix to stdout (logged output) — avoid running where logs are publicly accessible. (5) If you prefer clearer boundaries, modify create_feishu_doc.load_token to read credentials from a dedicated skill config or environment variables and avoid loading the entire openclaw.json. If you want me to, I can suggest a small patch to require explicit env vars (FEISHU_APP_ID/FEISHU_APP_SECRET) and stop reading ~/.openclaw/openclaw.json.Like a lobster shell, security has layers — review code before you run it.
arxivlarklatestpaper-reportresearch
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
SKILL.md
paper-lark-report
全自动科研论文日报/周报生成 Skill。基于 arXiv API 精准检索 + LLM 语义评分 + 飞书 Wiki 推送。
安装
# 方式一:通过 ClawHub 安装
npx clawhub@latest install leogoat2004/paper-lark-report
# 方式二:通过 OpenClaw CLI 安装
openclaw skills install leogoat2004/paper-lark-report
核心流程
cron --(isolated session)--> run_daily()
├─ build_arxiv_query(research_direction)
├─ fetch_arxiv_papers(query, max_search_results=20)
├─ 去重(processed_ids.json)
├─ fetch_arxiv_details(filtered[:20])
└─ 保存 data/daily_papers.json
│
▼
LLM isolated session
├─ 评分(0-10)
├─ 精选 Top max_daily_papers
├─ 从 full_abstract 提取 motivation + core_innovation(中文)
├─ 写入 data/selected_papers.json
├─ --save-selected
├─ feishu-create-doc skill 创建 Wiki 文档(子节点)
└─ --register-doc
目录结构
paper-lark-report/
├── SKILL.md
├── config.yaml
├── data/
│ ├── doc_registry.json
│ ├── processed_ids.json
│ ├── daily_papers.json # 候选论文(LLM 输入)
│ ├── selected_papers.json # 精选结果(含中文分析)
│ └── doc_result.json # 最近创建文档的 token/url
├── processed_log/
│ └── YYYY-MM-DD.json # 每日归档(供周报聚合)
├── scripts/
│ ├── arxiv_search.py # arXiv API
│ ├── paper_lark_report.py # 主入口
│ └── create_feishu_doc.py # 飞书 Wiki 创建(直接 API)
└── templates/
├── daily_report.md # 日报模板(含 Instructions)
└── weekly_report.md # 周报模板(含 Instructions)
配置(config.yaml)
| 字段 | 说明 |
|---|---|
feishu_space_id | Wiki 空间 ID(整数,URL 中提取) |
feishu_parent_node | 父节点 token,创建在 paper-lark-report 节点下 |
research_direction | 自由文本研究方向描述 |
max_search_results | 每日 arXiv 最多获取篇数(默认 20) |
max_daily_papers | 日报最多精选篇数(默认 3) |
arxiv_paper_max_days | 论文最大天数(默认 7) |
daily_cron / weekly_cron | cron 表达式(UTC+8) |
arXiv 查询策略
Query 构建规则:abs:core_term AND (abs:term1 OR abs:term2 OR ...)
- 第一个词作为 AND 核心,其余 OR 扩展
- 识别复合词(multi-agent 等)作为原子单元
- 过滤泛化词(towards/safe/efficient 等)
- 最多 8 个词
飞书 Wiki API 验证过的要点
节点创建
POST /wiki/v2/spaces/{space_id}/nodes
body: { obj_type: "docx", parent_node_token, node_type: "origin", title }
返回: { node_token, obj_token }
注意:Wiki API 忽略传入的 obj_token,始终创建自己的空文档,必须用返回的 obj_token 写入。
写入可用 block type
| block_type | 类型 | 可用 |
|---|---|---|
| 2 | text/paragraph | ✅ |
| 3 | heading1 | ✅ |
| 4 | heading2 | ✅ |
| 5 | heading3 | ✅ |
| 25 | divider | ❌ 1770029 |
| 27 | callout | ❌ 字段校验严 |
| 31 | table | ❌ 参数结构不对 |
关键参数
space_id:必须是整数,不是字符串parent_node:父节点 token,文档创建在其下级- Token 获取:从
openclaw.json的channels.feishu.appId/appSecret换取 tenant_access_token
CLI
# 日报(cron 触发)
python3 scripts/paper_lark_report.py
# 周报
python3 scripts/paper_lark_report.py --weekly
# LLM 选完论文后
python3 scripts/paper_lark_report.py --save-selected "YYYY-MM-DD" "data/selected_papers.json"
python3 scripts/paper_lark_report.py --register-doc "<node_token>" "<obj_token>" "<doc_url>"
Files
7 totalSelect a file
Select a file to preview.
Comments
Loading comments…
