Article to Feishu

将网页文章转换为飞书文档,支持今日头条、博客园、微信公众号、CSDN 等多种网站。自动下载图片并按原文顺序插入。 **当用户要求以下操作时使用**: - "把这篇文章转成飞书文档" - "导入文章到飞书" - "保存网页到飞书" - "把链接转成文档" **支持的网站**: - 今日头条 (m.toutiao.c...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 24 · 0 current installs · 0 all-time installs
byKevin@mywaystay
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
The name/description match the included scripts and SKILL.md: scripts extract article HTML, extract/download images (with Referer handling), and the instructions show how the AI agent should create/update Feishu docs. There are no unrelated binaries, credentials, or surprising capabilities in the bundle.
Instruction Scope
Instructions tell the agent to fetch pages (curl / web_fetch / Jina Reader), extract/download images, and then call feishu_create_doc / feishu_update_doc / feishu_doc_media to build the document. The scripts operate only on provided URLs and temporary directories; they do not attempt to read other system files or hidden credentials. Note: the workflow requires network access and explicitly sends URLs/content to a third‑party service (r.jina.ai) when using the Jina Reader; that is expected by the SKILL.md but is a privacy surface to be aware of.
Install Mechanism
There is no install spec (instruction-only skill with bundled shell scripts). No downloads during installation; network calls occur at runtime when fetching pages/images. This is low install risk. The runtime use of curl to external hosts is expected for this functionality.
!
Credentials
The skill declares no required environment variables or credentials, yet runtime instructions expect the agent/platform to have Feishu integration (feishu_create_doc, feishu_doc_media) available — the skill doesn't declare or request Feishu credentials itself (this can be normal if the platform supplies connectors, but users should confirm). Also several scripts call the external Jina Reader (https://r.jina.ai/) which will receive the article URL/content and images are downloaded from original hosts — this transmits scraped data to third parties and should be considered a privacy/exfiltration risk for sensitive content.
Persistence & Privilege
The skill is not always:true, does not request persistent/privileged installation, and does not modify other skills or system-wide configs. It only creates and cleans up temporary directories during normal operation.
Assessment
This skill is internally consistent for converting public web articles to Feishu documents. Before installing, consider: (1) It sends article URLs/content to r.jina.ai (Jina Reader) by default — avoid using it on private or sensitive links if you don't want third parties to see the content. (2) The agent/platform must provide Feishu integration and credentials for feishu_create_doc/feishu_doc_media calls; the skill does not itself request or store those credentials. (3) The scripts download images from arbitrary hosts (network traffic and temporary files); review and test with non-sensitive pages first. If you need tighter privacy, modify the scripts to avoid r.jina.ai (fetch pages locally) and audit where network requests go.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.0
Download zip
articlevk9733f1jwwhm0vn59xf6nvtxeh831eyvconvertervk9733f1jwwhm0vn59xf6nvtxeh831eyvdocumentvk9733f1jwwhm0vn59xf6nvtxeh831eyvfeishuvk9733f1jwwhm0vn59xf6nvtxeh831eyvlatestvk9733f1jwwhm0vn59xf6nvtxeh831eyv

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

网页文章转飞书文档

将任意网页文章转换为飞书云文档,自动处理图片防盗链并按原文顺序插入图片。

🚀 快速开始

# 1. 下载文章图片(自动处理防盗链)
bash {baseDir}/scripts/download_article_images.sh "$ARTICLE_URL" /tmp/article-img/

# 2. 获取文章内容
curl -sL "$ARTICLE_URL" | grep -oP '<title>.*</title>'
# 或使用 web_fetch 工具

# 3. AI Agent 分段构建文档
# - feishu_create_doc 创建文档
# - feishu_update_doc mode=append 追加文字
# - feishu_doc_media action=insert 插入图片

📖 工作流程

┌─────────────────┐
│  1. 获取文章内容  │  web_fetch 或 curl
└────────┬────────┘
         ▼
┌─────────────────┐
│  2. 提取图片 URL │  grep 或专用脚本
└────────┬────────┘
         ▼
┌─────────────────┐
│  3. 下载图片本地 │  带 Referer 防盗链
└────────┬────────┘
         ▼
┌─────────────────┐
│  4. 创建文档     │  feishu_create_doc
└────────┬────────┘
         ▼
┌─────────────────┐
│  5. 分段构建     │  文字 → 图片 → 文字...
└────────┬────────┘
         ▼
┌─────────────────┐
│  6. 清理临时文件 │  rm -rf /tmp/article-img/
└─────────────────┘

🔧 工具脚本

download_article_images.sh

通用图片下载器,自动检测网站并设置正确的 Referer。

bash {baseDir}/scripts/download_article_images.sh <article_url> <output_dir> [referer]

示例

# 博客园文章
bash {baseDir}/scripts/download_article_images.sh "https://www.cnblogs.com/xxx/p/123" /tmp/img/

# 今日头条
bash {baseDir}/scripts/download_article_images.sh "https://m.toutiao.com/is/xxx/" /tmp/img/

# 自定义 Referer
bash {baseDir}/scripts/download_article_images.sh "$URL" /tmp/img/ "https://example.com/"

自动识别的网站

网站Referer
今日头条https://www.toutiao.com/
博客园https://www.cnblogs.com/
CSDNhttps://blog.csdn.net/
微信公众号https://mp.weixin.qq.com/
简书https://www.jianshu.com/
知乎https://zhuanlan.zhihu.com/

fetch_article.sh

使用 Jina AI Reader 获取文章内容(适合有反爬的网站)。

bash {baseDir}/scripts/fetch_article.sh "https://m.toutiao.com/is/xxx/"

extract_images.sh

从文章中提取图片 URL。

bash {baseDir}/scripts/extract_images.sh "https://m.toutiao.com/is/xxx/"

download_images.sh

今日头条专用图片下载器。

bash {baseDir}/scripts/download_images.sh "https://m.toutiao.com/is/xxx/" /tmp/img/

📝 分段构建文档(核心)

原则

文字 + 图片交替追加,确保图片出现在正确位置

1. feishu_create_doc     → 创建文档,写标题和开头
2. feishu_update_doc     → 追加第一段文字
3. feishu_doc_media      → 插入第一张图片
4. feishu_update_doc     → 追加第二段文字
5. feishu_doc_media      → 插入第二张图片
... 循环直到完成

完整示例

# 步骤 1: 下载图片
bash {baseDir}/scripts/download_article_images.sh "$URL" /tmp/article-img/
# 输出: 01.jpg, 02.jpg, 03.jpg...

# 步骤 2: 创建文档
feishu_create_doc title="文章标题" markdown="文章开头..."

# 步骤 3: 追加第一段
feishu_update_doc doc_id="xxx" mode=append markdown="## 章节1\n\n说明文字..."

# 步骤 4: 插入图片
feishu_doc_media action=insert doc_id="xxx" file_path="/tmp/article-img/01.jpg" type=image align=center

# 步骤 5: 继续追加
feishu_update_doc doc_id="xxx" mode=append markdown="更多内容..."

# 步骤 6: 插入更多图片...
feishu_doc_media action=insert doc_id="xxx" file_path="/tmp/article-img/02.jpg" type=image align=center

# 步骤 7: 清理
rm -rf /tmp/article-img/

🖼️ 图片处理策略

策略选择

场景策略说明
图片 URL 可公开访问<image url="..."/>简单快捷
图片有防盗链下载后上传必须!
图片 URL 有时效下载后上传尽快处理
不确定下载后上传最安全

URL 直接引用

<image url="https://example.com/image.png" align="center" caption="描述"/>

系统自动下载并上传到飞书。

本地图片上传(防盗链必须)

{
  "action": "insert",
  "doc_id": "xxx",
  "file_path": "/tmp/article-img/01.jpg",
  "type": "image",
  "align": "center"
}

⚠️ 注意事项

  1. 防盗链:大多数网站图片需要带 Referer 头,用脚本自动处理
  2. 图片顺序:按原文顺序命名(01.jpg, 02.jpg...)
  3. 分段构建feishu_doc_media insert 只能追加到末尾
  4. 临时清理:完成后删除临时图片目录
  5. 图片大小:飞书限制 20MB 以内

🐛 常见问题

图片显示不出来?

原因:防盗链或 URL 过期

解决:使用 download_article_images.sh 下载后上传

图片顺序错乱?

原因:提取 URL 时用了 sort -u 打乱顺序

解决:脚本已按出现顺序下载,文件名按序号命名

下载失败?

# 手动测试,检查 Referer
curl -sL -H "Referer: https://www.toutiao.com/" "$IMG_URL" -o test.jpg

📋 各网站特性

网站反爬防盗链推荐方案
今日头条Jina Reader + 下载图片
博客园curl + 下载图片
CSDNJina Reader + 下载图片
微信公众号Jina Reader + 下载图片
简书直接获取
知乎下载图片

Files

6 total
Select a file
Select a file to preview.

Comments

Loading comments…