Link Library

v1.0.0

Personal knowledge base that captures web content (articles, tweets/threads, videos, podcasts, images, PDFs) and makes it retrievable for future conversation...

⭐ 0· 94·0 current·0 all-time

by不白@nowhitestar

MIT-0

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Suspicious

high confidence

Purpose & Capability

The skill claims to capture web content and make it retrievable, which matches the instructions. However the registry metadata declares no required binaries or credentials, while the SKILL.md explicitly invokes many external tools (curl/r.jina.ai, yt-dlp, xreach, a local python wechat script, etc.). Those tools are necessary for the described capabilities but are not declared — an incoherence that should be resolved before trust.

Instruction Scope

Instructions instruct the agent to fetch remote content and to always save the "full original text" into a local library directory (~/.openclaw/workspace-main/library/). They also instruct calls to third‑party fetch endpoints (e.g., https://r.jina.ai/URL), run yt-dlp to download subtitles/media, and run a local script at ~/.agent-reach/tools/wechat-article-for-ai. The SKILL.md also defines an auto-save policy (sometimes save without confirmation). These behaviors expand scope into network I/O, file writes, and potential disclosure of URLs/content to external services.

ℹ

Install Mechanism

This is an instruction-only skill (no install spec) which is low-install risk on its own. However the skill expects many third‑party command-line tools and a local script; there is no guidance to install them. The lack of a declared install mechanism or dependency list is an operational and security gap (an operator may unknowingly run commands that fail or run unreviewed CLIs).

Credentials

The skill declares no required environment variables or credentials, yet the fetch methods (xreach for Twitter/X, a WeChat python script, yt-dlp for some sites) commonly require API tokens, cookies, or authenticated access. Also using remote services like r.jina.ai to fetch page text will transmit user-shared URLs (and potentially page content) to a third party. The absence of declared credentials or mention of where sensitive tokens are stored is disproportionate and ambiguous.

Persistence & Privilege

The skill writes content persistently to a specific path under the user's home directory and mandates saving full original text (potentially storing sensitive or copyrighted material). Although always:false, the skill allows autonomous invocation and includes auto-save rules that can save without explicit confirmation in some cases, which combined with network fetches to third parties increases privacy/exfiltration risk.

What to consider before installing

Before installing or enabling this skill: (1) Treat it as capable of writing persistent files under ~/.openclaw/workspace-main/library/ and of sending URLs/content to third parties (e.g., r.jina.ai) — do not use it with sensitive or corporate links until you trust those endpoints. (2) Confirm which binaries and local scripts it requires (yt-dlp, xreach, curl, python3, the local wechat script) and inspect/install them from trusted sources; the skill currently declares none. (3) Consider disabling or changing the auto-save policy — require explicit user confirmation before saving full original text. (4) If you need to use it, run it in a sandboxed account or VM and audit the files it writes and the network calls it makes. (5) What would increase confidence: an explicit dependency/install section, a clear list of required credentials and how they’re used/stored, and removal or opt-in control over third‑party fetch endpoints (or an option to fetch content locally rather than via r.jina.ai).

Like a lobster shell, security has layers — review code before you run it.

latestvk978r26x2zkss5sacv2nw3mgj9837w1s

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

SKILL.md

Link Library — Personal Content Knowledge Base

Save web content with full original text, generate summaries and tags, retrieve semantically.

Core Rules

Always save original full text — summaries are for retrieval, originals are for re-reading
Detect interest, don't demand commands — if user engages with a link, offer to save
Twitter/X is first-class — tweets, threads, and articles are fully supported

Interest Detection

When user shares a link, evaluate interest signals:

Auto-save (no confirmation needed):

User explicitly says save/bookmark/记一下/放进知识库
User asks "帮我总结一下" (summarize implies save-worthy)

Offer to save (ask once):

User shares link + positive commentary ("这篇不错", "有意思", "学到了")
User asks follow-up questions about link content
User discusses link content substantively

Don't save:

User shares link just for quick reference in conversation
User says "不用保存" or similar

Data Location

All entries in ~/.openclaw/workspace-main/library/:

library/
├── articles/     # Web articles, blog posts, WeChat, Zhihu
├── tweets/       # Twitter/X posts and threads
├── videos/       # YouTube, Bilibili
├── podcasts/     # Podcast episodes
├── papers/       # Academic papers, PDFs
├── images/       # Infographics, visual content
└── misc/         # Everything else

Content Types & Fetch Methods

Type	URL Patterns	Fetch Method	Template
article	Generic web, blog, /post/	`web_fetch` or `curl -s "https://r.jina.ai/URL"`	`article.md`
wechat	mp.weixin.qq.com	`cd ~/.agent-reach/tools/wechat-article-for-ai && python3 main.py "URL"`	`article.md`
tweet	x.com, twitter.com /status/	`xreach tweet URL --json`	`tweet.md`
thread	x.com, twitter.com (thread)	`xreach thread URL --json`	`tweet.md`
video	youtube.com, youtu.be	`yt-dlp --dump-json "URL"` + subtitle extraction	`video.md`
bilibili	bilibili.com	`yt-dlp --dump-json "URL"` + subtitle extraction	`video.md`
paper	arxiv.org, .pdf links	`web_fetch` or browser	`paper.md`
podcast	Podcast platforms	`web_fetch` metadata	`podcast.md`
image	Image URLs	Download + describe	`image.md`

Twitter/X Fetch Details

# Single tweet
xreach tweet URL_OR_ID --json

# Full thread
xreach thread URL_OR_ID --json

# User timeline (for context)
xreach tweets @username -n 20 --json

Extract from JSON: full_text, user.screen_name, created_at, entities, media URLs. For threads: concatenate all tweets in order as full content.

Video Subtitle Extraction

# Download subtitles
yt-dlp --write-sub --write-auto-sub --sub-lang "zh-Hans,zh,en" \
  --convert-subs vtt --skip-download -o "/tmp/%(id)s" "URL"
# Then read the .vtt file as transcript

Entry Structure

Every entry has two parts:

1. YAML Frontmatter (structured metadata)

title: "..."
source: "..."           # Platform/domain
url: "..."              # Original URL
author: "..."           # Author or @handle
date_published: "..."   # When content was created
date_saved: "..."       # When we saved it
last_updated: "..."     # Last modification
type: article|tweet|video|podcast|paper|image
tags: [tag1, tag2, ...]
status: unread|read|reviewed
priority: low|normal|high
related: []             # Paths to related entries

2. Markdown Body (content)

# {title}

## Summary
2-3 sentence summary.

## Key Points
- Point 1
- Point 2

## Original Content
THE FULL ORIGINAL TEXT — not truncated, not summarized.
This is the authoritative source for re-reading and quoting.

## Quotes
> Notable quotes worth highlighting

## Notes
Personal observations, connections, action items.

## Related
- [[library/tweets/related-tweet]]
- [[library/articles/related-article]]

⚠️ MANDATORY: Always save original full text in "Original Content" section. Summaries and key points are for quick retrieval. The original text is for accurate re-reading and quoting. Never skip saving the full content.

Filename Convention

<slugified-title>-<YYYY-MM-DD>.md

Examples:

library/articles/yc-why-not-work-and-startup-2026-03-12.md
library/tweets/garry-tan-on-yc-advice-2026-03-13.md
library/videos/how-to-build-agents-2026-03-13.md

Save Workflow

Detect URL — Parse link from user message
Identify type — Match URL pattern to content type
Check dedup — memory_search("URL or title") to avoid duplicates
Fetch content — Use appropriate method from table above
Generate metadata — Title, summary, key points, tags (3-7)
Write entry — Use template, fill frontmatter + full original text
Confirm — Tell user: title, tags, and where it's saved

Search & Retrieval

# Semantic search
memory_search("创业方法论")
memory_search("Garry Tan 的推文")
memory_search("AI agent 视频教程")

# Read specific entry
memory_get("library/tweets/garry-tan-on-yc-2026-03-13.md")

When returning search results, show:

Title + source + date
Summary (2 lines max)
Tags
Offer to show full original text

Writing Reference Mode

When user asks to write something using saved content:

Search library for relevant entries
Read full original text of top matches
Synthesize insights, cite sources inline
Format citations as [[library/type/entry-name]]

Templates

Located in templates/:

article.md — Web articles, blog posts, newsletters
tweet.md — Twitter/X posts and threads
video.md — Videos with transcript
podcast.md — Podcast episodes
paper.md — Academic papers
image.md — Visual content

Best Practices

Save originals religiously — summaries lose nuance
Tag consistently — reuse existing tags, keep vocabulary tight
Link related entries — build a knowledge graph over time
Don't over-ask — if interest is clear, just save and confirm

Files

7 total

Select a file

Select a file to preview.

Comments

Loading comments…