Install
openclaw skills install content-parserExtract and parse content from URLs. Triggers on: user provides a URL to extract content from, another skill needs to parse source material, "parse this URL", "extract content", "解析链接", "提取内容".
openclaw skills install content-parserExtract and normalize content from URLs across supported platforms. Returns structured data including content body, metadata, and references. Useful as a preprocessing step for content generation skills or standalone content extraction.
shared/authentication.md for API key and headersshared/common-patterns.md for polling, errors, and interaction patternsshared/config-pattern.md before any interaction~/Downloads/ or .listenhub/ — save to the current working directoryFollow shared/config-pattern.md § API Key Check. If the key is missing, stop immediately.
Follow shared/config-pattern.md Step 0.
If file doesn't exist — ask location, then create immediately:
mkdir -p ".listenhub/content-parser"
echo '{"autoDownload":true}' > ".listenhub/content-parser/config.json"
CONFIG_PATH=".listenhub/content-parser/config.json"
# (or $HOME/.listenhub/content-parser/config.json for global)
Then run Setup Flow below.
If file exists — read config, display summary, and confirm:
当前配置 (content-parser):
自动下载:{是 / 否}
Ask: "使用已保存的配置?" → 确认,直接继续 / 重新配置
autoDownload: trueautoDownload: falseSave immediately:
NEW_CONFIG=$(echo "$CONFIG" | jq --argjson dl {true/false} '. + {"autoDownload": $dl}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"
CONFIG=$(cat "$CONFIG_PATH")
Free text input. Ask the user:
What URL would you like to extract content from?
Ask if the user wants to configure extraction options:
Question: "Do you want to configure extraction options?"
Options:
- "No, use defaults" — Extract with default settings
- "Yes, configure options" — Set summarize, maxLength, or Twitter tweet count
If "Yes", ask follow-up questions:
Summarize:
Ready to extract content:
URL: {url}
Options: {summarize: true, maxLength: 5000, twitter.count: 50} / default
Proceed?
Wait for explicit confirmation before calling the API.
Validate URL: Must be HTTP(S). Normalize if needed (see references/supported-platforms.md)
Build request body:
{
"source": {
"type": "url",
"uri": "{url}"
},
"options": {
"summarize": true/false,
"maxLength": 5000,
"twitter": {
"count": 50
}
}
}
Omit options if user chose defaults.
Submit (foreground): POST /v1/content/extract → extract taskId
Tell the user extraction is in progress
Poll (background): Run the following exact bash command with run_in_background: true and timeout: 300000. Note: status field is .data.status (not processStatus), interval is 5s, values are processing/completed/failed:
TASK_ID="<id-from-step-3>"
for i in $(seq 1 60); do
RESULT=$(curl -sS "https://api.marswave.ai/openapi/v1/content/extract/$TASK_ID" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" 2>/dev/null)
STATUS=$(echo "$RESULT" | tr -d '\000-\037\177' | jq -r '.data.status // "processing"')
case "$STATUS" in
completed) echo "$RESULT"; exit 0 ;;
failed) echo "FAILED: $RESULT" >&2; exit 1 ;;
*) sleep 5 ;;
esac
done
echo "TIMEOUT" >&2; exit 2
When notified, download and present result:
If autoDownload is true:
{taskId}-extracted.md to the current directory — full extracted content in markdown{taskId}-extracted.json to the current directory — full raw API response dataecho "$CONTENT_MD" > "${TASK_ID}-extracted.md"
echo "$RESULT" > "${TASK_ID}-extracted.json"
Present:
内容提取完成!
来源:{url}
标题:{metadata.title}
长度:~{character count} 字符
消耗积分:{credits}
已保存到当前目录:
{taskId}-extracted.md
{taskId}-extracted.json
Show a preview of the extracted content (first ~500 chars)
Offer to use content in another skill (e.g. /podcast, /tts)
Estimated time: 10-30 seconds depending on content size and platform.
shared/api-content-extract.mdreferences/supported-platforms.mdshared/common-patterns.md § Async Pollingshared/common-patterns.md § Error Handlingshared/config-pattern.mdUser: "Parse this article: https://en.wikipedia.org/wiki/Topology"
Agent workflow:
https://en.wikipedia.org/wiki/Topologycurl -sS -X POST "https://api.marswave.ai/openapi/v1/content/extract" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"source": {
"type": "url",
"uri": "https://en.wikipedia.org/wiki/Topology"
}
}'
curl -sS "https://api.marswave.ai/openapi/v1/content/extract/69a7dac700cf95938f86d9bb" \
-H "Authorization: Bearer $LISTENHUB_API_KEY"
User: "Extract recent tweets from @elonmusk, get 50 tweets"
Agent workflow:
https://x.com/elonmusk{"twitter": {"count": 50}}curl -sS -X POST "https://api.marswave.ai/openapi/v1/content/extract" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"source": {
"type": "url",
"uri": "https://x.com/elonmusk"
},
"options": {
"twitter": {
"count": 50
}
}
}'