Install
openclaw skills install web-collection通过云端连接器优先执行浏览器插件数据采集,也可回退到本地连接器;适用于抖音、TikTok、小红书、Amazon、Bilibili 的采集任务,以及 web-collection 首次上手、配置、付费使用说明和 QA 排障。
openclaw skills install web-collectionUse this skill for browser-extension collection tasks on:
The online documents are the source of truth for user-facing guidance. Do not duplicate their long-form onboarding, paid-access, UI operation, or QA content in this skill. When a user asks for guidance, read the relevant online document if tools can access it; if not, give the user the link and say the online document should be used as the current guide.
Decision rule:
For collection execution, keep using this SKILL.md and the bundled scripts as the agent contract. For complex or ambiguous execution requests, read references/learning-guide.md as an offline routing and recovery summary after checking whether the QA guide applies.
openclaw browser profile.id and token.local: talk to the local bridge directly and only run the local send-command scriptcloud: call the cloud connector dispatch API and only run the cloud send-command scriptcloud mode, do not rewrite the collection payload. Only wrap it in:
device_idactionpayloadpersonalSmart from this skill. Send personal plus deduplication.enabled=true; the new plugin switches to its smart personal export internally.This skill uses one preferences file:
$OPENCLAW_STATE_DIR/skill-state/web-collection/preferences.json
Fallback:
$HOME/.openclaw/skill-state/web-collection/preferences.json
Helper script:
bash {baseDir}/scripts/export_preference.sh show
bash {baseDir}/scripts/export_preference.sh check
bash {baseDir}/scripts/export_preference.sh apply-recommended
bash {baseDir}/scripts/export_preference.sh set-key defaultConnectionMode cloud
bash {baseDir}/scripts/export_preference.sh set-key defaultCloudDeviceId desktop-local-smoke-fix
bash {baseDir}/scripts/export_preference.sh set-key defaultCloudToken <user_api_key>
bash {baseDir}/scripts/export_preference.sh set-key defaultExportMode csv
bash {baseDir}/scripts/export_preference.sh set-key defaultDeduplicationEnabled true
bash {baseDir}/scripts/export_preference.sh set-key defaultDeduplicationStrategy keepOld
Required defaults:
defaultExportModedefaultMaxItemsdefaultFetchDetaildefaultDetailSpeedOptional defaults with built-in fallback:
defaultDeduplicationEnabled defaults to truedefaultDeduplicationStrategy defaults to keepOldMode-specific defaults:
local
cloud
https://i-sync.cn by defaultdefaultCloudDeviceId only when it is not already provided by environment variablesdefaultCloudToken only when it is not already provided by environment variablesrun.sh enforces this. If these are incomplete, collection must not start.
On first use:
cloud when the user does not specify a mode.cloud, collect these values only when they are not already available from environment variables or stored preferences:
defaultCloudDeviceIddefaultCloudToken推荐配置自己配置推荐配置, run:bash {baseDir}/scripts/export_preference.sh apply-recommended
自己配置, ask for all common values in one message, not one by one.check passes.Preferred cloud prompt:
检测到你要走云端分发,还需要这两个配置:
- device_id
- API token
请一次性发给我。
Preferred quick-reply prompt for common defaults:
常用配置还需要确认一次。
这些配置包括:
- 导出方式
- 默认采集条数
- 是否默认采集详情
- 默认采集速度
- 是否开启导出去重
- 去重保留策略
你可以直接用推荐配置,也可以自己配置。
[[quick_replies: 推荐配置, 自己配置]]
Preferred custom-config prompt:
好,我们一次性把默认配置定好。请直接按下面格式回复:
导出方式:CSV / 多维表格
默认采集条数:10 / 20 / 50 / 100
是否默认采集详情:是 / 否
默认采集速度:fast / medium / slow
是否开启导出去重:是 / 否
去重保留策略:保留原始数据 / 保留新数据
说明:
- 多维表格:适合查看、筛选、分享
- CSV:适合本地保存
- 采集详情:开启后结果更完整,但一般更慢
- 采集速度:推荐 fast
- 导出去重:推荐开启
- 保留原始数据:重复数据保留第一次导入版本
Recommended defaults:
cloud多维表格20truefasttruekeepOld(保留原始数据)Use cloud mode when the collection request should be sent to the platform backend first, and then dispatched to the user's connected local connector.
Cloud responsibilities:
/api/v1/connector/cloud/dispatchAuthorization: Bearer <user_api_key>device_idpayloadmaxItems=20, mode=search, interval=300, fetchDetail=true, detailSpeed=fast/api/v1/connector/cloud/commands/{command_id} for final status and result/api/v1/connector/cloud/commands?device_id=...result + task_updates as the source of completion snapshotDo not:
19820 port from the cloud pathpayload semanticsWhen collection fails, parameters look incomplete, or status is unclear, run connector checks in this order instead of guessing.
Layer 1: capability
GET /api/helpGET /api/routesGET /api/filters (or platform/method scoped)Layer 2: diagnostics
GET /api/statusGET /api/platform-stateGET /api/cloud/statusPOST /api/preflight with the final request bodyLayer 3: execution and tracking
POST /api/collectGET /api/tasks/:id (local mode)GET /api/v1/connector/cloud/commands/{command_id} (cloud mode, preferred)GET /api/v1/connector/cloud/commands?device_id=... (cloud fallback)POST /api/stop or POST /api/reset when stuckLocal command template (admin token required):
TOKEN="$(cat ~/.meixi-connector/bridge-admin-token.txt)"
curl -s -H "x-connector-admin-token: $TOKEN" "http://127.0.0.1:19820/api/status"
Cloud command template (async result):
curl -s -H "Authorization: Bearer <token_or_api_key>" \
"https://i-sync.cn/api/v1/connector/cloud/commands/<command_id>"
bitable
--export-target bitableexport.tableUrl on successpersonalexport.deduplication.enabled=true and strategy=keepOldcsv
--export-target csvDeduplication fields are owned by the connector/plugin. Do not ask the user to choose a field and do not pass a field from this skill.
For bitable export, this skill must treat personal export as a smart-export conversation, not as a plain success-or-fail step.
If collection succeeded but export failed:
新建表继续导出导出为CSVRecommended failure explanation:
采集已经完成,失败发生在导出到原多维表格这一步。
原表导出失败通常有几种原因:
- 原目标表不存在
- 原目标表字段结构与本次数据不一致
- 导出服务临时异常
为了避免这次结果丢失,我现在可以继续帮你导出。你可以选择:
- 新建表继续导出:在当前多维表格中创建一个新的数据表后继续导出
- 导出为CSV:直接把这次结果保存成 CSV 文件
[[quick_replies: 新建表继续导出, 导出为CSV]]
If the user chooses 导出为CSV, switch the follow-up run to CSV export.
If the user chooses 新建表继续导出 or 导出为CSV after a failed export and the previous result includes a connector taskId, do not start a new collection. Re-export from the cached task records:
bash {baseDir}/scripts/reexport_task.sh --task-id "<taskId>" --export-target csv
bash {baseDir}/scripts/reexport_task.sh --task-id "<taskId>" --export-target bitable --new-table
This uses POST /api/tasks/:id/export and must be described as reusing the previous task's cached records. If no task id is available, be honest that the skill cannot guarantee a no-recollect retry.
Expected bitable connector payload shape:
{
"autoExport": true,
"exportMode": "personal",
"export": {
"enabled": true,
"mode": "personal",
"deduplication": {
"enabled": true,
"strategy": "keepOld"
}
}
}
Do not use this shape:
{
"exportMode": "personalSmart"
}
Preferred wrapper:
bash {baseDir}/scripts/run.sh ...
The wrapper:
scripts/preflight_check.sh firstscripts/cloud_dispatch_loop.shscripts/collect_and_export_loop.shscripts/preflight_check.sh
scripts/run.sh
connection-modescripts/collect_and_export_loop.sh
scripts/cloud_dispatch_loop.sh
scripts/reexport_task.sh
scripts/export_preference.sh
references/learning-guide.md
Douyin keyword search:
bash {baseDir}/scripts/run.sh \
--platform douyin \
--keyword "AI" \
--ensure-bridge
Douyin keyword search via cloud dispatch:
bash {baseDir}/scripts/run.sh \
--connection-mode cloud \
--cloud-device-id desktop-local-smoke-fix \
--cloud-token '<user_api_key>' \
--platform douyin \
--keyword "AI员工"
Amazon keyword search:
bash {baseDir}/scripts/run.sh \
--platform amazon \
--keyword "Chinese porcelain" \
--ensure-bridge
Bilibili keyword search:
bash {baseDir}/scripts/run.sh \
--platform bilibili \
--keyword "古董" \
--ensure-bridge
Wrapper defaults:
douyin => videoKeywordtiktok => keywordSearchxiaohongshu => keywordSearchamazon => keywordSearchbilibili => keywordSearchSupported methods:
douyin: videoKeyword, creatorKeyword, creatorLink, creatorVideo, videoComment, videoInfo, videoLinktiktok: keywordSearch, userVideo, tiktokComment, tiktokCreatorKeyword, tiktokCreatorLinkxiaohongshu: keywordSearch, creatorNote, creatorLink, creatorKeyword, noteLink, noteCommentamazon: keywordSearch, productLink, productReviewbilibili: keywordSearch, videoInfo, creatorVideo, bilibiliCommentlocal mode:
pluginConnected=true/api/collectTASK_RUNNING via stop -> wait idle -> retry/api/tasks/<taskId> until completed or errorcloud mode:
/api/v1/connector/cloud/status?device_id=...action=collect to /api/v1/connector/cloud/dispatch/api/v1/connector/cloud/commands/{command_id} preferred)completed or terminal error stateresult and task_updates for records/count/export snapshot and include key fields in the final replyQuick query examples:
curl -H "Authorization: Bearer <token_or_api_key>" \
"https://i-sync.cn/api/v1/connector/cloud/commands?device_id=<device_id>"
curl -H "Authorization: Bearer <token_or_api_key>" \
"https://i-sync.cn/api/v1/connector/cloud/commands/<command_id>"
When successful:
local or cloud mode.cloud mode was used, include the command status.bitable and export.tableUrl exists, include the table link first.csv, explicitly say export mode is CSV.When bitable export is expected but no table link exists, explicitly say export did not finish correctly.
pluginConnected=false
local mode, ensure collect, status, and stop all use the same local base URLdefaultCloudTokendefaultCloudDeviceId and whether the local connector is onlineTASK_RUNNING
--force-stop-before-start