Deep Research Pro v2.2

v2.2.0

Conducts thorough research by downloading full PDFs, extracting structured data with original text quotes, verifying sources, and generating cross-validated...

0· 143·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for xueylee-dotcom/deep-research-v22.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Deep Research Pro v2.2" (xueylee-dotcom/deep-research-v22) from ClawHub.
Skill page: https://clawhub.ai/xueylee-dotcom/deep-research-v22
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install deep-research-v22

ClawHub CLI

Package manager switcher

npx clawhub@latest install deep-research-v22
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
The name/description (full‑text extraction, quoting, source verification) align with the included scripts (extract-from-pdf.py, check-sourcing.sh, quality-score.py) and templates. There are no unrelated credentials or unusual binaries requested.
Instruction Scope
Instructions explicitly require downloading arbitrary PDF URLs, extracting text, copying 50+ character verbatim quotes from source, and running a provenance check; these are coherent with the stated goal. Notes of caution: downloading/parsing arbitrary PDFs can expose the host to malformed/malicious PDFs; the workflow promotes copying verbatim excerpts which may have copyright implications; check-sourcing.sh assumes report and source file layout and relies on grep patterns.
!
Install Mechanism
There is no install spec even though the code depends on Python and either pdfplumber (a pip package) or pdftotext (and likely the poppler system library). SKILL.md lists those tools as '已安装' but the registry metadata declares no required binaries/dependencies — this mismatch means the agent or user must ensure the environment has the needed packages. Also check that grep -P (PCRE) is available where the scripts run.
Credentials
The skill requests no environment variables or credentials and the scripts do not read secrets or other env vars. Network access to arbitrary URLs is required for the intended purpose; no external endpoints or hidden exfiltration channels are present in the code.
Persistence & Privilege
The skill is not always-on and does not request persistent system privileges or alter other skills. It writes temporary files (uses /tmp) and deletes them; that is normal for its function.
Assessment
This skill appears coherent with its stated purpose, but take these precautions before installing or executing it: - Ensure your environment has Python3, pdfplumber (pip) or pdftotext + poppler installed; the package/dependency requirements are not declared by the registry. - Run the workflow in an isolated environment (VM/container) because it downloads and opens arbitrary PDFs — malformed PDFs can exploit local parsers. - Review the code (extract-from-pdf.py and check-sourcing.sh) yourself if you can; they perform network downloads and local file writes but contain no hidden exfiltration. Verify the User-Agent and URL handling if you have network-policy constraints. - Be aware the skill enforces copying verbatim 50+ character quotes from sources — consider copyright/privacy rules when including such text in generated reports. - Confirm grep -P availability (the shell script uses PCRE patterns) or adjust scripts for your environment. If you need higher assurance, request the author add an explicit install spec (pip/apt/poppler), declare required binaries, and add input validation/sanitization for PDF URLs before running.

Like a lobster shell, security has layers — review code before you run it.

deepvk975190qv1gavvr9ccp9qaxrxn837k9dlatestvk975190qv1gavvr9ccp9qaxrxn837k9dpdf-extractionvk975190qv1gavvr9ccp9qaxrxn837k9dresearchvk975190qv1gavvr9ccp9qaxrxn837k9dstudyvk975190qv1gavvr9ccp9qaxrxn837k9d
143downloads
0stars
1versions
Updated 1mo ago
v2.2.0
MIT-0

Skill: Deep Research Pro (v2.2 - True Depth)

版本:2.2.0 描述:真深度研究技能,强制全文解析+结构化提取+溯源验证

核心原则

没有真正的原文阅读,就没有深度研究


🔴 强制执行流程(v2.2 新增)

Step 1: 研究规划 (必须输出文件)

  • 生成 research/plan.md
  • 列出至少 5 个具体检索查询式
  • 用户确认后才能继续

Step 2: 全文解析 + 结构化提取 (核心!)

禁止跳过此步骤!

对于每个有效来源,必须执行:

  1. 获取全文

    # 使用 extract-from-pdf.py 脚本
    python3 scripts/extract-from-pdf.py card-001 "https://arxiv.org/pdf/xxx.pdf"
    
    • 如果有DOI/URL,尝试下载PDF
    • 如果无法获取全文,标记 full_text: false跳过该来源
  2. 结构化提取(从PDF原文提取)

    • 样本量:具体数字
    • 主要结果:具体数值 + 单位 + 统计显著性
    • 成本影响:具体金额/百分比
    • 置信区间:95%CI
    • 原文引用:必须从正文中复制至少50字
  3. 更新卡片

    • 用提取的真实数据替换"待提取"
    • 标记 full_text: true/false

最低要求

  • deep 模式:至少 10 个带全文提取的卡片
  • 质量阈值:提取后评分 ≥ 6/10

Step 3: 溯源验证 (强制检查!)

生成报告前必须运行:

bash scripts/check-sourcing.sh reports/final-report.md sources/
  • 检查每个 [[card-xxx]] 引用的数据是否在卡片中存在
  • 如果有数据无法溯源,拒绝生成报告
  • 修复后重新验证

Step 4: 交叉分析

  • 生成 analysis/synthesis.md
  • 至少找出 3 组矛盾数据
  • 标注每个观点的卡片来源

Step 5: 报告生成

  • 生成 reports/final-report.md
  • 每个数据点必须标注 [[card-xxx]]
  • 报告末附溯源检查结果

🔧 工具依赖

工具用途状态
pdfplumberPDF全文解析✅ 已安装
pdftotextPDF备用解析✅ 已安装
extract-from-pdf.py结构化数据提取✅ 已创建
check-sourcing.sh溯源验证✅ 已创建

📋 执行命令

完整流程

# Step 1: 规划
# 编辑 research/plan.md,确认检索式

# Step 2: 检索 + 提取(循环执行)
# 对于每个来源:
python3 scripts/extract-from-pdf.py card-001 "URL"
# 检查提取结果,填入卡片

# Step 3: 溯源验证
bash scripts/check-sourcing.sh reports/final-report.md sources/

# Step 4-5: 分析与报告
# 生成最终报告

⚠️ 限制说明

如果无法获取全文(付费论文/报告):

  1. 标记卡片 full_text: false
  2. 报告中对该来源的数据仅作参考,不作为核心结论
  3. 建议人工复核关键数据

📊 版本对比

维度v2.1v2.2
PDF解析✅ 强制
数据提取"待提取"✅ 真实提取
原文引用模板话术✅ 从正文复制
溯源检查✅ 强制验证
报告质量有引用无验证有引用+验证

质量门禁(v2.2 强化版)

# 1. 检查卡片数量(≥10个有全文的)
FULLTEXT_COUNT=$(grep -l "full_text: true" sources/card-*.md 2>/dev/null | wc -l)
if [ $FULLTEXT_COUNT -lt 10 ]; then
 echo "❌ 错误:全文提取卡片不足10个,当前 $FULLTEXT_COUNT 个"
 exit 1
fi

# 2. 检查溯源
bash scripts/check-sourcing.sh reports/final-report.md sources/
if [ $? -ne 0 ]; then
 echo "❌ 错误:报告中有数据无法溯源"
 exit 1
fi

# 3. 检查待提取标记
if grep -q "待提取" sources/card-*.md; then
 echo "❌ 错误:卡片中仍有'待提取'数据"
 exit 1
fi

Skill版本:2.2.0 | 最后更新:2026-03-19

Comments

Loading comments...