Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

ArXiv Watcher for Music Research

v2.0.0

Search and summarize papers from ArXiv. Use when the user asks for the latest research, specific topics on ArXiv, or a daily summary of AI papers.

0· 306·0 current·0 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
high confidence
!
Purpose & Capability
The SKILL.md describes a full-featured ArXiv watcher (CS-only filtering, strict date range, chunking, deduplication, audit logs, and a CLI 'arxiv_watcher'), but the repository provides only a single 238-byte bash script (search_arxiv.sh) that issues a simple curl query to export.arxiv.org and does not implement category filters, date-range chunking, deduplication, file outputs, logging, or the advertised CLI. The _meta.json ownerId also differs from the registry ownerId, indicating sloppy packaging or an inconsistent bundle.
Instruction Scope
SKILL.md instructs the agent to create files under research/{domain}/search_results and to maintain comprehensive logs and audit trails. There are no instructions to read unrelated system files or credentials. However, the actual code does not create those files or perform the documented logging and processing steps — the instructions and implementation are out of sync.
Install Mechanism
No install spec (instruction-only) and a tiny shell script that calls the official ArXiv endpoint (export.arxiv.org). This is a low-risk install surface; nothing is downloaded from suspicious hosts. The only external network call (curl) is expected for this purpose.
Credentials
No environment variables, credentials, or config paths are requested. That is proportionate for an ArXiv search utility.
Persistence & Privilege
Skill is not marked always:true and does not request persistent system privileges. It does claim to create local files when used, which is reasonable for a search/logging tool, but those file-creation behaviors are not implemented in the provided script.
What to consider before installing
This skill overpromises: the README and SKILL.md describe many features (CS-only filtering, strict date-based chunking, deduplication, progress logs, and an arxiv_watcher CLI) but the package contains only a tiny search_arxiv.sh that performs a single curl query and does not implement those features. The source is unknown and _meta.json ownerId doesn't match the registry ownerId — signs of a sloppy or incomplete package. Before using or installing: (1) ask the author for the full implementation or a trustworthy source link; (2) verify a real CLI or wrapper exists and that it safely URL-encodes user input and implements rate limiting and file-writing with explicit paths; (3) run the script in a sandboxed environment and inspect any files it creates; (4) if you need the claimed features, prefer a repository that includes the implementation for chunking, deduplication, and logging, or implement those steps yourself. If you proceed without these checks, be aware you are trusting an incomplete/possibly mispackaged skill.

Like a lobster shell, security has layers — review code before you run it.

latestvk97amfza252eedw59z1yjabf7x821x9n

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

ArXiv Watcher Skill

This skill provides systematic ArXiv paper search with structured query strategies, duplicate handling, and comprehensive audit trail for academic research workflows.

Capabilities

  • Structured Search Strategy: Implements domain-specific search strategies based on research objectives
  • CS Category Filtering: Searches only within arXiv CS (Computer Science) category
  • Time Range Enforcement: Strictly adheres to specified date ranges
  • Duplicate Handling: Automatically merges results from multiple queries and removes duplicates
  • Rate Limit Management: Implements multi-round search strategies for large date ranges
  • Comprehensive Logging: Maintains detailed records of all search activities and results

Research Domain Configuration

Music Generation Search Strategy

Primary Objective: Systematically map the methodological landscape, data/training resources, evaluation benchmarks, and SOTA trends in music generation over the past two years.

Strong Relevance Keywords:

  • music generation
  • song generation
  • text-to-music
  • text-to-song
  • lyrics-to-music
  • image-to-music
  • video-to-music
  • video-guided music generation
  • text-to-midi
  • symbolic music generation
  • music synthesis

Weak Relevance Keywords (require additional verification):

  • editing
  • controllable generation
  • instruction-following

Related Paper Types (strong relevance if combined with music keywords):

  • survey
  • benchmark
  • evaluation
  • dataset

Search Implementation

Query Construction

  • Base Query: Combines strong relevance keywords with OR logic
  • Category Filter: Restricts to cat:cs.* (Computer Science)
  • Date Filter: Uses submittedDate range with strict bounds
  • Duplicate Prevention: Tracks paper IDs across multiple queries

Rate Limit Handling

For large date ranges (>6 months), implements multi-round strategy:

  1. Chunk by Quarter: Split date range into quarterly segments
  2. Sequential Queries: Execute queries sequentially with delay
  3. Merge Results: Combine and deduplicate across all segments
  4. Progress Tracking: Log completion status for each segment

Result Processing

  • Deduplication: Remove papers appearing in multiple query results
  • Metadata Extraction: Extract title, authors, abstract, submission date, arXiv ID
  • Relevance Tagging: Tag papers with primary keywords that matched
  • Structured Output: Generate standardized paper list format

Output Format

Local File Storage

  • Search Log: research/{domain}/search_results/arxiv_search_log.md
  • Paper List: research/{domain}/search_results/paper_list.json
  • Directory Structure: Automatically created if missing

Search Log Structure

Each search session includes:

  • Session Header: Date, domain, time range, search objective
  • Query Strategy: Detailed keyword combinations and search parameters
  • Execution Details: Query chunks, rate limit handling, completion status
  • Results Summary: Total papers found, duplicates removed, final count
  • Individual Results: Structured list of all papers with metadata

Paper List Format (JSON)

{
  "search_metadata": {
    "domain": "music_generation",
    "time_range": {"start": "2024-06-01", "end": "2026-02-27"},
    "keywords": ["music generation", "song generation", ...],
    "total_papers": 187,
    "search_date": "2026-02-28"
  },
  "papers": [
    {
      "title": "Paper Title",
      "authors": ["Author1", "Author2"],
      "abstract": "Paper abstract...",
      "arxiv_id": "2602.xxxxx",
      "submission_date": "2026-02-23",
      "matched_keywords": ["music generation"],
      "category": "cs.SD",
      "url": "https://arxiv.org/abs/2602.xxxxx"
    }
  ]
}

Usage Examples

# Search music generation papers for full date range
arxiv_watcher --domain "music_generation" --start_date "2024-06-01" --end_date "2026-02-27" --keywords "music generation,song generation"

# Search recent papers only  
arxiv_watcher --domain "music_generation" --days 15 --keywords "music generation,song generation"

Files Created

  • research/{domain}/search_results/arxiv_search_log.md
  • research/{domain}/search_results/paper_list.json
  • Directory structure automatically created if missing

Audit Trail Requirements

All search activities must include:

  • Complete query strategy documentation
  • Execution progress tracking
  • Duplicate handling records
  • Final result validation

Files

4 total
Select a file
Select a file to preview.

Comments

Loading comments…