Academic Survey Self Improve

Security checks across static analysis, malware telemetry, and agentic risk

Overview

The skill matches its academic-survey purpose, but it deserves review because it compiles remotely sourced arXiv content into PDFs and overstates the reliability of its generated citations and quality scores.

Install only if you are comfortable with a skill that fetches arXiv metadata, writes local LaTeX/PDF files, and may invoke pdflatex. Verify generated citations and claims manually before using them academically, and avoid enabling the hourly schedule unless you intend ongoing background generation.

Static analysis

No static analysis findings were reported for this release.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal

Risk analysis

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

A malicious or tampered paper record could break the build, alter the generated document, or potentially abuse LaTeX processing depending on the local TeX configuration.

Why it was flagged

The skill retrieves remote arXiv metadata, turns it into a LaTeX survey, and compiles it into a PDF. This is purpose-aligned, but the shown artifacts do not show sanitization or sandboxing for untrusted metadata before a local compiler processes it.

Skill content
with urllib.request.urlopen(url, timeout=30) as response: ... paper = {'title': ..., 'summary': ...} ... tex_content = self._generate_survey(papers, analysis) ... pdf_file = self._compile_pdf(tex_file)
Recommendation

Use HTTPS where possible, escape LaTeX-special characters from all external metadata, compile with shell escape disabled in a sandboxed working directory, and ask the user before compiling remotely sourced content.

What this means

Users may overtrust the generated survey, citations, or quality score and reuse inaccurate academic content without verification.

Why it was flagged

The improver labels added references as high-quality but generates them from fixed strings rather than evidence-backed arXiv retrieval. This conflicts with the SKILL.md's broad claims about real arXiv citations and quality control.

Skill content
# Add 10 new high-quality references ... new_refs = self._generate_references(evaluation['metrics']) ... \\bibitem{Ref51} Smith, J., et al. "Advanced Techniques in Knowledge Injection." NeurIPS 2024.
Recommendation

Treat outputs as drafts only, verify every citation manually, and update the documentation to distinguish real arXiv-derived references from template or synthetic references.

What this means

The skill may fail or invoke local tools that the registry metadata did not make obvious.

Why it was flagged

The skill needs local runtimes and an external API, while the registry metadata lists no required binaries or install spec. This appears purpose-aligned, but users should notice the undeclared runtime expectations.

Skill content
## 📦 依赖

- Python 3.6+
- LaTeX (pdflatex)
- TikZ (图表生成)
- arXiv API (论文搜索)
Recommendation

Declare Python, pdflatex/LaTeX, network access to arXiv, and any expected output directories in metadata or install documentation.

What this means

If scheduled, the skill could keep generating files and consuming resources until the schedule is removed.

Why it was flagged

The hourly workflow is documented as an optional user-configured schedule, not automatic installation behavior. Still, it would create ongoing autonomous network, CPU, and disk activity if enabled.

Skill content
配置 cron 任务,每小时自动生成一篇新颖的综述。 ... "schedule": {"kind": "every", "everyMs": 3600000}
Recommendation

Only enable the hourly job intentionally, monitor the output directory, and remove the schedule when no longer needed.

What this means

Topic choices may be affected by prior runs or by edits to the history file if the output directory is shared.

Why it was flagged

The skill persists generated-topic history and reuses it for novelty checks. This is scoped and purpose-aligned, but it is persistent context that can influence future runs.

Skill content
self.topic_history_file = self.output_dir / 'topic_history.json' ... json.dump(self.topic_history, f, indent=2)
Recommendation

Keep the output directory private if needed, and review or delete topic_history.json when you want a fresh generation history.