Install
openclaw skills install phy-write-academic-reportWrite 40-100+ page academic reports (FYP, thesis, dissertation) with parallel Claude Code subagents. 3-wave pipeline: Wave 0 extracts data from your research repo, Wave 1 writes chapters in parallel (3-4x faster), Wave 2 compiles LaTeX with automated cross-reference auditing. Inherits academic writing standards from Nanda, Gopen & Swan, Lipton.
openclaw skills install phy-write-academic-reportTurn a research repository into a publication-quality LaTeX thesis in 2-4 hours instead of 8-12 — using a 3-wave parallel agent pipeline purpose-built for academic reports.
What this skill does: You point it at your research repo (code + experiment results). It launches parallel agents to extract data, write chapters simultaneously, then assembles and compiles a complete LaTeX report with proper cross-references, figures, and bibliography.
Validated: 86-page FYP report, 6 chapters + 3 appendices + 15 figures, produced in ~6 hours. Writing philosophy inherited from ml-paper-writing (Nanda, Farquhar, Gopen & Swan, Lipton, Steinhardt, Perez).
This rule is inherited from ml-paper-writing and is non-negotiable.
| Statistic | Source |
|---|---|
| 6-55% of AI-generated citations are fabricated | Multiple studies (varies by model/domain) |
| 100+ hallucinated refs in NeurIPS 2025 accepted papers | GPTZero analysis, Jan 2026 |
| 50+ hallucinated refs in ICLR 2026 submissions | GPTZero analysis, Feb 2026 |
| Only 26.5% of AI-generated references are entirely accurate | Paper-Checker 2026 survey |
| 206+ legal sanctions for AI-hallucinated citations in courts | As of July 2025 |
| 3 types: fully fabricated, chimeric (blended), modified real | CheckIfExist (arXiv 2602.15871) |
Universities increasingly treat fake citations as academic misconduct — failed assignments, course failure, or expulsion.
NEVER generate BibTeX entries from memory. ALWAYS fetch programmatically.
IF you cannot programmatically fetch a citation:
→ Mark it as [CITATION NEEDED] or [PLACEHOLDER - VERIFY]
→ Tell the author explicitly
→ NEVER invent a plausible-sounding reference
After writing, always run the citation checker before submission:
# Check a single .bib file
python scripts/citation_checker.py references.bib
# Check all .bib files in a report directory
python scripts/citation_checker.py path/to/report/
# JSON output (for CI pipelines)
python scripts/citation_checker.py references.bib --json
The checker uses a cascading 3-source verification pipeline:
CrossRef (140M+ DOIs) → Semantic Scholar (200M+ papers) → OpenAlex (240M+ works)
For each citation it:
Red flag detection catches:
See references/citation-workflow.md for the full API documentation and Python CitationManager class.
| Scenario | Use This Skill | Use ml-paper-writing Instead |
|---|---|---|
| FYP / Final Year Project report | Yes | |
| MSc / PhD dissertation | Yes | |
| Technical report (20+ pages) | Yes | |
| Conference paper (8-12 pages) | Yes | |
| Workshop paper (4-6 pages) | Yes |
Key difference: This skill orchestrates parallel subagents for long documents. Conference papers are short enough to write sequentially.
Wave 0: DATA PREPARATION Wave 1: CHAPTER WRITING Wave 2: ASSEMBLY
(5-6 parallel agents) (3-4 parallel agents) (1-2 sequential agents)
┌─ Agent 0A: Data consolidation ┌─ Agent 1: Template + Ch1-2 ┌─ Agent 6: Merge + cross-ref
├─ Agent 0B: Codebase analysis ├─ Agent 2: Ch3 (core work) └─ Agent 7: Compile + review
├─ Agent 0C: System analysis ├─ Agent 3: Ch4-5 (results)
├─ Agent 0D: Experiment history └─ Agent 4: Ch6 + Appendices
├─ Agent 0E: Statistics
└─ Agent 0F: Figure generation
Why waves? Data must exist before prose. Prose must exist before assembly. Violating this order produces agents that hallucinate numbers or write without evidence.
Goal: Produce all data artifacts that chapter-writing agents will reference. Every claim in the report must trace back to a Wave 0 artifact.
| Agent | Input | Output | Purpose |
|---|---|---|---|
| 0A: Data Consolidation | Raw result files (JSON, CSV) | data/final_results.json | Single source of truth for all numbers |
| 0B: Codebase Analysis | Source code | data/codebase_analysis.md | Module map, LOC, complexity, key snippets |
| 0C: System Analysis | Architecture, pipeline code | data/system_analysis.md | How components connect, data flow |
| 0D: Experiment History | All experiment logs | data/experiment_history.md | Timeline, what changed, why |
| 0E: Statistics | Result files | data/statistics.md | Aggregate stats, distributions |
| 0F: Figure Generation | Data artifacts + style config | figures/*.pdf + figures/*.png | All publication-quality figures |
Figures deserve a dedicated agent because:
# Recommended figure style
import matplotlib.pyplot as plt
import matplotlib
matplotlib.rcParams.update({
'font.size': 11,
'font.family': 'serif',
'axes.labelsize': 12,
'axes.titlesize': 13,
'xtick.labelsize': 10,
'ytick.labelsize': 10,
'legend.fontsize': 10,
'figure.figsize': (6.5, 4),
'savefig.dpi': 300,
'savefig.bbox': 'tight',
})
# Colorblind-safe palette (Okabe-Ito)
COLORS = ['#E69F00', '#56B4E9', '#009E73', '#F0E442',
'#0072B2', '#D55E00', '#CC79A7', '#000000']
Output both formats: figure_name.pdf (for LaTeX) + figure_name.png (for preview).
Do NOT proceed to Wave 1 until:
final_results.json match known ground truthIndependent (can parallelize):
Ch1 (Introduction) ←→ Ch2 (Literature Review) [no dependency]
Ch3 (System/Methods) [needs 0B, 0C]
Ch6 (Conclusion) [needs 0A summary only]
Sequential (must wait):
Ch4 (Experimental Setup) → Ch5 (Results) [Ch5 needs Ch4's definitions]
Ch5 needs: 0A (data), 0D (history), 0E (stats), 0F (figures)
| Agent | Chapters | Depends On | Approx Pages |
|---|---|---|---|
| Agent 1 | Template + Front matter + Ch1 + Ch2 | Plan only | 15-20 |
| Agent 2 | Ch3 (System Design) | 0B, 0C | 12-18 |
| Agent 3 | Ch4 + Ch5 (Setup + Results) | 0A, 0D, 0E, 0F | 15-25 |
| Agent 4 | Ch6 + Appendices | 0A (summary) | 5-10 |
These principles from ml-paper-writing apply to every chapter:
The Narrative Principle (Nanda): Your report tells one story. Every chapter advances that story. If a section doesn't connect to the core contribution, cut it.
Sentence-Level Clarity (Gopen & Swan):
| Principle | Rule | Mnemonic |
|---|---|---|
| Subject-verb proximity | Keep subject and verb close | "Don't interrupt yourself" |
| Stress position | Emphasis at sentence end | "Save the best for last" |
| Topic position | Context at sentence start | "First things first" |
| Old before new | Familiar then unfamiliar | "Build on known ground" |
| One unit, one function | Each paragraph = one point | "One idea per container" |
| Action in verb | Use verbs, not nominalizations | "Verbs do, nouns sit" |
| Context before new | Explain before presenting | "Set the stage first" |
Word Choice (Lipton, Steinhardt):
Micro-Level Tips (Perez):
| Conference Paper | Thesis/Report |
|---|---|
| 1-1.5 page intro | 3-5 page intro with motivation + scope |
| Related Work section | Full Literature Review chapter |
| 8-12 pages total | 40-100+ pages total |
| 5-sentence abstract | 250-400 word abstract |
| Contribution bullets | Objectives & scope section |
| No project timeline | Gantt chart / project schedule |
| No appendices (usually) | 2-5 appendices with supplementary material |
\chapter{Introduction}
\section{Background}
% 1-2 pages: Establish the problem domain
% Start specific, not generic. No "AI has revolutionized..."
\section{Motivation}
% 0.5-1 page: Why this problem matters NOW
% Use the "map analogy" or similar concrete framing
\section{Objectives and Scope}
% 0.5 page: Numbered list of objectives
% Explicitly state what is IN and OUT of scope
\section{Project Schedule}
% Gantt chart figure (generated in Wave 0)
\section{Report Organization}
% Brief roadmap of remaining chapters
\chapter{Literature Review}
% Organize METHODOLOGICALLY, not paper-by-paper
% Group: "One line of work uses X [refs] whereas we use Y because..."
\section{Topic Area 1}
\section{Topic Area 2}
\section{Topic Area 3}
\section{Research Gap and Our Position}
% Explicitly state what's missing and how you fill it
% Include positioning figure/table if helpful
\chapter{System Design and Implementation}
\section{System Architecture}
% Architecture diagram (FIGURE — from Wave 0)
\section{Core Component 1}
% Code listings where relevant (use lstlisting or minted)
\section{Core Component 2}
\section{Technology Stack}
% TABLE: libraries, versions, purpose
\chapter{Experimental Setup}
\section{Dataset / Data Collection}
\section{Evaluation Methodology}
\section{Baselines and Conditions}
\section{Statistical Methods}
% TABLE: which test, why, assumptions
\chapter{Results and Analysis}
% For EACH result, explicitly state:
% 1. What claim it supports
% 2. The specific numbers
% 3. Statistical significance
\section{Main Results}
% FIGURE + TABLE for primary ablation/comparison
\section{Detailed Analysis 1}
\section{Detailed Analysis 2}
\section{Discussion}
% What worked, what didn't, WHY
\chapter{Conclusion and Future Work}
\section{Summary of Contributions}
% 3-5 numbered contributions, each 2-3 sentences
\section{Limitations}
% HONEST assessment. Claude undersells weaknesses by default.
% Explicitly prompt: "What are the real limitations?"
% Pre-empt criticisms. Honesty builds trust.
\section{Future Work}
% 2-4 concrete, actionable directions
% Not vague "further research" — specific next steps
Claude has a documented tendency to understate limitations. When writing the limitations section:
Use \input{} in main.tex to include chapter files:
\documentclass[12pt,a4paper]{report}
\input{preamble}
\begin{document}
\input{front_matter}
\tableofcontents
\listoffigures
\listoftables
\input{chapters/ch1_introduction}
\input{chapters/ch2_literature_review}
\input{chapters/ch3_system_design}
\input{chapters/ch4_experimental_setup}
\input{chapters/ch5_results}
\input{chapters/ch6_conclusion}
\bibliographystyle{plain}
\bibliography{references}
\appendix
\input{appendices/appendix_a}
\input{appendices/appendix_b}
\end{document}
With parallel agents writing chapters independently, duplicate labels are inevitable.
Run the automated audit script:
python scripts/cross_ref_audit.py report_dir/
This checks:
\label{} definitions\ref{} and \cite{} referencesSee scripts/cross_ref_audit.py for the full script.
Tectonic is strongly recommended over BasicTeX/TeX Live for local compilation:
# Install (macOS)
brew install tectonic
# Compile (handles all passes automatically)
tectonic main.tex
# Or with verbose output
tectonic -X compile main.tex
Why Tectonic?
sudo, no tlmgr installSee references/compilation-guide.md for alternatives and troubleshooting.
Final quality checks:
Post-Compilation Checklist:
- [ ] No undefined references (\ref, \cite)
- [ ] No duplicate labels
- [ ] All figures render at correct size
- [ ] Table of Contents is accurate
- [ ] List of Figures / Tables is complete
- [ ] Page numbers are correct
- [ ] Bibliography entries are complete
- [ ] Appendices are properly lettered
- [ ] No overfull/underfull hbox warnings (major ones)
- [ ] Consistent formatting across all chapters
Use booktabs for professional tables:
\usepackage{booktabs}
\begin{table}[t]
\centering
\caption{Comparison of conditions. Best results in \textbf{bold}.}
\label{tab:main_results}
\begin{tabular}{lcc}
\toprule
Condition & Success Rate $\uparrow$ & p-value \\
\midrule
Baseline & 25.6\% & --- \\
Summary & 27.8\% & 0.839 \\
\textbf{URL} & \textbf{50.0\%} & $<$0.001 \\
\textbf{Tools} & \textbf{50.0\%} & $<$0.001 \\
\bottomrule
\end{tabular}
\end{table}
Rules:
\begin{figure}[t]
\centering
\includegraphics[width=0.85\textwidth]{figures/architecture.pdf}
\caption{System architecture showing the five core modules.
Arrows indicate data flow from browser automation (left)
through state abstraction to the output graph (right).}
\label{fig:architecture}
\end{figure}
A minimal, clean university thesis template is provided in templates/university-thesis/. It includes:
\input{}Most universities provide their own LaTeX template. To adapt:
\input{} chapter structure from our template| University Feature | Where to Adapt |
|---|---|
| Title page format | front_matter.tex — follow university spec exactly |
| Margin requirements | preamble.tex — use university's geometry settings |
| Font requirements | preamble.tex — usually Times New Roman or Computer Modern |
| Citation style | \bibliographystyle{} — university specifies (Harvard, APA, IEEE, etc.) |
| Appendix format | Check if university wants lettered (A, B, C) or numbered |
1. UNDERSTAND THE PROJECT
- Read the codebase, results, existing docs
- Identify the core contribution
2. PLAN THE REPORT
- Define chapter structure
- Map: which data → which chapter
- Identify figures needed
- Create the execution plan
3. WAVE 0: DATA PREPARATION
- Launch 5-6 parallel agents
- Wait for ALL to complete
- Verify outputs (spot-check numbers)
4. WAVE 1: CHAPTER WRITING
- Launch 3-4 parallel agents
- Each agent gets: chapter template + relevant Wave 0 data
- Independent chapters can run in parallel
5. WAVE 2: ASSEMBLY
- Merge chapters into main.tex
- Run cross_ref_audit.py
- Fix duplicate labels, undefined refs
- Compile with tectonic
- Quality review
6. ITERATE
- Author reviews output
- Targeted revisions (specific chapters/sections)
- Re-compile and verify
| Wave | Agents | Typical Duration | Notes |
|---|---|---|---|
| Wave 0 | 5-6 | 30-60 min | Depends on codebase size |
| Wave 1 | 3-4 | 60-90 min | Longest wave |
| Wave 2 | 1-2 | 20-40 min | Mostly automated |
| Total | 2-4 hours | For ~80 page report |
Without parallel agents, the same report takes 8-12 hours.
brew install tectonic — no sudo, handles packages automatically.grep '!' breaks in zsh due to history expansion. Use Python scripts for pattern matching.| Issue | Solution |
|---|---|
Duplicate \label{} across chapters | Run cross_ref_audit.py, rename with chapter prefix |
| Missing package in tectonic | Tectonic auto-downloads; if stuck, try tectonic -X compile |
| Figures too large / overlapping text | Use [width=0.85\textwidth] and [htbp] float placement |
| BibTeX not resolving | Run tectonic twice, or check .bib file syntax |
| Inconsistent notation across chapters | Define macros in preamble.tex, shared across all \input{} files |
| Agent writes without evidence | Wave 0 completion gate — never skip data preparation |
| Abstract too long for university | Keep to word limit; conference 5-sentence formula still works |
| Examiner criticizes missing limitations | Use the explicit limitations prompting strategy |
| Document | Contents |
|---|---|
| references/writing-guide.md | Gopen & Swan 7 principles, micro-tips, word choice |
| references/citation-workflow.md | Citation APIs, Python code, BibTeX management |
| Document | Contents |
|---|---|
| references/compilation-guide.md | Tectonic, latexmk, cross-ref audit, local compilation |
| references/parallel-pipeline.md | Wave architecture, agent orchestration, dependency graph |
| scripts/cross_ref_audit.py | Automated cross-reference and duplicate label checker |
| templates/university-thesis/ | Generic university thesis LaTeX template |
Canlah AI — Run performance marketing without breaking your brand.