Install
openclaw skills install phy-citation-checkerVerify academic citations against CrossRef, Semantic Scholar, and OpenAlex. Detects AI-hallucinated references, chimeric citations, and suspicious patterns.
openclaw skills install phy-citation-checkerVerify academic citations against CrossRef, Semantic Scholar, and OpenAlex. Detects AI-hallucinated references, chimeric citations (real title + wrong authors), and suspicious patterns before submission.
.bib file with AI assistancepython scripts/citation_checker.py references.bib
.bib Files in a Directorypython scripts/citation_checker.py path/to/report/
python scripts/citation_checker.py references.bib --json
python scripts/citation_checker.py references.bib --verbose
Each citation is checked against three independent databases:
| Source | Coverage | Strength |
|---|---|---|
| CrossRef | 140M+ DOI-registered works | Best for journal/conference papers with DOIs |
| Semantic Scholar | 200M+ papers | Best author disambiguation, arXiv coverage |
| OpenAlex | 240M+ works | Broadest coverage, fully open |
Verification logic:
When a citation's title matches a real paper but the authors don't overlap at all, it's flagged as a possible chimeric hallucination — the most dangerous type because the title looks real on Google Scholar.
10.xxxx/)| Code | Meaning |
|---|---|
| 0 | All citations verified |
| 1 | One or more citations not found |
| 2 | Suspicious citations only (no hard failures) |
pip install requests
No API keys required — uses free tiers of all three databases.
| Category | Result | Description |
|---|---|---|
| Known-good | 9/10 (90%) | Famous ML papers (Vaswani, Devlin, Brown, He, etc.) |
| Known-bad | 10/10 (100%) | Fabricated papers with plausible titles |
| Chimeric | 5/5 (100%) | Real titles with wrong authors |
| False positive rate | 10% | 1 miss: unpublished tech report without DOI |
| False negative rate | 0% | No fake paper was ever verified |
The core guarantee: fake papers are never marked as real.
#!/bin/bash
# .git/hooks/pre-commit
python scripts/citation_checker.py references.bib --json > /tmp/cite_check.json
NOT_FOUND=$(python3 -c "import json; d=json.load(open('/tmp/cite_check.json')); print(d['summary']['not_found'])")
if [ "$NOT_FOUND" -gt "0" ]; then
echo "BLOCKED: $NOT_FOUND unfound citations. Run 'python scripts/citation_checker.py references.bib --verbose' to investigate."
exit 1
fi
- name: Check citations
run: |
pip install requests
python scripts/citation_checker.py references.bib --json > citation_report.json
python -c "
import json, sys
r = json.load(open('citation_report.json'))
if r['summary']['not_found'] > 0:
print(f'FAIL: {r[\"summary\"][\"not_found\"]} citations not found')
sys.exit(1)
print(f'PASS: {r[\"summary\"][\"verified\"]} verified, {r[\"summary\"][\"suspicious\"]} suspicious')
"