Install
openclaw skills install @quqxui/information-extractionExtract structured information from unstructured text through a semi-automatic pipeline. Support entity extraction, relation extraction, attribute extraction, and event extraction from plain text and Markdown. Use when converting raw text into triples, graph-ready records, or normalized structured facts from documents, notes, reports, transcripts, and web content copied as text.
openclaw skills install @quqxui/information-extractionExtract entity, relation, attribute, and event information from text into a normalized intermediate structure, then export triples in JSON, JSONL, or TSV.
Prefer this skill for:
If the user provides a file in another format, convert it to text first, then use this skill.
Default output should contain:
{
"triples": [],
"entities": [],
"attributes": [],
"events": [],
"ambiguities": []
}
Support export formats:
references/relation-taxonomy.md.Use these record shapes during extraction.
{
"id": "ent_001",
"mention": "OpenAI",
"canonical_name": "OpenAI",
"type": "Organization",
"evidence": "OpenAI published the GPT-4 Technical Report.",
"confidence": 0.95
}
{
"subject": "ent_001",
"predicate": "published",
"object": "ent_002",
"evidence": "OpenAI published the GPT-4 Technical Report.",
"confidence": 0.93
}
{
"entity_id": "ent_002",
"attribute": "year",
"value": "2023",
"evidence": "The report was released in 2023.",
"confidence": 0.87
}
{
"id": "ev_001",
"type": "Publication",
"trigger": "published",
"participants": {
"agent": "ent_001",
"object": "ent_002"
},
"time": "2023",
"location": null,
"evidence": "OpenAI published the GPT-4 Technical Report in 2023.",
"confidence": 0.92
}
references/pipeline.md for the end-to-end procedure.references/schema.md for types and intermediate record structure.references/relation-taxonomy.md before inventing new predicates.references/triple-mapping.md when exporting final triples.references/event-modeling.md when text describes complex events.references/quality-checklist.md before final delivery.python3 skills/information-extraction/scripts/extract.py --text "OpenAI published GPT-4." --output out.json
Or read from stdin:
echo "OpenAI published GPT-4." | python3 skills/information-extraction/scripts/extract.py --stdin --output out.json
python3 skills/information-extraction/scripts/normalize.py --input out.json --output normalized.json
python3 skills/information-extraction/scripts/export_triples.py --input normalized.json --format json --output triples.json
python3 skills/information-extraction/scripts/export_triples.py --input normalized.json --format jsonl --output triples.jsonl
python3 skills/information-extraction/scripts/export_triples.py --input normalized.json --format tsv --output triples.tsv
This is a semi-automatic pipeline, not a claim of perfect extraction. The scripts provide scaffolding, normalization, and export. For high-stakes outputs, keep evidence and perform manual review.