Install
openclaw skills install text-entity-relation-extractorExtract entities and relationships from unstructured text and convert them into graph-ready structures such as triples, nodes, and edges.
openclaw skills install text-entity-relation-extractorExtract structured knowledge from unstructured text.
This skill analyzes natural language text and identifies entities and relationships that can be converted into knowledge graph structures such as nodes, edges, or semantic triples. It is useful for transforming documents, articles, transcripts, or raw text into graph-ready data suitable for knowledge graphs, semantic systems, or graph databases.
Input Text:
Elon Musk founded SpaceX in 2002. SpaceX is headquartered
in Hawthorne, California and develops reusable spacecraft.
The company employs over 9,000 people and has partnerships
with NASA for space exploration missions.
Extracted Entities:
Elon Musk → PERSON
SpaceX → ORGANIZATION
2002 → DATE
Hawthorne → LOCATION
California → LOCATION
NASA → ORGANIZATION
9,000 → QUANTITY
Extracted Relationships:
Elon Musk -[FOUNDED]-> SpaceX
SpaceX -[HEADQUARTERED_IN]-> Hawthorne
SpaceX -[HEADQUARTERED_IN]-> California
SpaceX -[DEVELOPS]-> Spacecraft
SpaceX -[EMPLOYS]-> 9,000 people
SpaceX -[PARTNERS_WITH]-> NASA
Generated RDF Triples:
:Elon_Musk a foaf:Person ;
foaf:founded :SpaceX .
:SpaceX a schema:Organization ;
schema:foundationDate "2002"^^xsd:gYear ;
schema:headquartersLocation :Hawthorne ;
schema:numberOfEmployees 9000 ;
schema:partnerOf :NASA .
:Hawthorne a schema:Place ;
schema:location :California .
Purpose: Identify and classify entities in text
Entity Types Supported:
Configuration:
ner:
model: spacy|bert|custom
entity_types:
- PERSON
- ORGANIZATION
- LOCATION
confidence_threshold: 0.7
case_sensitive: true
Purpose: Identify and extract relationships between entities
Relationship Types:
Detection Methods:
Dependency Parsing:
- Extract based on syntactic dependencies
- Example: SUBJECT -[verb]-> OBJECT
Pattern Matching:
- Use predefined patterns
- Example: [PERSON] works at [ORGANIZATION]
Machine Learning:
- Train on annotated data
- Classify relationship types
Knowledge Extraction:
- Use external knowledge bases
- Semantic role labeling
Configuration:
relation_extraction:
method: dependency|pattern|ml|hybrid
relationship_types:
- WORKS_AT
- LOCATED_IN
- FOUNDED
- OWNS
confidence_threshold: 0.6
Purpose: Standardize and deduplicate entities
Operations:
Configuration:
normalization:
lowercase: true
remove_punctuation: true
alias_mapping:
USA: United States
NYC: New York City
deduplication:
similarity_threshold: 0.85
Purpose: Convert extracted knowledge to RDF triples
Components:
Example:
Elon_Musk -[FOUNDED]-> SpaceX
SpaceX -[HEADQUARTERS]-> Hawthorne
SpaceX -[EMPLOYEE_COUNT]-> 9000
Purpose: Build knowledge graph from triples
Output:
Pattern: Identify entity boundaries and types
Text: "Apple Inc. was founded in 1976 by Steve Jobs."
Extracted:
Apple Inc. → ORGANIZATION
1976 → DATE
Steve Jobs → PERSON
Pattern: Extract [Entity1] -[Relation]-> [Entity2]
Text: "Steve Jobs founded Apple Inc."
Extracted:
Steve Jobs -[FOUNDED]-> Apple Inc.
Type: FOUNDER_OF
Confidence: 0.92
Pattern: Use syntactic structure to extract relations
Dependency: nsubj(VERB, PERSON), dobj(VERB, ORG)
Example:
Person → VERB → Organization
John → founded → Apple
Pattern: Use handcrafted extraction rules
Rule: [PERSON] works at [ORGANIZATION]
Match: "Alice works at Acme"
Extract: Alice -[WORKS_AT]-> Acme
Rule: [ORG] is located in [LOCATION]
Match: "Google is located in Mountain View"
Extract: Google -[LOCATED_IN]-> Mountain View
@prefix ex: <http://example.org/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix schema: <http://schema.org/> .
ex:Elon_Musk a foaf:Person ;
foaf:name "Elon Musk" ;
ex:founded ex:SpaceX .
ex:SpaceX a schema:Organization ;
foaf:name "SpaceX" ;
schema:foundingDate "2002"^^xsd:gYear ;
schema:headquartersLocation ex:Hawthorne .
{
"nodes": [
{"id": "Elon Musk", "type": "PERSON", "properties": {"name": "Elon Musk"}},
{"id": "SpaceX", "type": "ORGANIZATION", "properties": {"name": "SpaceX", "founded": 2002}},
{"id": "Hawthorne", "type": "LOCATION", "properties": {"name": "Hawthorne"}}
],
"edges": [
{"source": "Elon Musk", "target": "SpaceX", "type": "FOUNDED", "confidence": 0.92},
{"source": "SpaceX", "target": "Hawthorne", "type": "HEADQUARTERED_IN", "confidence": 0.88}
]
}
| Entity 1 | Type 1 | Relationship | Entity 2 | Type 2 | Confidence |
|---|---|---|---|---|---|
| Elon Musk | PERSON | FOUNDED | SpaceX | ORG | 0.92 |
| SpaceX | ORG | HEADQUARTERED_IN | Hawthorne | LOCATION | 0.88 |
Entity Confidence:
Score = Model_Confidence × Type_Confidence × Normalization_Score
Range: 0.0 - 1.0
Threshold: Usually 0.6-0.8 for filtering
Relationship Confidence:
Score = Detection_Score × Entity_Confidence × Pattern_Match_Score
Factors:
- Model prediction confidence
- Dependency strength
- Pattern specificity
✓ Choose appropriate NER models for domain
✓ Validate extracted relationships
✓ Normalize entity names consistently
✓ Remove low-confidence extractions
✓ Handle entity disambiguation
✓ Document extraction patterns
✓ Test with domain-specific text
✓ Manage performance with long texts
✓ Validate against domain knowledge
✓ Monitor confidence scores
Extracted knowledge feeds into:
See extraction-patterns.md for detailed NER and relationship extraction patterns and example-extractions.md for complete real-world examples.
Version: 1.0.0