Knowledge Graph - Neo4j Integration

API key required
Data & APIs

Connect to Neo4j graph databases and execute Cypher queries for storing, querying, and managing knowledge graph data using the property graph model. Full support for transactions, indexes, bulk operations, and result mapping.

Install

openclaw skills install neo4j-integration

Neo4j Integration

Connect to Neo4j graph databases and execute Cypher queries for efficient knowledge graph management.

This skill enables seamless interaction with Neo4j graph databases using the official Python driver. It provides connection management, query execution, transaction support, and result mapping for the property graph model.

Quick Start

Use When

  • Working with Neo4j-backed knowledge graphs
  • Executing Cypher queries on graph data
  • Creating or updating nodes and relationships
  • Importing graph data into Neo4j
  • Querying graph structures with complex patterns
  • Managing graph database transactions
  • Creating indexes for performance optimization
  • Building production graph applications

Inputs

  • Neo4j connection credentials (URI, username, password)
  • Cypher queries with optional parameters
  • Node/relationship definitions
  • Bulk data for import
  • Transaction context

Outputs

  • Query results (nodes, relationships, properties)
  • Execution statistics and metrics
  • Success/failure status
  • Record counts and performance data

Connection & Authentication

Supported Protocols

bolt://        - Unencrypted connection
neo4j://       - Standard connection (recommended)
neo4j+s://     - TLS-encrypted connection
neo4j+ssc://   - Self-signed certificate

Connection Configuration

config = {
    "uri": "neo4j://localhost:7687",
    "username": "neo4j",
    "password": "secure_password",
    "encrypted": True,
    "trust": "TRUST_ALL_CERTIFICATES"
}

Connection Pool

  • Default pool size: 50 connections
  • Configurable connection limits
  • Automatic connection recycling
  • Health checks for stale connections

Property Graph Model

Neo4j uses a property graph model with three core elements:

1. Nodes

Represent entities with labels and properties.

CREATE (p:Person {name: "Alice", age: 30, email: "alice@example.com"})
CREATE (c:Company {name: "TechCorp", industry: "Technology"})

Properties:

  • Name: String identifier
  • Properties: Key-value pairs
  • Labels: Type classification (Person, Company, etc.)

2. Relationships

Connect nodes with typed, directed edges and properties.

CREATE (a:Person)-[:WORKS_AT {since: 2020}]->(c:Company)
CREATE (a:Person)-[:KNOWS {strength: 0.8}]->(b:Person)

Characteristics:

  • Direction: Start node → End node
  • Type: Uppercase name (WORKS_AT, KNOWS, etc.)
  • Properties: Optional metadata
  • Can be traversed in both directions with <-

3. Properties

Attributes on nodes and relationships.

String: "text value"
Integer: 42
Float: 3.14
Boolean: true/false
DateTime: timestamp
List: [1, 2, 3]

Core Cypher Query Patterns

MATCH - Find Data

MATCH (p:Person) WHERE p.age > 30 RETURN p.name, p.age

CREATE - Add Data

CREATE (p:Person {name: "Bob", age: 25}) RETURN p

MERGE - Create or Update

MERGE (p:Person {name: "Alice"}) SET p.age = 31 RETURN p

MATCH + CREATE - Add Relationships

MATCH (a:Person {name: "Alice"}), (b:Person {name: "Bob"})
CREATE (a)-[:KNOWS]->(b) RETURN a, b

DELETE - Remove Data

MATCH (p:Person {name: "Old Person"}) DELETE p

RETURN + ORDER BY + LIMIT

MATCH (p:Person) RETURN p ORDER BY p.age DESC LIMIT 10

Advanced Query Features

Aggregations

MATCH (p:Person) RETURN COUNT(p), AVG(p.age), MAX(p.age)

Collection Functions

MATCH (p:Person)-[:KNOWS]->(friends:Person)
RETURN p.name, COLLECT(friends.name) AS friend_list

Conditional Logic

MATCH (p:Person) 
RETURN p.name, CASE WHEN p.age > 30 THEN "Senior" ELSE "Junior" END AS level

Path Queries

MATCH path = (a:Person)-[:KNOWS*1..3]->(b:Person)
WHERE a.name = "Alice" AND b.name = "Bob"
RETURN path, LENGTH(path) AS hops

Graph Algorithms

MATCH (n:Person) WHERE exists(n.pagerank) RETURN n ORDER BY n.pagerank DESC

Transaction Management

Simple Transaction

BEGIN
CREATE (p:Person {name: "Alice"})
CREATE (c:Company {name: "TechCorp"})
COMMIT

Rollback on Error

BEGIN
CREATE (p:Person {name: "Alice"})
ROLLBACK

Properties

  • Atomicity: All-or-nothing execution
  • Consistency: Graph constraints maintained
  • Isolation: ACID compliance
  • Durability: Persistent storage

Indexes & Performance

Create Index

CREATE INDEX person_name FOR (p:Person) ON (p.name)
CREATE INDEX company_id FOR (c:Company) ON (c.id)

Index Types

  • Range Index - Efficient for range queries
  • Full-text Index - Text search capability
  • Lookup Index - Universal index
  • Unique Index - Constraint enforcement

Query Optimization

  1. Use indexes on filtered properties - WHERE clauses
  2. Avoid cartesian products - Join on common properties
  3. Limit result sets - Use LIMIT clause
  4. Batch imports - Load data in chunks
  5. Profile queries - EXPLAIN/PROFILE for analysis

Bulk Operations

Import CSV

LOAD CSV WITH HEADERS FROM "file:///data.csv" AS row
CREATE (p:Person {name: row.name, age: toInteger(row.age)})

Batch Create

UNWIND $nodes AS node
CREATE (n {id: node.id, name: node.name})

Batch Update

UNWIND $updates AS update
MATCH (p:Person {id: update.id})
SET p.age = update.age

Result Mapping

Simple Results

MATCH (p:Person) RETURN p.name, p.age

Maps to:

[
    {"name": "Alice", "age": 30},
    {"name": "Bob", "age": 25}
]

Node Results

MATCH (p:Person) RETURN p

Maps to Python Node objects with:

  • id: Node internal ID
  • labels: List of labels
  • properties: Dict of properties

Relationship Results

MATCH (a)-[r]->(b) RETURN r

Maps to Relationship objects with:

  • id: Relationship internal ID
  • type: Relationship type
  • properties: Dict of properties
  • start_node_id: Source node ID
  • end_node_id: Target node ID

Error Handling

Common Errors

ErrorCauseSolution
Connection refusedNeo4j not runningStart Neo4j server
Authentication failedWrong credentialsVerify username/password
Syntax errorInvalid CypherCheck query syntax
Constraint violationDuplicate/invalid dataCheck constraints
TimeoutQuery too slowAdd indexes, optimize query
Out of memoryToo much dataBatch operations, paginate

Retry Logic

  • Connection failures: Automatic retry with exponential backoff
  • Transient errors: Configurable retry attempts
  • Circuit breaker: Fail fast on persistent failures

Best Practices

Use parameterized queries - Prevent injection attacks
Create appropriate indexes - Improve query performance
Batch large imports - Avoid memory exhaustion
Use transactions - Ensure data consistency
Profile queries - Identify performance bottlenecks
Close connections - Prevent resource leaks
Limit result sets - Avoid network overhead
Normalize node names - Prevent duplicate nodes
Document schemas - Maintain data governance
Monitor database - Track performance metrics

Integration Points

This skill integrates with:

  • GraphQL Graph Mapping - Expose Neo4j via GraphQL
  • Graph Query Optimization - Optimize Cypher queries
  • Schema Validation - Validate graph structure
  • CSV Graph Loader - Import CSV to Neo4j
  • Constraint Generator - Define database constraints
  • REST API Wrapper - Expose Neo4j as REST API

Recommended Libraries

Neo4j Python Driver

  • neo4j - Official driver (4.x/5.x)
  • neomodel - Python ORM for Neo4j
  • py2neo - Pythonic interface

Query Building

  • cypher-dsl-python - Build Cypher programmatically
  • ipython-cypher - Jupyter integration

Data Processing

  • pandas - Data frame operations
  • polars - Efficient data loading
  • networkx - Graph analysis

Visualization

  • graphistry - Interactive graph visualization
  • pyvis - Network visualization
  • neovis.js - Neo4j visualization

Related Skills

  • RDF Triple Store Integration - Alternative graph database
  • TigerGraph Connector - Distributed graph platform
  • JanusGraph Connector - Scalable graph database
  • GraphQL Graph Mapping - API layer on Neo4j
  • Graph Query Optimization - Improve query performance

Version: 1.0.0