Knowledge Graph - Janusgraph Connector

Data & APIs

Connect to JanusGraph distributed graph database to query, manage, and analyze graph data using Apache TinkerPop Gremlin traversal language

Install

openclaw skills install janusgraph-connector

JanusGraph Connector

Purpose

This skill enables interaction with a JanusGraph distributed graph database for querying, storing, managing, and analyzing knowledge graph data at scale.

JanusGraph is a highly scalable, distributed graph database built on the Apache TinkerPop stack that uses Gremlin as its graph traversal language. It supports multiple backend storage systems and is designed for enterprise-grade graph operations.

Key Capabilities

  • Query distributed graph data using Gremlin traversal language
  • Insert and manage vertices (nodes) and edges (relationships)
  • Execute multi-hop graph traversals
  • Manage transactions with ACID compliance
  • Create and manage database indexes
  • Analyze graph structures and patterns
  • Integrate with knowledge graph applications

When To Use This Skill

Use this skill when:

  • Querying JanusGraph: Executing Gremlin traversal queries against a JanusGraph instance
  • Loading Data: Inserting vertices and edges into JanusGraph
  • Graph Analysis: Analyzing graph structures, paths, and relationships
  • Distributed Graphs: Working with large-scale distributed graph data
  • Multi-hop Traversals: Finding paths and relationships across multiple hops
  • Graph Transactions: Managing atomic graph operations with rollback capability

Example Triggers

  • "Execute this Gremlin query against JanusGraph"
  • "Insert vertices with these properties"
  • "Create relationships between nodes"
  • "Find all neighbors of this vertex"
  • "Traverse the graph path from node X to node Y"
  • "Get all vertices with this label"
  • "Create a composite index on these properties"

Connection Configuration

Connection Parameters

{
  "host": "localhost",
  "port": 8182,
  "protocol": "ws",
  "traversal_source": "g",
  "timeout": 30,
  "max_pool_size": 10
}

Configuration Details

ParameterTypeDefaultDescription
hoststringlocalhostJanusGraph/Gremlin Server hostname
portinteger8182Gremlin Server port
protocolstringwsProtocol (ws for WebSocket)
traversal_sourcestringgGraph traversal source name
timeoutinteger30Connection timeout in seconds
max_pool_sizeinteger10Maximum connection pool size
usernamestringoptionalAuthentication username
passwordstringoptionalAuthentication password

Connection Methods

  • Gremlin Server WebSocket - Direct connection to Gremlin Server
  • Remote Traversal - Using remote graph traversal sources
  • Embedded Graph - Local in-process JanusGraph instance

Core Concepts

Graph Model

Vertices (Nodes)

  • Labeled entities in the graph
  • Contain properties (key-value pairs)
  • Uniquely identified by vertex ID
  • Example: Person("Alice", age: 30, email: "alice@example.com")

Edges (Relationships)

  • Directional connections between vertices
  • Have labels describing the relationship type
  • Support properties for relationship metadata
  • Example: Person -> KNOWS -> Person

Properties

  • Key-value metadata on vertices and edges
  • Support multiple data types (string, int, float, bool, date)
  • Can be indexed for performance
  • Example: name: "Alice", age: 30, since: "2020-01-15"

Labels

  • Classify vertices (e.g., "Person", "Product", "Location")
  • Classify edges (e.g., "KNOWS", "PURCHASED", "LOCATED_IN")
  • Enable efficient filtering and querying

Gremlin Query Language

Gremlin is a graph traversal language that:

  • Works across multiple graph databases (vendor-independent)
  • Provides functional composition API
  • Supports filtering, mapping, reducing operations
  • Enables complex multi-hop traversals
  • Language: DSL for Java, Python, JavaScript, etc.

TinkerPop Architecture

  • Graph - The graph database instance
  • Traversal - Sequence of steps to traverse the graph
  • Step - Individual operation (filter, map, reduce)
  • Traverser - Object moving through the traversal path

Core Gremlin Patterns

Vertex Queries (MATCH Operations)

Get all vertices

g.V()

Get vertices by label

g.V().hasLabel("Person")

Get vertices by property

g.V().has("name", "Alice")

Get vertices with multiple conditions

g.V().has("name", "Alice").has("age", gt(25))

Create Operations

Create a vertex

g.addV("Person")
  .property("name", "Alice")
  .property("age", 30)
  .property("email", "alice@example.com")

Create an edge

g.V().has("name", "Alice").addE("KNOWS")
  .to(g.V().has("name", "Bob"))
  .property("since", "2020-01-15")

Batch create vertices

g.addV("Person").property("name", "Alice")
g.addV("Person").property("name", "Bob")
g.addV("Person").property("name", "Charlie")

Relationship Traversals

Single-hop traversal

g.V().has("name", "Alice").out("KNOWS")

Multi-hop traversal

g.V().has("name", "Alice").repeat(out()).times(3)

Bidirectional traversal

g.V().has("name", "Alice").both("KNOWS")

Path finding

g.V().has("name", "Alice").repeat(out()).until(has("name", "Bob"))

Aggregations

Count vertices

g.V().count()

Group by property

g.V().group().by("age")

Calculate statistics

g.V().values("age").mean()

Filtering Operations

Comparison operators

g.V().has("age", gt(25))              // greater than
g.V().has("age", gte(25))             // greater than or equal
g.V().has("age", lt(30))              // less than
g.V().has("age", lte(30))             // less than or equal
g.V().has("age", neq(25))             // not equal

Text filters

g.V().has("name", startingWith("Al"))
g.V().has("email", endingWith("@example.com"))
g.V().has("name", containing("ice"))

List filters

g.V().has("status", within("active", "pending"))
g.V().has("status", without("deleted", "archived"))

Collections & Deduplication

Get property values

g.V().values("name")

Deduplicate results

g.V().values("age").dedup()

Collect into list

g.V().values("name").fold()

Sorting & Limiting

Sort results

g.V().order().by("name")
g.V().order().by("age", desc)

Limit results

g.V().limit(10)

Pagination

g.V().skip(20).limit(10)

Delete Operations

Delete a vertex

g.V().has("name", "Alice").drop()

Delete an edge

g.V().has("name", "Alice").outE("KNOWS").drop()

Delete all vertices of a label

g.V().hasLabel("Temporary").drop()

Update Operations

Update a property

g.V().has("name", "Alice").property("age", 31)

Add/update multiple properties

g.V().has("name", "Alice")
  .property("age", 31)
  .property("updated_at", 1681305600)

Advanced Features

Transaction Management

Begin transaction

connector.begin_transaction()

Commit transaction

connector.commit_transaction()

Rollback on error

connector.rollback_transaction()

ACID Properties

  • Atomicity: All-or-nothing operations
  • Consistency: Graph invariants maintained
  • Isolation: Transactions don't interfere
  • Durability: Committed data persists

Index Management

Composite Index (Fast exact-match lookups)

graph.index("Person_Name")
  .onType(Person.class)
  .add("name")
  .buildCompositeIndex()

Mixed Index (Full-text search, range queries)

graph.index("Person_Search")
  .onType(Person.class)
  .add("name", Mapping.TEXT.asParameter())
  .add("age", Mapping.DEFAULT.asParameter())
  .buildMixedIndex("search")

Edge Index

graph.index("KnowsIndex")
  .onType(KnowsEdge.class)
  .add("since")
  .buildCompositeIndex()

Vertex Centric Index

graph.index("OutKnows")
  .onType(Person.class)
  .direction(Direction.OUT)
  .label("knows")
  .buildCompositeIndex()

Batch Operations

Batch property updates

connector.batch_update_vertices(
    vertices=['v1', 'v2', 'v3'],
    properties={'status': 'processed'}
)

Bulk insert

vertices = [
    {'label': 'Person', 'properties': {'name': 'Alice', 'age': 30}},
    {'label': 'Person', 'properties': {'name': 'Bob', 'age': 25}},
]
connector.batch_create_vertices(vertices)

Result Mapping

Vertex mapping

class Vertex:
    id: str
    label: str
    properties: Dict[str, Any]

Edge mapping

class Edge:
    id: str
    label: str
    from_id: str
    to_id: str
    properties: Dict[str, Any]

Path mapping

class Path:
    vertices: List[Vertex]
    edges: List[Edge]
    length: int

Error Handling

Common Error Scenarios

ErrorCauseSolution
Connection refusedJanusGraph server not runningStart JanusGraph server
Query syntax errorInvalid Gremlin syntaxValidate query syntax
Timeout exceptionQuery too slowAdd indexes, limit traversal depth
Property not foundIncorrect property nameVerify property exists
Vertex not foundID doesn't existCheck vertex exists before operation
Transaction conflictConcurrent modificationSimplify or retry transaction
Index not foundIndex name incorrectCreate index or fix name

Error Handling Best Practices

  1. Validate Connections - Check connection health before operations
  2. Use Try-Catch - Wrap operations in error handlers
  3. Retry Logic - Implement exponential backoff for transient failures
  4. Logging - Log all errors for debugging
  5. Graceful Degradation - Handle missing data gracefully

Best Practices

1. Connection Management

✅ Reuse connections via connection pooling
✅ Close connections properly when done
✅ Set appropriate timeouts
✅ Monitor connection health

2. Query Optimization

✅ Use indexes on filtered properties
✅ Avoid unbounded traversals
✅ Limit result sets explicitly
✅ Use parameterized queries

3. Data Management

✅ Use meaningful labels and property names
✅ Maintain referential integrity
✅ Batch operations for bulk loads
✅ Clean up temporary data

4. Transaction Handling

✅ Keep transactions short and focused
✅ Commit frequently for better concurrency
✅ Handle rollback scenarios
✅ Use appropriate isolation levels

5. Performance

✅ Create indexes on high-cardinality properties
✅ Monitor query execution time
✅ Use vertex-centric indexes for edge traversals
✅ Limit traversal depth in long-running queries

6. Scalability

✅ Distribute data across multiple servers
✅ Use appropriate backend storage (Cassandra for large scale)
✅ Partition data by domain when possible
✅ Monitor resource utilization

7. Security

✅ Authenticate connections properly
✅ Encrypt sensitive data
✅ Use prepared statements/parameter binding
✅ Apply principle of least privilege

8. Maintenance

✅ Regularly backup graph data
✅ Monitor index efficiency
✅ Clean up unused vertices/edges
✅ Monitor transaction logs


Integration with Related Skills

Neo4j Integration

  • Alternative property graph database using Cypher
  • Use Neo4j for strong ACID transactions
  • Use JanusGraph for distributed scale

GraphQL Graph Mapping

  • Expose JanusGraph via GraphQL API
  • Automatic schema generation from graph structure

Graph Query Optimization

  • Optimize Gremlin queries for performance
  • Analyze query execution plans

CSV Graph Loader

  • Bulk import CSV data into JanusGraph
  • Transform CSV to graph structure

REST API Wrapper

  • Expose JanusGraph as REST API
  • Create custom endpoints for common queries

Graph Constraint Generator

  • Define constraints on vertices and edges
  • Enforce data integrity rules

Libraries & Dependencies

Core Libraries

LibraryPurpose
gremlin-pythonGremlin language bindings for Python
python-websocketWebSocket client for Gremlin Server
pydanticData validation and typing

Optional Libraries

LibraryPurpose
pandasData transformation and analysis
networkxAdditional graph analysis
tinkerpop-coreTinkerPop framework (for embedding)

Installation

pip install gremlin-python pydantic

Expected Benefits

Using this skill enables:

Scalability - Manage graphs at enterprise scale
Flexibility - Multiple backend storage options
Performance - Optimized graph traversals
ACID Compliance - Reliable transactions
Distributed Deployment - High availability
Advanced Analytics - Complex graph algorithms
Vendor Independence - TinkerPop abstraction layer


Quick Reference

Connection & Session Management

connector = JanusGraphConnector()
connector.connect(config)
result = connector.execute_query(query)
connector.close()

Common Queries

# Get all vertices of a type
g.V().hasLabel('Person')

# Find specific vertex
g.V().has('name', 'Alice')

# Get neighbors
g.V().has('name', 'Alice').out('KNOWS')

# Create vertex
g.addV('Person').property('name', 'Alice')

# Create edge
g.V().has('name', 'Alice').addE('KNOWS').to(...)

Indexes

connector.create_index(
    name='PersonName',
    properties=['name'],
    index_type='composite'
)

Transactions

connector.begin_transaction()
# ... operations ...
connector.commit_transaction()

Related Skills

  • Neo4j Integration - Property graph database using Cypher
  • GraphQL Graph Mapping - GraphQL API for graphs
  • Graph Query Optimization - Query performance tuning
  • CSV Graph Loader - Bulk data import
  • REST API Wrapper - REST interface for graphs
  • RDF Triple Store Integration - RDF/OWL graph support
  • Graph Constraint Generator - Constraint management

Resources


Version: 1.0.0
Last Updated: April 12, 2026