Langgraph Architecture

v1.0.1

Guides architectural decisions for LangGraph applications. Use when deciding between LangGraph vs alternatives, choosing state management strategies, designi...

0· 179·1 current·1 all-time
byKevin Anderson@anderskev

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for anderskev/langgraph-architecture.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Langgraph Architecture" (anderskev/langgraph-architecture) from ClawHub.
Skill page: https://clawhub.ai/anderskev/langgraph-architecture
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install langgraph-architecture

ClawHub CLI

Package manager switcher

npx clawhub@latest install langgraph-architecture
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
The skill's name and description match the SKILL.md content: architectural guidance for LangGraph (state, routing, persistence, multi-agent patterns). Nothing requested (no env vars, no binaries, no config paths) is disproportionate to that purpose.
Instruction Scope
SKILL.md contains recommendations and example code snippets (Python types, reducers, checkpointer choices, streaming modes). The instructions do not direct the agent to read local files, access secrets, call external endpoints, or perform actions beyond giving guidance and sample code.
Install Mechanism
No install spec and no code files — the skill is instruction-only, so nothing will be written to disk or fetched during install.
Credentials
The skill declares no required environment variables, no primary credential, and no config paths. This is proportionate for a documentation/architecture guidance skill.
Persistence & Privilege
always is false and there are no signs the skill modifies agent configs or requests persistent presence. The default ability for the agent to invoke the skill autonomously is normal and not a concern here given the skill's benign footprint.
Assessment
This skill appears to be pure guidance and low-risk: it asks for no credentials and installs nothing. You can safely use it for design advice, but treat the code examples as templates — review and test them before copying into production. Also prefer skills from known sources if you need long-term maintenance or security guarantees.

Like a lobster shell, security has layers — review code before you run it.

latestvk97ekgqm627wjwh13pr908amhd85b92f
179downloads
0stars
2versions
Updated 6d ago
v1.0.1
MIT-0

LangGraph Architecture Decisions

When to Use LangGraph

Use LangGraph When You Need:

  • Stateful conversations - Multi-turn interactions with memory
  • Human-in-the-loop - Approval gates, corrections, interventions
  • Complex control flow - Loops, branches, conditional routing
  • Multi-agent coordination - Multiple LLMs working together
  • Persistence - Resume from checkpoints, time travel debugging
  • Streaming - Real-time token streaming, progress updates
  • Reliability - Retries, error recovery, durability guarantees

Consider Alternatives When:

ScenarioAlternativeWhy
Single LLM callDirect API callOverhead not justified
Linear pipelineLangChain LCELSimpler abstraction
Stateless tool useFunction callingNo persistence needed
Simple RAGLangChain retrieversBuilt-in patterns
Batch processingAsync tasksDifferent execution model

State Schema Decisions

TypedDict vs Pydantic

TypedDictPydantic
Lightweight, fasterRuntime validation
Dict-like accessAttribute access
No validation overheadType coercion
Simpler serializationComplex nested models

Recommendation: Use TypedDict for most cases. Use Pydantic when you need validation or complex nested structures.

Reducer Selection

Use CaseReducerExample
Chat messagesadd_messagesHandles IDs, RemoveMessage
Simple appendoperator.addAnnotated[list, operator.add]
Keep latestNone (LastValue)field: str
Custom mergeLambdaAnnotated[list, lambda a, b: ...]
Overwrite listOverwriteBypass reducer

State Size Considerations

# SMALL STATE (< 1MB) - Put in state
class State(TypedDict):
    messages: Annotated[list, add_messages]
    context: str

# LARGE DATA - Use Store
class State(TypedDict):
    messages: Annotated[list, add_messages]
    document_ref: str  # Reference to store

def node(state, *, store: BaseStore):
    doc = store.get(namespace, state["document_ref"])
    # Process without bloating checkpoints

Graph Structure Decisions

Single Graph vs Subgraphs

Single Graph when:

  • All nodes share the same state schema
  • Simple linear or branching flow
  • < 10 nodes

Subgraphs when:

  • Different state schemas needed
  • Reusable components across graphs
  • Team separation of concerns
  • Complex hierarchical workflows

Conditional Edges vs Command

Conditional EdgesCommand
Routing based on stateRouting + state update
Separate router functionDecision in node
Clearer visualizationMore flexible
Standard patternsDynamic destinations
# Conditional Edge - when routing is the focus
def router(state) -> Literal["a", "b"]:
    return "a" if condition else "b"
builder.add_conditional_edges("node", router)

# Command - when combining routing with updates
def node(state) -> Command:
    return Command(goto="next", update={"step": state["step"] + 1})

Static vs Dynamic Routing

Static Edges (add_edge):

  • Fixed flow known at build time
  • Clearer graph visualization
  • Easier to reason about

Dynamic Routing (add_conditional_edges, Command, Send):

  • Runtime decisions based on state
  • Agent-driven navigation
  • Fan-out patterns

Persistence Strategy

Checkpointer Selection

CheckpointerUse CaseCharacteristics
InMemorySaverTesting onlyLost on restart
SqliteSaverDevelopmentSingle file, local
PostgresSaverProductionScalable, concurrent
CustomSpecial needsImplement BaseCheckpointSaver

Checkpointing Scope

# Full persistence (default)
graph = builder.compile(checkpointer=checkpointer)

# Subgraph options
subgraph = sub_builder.compile(
    checkpointer=None,   # Inherit from parent
    checkpointer=True,   # Independent checkpointing
    checkpointer=False,  # No checkpointing (runs atomically)
)

When to Disable Checkpointing

  • Short-lived subgraphs that should be atomic
  • Subgraphs with incompatible state schemas
  • Performance-critical paths without need for resume

Multi-Agent Architecture

Supervisor Pattern

Best for:

  • Clear hierarchy
  • Centralized decision making
  • Different agent specializations
          ┌─────────────┐
          │  Supervisor │
          └──────┬──────┘
    ┌────────┬───┴───┬────────┐
    ▼        ▼       ▼        ▼
┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐
│Agent1│ │Agent2│ │Agent3│ │Agent4│
└──────┘ └──────┘ └──────┘ └──────┘

Peer-to-Peer Pattern

Best for:

  • Collaborative agents
  • No clear hierarchy
  • Flexible communication
┌──────┐     ┌──────┐
│Agent1│◄───►│Agent2│
└──┬───┘     └───┬──┘
   │             │
   ▼             ▼
┌──────┐     ┌──────┐
│Agent3│◄───►│Agent4│
└──────┘     └──────┘

Handoff Pattern

Best for:

  • Sequential specialization
  • Clear stage transitions
  • Different capabilities per stage
┌────────┐    ┌────────┐    ┌────────┐
│Research│───►│Planning│───►│Execute │
└────────┘    └────────┘    └────────┘

Streaming Strategy

Stream Mode Selection

ModeUse CaseData
updatesUI updatesNode outputs only
valuesState inspectionFull state each step
messagesChat UXLLM tokens
customProgress/logsYour data via StreamWriter
debugDebuggingTasks + checkpoints

Subgraph Streaming

# Stream from subgraphs
async for chunk in graph.astream(
    input,
    stream_mode="updates",
    subgraphs=True  # Include subgraph events
):
    namespace, data = chunk  # namespace indicates depth

Human-in-the-Loop Design

Interrupt Placement

StrategyUse Case
interrupt_beforeApproval before action
interrupt_afterReview after completion
interrupt() in nodeDynamic, contextual pauses

Resume Patterns

# Simple resume (same thread)
graph.invoke(None, config)

# Resume with value
graph.invoke(Command(resume="approved"), config)

# Resume specific interrupt
graph.invoke(Command(resume={interrupt_id: value}), config)

# Modify state and resume
graph.update_state(config, {"field": "new_value"})
graph.invoke(None, config)

Gates (sequenced)

Complete in order before treating a LangGraph design as locked in. Each step has an objective pass condition (artifact or explicit “none”), not an honor-system “we considered it.”

  1. AlternativesPass: For the workload, either (a) at least one row from Consider Alternatives When was evaluated and rejected with a one-line reason, or (b) the use case clearly matches Use LangGraph When You Need and does not fit a “consider alternative” row.
  2. State contractPass: Every state field has an assigned reducer (or default/LastValue) documented in the same place as the schema; large payloads are references or Store-backed, not inlined blobs (see State Size Considerations).
  3. CheckpointerPass: The saver type is chosen for the target environment per Checkpointer Selection (e.g. production is not InMemorySaver unless explicitly test-only).
  4. Loops and flaky nodesPass: recursion_limit (or equivalent) is set for any graph that can cycle; per-node RetryPolicy or a documented “no retries” choice exists for external calls (see Retry Configuration).

Error Handling Strategy

Retry Configuration

# Per-node retry
RetryPolicy(
    initial_interval=0.5,
    backoff_factor=2.0,
    max_interval=60.0,
    max_attempts=3,
    retry_on=lambda e: isinstance(e, (APIError, TimeoutError))
)

# Multiple policies (first match wins)
builder.add_node("node", fn, retry_policy=[
    RetryPolicy(retry_on=RateLimitError, max_attempts=5),
    RetryPolicy(retry_on=Exception, max_attempts=2),
])

Fallback Patterns

def node_with_fallback(state):
    try:
        return primary_operation(state)
    except PrimaryError:
        return fallback_operation(state)

# Or use conditional edges for complex fallback routing
def route_on_error(state) -> Literal["retry", "fallback", "__end__"]:
    if state.get("error") and state["attempts"] < 3:
        return "retry"
    elif state.get("error"):
        return "fallback"
    return END

Scaling Considerations

Horizontal Scaling

  • Use PostgresSaver for shared state
  • Consider LangGraph Platform for managed infrastructure
  • Use stores for large data outside checkpoints

Performance Optimization

  1. Minimize state size - Use references for large data
  2. Parallel nodes - Fan out when possible
  3. Cache expensive operations - Use CachePolicy
  4. Async everywhere - Use ainvoke, astream

Resource Limits

# Set recursion limit
config = {"recursion_limit": 50}
graph.invoke(input, config)

# Track remaining steps in state
class State(TypedDict):
    remaining_steps: RemainingSteps

def check_budget(state):
    if state["remaining_steps"] < 5:
        return "wrap_up"
    return "continue"

Decision Checklist

After Gates (sequenced), before implementing:

  1. Is LangGraph the right tool? (vs simpler alternatives)
  2. State schema defined with appropriate reducers?
  3. Persistence strategy chosen? (dev vs prod checkpointer)
  4. Streaming needs identified?
  5. Human-in-the-loop points defined?
  6. Error handling and retry strategy?
  7. Multi-agent coordination pattern? (if applicable)
  8. Resource limits configured?

Comments

Loading comments...