Install
openclaw skills install ai-friendly-architecture-designUse when system needs to handle AI uncertainty, Agent types must be selected, APIs will be consumed by AI, or architecture must support probabilistic outputs and dynamic planning
openclaw skills install ai-friendly-architecture-designAI Friendly architecture enables traditional systems to handle AI's inherent uncertainty through three paradigm shifts: deterministic→probabilistic, structured→semantic, and static→dynamic. This skill guides agents to apply these principles correctly and avoid common anti-patterns.
Core principle: Use appropriate architecture for the problem—don't over-engineer with AI when traditional solutions suffice.
Use when:
Do NOT use when:
Traditional: Output follows y=f(x) mapping—binary success/failure.
AI Friendly: Output emerges from model + prompt + context + environment. Design goal: converge probabilistic output to an acceptable "safe interval" through RAG, prompt engineering, and evaluation mechanisms.
Design implication: Don't expect exact schema compliance from AI outputs. Build validation and fallback mechanisms.
Traditional: Input must match predefined Schema exactly (JSON field types). System boundary is a rigid wall.
AI Friendly: System understands natural language and unstructured data. Responds based on intent, not format. System boundary becomes an elastic membrane.
Design implication: Design interfaces that accept flexible inputs and translate intent to actions.
Traditional: Execution paths defined by hardcoded if-else logic or rules. Behavior is enumerable and verifiable.
AI Friendly: System makes decisions based on models, can reason about current state, decompose tasks, and respond to unknown changes without human intervention.
Design implication: Shift from "rules" to "planning"—grant systems autonomy for intelligent task orchestration.
┌─────────────────────────────────────────────────────────────┐
│ Quality & Stability Layer │
│ (AI Observability, Evaluation, Security) │
├─────────────────────────────────────────────────────────────┤
│ Application Layer │
│ ┌──────────┐ ┌──────────┐ ┌──────────────────────┐ │
│ │ Agent │ │ Intent │ │ Session │ │
│ │ Layer │ │ Layer │ │ Layer │ │
│ └──────────┘ └──────────┘ └──────────────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ Capability Layer │
│ (MCP, RAG, Function Calling) │
├─────────────────────────────────────────────────────────────┤
│ Foundation Layer │
│ ┌──────────┐ ┌──────────┐ ┌──────────────────────┐ │
│ │ Model │ │ Knowledge│ │ Tool Management │ │
│ │Management│ │Management│ │ │ │
│ └──────────┘ └──────────┘ └──────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Provider Selection:
| Factor | Consideration |
|---|---|
| Latency requirements | Regional providers vs global (OpenAI, Anthropic) |
| Cost | Per-token pricing, batch discounts, small model for simple tasks |
| Data privacy | On-premise (Ollama, vLLM) vs cloud API |
| Capability | Task-specific: code (Codex), vision (GPT-4V), reasoning (Claude) |
Failover Strategy:
Cost Optimization:
Reference: The ReAct pattern is from the paper ReAct: Synergizing Reasoning and Acting in Language Models (Yao et al., 2022).
Three Agent types for different scenarios (this is one common taxonomy—other frameworks may use different classifications):
| Agent Type | Capability | Use Case |
|---|---|---|
| BaseAgent | Fixed workflow, no dynamic planning | Simple chatbots, AI Workflows |
| ReActAgent | Thought→Action→Observation loop | Rational tasks with tool use |
| PlanAgent | Global planning + ReAct execution | Complex tasks requiring strategy |
Note: "PlanAgent" is a common architectural pattern but not a standardized term—different frameworks may name it differently.
ReAct + Plan Combination:
Required only for multi-intent scenarios. Handles:
Also performs query rewriting and expansion for intent optimization.
Manages conversation state and user context. Feeds data into Context Engineering.
Responsibilities:
What it does NOT do:
Session Layer provides the data; Context Engineering decides how to use it.
Three coordination patterns:
| Pattern | Decision Point | Use Case |
|---|---|---|
| Centralized | Single coordinator agent | Clear task decomposition |
| Decentralized | Peer-to-peer negotiation | Collaborative problem-solving |
| Hybrid | Mixed coordination | Complex domains with sub-domains |
MOE (Mixture of Experts) Pattern:
Split interfaces into atomic tools matching Agent reasoning patterns:
❌ getProductWithInventoryAndPricing(id)
✅ getProduct(id)
✅ getInventory(productId)
✅ getPricing(productId)
Human-readable names, flat KV structure, core parameters only:
// ❌ Bad: Nested, complex
{"product": {"identifiers": {"sku_id": "123"}, "filters": {"status": "active"}}}
// ✅ Good: Flat, clear
{"sku_id": "123", "status": "active"}
Context Engineering selects, organizes, and compresses information optimally within context windows. It is more impactful than model selection for production systems.
*Based on production AI Review system results. See Real-World Impact section.
Store past successful cases with their decision reasoning. Retrieve similar cases via vector search to guide current decisions.
*Based on production CogentAI system results. See Real-World Impact section.
Multiple models vote with confidence scores. Use when single-model accuracy is insufficient.
| Type | Scope | Implementation | Use Case |
|---|---|---|---|
| Long-term memory | Cross-session | Persistent store (DB/vector) | User preferences, history |
| Short-term memory | Current session | In-context window | Current task context |
| Working memory | Current step | Scratchpad pattern | Intermediate reasoning |
| Technique | When to Use | Trade-off |
|---|---|---|
| Standard RAG | Simple Q&A over documents | Low complexity, moderate accuracy |
| GraphRAG | Entity relationships, knowledge graphs | Higher complexity, better associative retrieval |
| Dynamic context pruning | Long conversations, large knowledge bases | Reduces noise, may lose relevant context |
| Hybrid retrieval (dense + sparse) | Mixed structured/unstructured data | Best recall, more infrastructure |
Is the task pattern-recurring?
├─ Yes → Build historical case library
└─ No → Is single-model accuracy sufficient?
├─ Yes → Standard RAG + memory management
└─ No → Hybrid decision-making
└─ Complex entity relationships?
├─ Yes → GraphRAG
└─ No → Standard RAG + dynamic pruning
| # | Question | If Yes | If No |
|---|---|---|---|
| 1 | Is the task deterministic? | Traditional architecture (MVC/DDD) | → Q2 |
| 2 | Does it need language understanding? | → Q3 | Rules or traditional ML |
| 3 | Are there hard performance constraints (<100ms)? | Hybrid: AI offline + rules online, caching | → Q4 |
| 4 | Are there strict cost constraints? | Hybrid: AI + rules, small models, caching | → Q5 |
| 5 | Does it need dynamic tool selection? | → Q6 | Single LLM call or AI Workflow |
| 6 | Does it need multi-step reasoning? | PlanAgent + ReActAgent | ReActAgent |
| 7 | Multiple domains? | Multi-Agent (MOE pattern) | Single Agent |
Q1: Is the task deterministic?
Q2: Does it need language understanding?
Q3: Are there hard performance constraints?
Q4: Are there strict cost constraints?
Q5: Does it need dynamic tool selection?
Q6: Does it need multi-step reasoning?
Q7: Multiple domains?
Constraint Strategy Table:
| Constraint | Strategy | Example | Priority |
|---|---|---|---|
| Latency <100ms | AI offline training + rules online, caching | Real-time recommendation | Hard |
| High volume (10M+/day) | Caching + small models + rules fallback | API gateway | Soft |
| Limited budget | Hybrid AI+rules, model tiering | Internal tools | Soft |
| Data privacy | On-premise models, data anonymization | Medical/financial | Hard |
| Performance + Cost | Small models + caching, reduce AI scope to high-ROI tasks | Budget real-time system | Hard > Soft |
| Performance + Privacy | On-premise small models + caching, accept lower accuracy | Hospital system | Hard > Hard |
| Cost + Privacy | On-premise small models, rules for low-value decisions | Internal medical tool | Soft < Hard |
Rule: Hard constraints (latency, privacy) beat soft constraints (cost, accuracy). When constraints conflict, optimize for the hard constraint first.
| Mistake | Why Wrong | Fix |
|---|---|---|
| Using Multi-Agent for simple FAQ | Over-engineering, adds latency/cost | Use single Agent with knowledge base |
| Complex nested API for Agent tools | Agents can't parse deep structures | Atomic tools with flat parameters |
| Skipping Intent Layer for multi-intent queries | Agent can't distinguish user goals | Add Intent Layer with query rewriting |
| Using ReAct for deterministic tasks | Unnecessary reasoning overhead | Use BaseAgent or workflow |
| Ignoring Context Engineering | Poor model performance | Build case libraries, hybrid decisions |
| Ignoring performance constraints | AI inference latency breaks SLA | Use hybrid architecture, see Constraint Strategy Table |
| Ignoring cost constraints | Unsustainable AI spend at scale | Model tiering, caching, rules fallback |
| "To use AI" as architecture goal | No business value | Define specific problems AI solves |
| Excuse | Reality |
|---|---|
| "Multi-Agent is always better" | Coordination overhead isn't justified for simple tasks |
| "ReAct can handle everything" | Deterministic tasks don't need dynamic reasoning |
| "AI Friendly API is too much work" | Atomic tools are easier to maintain and test |
| "Context Engineering is optional" | Memory and context are more important than model choice |
| "We need AI for everything" | Traditional architecture handles deterministic logic better |
| "Paradigm shifts are just theory" | They explain WHY the patterns work—skip them at your peril |
| "Context Engineering is just RAG" | Includes memory, hybrid decisions, case libraries beyond RAG |
| "Intent Layer is optional" | Required for multi-intent scenarios—Agent alone can't distinguish |
| "This only applies to LLM systems" | Principles apply to any AI system with uncertainty |
From production systems:
Note: These metrics are illustrative examples from the source article, not independently verified measurements. Results will vary based on implementation quality, data, and domain.