Install
openclaw skills install uplo-devopsAI-powered DevOps knowledge management. Search runbooks, infrastructure documentation, CI/CD pipelines, and incident response procedures with structured extraction.
openclaw skills install uplo-devopsIt is 3 AM. PagerDuty is screaming. The on-call engineer who has seen this exact failure pattern left the company four months ago. The runbook exists somewhere, maybe in Confluence, maybe in a GitHub repo, maybe in a Notion page that someone bookmarked. UPLO DevOps eliminates this scramble by indexing runbooks, post-incident reviews, infrastructure documentation, CI/CD configurations, and architecture decision records into a single searchable layer that works when you need it most.
get_identity_context
This loads your team assignments (platform, SRE, application), on-call rotation status, and access tier. Some production configurations and credentials documentation are restricted by clearance.
Grab active directives — these include change freeze windows, incident commander designations, and infrastructure migration deadlines:
get_directives
The payments service is returning 503 errors. The on-call engineer has not worked on payments before.
search_knowledge query="payments service 503 error runbook troubleshooting steps"
Check for previous incidents with similar symptoms:
search_with_context query="payments service outage 503 timeout database connection pool previous incidents root cause"
If the runbook suggests checking the connection pool but the current configuration is unclear:
search_knowledge query="payments service database connection pool configuration pgbouncer settings production"
After resolving:
log_conversation summary="Resolved payments 503 outage; root cause was pgbouncer max_client_conn exceeded after traffic spike; matched PIR-2024-087 pattern; increased pool to 200" topics='["incident","payments","pgbouncer","connection-pool"]' tools_used='["search_knowledge","search_with_context"]'
The platform team is moving from self-managed Kafka to a managed streaming service. The tech lead needs to scope the blast radius.
search_with_context query="Kafka consumers producers services dependencies topic configuration"
Find the ADRs that led to the original Kafka deployment:
search_knowledge query="architecture decision record ADR Kafka event streaming selection rationale"
Check current SLOs and whether the migration might violate them:
search_knowledge query="event streaming SLO latency throughput requirements Kafka p99"
export_org_context
search_knowledge — Your go-to during incidents. When you need a specific runbook, a configuration reference, or a known procedure, this is the fastest path. Latency matters at 3 AM. Example: search_knowledge query="redis cluster failover runbook manual promotion steps"
search_with_context — For investigation and planning. "What services depend on this database?" or "Has this failure happened before?" require traversing relationships between services, incidents, and infrastructure components. Example: search_with_context query="auth-service dependencies upstream downstream database cache"
get_directives — Change freeze windows, incident escalation policies, and migration deadlines surface here. Checking before a production change can prevent a career-limiting mistake.
flag_outdated — Infrastructure documentation rots faster than any other type. The Kubernetes cluster version documented last quarter is wrong. The network diagram shows a load balancer that was decommissioned. The runbook references a CLI tool that was replaced. Flag these aggressively — someone will use them during an incident.
report_knowledge_gap — When a service has no runbook, no architecture diagram, or no documented owner, that is an operational risk. Reporting the gap creates a trackable item for the platform team.
payments-api, auth-service-v2, order-processor) rather than casual descriptions.search_knowledge for the runbook. Only escalate to search_with_context if the runbook does not exist or the failure mode is novel. Speed matters during incidents.log_conversation after every incident investigation, even false alarms. The pattern of false alarms is itself a signal that the monitoring team should investigate.