{"skill":{"slug":"skylv-error-monitoring-agent","displayName":"Skylv Error Monitoring Agent","summary":"Catch errors before users report them. Real-time monitoring, auto-grouping, and smart alerts that reduce noise by 90%.","description":"---\nname: error-monitoring-agent\nslug: error-monitoring-agent\nversion: 1.0.1\ndescription: Catch errors before users report them. Real-time monitoring, auto-grouping, and smart alerts that reduce noise by 90%.\nauthor: SKY-lv\nlicense: MIT-0\ntags: [monitoring, debugging, alerting]\nkeywords: [error monitoring, error tracking, alerting, exception handling, error analysis, bug tracking, incident detection, real-time monitoring]\ntriggers: error monitoring, error tracking, catch errors, alert on error, exception tracking\n---\n\n# error-monitoring-agent\n\n> Catch errors before users report them. Group similar issues, alert on spikes, and auto-resolve known problems — all with zero configuration.\n\n## What It Does\n\n- **Real-time detection** — Monitor logs, APIs, workers for errors\n- **Smart grouping** — Merge similar stack traces, reduce noise 90%\n- **Rate alerts** — Alert when error rate spikes or new error types appear\n- **Root cause** — Correlate errors with deploys, config changes\n- **Auto-resolve** — Apply known fixes automatically (restart, retry, rollback)\n\n---\n\n## Quick Start\n\n```bash\n# 1. Start monitoring\nnode monitor.js watch --source logs,api\n\n# 2. Check current errors\nnode monitor.js status\n\n# 3. Set up alert\nnode monitor.js alert --rule \"error_rate > 10/min\" --channel slack\n\n# 4. View top errors\nnode monitor.js aggregate --top 10\n```\n\n---\n\n## Common Use Cases\n\n### 🚨 Alert on Error Spikes\n```bash\n# Alert when error rate exceeds threshold\nnode monitor.js alert --rule \"error_rate > 10/min\" --channel slack\n\n# Alert on new error types\nnode monitor.js alert --rule \"new_error_type\" --channel pagerduty\n\n# Alert on spike vs baseline\nnode monitor.js alert --rule \"error_spike > 3x_baseline\" --channel email\n```\n\n### 🔍 Investigate Incident\n```bash\n# Find all errors in time window\nnode monitor.js aggregate --time-window 1h --top 20\n\n# Analyze specific error\nnode monitor.js analyze --error-id err_abc123 --depth 5\n\n# Correlate with recent changes\nnode monitor.js analyze --correlate deploy-log,config-change\n```\n\n### 🤖 Auto-Resolve Known Issues\n```bash\n# Enable auto-resolution\nnode monitor.js auto-resolve --strategy restart,retry,rollback\n\n# Apply approved fixes only\nnode monitor.js auto-resolve --known-fixes db --apply-approved\n```\n\n### 📊 Track Error Budget\n```bash\n# Check error rate vs SLO\nnode monitor.js budget --slo 99.9% --window 30d\n\n# View error budget remaining\nnode monitor.js budget --remaining\n```\n\n---\n\n## All Commands\n\n| Command | Purpose |\n|---------|---------|\n| `watch --source <src>` | Start monitoring |\n| `status` | Current error summary |\n| `aggregate --top <n>` | Group similar errors |\n| `alert --rule <rule>` | Create alert rule |\n| `analyze --error-id <id>` | Root cause analysis |\n| `auto-resolve --strategy <s>` | Enable auto-fix |\n| `budget --slo <target>` | Check error budget |\n\n---\n\n## Configuration\n\n```json\n{\n  \"monitoring\": {\n    \"sources\": [\"application\", \"infrastructure\", \"api\"],\n    \"sampling\": 1.0,\n    \"retention\": \"30d\",\n    \"alertRules\": [\n      { \"condition\": \"error_rate > 10/min\", \"action\": \"page-oncall\" },\n      { \"condition\": \"new_error_type\", \"action\": \"notify-channel\" }\n    ],\n    \"autoResolve\": {\n      \"enabled\": true,\n      \"approvedStrategies\": [\"restart-service\", \"retry-request\"]\n    }\n  }\n}\n```\n","tags":{"latest":"1.0.1"},"stats":{"comments":0,"downloads":405,"installsAllTime":1,"installsCurrent":1,"stars":0,"versions":2},"createdAt":1777690156810,"updatedAt":1778492826217},"latestVersion":{"version":"1.0.1","createdAt":1777852268168,"changelog":"- Improved description for greater clarity and emphasis on real-time detection and noise reduction.\n- Expanded and reorganized documentation for easier onboarding, with new \"Quick Start\" and \"All Commands\" sections.\n- Updated triggers list for broader integration and usability.\n- Enhanced examples: added more command samples for alerting, investigation, auto-resolution, and error budget tracking.\n- Streamlined configuration and command usage instructions for a simpler user experience.","license":"MIT-0"},"metadata":{"setup":[],"os":null,"systems":null},"owner":{"handle":"sky-lv","userId":"s17fgkeb63szvtadtmm753m0gd84e4vz","displayName":"SKY-lv","image":"https://avatars.githubusercontent.com/u/259750852?v=4"},"moderation":{"isSuspicious":false,"isMalwareBlocked":false,"verdict":"clean","reasonCodes":["review.llm_review"],"summary":"Review: review.llm_review","engineVersion":"v2.4.24","updatedAt":1780090737995}}