{"skill":{"slug":"local-inference-context","displayName":"Local Inference Context","summary":"Context management for self-hosted LLM backends (llama.cpp, Ollama). Prevents mid-task 503 errors and context overflows caused by VRAM-limited KV caches. Use...","tags":{"latest":"1.0.0"},"stats":{"comments":0,"downloads":136,"installsAllTime":0,"installsCurrent":0,"stars":0,"versions":1},"createdAt":1778024015485,"updatedAt":1778492854429},"latestVersion":{"version":"1.0.0","createdAt":1778024015485,"changelog":"- Initial release of local-inference-context skill.\n- Adds context management tailored for self-hosted LLM backends (llama.cpp, Ollama), including VRAM-aware thresholds.\n- Prevents mid-task 503 errors and context overflows caused by limited KV-cache on local hardware.\n- Provides detailed guidance on calibrating effective context budgets, error signals, and pre-task checklists.\n- Introduces tailored recommendations for amber, red, and critical fill states to minimize context-related failures.\n- Highlights the necessity of a dedicated compaction model for reliable recovery.","license":"MIT-0"},"metadata":{"os":["linux","darwin"],"systems":null},"owner":{"handle":"joekravelli","userId":"s171bj037tmg9ggmn1jgf1v65s84q0ek","displayName":"JoeKravelli","image":"https://avatars.githubusercontent.com/u/38085287?v=4"},"moderation":null}