{"skill":{"slug":"llm-perf-estimator","displayName":"LLM Inference Performance Estimator","summary":"Estimate LLM inference performance metrics including TTFT, decode speed, and VRAM requirements based on model architecture, GPU specs, and quantization format.","tags":{"latest":"1.0.0"},"stats":{"comments":0,"downloads":130,"installsAllTime":0,"installsCurrent":0,"stars":1,"versions":1},"createdAt":1774319213630,"updatedAt":1774319508533},"latestVersion":{"version":"1.0.0","createdAt":1774319213630,"changelog":"Initial public release.  \n- Estimate LLM inference performance metrics: TTFT (Time To First Token), decode speed, and VRAM requirements.  \n- Supports model selection by name, config file, or interactive input.  \n- Includes detailed preset tables for major LLMs and GPUs, with support for custom entries.  \n- Handles quantization effects and key architectural details (MoE, hybrid attention, embeddings).  \n- Guides the user step-by-step if information is missing.  \n- Provides clear calculation methods and caveats for each metric.","license":"MIT-0"},"metadata":null,"owner":{"handle":"zhangyu68","userId":"s17ebetxvynkc6f2jnptwrv7g583gmeb","displayName":"zhangyu68","image":"https://avatars.githubusercontent.com/u/36838019?v=4"},"moderation":null}