{"skill":{"slug":"llm-evaluator","displayName":"Llm Evaluator","summary":"LLM-as-a-Judge evaluation system using Langfuse. Score AI outputs on relevance, accuracy, hallucination, and helpfulness. Backfill scoring on historical trac...","tags":{"latest":"1.0.0"},"stats":{"comments":0,"downloads":381,"installsAllTime":0,"installsCurrent":0,"stars":0,"versions":1},"createdAt":1772679152872,"updatedAt":1777525715977},"latestVersion":{"version":"1.0.0","createdAt":1772679152872,"changelog":"- Initial release of the llm-evaluator skill.\n- Provides an LLM-as-a-Judge system for evaluating AI outputs using relevance, accuracy, hallucination, and helpfulness scores.\n- Integrates with Langfuse and uses GPT-5-nano for efficient automated judging.\n- Enables batch backfill scoring for historical traces and real-time evaluation of outputs.\n- Command-line interface for testing, scoring specific traces, and running backfills.","license":null},"metadata":{"os":null,"systems":null},"owner":{"handle":"aiwithabidi","userId":"publishers:aiwithabidi","displayName":"aiwithabidi","image":"https://avatars.githubusercontent.com/u/208891229?v=4"},"moderation":{"isSuspicious":true,"isMalwareBlocked":false,"verdict":"suspicious","reasonCodes":["suspicious.llm_suspicious","suspicious.vt_suspicious"],"summary":"Detected: suspicious.llm_suspicious, suspicious.vt_suspicious","engineVersion":"v2.4.5","updatedAt":1777525715977}}