Phy Content Safety Guard

PassAudited by ClawScan on Mar 22, 2026.

Overview

The skill's requirements and instructions are coherent with its stated purpose (a Gemini-based judge for outbound message filtering) and only ask for a single Google Generative API key, but review the prompts and safety-settings before production use.

This skill is conceptually coherent for running a Gemini-based judge, but do these checks before using it in production: - Inspect the SKILL.md's system prompt and any example code line-by-line. System prompts can strongly change model behavior; verify they don't instruct the judge (or your agent) to reveal or ignore secrets, to forward data to unknown endpoints, or to change platform policies. - Limit the GOOGLE_GENAI_API_KEY you provide: use a dedicated API key or service account with minimal permissions and budget/cost limits, rotate it regularly, and monitor usage and billing. - Pay attention to the 'relax Gemini's built-in safety filter' guidance. That is reasonable for a judge pattern but increases the chance the judge will accept/see sensitive content — ensure this aligns with your compliance needs. - Decide fail-open vs fail-closed policy before deployment. The example defaults to fail-open (lets messages through on error); for high-risk products consider fail-closed. - Test the guard with your red-team cases in a sandboxed environment and audit logs for false negatives/positives and unexpected behavior. If you are not comfortable auditing prompts yourself, ask a security reviewer to validate the system prompt and example handler code before granting the API key or enabling the skill in production.