Post-Development Verification
ReviewAudited by ClawScan on May 10, 2026.
Overview
This instruction-only testing skill is coherent and disclosed, but it can run real project tests, commands, services, migrations, network calls, and test credentials, so it should only be used in a sandbox.
Use this skill only in a disposable test or sandbox environment. Before allowing execution phases, check the Phase 0 environment report, confirm all URLs, databases, services, and credentials are non-production, and inspect the project commands that will be run.
Findings (5)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
If pointed at the wrong environment, tests or migrations could change real services or data.
The skill clearly discloses broad operational actions that can affect local services and data stores. This is expected for full-stack verification, but users need to confirm the target environment before allowing execution.
This skill starts services, runs migrations, and makes network calls.
Run Phase 0 first, review the environment report, and only proceed when databases, services, URLs, and credentials are confirmed to be test or sandbox resources.
Project scripts may install packages, start services, change files, or run other commands depending on the repository.
The templates direct the agent to run project-defined commands. Running project scripts is central to E2E verification, but those scripts can execute arbitrary local code.
start using: project_start_command (e.g., npm run dev, uvicorn app:app)
Inspect project scripts and generated test commands before execution, especially in unfamiliar repositories.
Using production credentials by mistake could let verification tests mutate real accounts or data.
The skill may use credentials for authenticated tests, but it explicitly scopes them to test accounts and environment variables.
Uses only test accounts, test API keys, and environment-variable-sourced tokens. Never prompts for production secrets.
Provide only sandbox credentials and verify environment variables do not point to production accounts or APIs.
A misconfigured endpoint, database, or service dependency could cause test actions to affect more than the intended local test stack.
The highest realism mode intentionally exercises all services rather than mocks. This is useful for validation but can propagate mistakes across several systems if sandbox boundaries are wrong.
L3 | All services real (sandbox/test accounts) | 0%
Keep the realism level at L0-L2 unless every dependency has been verified as disposable or sandboxed; avoid L3 near production resources.
Unreviewed logs or failure output could steer the agent's next changes or expose local implementation details in reports.
Test feedback and failure reports become machine-consumed context for later fix rounds. This is purpose-aligned, but error text and logs can contain misleading or sensitive content.
The Agent parses this report programmatically to determine: What broke ... What to fix next ... Whether to keep fixing
Review generated feedback reports, keep them local, and redact secrets or misleading untrusted output before relying on them for automated fixes.
