NPD Validator

v1.0.1

Run a full NPD validation pipeline on a product concept. Orchestrates 8 specialized subagents (Research Coordinator, 5 Independent Evaluators, Devil's Advoca...

⭐ 0· 106·0 current·0 all-time

by@autosolutionsai-didac

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for autosolutionsai-didac/autosolutions-npd-validator.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "NPD Validator" (autosolutionsai-didac/autosolutions-npd-validator) from ClawHub.
Skill page: https://clawhub.ai/autosolutionsai-didac/autosolutions-npd-validator
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Canonical install target

openclaw skills install autosolutionsai-didac/autosolutions-npd-validator

ClawHub CLI

Package manager switcher

npx clawhub@latest install autosolutions-npd-validator

Security Scan

Capability signals

CryptoCan make purchases

These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

Name and description match the runtime instructions: the SKILL.md describes orchestrating evaluator subagents, building a concept brief, running searches, and producing a verdict. Required resources (none) and allowed tools (WebSearch, Read/Write/Edit, SubAgent) mostly align with that purpose. The presence of Bash in allowed-tools appears unnecessary for the described tasks but not harmful by itself.

ℹ

Instruction Scope

Instructions are detailed and largely stay within the stated purpose: populate concept brief, run many web searches (25–40+), orchestrate evaluators, cite sources, and write report files (data/concept_brief.md and evaluator outputs). This means the skill will query external web sources and publish cited URLs in reports. The methodology enforces heavy web searching and URL citation; this is expected but may expose user-supplied concept details to search services. The skill does not instruct reading system files or unrelated env vars.

✓

Install Mechanism

Instruction-only skill with no install spec and no code files — lowest install risk. It will create files in its workspace (e.g., data/), which is normal for report generation.

✓

Credentials

The skill requests no environment variables, credentials, or config paths. All required resources (web searches, local report files) are proportional to an external-research-based validation pipeline.

ℹ

Persistence & Privilege

always:false and disable-model-invocation:false (normal). The skill uses SubAgent orchestration (multiple internal evaluators) which allows autonomous multi-step runs; that's consistent with its purpose but increases the volume of automated external requests. It does not request permanent system-wide changes or cross-skill configuration access.

Scan Findings in Context

[no-regex-findings] expected: The regex-based scanner found nothing to analyze because this is an instruction-only skill (no code files). That is expected; the security surface is the prose in SKILL.md, which we evaluated above.

Assessment

This skill appears to do what it claims, but consider the following before installing: - Privacy: The pipeline requires 25–40+ web searches and will cite URLs in its reports. Do not paste proprietary IP, confidential specs, or unreleased data into the concept brief unless you are comfortable that details may be included in web queries or citations. - Interaction vs Autonomous runs: The skill can run autonomously and will orchestrate multiple subagents and searches. If you prefer control, run it interactively or limit its scope (Quick Scan / Mode A). - Files written: The skill will create files (e.g., data/concept_brief.md and evaluator reports) in the agent workspace. Review those outputs for any sensitive content before sharing externally. - Tools: The inclusion of Bash in allowed-tools is unnecessary for typical report generation; if you have a policy against letting skills run shell commands, consider disabling Bash access for this skill or running it in an environment where shell commands are sandboxed. - Intellectual property: If the idea is commercially sensitive, consider redacting fine-grained technical or manufacturing details and instead provide high-level intent, or perform the validation only on public/market-facing aspects. If you want a stricter assessment, provide evidence of where the agent will be deployed (host environment and logging policies) or request a version that explicitly disables shell access and/or reduces web-search depth.

Like a lobster shell, security has layers — review code before you run it.

latestvk9770p90frv5r7rxpfezv67xfn856qzx

106downloads

0stars

2versions

Updated 6d ago

v1.0.1

MIT-0

NPD Product Validation Pipeline

You are orchestrating a multi-agent product validation. Read the methodology skill at skills/npd-methodology/SKILL.md before proceeding.

Step 1: Determine Entry Mode

Analyze "$ARGUMENTS" to determine entry mode:

Mode A (Rough Idea): User provided <4 of 6 Concept Brief fields.

Ask targeted questions to fill the brief
Run a Quick Scan (1-2 searches per dimension, traffic light output)
Get user confirmation before deep validation

Mode B (Solid Idea): User provided ≥4 fields or said "just validate."

Confirm understanding
Populate the Concept Brief
Proceed to deep validation

Step 2: Populate Concept Brief

Create data/concept_brief.md with:

Product Name / Working Title: [...]
Category: [...]
One-Line Description: [≤15 words]
Target Consumer: [...]
Price Point Range: [...]
Key Differentiator: [...]
Launch Horizon: [now / 6 months / 12 months / 24+ months — specify target month if seasonal]
Brand Context (optional): [...]

If the user hasn't specified a launch horizon, ASK before proceeding. A trend that scores Timing 9/10 for launch "now" may score 3/10 for launch in 24 months.

Brief Coherence Check

Before launching the pipeline, scan the brief for internal contradictions:

Positioning vs Price: Does the price match the positioning? ("ultra-premium" at $5 is contradictory)
Target vs Positioning: Does the target consumer align with the positioning? ("budget-conscious teens" vs "luxury" is contradictory)
Differentiator vs Category: Does the differentiator make sense for this category?

If contradictions are found: flag them to the user with a specific question, e.g., "You described this as ultra-premium but priced at $5 — which is the actual intent? The validation results would be very different." Do NOT reject the idea — the user may have a valid strategy. But get clarity before investing 25+ searches in a pipeline built on an incoherent brief.

Multi-SKU Launch Detection

Multi-SKU launches come in two distinct flavors — identify which one applies before proceeding:

Bundle / Collection / Set: Multiple products sold together as a unit at a combined price (e.g., "The Morning Routine — 3 products, $65 together"). Consumers buy them as ONE transaction.

Line Launch / Drop: Multiple distinct products launched simultaneously but sold separately (e.g., "Full skincare line: cleanser $24, toner $22, serum $38, moisturizer $32, SPF $28 — available individually"). Consumers pick what they want.

Each has different dynamics:

Bundles — see Bundle-specific evaluator guidance below. Attachment rate is fixed at 100% for bundle buyers. SKU complexity is lower (1 bundle SKU + 3 components).

Line launches — distinct considerations the bundle framework does NOT cover:

Working capital multiplier: 5 SKUs = 5× inventory investment. A brand that can afford 1 SKU launch may not afford 5.
Testing cost aggregation: each SKU requires its own stability/safety testing. Not linear but not 1×.
Retailer listing economics: 5 listing fees but ONE buyer conversation — potentially better leverage, but also harder negotiation.
Launch moment amplification: a line launch is a bigger PR/marketing moment than a single SKU — more editorial hooks, more creator content, larger retail displays.
Complexity non-linearities: 5 SKUs requires 5 different supply chains, 5 different forecasts, 5 different return rates to manage. Operational load scales faster than revenue potential.

For a line launch, the validation should produce a recommendation on optimal launch size: hero-first (launch 1, expand later based on performance), full-line (launch all 5 simultaneously), or staggered (launch 2, then add 3 more in months 3 and 6). This is a launch strategy question, not just a go/no-go.

Collection / Bundle Detection

If the multi-SKU offering is a bundle, treat it as a bundle validation, not a single-product one. Bundles have distinct dynamics: attachment rate, bundle-vs-separate margin tradeoffs, SKU complexity, and consumers' willingness to buy a set vs individual items.

Update the Concept Brief with a Bundle Context field listing each component product, the bundle price, and the sum-of-parts price. Evaluators adapt:

Market Demand: Do consumers in this category buy collections, or prefer to curate their own? What's the bundling precedent?
Competitive Intel: Who else sells collections in this space? What's the typical bundle discount?
Brand Fit: Does a collection make sense for this brand's portfolio? Is the brand known for standalone heroes or curated sets?
Commercial Viability: Bundle economics specifically — attachment rate assumptions, bundle discount margin impact, SKU complexity cost, inventory implications
Consumer & Trends: Are collections trending up or down in the category?

Do NOT produce 3 separate validations averaged together — the bundle IS the product.

Geographic Scope Check

If the user mentions multiple markets (e.g., "launch in US, UK, Germany, Japan"), a single composite score is misleading — regulations, competitors, trends, and consumer behavior differ significantly across markets. A product can be GO in one market and NO-GO in another.

Before launching, ask the user: "You mentioned [X] markets. For the most accurate results, I can (a) validate for your PRIMARY market and flag considerations for the others, (b) run parallel validations per market and produce per-market verdicts, or (c) use a TEST MARKET / BEACHHEAD approach — fully validate the primary market with explicit expansion-readiness criteria, so US performance can de-risk subsequent EU/UK/Australia rollouts. Option (b) takes [N × time] but gives market-specific go/no-go decisions. Option (c) is right when the user intends to use launch data from one market to inform expansion. Which do you prefer?"

If they choose (a), add the primary market to the Concept Brief and note the others as "expansion considerations." The Research Coordinator and Consumer & Trends evaluators will incorporate expansion notes without producing full per-market analyses.

If they choose (b), run the full pipeline once per market with each producing its own report, then produce a consolidated cross-market comparison as the final deliverable.

If they choose (c) — beachhead strategy — validate the primary market in depth, then produce an additional Expansion Readiness section in the final report. This section defines:

Performance criteria that would trigger expansion (e.g., "$X revenue in 12 months," "Y% repeat rate," "Z retailer sell-through")
Market-specific risks for each expansion target (regulatory, competitive, cultural)
Investment required per additional market
Recommended sequencing (UK before EU? Australia before Asia?)

Regulatory Divergence — Product-Level Check

When validating multiple markets, recognize that some categories require different formulations/products per market, not just different marketing. Examples:

Retinol: EU caps at 0.3%, US has no cap — the formulation itself must differ
SPF: UVA protection calculations differ (US: no specific UVA rating, EU: UVA-PF, Japan: PPD)
Supplements: health claims and ingredient allowances vary dramatically
Food: additive allowances, labeling requirements, portion sizes
Cosmetics: EU has a banned list (Annex II) far longer than US FDA's

If the Research Coordinator identifies product-level regulatory divergence across target markets, flag it as a Critical Alert. The "same product" in US and EU may require two separate formulation development tracks, two testing files, two manufacturing runs. Commercial Viability should reflect this operational reality — it's not one product with different labels, it's effectively two products to manage.

User-Provided Internal Data

Internal data (sales history, cost sheets, customer research, brand portfolio data) is MORE reliable than external web research. If the user provides files or data:

Save to data/user_provided/ with a descriptive name (e.g., internal_sales_q1_q4_2025.csv)
Add a note to data/concept_brief.md: "User-provided data available: [list files with 1-line description each]"
Evaluators are instructed to prioritize user data over external estimates in their specific domain:
- Sales/performance data → Market Demand, Commercial Viability
- Customer research/demographics → Brand Fit, Consumer & Trends
- Cost sheets/margin data → Commercial Viability
- Portfolio data → Brand Fit
Dimensions backed by user data should be marked HIGH confidence (internal data beats external estimates)
The final report must note which scores were informed by user-provided data vs external research

If the user says they have data but hasn't provided it yet, ask them to share it before Phase 2 begins.

Step 3: Run the Pipeline

Execute each phase by delegating to the specialized subagents:

Phase 1 — Research Coordinator

Delegate to the npd-research-coordinator subagent:

Input: the concept brief
It will conduct 8-12 web searches and produce data/context_brief.md
Wait for completion before proceeding

Phase 2 — Independent Evaluations

Delegate to ALL 5 evaluator subagents. Run them in parallel if possible:

npd-market-demand → reads concept brief + context brief → writes data/eval_market_demand.md
npd-competitive-intel → reads concept brief + context brief → writes data/eval_competitive_intel.md
npd-brand-fit → reads concept brief + context brief → writes data/eval_brand_fit.md
npd-commercial-viability → reads concept brief + context brief → writes data/eval_commercial_viability.md
npd-consumer-trends → reads concept brief + context brief → writes data/eval_consumer_trends.md

CRITICAL: Each evaluator subagent runs in its own context. They CANNOT see each other's output. This is the key architectural advantage — true independence.

Give each evaluator this instruction: "Read data/concept_brief.md and data/context_brief.md. Follow your evaluation methodology. Conduct at least 3-5 additional web searches. Score your 3 dimensions (1-10 each). Write your full evaluation to data/eval_[your-name].md"

Phase 3 — Devil's Advocate

After ALL 5 evaluators complete, delegate to npd-devils-advocate:

It reads all 5 evaluation files + the concept brief
It conducts adversarial research
It writes data/devils_case.md

Phase 4 — Consensus Director

After the Devil's Advocate completes, delegate to npd-consensus-director:

It reads everything: concept brief, context brief, all 5 evals, devil's case
It runs 3 rounds: conflict detection → challenge/defend → final synthesis
For Round 2 challenges: it writes challenge files to data/challenges/ and re-invokes the relevant evaluator subagent with the challenge
It writes the final report to data/validation_report.md

Step 4: Present Results

Read data/validation_report.md and present to the user:

Lead with the VERDICT and COMPOSITE SCORE
Summarize top 3 findings
Note the Devil's Advocate kill shot rating
Offer iteration options: pivot, deepen, variant test, challenge

Step 5: Support Iteration

If the user wants to iterate:

Pivot: Update concept_brief.md → re-run full pipeline
Deepen one area: Re-invoke that specific evaluator subagent with expanded scope
Challenge a score: Pass user's counter-evidence to the evaluator subagent
Change weights: Re-invoke consensus-director to recalculate

Variant Testing

When the user wants to compare two versions of a concept (e.g., "what if we did the same product but as mass market instead of premium?"):

Preserve original: Save current data/concept_brief.md and data/validation_report.md as data/variant_a/
Create variant brief: Write the modified concept to data/variant_b/concept_brief.md
Run parallel pipeline: Execute the full 4-phase pipeline for variant B, writing all outputs to data/variant_b/
Produce comparison report: Write data/comparison_report.md containing:
- Side-by-side scorecard table (variant A scores | variant B scores | delta per dimension)
- Verdict comparison (which variant got the better verdict and why)
- Key differentiators (which dimensions drove the difference)
- Recommendation (which variant to proceed with, or whether to pivot to a hybrid)
Present comparison to the user, leading with the recommended variant and the score gap

The two pipelines must be run independently — do not let variant A's findings contaminate variant B's evaluators.

Scope Creep Detection

Before running each iteration, compare the new concept to V1 (the original):

Category change: V1 was "body moisturizer," V6 is "face mist" → scope creep
Target demographic shift: V1 targeted women 25–35, V6 targets teens → scope creep
Differentiator fully replaced: V1 differentiator was "caffeine," V6 is "adaptogens" with no caffeine → scope creep
Price tier jumped 2+ levels: V1 was mid-range $18–24, V6 is ultra-premium $85+ → scope creep

If 2+ scope creep signals are present, stop and ask the user: "This iteration has drifted substantially from V1. The concept now differs in [list changes]. Should we: (a) close this validation and start a fresh one for the new concept, or (b) continue as V[N] with the understanding that the original concept is effectively archived?"

This prevents a V7 verdict from being confused with a validation of the original concept. Both paths are valid — the user chooses. If they continue, note the drift explicitly in the iteration log and report header.

Track iteration history in data/iteration_log.md.

Comments

Loading comments...