Customer Research & Validation

v0.1.0

Conducts in-depth customer research by mining forums, generating surveys and interviews, scraping competitor reviews, and analyzing sentiment to validate mar...

⭐ 0· 140·0 current·0 all-time

byRunByDaVinci@clawdiri-ai

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for clawdiri-ai/customer-research-dv.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Customer Research & Validation" (clawdiri-ai/customer-research-dv) from ClawHub.
Skill page: https://clawhub.ai/clawdiri-ai/customer-research-dv
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install customer-research-dv

ClawHub CLI

Package manager switcher

npx clawhub@latest install customer-research-dv

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Suspicious

medium confidence

ℹ

Purpose & Capability

The scripts, docs, and CLI wrapper align with the described functionality (Reddit/forum mining, survey/interview generation, persona validation, competitor scraping). However the SKILL.md and README refer to using 'OpenClaw LLM access for sentiment analysis' while the skill declares no required credentials or primaryEnv; similarly competitor scraping mentions future browser automation (Playwright) for sites like G2/Trustpilot but no explicit runtime requirements are declared. These are minor coherence gaps (expected behavior but not fully documented in manifest).

ℹ

Instruction Scope

Runtime instructions are explicit about what will be done: crawling Reddit/public JSON, scraping review sites, writing JSON/markdown to local paths (data/research/, projects/...). The SKILL.md and scripts direct the agent to gather community text and store results locally. This scope is consistent with the purpose, but the docs instruct running recurring jobs (cron) and updating project files, which broadens scope of file writes—users should review where outputs are stored and any automation hooks before use.

ℹ

Install Mechanism

The registry lists no formal install spec (instruction-only), which is low risk. The repository includes setup.sh and a Python requirements.txt; running setup.sh would write to disk and install dependencies. No remote downloads from obscure URLs were listed in the manifest excerpts, but any install should be inspected before executing setup.sh or pip installs.

Credentials

The skill references use of the OpenClaw LLM for sentiment analysis and (optionally) Reddit API access, browser automation, and future APIs, but the manifest declares no required environment variables or primary credentials. That is a discrepancy: if you enable authenticated Reddit API use, or if the LLM requires an API key or a webhook, those credentials will be needed but are not declared. Treat any prompts to supply API keys or tokens as significant—only provide them if you trust the source and inspect the code paths that use them.

✓

Persistence & Privilege

The skill is not force-enabled (always: false), and it does not request elevated platform privileges in the manifest. It writes outputs to local project directories per the documentation, which is expected for a research tool. Autonomous invocation is allowed by default (normal), so consider limiting autonomous runs if you worry about automated scraping or exfiltration.

What to consider before installing

This package appears coherent with its stated goal (mining communities, scraping reviews, generating surveys/interviews) but exercise caution before running it: 1) Inspect setup.sh and any scripts for network calls (look for curl, requests, sockets, or any hardcoded remote endpoints) and for any calls that post data externally. 2) Search scripts for invocations of LLM/CLI tools (e.g., openclaw/chat, external API endpoints). Because the manifest does not declare credentials, verify where you would need to provide API keys (Reddit, Playwright-driven sites, or an LLM) and assess whether those keys are necessary. 3) Run the scripts in a sandbox or isolated environment (container or VM) first, with no sensitive credentials mounted, until you confirm behavior. 4) If you plan scheduled/cron runs, ensure outputs are written to a controlled directory and that retention/archival behavior (90 days) is acceptable. 5) If you want a stronger assurance, share the contents of setup.sh and the top-level network-related sections of reddit-miner.py and competitor-scraper.py for focused review—if those files include obfuscated code, undocumented endpoints, or credential exfiltration, my assessment would escalate.

Like a lobster shell, security has layers — review code before you run it.

latestvk97dt3p0tje18nt65k3sm6qxks83cz43

140downloads

0stars

1versions

Updated 1mo ago

v0.1.0

MIT-0

Customer Research & Validation Skill

Trigger conditions:

User asks to validate a product idea, persona, or market assumption
User mentions "customer research", "validate assumption", "talk to users"
User requests Reddit/forum mining, competitor analysis, or sentiment analysis
User wants to generate surveys or interview scripts
User asks about customer pain points, needs, or jobs-to-be-done

Purpose

Pre-pipeline validation for DaVinci Enterprises products. Ensures marketing strategy is built on real customer signal, not assumptions. Prevents building features nobody wants.

What It Does

Reddit/Forum Mining — Extract threads, comments, sentiment from subreddits and forums
Survey Generation — Convert research questions into structured surveys
Interview Scripts — Generate customer interview guides with probing questions
Persona Validation — Test persona assumptions against real user behavior
Competitor Review Scraping — Aggregate reviews from G2, Trustpilot, Reddit
Sentiment Analysis — Aggregate and score customer sentiment across sources

Usage

Quick Start

# Validate a product hypothesis via Reddit mining
scripts/reddit-miner.sh --subreddit "personalfinance" --query "FIRE calculator" --limit 50

# Generate a customer interview script
scripts/interview-generator.sh --persona "FIRE enthusiast" --problem "retirement planning tools"

# Scrape competitor reviews
scripts/competitor-scraper.sh --product "Personal Capital" --sources "g2,trustpilot,reddit"

Integration with Marketing Pipeline

This skill feeds into the content strategy workflow:

Discovery → Run customer research to identify pain points
Validation → Test persona assumptions against real data
Strategy → Build content pillars around validated needs
Execution → Ogilvy creates content targeting real customer language

Output format: JSON reports to data/research/ for downstream consumption.

Scripts

reddit-miner.sh

Fetch Reddit threads matching keywords, extract sentiment, output structured JSON.

Usage:

./scripts/reddit-miner.sh --subreddit SUBREDDIT --query "search terms" [--limit N] [--sentiment]

Output: data/research/reddit-{subreddit}-{timestamp}.json

interview-generator.sh

Generate customer interview script from persona + problem statement.

Usage:

./scripts/interview-generator.sh --persona "description" --problem "pain point"

Output: Markdown interview guide to stdout

competitor-scraper.sh

Aggregate reviews from multiple sources, extract themes and sentiment.

Usage:

./scripts/competitor-scraper.sh --product "Product Name" --sources "g2,trustpilot,reddit"

Output: data/research/competitor-{product}-{timestamp}.json

Output Schema

All scripts output to data/research/ with consistent JSON schema:

{
  "meta": {
    "skill": "customer-research",
    "script": "reddit-miner",
    "timestamp": "2026-03-22T00:43:00Z",
    "query": {...}
  },
  "findings": [
    {
      "source": "reddit",
      "source_id": "thread_abc123",
      "text": "I wish there was a FIRE calculator that...",
      "sentiment": 0.65,
      "themes": ["pain point", "feature request"],
      "metadata": {...}
    }
  ],
  "summary": {
    "total_sources": 47,
    "avg_sentiment": 0.42,
    "top_themes": ["complexity", "cost", "trust"],
    "key_insights": ["Users want transparency", "Price sensitivity high"]
  }
}

Dependencies

jq — JSON processing
curl — HTTP requests
Reddit API access (optional: can scrape public threads without auth)
OpenClaw LLM access for sentiment analysis

Example Workflow

Scenario: Validate demand for FIRE Sim product

Mine Reddit pain points:

./scripts/reddit-miner.sh --subreddit "financialindependence" \
  --query "retirement calculator problems" --limit 100 --sentiment

Scrape Personal Capital reviews:

./scripts/competitor-scraper.sh --product "Personal Capital" \
  --sources "g2,trustpilot,reddit"

Generate interview script:

./scripts/interview-generator.sh \
  --persona "30-40 tech worker, $200K income, aiming FIRE by 45" \
  --problem "existing retirement tools too conservative or too complex"

Analyze findings:
- Review JSON outputs in data/research/
- Identify recurring themes, pain points, language patterns
- Validate/invalidate persona assumptions
- Feed insights into content strategy
Document learnings:
- Update projects/davinci-enterprises/customer-insights.md
- Flag validated needs for product roadmap
- Inform Ogilvy content pillars with real customer language

Quality Gates

Minimum sample size: 30+ sources per research question
Sentiment confidence: Only report sentiment scores with >50 samples
Theme validation: Themes must appear in ≥3 independent sources
Source diversity: Mix Reddit, review sites, forums (not just one platform)

Anti-Patterns

❌ Don't:

Build features based on one Reddit comment
Cherry-pick data to confirm existing beliefs
Skip competitor analysis (reinventing the wheel wastes time)
Ignore negative sentiment (it's the most valuable signal)

✅ Do:

Let data challenge your assumptions
Track quotes verbatim (real customer language = gold for content)
Cross-reference findings across sources
Document what you disproved, not just what you confirmed

Integration Points

Content Strategy: Feed validated pain points to Ogilvy for pillar creation
Product Roadmap: Link research findings to JIRA/task tickets
Persona Database: Update persona definitions based on validation results
Marketing Copy: Extract customer language for landing pages, ads

Maintenance

Research data retention: 90 days (then archive to cold storage)
Re-run validation quarterly for active products
Update scripts when Reddit/review site APIs change
Log failed scrapes to logs/customer-research-errors.log

Next Steps After Running Research:

Review findings in data/research/
Update persona docs with validated/invalidated assumptions
Create content strategy tasks based on identified pain points
Schedule customer interviews if online research raises questions
Document learnings in project-specific CONTEXT.md

Comments

Loading comments...