Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Family Soul Analyzer

v0.1.0

从家庭群聊记录(微信/WhatsApp/其他)提炼数字人格。 输出 soul.md(集体人格)+ 每位成员的 persona 文件,可直接用于 AI agent 的人格底座。 关键词:群聊分析、家庭人格、soul、persona、数字人格、聊天记录、微信导出、人格提炼。

0· 88·0 current·0 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
high confidence
Purpose & Capability
Overall the code and SKILL.md align with the described purpose (WeChat/CSV parsing → denoise → LLM extraction → synthesis). However the repo includes multiple alternative LLM backends (Anthropic/Claude, Kimi/Moonshot/OpenClaw) beyond the single provider named in SKILL.md, which is plausible but expands the skill's network footprint. Also the package bundles an actual exported chat JSON in data/raw/, which is unexpected for a reusable skill and raises privacy concerns.
!
Instruction Scope
SKILL.md instructs the agent to run the included pipeline scripts and to confirm ANTHROPIC_API_KEY; the scripts will read user-provided chat files and then transmit chunked chat text to third‑party LLM APIs. This is consistent with its purpose but the instructions (and code) will upload sensitive family chat content to external services — an explicit privacy/data-exfiltration risk the user must accept. The skill's runtime also writes cached API responses to disk (raw_cache.jsonl), increasing persistence of derived sensitive data.
Install Mechanism
There is no install spec beyond a requirements.txt and SKILL.md instructions — the skill is instruction-plus-code. That is lower risk than arbitrary remote downloads, but the presence of runnable Python scripts (and a requirements file) means the agent will execute code from this bundle. No external install URL was used, which avoids direct supply-chain download risks, but users should still vet requirements and run in an isolated environment.
!
Credentials
Registry metadata lists no required env vars, but the SKILL.md and multiple scripts expect ANTHROPIC_API_KEY (Claude) and optionally KIMI_API_KEY / MOONSHOT_API_KEY. Worse: several scripts contain a hard-coded API key string (e.g. 'sk-kimi-Sgsy7YYJPrk...') and hard-coded base URLs. Hard-coded credentials in published code are a major red flag (they may be leaked/stale/unauthorized) and the mismatch between declared requirements and actual env requirements is an incoherence to surface to users.
!
Persistence & Privilege
The skill writes outputs and caches to disk (soul.md, persona_*.md, data/observations/raw_cache.jsonl, observations.jsonl). That behavior is expected for a pipeline but means sensitive inputs and raw LLM responses are stored locally in the skill directory by default. The skill does not request elevated system privileges or set always:true, but the combination of autonomous invocation (normal default) plus persistent caches increases the blast radius if misconfigured.
Scan Findings in Context
[hardcoded_api_key] unexpected: Multiple files include a hard-coded API key string (pipeline/03_extract_kimi_openclaw.py, pipeline/03_extract_kimi_v2.py, pipeline/03_extract_simple.py contain 'sk-kimi-...'). Even if these are placeholders, embedding secrets in code is unsafe and not necessary for the stated purpose.
[undeclared_env_vars] unexpected: Registry metadata lists no required environment variables, but SKILL.md and code require ANTHROPIC_API_KEY and optionally KIMI_API_KEY/MOONSHOT_API_KEY. This mismatch is an incoherence between declared requirements and actual runtime needs.
[sensitive_sample_data] unexpected: The repo bundles a real chat export under data/raw/ (群聊_修身,齐家,尝小烹@深圳.json). Packaging identifiable family chat transcripts with the skill is unexpected and introduces a privacy liability.
[third_party_api_calls] expected: The skill legitimately calls remote LLM APIs (Anthropic/Claude, Kimi/Moonshot/OpenAI-compatible endpoints) to perform extraction and synthesis; this is expected for its function but has privacy/cost implications.
What to consider before installing
Key things to consider before installing or running this skill: - Privacy: the skill uploads chat content to external LLM APIs. Only run it on data you own or have explicit consent to process. The package itself includes an example exported chat JSON (data/raw/...), which may contain real people’s messages — remove or inspect it before use. - Credentials: SKILL.md and the scripts expect ANTHROPIC_API_KEY and optionally KIMI/MOONSHOT keys, but the registry entry lists none. Do not use any hard-coded API key found in the code. Replace or remove hard-coded keys and set your own keys as environment variables. - Hard-coded secrets: the repo contains embedded API key-like strings. Treat them as compromised/unauthorized; remove them and audit where keys are used. Do not rely on those keys for production. - Data persistence: the pipeline caches raw API responses (raw_cache.jsonl) and writes outputs to the skill directory. If you run it, run in an isolated directory or container and clean caches after use if you do not want local persistence. - Run safely: review requirements.txt and the Python scripts before execution. If possible, run first on synthetic/dummy data to verify behavior and network calls. Consider running in a sandboxed environment (container) and monitor outbound network requests. - Consent and legality: extracting 'personas' from family chat may implicate privacy laws or consent obligations — ensure you have permission from chat participants. If you want, I can: (1) list the exact files and lines where hard-coded keys appear; (2) suggest minimal code edits to remove embedded keys and stop caching raw responses; or (3) provide a safe checklist to run the skill in a sandbox.

Like a lobster shell, security has layers — review code before you run it.

latestvk972g4fdjwc7c6kg7605j79xcd83g51c

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments