Chinese Bedtime Story Generator
生成多角色中文睡前故事并合成语音。适用于用户想为孩子定制个性化睡前故事的场景:根据孩子姓名、年龄和兴趣,由 LLM 创建完整世界观和角色,生成分段故事文本(每段标注说话人),再由 TTS 以不同音色合成旁白、主角、小伙伴和长者的语音,最终拼接为完整 MP3 音频文件。支持连载模式(`--continue`)在多次...
MIT-0 · Free to use, modify, and redistribute. No attribution required.
⭐ 1 · 22 · 0 current installs · 0 all-time installs
MIT-0
Security Scan
OpenClaw
Suspicious
medium confidencePurpose & Capability
The name/description claim LLM + TTS multi-role story generation — the code indeed calls an LLM and a TTS service and saves story_state and MP3 outputs, so purpose and capability are generally aligned. However, the registry metadata lists no required environment variables or primary credential while the code requires STORY_LLM_API_KEY (or IME_MODEL_API_KEY) and STORY_TTS_API_KEY (or SENSEAUDIO_API_KEY). That mismatch between declared requirements and actual code is an incoherence.
Instruction Scope
SKILL.md and scripts limit behavior to generating story text, calling an LLM, calling a TTS API, persisting story_state.json and output files. They do not request unrelated files or broad system reads. Two issues to note: (1) the instructions/code claim MP3 frames can be 'directly binary appended' to produce a single MP3 without ffmpeg — this is generally unreliable and likely to produce invalid/uncertain audio files, not a security risk but an implementation bug; (2) prompts/behavior rely on saving and loading story_state.json for continuation, which is expected but means persisted state contains child name/age/interests and story summaries.
Install Mechanism
No install spec; this is effectively instruction + Python script. A requirements.txt lists openai, requests, python-dotenv which is proportionate to contacting LLM/TTS and loading .env. No arbitrary remote downloads or install-time code execution were specified.
Credentials
Although TTS/LLM API keys are appropriate for the stated functionality, the skill metadata declares no required env vars while the code requires STORY_LLM_API_KEY/IME_MODEL_API_KEY and STORY_TTS_API_KEY/SENSEAUDIO_API_KEY at runtime. Defaults point to third-party endpoints (models.audiozen.cn and api.senseaudio.cn). This is problematic because (a) the registry did not disclose credential needs, and (b) the default endpoints are non-obvious third-party services — you should verify those providers and their privacy/policy before supplying keys. The code will raise and exit if keys are missing when those functions run.
Persistence & Privilege
The skill does not request elevated privileges or permanent installation. It writes output and a story_state.json into an 'outputs' directory under the skill; always:false and no modifications to other skills or system-wide settings. This file-based persistence is expected for a continuation feature.
What to consider before installing
This skill mostly does what it says (use an LLM + TTS to build MP3 bedtime stories), but there are some red flags to check before installing or providing secrets: 1) The registry/metadata claims no environment variables, but the script requires STORY_LLM_API_KEY (or IME_MODEL_API_KEY) and STORY_TTS_API_KEY (or SENSEAUDIO_API_KEY). Do not provide real API keys until you verify the endpoints and provider reputation. 2) The code defaults to models.audiozen.cn and api.senseaudio.cn — confirm these are legitimate services, read their privacy/policy, and consider using your own known LLM/TTS endpoints by setting the STORY_LLM_BASE_URL / STORY_TTS_URL env vars. 3) Test the skill in a sandbox or with dummy keys and use --no-tts to exercise only text-generation first. 4) Be aware the MP3 concatenation is implemented by naive binary append (the README asserts 'no ffmpeg needed'); that may produce broken files — test output integrity before trusting it for playback. 5) Inspect the code yourself (or ask the author) if you plan to store real child data; story_state.json will persist child_name/age/interests and summaries locally. If you need help verifying the remote providers or altering the script to use a different TTS/LLM endpoint, consider that before entering secrets.Like a lobster shell, security has layers — review code before you run it.
Current versionv1.0.0
Download zipaudiochildrenchineselateststorytts
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
SKILL.md
中文睡前故事生成器
适用范围
此 Skill 用于生成可播放的多角色中文睡前故事音频。
能力边界:
- 依赖
LLM生成故事文本,TTS合成多角色语音 - 支持自定义孩子姓名、年龄、兴趣爱好
- 支持连载模式,跨会话保持世界观和情节连贯
- 输出完整 MP3 音频 + 故事文本 + 状态文件
不做:
- 实时语音对话或 ASR 识别
- 视频或动画生成
- 英文故事
默认配置
child_name:小朋友age:5interests:冒险,动物episodes:1(单集)
音色分配
| 角色 | voice_id | speed | pitch |
|---|---|---|---|
| 旁白 narrator | male_0004_a | 0.9 | 0 |
| 主角 protagonist | child_0001_a | 1.0 | 0 |
| 小伙伴 sidekick | child_0001_b | 1.0 | 0 |
| 长者 elder | male_0018_a | 0.85 | -2 |
工作流
-
初始化
- 读取
--child-name、--age、--interests参数 - 若
--continue,从story_state.json加载已有世界观和角色
- 读取
-
世界观与角色创建(首次运行)
- LLM 生成世界名称、背景设定、4个角色(旁白/主角/小伙伴/长者)
- 主角名称默认使用孩子姓名
-
故事生成
- LLM 生成 12-20 段 segments,每段标注 speaker 和 text
- 连载模式下传入上集摘要,保持情节连贯
-
TTS 多角色合成
- 逐段根据 speaker 选择对应音色参数
- 合成每段音频
-
音频拼接
- MP3 帧独立可解码,直接二进制追加拼接
- 无需 ffmpeg
-
保存输出
story_state.json:世界观+角色+情节摘要story_ep{N}.txt:故事文本story_ep{N}.mp3:完整音频
Prompt 模块
数据结构
详见 references/state_schema_cn.md。
直接运行
pip install -r requirements.txt
# 首次生成
python scripts/run_story.py --child-name "小明" --age 5 --interests "恐龙,太空"
# 连载续写
python scripts/run_story.py --continue
# 不调用 TTS,仅输出文本
python scripts/run_story.py --child-name "小红" --age 7 --interests "魔法,精灵" --no-tts
# 生成多集
python scripts/run_story.py --child-name "小明" --age 5 --interests "恐龙,太空" --episodes 3
环境变量参考:.env.example
接口约定:
- LLM 读取
STORY_LLM_API_KEY,回退到IME_MODEL_API_KEY - TTS 读取
STORY_TTS_API_KEY,回退到SENSEAUDIO_API_KEY
Files
7 totalSelect a file
Select a file to preview.
Comments
Loading comments…
