Install
openclaw skills install @yangjinghua0127/url2podcastopenclaw skills install @yangjinghua0127/url2podcast输入一个链接,输出一个可播放的双人播客音频文件。
https://openspeech.bytedance.com/api/v1/tts)volcano_ttszh_female_qingxinnvsheng_uranus_bigtts(小何)zh_male_taocheng_uranus_bigtts(小天)source_brief.jsonpodcast_content.mdpodcast_script.json(按上下文补情感)lines[] 逐句调用 TTS,生成 chunks/*.wavchunks/list.txt 并顺序合并podcast_final.mp3 并发送先生成 source_brief.json,避免直接写空泛文案:
{
"topic": "主题",
"core_claims": ["3-5条核心观点"],
"key_facts": ["3-8条可复述事实"],
"debate_points": ["1-3个可讨论分歧点"],
"actions": ["2-4条可执行建议"]
}
要求:
core_claims 必须和原文一致,不可臆造key_facts 尽量具体,少用空泛词podcast_content.md 必须覆盖大部分 core_claims先生成 podcast_content.md,结构最少包含:
要求:
对 podcast_content.md 做 1-10 分评分:
规则:
<8:必须重写脚本阶段必须输出 JSON 文件 podcast_script.json:
{
"title": "播客标题",
"summary": "一句话摘要",
"lines": [
{
"idx": 1,
"speaker": "host",
"text": "欢迎收听,今天我们先说核心结论。",
"voice_type": "zh_female_qingxinnvsheng_uranus_bigtts",
"emotion": "neutral",
"speed_ratio": 0.95
},
{
"idx": 2,
"speaker": "guest",
"text": "好的,我补充两个关键原因。",
"voice_type": "zh_male_taocheng_uranus_bigtts",
"emotion": "neutral",
"speed_ratio": 0.95
}
]
}
idx 必须递增、连续speaker 仅允许 host 或 guestspeaker=host 必须映射 zh_female_qingxinnvsheng_uranus_bigttsspeaker=guest 必须映射 zh_male_taocheng_uranus_bigttstext 不允许带“主持人:/嘉宾:”前缀emotion 推荐值:neutral / happy / sad / serious / excitedemotion,自动回退为 neutralspeed_ratio 建议范围 0.90-1.05happyneutralserious 或 sadexcitedneutral(默认)你是播客脚本生成器。请基于 podcast_content.md 生成 JSON,不要输出其他文本。
要求:
1) 输出结构必须符合 podcast_script.json 契约(title/summary/lines)
2) lines 每句都要有 idx/speaker/text/voice_type/emotion/speed_ratio
3) 主持人固定音色:zh_female_qingxinnvsheng_uranus_bigtts
4) 嘉宾固定音色:zh_male_taocheng_uranus_bigtts
5) text 中不要出现角色前缀(例如“主持人:”)
6) emotion 必须结合上下文:承接、转折、强调、风险提示分别给出不同情感
7) emotion 优先从 `neutral/happy/sad/serious/excited` 选择;不确定用 `neutral`
8) 默认 speed_ratio=0.95
9) 严禁输出脚本以外的解释文本
环境变量(必填):
export VOLC_APPID="你的AppID"
export VOLC_TOKEN="你的AccessToken"
单句请求模板(从 lines[i] 取值):
curl -sS "https://openspeech.bytedance.com/api/v1/tts" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer;${VOLC_TOKEN}" \
-d "{
\"app\": {\"appid\": \"${VOLC_APPID}\", \"token\": \"${VOLC_TOKEN}\", \"cluster\": \"volcano_tts\"},
\"user\": {\"uid\": \"podcast-maker\"},
\"audio\": {\"voice_type\": \"${VOICE_TYPE}\", \"encoding\": \"wav\", \"speed_ratio\": ${SPEED_RATIO}, \"emotion\": \"${EMOTION}\"},
\"request\": {\"reqid\": \"${REQID}\", \"text\": \"${TEXT}\", \"text_type\": \"plain\", \"operation\": \"query\"}
}" > "${OUT_JSON}"
解码 data(base64) 到 wav:
python3 - <<'PY'
import base64, json, sys
p = sys.argv[1]
o = sys.argv[2]
data = json.load(open(p, "r", encoding="utf-8"))
b64 = data.get("data")
if not b64:
raise SystemExit(f"TTS failed: {data}")
open(o, "wb").write(base64.b64decode(b64))
PY "$OUT_JSON" "$OUT_WAV"
ls chunks/*.wav | sort | sed "s/^/file '/;s/$/'/" > chunks/list.txt
ffmpeg -f concat -safe 0 -i chunks/list.txt -c copy podcast_final.wav
ffmpeg -i podcast_final.wav -codec:a libmp3lame -b:a 192k podcast_final.mp3
// 1) 先提炼 source_brief.json
const brief = await llm_generate_source_brief(content);
writeFile("source_brief.json", brief);
// 2) 基于 source_brief 生成完整对话稿
let contentDraft = await llm_generate_podcast_content(content, brief);
writeFile("podcast_content.md", contentDraft);
// 3) 质量评分,不达标重写(最多2轮)
for (let i = 0; i < 2; i++) {
const score = await llm_score_content(contentDraft);
if (score.min >= 8) break;
contentDraft = await llm_rewrite_content(contentDraft, score.issues);
writeFile("podcast_content.md", contentDraft);
}
// 4) 基于对话稿生成结构化脚本
const script = await llm_generate_script_json(contentDraft);
writeFile("podcast_script.json", script);
// 5) 读取并校验脚本
const parsed = JSON.parse(readFile("podcast_script.json"));
validate(parsed);
// 6) 逐句合成
for (const line of parsed.lines) {
const wav = await volcTts(line.text, line.voice_type, line.emotion, line.speed_ratio);
saveToChunks(line.idx, line.speaker, wav);
}
// 7) 按 idx 顺序合并并导出 mp3
mergeChunksToMp3();
401:检查 Authorization: Bearer;${VOLC_TOKEN},分号不可漏data:查看返回 code/message,常见是 token 或 voice_type 不匹配emotion 改为 neutral 后重试speed_ratio 调到 0.92-0.98chunks/*.wav 可播放,再检查 chunks/list.txtsource_brief.jsonpodcast_content.mdpodcast_script.jsonpodcast_final.mp3(或 podcast_final.wav)