tencent-tts-podcast
v1.0.0Convert text to podcast audio using Tencent Cloud TTS. Supports both short and long text processing, generates up to 30-minute long audio with automatic chun...
Security Scan
OpenClaw
Suspicious
medium confidencePurpose & Capability
The name/description (Tencent TTS → podcast WAV) aligns with the included code (tts_podcast.py + tts_tool.py) and the listed dependencies. However registry metadata earlier claimed no required environment variables while SKILL.md and the code require Tencent Cloud credentials (secret_id and secret_key) as parameters or via environment variables — this mismatch should be resolved/clarified.
Instruction Scope
SKILL.md and the example scripts confine actions to: chunking text, calling Tencent TTS API, optionally uploading to COS, and returning/encoding the generated WAV. The code does exactly that. The concern: tts_tool.py attempts to import core.tts_config.get_tts_credentials if available — that means, in some agent runtimes, the skill will call into a platform 'core' module to retrieve credentials from the host environment without that behavior being described in SKILL.md. That is an unexpected expansion of scope and could access stored platform secrets.
Install Mechanism
No install spec is provided (instruction-only), and dependency requirements are listed in requirements.txt. No remote downloads or install scripts that fetch arbitrary code were found. Risk is limited to installing the listed Python packages when you choose to run it locally.
Credentials
The skill legitimately needs Tencent Cloud credentials (SecretId/SecretKey) and optionally COS credentials for uploads; those are proportionate to TTS and uploading outputs. But the registry metadata did not declare required env vars while SKILL.md and code expect credentials as input parameters or environment variables (TENCENT_TTS_SECRET_ID/TENCENT_TTS_SECRET_KEY). Also default COS bucket/app_id values are present (bucket_name: 'ti-aoi', app_id: 1257195185) — if upload is enabled and the user leaves defaults, outputs could be uploaded to an account you did not configure. These mismatches should be noted.
Persistence & Privilege
The skill does not request permanent 'always' inclusion and does not modify other skills or system-wide settings. It does attempt to import a 'core' config helper (core.tts_config) which, if present, may read platform-managed credentials — this is not a persistence privilege itself but is an elevated attempt to access platform-side secrets and should be considered when installing.
What to consider before installing
What to check before installing/running:
- Expect to provide Tencent Cloud SecretId and SecretKey (or set environment variables TENCENT_TTS_SECRET_ID/TENCENT_TTS_SECRET_KEY). The registry metadata did not list these — confirm where you will supply them.
- Review the code yourself (tts_podcast.py and tts_tool.py) before supplying credentials. tts_tool will try to import core.tts_config.get_tts_credentials if available; if your agent/runtime has a core module it may return platform-level secrets. If you don't want that, remove/override that import or ensure the core module won't expose secrets.
- If enabling COS upload, check the default bucket_name ('ti-aoi') and app_id (1257195185). By default upload is disabled, but if enabled and you leave defaults, files might be uploaded to someone else's COS account. Prefer supplying your own COS credentials and bucket.
- Run with least-privilege credentials: create an API key limited to TTS (and COS only if you enable upload) rather than using broad account keys.
- The skill produces WAV files only (no MP3), and dependencies listed are standard (tencentcloud-sdk-python, cos-python-sdk-v5, requests). Install dependencies in a virtualenv or sandbox.
- If you want higher assurance, ask the skill author to remove the implicit core.tts_config import or to document when/why it will access platform secrets and to update registry metadata to list required env vars.Like a lobster shell, security has layers — review code before you run it.
latest
Tencent TTS Podcast Generator
Convert text content to podcast audio files using Tencent Cloud TTS service.
Capabilities
What This Skill Can Do
- Short & Long Text Compatible: Intelligently detects text length, processes short text directly, auto-chunks long text
- Long Text to Speech: Supports generating podcasts up to 30 minutes long (~7200 characters)
- Concurrent Processing: Long texts are automatically split and processed in parallel for faster generation
- 26 Voices: Supports basic, featured, customer service, and Tencent featured voices
- Smart Chunking: Splits text at semantic boundaries (paragraph/sentence) for natural audio flow
- Duration Estimation: Automatically estimates generated audio duration
- Auto Retry: Automatically retries failed requests to improve success rate
Short & Long Text Processing Strategy
Note: Tencent Cloud TTS single request limit is ~150 characters. Texts exceeding this will be auto-chunked.
| Text Type | Length Range | Processing Method | Concurrency | Timeout |
|---|---|---|---|---|
| Ultra Short | ≤50 chars | Direct request | 1 | 30s |
| Short | 50-150 chars | Direct request | 1 | 30s |
| Medium | 150-500 chars | Auto-chunk (2-4 chunks) | 2-3 | 60s |
| Long | 500-2000 chars | Auto-chunk (4-14 chunks) | 3-5 | 60s |
| Extra Long | 2000-7200 chars | Auto-chunk (14-50 chunks) | 3-5 | 60s |
What This Skill Does NOT Do
- Does not generate mp3 format (wav only)
- Does not support background music or sound effects
- Does not auto-generate podcast scripts (user must provide)
- Does not support dual-speaker dialogue mode (single voice only)
File Structure
This Skill consists of the following files:
-
tts_podcast.pyMain entry script- Tencent Cloud TTS signature generation
- Audio file generation
- COS upload functionality
-
tts_tool.pyAgentScope tool interface wrapper -
SKILL.mdThis file, describing Skill capabilities, boundaries, and usage conventions -
requirements.txtPython dependency configuration
Input & Output Specifications
Input Parameters
| Parameter | Description | Required | Default |
|---|---|---|---|
Text | Text content to convert | Yes | - |
VoiceType | Voice ID (see voice table below, either this or VoiceName) | No | 502006 |
VoiceName | Voice name (see voice table below, either this or VoiceType) | No | - |
secret_id | Tencent Cloud SecretId | Yes | - |
secret_key | Tencent Cloud SecretKey | Yes | - |
max_workers | Concurrent threads (3-5 for long text, 1 for short) | No | 3 |
chunk_size | Chunk size in characters (long text optimization) | No | 140 |
timeout | Request timeout in seconds | No | 30/60 |
enable_retry | Enable automatic retry | No | true |
max_retries | Max retry attempts | No | 2 |
preserve_paragraphs | Preserve paragraph boundaries when chunking | No | true |
cos_secret_id | Tencent Cloud COS SecretId (optional, defaults to TTS credentials) | No | - |
cos_secret_key | Tencent Cloud COS SecretKey (optional, defaults to TTS credentials) | No | - |
upload_cos | Whether to upload to COS, true/false (default false, local only) | No | false |
bucket_name | COS Bucket name (default: ti-aoi) | No | ti-aoi |
app_id | COS App ID (default: 1257195185) | No | 1257195185 |
region | COS region (default: ap-guangzhou) | No | ap-guangzhou |
Output
{
"Code": 0,
"Msg": "success",
"AudioUrl": "https://xxx.cos.ap-guangzhou.myqcloud.com/xxx.wav"
}
Usage
Environment Requirements
- Python 3.8+
- tencentcloud-sdk-python
- cos-python-sdk-v5
- requests
Install Dependencies
pip install -r requirements.txt
Basic Usage
from tts_podcast import main
result = main({
"Text": "Hello, welcome to today's podcast.",
"VoiceType": 502006,
"secret_id": "YOUR_SECRET_ID",
"secret_key": "YOUR_SECRET_KEY"
})
print(result)
# {'Code': 0, 'Msg': 'success', 'AudioUrl': 'https://...'}
Short Text Optimized Usage
# Short text (<150 chars) - Use single thread for fast response
result = main({
"Text": "Hello, this is a short message.",
"VoiceType": 502006,
"secret_id": "YOUR_SECRET_ID",
"secret_key": "YOUR_SECRET_KEY",
"max_workers": 1, # Single thread is sufficient
"timeout": 30, # 30 second timeout
"enable_retry": True # Enable retry
})
Long Text Optimized Usage
# Long text (>150 chars) - Use concurrency for speed
long_text = """Chapter 1: The Origin of AI
The concept of artificial intelligence can be traced back to ancient Greek mythology..."""
result = main({
"Text": long_text,
"VoiceType": 502007,
"secret_id": "YOUR_SECRET_ID",
"secret_key": "YOUR_SECRET_KEY",
"max_workers": 5, # Concurrent processing
"chunk_size": 140, # 140 chars per chunk
"timeout": 60, # 60 second timeout
"preserve_paragraphs": True # Preserve paragraph boundaries
})
Voice Reference
| VoiceType | Voice Name | Characteristics |
|---|---|---|
| 0 | 普通女声 | Standard female |
| 1 | 普通男声 | Standard male |
| 5 | 情感女声 | Emotional female |
| 6 | 情感男声 | Emotional male |
| 1000 | 智障少女 | Lively cute |
| 1001 | 阳光少年 | Bright youthful |
| 1002 | 温柔淑女 | Gentle female |
| 1003 | 成熟青年 | Mature male |
| 1004 | 严厉管事 | Stern female |
| 1005 | 亲和女声 | Friendly female |
| 1006 | 甜美女声 | Sweet female |
| 1007 | 磁性男声 | Magnetic male |
| 1008 | 播音主播 | Broadcast anchor |
| 101001 | 客服女声 | Customer service |
| 101005 | 售前客服 | Pre-sales service |
| 101007 | 售后客服 | After-sales service |
| 101008 | 亲和客服 | Friendly service |
| 502006 | 小旭 | Tencent voice |
| 502007 | 小巴 | Tencent voice |
| 502008 | 思驰 | Tencent voice |
| 502009 | 思佳 | Tencent voice |
| 502010 | 思悦 | Tencent voice |
| 502011 | 小宁 | Tencent voice |
| 502012 | 小杨 | Tencent voice |
| 502013 | 云扬 | Tencent voice |
| 502014 | 云飞 | Tencent voice |
Technical Architecture
tts_podcast.py
- TTS: Uses Tencent Cloud TTS API signature v3
- Upload: Uses Tencent Cloud COS SDK for audio file upload
- Auth: Supports credentials from parameters or environment variables
- Short & Long Text Compatible:
- Short text (≤150 chars): Direct single request, fast response
- Long text (>150 chars): Smart chunking + concurrent processing + auto-merge
Text Chunking Strategy
- Paragraph Priority: Try to preserve paragraph integrity, split at paragraph boundaries
- Sentence Boundaries: When paragraphs are too long, split at sentence ending punctuation (。!?;)
- Semantic Protection: Avoid truncating in the middle of words, ensure semantic coherence
- Length Control: Each chunk does not exceed 150 characters (Tencent Cloud API limit)
License
MIT
Comments
Loading comments...
