Aliyun Cosyvoice Voice Clone
v1.0.0Use when creating cloned voices with Alibaba Cloud Model Studio CosyVoice customization models, especially cosyvoice-v3.5-plus or cosyvoice-v3.5-flash, from...
MIT-0
Security Scan
OpenClaw
Suspicious
medium confidencePurpose & Capability
The name, description, SKILL.md, reference docs, and helper script all consistently implement preparing/enrolling CosyVoice cloned voices against Alibaba Cloud Model Studio. However, the package metadata lists no required environment variables while the SKILL.md explicitly requires a DASHSCOPE_API_KEY or ~/.alibabacloud/credentials entry — an inconsistency in declared requirements.
Instruction Scope
Runtime instructions are narrowly scoped to creating enrollment request JSON, validating Python files, and saving outputs. They instruct the user/agent to set an API key and to provide a public audio URL. The guidance to save evidence (including the sample URL and prefix) is expected for auditing but could expose sensitive sample URLs or identifiers if not handled carefully.
Install Mechanism
There is no install spec; this is an instruction-only skill with a small helper script. No downloads, package installs, or archive extraction are performed by the skill itself.
Credentials
Requesting an Alibaba Cloud API key (DASHSCOPE_API_KEY or ~/.alibabacloud/credentials) is proportionate for calling the CosyVoice enrollment endpoint. The concern is that the skill's declared metadata lists no required env vars or primary credential while the SKILL.md requires them — this mismatch can mislead users about what secrets the skill needs.
Persistence & Privilege
The skill does not request permanent/always-on inclusion and does not modify other skills or agent configuration. Autonomous invocation is allowed by default (platform behavior) but the skill metadata does not elevate privileges (always:false).
What to consider before installing
This skill is generally coherent for preparing Alibaba Cloud CosyVoice enrollment requests, but before installing you should: (1) note that the SKILL.md requires a DASHSCOPE_API_KEY or ~/.alibabacloud/credentials entry even though the metadata lists no required env vars — treat that as a manual prerequisite; (2) ensure the API key you provide has minimal permissions needed for voice enrollment and rotate/remove it when not needed; (3) avoid passing private or sensitive audio via public URLs (the workflow expects a public URL and the script writes the URL to an output file); (4) inspect the small helper script (already included) to confirm it only writes local JSON and does not perform network calls or exfiltration; (5) verify the endpoint URLs (dashscope* hosts) match Alibaba Cloud documentation you trust; and (6) run the provided validation steps in a safe environment to see what files are written under output/aliyun-cosyvoice-voice-clone/ and verify they contain only expected artifacts.Like a lobster shell, security has layers — review code before you run it.
latest
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
SKILL.md
Category: provider
Model Studio CosyVoice Voice Clone
Use the CosyVoice voice enrollment API to create cloned voices from public reference audio.
Critical model names
Use model="voice-enrollment" and one of these target_model values:
cosyvoice-v3.5-pluscosyvoice-v3.5-flashcosyvoice-v3-pluscosyvoice-v3-flashcosyvoice-v2
Recommended default in this repo:
target_model="cosyvoice-v3.5-plus"
Region and compatibility
cosyvoice-v3.5-plusandcosyvoice-v3.5-flashare available only in China mainland deployment mode (Beijing endpoint).- In international deployment mode (Singapore endpoint),
cosyvoice-v3-plusandcosyvoice-v3-flashdo not support voice clone/design. - The
target_modelused during enrollment must match the model used later in speech synthesis, otherwise synthesis fails.
Endpoint
- Domestic:
https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization - International:
https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization
Prerequisites
- Set
DASHSCOPE_API_KEYin your environment, or adddashscope_api_keyto~/.alibabacloud/credentials. - Provide a public audio URL for the enrollment sample.
Normalized interface (cosyvoice.voice_clone)
Request
model(string, optional): fixed tovoice-enrollmenttarget_model(string, optional): defaultcosyvoice-v3.5-plusprefix(string, required): letters/digits only, max 10 charsvoice_sample_url(string, required): public audio URLlanguage_hints(array[string], optional): only first item is usedmax_prompt_audio_length(float, optional): only forcosyvoice-v3.5-plus,cosyvoice-v3.5-flash,cosyvoice-v3-flashenable_preprocess(bool, optional): only forcosyvoice-v3.5-plus,cosyvoice-v3.5-flash,cosyvoice-v3-flash
Response
voice_id(string): use this as thevoiceparameter in later TTS callsrequest_id(string)usage.count(number, optional)
Operational guidance
- For Chinese dialect reference audio, keep
language_hints=["zh"]; control dialect style later in synthesis via text orinstruct. - For
cosyvoice-v3.5-plus, supportedlanguage_hintsincludezh,en,fr,de,ja,ko,ru,pt,th,id,vi. - Avoid frequent enrollment calls; each call creates a new custom voice and consumes quota.
Local helper script
Prepare a normalized request JSON:
python skills/ai/audio/aliyun-cosyvoice-voice-clone/scripts/prepare_cosyvoice_clone_request.py \
--target-model cosyvoice-v3.5-plus \
--prefix myvoice \
--voice-sample-url https://example.com/voice.wav \
--language-hint zh
Validation
mkdir -p output/aliyun-cosyvoice-voice-clone
for f in skills/ai/audio/aliyun-cosyvoice-voice-clone/scripts/*.py; do
python3 -m py_compile "$f"
done
echo "py_compile_ok" > output/aliyun-cosyvoice-voice-clone/validate.txt
Pass criteria: command exits 0 and output/aliyun-cosyvoice-voice-clone/validate.txt is generated.
Output And Evidence
- Save artifacts, command outputs, and API response summaries under
output/aliyun-cosyvoice-voice-clone/. - Include
target_model,prefix, and sample URL in the evidence file.
References
references/api_reference.mdreferences/sources.md
Files
6 totalSelect a file
Select a file to preview.
Comments
Loading comments…
