Alibabacloud Avatar Video

Security checks across malware telemetry and agentic risk

Overview

This skill largely matches its Alibaba Cloud media-generation purpose, but it needs review because it handles cloud keys, uploads personal media, and can send DashScope credentials to an environment-selected API host.

Install only if you intend to use Alibaba Cloud/DashScope for these media tasks. Use a dedicated least-privilege RAM user, restrict OSS to a temporary bucket or prefix, keep signed URL lifetimes short, delete uploaded/generated personal media after use, and do not set DASHSCOPE_BASE_URL or LINGMOU_ENDPOINT to any host you do not fully trust.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (18)

Tainted flow: 'BASE_URL' from os.getenv (line 44, credential/environment) → requests.post (network output)

Critical
Category
Data Flow
Content
Returns output dict with check_pass, bodystyle.
    """
    print(f"\n[step1] aa-detect …")
    r = requests.post(
        f"{BASE_URL}/api/v1/services/aigc/image2video/aa-detect",
        headers=_headers(async_mode=False),
        json={"model": "animate-anyone-detect-gen2", "input": {"image_url": image_url}},
Confidence
88% confidence
Finding
r = requests.post( f"{BASE_URL}/api/v1/services/aigc/image2video/aa-detect", headers=_headers(async_mode=False), json={"model": "animate-anyone-detect-gen2", "input": {"ima

Tainted flow: 'BASE_URL' from os.getenv (line 44, credential/environment) → requests.post (network output)

Critical
Category
Data Flow
Content
Returns template_id str.
    """
    print(f"\n[step2] aa-template-generation …")
    r = requests.post(
        f"{BASE_URL}/api/v1/services/aigc/image2video/aa-template-generation/",
        headers=_headers(async_mode=True),
        json={"model": "animate-anyone-template-gen2", "input": {"video_url": video_url}},
Confidence
88% confidence
Finding
r = requests.post( f"{BASE_URL}/api/v1/services/aigc/image2video/aa-template-generation/", headers=_headers(async_mode=True), json={"model": "animate-anyone-template-gen2",

Tainted flow: 'BASE_URL' from os.getenv (line 44, credential/environment) → requests.post (network output)

Critical
Category
Data Flow
Content
"input": {"image_url": image_url, "template_id": template_id},
        "parameters": {"use_ref_img_bg": use_ref_img_bg, "video_ratio": video_ratio},
    }
    r = requests.post(
        f"{BASE_URL}/api/v1/services/aigc/image2video/video-synthesis/",
        headers=_headers(async_mode=True),
        json=payload,
Confidence
89% confidence
Finding
r = requests.post( f"{BASE_URL}/api/v1/services/aigc/image2video/video-synthesis/", headers=_headers(async_mode=True), json=payload, timeout=60, )

Tainted flow: 'url' from os.getenv (line 162, credential/environment) → requests.get (network output)

Critical
Category
Data Flow
Content
url = f"{BASE_URL}/api/v1/tasks/{task_id}"
    start = time.time()
    while time.time() - start < max_wait:
        r = requests.get(url, headers=_headers(), timeout=60)
        r.raise_for_status()
        data = r.json()
        out = data.get("output", {})
Confidence
87% confidence
Finding
r = requests.get(url, headers=_headers(), timeout=60)

Tainted flow: 'endpoint' from os.getenv (line 141, credential/environment) → requests.post (network output)

Critical
Category
Data Flow
Content
payload = {"model": model, "input": inp, "parameters": params}

    print(f"\n[i2v] submit  model={model}  resolution={resolution}  duration={duration}s")
    r = requests.post(endpoint, headers=_headers(async_mode=True), json=payload, timeout=60)
    r.raise_for_status()
    data = r.json()
    task_id = (data.get("output") or {}).get("task_id")
Confidence
86% confidence
Finding
r = requests.post(endpoint, headers=_headers(async_mode=True), json=payload, timeout=60)

Tainted flow: 'url' from os.environ.get (line 100, credential/environment) → requests.get (network output)

Critical
Category
Data Flow
Content
url = f"{BASE_URL}/api/v1/tasks/{task_id}"
    start = time.time()
    while time.time() - start < max_wait:
        r = requests.get(url, headers=_headers(async_mode=False), timeout=30)
        r.raise_for_status()
        data = r.json()
        out = data.get("output", {})
Confidence
86% confidence
Finding
r = requests.get(url, headers=_headers(async_mode=False), timeout=30)

Tainted flow: 'BASE_URL' from os.getenv (line 47, credential/environment) → requests.post (network output)

Critical
Category
Data Flow
Content
Raises ValueError if check fails.
    """
    print(f"\n[step1] liveportrait-detect …")
    r = requests.post(
        f"{BASE_URL}/api/v1/services/aigc/image2video/face-detect",
        headers=_headers(async_mode=False),
        json={"model": "liveportrait-detect", "input": {"image_url": image_url}},
Confidence
93% confidence
Finding
r = requests.post( f"{BASE_URL}/api/v1/services/aigc/image2video/face-detect", headers=_headers(async_mode=False), json={"model": "liveportrait-detect", "input": {"image_ur

Tainted flow: 'BASE_URL' from os.getenv (line 47, credential/environment) → requests.post (network output)

Critical
Category
Data Flow
Content
"head_move_strength": head_move_strength,
        },
    }
    r = requests.post(
        f"{BASE_URL}/api/v1/services/aigc/image2video/video-synthesis/",
        headers=_headers(async_mode=True),
        json=payload,
Confidence
93% confidence
Finding
r = requests.post( f"{BASE_URL}/api/v1/services/aigc/image2video/video-synthesis/", headers=_headers(async_mode=True), json=payload, timeout=60, )

Tainted flow: 'url' from os.environ.get (line 115, credential/environment) → requests.get (network output)

Critical
Category
Data Flow
Content
start = time.time()
    while time.time() - start < max_wait:
        try:
            r = requests.get(url, headers=_headers(), timeout=30)
            r.raise_for_status()
            data = r.json()
            out = data.get("output", {})
Confidence
91% confidence
Finding
r = requests.get(url, headers=_headers(), timeout=30)

Tainted flow: 'safe_url' from requests.post (line 344, network input) → urllib.request.urlopen (network output)

Medium
Category
Data Flow
Content
out_path = resolve_under_cwd(args.output, field="--output")
            safe_url = validate_http_https_url(video_url, field="result video URL")
            print(f"Downloading → {out_path} …")
            with urllib.request.urlopen(safe_url, timeout=300) as response:
                with open(out_path, 'wb') as f:
                    f.write(response.read())
            size_kb = out_path.stat().st_size // 1024
Confidence
83% confidence
Finding
with urllib.request.urlopen(safe_url, timeout=300) as response:

Tainted flow: 'safe_url' from requests.post (line 147, network input) → urllib.request.urlopen (network output)

Medium
Category
Data Flow
Content
if args.download and video_url:
        out_path = resolve_under_cwd(args.output, field="--output")
        safe_url = validate_http_https_url(video_url, field="result video URL")
        with urllib.request.urlopen(safe_url, timeout=300) as response:
            with open(out_path, 'wb') as f:
                f.write(response.read())
        print(f"saved={out_path}")
Confidence
91% confidence
Finding
with urllib.request.urlopen(safe_url, timeout=300) as response:

Tainted flow: 'safe_url' from os.getenv (line 173, credential/environment) → urllib.request.urlopen (network output)

Critical
Category
Data Flow
Content
if args.download:
            safe_url = validate_http_https_url(url, field="image URL")
            filename = out_dir / f"t2i_{int(time.time())}_{i+1}.png"
            with urllib.request.urlopen(safe_url, timeout=300) as response:
                with open(filename, 'wb') as f:
                    f.write(response.read())
            size_kb = filename.stat().st_size // 1024
Confidence
88% confidence
Finding
with urllib.request.urlopen(safe_url, timeout=300) as response:

Lp3

Medium
Category
MCP Least Privilege
Confidence
91% confidence
Finding
The skill declares powerful operational requirements in metadata and documentation, including environment secrets, shell tools, filesystem interaction, and network access, but the finding indicates these capabilities are not explicitly declared in a permission model. That creates a transparency and governance gap: a user or platform may invoke a skill that can exfiltrate credentials, upload local files to OSS, or make arbitrary external API calls without clear permission scoping.

Vague Triggers

Medium
Confidence
82% confidence
Finding
The trigger text is broad enough to match generic requests like speech synthesis, text-to-image, text-to-video, or animation, which are common user intents across many contexts. Overbroad triggering can cause the agent to invoke a skill that performs networked media generation, consumes paid APIs, and processes user-provided files or prompts in situations where a narrower, more appropriate tool should be selected.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
This reference explicitly requires public image/video URLs and discusses downloading generated outputs, but it does not warn that user-provided media will be exposed to external Alibaba Cloud endpoints and may remain accessible via public links. In a skill that handles portraits, audio, and video of people, omission of privacy guidance materially increases the risk of accidental disclosure of sensitive biometric and personal content.

Missing User Warnings

Medium
Confidence
88% confidence
Finding
The documentation explicitly requires users to provide public image and audio URLs and use a bearer token, but it does not warn that user media will be exposed to third-party infrastructure and may need to be publicly accessible on the internet. In a skill handling portraits, speech, and avatar generation, this can lead to unintended disclosure of sensitive personal data, misuse of exposed media URLs, or insecure operational practices by downstream implementers.

Missing User Warnings

Medium
Confidence
96% confidence
Finding
The documentation explicitly instructs users to upload local media to a publicly reachable URL so DashScope can fetch it, but it does not prominently warn that uploaded face images, voice audio, and generated assets may become publicly accessible or broadly retrievable via signed links. In this skill context, the files are likely to contain sensitive biometric and personal content, so omission of clear privacy guidance materially increases the chance of accidental exposure.

Missing User Warnings

Medium
Confidence
90% confidence
Finding
When local image/audio files are provided, the script uploads them to Alibaba OSS and then submits them to external AI services, but there is no explicit user-facing notice or consent step about this data transfer. In an agent setting, users may reasonably assume files stay local; silent transmission of potentially sensitive media increases privacy and compliance risk.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal