Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

ifly-image-understanding

iFlytek Image Understanding (图片理解) — analyze and answer questions about images using Spark Vision model. WebSocket API, pure Python stdlib, no pip dependencies.

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 78 · 0 current installs · 0 all-time installs
byIflytek AIcloud@qingzhe2020
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The name/description (iFlytek Image Understanding) match the code: the script reads a local image, HMAC-signs requests, and connects to iFlytek's Spark Vision WebSocket endpoint. However, the skill registry metadata declares no required environment variables while both SKILL.md and the script require IFLY_APP_ID, IFLY_API_KEY, and IFLY_API_SECRET — this metadata mismatch is unexpected and should be corrected or explained.
Instruction Scope
The runtime instructions are narrowly scoped: set iFlytek credentials, supply a local image, and run the Python script which transmits the image and question to the documented iFlytek wss endpoint. There are no instructions to read arbitrary files or exfiltrate data to other endpoints. Note: transmitted data includes the raw image and any question text (so avoid sending sensitive images/queries).
Install Mechanism
There is no install spec and the code uses only Python stdlib; no external downloads or package installs are requested. This is a low-risk install mechanism in itself.
!
Credentials
Requesting IFLY_APP_ID / IFLY_API_KEY / IFLY_API_SECRET is proportionate to authenticating to iFlytek and is expected. The concern is twofold: (1) the registry metadata does not declare these required env vars (inconsistency that could mislead users), and (2) the repository includes a .claude/settings.local.json file granting Read(...) permission to a specific user's Desktop path and a zip command — that config is unrelated to the skill's runtime needs and suggests accidental inclusion of local packaging settings that could disclose or request broader filesystem access.
Persistence & Privilege
The skill does not request always:true, does not declare persistent installation steps, and the code does not modify other skills or system-wide settings. It performs one-off WebSocket calls during invocation.
What to consider before installing
Before installing or using this skill: 1) Recognize that the script will send the full image bytes and your question to iFlytek's cloud (wss://spark-api.cn-huabei-1.xf-yun.com/v2.1/image). Do not use it with sensitive images or questions unless you trust the service and your agreement with it. 2) The SKILL.md and script require IFLY_APP_ID, IFLY_API_KEY, and IFLY_API_SECRET — verify the registry listing or installer prompts these; do not paste secrets into unexpected places. 3) Remove or inspect the included .claude/settings.local.json before use: it contains a Read(...) permission pointing at a user's Desktop path and a zip command, which is unrelated to normal runtime and may be an accidental leak of local packaging settings. 4) Run the script in a controlled environment (isolated user account or container) if you have any doubt. 5) If you need higher assurance, ask the author to (a) update the registry metadata to declare required env vars, (b) remove any local .claude files from the published package, and (c) confirm the only network destination is the documented iFlytek WebSocket endpoint.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.0
Download zip
latestvk977ydk3e1kewezrhq1tdxjhdn8348ra

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

ifly-image-understanding

Analyze images and answer questions about their content using iFlytek's Spark Vision model (图片理解).

Setup

  1. Create an app at 讯飞控制台 with 图片理解 service enabled
  2. Set environment variables:
    export IFLY_APP_ID="your_app_id"
    export IFLY_API_KEY="your_api_key"
    export IFLY_API_SECRET="your_api_secret"
    

Usage

Describe an image

python3 scripts/image_understanding.py photo.jpg

Ask a question about an image

python3 scripts/image_understanding.py photo.jpg -q "图片里有什么动物?"

Use basic model (lower token cost)

python3 scripts/image_understanding.py photo.jpg --domain general

Options

FlagShortDescription
imageImage file path (.jpg, .jpeg, .png)
--question-qQuestion about the image (default: describe)
--domain-dimagev3 (advanced, default) or general (basic, fixed 273 tokens/image)
--temperature-tSampling temperature (0,1], default 0.5
--max-tokensMax response tokens 1-8192, default 2048
--rawOutput raw WebSocket JSON frames

Examples

# OCR a receipt
python3 scripts/image_understanding.py receipt.png -q "总金额是多少?"

# Identify objects
python3 scripts/image_understanding.py scene.jpg -q "图片中有哪些物体?"

# Low-cost basic model
python3 scripts/image_understanding.py chart.png -q "图表的趋势是什么?" -d general

Notes

  • Image formats: .jpg, .jpeg, .png
  • Max image size: 4MB
  • Max tokens: 8192 (input + output combined)
  • Auth: HMAC-SHA256 signed WebSocket URL
  • Endpoint: wss://spark-api.cn-huabei-1.xf-yun.com/v2.1/image
  • Pure stdlib: No pip dependencies — uses built-in socket + ssl for WebSocket
  • Model versions: imagev3 (advanced, dynamic token cost) vs general (basic, fixed 273 tokens/image)

错误码说明 😢

遇到错误先别慌~看看下面找到对应的解决方法吧!✨

错误码错误信息解决办法
0🎉 成功恭喜你!请求正常完成啦~
10003用户的消息格式有错误检查一下你的请求格式是否正确哦~确保发送的是合法的JSON格式呢!
10004用户数据的schema错误看起来数据结构有点问题~请检查一下字段名称和类型是否正确呀!
10005用户参数值有错误参数值可能不太对呢~仔细核对一下每个参数的有效范围吧!
10006用户并发错误:同一用户不能多处同时连接检测到重复连接啦!请确保只有一个客户端在连接同一个用户ID哦~
10013用户问题涉及敏感信息,审核不通过哎呀,你的问题可能包含了一些不太合适的内容~换个问题试试看吧!
10022模型生产的图片涉及敏感信息,审核不通过生成的图片没有通过审核呢...很抱歉,换张图片再试一下吧!
10029图片任何一边的长度超过12800图片尺寸太大啦!请确保图片宽高都不超过12800像素哦~
10041图片分辨率不符合要求图片尺寸不合适的呢~要求是:50×50 < 图片总像素值 < 6000×6000 哦!
10907Token数量超过上限内容太丰富啦!对话历史+问题的字数太多,需要精简一下输入哦~

💡 小贴士:如果还有其他问题,可以查看官方文档或者联系技术支持哦!


常见问题 🤔

图片理解的主要功能是什么呀?🐱

答:用户输入一张图片和问题,从而识别出图片中的对象、场景等信息,然后回答你的问题~是不是很方便呢!✨

图片理解支持什么应用平台呢?📱

答:目前支持 Web API 应用平台哦!直接在代码里调用就可以啦~

图片理解的文本大小限制多少呀?📝

答:有效内容不能超过 8192 Token 呢~如果超过了就要精简一下输入啦!


更多资源 📚

有更多问题随时来问我哦~祝你使用愉快!🌸

Files

3 total
Select a file
Select a file to preview.

Comments

Loading comments…