large-file-handler
v1.0.0大文件异步处理器。当用户发送大文件(>10MB)时自动启用,采用流式保存 + 后台异步处理,避免 Gateway 卡死。支持 PDF、视频、图片、日志、Office 文档等。处理完成后主动推送结果。
MIT-0
Security Scan
OpenClaw
Benign
medium confidencePurpose & Capability
Name/description match what the code implements: stream-save, async subprocess processing, per-type processors and result push placeholders. The number and type of files are proportionate to a file-processing skill.
Instruction Scope
SKILL.md and integration docs instruct saving files, starting subprocesses, and pushing results — the code follows that. Minor inconsistencies exist between docs and code (SKILL.md shows 'handlers/{handler_type}.py' and a '--notify' flag, while the code uses a 'processors/' directory and '--user-id/--channel' flags). The instructions and code do not read secrets or unrelated system state, but they do modify and move files in the workspace.
Install Mechanism
No install spec or remote downloads; this is an instruction+code skill that runs local Python scripts. No external packages are auto-installed by the skill bundle.
Credentials
The skill requests no environment variables or credentials. It does use a hardcoded absolute WORKSPACE path (E:\ai\openclaw\.openclaw\workspace) and inserts it into sys.path — reasonable for a local skill but worth noting because it will read/write files under that path and may import modules from it.
Persistence & Privilege
Skill is not always-enabled and is user-invocable. It does not modify other skills or agent-wide settings. It does create files and lock files within its workspace subdirectories, which is expected for this functionality.
Assessment
This skill appears to do what it says: it saves received files to disk and launches local subprocesses to process them, then prints (placeholder) results. Before installing, consider: 1) it writes and moves files under a hardcoded workspace path (E:\ai\openclaw\.openclaw\workspace) — ensure that path exists and you trust where files will be stored and retained; 2) it spawns subprocesses to run local processor scripts — review those processor scripts if you run untrusted files or run the skill in a sandbox; 3) notification/push to external services (Feishu) is currently a TODO — integrating real push will require adding credentials (e.g., Feishu tokens) later, so only add those when you trust the code; 4) the docs and code have small mismatches (handler paths and CLI flags) — test in a safe environment before production use; 5) set appropriate permissions and retention/cleanup (completed/ directory) to avoid accumulating large files.Like a lobster shell, security has layers — review code before you run it.
latest
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
SKILL.md
large-file-handler - 大文件异步处理技能
🎯 功能
接收和处理大文件(PDF、视频、图片、日志、Office 等),采用异步处理 + 结果推送模式,避免 Gateway 卡死。
📁 文件存储结构
tmp/files/
├── pending/ # 待处理文件
├── processing/ # 处理中文件(带锁标记)
└── completed/ # 已完成文件(保留 24 小时后清理)
⚙️ 配置参数
| 参数 | 默认值 | 说明 |
|---|---|---|
LARGE_FILE_THRESHOLD | 10MB | 超过此大小自动切换异步模式 |
MAX_FILE_SIZE | 500MB | 最大接收文件大小 |
CHUNK_SIZE | 5MB | 分块处理大小 |
RETENTION_HOURS | 24 | 完成文件保留时间 |
🔄 处理流程
1. 文件接收
用户发送文件
↓
检查文件大小
↓
<阈值:直接处理
>阈值:异步模式
↓
流式保存到 tmp/files/pending/
↓
立即回复"收到,处理中"
2. 异步处理
后台任务启动
↓
移动文件到 processing/(加锁)
↓
根据文件类型调用对应处理器
↓
处理完成 → 移动到 completed/
↓
推送结果给用户
📦 文件类型处理器
| 类型 | 处理器 | 说明 |
|---|---|---|
.pdf | pdf-processor | PDF 提取/OCR/摘要 |
.mp4,.mov,.avi | video-processor | 关键帧提取 + 音频转文字 |
.jpg,.png,.webp | image-processor | OCR + 内容描述 |
.log,.txt | log-processor | 关键字提取 + 错误分析 |
.docx,.pptx,.xlsx | office-processor | 转换为 Markdown |
.zip,.rar,.7z | archive-processor | 解压 + 选择性提取 |
💬 用户交互示例
小文件(<10MB)
用户:[发送文件]
我:[直接处理并回复结果]
大文件(>10MB)
用户:[发送 50MB PDF]
我:收到文件「xxx.pdf」(50MB),正在后台处理,完成后发你结果 🦁
【3 分钟后】
我:[PDF 处理完成]
📄 文档标题:xxx
📊 共 xx 页
📝 内容摘要:...
🔧 技术实现
文件保存(流式)
def save_file_stream(file_stream, dest_path, chunk_size=5*1024*1024):
with open(dest_path, 'wb') as f:
while chunk := file_stream.read(chunk_size):
f.write(chunk)
return True
异步任务启动
import subprocess
def start_async_task(file_path, handler_type):
# 启动独立子进程,与 Gateway 隔离
subprocess.Popen([
'python', 'handlers/{handler_type}.py',
'--file', file_path,
'--notify', 'feishu'
], cwd=WORKSPACE)
文件锁机制
def acquire_lock(file_path):
lock_file = file_path + '.lock'
try:
with open(lock_file, 'x') as f:
f.write(str(os.getpid()))
return True
except FileExistsError:
return False # 已被其他进程处理
📝 待办事项
- 实现文件接收和流式保存
- 实现大小阈值判断
- 创建各类型文件处理器
- 实现结果推送机制
- 添加定时清理任务(清理 completed/ 超过 24 小时的文件)
🚀 使用方法
# 在 Gateway 中调用
from large_file_handler import handle_file
result = handle_file(
file_stream=request.files['file'],
file_name='example.pdf',
user_id='ou_xxx',
channel='feishu'
)
if result['async']:
# 异步模式,已自动后台处理
return "收到文件,处理完成后通知你"
else:
# 同步模式,直接返回结果
return result['content']
版本: v0.1
创建日期: 2026-04-03
作者: Leo 🦁
Files
9 totalSelect a file
Select a file to preview.
Comments
Loading comments…
