Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Local Ai Search

v1.4.0

Natural language search for local files (100G-1T). Supports xlsx, pptx, pdf, docx formats with location info. Triggered when user asks to search local/comput...

1· 97·0 current·0 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The name/description (local natural-language search over Office/PDF files) matches what the code and scripts implement (conversion, indexing, querying via a local Khoj server). The included tools (markitdown, khoj, Tesseract) and file-processing scripts are appropriate for the stated purpose. However, the registry metadata says 'Required env vars: none' and 'Primary credential: none' while the SKILL.md and code reference multiple environment variables (OPENAI_API_KEY, KHOJ_API_KEY, KHOJ_URL, ANTHROPIC_API_KEY, OPENAI_BASE_URL, OCR_LANGUAGES, USE_EMBEDDED_DB, etc.), which is an incoherence between declared requirements and actual needs.
!
Instruction Scope
Runtime instructions and scripts scan arbitrary directories, convert many file types to Markdown, and upload file contents to a Khoj API endpoint (default http://localhost:42110). The SKILL.md explicitly tells users to configure cloud LLM API keys (OpenAI/DeepSeek/Anthropic) and to use cloud LLMs for final responses. If misconfigured (KHOJ_URL changed to a remote host or cloud provider settings applied), the scripts will transmit local file contents and converted text to external services. The skill also offers a cron-based scheduler to run regular syncs, increasing persistence of data exfiltration risk if endpoints are untrusted.
Install Mechanism
There is no formal install spec in the registry (instruction-only), which limits automatic code installation risk, but the bundle includes Python scripts and a requirements.txt that instruct pip installs (khoj, markitdown, etc.). The code uses standard OS utilities (curl, crontab, pkill) and invokes subprocesses; no remote binary download URLs or archive extracts are present. This is a moderate, expected footprint for a local indexing tool but still requires manual dependency installation and review.
!
Credentials
The registry lists no required environment variables, but the SKILL.md and code read and rely on many env vars: OPENAI_API_KEY, OPENAI_BASE_URL, ANTHROPIC_API_KEY, KHOJ_API_KEY, KHOJ_URL, USE_EMBEDDED_DB, OCR_LANGUAGES, OCR_DPI, and others. Those variables include secrets (API keys) that the skill will use and which, depending on configuration, enable networked sending of local file content to cloud LLM providers. The absence of declared env var requirements in metadata is an incoherence that hides the need for sensitive credentials and their implications.
Persistence & Privilege
always:false (not force-included) and the skill is user-invocable. However, included scripts (schedule_sync.sh, sync.py) create cron jobs and write to ~/.khoj (sync state and logs), so the skill can establish persistent periodic syncing of local directories. This behavior is explainable for a sync/indexing tool but increases the risk profile because it regularly accesses user files and may upload them if endpoints are configured to remote hosts.
What to consider before installing
What to consider before installing: - Incoherent metadata: the registry says no env vars/credentials are required, but the skill and its config files expect and use API keys (OPENAI_API_KEY, ANTHROPIC_API_KEY, KHOJ_API_KEY) and a KHOJ_URL. Treat that as a red flag: the skill will ask you to provide secrets even though metadata does not advertise that need. - Data flow and exfiltration risk: by default the scripts upload converted file contents to a Khoj API (http://localhost:42110). That is fine if Khoj runs locally. But if you set KHOJ_URL or LLM provider base URL to a remote server (or misconfigure the base_url), the skill will send your converted documents and query context to that remote service and to cloud LLM providers. Only provide API keys or point KHOJ_URL to hosts you trust. - Persistence: the skill can install cron jobs (schedule_sync.sh) and writes state/logs under ~/.khoj. If enabled, syncs will run periodically and continue uploading new/changed files. Review the schedule script before enabling and verify the cron entry before trusting it. - Review and sandbox: because the package contains executable scripts that read arbitrary directories and make HTTP requests, inspect the code yourself (or run it in an isolated VM/container) before pointing it at sensitive files. Confirm KHOJ_URL remains localhost and do not export API keys to the environment unless you intend cloud processing. - Least privilege: if you do use this skill, create a dedicated, minimal directory for indexing first (not your entire home), test conversion and indexing there, and verify network traffic (e.g., via netstat/tcpdump) to ensure uploads go only where you expect. - What would change this assessment: if the registry metadata were updated to explicitly declare all required env vars and their purposes (and require user consent before enabling scheduled sync), or if the package came from a verifiable homepage/repo with a release signature, I would consider the incoherence resolved and raise confidence toward benign. Conversely, evidence that the default KHOJ_URL points to a remote, unknown host would increase severity. Recommended immediate actions: 1) Inspect config.yaml and make sure llm.provider and base_url are what you expect (prefer local/localhost). 2) Do not export API keys globally; use them only for trusted providers and ephemeral test runs. 3) Run the scripts in a sandbox and monitor network activity before pointing the tool at sensitive directories. 4) If you do not understand or cannot audit the code, avoid granting it access to your entire home directory or enabling cron syncing.

Like a lobster shell, security has layers — review code before you run it.

latestvk972hvp10zh7najn17csq23bpx83bkfv

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments