Search Cluster
Aggregated search aggregator using Google CSE, GNews RSS, Wikipedia, Reddit, and Scrapling.
MIT-0 · Free to use, modify, and redistribute. No attribution required.
⭐ 0 · 957 · 4 current installs · 4 all-time installs
byazzar budiyanto@1999AZZAR
MIT-0
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
Name/description match the implemented behavior: the code queries Google CSE (optional), Wikipedia, Reddit, GNews RSS, and a local scrapling-based scraper. Optional env vars (GOOGLE_*, SCRAPLING_PYTHON_PATH, REDIS_*, SEARCH_USER_AGENT) are appropriate for these providers. Minor inconsistency: registry metadata listed no homepage while skill.json contains a GitHub homepage; SKILL.md refers to scripts/ subpaths (scripts/search-cluster.py, scripts/stealth_fetch.py) but the actual files live at the repository root (search-cluster.py, stealth_fetch.py). This appears to be sloppy documentation rather than functional mismatch.
Instruction Scope
SKILL.md instructs creating a dedicated venv for scrapling and setting SCRAPLING_PYTHON_PATH; the runtime instructions and code keep network activity limited to provider endpoints (Google APIs, Wikipedia, Reddit, Google News RSS, DuckDuckGo via scrapling). The code uses subprocess.run to execute stealth_fetch.py with the query as an argument (explicit, not reading arbitrary files). There are no instructions to read unrelated system files or exfiltrate environment variables.
Install Mechanism
There is no install spec (instruction-only for the platform), which is low risk. SKILL.md requires creating a venv and pip-installing 'scrapling' there; skill.json lists python dependencies ('redis', 'scrapling') and binary 'python3' — this is consistent with the code (redis is optional and only imported when REDIS_HOST is set). No remote arbitrary downloads or extract steps are present.
Credentials
All requested/declared environment variables are proportional and directly tied to functionality: optional Google API credentials for CSE, SCRAPLING_PYTHON_PATH for the scraper venv, REDIS_HOST/PORT for caching, and SEARCH_USER_AGENT for HTTP requests. No unrelated secrets or broad credential requests are present.
Persistence & Privilege
The skill does not request always:true, does not modify other skills, and asks for no system-wide configuration or persistent privileges. It executes a local helper script via subprocess but that helper is packaged with the skill; this is expected behavior for the scrapling provider and is limited to the skill's scope.
Assessment
This skill appears to do what it claims: aggregate searches across Google CSE, Wikipedia, Reddit, Google News RSS, and a scrapling-based DuckDuckGo scraper. Before installing, consider the following: (1) Run the scrapling provider in a dedicated, isolated virtual environment as instructed and set SCRAPLING_PYTHON_PATH to that venv's python to avoid executing unreviewed code with your system python. (2) The SKILL.md references a scripts/ path while the files are at the repository root — verify the file paths when invoking the tool. (3) The scrapling package executes scraping logic (stealth_fetch.py runs as a subprocess); review that package's source or use network isolation if you don't trust it. (4) Google API keys (if used) and any Redis host you configure should be scoped and protected; REDIS_HOST is optional and only used for caching. (5) If you need higher assurance, inspect the referenced GitHub homepage (skill.json points to a repo) and/or run the code in a sandbox before granting access to any credentials or production networks.Like a lobster shell, security has layers — review code before you run it.
Current versionv3.5.1
Download zipgooglelatestnewsapiredditrsssearchwiki
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
SKILL.md
Search Cluster (Industrial Standard v3.1)
A multi-provider search aggregator designed for high-availability and security.
Installation
The scrapling provider requires a dedicated virtual environment.
- Create a venv: python3 -m venv venv/scrapling
- Install scrapling: venv/scrapling/bin/pip install scrapling
- Provide the path to the venv binary in SCRAPLING_PYTHON_PATH.
Security Posture
- Subprocess Isolation: Query inputs are passed as arguments to stealth_fetch.py.
- Strict TLS: Mandatory SSL verification on all providers.
- Sanitization: Integrated native internal scrubber (Path Neutral).
Requirements and Environment
Declare these variables in your environment or vault:
| Variable | Requirement | Description |
|---|---|---|
| GOOGLE_API_KEY | Optional | API Key for Google Custom Search. |
| GOOGLE_CSE_ID | Optional | Search Engine ID for Google CSE. |
| SCRAPLING_PYTHON_PATH | Optional | Path to the scrapling venv python binary. |
| REDIS_HOST | Optional | Host for result caching. |
| REDIS_PORT | Optional | Port for result caching (Default: 6379). |
| SEARCH_USER_AGENT | Optional | Custom User-Agent string. |
Providers
- google: Official Google Custom Search.
- wiki: Wikipedia OpenSearch API.
- reddit: Reddit JSON search API.
- gnews: Google News RSS aggregator.
- scrapling: Headless stealth scraping (via DuckDuckGo).
Included Scripts
- scripts/search-cluster.py: Main entry point.
- scripts/stealth_fetch.py: Scrapling fetcher (REQUIRED for scrapling provider).
Workflow
- Execute: scripts/search-cluster.py all "<query>"
- Output is structured JSON with source, title, link, and sanitized snippet.
Files
5 totalSelect a file
Select a file to preview.
Comments
Loading comments…
