SearXNG-lite

v1.0.0

Multi-engine web search aggregation via local Python script. Use when: (1) searching the web for information, articles, documentation, (2) searching code rep...

0· 141· 1 versions· 0 current· 0 all-time· Updated 1mo ago· MIT-0

SearXNG-lite

Lightweight multi-engine aggregated web search. No Docker, no server, no SearXNG instance required — just a single Python script that queries search engines directly.

26 engines across 9 categories, concurrent execution, JSON + compact text output.

What makes this different

Unlike other SearXNG skills that need a running SearXNG server, this skill:

  • Zero infrastructure — no Docker, no SearXNG instance, no HTTP server
  • Single file — one Python script (~850 lines) does everything
  • Direct queries — sends requests to search engines in-process via httpx
  • Hot-reload config — edit config.yml to toggle categories, changes apply instantly
  • Concurrent — queries multiple engines in parallel (up to 5 threads)
  • Deduplication — merges results from multiple engines, scores by cross-engine overlap

Requirements

  • Python 3.10+
  • httpx — HTTP client (pip3 install httpx)
  • lxml — HTML parser (pip3 install lxml, pre-installed on macOS)
  • (Optional) socksio — for SOCKS proxy support (pip3 install socksio)
  • (Optional) pyyaml — for config parsing (pip3 install pyyaml; falls back to built-in parser)

Quick start

# Install dependencies
pip3 install httpx lxml

# Search
python3 scripts/search.py "your query"

# List all engines
python3 scripts/search.py --list

How to search

python3 scripts/search.py "query"                          # default (general + knowledge engines)
python3 scripts/search.py "query" -c dev                   # by category
python3 scripts/search.py "query" -c dev,academic          # multiple categories
python3 scripts/search.py "query" -e github,arxiv          # specific engines
python3 scripts/search.py "query" --all                    # all enabled engines
python3 scripts/search.py "query" -l zh-CN                 # Chinese results
python3 scripts/search.py "query" -n 5                     # limit results
python3 scripts/search.py "query" --compact                # title + url + snippet text output
python3 scripts/search.py --list                           # show all engines & categories

All paths are relative to this skill's directory.

Arguments

ArgShortDefaultDescription
queryrequiredSearch query
--engines-eComma-separated engine names
--categories-cComma-separated categories
--max-results-n10Max results
--lang-lenLanguage code (e.g. en, zh-CN, de)
--page-p1Page number
--proxyfrom configProxy URL (overrides config/env)
--timeout12Timeout in seconds
--allUse all enabled engines
--compactHuman-readable text output
--listList engines and exit
--debugEnable debug logging

Without -e or -c, searches general + knowledge categories.

Configuration

Edit config.yml in the skill directory to customize behavior:

# Proxy for engines that need it (Google, YouTube, Reddit, etc.)
# Supports: http, https, socks5, socks5h
# Leave empty to disable — proxy-required engines will fail silently.
proxy: "socks5h://127.0.0.1:1080"

# Category toggles
categories:
  general: true       # bing, brave, duckduckgo, google🌐, startpage, yahoo
  knowledge: true     # wikipedia, wikidata, wolframalpha🌐
  dev: true           # github, gitlab, stackoverflow, hackernews, reddit🌐, huggingface🌐, mdn
  academic: true      # arxiv, semantic_scholar, google_scholar🌐, crossref
  news: false         # bing_news, reuters
  video: false        # youtube🌐
  images: false       # unsplash
  social: false       # lemmy🌐
  translate: false    # lingva🌐

Proxy setup

Some engines (marked with 🌐) require a proxy to access. Three ways to configure:

  1. config.yml (recommended): set proxy: "your-proxy-url"
  2. Environment variable: set HTTPS_PROXY=your-proxy-url
  3. CLI flag: pass --proxy your-proxy-url per search

Priority: CLI flag > config.yml > environment variable.

If no proxy is configured, 🌐-marked engines will fail silently and results from other engines are still returned.

Config hot-reload

The config file is read on every search call. No restart needed — just edit and save.

If config.yml is missing, the script falls back to a default set of engines: bing, brave, duckduckgo, wikipedia.

Categories and engines

CategoryEnginesUse for
generalbing, brave, duckduckgo, google🌐, startpage, yahooGeneral web search
knowledgewikipedia, wikidata, wolframalpha🌐Facts, definitions, calculations
devgithub, gitlab, stackoverflow, hackernews, reddit🌐, huggingface🌐, mdnCode, repos, dev Q&A, AI models
academicarxiv, semantic_scholar, google_scholar🌐, crossrefPapers, citations
newsbing_news, reutersCurrent events
videoyoutube🌐Video search
imagesunsplashFree stock photos
sociallemmy🌐Community discussions
translatelingva🌐Translation

🌐 = requires proxy. Without proxy, these engines are skipped.

Output format

Default JSON:

{
  "query": "search term",
  "results": [
    {"title": "...", "url": "...", "content": "...", "engines": ["bing","brave"], "score": 2}
  ],
  "result_count": 15,
  "elapsed": 2.1,
  "engines_used": ["bing", "brave", "wikipedia"],
  "errors": []
}

--compact outputs human-readable text: numbered title, URL, snippet, engine tags.

score = number of engines that returned this result. Higher = more relevant.

Typical agent workflows

  • General research: python3 scripts/search.py "topic" -n 5
  • Find a library: python3 scripts/search.py "image processing python" -e github
  • Academic papers: python3 scripts/search.py "attention mechanism" -c academic -n 5
  • Tech discussions: python3 scripts/search.py "topic" -e hackernews,reddit
  • AI models: python3 scripts/search.py "text-to-speech" -e huggingface
  • Dev docs: python3 scripts/search.py "fetch API" -e mdn,stackoverflow
  • Chinese search: python3 scripts/search.py "大语言模型" -l zh-CN -n 5
  • News: python3 scripts/search.py "AI regulation" -c news

Version tags

latestvk973243d8xr5c4h26g089pqqcs83jktc