Chinese Ebook Downloader

v2.0.0

Download Chinese-language ebooks from multiple sources with automatic A→B→C fallback. Primary source: online book library with ~100% coverage, no daily limit...

⭐ 0· 139·0 current·0 all-time

by@lb1121

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for lb1121/chinese-ebook-downloader.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Chinese Ebook Downloader" (lb1121/chinese-ebook-downloader) from ClawHub.
Skill page: https://clawhub.ai/lb1121/chinese-ebook-downloader
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install chinese-ebook-downloader

ClawHub CLI

Package manager switcher

npx clawhub@latest install chinese-ebook-downloader

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Suspicious

medium confidence

Purpose & Capability

Name/description align with the included scripts: the code implements multi‑source search, browser automation (Playwright), decrypting file hosts, curl downloads, ZIP extraction and EPUB→PDF conversion. However the package metadata claims 'no required binaries' and 'no required env vars' while the code clearly depends on external tooling and libraries (Playwright, curl, unzip, file, weasyprint/ebooklib, CJK fonts) and even references a hard‑coded Python interpreter path. This mismatch between declared requirements and what the code needs is an incoherence the user must resolve.

ℹ

Instruction Scope

SKILL.md and the scripts instruct the agent (and the user) to automate browser interactions: enter passwords, wait countdowns, extract JS variables from pages and run JS via page.evaluate to call file-host APIs, then download with curl and extract files. Those steps are within the downloader's purpose, but the automation intentionally executes extracted/constructed JS in a browser context and runs arbitrary downloads and subprocesses. That increases the attack surface (malicious remote pages could cause unexpected network activity). The instructions do not ask the agent to read unrelated system files or credentials.

Install Mechanism

There is no install spec (instruction-only install), but the bundle includes many Python scripts which require installing dependencies manually. README mentions Playwright and pip packages, but the registry metadata declared no required binaries. Several scripts assume system binaries exist (curl, unzip, file) and one shell script and multiple Python scripts hard-code an absolute PYTHON path (/opt/homebrew/.../env9/bin/python) and paths under ~/.openclaw/workspace — these are brittle and incoherent with a cross‑platform skill. Lack of an install step means users may run these scripts with missing dependencies or unexpected interpreter versions.

ℹ

Credentials

The skill does not declare required environment variables in the registry manifest, but the README and code reference optional env vars (SOURCE_A_BASE_URL, SOURCE_B_BASE_URL, FILE_HOST_BASE_URL, EBOOK_DEFAULT_PASSWORD). These are reasonable for configuring source hosts and a default extraction password. The skill does not request unrelated secrets (AWS keys, tokens). Still, default passwords and host base URLs can be changed via env; ensure you don't accidentally set sensitive values there.

✓

Persistence & Privilege

The skill is not always-enabled and will not autonomously be force‑included in all agent runs (always: false). It does not modify other skills or global agent settings. It does read and write files in user directories (/tmp and under the user's home) which is expected for a downloader.

What to consider before installing

Key points before you install/use: - Functional fit: The skill appears to do what it claims (automated ebook search/download + conversion). The included scripts perform browser automation, decrypt file-host pages, call APIs, and download files. - Missing/declarative mismatches: The registry says 'no required binaries' but the code needs: Playwright (and a browser runtime), Python packages (playwright, ebooklib, weasyprint), system tools (curl, unzip, file), and CJK fonts. Several scripts contain a hard-coded Python interpreter path (/opt/homebrew/...), which will likely fail on other systems. Expect to manually install dependencies and edit paths. - Security surface: The automation executes JavaScript extracted from third-party pages (page.evaluate), launches headless browsers, and executes shell commands (curl, unzip, subprocess.run). That is expected for this downloader but increases risk: a malicious or compromised download page could trigger unexpected network requests or server‑side interactions. To reduce risk, run this skill only in an isolated environment (container, VM, or dedicated machine), review the code paths that call page.evaluate and subprocess.run, and avoid running with elevated privileges. - Legal and policy: The skill is designed to retrieve ebooks from sites and file hosts (including Anna's Archive/libgen mirrors). That may conflict with copyright law or your organization's acceptable-use policy. Confirm legality and policy compliance before using. - Practical recommendations: - Install and test dependencies in a sandbox (virtualenv/conda, container). Follow README for Playwright setup. - Replace or remove hard-coded PYTHON paths and verify environment values (SOURCE_* variables) point to expected hosts. - Inspect and, if desired, restrict network access for the process (e.g., block outbound except to known sources) when testing. - If you want to use it as an OpenClaw skill, add an explicit install step and declare required binaries and env vars so the runtime can validate prerequisites. If you want, I can: list the exact files/lines that reference hard-coded paths and subprocess calls, extract all external hostnames the code references, or generate a minimal checklist of the packages/commands to install to run this safely in a container.

Like a lobster shell, security has layers — review code before you run it.

latestvk9770158x4mx4n31sjc79bp3hh83kyrn

139downloads

0stars

7versions

Updated 1mo ago

v2.0.0

MIT-0

Chinese Ebook Downloader

Download Chinese ebooks from multiple sources with automatic fallback and format conversion.

Quick Start

# Single book download (multi-source fallback)
python scripts/download_book.py --title "超越百岁" --author "彼得·阿提亚"

# Multi-source batch download (A→B→C fallback + EPUB→PDF conversion)
python scripts/multi_source_download.py ~/Books/

# Search Anna's Archive directly
python scripts/search_source_c.py "书名" "作者"

# Convert EPUB to PDF
python scripts/epub_to_pdf.py book.epub book.pdf

Download Sources (Priority Order)

Source	Coverage	Limit	Notes
Source A (online book library)	~100%	None	Primary — high coverage for popular Chinese books
Source B (secondary library)	~8%	None	Fallback for missing titles
Source C (Anna's Archive)	Wide	Rate-limited	Last resort — uses libgen.li mirrors

Note: Z-Library has been deprecated due to 10/day download limit.

Multi-Source Fallback

The multi_source_download.py script automatically tries sources in order:

Source A → Source B → Source C → EPUB→PDF Conversion

Workflow per book:

Try Source A (ZIP → extract PDF/EPUB)
If failed, try Source B (file host download)
If failed, try Source C (Anna's Archive via libgen.li)
If only EPUB found, auto-convert to PDF using weasyprint

Usage:

# Edit BOOKS list in script, then run:
python scripts/multi_source_download.py ~/Books/

EPUB → PDF Conversion

When only EPUB format is available, auto-convert using weasyprint:

# Single file
python scripts/epub_to_pdf.py input.epub output.pdf

# Batch convert directory
python scripts/epub_to_pdf.py --batch ~/Books/

Requirements: ebooklib, weasyprint, CJK fonts installed.

Scripts Reference

Script	Purpose
`download_book.py`	Primary download from Source A
`search_secondary_source.py`	Source B search & download
`search_source_c.py`	Anna's Archive search & download
`batch_download.py`	Batch download from JSON list
`multi_source_download.py`	Multi-source A→B→C fallback
`epub_to_pdf.py`	EPUB/MOBI to PDF conversion
`anna_iso_batch.sh`	Anna's Archive isolated batch (one process per book)

Source A Workflow (Primary)

Search → Get file host link → Decrypt → Wait countdown → API fetch → curl download → Extract ZIP

Step 1: Search

Search the primary library for the book title. Navigate to download page, extract file host URL and password.

Step 2: Decrypt

Navigate to file host URL, enter password, click decrypt.

Step 3: Wait for countdown

File hosting service requires countdown before download. Do not skip.

Step 4: Fetch real download URL

Get page variables:

JSON.stringify({api_server, userid, file_id, share_id, file_chk, start_time, wait_seconds, verifycode})

Call API:

(async () => {
  var url = api_server + '/get_file_url.php?uid=' + userid
    + '&fid=' + file_id + '&folder_id=0&share_id=' + share_id
    + '&file_chk=' + file_chk + '&start_time=' + start_time
    + '&wait_seconds=' + wait_seconds + '&mb=0&app=0&acheck=0'
    + '&verifycode=' + verifycode + '&rd=' + Math.random();
  var headers = typeof getAjaxHeaders === 'function' ? getAjaxHeaders() : {};
  var resp = await fetch(url, {headers: headers});
  return JSON.stringify(await resp.json());
})()

Response code: 200 → downurl is real URL.

Step 5: Download

curl -L -o "book.zip" "DOWNURL" \
  -H "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)" \
  --max-time 1200

Step 6: Extract ZIP (GBK encoding)

import zipfile
with zipfile.ZipFile('book.zip', 'r') as z:
    for info in z.infolist():
        try:
            name = info.filename.encode('cp437').decode('gbk')
        except:
            name = info.filename
        ext = os.path.splitext(name)[1].lower()
        if ext in ('.epub', '.azw3', '.mobi', '.pdf', '.txt'):
            data = z.read(info.filename)
            with open(os.path.basename(name), 'wb') as f:
                f.write(data)

Book Name Matching Strategy

When a book title is long or contains multiple names (e.g. box sets):

Removes subtitles (after "：" or ":")
Removes parenthetical content ("（...）", "(...)")
Removes "套装共X册" bundle descriptions
Splits "+"-connected titles into individual books
Tries each keyword until match found
Falls back to full title + author

Examples:

"杨定一全部生命系列：真原医+静坐+好睡（套装3册）" → tries "真原医", "静坐", "好睡"
"超越百岁：长寿的科学与艺术" → tries "超越百岁", then "超越百岁彼得·阿提亚"

Format Selection

Flag	Description
`--format pdf`	PDF only (default, preferred for NotebookLM)
`--format epub`	EPUB only
`--format mobi`	MOBI only
`--format azw3`	AZW3 only
`--format any`	Accept any available format

Batch Download

python scripts/batch_download.py --book-list books.json --output-dir ~/Books/

JSON format:

[
  {"title": "超越百岁", "file_url": "<file_host_url>", "password": "<password>"}
]

Features: resume via _progress.json, skip existing, rate limiting.

Troubleshooting

Problem	Solution
IP blocking	Use browser tool, not web_fetch
Link 404	Link expired, re-search
API non-200	Re-navigate and re-decrypt
Download is HTML	URL expired, fresh API call needed
ZIP filenames garbled	Use Python cp437→gbk, not unzip
Timeout on large files	Increase `--max-time` to 1200
Anna's Archive blocked	Try different mirror, use `anna_iso_batch.sh`

Comments

Loading comments...