Web Scraper Trae

v1.0.1

Opens browser and scrapes webpage content using Playwright. Invoke when user wants to crawl/scrape a webpage, extract data from a website, or get content fro...

0· 114·1 current·1 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for zhengjia626/web-scraper-trae.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Web Scraper Trae" (zhengjia626/web-scraper-trae) from ClawHub.
Skill page: https://clawhub.ai/zhengjia626/web-scraper-trae
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install web-scraper-trae

ClawHub CLI

Package manager switcher

npx clawhub@latest install web-scraper-trae
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
The SKILL.md clearly implements a Playwright-based scraper and this matches the name/description. However, the registry metadata lists no required binaries while the instructions explicitly require Node.js, npm/npx, and downloading Playwright/Chromium — a minor inconsistency between declared requirements and actual prerequisites.
Instruction Scope
Instructions stay within the stated purpose: create and run a Node script that navigates to a provided URL and returns title/text/html. The instructions do not ask to read unrelated files, environment variables, or transmit data to third-party endpoints beyond fetching the target URL. Note: the output includes full HTML content from the target page, which may contain sensitive data if the URL requires authentication or points to internal resources.
Install Mechanism
There is no formal install spec in the metadata (instruction-only). The SKILL.md expects use of npm/npx to install Playwright, which in turn downloads Chromium from Playwright's release hosts — this is a standard but non-trivial network download and writes large binaries to disk. This is expected for a Playwright scraper but worth noting for environments with restricted network access or storage quotas.
Credentials
The skill requests no environment variables or credentials. That is proportional to its stated purpose. Make sure you don't pass URLs that require credentials, since the skill does not include built-in auth handling.
Persistence & Privilege
Skill does not request always:true and makes no claims to modify agent/system configuration. Autonomous invocation is allowed (platform default) but not combined with other concerning privileges.
Assessment
This skill is instruction-only and implements a straightforward Playwright scraper. Before installing or running it: (1) ensure Node.js/npm/npx are available — the metadata does not declare them; (2) expect Playwright to download a Chromium binary (large, network activity, disk usage); (3) run the scraper in an isolated/sandboxed environment (the script uses --no-sandbox) to reduce risk if untrusted pages are loaded; (4) avoid scraping URLs that require authentication or point to internal/private systems unless you understand the implications, since the script will print full HTML/text; (5) if you need tighter controls, consider reviewing or copying the provided script into your own vetted environment rather than executing arbitrary installs. If you want higher assurance, ask the publisher for an explicit install spec and declared binary requirements (node/npm).

Like a lobster shell, security has layers — review code before you run it.

latestvk977wnsy440cpeb0151wcnt4m984jt5j
114downloads
0stars
2versions
Updated 2w ago
v1.0.1
MIT-0

Web Scraper Trae

Opens a browser using Playwright and scrapes webpage content.

Prerequisites

npm install playwright
npx playwright install chromium

Usage

When user provides a URL, create a Node.js script to scrape the page:

const { chromium } = require('playwright');

async function scrape(url) {
  const browser = await chromium.launch({
    headless: true,
    args: ['--no-sandbox', '--disable-setuid-sandbox']
  });

  const page = await browser.newPage();
  await page.goto(url, { waitUntil: 'networkidle', timeout: 60000 });

  const title = await page.title();
  const text = await page.textContent('body');
  const html = await page.content();

  await browser.close();

  return { title, text, html, url };
}

const url = process.argv[2];
if (!url) {
  console.error('请提供 URL 参数');
  process.exit(1);
}

scrape(url).then(result => {
  console.log('=== SCRAPE_RESULT ===');
  console.log(JSON.stringify(result, null, 2));
}).catch(err => {
  console.error('爬取失败:', err.message);
  process.exit(1);
});

Execution

Run the script with:

node scrape.js "https://example.com"

Output Format

Return JSON with:

  • title: Page title
  • text: Visible text content (HTML stripped)
  • html: Full HTML source
  • url: Original URL

Notes

  • Use headless: true for server environments
  • Use waitUntil: 'networkidle' to ensure full page load
  • Set timeout to 60 seconds for slow pages
  • Handle SPA (Single Page Applications) that load content dynamically
  • For pages requiring interaction, use playwright-cli skill instead

Comments

Loading comments...