Web Scraper Trae

v1.0.1

Opens browser and scrapes webpage content using Playwright. Invoke when user wants to crawl/scrape a webpage, extract data from a website, or get content fro...

⭐ 0· 114·1 current·1 all-time

by@zhengjia626

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for zhengjia626/web-scraper-trae.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Web Scraper Trae" (zhengjia626/web-scraper-trae) from ClawHub.
Skill page: https://clawhub.ai/zhengjia626/web-scraper-trae
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install web-scraper-trae

ClawHub CLI

Package manager switcher

npx clawhub@latest install web-scraper-trae

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

medium confidence

ℹ

Purpose & Capability

The SKILL.md clearly implements a Playwright-based scraper and this matches the name/description. However, the registry metadata lists no required binaries while the instructions explicitly require Node.js, npm/npx, and downloading Playwright/Chromium — a minor inconsistency between declared requirements and actual prerequisites.

✓

Instruction Scope

Instructions stay within the stated purpose: create and run a Node script that navigates to a provided URL and returns title/text/html. The instructions do not ask to read unrelated files, environment variables, or transmit data to third-party endpoints beyond fetching the target URL. Note: the output includes full HTML content from the target page, which may contain sensitive data if the URL requires authentication or points to internal resources.

ℹ

Install Mechanism

There is no formal install spec in the metadata (instruction-only). The SKILL.md expects use of npm/npx to install Playwright, which in turn downloads Chromium from Playwright's release hosts — this is a standard but non-trivial network download and writes large binaries to disk. This is expected for a Playwright scraper but worth noting for environments with restricted network access or storage quotas.

✓

Credentials

The skill requests no environment variables or credentials. That is proportional to its stated purpose. Make sure you don't pass URLs that require credentials, since the skill does not include built-in auth handling.

✓

Persistence & Privilege

Skill does not request always:true and makes no claims to modify agent/system configuration. Autonomous invocation is allowed (platform default) but not combined with other concerning privileges.

Assessment

This skill is instruction-only and implements a straightforward Playwright scraper. Before installing or running it: (1) ensure Node.js/npm/npx are available — the metadata does not declare them; (2) expect Playwright to download a Chromium binary (large, network activity, disk usage); (3) run the scraper in an isolated/sandboxed environment (the script uses --no-sandbox) to reduce risk if untrusted pages are loaded; (4) avoid scraping URLs that require authentication or point to internal/private systems unless you understand the implications, since the script will print full HTML/text; (5) if you need tighter controls, consider reviewing or copying the provided script into your own vetted environment rather than executing arbitrary installs. If you want higher assurance, ask the publisher for an explicit install spec and declared binary requirements (node/npm).

Like a lobster shell, security has layers — review code before you run it.

latestvk977wnsy440cpeb0151wcnt4m984jt5j

114downloads

0stars

2versions

Updated 2w ago

v1.0.1

MIT-0

Web Scraper Trae

Opens a browser using Playwright and scrapes webpage content.

Prerequisites

npm install playwright
npx playwright install chromium

Usage

When user provides a URL, create a Node.js script to scrape the page:

const { chromium } = require('playwright');

async function scrape(url) {
  const browser = await chromium.launch({
    headless: true,
    args: ['--no-sandbox', '--disable-setuid-sandbox']
  });

  const page = await browser.newPage();
  await page.goto(url, { waitUntil: 'networkidle', timeout: 60000 });

  const title = await page.title();
  const text = await page.textContent('body');
  const html = await page.content();

  await browser.close();

  return { title, text, html, url };
}

const url = process.argv[2];
if (!url) {
  console.error('请提供 URL 参数');
  process.exit(1);
}

scrape(url).then(result => {
  console.log('=== SCRAPE_RESULT ===');
  console.log(JSON.stringify(result, null, 2));
}).catch(err => {
  console.error('爬取失败:', err.message);
  process.exit(1);
});

Execution

Run the script with:

node scrape.js "https://example.com"

Output Format

Return JSON with:

title: Page title
text: Visible text content (HTML stripped)
html: Full HTML source
url: Original URL

Notes

Use headless: true for server environments
Use waitUntil: 'networkidle' to ensure full page load
Set timeout to 60 seconds for slow pages
Handle SPA (Single Page Applications) that load content dynamically
For pages requiring interaction, use playwright-cli skill instead

Comments

Loading comments...