playwright-browser

Data & APIs

Use Playwright to browse websites with a real (non-headless) browser and extract data by hooking network responses. Use when the user wants to: - View a website's content, especially SPAs (Single Page Applications) - Navigate websites and click links - Search for keywords within web pages - Extract data from JavaScript-heavy sites - Capture API responses directly Triggers on: "查看网页", "browse site", "get content from URL", "extract data from website", "hook API responses", "抓取网页内容", "find link", "click link", "search page"

Install

openclaw skills install playwright-browser

Playwright Browser Skill

Purpose

This skill enables OpenClaw to launch a real Chrome/Chromium browser (non-headless), navigate to websites, and extract content by:

Scraping rendered DOM content
Hooking and capturing network responses (XHR/Fetch API calls)
Finding and clicking links
Searching for keywords in page content
Focusing on specific elements

Prerequisites

Ensure Playwright is installed:

pip install playwright
playwright install chromium

Usage

1. Basic Navigation and Content Extraction

from skills.playwright-browser.scripts.browser_agent import SyncBrowserAgent

agent = SyncBrowserAgent(headless=False)
content = agent.get_page_content("https://example.com")
print(f"Title: {content['title']}")
agent.close()

2. Find and Click Links

from skills.playwright-browser.scripts.browser_agent import SyncBrowserAgent

agent = SyncBrowserAgent(headless=False)
agent.navigate("https://sina.com")

# Find links containing specific text
links = agent.find_links_by_text("军事")
for link in links:
    print(f"Found: {link['text']} -> {link['href']}")

# Click the first matching link
agent.find_link_and_click("军事")

agent.close()

3. Search Page Content

agent = SyncBrowserAgent(headless=False)
agent.navigate("https://mil.news.sina.com.cn/")

# Search for keyword in page
results = agent.search_page_content("伊朗")
for r in results:
    print(f"Found: {r['text']}")

# Focus on first result
if results:
    agent.focus_on_element(results[0])

agent.close()

4. Hook Network Responses

agent = SyncBrowserAgent(headless=False)

def on_api_response(response):
    if response.resource_type in ['xhr', 'fetch']:
        print(f"API: {response.url}")

agent.hook_network_responses(on_api_response)
agent.navigate("https://spa-app.example.com")

# Get all captured API calls
api_calls = agent.get_captured_api_calls()
agent.close()

Safety Guidelines

NEVER navigate to suspicious or potentially malicious URLs
Always validate URLs before navigation
Respect robots.txt and website terms of service
Warn user when accessing sites that may contain sensitive content
Close browser when done to free resources