Install
openclaw skills install tiktok-scraper-2Discover and scrape TikTok profiles by location and category with browser simulation, stealth, proxy support, and exportable JSON/CSV data including thumbnails.
openclaw skills install tiktok-scraper-2A browser-based TikTok profile discovery and scraping tool.
Part of ScrapeClaw — a suite of production-ready, agentic social media scrapers for Instagram, YouTube, X/Twitter, TikTok, and Facebook built with Python & Playwright, no API keys required.
---
name: tiktok-scraper
description: Discover and scrape TikTok profiles from your browser.
emoji: 🎵
version: 1.0.0
author: influenza
tags:
- tiktok
- scraping
- social-media
- influencer-discovery
metadata:
clawdbot:
requires:
bins:
- python3
- chromium
config:
stateDirs:
- data/output
- data/queue
- thumbnails
outputFormats:
- json
- csv
---
This skill provides a two-phase TikTok scraping system:
tiktok.com as the site to searchFor OpenClaw agent integration, the skill provides JSON output:
# Discover profiles (returns JSON)
discover --location "Miami" --category "dance" --output json
# Scrape single profile (returns JSON)
scrape --username charlidamelio --output json
{
"username": "example_creator",
"full_name": "Example Creator",
"nickname": "Example",
"bio": "Dance creator | NYC 💃",
"bio_link": "https://example.com",
"followers": 250000,
"following": 800,
"likes": 5000000,
"videos_count": 120,
"is_verified": false,
"is_private": false,
"influencer_tier": "macro",
"category": "dance",
"location": "New York",
"profile_url": "https://www.tiktok.com/@example_creator",
"profile_pic_local": "thumbnails/example_creator/profile_abc123.jpg",
"content_thumbnails": [
"thumbnails/example_creator/content_1_def456.jpg",
"thumbnails/example_creator/content_2_ghi789.jpg"
],
"video_views": [
{"display": "1.2M", "count": 1200000},
{"display": "500K", "count": 500000}
],
"scrape_timestamp": "2026-03-02T14:30:00"
}
| Tier | Follower Range |
|---|---|
| nano | < 1,000 |
| micro | 1,000 - 10,000 |
| mid | 10,000 - 100,000 |
| macro | 100,000 - 1M |
| mega | > 1,000,000 |
data/queue/{location}_{category}_{timestamp}.jsondata/output/{username}.jsonthumbnails/{username}/profile_*.jpg, thumbnails/{username}/content_*.jpgdata/export_{timestamp}.json, data/export_{timestamp}.csvEdit config/scraper_config.json:
{
"proxy": {
"enabled": false,
"provider": "brightdata",
"country": "",
"sticky": true,
"sticky_ttl_minutes": 10
},
"google_search": {
"enabled": true,
"api_key": "",
"search_engine_id": "",
"queries_per_location": 3
},
"scraper": {
"headless": false,
"min_followers": 1000,
"download_thumbnails": true,
"max_thumbnails": 6
},
"cities": ["New York", "Los Angeles", "Miami", "Chicago"],
"categories": ["fashion", "beauty", "fitness", "food", "travel", "tech", "comedy", "dance", "music", "gaming"]
}
The scraper automatically filters out:
Running a scraper at scale without a residential proxy will get your IP blocked fast. Here's why proxies are essential for long-running scrapes:
| Advantage | Description |
|---|---|
| Avoid IP Bans | Residential IPs look like real household users, not data-center bots. TikTok is far less likely to flag them. |
| Automatic IP Rotation | Each request (or session) gets a fresh IP, so rate-limits never stack up on one address. |
| Geo-Targeting | Route traffic through a specific country/city so scraped content matches the target audience's locale. |
| Sticky Sessions | Keep the same IP for a configurable window (e.g. 10 min) — critical for maintaining a consistent browsing session. |
| Higher Success Rate | Rotating residential IPs deliver 95%+ success rates compared to ~30% with data-center proxies on TikTok. |
| Long-Running Scrapes | Scrape thousands of profiles over hours or days without interruption. |
| Concurrent Scraping | Run multiple browser instances across different IPs simultaneously. |
We have affiliate partnerships with top residential proxy providers. Using these links supports continued development of this skill:
| Provider | Best For | Sign Up |
|---|---|---|
| Bright Data | World's largest network, 72M+ IPs, enterprise-grade | 👉 Get Bright Data |
| IProyal | Pay-as-you-go, 195+ countries, no traffic expiry | 👉 Get IProyal |
| Storm Proxies | Fast & reliable, developer-friendly API, competitive pricing | 👉 Get Storm Proxies |
| NetNut | ISP-grade network, 52M+ IPs, direct connectivity | 👉 Get NetNut |
Sign up with any provider above, then grab:
export PROXY_ENABLED=true
export PROXY_PROVIDER=brightdata # brightdata | iproyal | stormproxies | netnut | custom
export PROXY_USERNAME=your_user
export PROXY_PASSWORD=your_pass
export PROXY_COUNTRY=us # optional: two-letter country code
export PROXY_STICKY=true # optional: keep same IP per session
These are auto-configured when you set the provider name:
| Provider | Host | Port |
|---|---|---|
| Bright Data | brd.superproxy.io | 22225 |
| IProyal | proxy.iproyal.com | 12321 |
| Storm Proxies | rotating.stormproxies.com | 9999 |
| NetNut | gw-resi.netnut.io | 5959 |
Override with PROXY_HOST / PROXY_PORT env vars if your plan uses a different gateway.
For any other proxy service, set provider to custom and supply host/port manually:
{
"proxy": {
"enabled": true,
"provider": "custom",
"host": "your.proxy.host",
"port": 8080,
"username": "user",
"password": "pass"
}
}
Once configured, the scraper picks up the proxy automatically — no extra flags needed:
# Discover and scrape as usual — proxy is applied automatically
python main.py discover --location "Miami" --category "dance"
python main.py scrape --username charlidamelio
# The log will confirm proxy is active:
# INFO - Proxy enabled: <ProxyManager provider=brightdata enabled host=brd.superproxy.io:22225>
from proxy_manager import ProxyManager
# From config (auto-reads config/scraper_config.json)
pm = ProxyManager.from_config()
# From environment variables
pm = ProxyManager.from_env()
# Manual construction
pm = ProxyManager(
provider="brightdata",
username="your_user",
password="your_pass",
country="us",
sticky=True
)
# For Playwright browser context
proxy = pm.get_playwright_proxy()
# → {"server": "http://brd.superproxy.io:22225", "username": "user-country-us-session-abc123", "password": "pass"}
# For requests / aiohttp
proxies = pm.get_requests_proxy()
# → {"http": "http://user:pass@host:port", "https": "http://user:pass@host:port"}
# Force new IP (rotates session ID)
pm.rotate_session()
# Debug info
print(pm.info())
"sticky": true."country": "us" (or your target region) so TikTok serves content in the expected locale.pm.rotate_session() between large batches of profiles to get a fresh IP.delay_between_profiles in config to avoid aggressive patterns.