CRE Scraper
Scrapes commercial real estate listings from Crexi and LoopNet using Claude in Chrome on a Mac Mini with residential IP. Bypasses Cloudflare bot protection....
MIT-0 · Free to use, modify, and redistribute. No attribution required.
⭐ 0 · 4 · 0 current installs · 0 all-time installs
MIT-0
Security Scan
OpenClaw
Suspicious
high confidencePurpose & Capability
Name/description (CRE scraping) matches the code: Playwright-based scrapers for Crexi and LoopNet, local SQLite storage, and syncing to a dashboard. However the SKILL.md and scripts expect an external VPS (rsync/ssh) for the Command Center; the registry metadata claimed no config paths or env vars but the instructions require ~/.claude/settings.json, a session.json, and an SSH-authorized key on a remote host. That mismatch (metadata vs. declared requirements) is a red flag.
Instruction Scope
SKILL.md and scripts instruct the agent to use a saved browser session (session.json with cookies and cf_clearance), click reveal phone buttons, intercept API responses, and then rsync the DB to root@187.77.140.113 and run a remote sync script. scrape.py also reads LOOPNET_EMAIL / LOOPNET_PASS from the environment even though those env vars are not declared. The skill therefore reads/uses browser session data and credentials and transmits scraped data (including click-revealed phone numbers) to an external host — scope exceeds a simple local scraping helper.
Install Mechanism
No external install spec (instruction-only) so nothing is downloaded at install time — lower risk in that sense. But the package ships an included session.json (cookies/Cloudflare tokens), Python scripts, and shell scripts that will be present on disk and executed. The inclusion of a pre-populated session.json (with cf_clearance and many domain cookies) is unusual and risky because it embeds session state that can be used to bypass protections.
Credentials
Declared registry metadata lists no env/config requirements, but SKILL.md and code require/expect: ~/.claude/settings.json, a session.json, Chrome with Claude extension, an SSH key authorized on an external VPS, and LoopNet credentials via LOOPNET_EMAIL/LOOPNET_PASS. The skill also unsets ANTHROPIC_API_KEY in run-scrape.sh. Requesting an SSH key and syncing sensitive scraped data to an IP-owned remote host is disproportionate for a local scraper unless you explicitly control that remote host.
Persistence & Privilege
always:false (good), but the skill is designed to be run regularly (launchd cron entries suggested) and will autonomously push data to an external VPS when invoked. Autonomous invocation combined with automatic rsync/ssh to an unknown third party increases blast radius — the skill would repeatedly transmit scraped leads (including phone numbers and any intercepted session data) off-host.
Scan Findings in Context
[base64-block] unexpected: A prompt-injection pattern (base64-block) was detected in SKILL.md. There is no legitimate need for embedded base64 payloads in a scraping helper; this could be an attempt to hide or inject instructions. Treat this as suspicious and review the SKILL.md and all files for hidden/encoded payloads.
What to consider before installing
Do not install or run this skill until you confirm a few things. Key concerns: (1) The package includes a session.json with Cloudflare clearance and many cookies — that lets the scraper bypass bot protections and may include someone else's session or sensitive tokens; don't use a session file you don't fully trust. (2) Scripts rsync the local DB to root@187.77.140.113 and SSH to run a remote script — that sends scraped data (and could send session-derived data) to an unknown server. If you intend to use this, replace the remote host with a server you control, remove or regenerate the included session.json, and only grant SSH access to a dedicated key/account. (3) The code expects LOOPNET_EMAIL / LOOPNET_PASS env vars but they are not declared in registry metadata — supplying credentials to undeclared code is risky. (4) Check legal/ToS implications of scraping these sites. If you want help making the skill safer: remove the bundled session.json, add clear declarations for required env vars and remote hosts, and change rsync/ssh endpoints to your own infrastructure. If you cannot verify the owner or purpose of the remote VPS, treat this skill as unsafe.Like a lobster shell, security has layers — review code before you run it.
Current versionv2.0.0
Download ziplatest
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
SKILL.md
CRE Scraper v2.0
Scrape commercial real estate listings from Crexi and LoopNet using Claude in Chrome.
Architecture
Mac Mini (residential IP + Chrome)
→ /scrape-crexi or /scrape-loopnet slash commands
→ ~/.openclaw/workspace/data/properties.db
→ rsync to VPS staging
→ sync-properties.py → Command Center dashboard
Requirements
- macOS with Claude Code installed
- Claude in Chrome browser extension active
- Logged into Crexi (crexi.com) and LoopNet (loopnet.com) in Chrome
- SSH key authorized on VPS
chromeEnabled: truein ~/.claude/settings.json
Usage
Run Crexi scrape (all 21 combinations):
~/.openclaw/skills/cre-scraper/run-scrape.sh
Run enrichment on unenriched properties:
~/.openclaw/skills/cre-scraper/enrich-batch.sh [batch_size]
Or inside Claude Code:
/scrape-crexi
/scrape-loopnet
Configuration
- States: FL, GA, NC, TN, AL, LA, ID
- Asset types: rv_park, self_storage, marina
- Price range: $800K–$3M
- Min units: 50+ (when known)
- Value-add threshold: VAS ≥ 40
What gets scraped
Per listing:
- Address, city, state, zip
- Asking price, cap rate, NOI, occupancy
- Units/pads/slips, SF, year built, acreage
- Pro-forma cap rate and NOI
- Broker name, firm, full phone (click-reveal)
- Description and investment highlights
- AI analysis: IRR, DSCR, Cash-on-Cash, Value-Add Score, AI Confidence
Cron schedule (launchd)
- 7:00am — Crexi scrape (ai.crexi.scraper)
- 8:00am — LoopNet scrape (ai.loopnet.scraper)
- Midnight — Enrichment batch (ai.crexi.enricher)
Trigger phrases
- "scrape new deals"
- "run the Crexi scraper"
- "find new RV parks in Florida"
- "check LoopNet for self storage in Tennessee"
- "enrich unenriched properties"
- "sync deals to dashboard"
Output
Properties saved to ~/.openclaw/workspace/data/properties.db and synced to OpenClaw Command Center dashboard via sync-properties.py.
Files
7 totalSelect a file
Select a file to preview.
Comments
Loading comments…
