PropertyGuru SG Sale Browser Crawl

extract around 50 Singapore for-sale listings from a PropertyGuru search results URL using a real browser session after Cloudflare verification. use when the...

MIT-0 · Free to use, modify, and redistribute. No attribution required.

⭐ 0 · 67 · 0 current installs · 0 all-time installs

by@snailb1007

MIT-0

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Suspicious

medium confidence

ℹ

Purpose & Capability

The skill's stated purpose (collect ~50 PropertyGuru SG sale listings) matches the runtime instructions (use a real browser and extract Next.js hydration data). However, the SKILL.md explicitly depends on Playwright while the registry metadata lists no required binaries or install steps — a capability mismatch that means the environment must already provide Playwright or the skill won't run as described.

✓

Instruction Scope

Instructions are narrowly focused: open the provided search URL in a real browser, wait for Cloudflare verification to pass, read window.__NEXT_DATA__.props.pageProps.pageData.listingsData, dedupe by listing id, and stop when ~50 unique listings are collected. The skill reads its own included reference file (references/source-notes.md) for defaults — that is expected. There are no instructions to read unrelated system files or to transmit data to unexpected external endpoints.

Install Mechanism

This is an instruction-only skill with no install spec, yet SKILL.md says it depends on Playwright and a real browser session. The omission means there's no declared way the platform will install or provide Playwright (a non-trivial dependency that often requires native browser binaries). That gap is operationally risky and could lead to silent failures or ad-hoc installation by an operator.

✓

Credentials

The skill requests no environment variables, credentials, or config paths. That is proportional to its stated purpose. It will, however, require network access and the ability to launch a browser process — capabilities not declared in the registry metadata.

✓

Persistence & Privilege

The skill is not always-enabled and does not request elevated persistence. It does not attempt to modify other skills or system-wide settings. Autonomous invocation is allowed (platform default), which increases blast radius if misused, but that by itself is not a mismatch.

What to consider before installing

This skill appears to do what it says (browser-backed scraping of PropertyGuru's Next.js payload), but note two practical concerns before installing: 1) Missing runtime/install information: SKILL.md depends on Playwright and a real browser, but there is no install spec or declared binary requirement. Ensure your agent environment already provides Playwright and a compatible browser (or add an explicit install step). Without that, the skill may fail or someone may try to install Playwright ad-hoc. 2) Operational and legal considerations: the skill will launch real browser sessions and access the public site repeatedly. That requires network access, may trigger anti-bot defenses (Cloudflare) and could violate PropertyGuru's terms of service. Confirm you have permission to crawl and respect rate limits. If you plan to use this skill, verify the runtime will supply Playwright (and browser binaries), restrict how often the skill runs, and monitor its network/browser activity. If you want to be stricter, request the maintainer add an explicit install spec (or remove the Playwright dependency) and document required runtime permissions.

Like a lobster shell, security has layers — review code before you run it.

Current versionv0.1.0

Download zip

crawlvk9731gq10ekks9xrb976z3rbf5835b7vlatestvk9731gq10ekks9xrb976z3rbf5835b7vpropertyguruvk9731gq10ekks9xrb976z3rbf5835b7v

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

Runtime requirements

🏘️ Clawdis

SKILL.md

PropertyGuru SG Sale Browser Crawl

Use this skill for PropertyGuru Singapore search results pages when the job is to collect roughly 50 listing cards from one filtered search URL.

This target is browser-backed.

A direct fetch may return a Cloudflare verification page or 403.
Prefer a real browser session and extract from the page's hydrated Next.js data.
Do not treat DOM card scraping as the primary source when __NEXT_DATA__ is available.

Required skill

This skill depends on playwright.

Use a real browser page.
Let the browser complete PropertyGuru's Cloudflare verification first.
Only extract after the page title and result page content have loaded.

Workflow

Read {baseDir}/references/source-notes.md.
Start from the user-provided search URL. If no URL is supplied, use the default URL from the source notes.
Open the page in a real browser.
Wait until the search results page is actually loaded, not the initial verification screen.
Read window.__NEXT_DATA__.props.pageProps.pageData.
Use pageData.data.listingsData as the canonical listing collection for that page.
For each item in listingsData, use listingData.id as the stable dedupe key.
Preserve the raw listingData object whenever possible.
Optionally add lightweight wrapper fields such as:
- source_url
- page
- collected_at
- listing_id
Continue page-by-page until one of these conditions is met:

50 unique listings have been collected
paginationData.currentPage >= paginationData.totalPages
listingsData is empty
the next page repeats only ids already seen

For the default PropertyGuru URL observed on March 18, 2026, the page payload contained 25 listings per page, so pages 1 and 2 were enough to reach 50 listings.

Canonical data location

Prefer:

window.__NEXT_DATA__.props.pageProps.pageData.data.listingsData

Related pagination data:

window.__NEXT_DATA__.props.pageProps.pageData.data.paginationData

Useful search context:

window.__NEXT_DATA__.props.pageProps.pageData.searchParams

Recommended extraction shape

Prefer keeping the raw listing payload plus a few convenience fields:

{
  "source_url": "https://www.propertyguru.com.sg/property-for-sale?listingType=sale&page=1&isCommercial=false&maxPrice=1400000",
  "page": 1,
  "collected_at": "2026-03-18T04:40:00Z",
  "listing_id": 500044843,
  "raw": {
    "id": 500044843,
    "localizedTitle": "780B Woodlands Crescent",
    "url": "https://www.propertyguru.com.sg/listing/hdb-for-sale-780b-woodlands-crescent-500044843",
    "price": {
      "value": 500000,
      "pretty": "S$ 500,000"
    },
    "bedrooms": 2,
    "bathrooms": 2
  }
}

If the caller wants a flatter convenience export, these fields are usually available:

listingData.id
listingData.localizedTitle
listingData.url
listingData.price.value
listingData.price.pretty
listingData.area.localeStringValue
listingData.bedrooms
listingData.bathrooms
listingData.fullAddress
listingData.property
listingData.psfText
listingData.postedOn
listingData.agent
listingData.agency
listingData.mrt
listingData.isVerified

Operating rules

Use browser extraction as the default path.
Do not rely on curl, plain HTTP, or static HTML parsing as the primary strategy.
Do not scrape promo widgets, "Explore around" cards, or other injected recommendation blocks from the visible DOM.
Use only listingsData for the main dataset.
Crawl one page at a time.
Deduplicate strictly on listingData.id.
Stop as soon as the requested target count is reached.
Preserve the search URL and page number with every saved record.
If the page falls back to a Cloudflare challenge and does not recover, report the block explicitly instead of pretending the page is empty.

Output target

Default target: about 50 unique listings.

Prefer pages 1 and 2 first.
If one page returns fewer rows than expected, continue to page 3 and beyond until the target count is reached.

Notes

PropertyGuru may change the page structure, build id, or anti-bot behavior at any time.
When the page changes, re-check __NEXT_DATA__ before changing extraction logic.
For this skill, the in-page Next.js payload is more stable than card-by-card DOM parsing.

Files

3 total

Select a file

Select a file to preview.

Comments

Loading comments…