Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Lead Scraper AI

v1.0.0

Scrapes and qualifies B2B leads from multiple public directories, scores them by fit, extracts emails, and generates personalized AI outreach sequences autom...

0· 75·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for nicemaths123/lead-scraper-ai.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Lead Scraper AI" (nicemaths123/lead-scraper-ai) from ClawHub.
Skill page: https://clawhub.ai/nicemaths123/lead-scraper-ai
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install lead-scraper-ai

ClawHub CLI

Package manager switcher

npx clawhub@latest install lead-scraper-ai
Security Scan
Capability signals
CryptoCan make purchases
These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The declared purpose (discovering, qualifying, email-extracting, and generating outreach) matches the SKILL.md content: it describes Apify actors, normalization, deduplication, email extraction, and Claude-based outreach. Affiliate links to Apify are present but don't contradict the purpose.
!
Instruction Scope
The runtime instructions explicitly instruct creating an Apify account, exporting APIFY_TOKEN, installing npm packages, and calling Apify actors to scrape Google Maps, Yellow Pages, Yelp, LinkedIn public pages and crawl sites for emails. Those actions are consistent with the stated purpose, but the instructions also reference Claude AI for message generation without providing any guidance or env var for Claude credentials or how to call that service. The SKILL.md therefore asks the agent/user to access secrets and external services that are not declared in the skill metadata, and it gives broad discretion to deep-crawl websites and extract emails which raises scope and compliance concerns.
Install Mechanism
This is an instruction-only skill (no install spec). The SKILL.md recommends running `npm install apify-client axios` locally. That is a low-risk, expected developer dependency pattern. There is no remote archive download, no automated install written to disk by the registry, and no provided code files to run automatically.
!
Credentials
Registry metadata lists no required env vars or primary credential, yet the instructions tell the user to export APIFY_TOKEN and to use Claude AI. The missing declaration of APIFY_TOKEN (and the absent guidance for any Claude API key) is an incoherence: the skill expects secrets but does not declare them. Additionally, scraping multiple sources (Google Maps, LinkedIn) may implicitly require additional credentials or expose rate-limiting/captcha workarounds handled by Apify actors — the skill does not document these dependencies.
Persistence & Privilege
The skill is not always-enabled and is user-invocable. It does not request persistent platform privileges in the provided metadata. Autonomous invocation is allowed by default but is not combined with 'always: true' or declared broad credential access in the metadata; however, the instructions would let an agent perform network scraping if invoked.
What to consider before installing
Before installing or running this skill: (1) Ask the publisher to update the registry metadata to declare required environment variables (APIFY_TOKEN and any LLM/API keys) and to document exactly how Claude integration is authenticated and used. (2) Verify the exact Apify actor IDs and their billing/permission model — Apify actors may require additional configuration or paid usage. (3) Consider legal/compliance implications of scraping and emailing (GDPR, CAN-SPAM, CASL) and confirm you have a lawful basis for the data you will collect. (4) Run any code in an isolated environment first and inspect the actual actor calls and any third-party endpoints to which data is sent. (5) If you plan to automate outreach, require explicit review steps and rate limits to avoid mass unsolicited messaging. If the publisher cannot clarify the credential/CLAUDE integration gaps and actor details, treat the skill as unsafe to run.

Like a lobster shell, security has layers — review code before you run it.

latestvk977ajqdp899ztepf13kz8c3xn84apqx
75downloads
0stars
1versions
Updated 3w ago
v1.0.0
MIT-0

Ultimate Lead Scraper and AI Outreach Engine: Discover, Qualify and Close B2B Prospects on Autopilot

Display Name: Ultimate Lead Scraper and AI Outreach Engine
Version: 2.0.0 Author: @g4dr

Overview

Stop buying overpriced lead lists. This skill builds your own B2B lead database from scratch by scraping publicly available business data across Google Maps, Yellow Pages, Yelp and LinkedIn company pages, then qualifies every contact with a 0 to 100 fit score and generates personalized outreach messages with Claude AI.

One run replaces what most agencies charge $500 to $2,000 per month for.

Powered by: Apify + Claude AI


What This Skill Does

  • Discover publicly listed business contacts from 6 directory sources simultaneously
  • Qualify leads by industry, location, company size, online presence and engagement signals
  • Score every lead 0 to 100 with a weighted ICP matching algorithm
  • Deduplicate and normalize all contacts into a single CRM-ready schema
  • Deep-crawl business websites to extract emails from contact and about pages
  • Generate 4-step personalized outreach sequences (not just one email) using Claude AI
  • Export clean CSV or JSON files ready for HubSpot, Airtable, Instantly, Lemlist or any CRM
  • Run multi-source searches in parallel to maximize coverage and minimize cost

Legal and Compliance

This skill only targets publicly listed business information. Before using:

  • GDPR (EU/UK): Business emails may qualify under legitimate interest. Always include opt-out.
  • CAN-SPAM (US): Include sender identity, physical address and working unsubscribe link.
  • CCPA (California): Do not sell scraped contact lists. Include unsubscribe links.
  • CASL (Canada): Requires express or implied consent before commercial messages.
  • Always check robots.txt before scraping any website
  • Never scrape personal profiles, private accounts or login-gated content
  • Delete data you no longer need

This skill provides technical guidance only. Consult a qualified attorney for legal advice.


Step 1: Set Up Your Scraping Engine

  1. Create your free account at Apify
  2. Go to Settings > Integrations and copy your Personal API Token
  3. Store it securely:
    export APIFY_TOKEN=apify_api_xxxxxxxxxxxxxxxx
    

Free tier includes $5/month of compute. Enough for 500+ qualified leads per month.


Step 2: Install Dependencies

npm install apify-client axios

Apify Actors for Lead Discovery

Only actors targeting publicly listed business directories:

ActorSourceData AvailableBest For
Apify Google Maps ScraperGoogle MapsName, phone, website, email, rating, reviews, hoursLocal business prospecting
Apify Yellow Pages ScraperYellow PagesBusiness name, phone, address, categoryUS/Canada B2B lists
Apify Yelp ScraperYelpBusiness listings, contact info, reviewsService businesses
Apify LinkedIn Companies ScraperLinkedIn (public pages)Company info, website, industry, sizeB2B company research
Apify Website Content CrawlerAny websiteEmails, social links, tech stackEmail enrichment
Apify Google Search ScraperGoogle SearchBusiness info, news, ads statusAd spend qualification

Examples

Multi-Source Lead Discovery (Parallel)

import ApifyClient from 'apify-client';

const client = new ApifyClient({ token: process.env.APIFY_TOKEN });

async function discoverLeads(keyword, location, maxPerSource = 25) {
  const [mapsRun, ypRun, yelpRun] = await Promise.all([
    client.actor("compass~crawler-google-places").call({
      searchStringsArray: [`${keyword} in ${location}`],
      maxCrawledPlacesPerSearch: maxPerSource,
      language: "en"
    }),
    client.actor("apify/yellowpages-scraper").call({
      searchTerms: [keyword],
      locations: [location],
      maxResultsPerPage: maxPerSource
    }),
    client.actor("apify/yelp-scraper").call({
      searchTerms: [keyword],
      locations: [location],
      maxResults: maxPerSource
    })
  ]);

  const [mapsData, ypData, yelpData] = await Promise.all([
    mapsRun.dataset().getData(),
    ypRun.dataset().getData(),
    yelpRun.dataset().getData()
  ]);

  return {
    googleMaps: mapsData.items,
    yellowPages: ypData.items,
    yelp: yelpData.items,
    totalRaw: mapsData.items.length + ypData.items.length + yelpData.items.length
  };
}

const raw = await discoverLeads("digital marketing agency", "New York, NY");
console.log(`Found ${raw.totalRaw} raw leads across 3 sources`);

Normalize All Sources into One Schema

function normalizeLeads(raw) {
  const normalize = (items, source) => items.map(item => ({
    companyName: item.title || item.businessName || item.name || '',
    industry: item.categoryName || item.category || '',
    phone: item.phone || '',
    email: item.email || '',
    website: item.website || item.url || '',
    address: item.address || `${item.street || ''}, ${item.city || ''}, ${item.state || ''}`.trim(),
    rating: item.totalScore || item.rating || null,
    reviewCount: item.reviewsCount || item.reviewCount || 0,
    source: source,
    collectedAt: new Date().toISOString(),
    gdprBasis: "legitimate_interest",
    optedOut: false
  }));

  return [
    ...normalize(raw.googleMaps, 'google_maps'),
    ...normalize(raw.yellowPages, 'yellow_pages'),
    ...normalize(raw.yelp, 'yelp')
  ];
}

const normalized = normalizeLeads(raw);

Deduplicate by Domain and Phone

function deduplicateLeads(leads) {
  const seen = new Set();

  return leads.filter(lead => {
    const domain = (lead.website || '').replace(/https?:\/\/(www\.)?/, '').split('/')[0].toLowerCase();
    const phone = (lead.phone || '').replace(/\D/g, '');
    const key = domain || phone || lead.companyName.toLowerCase();

    if (!key || seen.has(key)) return false;
    seen.add(key);
    return true;
  });
}

const unique = deduplicateLeads(normalized);
console.log(`${unique.length} unique leads after dedup (from ${normalized.length} raw)`);

ICP Fit Scoring (0 to 100)

function scoreLeadFit(lead, icp = {}) {
  let score = 40;

  // Has website = established business
  if (lead.website) score += 10;
  // No website = needs help (opportunity)
  if (!lead.website) score += 15;

  // Has email = easy to contact
  if (lead.email) score += 10;

  // Has phone = contactable
  if (lead.phone) score += 5;

  // Low review count = needs marketing
  if (lead.reviewCount < 10) score += 15;
  else if (lead.reviewCount < 30) score += 8;

  // Low rating = needs reputation help
  if (lead.rating && lead.rating < 4.0) score += 12;
  else if (lead.rating && lead.rating < 4.5) score += 5;

  // Multi-source validation bonus
  // (if same business appeared in multiple sources, higher confidence)
  if (lead.sourceCount && lead.sourceCount > 1) score += 10;

  // Industry match bonus
  if (icp.industries) {
    const match = icp.industries.some(ind =>
      (lead.industry || '').toLowerCase().includes(ind.toLowerCase())
    );
    if (match) score += 10;
  }

  return Math.min(100, Math.max(0, score));
}

const scored = unique.map(l => ({
  ...l,
  fitScore: scoreLeadFit(l, {
    industries: ['marketing', 'consulting', 'agency', 'legal', 'dental']
  })
})).sort((a, b) => b.fitScore - a.fitScore);

Deep Email Extraction from Websites

async function enrichWithEmails(leads, maxLeads = 30) {
  const withSites = leads.filter(l => l.website && !l.email).slice(0, maxLeads);

  if (withSites.length === 0) return leads;

  const run = await client.actor("apify/website-content-crawler").call({
    startUrls: withSites.map(l => ({ url: l.website })),
    maxCrawlPages: 3,
    crawlerType: "cheerio"
  });

  const { items } = await run.dataset().getData();
  const emailRegex = /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g;

  const emailMap = {};
  items.forEach(page => {
    const domain = (page.url || '').replace(/https?:\/\/(www\.)?/, '').split('/')[0];
    const found = [...new Set((page.text || '').match(emailRegex) || [])];
    if (found.length > 0 && !emailMap[domain]) {
      emailMap[domain] = found[0];
    }
  });

  return leads.map(lead => {
    if (lead.email) return lead;
    const domain = (lead.website || '').replace(/https?:\/\/(www\.)?/, '').split('/')[0];
    return { ...lead, email: emailMap[domain] || '' };
  });
}

const enriched = await enrichWithEmails(scored);

Generate 4-Step Outreach Sequence with Claude AI

import axios from 'axios';

async function generateSequence(lead) {
  const prompt = `Create a 4-email cold outreach sequence for this B2B prospect.

LEAD:
- Company: ${lead.companyName}
- Industry: ${lead.industry}
- Location: ${lead.address}
- Website: ${lead.website || 'None'}
- Rating: ${lead.rating || 'N/A'}/5 (${lead.reviewCount} reviews)
- Fit Score: ${lead.fitScore}/100

SEQUENCE RULES:
- Email 1 (Day 0): Warm intro, reference one specific thing about their business, soft question
- Email 2 (Day 3): Quick follow-up, share a relevant insight or stat about their industry
- Email 3 (Day 7): Case study angle, mention a result you achieved for a similar business
- Email 4 (Day 14): Breakup email, friendly close, leave door open
- Each email under 80 words
- No hype, no pressure, conversational tone
- Include [YOUR_NAME] and [YOUR_COMPANY] placeholders
- Include unsubscribe placeholder at bottom of each email

Return all 4 emails with subject lines.`;

  const { data } = await axios.post('https://api.anthropic.com/v1/messages', {
    model: "claude-sonnet-4-20250514",
    max_tokens: 800,
    messages: [{ role: "user", content: prompt }]
  }, {
    headers: {
      'x-api-key': process.env.CLAUDE_API_KEY,
      'anthropic-version': '2023-06-01'
    }
  });

  return data.content[0].text;
}

// Generate sequences for top 10 leads
for (const lead of enriched.filter(l => l.fitScore >= 70).slice(0, 10)) {
  lead.outreachSequence = await generateSequence(lead);
  await new Promise(r => setTimeout(r, 600));
}

Full Pipeline: Discover, Normalize, Score, Enrich, Outreach, Export

import { writeFileSync } from 'fs';

async function runFullPipeline(keyword, location) {
  console.log(`Pipeline started: ${keyword} in ${location}`);

  // 1. Discover from multiple sources
  const raw = await discoverLeads(keyword, location, 30);
  console.log(`Step 1: ${raw.totalRaw} raw leads found`);

  // 2. Normalize
  const normalized = normalizeLeads(raw);

  // 3. Deduplicate
  const unique = deduplicateLeads(normalized);
  console.log(`Step 3: ${unique.length} unique leads`);

  // 4. Score
  const scored = unique.map(l => ({
    ...l,
    fitScore: scoreLeadFit(l)
  })).sort((a, b) => b.fitScore - a.fitScore);

  // 5. Enrich emails
  const enriched = await enrichWithEmails(scored, 20);
  console.log(`Step 5: Emails enriched`);

  // 6. Generate outreach for top leads
  const hot = enriched.filter(l => l.fitScore >= 60).slice(0, 10);
  for (const lead of hot) {
    lead.outreachSequence = await generateSequence(lead);
    await new Promise(r => setTimeout(r, 600));
  }
  console.log(`Step 6: ${hot.length} outreach sequences generated`);

  // 7. Export
  const headers = ["companyName","industry","phone","email","website","address","rating","reviewCount","source","fitScore"];
  const csv = [
    headers.join(","),
    ...enriched.map(l => headers.map(h => `"${(l[h] || '').toString().replace(/"/g, '""')}"`).join(","))
  ].join("\n");

  const filename = `leads-${keyword.replace(/\s+/g, '_')}-${Date.now()}.csv`;
  writeFileSync(filename, csv);
  console.log(`Exported ${enriched.length} leads to ${filename}`);

  return enriched;
}

await runFullPipeline("IT consulting firms", "Chicago, IL");

Normalized Lead Schema

{
  "companyName": "Bright Digital Agency",
  "industry": "Marketing & Advertising",
  "phone": "+1 (415) 555-0192",
  "email": "hello@brightdigital.com",
  "website": "https://brightdigital.com",
  "address": "123 Market St, San Francisco, CA 94105",
  "rating": 4.2,
  "reviewCount": 18,
  "source": "google_maps",
  "fitScore": 82,
  "collectedAt": "2025-02-25T10:00:00Z",
  "gdprBasis": "legitimate_interest",
  "optedOut": false
}

What Makes This Different

FeatureBasic Lead ScraperThis Skill
Data sources1 source3+ sources in parallel
DeduplicationNoneDomain + phone dedup
ScoringNone0 to 100 ICP fit scoring
Email enrichmentNoneWebsite crawl for hidden emails
OutreachSingle template4-step personalized sequences
ComplianceNoneGDPR/CAN-SPAM built in
ExportRaw JSONCRM-ready CSV with all fields

Compliance Checklist

Before running any campaign, verify:

  • Reviewed robots.txt of every target website
  • Confirmed all data is publicly listed business information
  • Outreach emails include sender identity and physical address
  • Outreach emails include a working unsubscribe link
  • Suppression list in place for previous opt-outs
  • Data will be deleted when no longer needed
  • For EU/UK contacts: legitimate interest assessment completed

Cost Estimate

ActionApify CUCost
75 leads from 3 sources (1 city)~0.15 CU~$0.06
375 leads from 3 sources (5 cities)~0.75 CU~$0.30
Email enrichment (30 websites)~0.15 CU~$0.06
Full pipeline (discovery + enrichment)~0.90 CU~$0.36

Scale with Apify as your pipeline grows. Free tier handles hundreds of leads monthly.


Pro Tips

  1. Small targeted batches (25 to 50 per source) outperform mass scraping every time
  2. Validate emails before sending with Hunter.io or NeverBounce
  3. Review outreach drafts before sending. Never auto-send without human review
  4. Warm up new email domains before sending at scale (use Instantly or Lemlist)
  5. Target decision makers by title rather than generic company emails
  6. Run weekly to catch new businesses and refresh stale data
  7. Cross-reference leads that appear in multiple sources. Multi-source leads convert 3x better

Error Handling

try {
  const run = await client.actor("apify/yellowpages-scraper").call(input);
  const dataset = await run.dataset().getData();
  return dataset.items;
} catch (error) {
  if (error.statusCode === 401) throw new Error("Invalid Apify token. Get yours at https://www.apify.com?fpr=dx06p");
  if (error.statusCode === 429) throw new Error("Rate limit. Reduce batch size or wait.");
  if (error.statusCode === 404) throw new Error("Actor not found. Verify actor ID.");
  throw error;
}

Requirements

  • An Apify account with API token
  • Claude API key for outreach generation
  • Node.js 18+ with apify-client and axios
  • A CRM or spreadsheet (HubSpot, Airtable, Google Sheets)
  • An outreach tool with unsubscribe management (Instantly, Lemlist, Apollo)

Comments

Loading comments...