Find Ai Consultancy

Dev Tools

Use whenever the user wants to find, shortlist, vet, or enrich US AI/ML/data consulting firms (consultancies) — AI/ML development, MLOps, generative AI / LLM apps (RAG, chatbots, agents), computer vision, NLP, recommendation systems, data engineering, BI/analytics. Triggers on "find an AI/ML consulting firm to build our recommendation engine", "shortlist three RAG/LLM consultancies for an enterprise chatbot", "compare three AI/ML consulting firms with strong ratings", or "pull contact info for these 8 AI consultancy domains", even when described indirectly (we want to use AI for X, deploy ML to production). Drives the ServiceGraph API (api.servicegraph.co) — a 100k+ US firm catalog filterable by industry, services, location, size, ratings. Defer to find-software-developer for general app/backend work where AI is just a feature. Skip in-house ML/data hires, LLM/AI-tool comparisons (ChatGPT vs Claude), "how do I fine-tune X" DIY questions, AI courses for individuals, non-US firms, individual freelancers.

Install

openclaw skills install find-ai-consultancy

find-ai-consultancy

Drive the ServiceGraph API (https://api.servicegraph.co) to find, shortlist, and enrich US AI/ML and data consultancies via the pro_services dataset. The catalog tags firms with industry:data_ai_consulting and a 4-tag service sub-taxonomy: ai-ml-development (the largest at ~12k firms), data-analytics, cloud-services, and api-integration. There is no data-engineering or business-intelligence sub-tagdata-analytics covers both. Confirm exact tag names via /v1/datasets/pro_services/fields?include_values=1.

Always pin industry:data_ai_consulting. This skill exists to do that automatically — the user shouldn't have to think about catalog taxonomy.

Any HTTP client works (curl, fetch, requests). Examples below use curl.

Sibling skills — defer when scope is different

  • General application or backend dev that uses AI as a feature (e.g. "build us a SaaS with an AI chatbot tab") → find-software-developer.
  • Web/site projects that include some AIfind-web-developer.
  • AI-related marketing or contentfind-marketing-agency.

This skill is for engagements where the AI/ML/data work IS the deliverable.

When NOT to use this skill

  • Consumer AI courses or learning ("find me an online course to learn ML").
  • AI/LLM product comparisons ("ChatGPT vs Claude vs Gemini", "Cursor vs Copilot").
  • DIY/code tasks ("how do I fine-tune Llama", "review this PyTorch loop").
  • In-house ML/data hires (Machine Learning Engineer, Data Scientist).
  • Generic AI knowledge questions.
  • Non-US firms / individual freelance ML engineers.

MCP server (preferred for authed calls)

If your harness has the ServiceGraph MCP server loaded (tools containing servicegraph), prefer those — OAuth 2.1 + PKCE keeps the token in the harness sandbox. Otherwise use the REST flow below.

API surface (dataset id: pro_services)

Every endpoint requires the bearer (Authorization: Bearer vk_…). No anonymous tier.

EndpointCostUse it for
GET /v1/datasets/pro_services/fields[?include_values=1]freeConfirm data_ai_consulting industry value and sub-tag names.
GET /v1/datasets/pro_services/check?filter=…freeValidate filter.
POST /v1/datasets/pro_services/translate-intentfree{intent} → DSL filter + sanity count.
GET /v1/datasets/pro_services/search?filter=…&limit=freeBrief firm cards + per-row unlock hint + total.
GET /v1/datasets/pro_services/:apexfreeOne row brief; detail only if unlocked.
POST /v1/datasets/pro_services/unlocks10 credits / firm{apexes:[...]} ≤100; atomic; 30-day TTL on detail.
GET /v1/me/creditsfreeBalance.

Cost model. Discovery / validation / search / brief reads are free. Detail (url, phone, email, social, address, full platforms map) costs 10 credits per firm and lasts 30 days.

Auth

vk_* API keys minted in the dashboard. Keep the token out of the LLM context — never read .env* into your context; dispatch via shell.

  1. Try the call first through a shell wrapper that sources .env.local:

    ( set -a; [ -f .env.local ] && . ./.env.local; set +a;
      curl -sS -H "Authorization: Bearer $SERVICEGRAPH_API_KEY" \
           'https://api.servicegraph.co/v1/datasets/pro_services/fields' )
    
  2. On 401 prompt the user (don't accept the key in chat):

    "Open https://servicegraph.co/profile/api-keys, create a key, and add SERVICEGRAPH_API_KEY=vk_… to .env.local here (or export it). Tell me when done. Please don't paste the key into chat."

  3. Retry after the user signals ready.

Filter DSL

GitHub-search-style.

filter   := orExpr
orExpr   := andExpr ("OR" andExpr)*
andExpr  := notExpr (("AND")? notExpr)*    # whitespace = implicit AND
notExpr  := ("NOT" | "-") notExpr | atom
atom     := "(" filter ")" | predicate
predicate:= IDENT op valueOrList | bareword
op       := ":" | "=" | ">=" | "<=" | ">" | "<"
valueOrList := value ("," value)*
value    := IDENT | NUMBER | tagAtEvidence
tagAtEvidence := IDENT "@" ("low"|"medium"|"high")
bareword := IDENT | NUMBER          # → keyword:<bareword>

Four rules that bite: AND binds tighter than OR (use parens); comma list = OR within one predicate; negation is -x or NOT x; bareword = keyword search (quote multi-word phrases).

AI-flavored examples (validate yours with /check):

industry:data_ai_consulting service_provided:ai-ml-development
industry:data_ai_consulting service_provided:ai-ml-development@high state:CA
industry:data_ai_consulting service_provided:data-analytics pipelines
industry:data_ai_consulting llm rag
industry:data_ai_consulting "computer vision" healthcare
industry:data_ai_consulting mlops
industry:data_ai_consulting (service_provided:ai-ml-development OR service_provided:data-analytics)
industry:data_ai_consulting service_provided:ai-ml-development@high rating>=4 has:clutch

Sub-niche → keyword/tag mapping:

User asks forUse
AI/ML model buildingservice_provided:ai-ml-development
Data engineering / pipelinesservice_provided:data-analytics + keywords pipelines / engineering (no data-engineering tag)
BI / analyticsservice_provided:data-analytics (covers BI too — no separate business-intelligence tag)
Cloud architecture for data/MLservice_provided:cloud-services
API / data integrationservice_provided:api-integration
LLM apps / RAG / agentsllm, rag, agent (keywords)
Generative AI"generative ai", genai
Computer vision"computer vision", cv
NLP / IDP / document understandingnlp, idp, "document understanding"
MLOps / model deploymentmlops, deployment
Recommendation systemsrecommendation, recsys
Predictive analytics / churn / forecastingpredictive, forecasting, churn

Identifying firms — apex

Firms are identified by their apex domain (scaleai.com, not www.scaleai.com/about).

Recipes

A. AI/ML consultancy for a recommendation engine

User: "AI/ML consultancy to build our recommendation engine for an ecommerce site."

GET /v1/datasets/pro_services/search?filter=industry:data_ai_consulting+service_provided:ai-ml-development+recommendation+ecommerce&limit=10

# Present, get pick of 3. "Unlocking 3 = 30 credits, 30-day TTL."
POST /v1/datasets/pro_services/unlocks
  { "apexes": ["firm-a.com", "firm-b.com", "firm-c.com"] }

B. RAG / LLM consultancies for a chatbot

User: "Three RAG/LLM consultancies for an enterprise chatbot."

GET /v1/datasets/pro_services/search?filter=industry:data_ai_consulting+(rag OR llm)+chatbot+enterprise&limit=10

If thin, drop enterprise and surface client-tier signals from the unlocked detail later.

C. Data engineering partner

User: "Data-engineering partner to build our analytics pipelines."

No data-engineering tag — data-analytics is the closest and covers both BI and engineering. Pin the tag plus keyword:

GET /v1/datasets/pro_services/search?filter=industry:data_ai_consulting+service_provided:data-analytics+(pipelines OR engineering)&limit=10

D. MLOps for model deployment

GET /v1/datasets/pro_services/search?filter=industry:data_ai_consulting+mlops&limit=10

E. Indirect intent — "use AI to predict customer churn"

User: "We want to use AI to predict customer churn — who can help us build that?"

GET /v1/datasets/pro_services/search?filter=industry:data_ai_consulting+service_provided:ai-ml-development+(churn OR predictive)&limit=10

Or let the translator do the mapping:

POST /v1/datasets/pro_services/translate-intent
  { "intent": "AI consultancy to build customer churn prediction" }

F. Computer vision + healthcare vertical

GET /v1/datasets/pro_services/search?filter=industry:data_ai_consulting+"computer vision"+healthcare&limit=10

G. Quality threshold + Fortune 500 clients

GET /v1/datasets/pro_services/search?filter=industry:data_ai_consulting+service_provided:ai-ml-development@high+rating>=4+fortune&limit=10

"Fortune 500" as a structured filter isn't a thing — surface from briefs or treat it as a keyword.

H. Custom LLM agent for customer service

GET /v1/datasets/pro_services/search?filter=industry:data_ai_consulting+(llm OR agent)+("customer service" OR support)&limit=10

I. BYO apex list — enrich domains

User pastes 8–20 AI consultancy domains:

  1. GET /v1/datasets/pro_services/:apex per domain — free brief (404 = not in catalog, no charge).
  2. User picks N to fully enrich. POST /unlocks = 10×N credits, atomic, detail returned.
  3. Re-runs within 30-day TTL are free.

A 404 here often means the firm is actually a SaaS product company (many AI vendors brand as "AI services" but operate as a product) — filtered out of the catalog.

Gotchas

  • Always pin industry:data_ai_consulting. Without it, ai-ml-development as a service tag surfaces IT firms that list AI as a sub-service.
  • Defer to find-software-developer for general dev that uses AI as a feature. When the deliverable is a SaaS product or app and AI is one of several features, that's software-dev work; this skill is for engagements where AI/ML/data work IS the deliverable.
  • Catalog audit notes: AI/ML-tagged firms have a higher historical rate of misclassification (some are SaaS products, some are B2C ed-tech). If an unlock returns a SaaS product, flag and skip rather than recommend.
  • Many sub-niches are keyword-only. Multi-word sub-niches split into ANDed barewords unless quoted (computer visioncomputer AND vision; "computer vision" → one phrase).
  • LLM-product comparisons (ChatGPT vs Claude vs Gemini) are NOT procurement — refuse.
  • AI courses for individuals (Coursera, fast.ai) are NOT in the catalog — refuse.
  • Briefs DO include apex, name, industry, service_provided, location, ratings. They DON'T include url, phone_primary, email_primary, legal_name, address_full, full platforms — those require an unlock.
  • not_found / not_in_dataset 404 = not in pro_services. Not charged. Skip.
  • Unlock is atomic. N apexes either all charge (up to 10×N credits) or none on 402.
  • Within-TTL re-views are free (was_cached:true).

Errors

JSON envelope: {"error": {"code": "...", "message": "..."}}.

StatusCodeWhat to do
400filter_parse_errorposition included; fix and re-validate with /check.
400kind_in_filterStrip any kind: from filter — URL is authoritative.
400field_not_in_datasetDrop the disallowed field.
400invalid_apexRe-normalize to apex domain.
401unauthorized / invalid_audienceRe-prompt for fresh vk_….
402insufficient_creditsneeded and balance in payload; nothing charged.
404not_found / not_in_datasetSkip; not charged.
429rate_limitedHonor Retry-After.

End-to-end example

User: "Three AI/ML consultancies to build a recommendation engine for an ecommerce site, ideally with 4-star ratings and Fortune 500 clients."

GET /v1/datasets/pro_services/fields?include_values=1
GET /v1/datasets/pro_services/check?filter=industry:data_ai_consulting+service_provided:ai-ml-development@high+recommendation+ecommerce+rating>=4
GET /v1/datasets/pro_services/search?filter=...&limit=10
# Present briefs. "Unlocking 3 = 30 credits, 30-day TTL."
POST /v1/datasets/pro_services/unlocks
  { "apexes": ["firm-a.com", "firm-b.com", "firm-c.com"] }
GET /v1/me/credits