Clay + Claude scoring pipeline

Industry-specific lead scoring at scale — judgment, not keyword matching.

Year
2025–present
Stack
Clay · Claude API · Playwright · HubSpot
Outcome
Live inside a week. Filters raw lists to a high-confidence subset before any send.

Problem

Raw lead lists pulled from scrapers, public datasets, and broker sites contained thousands of records with wildly varying quality. Manually qualifying them against acquisition criteria — industry fit, structural quality, buy-box alignment — wasn’t tractable at the volume we needed to operate. Off-the-shelf scoring (“good lead / bad lead”) breaks the moment you’re targeting more than one thesis at once.

What I built

A data enrichment and lead scoring system in Clay that aggregates raw leads from multiple sources, enriches them with firmographic and web-scraped data, and runs them through an LLM-based scoring framework. The 1–6 scoring scale evaluates industry fit, structural quality, and overall attractiveness using AI prompts grounded in website-derived inputs. Iterated on scoring logic to balance strict industry alignment against broader structural fit, so the pipeline surfaces both bullseye targets and high-quality adjacencies.

  1. Data aggregation (Clay) — website flatten, NAICS, reviews, firmographics
  2. Industry-specific evaluation (Claude) — thesis-comparison prompt, on-site evidence only
  3. Structured scoring output — 1–6 scale with banded meanings
  4. Override logic — caps for edge cases, softer penalties on missing data

Architecture principle

Separation of concerns. Clay handles data + orchestration. Claude handles reasoning + judgment. Neither tool is stretched to do the other’s job, which is what makes this cheap to operate and easy to change.

The scoring band

  • 1–2 — not a fit — hard pass
  • 3–4 — edge case — manual review before send
  • 5–6 — strong / near-ideal — send with primary campaign

Outcomes

time to live
First working version within a week
filter rate
Raw lists → high-confidence subset before any send
cost
Operational cost in the low hundreds per month
coverage
Underpins both outbound programs — same engine, different rubrics
iteration
Continuously modified as buy-box criteria shift

← back to the desktop