Clay + Claude scoring pipeline

Industry-specific lead scoring at scale — judgment, not keyword matching.

Year: 2025–present
Stack: Clay · Claude API · Playwright · HubSpot
Outcome: Live inside a week. Filters raw lists to a high-confidence subset before any send.

Problem

Raw lead lists pulled from scrapers, public datasets, and broker sites contained thousands of records with wildly varying quality. Manually qualifying them against acquisition criteria — industry fit, structural quality, buy-box alignment — wasn’t tractable at the volume we needed to operate. Off-the-shelf scoring (“good lead / bad lead”) breaks the moment you’re targeting more than one thesis at once.

What I built

A data enrichment and lead scoring system in Clay that aggregates raw leads from multiple sources, enriches them with firmographic and web-scraped data, and runs them through an LLM-based scoring framework. The 1–6 scoring scale evaluates industry fit, structural quality, and overall attractiveness using AI prompts grounded in website-derived inputs. Iterated on scoring logic to balance strict industry alignment against broader structural fit, so the pipeline surfaces both bullseye targets and high-quality adjacencies.

Data aggregation (Clay) — website flatten, NAICS, reviews, firmographics
Industry-specific evaluation (Claude) — thesis-comparison prompt, on-site evidence only
Structured scoring output — 1–6 scale with banded meanings
Override logic — caps for edge cases, softer penalties on missing data

Architecture principle

Separation of concerns. Clay handles data + orchestration. Claude handles reasoning + judgment. Neither tool is stretched to do the other’s job, which is what makes this cheap to operate and easy to change.

The scoring band

1–2 — not a fit — hard pass
3–4 — edge case — manual review before send
5–6 — strong / near-ideal — send with primary campaign

Outcomes

time to live: First working version within a week
filter rate: Raw lists → high-confidence subset before any send
cost: Operational cost in the low hundreds per month
coverage: Underpins both outbound programs — same engine, different rubrics
iteration: Continuously modified as buy-box criteria shift