Firecrawl: turning the open web into LLM-ready data for sales

Every AI-driven sales workflow has the same hidden bottleneck. The model needs clean text. The open web hands you raw HTML. Until something sits in between and converts the second into the first, your slick agentic pipeline is going to burn tokens parsing nav bars and cookie banners.

That something, for us, is Firecrawl.

Firecrawl sits underneath three workflows we run every week: niche verification on the Apollo lead pull, niche verification on the AMF enrichment pass, and the personalisation engine that writes 3,000 cold emails a day at SDR-level depth. The Sales OS layer wires it into live signal monitoring on top.

This piece pulls Firecrawl into focus. The mechanics underneath it, the three places it sits in the pipeline, and the technical decisions that mean nothing else in the same category quite reaches the same bar in 2026.

What Firecrawl actually does

Stripped to the essentials, Firecrawl is a hosted scraping API that turns a URL into clean Markdown. You hit a single endpoint with a URL, and it returns a Markdown document already shaped for an LLM to read.

Underneath, it runs a managed cluster of headless browsers with full JavaScript rendering. So it handles modern SPAs (React, Vue, Next.js) the same way it handles a static brochureware site. The response strips boilerplate by default with onlyMainContent: true. The navigation, footer, sidebar, and cookie banners come off cleanly via a deterministic HTML filter. No LLM is involved in the cleanup, which is what keeps the request fast and predictable.

Then it converts what's left to Markdown. The result is a clean document with proper headings, paragraphs, and lists, ready to drop into any model context.

Compare what you'd otherwise have to handle. A typical B2B SaaS homepage returns 80 to 200 KB of raw HTML, with five or six layers of <div class="..." data-...="..."> wrapping a couple of paragraphs of actual text. Strip all that yourself and you've written a brittle parsing pipeline, then a markdown converter, then a deboilerplater, then probably a JS rendering layer for the SPA cases. Or you ship the raw HTML to your model and pay for the extra 50 KB of context per page across thousands of pages, hoping the model picks the signal out cleanly. The Firecrawl response is typically 2 to 6 KB of structured markdown for the same page. Roughly a 20x reduction on what your model has to read.

Pricing is credit-based: 1 credit per /scrape call on basic mode, with optional add-ons for JSON output, schema-conformant extraction, and stealth-proxy retries. The dollar-per-credit figure depends on the tier (Hobby, Standard, Growth) and lives at firecrawl.dev/pricing. The credit model matters because it's what makes the per-company economics on a 2,000-row run actually work out.

Use case 1: niche verification

Apollo's industry filter is approximate. Their keyword tags are AI-curated from company website copy, which sounds clever in the brochure and turns out to be roughly 50% noise on most niches. A search for "UK recruitment agencies" returns recruitment software vendors, training providers, executive search firms, IT consultancies who happen to mention "talent" once on their homepage, and a handful of parked domains. I've been through this enough times that we now treat the raw export as candidates, not leads.

The fix is a verification layer. Firecrawl does the scraping. Claude Sonnet does the reading. Together they decide whether each company is actually in-niche before we spend any money enriching them.

The flow on a typical run looks like this. We pull 1,946 candidate companies from Apollo's free Company Search endpoint. We hand the list of homepages to Firecrawl with a concurrency cap of 50 (matches the paid plan's 50-browser ceiling) and proxy: auto. About 90 seconds later, we have 1,946 markdown payloads sitting in a JSONL file, capped at 6,000 characters each so the downstream batches stay sensible.

Then Sonnet takes over. We batch the markdown into chunks of 30 companies and spawn parallel subagents, each carrying a system prompt that defines the niche, lists the include and exclude signals, and asks for a binary verdict per company: yes, no, or unverified. Anti-hallucination rule baked into the prompt: every yes verdict must quote a specific phrase from the markdown that proves the fit. No phrase, no yes.

Why Sonnet, not Haiku. We tested Haiku first, on the 2026-04-30 UK accountancy run, and it returned 0 yes, 0 no, 61 unverified out of 61 with-markdown companies. I assumed the prompt was off. It wasn't. Haiku defaults to unverified the moment the markdown isn't trivially obvious. Same 61 companies, same prompt, given to Sonnet, came back as 4 yes, 50 no, 7 unverified. Sonnet reads sparse markdown and commits. It's the difference between a verification layer that works and one that quietly hands every decision back to you to make manually.

The numbers from that 1,946-company run: 1,362 verified yes, 440 confirmed no, 144 unverified. The unverified bucket holds companies whose homepages were a Cloudflare challenge that even proxy: auto couldn't get through, or domains Firecrawl couldn't resolve at all. Those go back into a separate pipeline; they never get enriched in the same run.

End to end, that verification layer costs about 1.75 Firecrawl credits per company on average. Basic mode usually wins; the harder Cloudflare niches trigger an automatic stealth-proxy retry that adds 4 extra credits per call. On a 1,946-company pool that's somewhere around 3,400 credits. The Apollo enrichment that comes after only fires on the 1,362 verified-yes companies, which means we're not paying Apollo to enrich a parked domain or a software vendor that never had any business being on the list. Unit economics finally line up.

Use case 2: AI personalisation at scale

This is the flagship use, and it's the one the rest of the cold-email industry is years behind on.

The standard agency playbook is to scrape one line off LinkedIn (a recent post, a job title, a company name) and merge-tag it into a template. The recipient clocks the pattern by the second sentence. Reply rates on that approach have been falling year on year because every cold inbox now sees fifty of them a week, all built from the same playbook.

Our personalisation engine is built on a different premise. For every prospect on a list, we run two Firecrawl calls. One on the company homepage, one on the prospect's LinkedIn URL (which Apollo gives us on roughly 70% of records, and Firecrawl handles cleanly because it renders the JavaScript LinkedIn ships its content through). The output is two markdown documents per prospect, somewhere around 8 to 12 KB of usable signal per person.

We hand those two documents to Claude with the brand voice (codified per client during the discovery week), the offer (one paragraph), and an instruction set that tells the model to write a cold email that reads as if a peer who actually understands the prospect's space spent twenty minutes researching them. The output is a complete email, not a templated stub with a personalised first line. The opening sentence references something specific from the company website. The body ties that to a problem the offer addresses, and the CTA reads as natural conversation.

A humaniser layer runs on top of every output. It strips the AI tells (em-dashes, antithesis patterns, throat-clearing openers, sycophancy), checks the British vs American register, and confirms there's a specific proper noun and a specific number in the body. Anything that fails the audit goes back through one rewrite pass before it leaves the engine.

The volume looks like this. We send roughly 3,000 of these a day per active campaign. A manual SDR doing genuine research caps at about 80 a day, and 80 is being generous. So Firecrawl plus Claude plus the humaniser layer compresses what was a research function (one full-time human, £55k to £75k fully loaded, 12-month onboarding) into about 0.04 seconds of compute per email at a fraction of the marginal cost.

Lavender's analysis of 100 million cold emails published in 2025 found AI-assisted-and-human-edited messages reply at 5.1%, against 3.8% for fully human-written and 2.4% for fully AI-generated. The same dataset shows genuine personalisation (a real reference to a specific company initiative or recent hire) reply at 4.7% versus 2.3% for surface-level merge tags. A 104% lift purely from depth of personalisation. The bar we hold ourselves to is unambiguous: no recipient should be able to tell AI was anywhere near it. That's the whole brand promise of the cold outbound engagements we run for clients. Without Firecrawl as the input layer, that bar isn't reachable. You can't write convincing personalisation off a token slot. You need actual context, in clean form, in real time.

This is also the bit nobody else in the cold-email category is willing to talk about publicly. Most agencies don't run this stack, and the ones that do treat the recipe as a moat.

Use case 3: live signal monitoring

The Pipeline use case runs once per prospect at the start of the campaign. The Sales OS use case runs every day, on a much smaller list, on the customers and prospects already in flight.

Sales OS, our custom AI platform, watches the customer base for things that change. New senior hire posted to the careers page. New product launch on the homepage. Leadership move announced on LinkedIn. Funding round mentioned in a press release. Regulatory filing dropped on the company news feed. Each of these is a buying signal, and the window in which acting on it lifts reply rates is short. Maybe a week.

The mechanics are simple. Firecrawl scrapes the homepage, careers page, news feed, and any other URL we've told the platform to watch, on a daily cron. The output is markdown, which we diff against yesterday's snapshot. When the diff lands above a meaningful threshold (a new heading, a new paragraph, a new careers listing), the platform fires a signal into the customer's campaign view.

A draft reply or a fresh outbound is generated automatically off that signal. The opening sentence references the change directly: "I saw you've just opened a regional office in Manchester," or "I noticed you've brought on a new VP of Sales last week." Then the offer. Then a CTA. The whole thing lands in the operator's inbox at 9am ready for one-click approval.

This is the bit that makes Sales OS feel like a sales engineering team you didn't have to hire. It's running while your team sleeps. It checks 80 customer URLs every night. It surfaces the three or four that moved, with the email pre-drafted. The operator scans them in five minutes over coffee and approves the ones worth sending.

You can build this without Firecrawl. I tried, briefly, with a homegrown headless-browser cluster on Lambda. It worked. It also broke every other day on a Cloudflare challenge or a JavaScript rendering edge case, and I ended up spending more engineering time keeping the scraper alive than building product on top. Pulling Firecrawl underneath the same logic dropped the maintenance load to roughly zero, and I haven't thought about a Cloudflare bypass since.

The technical why

Three Firecrawl features do most of the load-bearing work in our stack.

proxy: auto is the first. Roughly half of the top 1 million traffic sites now sit behind Cloudflare in some form (W3Techs and SQ Magazine numbers, January 2026). On B2B homepages the rate is broadly comparable. Single-shot scrapers without a bot-bypass layer fail silently on those domains: they return a "verifying you're human" challenge page that looks like a real response until you look closely. Firecrawl's proxy: auto mode runs basic proxies first; if the basic attempt comes back blocked, it automatically retries through stealth proxies. That retry costs 5 credits, vs 1 on a clean basic call, and it works on the niches that matter. We see roughly 1.75 credits per company on average across a typical B2B run, which is the basic-mode price plus a tail of stealth retries on the harder Cloudflare-protected sites.

The structured output endpoint is the second. Pass a JSON schema in the /scrape call and Firecrawl returns a JSON document conforming to it instead of free-form markdown. We use this when we want specific fields extracted in one call: founding year, team size, target client description, technology partner badges. JSON mode adds 4 credits per page, which is worth paying when the alternative is a second model call to extract the same fields from markdown. For verification, we stay on markdown because the binary yes/no fits the markdown-first prompt better. For the personalisation engine, we sometimes mix: markdown for the human-feeling brief, structured JSON for the specific facts the model is about to cite.

The maxAge cache window is the third. The default is 2 days; you can set it longer. If you re-scrape the same URL within the window, you still pay 1 credit (caching is about speed, not cost), and the response comes back instantly. A fresh scrape would take 5 to 8 seconds. Useful when a campaign re-runs the same domain across multiple wave passes, and useful when a Sales OS daily diff happens to hit a domain another customer's campaign also touched the same morning.

Why Firecrawl wins this category in 2026

The scraping API category has serious incumbents. Bright Data's Web Unlocker handles Cloudflare cleanly and returns raw HTML. ScrapFly does similar with a JSON envelope wrapping the HTML. Crawlbase prices aggressively at scale and returns rendered HTML. Apify ships a marketplace of community-built actors with output in JSON or CSV.

What separates Firecrawl from each of those is the same thing: markdown-first output, by default, generated cleanly without you having to write the conversion layer.

That sounds small. It isn't. Every other category leader returns HTML or JSON wrapping HTML, which means you write a markdown converter (or an HTML parser) before the data is in a shape your LLM can read. That's another preprocessing step in your pipeline, another layer of bugs, another cost in latency. It's a hidden 15% of engineering time on every workflow that runs through it.

Firecrawl removes that step. The output is already in the form the model needs to read. The credit you pay covers rendering, anti-bot, boilerplate stripping, and conversion to LLM-ready text in one call. Bright Data and ScrapFly are better at pure stealth scraping for sites that actively fight back. Crawlbase wins on raw cost at high volume, and Apify gives you more control over the scraping logic if that's what you need.

For an AI-driven sales workflow, where the next step in the pipeline is always a model reading the response, the question is which API ships output a model can read. Firecrawl is the only one that does, by default, in 2026.

Closing

Every AI-driven sales workflow has a hidden middle step. The pretty bits at the front (lead sourcing) and the pretty bits at the back (copywriting and reply handling) get all the airtime. The bit in the middle, which turns the open web into structured input the model can read, is the part nobody puts on a slide because nobody enjoys explaining it.

It's also the part that makes everything else work.

For us, Firecrawl is the answer to that middle step. It runs underneath the verification layer that filters Apollo lists from candidate junk to qualified yes. It sits inside the personalisation engine that writes 3,000 emails a day at SDR-level quality, and every morning the daily Sales OS cron uses it to spot what changed on customer homepages overnight. Every pound of revenue we book for a client passes through it once or twice along the way.

If you want the next layer up the stack, the copy optimisation loop covers how we feed the personalisation output into a continuous A/B test that auto-reverts when a rewrite drops performance. That's the system that makes the whole campaign get better with every cycle, instead of decaying.