most link prospecting is spreadsheet labour: export a backlink list, export a competitor's, vlookup the difference, sort by some authority column, then eyeball a few hundred rows for anything worth pitching. an ai assistant wired to a live backlink data source can do the mechanical parts in one conversation. this is a practical walkthrough of how that wiring works and where it still needs a human.
what mcp is, in one paragraph
mcp stands for model context protocol - a standard, open way to give an ai assistant live tools and data instead of relying on whatever it memorised during training. an mcp server exposes a small set of functions (lookups, queries, jobs); the assistant decides when to call them based on what you ask in plain english. for link building, that means the model can pull a current backlink graph at the moment you ask, rather than guessing from stale knowledge. assistants like claude, cursor, and cline all speak mcp.
the important shift is that you stop being the integration layer. you are no longer copying numbers out of one tool and into a sheet. you describe the outcome you want and the assistant runs the tool calls, holds the intermediate results in context, and hands you the part that needs judgment.
mcp does not make the model smarter about seo. it gives the model accurate, current data to reason over. the quality of your prospecting still depends on the quality of the underlying index and on you asking the right question. a confident answer built on no data is still wrong.
step 1: connect a backlink tool over mcp
you need a data source that ships an mcp server. crawlgraph exposes one with four tools - backlinks, gap analysis, ranked outreach targets, and release info - over the public webgraph it builds from common crawl. there are two ways to connect it.
the local option runs the server on your machine with npx, which is the simplest path for a single user. drop this into your assistant's mcp config (for claude desktop, that is the mcpServers block in its config file):
{
"mcpServers": {
"crawlgraph": {
"command": "npx",
"args": ["-y", "crawlgraph-mcp"],
"env": { "CRAWLGRAPH_API_KEY": "cg_live_…" }
}
}
}the hosted option points at the streamable-http endpoint instead, which avoids running anything locally and works well for cursor or cline. same key, different transport:
{
"mcpServers": {
"crawlgraph": {
"url": "https://crawlgraph.com/mcp",
"headers": { "Authorization": "Bearer cg_live_…" }
}
}
}the cg_live_… value is your api key, available once you are on the lifetime tier. backlink lookups and gap jobs draw against your monthly quota (1,000 lookups and 50 gap jobs on lifetime), so the assistant is spending real budget when it calls these tools - worth knowing before you tell it to analyse fifty competitors in a loop.
step 2: ask for a competitor gap analysis
with the tool connected, you prompt in plain english. the assistant figures out which tool calls to make. a gap analysis - domains that link to your competitors but not to you - is the highest-leverage prospecting query, and it maps directly onto the gap tool (which you can also run by hand in the browser). here is a prompt that works well:
my site is yoursite.com. my three closest competitors are rival-a.com, rival-b.com, and rival-c.com. run a gap analysis: find the domains that link to at least two of those competitors but not to me. rank them by authority, drop anything that looks like a directory, forum, or social platform, and give me the top 25 with the competitor(s) each one already links to.
behind that one message the assistant typically runs a gap job for each competitor, intersects the results to find domains linking to two or more of them, then calls the ranked outreach-targets tool, which already filters out platform noise (directories, forums, social sites) and sorts by cg_authority, the 0-100 authority metric. you get back a ranked shortlist instead of a raw dump. the filtering and ranking that used to be a manual afternoon happens inside the tool.
add “tell me which tool calls you made and the raw counts at each step” to your prompt. you want to see the gap totals before filtering so you can sanity check that the numbers are plausible. if a competitor returns zero gaps, the domain may be too new or too small to be well represented in the current index.
step 3: qualify and draft angles
a ranked list is a starting point, not an outreach plan. the next move is to turn the top entries into something you would actually send. because the assistant still holds the gap results in context, you can keep going in the same conversation:
take the top 10 targets from that list. for each one, look up what kind of site it is from its strongest pages, then suggest a one-line outreach angle: why it would make sense for them to link to yoursite.com specifically. keep it honest - no fake flattery.
here the assistant uses the backlinks tool to inspect each target's strongest pages, infers what the site is about, and proposes a reason a link to you would make editorial sense. treat these as drafts. the model is good at structure and speed and bad at knowing your real relationships, so every angle needs a human pass before it goes anywhere near a send button.
good things to have it do at this stage:
- group targets by type (resource pages, niche blogs, comparison roundups) so you can batch outreach
- flag any target that already links to you under a different domain or subdomain, to avoid a duplicate pitch
- note which competitor each prospect favours, so your angle can speak to a gap they have not filled
seen enough? run it on your site free.
5 backlinks free. $99 once for unlimited.
where this still needs a human
the workflow removes the mechanical steps. it does not remove judgment, and being honest about the limits matters more than the demo.
- data freshness. a common-crawl-based index refreshes on a quarterly cadence - the latest snapshot covers jan to mar 2026. a link built last week will not appear yet. for fast-moving campaigns that need same-day data, this is the wrong source, and a continuously-crawled tool like ahrefs is worth its recurring cost. the tradeoff is a smaller, slower index in exchange for free and fully scriptable.
- relevance is yours to call. the tool filters obvious noise, but it cannot tell whether a site is genuinely in your topical neighbourhood or just happens to share linkers. that judgment is the actual skill in link building.
- the model will sound certain when it is guessing. if you ask for something the tools cannot answer, a language model tends to fill the gap rather than say “no data”. make it cite the tool call behind every claim.
- outreach is still relationship work. a drafted angle is a head start on the first sentence, not a campaign. the reply rate comes from being relevant and human, which no tool call produces for you.
putting it together
the end-to-end loop is short: connect a backlink source over mcp once, then in a single conversation ask for a gap analysis, get a ranked and noise-filtered shortlist, and have the assistant draft outreach angles you refine by hand. the parts a computer is good at - intersecting lists, ranking, filtering, summarising - move off your plate. the parts that need a person - relevance, relationships, judgment - stay where they belong.
if you want to wire this up, the connection snippets above are the whole setup, and the underlying index, metrics, and quotas are documented at /docs/api. start by running a single domain on the homepage to see what the index has on your competitors before you point an assistant at it.
writes the queries we run internally. ships one tactical post a week.
+ a free domain audit when you sign up.