same experiment, third niche: take a category's biggest players, pull every domain that links to them from common crawl, and keep the ones that link to the whole set. SEO tools gave 37 universal linkers, nearly all media. CRMs gave 2, both integration hubs. AI agent frameworks just gave us 5, and they are a different animal again: two developer publishing platforms, one data science publisher, and two product sites. the niche that everyone is writing about has almost no press in its link graph.
how we pulled this
eight frameworks: langchain, llamaindex, crewai, dify, flowise, langflow, autogpt, and superagi. for each we pulled its referring domains ranked by cg_authority from the common crawl webgraph, then intersected the eight lists. the per-domain numbers:
| referring domains | cg authority | |
|---|---|---|
| langchain.com | 5,949 | 64 |
| superagi.com | 2,040 | 46 |
| llamaindex.ai | 1,999 | 49 |
| crewai.com | 1,489 | 46 |
| dify.ai | 1,331 | 46 |
| flowiseai.com | 685 | 29 |
| langflow.org | 450 | 28 |
| agpt.co (autogpt) | 450 | 28 |
one number up front: langchain has 3 to 13 times the referring domains of everything else in the category. 77 percent of all linkers link to exactly one framework, and most of the time that one framework is langchain. in a niche this young, editorial attention has already monopolized.
the overlap pyramid
across the eight frameworks there were 7,508 unique linking domains. the distribution:
| linking domains | non-platform | |
|---|---|---|
| link to all 8 | 5 | 5 |
| link to 7 | 18 | 17 |
| link to 6 | 39 | 37 |
| link to 5 | 70 | 68 |
| link to 4 | 144 | 141 |
| link to 3 | 384 | 383 |
| link to 2 | 1,061 | 1,059 |
| link to just 1 | 5,787 | 5,782 |
among domains linking to four or more frameworks, 97 percent are real editorial or product sites. the platform noise you would expect from a crawl-scale dataset is almost absent once you filter the obvious CDNs.
the five, and what they tell you
the five domains that link to all eight are github.com, substack.com, analyticsvidhya.com, justcall.io, and klavis.ai. github and substack are where this niche publishes: the frameworks are open source, and the people who cover them write newsletters, not magazine features. analyticsvidhya is the one classic publisher, a data science education site. widen to the seven-of-eight set and the picture sharpens:
domains linking to 7 of the 8 frameworks, by cg authority (github, substack, analyticsvidhya, justcall and klavis link to all 8): dev.to (7) 62 clickup.com (7) 58 n8n.io (7) 57 akka.io (7) 57 luma.com (7) 57 arize.com (7) 43 aiagentsdirectory.com (7) 39 zenml.io (7) 38 thetoolnerd.com (7) 34 productschool.com (7) 30 latenode.com (7) 21 agenthunter.io (7) 21 everydev.ai (7) 20 potpie.ai (7) 14
read the names. dev.to, qiita and zenn (both in the six-linker set) are developer publishing platforms. n8n and latenode are automation platforms. arize, zenml and langfuse are ML tooling that integrates with everything. and then there is the interesting layer: aiagentsdirectory.com, agenthunter.io, everydev.ai, findmyagentai.com, bestaiagents.ai, aiagentslive.com. a whole crop of AI agent directories that did not exist two years ago, most with single-digit-to-low authority scores.
the shape of the niche is the strategy
three niches, three shapes, three different plays:
| link to all 8 | who they are | |
|---|---|---|
| SEO tools | 37 | trade media + roundups |
| CRMs | 2 | integration hubs |
| AI agent frameworks | 5 | dev platforms + young directories |
for SEO tools the play is pitching writers, because the category shelf is media. for CRMs it is building integrations, because the shelf is automation hubs. for AI agents the shelf is developer platforms plus directories that are still forming. the play is to publish where developers already read (github, substack, dev.to, qiita) and to get listed in the agent directories while they are young. a directory with authority 20 today is not impressive. a directory that ends up being the geekflare of AI agents in three years is, and you got in when it took an email.
the newcomer proof
langflow and autogpt each have 450 referring domains, a fraction of langchain's 5,949. yet both are reached by the same directories and dev platforms that link to the whole category. you do not out-publish langchain at this point. you get onto the shelf that the category linkers are building, and the shelf treats you the same as the giant.
how to do this for your own niche
the method is category-agnostic. pick your two or three closest competitors, pull their referring domains, and keep the ones that link to several of them but not to you, sorted by overlap and then authority. that is exactly what a backlink gap analysis does, free, on the same common crawl data we used here. it tells you which of the three shapes your niche is, and hands you the list of who to approach.
the 59 domains that link to six or more of these frameworks are free to download as CSV or JSON. see also the SEO-tool link leaderboard and the CRM study for the contrasting shapes.
writes the queries we run internally. ships one tactical post a week.
+ a free domain audit when you sign up.