Free Backlink CheckerFree Backlink Checker
Crawly

Web Data

The Crawly Index

Our web index powers every backlink, authority score, and spam signal in Crawly's tools and API. 124M domains. 4.75B backlinks. Processed, scored, and updated every month.

124M domains4.75B backlinksUpdated monthlyAvailable via tools, API and MCP

What is the Crawly Index?

The Crawly Index is our processed web dataset covering 124 million domains and 4.75 billion links between them. We build and maintain it from open web crawl data - the same kind of publicly available, openly licensed data that underpins many of the web's most widely used datasets.

From that raw crawl, we run our own processing pipeline to extract the link graph, calculate authority signals, rank every domain by harmonic centrality and PageRank, and produce the scores and metrics exposed across Crawly's free tools, REST API, and MCP server.

The index is refreshed monthly. Every tool, every API call, and every MCP tool response is backed by the same underlying dataset.

How authority scores are calculated

Our scoring pipeline runs in three stages after each monthly index refresh.

1

Link graph extraction

We extract all domain-level links from the crawl - who links to whom, with what anchor text, from how many pages. This produces the raw link graph.

2

Harmonic centrality ranking

We run harmonic centrality over the full link graph to rank every domain by how central it is to the web's overall link structure. The more high-authority sites that link to a domain - and the more those sites are themselves well-linked - the higher it ranks.

3

Score normalisation

We map each domain's harmonic rank to a 0-100 authority score using a logarithmic formula. This ensures scores are spread across the full range - mid-tier domains score in the 20-50 band rather than clustering near zero.

Data points in the index

Every domain in the index has some or all of the following data points available.

Authority Score

A 0-100 score calculated from harmonic centrality across the full link graph. Higher means more authoritative.

Referring Domains

The number of unique domains linking to a given domain. The single most important signal for authority.

Total Backlinks

The total count of individual inbound links, including multiple links from the same domain.

Harmonic Rank

A domain's global rank by harmonic centrality - a measure of how central it is to the web's link structure.

PageRank Rank

A domain's global rank by PageRank - the original link-based ranking algorithm, still a strong authority signal.

Host Count

The number of unique IP hosts linking to a domain. Used as an IP diversity signal in spam scoring.

Spam Score

A 0-100 risk score derived from link density, IP diversity, and authority signals. Lower is better.

Link Quality Rating

Each linking domain is rated High, Medium or Low quality based on its own authority signals.

How to access the data

Crawly Index data is available in three ways depending on how you work.

Free tools

Browser-based tools - no login, no key. Check any domain instantly.

Browse tools

REST API

Programmatic access to domain authority, backlinks, and spam scores. 100 requests/day free.

View API docs

MCP Server

Query the index in plain English from Claude Code, Cursor, Windsurf or Codex.

Connect MCP