Crawly
Guide

Using Crawl Comparison

Diff two crawls to see exactly what changed - added pages, removed pages, and field-level differences in titles, H1s, status codes, and indexability.

When to use crawl comparison

Crawl comparison is most valuable when something significant has changed on a site and you need to verify the impact. The most common scenarios:

  • Site migrations. Crawl before, migrate, crawl after. Compare to catch pages that became non-indexable, lost their title, or gained a redirect chain.
  • CMS or template updates. A template change can silently affect hundreds of pages. Compare before and after to see exactly what changed.
  • Content publishing at scale. After a batch of new pages go live, compare to confirm they appear as expected in the crawl.
  • Ongoing monitoring. Run monthly crawls and compare each one to the previous. New broken links, newly non-indexable pages, and title changes all surface in the diff.

How to run a comparison

You need at least two completed crawls of the same site. Open the Crawl Comparison view in Crawly and select the two crawls you want to compare - typically the most recent crawl as "new" and an earlier crawl as "baseline".

Crawly matches URLs between the two crawls and produces three lists:

  • Added URLs - pages that exist in the new crawl but not in the baseline
  • Removed URLs - pages that existed in the baseline but are gone from the new crawl
  • Changed URLs - pages present in both crawls where one or more fields changed

Reading the three lists

Added URLs

New pages that Crawly discovered in the latest crawl. These are either genuinely new pages that have been published, or pages that previously existed but were not linked anywhere - meaning Crawly's spider could not reach them before.

Check that added URLs are intentional. After a migration, you might see new URLs created by the new platform that were not planned - for example, tag pages, author pages, or pagination variants.

Removed URLs

Pages from the baseline that Crawly did not find in the new crawl. This could mean:

  • The page was intentionally deleted or redirected
  • The page is now orphaned - no internal links lead to it, so the spider cannot find it
  • The URL structure changed and the page appears as a different (added) URL

Removed URLs that were previously ranking pages, or that have external inbound links, are high-priority items to investigate.

Changed URLs

Pages where the content or behaviour changed between crawls. Crawly shows field-level diffs for:

  • Title - old title vs new title side by side
  • H1 - heading change detected
  • Status code - for example, a page that was 200 is now 301 or 404
  • Indexability - a page that was indexable is now noindex, or vice versa

The most critical changes to act on: pages that went from indexable to non-indexable, and pages that changed status from 200 to an error code.

Migration workflow

For site migrations, the recommended workflow is:

  1. Crawl the existing site before the migration begins. Save this as your baseline.
  2. After migration, crawl the new site. Use the same domain if it is a platform change, or the new domain if the migration includes a URL move.
  3. Compare the two crawls. Review all three lists - added, removed, and changed.
  4. For any removed URL that previously ranked, verify a 301 redirect is in place pointing to the correct new URL. Use the Redirect Checker to confirm each redirect is a direct single hop.
  5. For any changed URL where indexability changed to non-indexable, investigate why - it may be a noindex tag accidentally applied, a canonical pointing to the wrong URL, or a robots.txt rule that is too broad.

Start tracking changes

Run your first crawl, then crawl again after changes to see exactly what moved.

Download free