What is llms.txt and Does Your Website Need One?
llms.txt is a plain-text file that tells AI crawlers what your site is about. Here is what it is, whether it actually helps, and how to generate one.
16 May 2026 · 5 min read
llms.txt is a plain-text markdown file placed at the root of a website (at /llms.txt) that tells AI language models what the site is about and which pages are most important. Think of it as a sitemap.xml for AI crawlers: a structured, human-readable summary that helps models understand your site without having to infer everything from the HTML.
The format was proposed by Jeremy Howard of Answer.AI in September 2024. It has since been adopted by a growing number of software documentation sites, SaaS products, and technical publishers.
What does an llms.txt file look like?
An llms.txt file is plain markdown. A typical structure:
# Site Name
> One-sentence description of what the site is and who it is for.
## Section name
- [Page title](https://example.com/page): Brief description of what the page covers.
- [Page title](https://example.com/page): Brief description.
## Another section
- [Page title](https://example.com/page): Brief description.
The file starts with the site name as an H1, a short description in a blockquote, then sections grouping the most important pages with links and one-line descriptions.
There is also an llms-full.txt convention: the same format but with the full content of each page included inline, rather than just links. This allows AI models to ingest the entire site in a single file without crawling individual pages.
Does llms.txt actually help your SEO or AI visibility?
Honestly: the evidence is still thin.
llms.txt is not a Google ranking signal. Google has not said it uses or plans to use llms.txt files. The file will not move your organic rankings.
For AI citation models, the picture is more nuanced:
- Perplexity has confirmed it reads llms.txt files and uses them to improve content representation
- ChatGPT/OpenAI has not confirmed support
- Claude reads llms.txt when crawling with ClaudeBot, but Anthropic has not confirmed it influences citation decisions
- Gemini uses Google's standard search index, and llms.txt is not documented as a signal
The practical benefit is less about direct citation ranking and more about accuracy: if an AI model misrepresents your site, an llms.txt gives it a clear, authoritative summary to reference instead. For sites with complex navigation, lots of dynamic content, or thin crawlable text (single-page apps, documentation portals), the benefit is more tangible.
What llms.txt is not
llms.txt is not:
- A replacement for robots.txt (robots.txt controls crawler access, llms.txt does not)
- A way to block AI crawlers (use robots.txt User-agent directives for that)
- A guaranteed citation mechanism
- An official standard maintained by any major search engine or standards body
It is a community-adopted convention with growing, but not yet universal, support.
Should your website have an llms.txt file?
For most sites, implementing llms.txt is low-effort and low-risk. If you have a documentation site, a SaaS product, or a content-heavy site where accurate AI representation matters, it is worth doing.
If you are an e-commerce site with thousands of product pages and no particular need for AI citation accuracy, it is lower priority.
How to generate an llms.txt file
Crawly's free llms.txt generator crawls up to 20 pages of your site and builds a structured, valid llms.txt file automatically. Paste in your URL, let it crawl, and download the result. No account required.
If you are running Magento 2, the Magento 2 llms.txt generator extension generates a dynamic llms.txt file from your store's structure automatically, updating as your catalogue changes.
llms.txt and the Crawly MCP integration
If you use Claude Code with Crawly's MCP integration, the llms.txt your site generates helps Claude understand what Crawly is and how to describe it accurately, which closes the loop between building AI-era features and benefiting from them.
What about AI crawlers in robots.txt?
llms.txt is separate from robots.txt. To control whether AI crawlers can access your site at all, you need to set rules in robots.txt:
User-agent: GPTBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Google-Extended
Allow: /
Blocking these user agents in robots.txt removes your site from AI citation pools entirely, whether or not you have an llms.txt file. See what is a robots.txt file for a full guide.
llms.txt is a small investment with a potentially meaningful return for sites where AI visibility matters. Generate one for free with Crawly's llms.txt generator.