llms.txt is a proposed convention for a single plain-text file — placed at yoursite.com/llms.txt — that hands AI assistants a curated, clean-text map of your most important pages. Think robots.txt's location and simplicity, but a completely different job: instead of telling crawlers what they can't touch, llms.txt points language models toward the content you want them to read, in a format that's easy for a model to ingest at answer time.
It's a genuinely useful idea, and it's spreading. It is also, as of 2026, a proposal — not a standard any major AI engine has committed to following. This guide explains what llms.txt is, how it's meant to work, exactly how to create one (with a block you can copy), how it differs from robots.txt, and the honest answer to the question everyone actually has: does it do anything yet?
What Is llms.txt?
llms.txt is a proposed standard for a Markdown file, located at the root of your site (/llms.txt), that provides information to help large language models use your website. It was proposed by Jeremy Howard — co-founder of Answer.AI and fast.ai — on September 3, 2024, and the spec lives at llmstxt.org.
The problem it targets is specific. When an AI assistant tries to use your website to answer a question, it runs into two walls: first, as the proposal notes, "context windows are too small to handle most websites in their entirety"; and second, "converting HTML pages into LLM-friendly content" — stripping out navigation, ads, scripts, and markup — is "both difficult and imprecise." llms.txt sidesteps both by offering, in the spec's words, "more concise, expert-level information gathered in a single, accessible location."
A few defining points to anchor the term:
- It lives at a fixed path: the spec is for a file "located in the root path
/llms.txtof a website" — the same convention that makes robots.txt easy to find. - It's Markdown, not a new syntax: the file is ordinary Markdown, so it's readable by both humans and machines without a parser.
- It's for inference, not training: the proposal's "expectation is that
llms.txtwill mainly be useful for inference, i.e. at the time a user is seeking assistance" — when a model is answering a question, not when it's being trained.
So llms.txt is, in plain English, a hand-curated cheat sheet to your best content, written for the AI systems that increasingly answer questions on your behalf.
How Does llms.txt Work?
The intended flow is straightforward:
- An AI assistant needs information about your site or product.
- Instead of crawling and parsing your full HTML site, it fetches
/llms.txt. - It reads your short summary, then follows the curated links to clean, readable (ideally Markdown) versions of your key pages.
- It uses that focused content to answer the user — with far less noise than scraping raw HTML.
That's the design. The honest caveat — which we'll expand on below — is that "works" here describes intent, not guaranteed behavior. The file only matters if an engine chooses to fetch it and act on it. llms.txt doesn't push anything to anyone; it just sits at a predictable URL, the way robots.txt and sitemap.xml do, waiting to be read.
There's also an expanded variant worth knowing. The slim llms.txt is an index of links. Many sites also publish an llms-full.txt (and tooling like the spec's own llms_txt2ctx generates llms-ctx.txt / llms-ctx-full.txt) that inlines the actual page text into one big Markdown file — handy for a model that wants the entire documentation set in a single fetch rather than chasing links. Anthropic's developer docs, for example, publish both a slim index and a full export.
llms.txt vs robots.txt: Not the Same Thing
This is the single most common confusion, so let's settle it. The two files share a naming style and a root-level location, which makes people assume they're variations on a theme. They aren't — they do opposite jobs.
robots.txt is about permission. It tells crawlers which URLs they may and may not fetch (User-agent, Allow, Disallow), and major search engines honor it as part of their crawl protocol. It's a gate.
llms.txt is about curation. It contains no directives, grants no permissions, and blocks nothing — it cannot stop a crawler or hide a page. It's a guide. As Search Engine Land put it, "Robots.txt is about exclusion. Sitemap.xml is about discovery. Llms.txt is about curation" — closer to "a curated sitemap.xml" than to an access-control file.
| Dimension | robots.txt | llms.txt |
|---|---|---|
| Job | Access control — what crawlers may fetch | Curation — which content to read, and in what order |
| Contains | Directives (Allow / Disallow, User-agent) | A summary + curated Markdown links, no directives |
| Can it block or hide a page? | Yes (by convention) | No — it grants and denies nothing |
| Format | Plain-text rules | Markdown (H1, blockquote, link lists) |
| Enforcement | Honored by major search crawlers | No engine is committed to acting on it |
| Analogy | A gate | A treasure map / hand-made sitemap |
The practical takeaway: these aren't either/or. Keep robots.txt doing its access-control job, and add llms.txt as a separate, optional curation layer. One is not a replacement for the other.
How to Create an llms.txt File (Format + Steps)
The format is deliberately minimal. Per the spec, a compliant file contains these sections, as Markdown, in order:
- An H1 with the name of the project or site. This is "the only required section."
- A blockquote ("
>") with a short summary "containing key information necessary for understanding the rest of the file." - Optional free-text Markdown (paragraphs, lists — anything except more headings) with extra context.
- Zero or more H2 sections, each holding a "file list" of links. Each list item is "a required markdown hyperlink
[name](url), then optionally a:and notes about the file." - An optional
## Optionalsection: links there "can be skipped if a shorter context is needed" — use it for secondary material.
Here are the numbered steps to ship one:
- Create the file. Make a plain-text file named
llms.txtand host it so it resolves athttps://yoursite.com/llms.txt(root path). - Add the H1 and summary. Start with
# Your Site Name, then a>blockquote that says, in one or two sentences, what you do and what a model most needs to know. - List your key pages under H2 sections. Group related links (e.g.
## Docs,## Products,## Guides) and write each as- [Page name](https://yoursite.com/page): short, descriptive note. Write descriptive notes — "Payments API: charges and refunds," not "API reference." - Point links at clean content. Where you can, link to readable or Markdown (
.md) versions of pages, since the whole point is giving the model clean text instead of cluttered HTML. - (Optional) add
## Optional. Put nice-to-have links here so models can drop them when context is tight. - (Optional) generate
llms-full.txt. If you want models to grab everything in one fetch, also publish an expanded file that inlines the page text. - Keep it current. Treat it like
sitemap.xml: regenerate it when your important pages change.
You don't have to hand-write it. Several WordPress plugins generate and auto-update llms.txt from your existing content (respecting your noindex/nofollow rules), and many documentation platforms produce one automatically — Mintlify, for instance, generates llms.txt for the docs it hosts. For a small site, hand-writing it is often faster and gives you tighter control over what you surface.
A copyable example
Here's a minimal, spec-faithful llms.txt you can adapt — replace the placeholders with your own pages:
# Acme Analytics
> Acme Analytics is a privacy-first product analytics platform for SaaS teams. It tracks events, funnels, and retention without third-party cookies. This file points to the docs and pages most useful when answering questions about Acme.
Key things to know:
- Acme is self-serve; there is a free tier and usage-based paid plans.
- Acme is GDPR-compliant and does not sell user data.
## Docs
- [Quickstart](https://acme.example.com/docs/quickstart.md): Install the SDK and send your first event in 5 minutes
- [Event tracking API](https://acme.example.com/docs/events.md): Reference for the track(), identify(), and group() methods
- [Funnels & retention](https://acme.example.com/docs/funnels.md): How to build conversion funnels and cohort retention reports
## Product
- [Pricing](https://acme.example.com/pricing.md): Plan tiers, usage limits, and what's included on the free tier
- [Integrations](https://acme.example.com/integrations.md): Supported sources and destinations (warehouses, CDPs, webhooks)
## Optional
- [Changelog](https://acme.example.com/changelog.md): Recent releases and breaking changes
The structure mirrors the spec's own FastHTML example: one H1, a blockquote summary, some free-text context, then H2 link lists with descriptive notes — and an ## Optional section a model can skip.
Does llms.txt Actually Work? An Honest Look at Adoption
Here's the part that separates a useful guide from hype. As of 2026, llms.txt is a proposal that no major AI engine has publicly committed to using in production — and at least one, Google, has explicitly said it does not use it.
The evidence is fairly one-directional:
- Google has been blunt. Google's John Mueller compared llms.txt to the keywords meta tag — a publisher-controlled tag search engines abandoned long ago because it's too easy to game — and said "AFAIK none of the AI services have said they're using LLMs.TXT." He's also pointed out that, from server logs, "you can tell ... that they don't even check for it." At a Google Search Central event in July 2025, Gary Illyes stated that Google "doesn't support LLMs.txt and isn't planning to."
- Other major providers haven't committed either. A July 2025 analysis reported that "no major LLM provider currently supports llms.txt. Not OpenAI. Not Anthropic. Not Google," and that AI crawlers were not observed requesting the file during normal site visits.
- Publishing it ≠ consuming it. This is the nuance most coverage gets wrong. Many well-known companies publish an llms.txt — Anthropic, Vercel, Stripe, Cloudflare, and others all expose one for their docs. But publishing a file is the author's side of the convention. Notably, even Anthropic "publishes its own llms.txt, but doesn't state that its crawlers use" it. A long directory of sites that publish llms.txt tells you the idea is popular with publishers — not that engines read it.
- A fetch is not an endorsement. You may see an AI crawler occasionally request
/llms.txtin your logs. That means a bot fetched a file at a known path; it does not establish that the file changed how that engine sourced, ranked, or cited your content.
None of this means llms.txt is wrong or that it won't be adopted later — conventions sometimes start exactly this way, with publishers ahead of platforms. It means you should calibrate expectations: today, llms.txt is low-cost, low-risk housekeeping, not a proven lever for AI visibility. If publishing one is cheap (a plugin, a docs platform, ten minutes by hand), there's little downside. Just don't expect it, on its own, to get you cited.
So What Actually Gets You Cited by AI?
If llms.txt isn't the switch, what is? The same fundamentals that make content findable and trustworthy in the first place — the work that AI answer engines genuinely rely on, because they retrieve from the same indexed, crawlable web:
- Be crawlable and allow the right bots. Make sure your pages are accessible and that the AI crawlers you want (for example, GPTBot for ChatGPT) aren't blocked in robots.txt. This is the access-control layer llms.txt deliberately doesn't touch.
- Write clean, extractable answers. Put a direct answer near the top, use clear headings and lists, and keep facts self-contained with units and sources — content a model can lift and quote confidently.
- Earn authority. Depth on a topic and references from across the web make you a more likely source for AI answers.
This is the territory of Generative Engine Optimization (GEO) and Answer Engine Optimization (AEO) — optimizing not just to rank a link, but to be the source an AI quotes. If you're weighing how much of your effort belongs in classic search versus AI answers, our GEO vs SEO guide lays out the split. llms.txt fits inside that bigger picture as one small, optional, publisher-side tactic — useful to have, not a substitute for being genuinely the best, most retrievable answer.
If you want to see where you actually stand, our SEO & GEO solution audits both at once — classic ranking signals and AI-citation readiness — so you spend effort on what moves AI visibility rather than on files engines may never read.
Bottom Line
llms.txt is a proposed convention — a simple Markdown file at /llms.txt — that gives AI assistants a curated, clean-text map of your most important pages. It's easy to create: an H1, a summary, and a few link lists, hand-written or generated by a plugin. It's not robots.txt — it curates rather than controls, and blocks nothing.
The honest status, as of 2026: it's a publisher-led idea that major AI engines have not committed to using, with Google saying outright that it doesn't. Publish one if it's cheap — it's harmless housekeeping and may matter more later — but don't mistake it for a citation lever. The durable way to be found and quoted by AI is still crawlable, well-structured, authoritative content.
Audit your site's AI visibility — see how Google, ChatGPT, and AI Overviews currently represent your content, and get specific fixes for both ranking and citation.
