/ llmtxt.info

llms.txt best practices — how to write a file AI systems actually use

A practical DO/DON'T guide based on the spec requirements and what actually helps AI systems use your file effectively.

Last updated:

What makes a good llms.txt?

A well-written llms.txt file does one thing: gives an AI system a reliable, curated shortcut to the pages that matter most on your site. When an agent framework or RAG pipeline reads your file, it should immediately know what your site is about, which pages to load for context, and in what order they matter.

A poorly written file — full of marketing prose, relative URLs, or every page on the site — provides no advantage over a crawl of your sitemap.xml. The value of llms.txt comes from curation and clarity.

What to DO

Use absolute URLs

Every link in your llms.txt must use a full absolute URL. Relative paths do not work because the file may be fetched by a client that does not know your domain ahead of time.

# Correct
- [API reference](https://example.com/docs/api/): complete endpoint documentation.

# Incorrect — relative URL
- [API reference](/docs/api/): complete endpoint documentation.

Write a factual blockquote

The optional (but highly recommended) blockquote immediately after your H1 title is the first thing an AI system reads. Use it to define what your site or product is in one to three plain sentences. Think of it as a machine-readable description, not a pitch.

# Acme

> Acme is an open-source inventory management platform for small manufacturers.
> It provides real-time stock tracking, supplier integration, and demand forecasting.
> Documentation covers setup, configuration, and the REST API.

Include your 5–15 most important pages

Curation is the whole point. For most sites, the sweet spot is 5 to 15 links. If you have a small documentation site, fewer links is fine. If you have a large platform with many distinct product areas, you can go higher — but anything above 30 links starts to dilute the signal.

Prioritize: quickstart, core concepts, API reference, authentication, and any pages that answer the questions your users most commonly ask AI assistants about your product.

Keep the file under a reasonable size

The spec does not mandate a maximum file size, but practical limits matter. A file well under 100 KB can be fetched and parsed efficiently. If your llms.txt grows very large, consider whether some content belongs in llms-full.txt instead.

Include API docs if you have them

If your product has an API, your API reference is almost certainly the most valuable thing to link. Developers frequently ask AI assistants for help with API calls, and a hallucinated endpoint or parameter is the most frustrating kind of AI error. Pointing to your canonical API reference reduces that risk.

Keep it machine-readable

The spec uses standard Markdown link syntax: - [Link text](URL): description. Do not add HTML, front matter, or non-standard formatting. Keep the structure clean so any Markdown parser can process it correctly.

What NOT to do

Don't include auth-gated pages

Links to pages that require login will fail when an AI crawler or agent tries to fetch them. Include only publicly accessible URLs. If your key documentation is behind a login wall, that is worth fixing independently — gated docs are also bad for human discoverability.

Don't put marketing copy in the blockquote

The blockquote is read by AI systems for factual context, not by humans for persuasion. Marketing language ("the industry's leading solution for...") does not help an AI understand what your product is. Factual, specific language does.

Don't use relative URLs

Relative URLs like /docs/api/ are not valid in llms.txt. Always use full absolute URLs including the protocol and domain.

Don't link to every page on the site

llms.txt is not a sitemap. A file with hundreds of links provides almost no curation signal. If an AI client needs to discover all your pages, they have your sitemap.xml for that. llms.txt should answer the question: "if you could only read 10 pages to understand this site, which 10 would they be?"

Don't publish and forget

llms.txt goes stale. If you remove a page you linked to, rename a section, or publish important new documentation, update your llms.txt to reflect the change. A file full of broken links or outdated descriptions actively misleads AI systems about your content.

Don't include staging or redirect URLs

Link to your canonical production URLs only. Staging URLs may be password-protected, redirect chains add latency, and non-canonical URLs confuse retrieval systems that track URLs as identifiers.

Section naming and structure

The spec defines sections as Markdown H2 headings (##) containing lists of links. Section names are not standardized — you choose them to reflect your content's organization. Common patterns:

  • Documentation or Docs — for documentation sites.
  • API Reference — for products with a public API.
  • Getting started — for onboarding-heavy products.
  • Guides — for tutorial-style content.
  • Blog — for editorial content (mark as Optional if secondary).
  • Optional — the spec-defined section for lower-priority links that a client may choose to skip.

Choose section names that reflect how your content is organized, not SEO keywords. The AI client reading the file understands natural language — use the names that make the structure clear.

Writing good link descriptions

Each link in llms.txt can have an optional description, separated from the URL by a colon:

- [Page title](https://example.com/page/): what this page contains.

Good descriptions are short (under 20 words), factual, and specific about what the page covers. Treat them as micro-abstracts, not promotional blurbs.

  • Good: "Complete reference for all REST API endpoints, including authentication, rate limits, and error codes."
  • Bad: "Our world-class API documentation that will help you build amazing integrations fast."
  • Good: "Step-by-step quickstart for new users, from account creation to first API call."
  • Bad: "Get started today and experience the power of Acme."

The Optional section

The spec defines a special ## Optional section. Links in this section are ones that an AI client may choose to skip if it is operating under context length or latency constraints. Use it for:

  • Blog posts and editorial content that provides context but is not technically essential.
  • Changelog or release notes pages.
  • Secondary language versions of your core pages.
  • FAQs and glossary pages — useful, but not the primary reference material.

The Optional section signals: "this is good to have, but skip it if you're in a hurry." It helps AI clients prioritize without losing the links entirely.

Keeping it up to date

A simple maintenance habit: every time you publish or retire a major page, check whether your llms.txt should be updated. For larger sites, consider:

  • Auto-generating llms.txt from your navigation structure or CMS at build time.
  • Adding llms.txt review to your content publication checklist.
  • Periodically running the validator to catch broken links.

If your site changes rarely, reviewing llms.txt quarterly is usually sufficient. If you publish frequently, tie it to your deployment pipeline.

Continue reading

Sources