Share this post:
If you already know what llms.txt is, you know it sits at your domain root and gives large language models a curated map of your site’s most important content. (If you are still getting up to speed, the full breakdown is here.) The more pressing question, the one most brands are not yet asking, is whether the file they have published is actually doing anything useful.
Having an llms.txt is not the same as having a good one. A file that lists every page on your site, uses vague descriptions, or points to outdated content is not helping AI systems understand your brand. It is adding noise. The file only earns its keep when it is curated, accurate, and structured in a way that makes an AI agent’s job genuinely easier.
This guide is about that gap. Not what the file is, but how to make it work: what to include, how to write the descriptions that actually influence how LLMs interpret your brand, how to handle the technical side without overcomplicating it, and how to keep the file useful as your site evolves. Google’s AI optimisation guidance, Chrome’s agentic browsing documentation, and the practical patterns emerging from real implementation all point in the same direction. Getting this right is low-effort compared to most GEO work, and the compounding effect on AI visibility is real.
The Difference Between a Functional llms.txt and a Useful One
Most implementations treat llms.txt as a technical checkbox. You grab a list of URLs, paste them into a markdown file, and publish it. Done. The problem is that AI agents reading that file get almost nothing to work with beyond a link inventory. No context, no hierarchy, no signal about what your brand actually does or which pages represent your strongest thinking.
A useful LLMs.txt file does something the raw HTML of your site often cannot: it speaks directly to the model in the format models process best. Clean markdown, explicit descriptions, logical groupings. When an AI agent lands on your site with a specific task, whether it is answering a user’s comparison query, generating a product recommendation, or summarising your service offering, it now has an editorially curated starting point rather than a crawl problem to solve.
The distinction becomes clearest when you think about what happens in a live AI query. The model is not patiently indexing your site over multiple visits. It is trying to form an accurate picture of your brand in a single interaction, often with limited time and context. A well-structured llms.txt compresses that learning curve dramatically. A poorly structured one wastes the opportunity entirely.
Three things separate a functional file from a genuinely useful one. The quality of the descriptions you write for each page, the editorial discipline you apply in deciding what to include and exclude, and whether the file’s structure reflects how you want AI systems to understand your brand rather than just how your site navigation happens to be organised.
See here the full guidelines by Google
Writing Descriptions That Actually Work
The descriptions you attach to each page link are where most implementations fall short. This is the single highest-leverage part of the file, and it gets treated as an afterthought.
Descriptions that do nothing: “Our pricing page.” “About us.” “Blog post on technical SEO.” These tell an AI model exactly what the URL already suggests. The model gains no information it could not infer from the slug.
Descriptions that work give the model something it cannot get from the URL alone. They describe the specific value of the page, the audience it serves, the question it answers, and the type of content it contains. A pricing page description that reads “Comparison of three service tiers including feature breakdowns, team-size guidance, and a calculator for estimating monthly investment” gives a model the context to cite that page accurately when a user asks about your costs. The difference in how that page gets used in AI-generated answers is not subtle.
A few principles to write by:
- Describe the page as you would brief a researcher who has never seen your site. Be specific about what the user will find there, not what the page is called. If the page contains original data, case study results, or a proprietary framework, say so explicitly. AI systems weight original evidence more heavily than generic content, and your description is one of the clearest signals you can give that this page contains something worth citing.
- Keep descriptions to two or three sentences. Long descriptions introduce ambiguity. The goal is to compress the essence of the page into a form the model can immediately categorise and retrieve. One sentence often works for straightforward pages. Use the second and third only when the page covers multiple distinct topics or contains content that context makes significantly more valuable.
What to Include, What to Leave Out
The editorial discipline here matters as much as the descriptions. An llms.txt that lists everything is not a curated map; it is a dump. And the value of the file scales with how well it signals your priorities, not how comprehensively it documents your archive.
Your cornerstone content belongs here: the comprehensive guides that define your topical authority, the original research your audience cites, the comparison or product pages that drive your most valuable conversions, and the service pages that explain what you actually do. These are the pages you would choose if you could only give an AI model ten URLs to understand your brand.
Blog posts with meaningful original data or case study findings belong. Blog posts that are thin, dated, or covered more thoroughly by a newer piece do not. If you have consolidated topics across multiple articles, include the canonical version. Including both the older and the newer piece on the same topic tells the model you have duplication, not depth.
Pages to leave out: anything thin, anything outdated, anything that would not represent your brand well if an AI cited it directly in a response. Your careers page, your cookie policy, your old campaign landing pages from three years ago. The test is simple: if an AI cited this page to answer a user query, would you be pleased with that? If not, the page has no place in the file.
For sites with large content libraries, a common approach is to separate cornerstone content from broader topical coverage. The main llms.txt stays tightly curated. An llms-full.txt file provides expanded coverage for AI agents that want deeper context. This two-tier structure keeps the primary signal clean while giving well-resourced crawlers more to work with.
Structuring the File for How AI Systems Read It
The organisation of your llms.txt shapes how models interpret the relationship between your pages. Group by topic rather than by page type. A section called “Technical SEO Guides” communicates something different from a section called “Blog Posts.” The first tells a model about the subject matter expertise the pages represent. The second tells it almost nothing.
Open the file with a clear brand description. One short paragraph that answers three questions: what does this company do, who does it serve, and what makes its content authoritative. Write this as you would brief a journalist, not as you would write an About Us paragraph. Specific, factual, and free of adjectives that substitute for information.
From there, organise your sections to mirror the topical clusters your brand owns. A B2B SaaS company might organise around product capabilities, integration documentation, case studies by industry vertical, and pricing information. A digital marketing agency might separate strategy guides from platform-specific content from measurement and analytics resources. The structure should tell a model: this is the topical map of this brand’s expertise.
Ordering matters within sections. Put your highest-authority, most comprehensive pages first. AI agents that time-out or truncate their context window mid-file should leave having encountered your best content, not your most recent posts.
Live example can be found here, in our own llms.txt file, and below:
# Title > Optional description goes here Optional details go here ## Section name – [Link title](https://link_url): Optional link details ## Optional – [Link title](https://link_url)Getting the File Live: Technical Basics
Place the file at your domain root at yourdomain.com/llms.txt. It should return a 200 status code. Serve it with a text/plain or text/markdown content type. No subdirectories, no date-versioning in the URL.
For WordPress sites, uploading directly to the root via FTP is the most reliable approach. Astro, Next.js, and most modern frameworks support static files placed in a public directory that maps to root on deployment. For headless setups, add llms.txt to your deployment pipeline so it updates when you make significant content changes rather than sitting as a static afterthought.
Set a cache-control header with a max-age of 24 hours or less. AI crawlers can revisit more frequently than traditional search bots during active content ingestion periods, and serving a stale file means your most recently published cornerstone content does not reach the model’s understanding of your site until the cache clears.
If you want to generate a well-structured file without building it manually from scratch, Brainz Digital’s free llms.txt generator takes your site URL and produces a structured, AI-ready file automatically.
It is the fastest path from zero to a properly formatted file, and it eliminates the most common technical errors that make implementations invisible to crawlers.

Maintaining the File as Your Site Evolves
A file set up in January and ignored is worse than no file at all by December. If your most-cited pages change, if you retire a product line, if you publish original research that becomes your most authoritative piece, an outdated llms.txt is actively pointing AI systems toward a stale picture of your brand.
Build review into your content calendar rather than treating it as a separate technical task. Every time you publish a piece significant enough to rank for a primary keyword, ask whether it belongs in the file. Every time you consolidate or retire content, remove the affected entries. The maintenance load is genuinely low if you handle it incrementally rather than as a quarterly audit.
Some teams automate regeneration from a maintained list of priority URLs. Others keep a simple internal doc of cornerstone content and refresh the file when that list changes. Either approach works. The failure mode is neglect.
llms.txt Inside Your Broader GEO Strategy
The file does not work in isolation. An llms.txt pointing to weak, vague, or poorly structured content does not improve your AI search visibility. It helps AI systems find your weak content faster. The file amplifies whatever is already there, which means the prerequisite is producing content that is specific, entity-rich, and authoritative on your core topics before you worry about how it is declared.
Think of llms.txt as the last mile of your GEO infrastructure. You do the foundational work: building topical authority, structuring content clearly, creating original evidence that gives AI systems something worth citing. The llms.txt file is how you tell those systems where that work lives, in a format they can process in a single pass rather than having to reconstruct from a crawl.
Where it genuinely shifts outcomes is in the comparison with competitors at similar content quality levels. If you and a direct competitor have equally strong cornerstone content, but you have given AI systems a clean editorial map and they have not, your content gets retrieved more consistently and described more accurately. That margin is not dramatic in the short term. Over time, as AI-driven discovery becomes a larger share of how users find brands in your category, it compounds.
At Brainz Digital, the work of building AI-ready content and the work of making that content discoverable to AI systems are treated as the same problem. llms.txt is one part of that infrastructure, alongside schema markup, entity consistency, and the broader positioning work of making sure models understand your brand the way you want them to, not the way your oldest ranking content happens to describe it.
Measuring Whether It Is Working
Direct attribution is not yet clean. No major AI platform publishes citation data with the granularity of Search Console. What you can do is track directional change over time and run manual audits to see whether AI systems’ understanding of your brand has shifted.
Before publishing or significantly updating your file, prompt ChatGPT, Perplexity, and Bing Copilot to describe your company and its main services. Note the gaps, the inaccuracies, the omissions. After several weeks, run the same prompts. The delta between those two sessions is your signal. Improved accuracy in how AI tools describe your brand, fewer gaps, more specific citations of your actual content, tells you the file is doing its job.
On the analytics side, Perplexity shows up as an identifiable referrer in GA4. Bing’s AI products produce distinct traffic signatures. ChatGPT Browse is harder to isolate but produces measurable session patterns. None of this is a clean controlled test, but directional improvement across multiple platforms after a well-constructed file goes live is a meaningful indicator.
The brands building this infrastructure now are not waiting for perfect tooling. They are acting on directional evidence, iterating as the measurement landscape catches up, and accumulating visibility compounding before their category becomes contested territory in AI-generated answers.
If you need any help with your AI search and Search Discovery in AI, or advice on your LLMs.txt optimisation, let us know; we would be happy to help out.