We’re entering a new phase of the web. One where your audience isn’t just people clicking around a browser. It’s language models crawling your content, distilling it, and regurgitating it in chats, summaries, snippets, and who knows what else.
And yet, most websites are still designed for people and Google's crawlers.
That’s why we built an open-source tool to generate llms.txt files: the first step to making your site legible, and useful, in a world where LLMs are the new search engine.
A Brief History of LLMs.txt
The concept of LLMs.txt was introduced by Jeremy Howard, co-founder of Answer.AI, to address a specific technical challenge: AI systems can only process limited context windows, making it difficult for them to understand large documentation sites. Traditional SEO techniques are optimized for search crawlers rather than reasoning engines, and so they can’t solve this limitation. When AI systems try to process HTML pages directly, they get bogged down with navigation elements, JavaScript, CSS, and other non-essential info that reduces the space available for actual content. LLMs.txt solves that by giving the AI the exact information it needs in a format it understands.
In November 2024, Mintlify added LLMs.txt support to their docs platform. In one move, they made thousands of dev tools’ docs LLM-friendly, like Anthropic and Cursor. Anthropic and others quickly posted on X about their LLMs.txt support. More Mintlify-hosted docs joined in, creating a wave of visibility for the proposed standard.
The momentum sparked new community sites and tools. @ifox created a directory to index LLM-friendly technical docs. @screenfluent followed shortly with another directory. Mot, who made dotenvx, built and shared an open-source generator tool for dotenvx’s docs site. Eric Ciarla of Firecrawl created a tool that scrapes your website and creates the file for you.
Why llms.txt matters
We’ve had robots.txt
for decades. It tells search engines what they can and can’t crawl. But that’s where the guidance ends. Once a page is crawled, you have very little control over how it's interpreted. Titles, descriptions, context—all of it gets inferred, often badly.
Enter llms.txt
. It’s a simple idea: give LLMs a structured, curated list of your best content. Tell them what it’s about. Make it easier for them to answer questions with accurate, up-to-date information pulled directly from your site.
Just like a sitemap helps search engines navigate your site, an llms.txt
file helps language models understand what each page is, why it exists, and when to use it.
And we wanted to make it as easy as possible to get started.
Introducing the llms.txt Generator
We built the llms.txt Generator to help marketers, developers, or product teams spin up a clean, structured file in seconds. No manual copy-pasting, no guessing.
Here’s what it does:
1. Input your URLs or sitemap
You can start with one or more URLs, or point the tool at your sitemap.xml
file directly. This is ideal if you already have a sitemap configured—it saves a ton of work.
2. Parse and discover your pages
The tool crawls the sitemap and parses every URL. For transparency, it logs each page as it goes, so you can follow along and see what’s being picked up.
3. Generate a structured file
We break the content into logical sections:
A
## Website
section for pages at the root level (like/home
,/about
,/contact
)Subsections based on your URL structure. For example, everything under
/articles/
goes into## Articles
We avoid clutter by only creating sections for the base paths, not every subpage
Each line in the file follows a simple format:
This gives language models an easy way to understand not just where to find the content, but what it is and why it matters.
4. Clean, readable output
Here’s a sample of what the file might look like:
Drop that file at the root of your website and boom. You’re now speaking the same language as the tools parsing your site.
Why we made it open source
To be honest, it wasn’t rocket science to build. It’s not the kind of thing that’ll differentiate Released or move the needle on growth. But if it saves even 100 people a few hours of tedious work, that’s enough to give us the warm and fuzzies.
So we’ve made the generator fully open source. Use it, fork it, improve it. We’ll be maintaining it and making it better over time, but the core idea is this: the more sites that publish llms.txt
files, the better the ecosystem becomes.
LLMs are only as good as the data they’re trained on and retrieve from. By making that data clearer, we raise the bar for everyone.
A new layer of visibility
When someone searches your site in Google, they get a few links and a snippet. When they ask ChatGPT or another assistant, they might get a summary with no attribution at all. Or worse, a hallucination.
llms.txt
gives you a chance to fix that. It gives LLMs a clearer picture of what your content is, which parts matter most, and how they should describe it.
It won’t solve every problem. But it’s a low-effort, high-impact way to take back a bit of control.
How to get started
Clone the project from GitHub
Run it locally or on your server
Point it at your site or sitemap
Generate and publish the file
You’ll find setup instructions, usage examples, and contribution guidelines all in the repo.
→ Check it out on GitHub ←
We built this because we needed it for ourselves. But we made it public because everyone should be thinking about how LLMs see their site.