Feature deep-dive
llms.txt & AI ingestion surface
Last updated
LLMs ingest your content through different surfaces than browsers do. Mentionwell ships every LLM-ingestion surface that exists today — llms.txt, llms-full.txt, per-page .md mirrors, RSS, JSON Feed, sitemap, and an explicit AI crawler allowlist — automatically on every site.
What gets generated per site
- /llms.txt — llmstxt.org-spec executive summary of the site.
- /llms-full.txt — exhaustive deep-reference version.
- /sitemap.xml — classic sitemap.
- /sitemap-llms.xml — curated AI-priority sitemap.
- /feed.xml — RSS.
- /feed.json — JSON Feed.
- /{slug}.md — markdown mirror per article (and per page on the marketing site).
- /robots.txt — explicit allowlist for GPTBot, ChatGPT-User, ClaudeBot, PerplexityBot, Google-Extended, etc.
Why it matters
Generative engines read your site differently from search bots. RSS/JSON Feed gives them a clean signal of fresh content. The .md mirror lets them ingest answer text without parsing HTML. llms.txt gives them a curated map. The robots allowlist tells them which crawlers are welcome. Without these, you’re relying on the engine to figure out your site shape — usually badly.