§ 31 · Glossary · robots.txt

robots.txt robots.txt for AI Crawl policy in the AI era.

robots.txt is the file at /robots.txt that tells crawlers which paths they may or may not fetch. In the AI era, robots.txt is also the primary opt-in/opt-out signal for AI training and retrieval — each major AI crawler honors its own user-agent directive (GPTBot, ClaudeBot, PerplexityBot, Google-Extended).

Updated 2026-05-01 2 FAQs

How robots.txt differs from AI Crawler, LLMO, llms.txt

robots.txt is crawl policy. llms.txt is ingest-friendly context (a Markdown overview of the site). Both live at the root; they answer different questions.

How Mentionwell handles robots.txt

Default robots.txt explicitly allows 15+ named AI crawlers — no ambiguous wildcards.
Per-domain customization so site owners can opt out of any specific crawler.
Allowlist published as part of the LLMO setup, alongside llms.txt and Markdown mirrors.

Frequently asked questions about robots.txt

How do I allow or block AI crawlers in robots.txt?

Use named user-agent directives — User-agent: GPTBot, User-agent: ClaudeBot, etc — with explicit Allow or Disallow rules. Wildcards alone don't reliably control AI crawlers.

Does robots.txt control AI training?

It controls AI crawlers that honor it — which most major ones do. It does not control downstream redistribution of already-trained-on data, and it has no legal force; it's an industry convention.

Ship robots.txt-optimized articles automatically

Mentionwell handles robots.txt on every published article — alongside the other six optimization targets in this glossary — so you don't have to think about it per post. Drop a domain, approve the first headline, watch the pipeline run.

§ · The rest of the alphabet

The rest of the
alphabet.

robots.txt is one of 34 terms in the AI search vocabulary. Mentionwell optimizes for all of them on every article. Browse the full glossary →

§ 06 · Launch

Drop a domain.
Get cited.

Mentionwell opens to early access on June 1. Headless on any framework. No CMS migration.

Coming soon · June 1 Read the docs

EARLY ACCESS · INVITE-ONLY AT LAUNCH