§ 31 · Glossary · robots.txt

robots.txt robots.txt for AI Crawl policy in the AI era.

robots.txt is the file at /robots.txt that tells crawlers which paths they may or may not fetch. In the AI era, robots.txt is also the primary opt-in/opt-out signal for AI training and retrieval — each major AI crawler honors its own user-agent directive (GPTBot, ClaudeBot, PerplexityBot, Google-Extended).

Updated 2026-05-01 2 FAQs

How robots.txt differs from AI Crawler, LLMO, llms.txt

robots.txt is crawl policy. llms.txt is ingest-friendly context (a Markdown overview of the site). Both live at the root; they answer different questions.

How Mentionwell handles robots.txt

  • Default robots.txt explicitly allows 15+ named AI crawlers — no ambiguous wildcards.
  • Per-domain customization so site owners can opt out of any specific crawler.
  • Allowlist published as part of the LLMO setup, alongside llms.txt and Markdown mirrors.

Frequently asked questions about robots.txt

How do I allow or block AI crawlers in robots.txt?

Use named user-agent directives — User-agent: GPTBot, User-agent: ClaudeBot, etc — with explicit Allow or Disallow rules. Wildcards alone don't reliably control AI crawlers.

Does robots.txt control AI training?

It controls AI crawlers that honor it — which most major ones do. It does not control downstream redistribution of already-trained-on data, and it has no legal force; it's an industry convention.

See also

Ship robots.txt-optimized articles automatically

Mentionwell handles robots.txt on every published article — alongside the other six optimization targets in this glossary — so you don't have to think about it per post. Drop a domain, approve the first headline, watch the pipeline run.

Sign in →

§ · The rest of the alphabet

The rest of the
alphabet.

robots.txt is one of 34 terms in the AI search vocabulary. Mentionwell optimizes for all of them on every article. Browse the full glossary →

§ 06 · Launch

Drop a domain.
Get cited.

Mentionwell opens to early access on June 1. Headless on any framework. No CMS migration.

EARLY ACCESS · INVITE-ONLY AT LAUNCH