§ 30 · Glossary · AI Crawler

AI Crawler AI Crawler / Bot GPTBot, ClaudeBot, PerplexityBot, and friends.

An AI crawler is a bot operated by an AI company to ingest web content for training, retrieval, or live grounding. The major ones — GPTBot, OAI-SearchBot, ClaudeBot, anthropic-ai, PerplexityBot, Google-Extended, Applebot-Extended — each have their own user-agent and (usually) their own robots.txt directive.

Updated 2026-05-01 2 FAQs

How AI Crawler differs from LLMO, robots.txt

Classic search crawlers (Googlebot, Bingbot) feed ranking systems. AI crawlers feed training corpora and retrieval indexes. A site can be visible to one and invisible to the other — they're controlled separately.

How Mentionwell handles AI Crawler

  • Default robots.txt explicitly names and allows 15+ AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, etc).
  • Per-bot allow/disallow control so site owners can opt out of any specific crawler.
  • Sitemaps and llms.txt published at canonical paths so crawlers can discover content efficiently.

Frequently asked questions about AI Crawler

What are the major AI crawlers I should know about?

GPTBot and OAI-SearchBot (OpenAI), ClaudeBot and anthropic-ai (Anthropic), PerplexityBot (Perplexity), Google-Extended (Google generative uses), Applebot-Extended (Apple), Bytespider (ByteDance), Meta-ExternalAgent (Meta).

Should I block AI crawlers?

Only with a specific reason. Most sites benefit from being crawlable — that's how content shows up in ChatGPT, Claude, Perplexity, and AI Overviews answers.

See also

Ship AI Crawler-optimized articles automatically

Mentionwell handles AI Crawler on every published article — alongside the other six optimization targets in this glossary — so you don't have to think about it per post. Drop a domain, approve the first headline, watch the pipeline run.

Sign in →

§ · The rest of the alphabet

The rest of the
alphabet.

AI Crawler is one of 34 terms in the AI search vocabulary. Mentionwell optimizes for all of them on every article. Browse the full glossary →

§ 06 · Launch

Drop a domain.
Get cited.

Mentionwell opens to early access on June 1. Headless on any framework. No CMS migration.

EARLY ACCESS · INVITE-ONLY AT LAUNCH