In one sentence
Google-Extended is the Web crawler identifier for Google's AI-related services (Gemini, AI Overview, Vertex AI, etc.). It can be controlled separately from the regular Googlebot.
What does it look like in practice?
For example, suppose you want:
- To be indexed by ordinary Google Search
- But not used for training or citation in Gemini or AI Overview
In your robots.txt you would write:
# Allow Googlebot (regular search)
User-agent: Googlebot
Allow: /
# Disallow Google-Extended (refuse AI uses)
User-agent: Google-Extended
Disallow: /
With this, you appear in Google search results while not being cited by Gemini.
That said, the GEO recommendation is Allow both as the default (otherwise you forfeit citation opportunities from AI search).
Why is it important?
- Google AI Overview appears at the top of search results: big impact on CTR
- Gemini is seeing growing enterprise adoption
- A new identifier introduced from 2024 onward: many companies still miss configuring it
Configuration (GEO recommendation)
User-agent: Google-Extended
Allow: /
Or, leaving nothing in robots.txt (default Allow) is fine.
Common misconceptions
- "Allowing Googlebot also reaches the AI" -> wrong. Google-Extended is a separate axis
- "Disallowing Google-Extended has no impact on Google Search" -> correct
- "Disallow completely prevents AI training" -> half-correct (only for Google)
See robots.txt and Gemini for details.