Robots.txt Generator
Generate your robots.txt file online — visual builder, WordPress/ecommerce/SaaS templates, validation, download
Paramètres globaux
Aperçu robots.txt
LIVE
robots.txt : guide complet pour le SEO
Le fichier robots.txt est un fichier texte placé à la racine de votre site qui indique aux robots d'indexation (Googlebot, Bingbot, etc.) quelles pages ils sont autorisés ou non à explorer. Bien configuré, il améliore l'efficacité de votre budget de crawl et protège vos pages privées des moteurs de recherche.
⚠️ robots.txt ≠ protection de données ! Le fichier robots.txt est une convention, pas une mesure de sécurité. Les robots malveillants ignorent ces directives. Pour protéger des données sensibles, utilisez une authentification ou des règles serveur.
Directives essentielles
| Directive | Rôle | Exemple |
|---|---|---|
User-agent |
Définit le robot ciblé. * = tous |
User-agent: Googlebot |
Disallow |
Interdit l'accès à un chemin | Disallow: /admin/ |
Allow |
Autorise explicitement un chemin (prioritaire sur Disallow) | Allow: /admin/logo.png |
Sitemap |
Déclare l'URL du sitemap XML | Sitemap: https://exemple.com/sitemap.xml |
Crawl-delay |
Délai en secondes entre deux requêtes | Crawl-delay: 10 |
Host |
URL canonique du site (Yandex) | Host: exemple.com |
Robots les plus courants
GooglebotGoogle — principal
Googlebot-ImageGoogle Images
BingbotMicrosoft Bing
GPTBotChatGPT / OpenAI
anthropic-aiClaude / Anthropic
facebookexternalhitMeta / Facebook
Bonnes pratiques robots.txt
- Toujours déclarer votre
Sitemap:dans le robots.txt pour faciliter la découverte par Google. - Utilisez des barres obliques finales sur les répertoires :
Disallow: /admin/plutôt que/admin. - Un
Disallow:vide signifie "tout autoriser". UnDisallow: /signifie "tout bloquer". - Bloquez les pages générées par paramètres de session (
?session=) pour économiser le budget de crawl. - Vérifiez votre robots.txt avec Google Search Console → Exploration → Testeur robots.txt.
Frequently asked questions
The robots.txt file must be placed exactly at the root of your domain: https://yoursite.com/robots.txt. It cannot be in a subdirectory. Google checks this exact...
The robots.txt file must be placed exactly at the root of your domain: https://yoursite.com/robots.txt. It cannot be in a subdirectory. Google checks this exact path during each crawl.
Disallow prevents the bot from crawling the page, but Google may still index it if found via a link. The meta noindex tag tells Google not to index it, but Goog...
Disallow prevents the bot from crawling the page, but Google may still index it if found via a link. The meta noindex tag tells Google not to index it, but Google must be able to crawl the page to read it. To completely deindex, use both.
To block robots that collect data for AI training, add specific blocks: User-agent: GPTBot / Disallow: / for ChatGPT, and User-agent: anthropic-ai / Disallow: /...
To block robots that collect data for AI training, add specific blocks: User-agent: GPTBot / Disallow: / for ChatGPT, and User-agent: anthropic-ai / Disallow: / for Claude. Other bots like CCBot (Common Crawl) also feed AI models.
For all standards-compliant robots (Googlebot, Bingbot…), yes. But malicious bots, scrapers and some tools deliberately ignore robots.txt. This file is in no wa...
For all standards-compliant robots (Googlebot, Bingbot…), yes. But malicious bots, scrapers and some tools deliberately ignore robots.txt. This file is in no way a security measure — it is a convention that well-behaved bots choose to follow.
Update your robots.txt whenever your site structure changes: adding a private section, migration, new technology. Also regularly check that no important pages a...
Update your robots.txt whenever your site structure changes: adding a private section, migration, new technology. Also regularly check that no important pages are accidentally blocked via Google Search Console → Crawl → robots.txt Tester.
Publicité
Tools in the same category
Popular tools
Trending tools