Does Googlebot respect Crawl-delay?

No, Googlebot ignores the Crawl-delay directive. Control its crawl rate through Google Search Console instead.

Can malicious bots be blocked with robots.txt?

No. Robots.txt is a convention, not a security mechanism. Malicious bots ignore it entirely. Use server-level rate limiting or a WAF for actual bot blocking.

guide

How to Create a robots.txt File — Free Online Generator (2026)

By Rui Barreira · Last updated: 18 June 2026

A robots.txt file tells search engine crawlers which parts of your website they are and are not allowed to access. The brevio Robots.txt Generator gives you a visual builder — add rules, choose crawlers, set a sitemap URL, and see the generated file update in real time. No typing the format by hand, no risk of syntax mistakes.

Every public website should have a robots.txt file at https://yourdomain.com/robots.txt. Without one, search engines still crawl everything, but you have no control over which directories they index, which bots they respect, or where they find your sitemap.

How to build your robots.txt file

Start with a preset. The generator includes four common configurations: Allow all (open to all crawlers), Block all (useful during development), Block /admin & /api (protect non-public paths), and Block AI bots (prevents GPTBot, ClaudeBot, and CCBot from scraping your content for model training). Choose the one closest to your intent and then customise it.
Add or modify User-agent blocks. Each block targets one or more crawler bots. Use * (asterisk) to match all bots that do not have their own specific block. Named bots like Googlebot or Bingbot take precedence over the wildcard block when present.
Add Allow and Disallow rules to each block. Rules are evaluated top to bottom within a block. Disallow: /admin/ blocks the entire /admin/ path tree. Allow: / explicitly permits the root and everything under it. An empty Disallow (Disallow:) means allow everything.
Set a Crawl-delay if needed. This tells the crawler how many seconds to wait between requests. Use this for shared hosting or staging environments that cannot sustain aggressive crawling. Googlebot ignores Crawl-delay — control its rate through Google Search Console instead.
Add your sitemap URL. Placing the sitemap location in robots.txt helps all crawlers discover it, even ones that are not logged in Google Search Console.
Copy or download the file. Click Copy to paste it into your deployment, or Download to save robots.txt directly. Upload it to the root of your web server so it is accessible at /robots.txt.

Common rules explained

Disallow: /: Blocks the entire site. Use this for staging or password-protected environments you do not want indexed.
Allow: /$: Allows only the homepage (the $ anchors to the end of the path). Useful when you want the homepage indexed but nothing else.
Disallow: /*.pdf$: Blocks all PDF files. Googlebot supports basic pattern matching with * and $.

Limitations to know

Robots.txt is a convention, not a security mechanism. Malicious bots ignore it entirely. Disallowing a URL does not prevent it from being indexed if other pages link to it — Google may still show the URL in search results, just without a snippet. To fully remove a page from search results, use noindex in the page's HTTP header or meta tag instead of (or in addition to) a Disallow rule.

Related tools: UTM Builder · Cron Expression Parser

Frequently Asked Questions

Does robots.txt prevent pages from being indexed?: Not directly. Disallowing a URL does not prevent it from being indexed if other pages link to it — Google may still show the URL in search results, just without a snippet. To fully remove a page, use noindex in the page's HTTP header or meta tag.
Does Googlebot respect Crawl-delay?: No, Googlebot ignores the Crawl-delay directive. Control its crawl rate through Google Search Console instead.
Can malicious bots be blocked with robots.txt?: No. Robots.txt is a convention, not a security mechanism. Malicious bots ignore it entirely. Use server-level rate limiting or a WAF for actual bot blocking.