๐ค
User-Agent Token
FirecrawlAgentRespects robots.txt
YesImpact Level
Niche
Smaller players and developer tools
Estimated Reach
Smaller players and developer tools
๐ฏWhat is Firecrawl?
Firecrawl is an AI training crawler operated by Firecrawl. Collects data to train AI models.
๐ How Your Data is Used
Developer tool for AI data collection.
๐ซ What Happens If You Block
Firecrawl users can't scrape your site for their AI apps.
๐ก Good to Know
Popular developer tool for building AI applications with web data.
๐ขAbout Firecrawl
๐ก๏ธFirecrawlAgent robots.txt Configuration
Control FirecrawlAgent access to your website using robots.txt directives.
Block FirecrawlAgent
To completely block Firecrawl from crawling your site:
User-agent: FirecrawlAgent
Disallow: /Allow FirecrawlAgent Full Access
To explicitly allow Firecrawl to crawl your entire site:
User-agent: FirecrawlAgent
Allow: /Selective Access for FirecrawlAgent
To allow Firecrawl but restrict certain directories:
User-agent: FirecrawlAgent
Disallow: /private/
Disallow: /api/
Disallow: /admin/
Allow: /โ Firecrawl respects robots.txt directives.
FirecrawlAgent User-Agent String
The user-agent token for Firecrawl is:
FirecrawlAgent๐Who Blocks Firecrawl?
Blocking (32 sites)
01net.com15min.lt4p.de4players.de588ku.comabendblatt.deallgaeuer-zeitung.deall-in.dearkansasonline.comaugsburger-allgemeine.deautobild.debabbel.combarcelona.catbfmtv.combild.debraunschweiger-zeitung.debrigitte.debsc.esbusinessinsider.debz-berlin.decastro.fmcidadeverde.comcitywire.comcloserweekly.comcomputerbild.decults3d.comdealabs.comdefcon.orgderwesten.dedolphin-emu.orgebird.orgexpresso.pt
Check Your Site's AI Policy
See if you're blocking or allowing Firecrawl and other AI crawlers.