๐ŸฆŠStackFox
๐Ÿค–

Firecrawl

Tier 4
๐Ÿ“š AI Trainingby Firecrawl โ†—ยท Since 2024

AI scraper for LLM training data.

User-Agent Token
FirecrawlAgent
Respects robots.txt
Yes
Impact Level
Niche
Smaller players and developer tools
Estimated Reach
Smaller players and developer tools

๐ŸŽฏWhat is Firecrawl?

Firecrawl is an AI training crawler operated by Firecrawl. Collects data to train AI models.

๐Ÿ“Š How Your Data is Used

Developer tool for AI data collection.

๐Ÿšซ What Happens If You Block

Firecrawl users can't scrape your site for their AI apps.

๐Ÿ’ก Good to Know

Popular developer tool for building AI applications with web data.

๐ŸขAbout Firecrawl

Firecrawl

Firecrawl operates 1 known bot for AI model training.

๐Ÿ›ก๏ธFirecrawlAgent robots.txt Configuration

Control FirecrawlAgent access to your website using robots.txt directives.

Block FirecrawlAgent

To completely block Firecrawl from crawling your site:

User-agent: FirecrawlAgent
Disallow: /

Allow FirecrawlAgent Full Access

To explicitly allow Firecrawl to crawl your entire site:

User-agent: FirecrawlAgent
Allow: /

Selective Access for FirecrawlAgent

To allow Firecrawl but restrict certain directories:

User-agent: FirecrawlAgent
Disallow: /private/
Disallow: /api/
Disallow: /admin/
Allow: /

โœ“ Firecrawl respects robots.txt directives.

FirecrawlAgent User-Agent String

The user-agent token for Firecrawl is:

FirecrawlAgent

Check Your Site's AI Policy

See if you're blocking or allowing Firecrawl and other AI crawlers.