Diffbot๐ฏWhat is Diffbot?
Diffbot is an AI training crawler operated by Diffbot. Collects data to train AI models.
World's largest commercial knowledge graph. Sells entity data to enterprises for competitive intelligence, lead gen, and AI training.
Content won't appear in Diffbot Knowledge Graph (10B+ entities). Affects DuckDuckGo search, Avast security ratings, enterprise market intel.
Enterprise customers include Microsoft, Adobe, Cisco, eBay, DuckDuckGo. Also used by AI startups like RelationalAI. Powers anti-misinformation initiatives.
๐ขAbout Diffbot
Diffbot operates 1 known bot for AI model training. Their service reaches Customers: Microsoft, Adobe, Cisco, eBay, DuckDuckGo, Avast.
๐ก๏ธDiffbot robots.txt Configuration
Control Diffbot access to your website using robots.txt directives.
Block Diffbot
To completely block Diffbot from crawling your site:
User-agent: Diffbot
Disallow: /Allow Diffbot Full Access
To explicitly allow Diffbot to crawl your entire site:
User-agent: Diffbot
Allow: /Selective Access for Diffbot
To allow Diffbot but restrict certain directories:
User-agent: Diffbot
Disallow: /private/
Disallow: /api/
Disallow: /admin/
Allow: /โ Diffbot respects robots.txt directives.
Diffbot User-Agent String
The user-agent token for Diffbot is:
Diffbot๐Who Blocks Diffbot?
Check Your Site's AI Policy
See if you're blocking or allowing Diffbot and other AI crawlers.