AI2Bot
Tier 3General crawler for AI2's research projects and open datasets.
AI2Bot๐ฏWhat is AI2Bot?
AI2Bot is an AI training crawler operated by Allen Institute for AI. Collects data to train AI models.
Builds open datasets for AI research. AI2 is a non-profit focused on AI for good.
Content won't be included in AI2's open research datasets.
Created by Paul Allen. Produces open models like OLMo. Research-focused, not commercial.
๐ขAbout Allen Institute for AI
Allen Institute for AI operates 3 known bots for AI model training.
๐ก๏ธAI2Bot robots.txt Configuration
Control AI2Bot access to your website using robots.txt directives.
Block AI2Bot
To completely block AI2Bot from crawling your site:
User-agent: AI2Bot
Disallow: /Allow AI2Bot Full Access
To explicitly allow AI2Bot to crawl your entire site:
User-agent: AI2Bot
Allow: /Selective Access for AI2Bot
To allow AI2Bot but restrict certain directories:
User-agent: AI2Bot
Disallow: /private/
Disallow: /api/
Disallow: /admin/
Allow: /โ AI2Bot respects robots.txt directives.
AI2Bot User-Agent String
The user-agent token for AI2Bot is:
AI2Bot๐Who Blocks AI2Bot?
๐Other Allen Institute for AI Bots
Check Your Site's AI Policy
See if you're blocking or allowing AI2Bot and other AI crawlers.