๐ŸฆŠStackFox
Allen Institute for AI logo

AI2Bot

Tier 3
๐Ÿ“š AI Trainingby Allen Institute for AI โ†—ยท Since 2014

General crawler for AI2's research projects and open datasets.

User-Agent Token
AI2Bot
Respects robots.txt
Yes
Impact Level
Notable
10M+ users - Regional/specialized AI companies
Estimated Reach
10M+ users - Regional/specialized AI companies

๐ŸŽฏWhat is AI2Bot?

AI2Bot is an AI training crawler operated by Allen Institute for AI. Collects data to train AI models.

๐Ÿ“Š How Your Data is Used

Builds open datasets for AI research. AI2 is a non-profit focused on AI for good.

๐Ÿšซ What Happens If You Block

Content won't be included in AI2's open research datasets.

๐Ÿ’ก Good to Know

Created by Paul Allen. Produces open models like OLMo. Research-focused, not commercial.

๐ŸขAbout Allen Institute for AI

Allen Institute for AI logo
Allen Institute for AI

Allen Institute for AI operates 3 known bots for AI model training.

๐Ÿ›ก๏ธAI2Bot robots.txt Configuration

Control AI2Bot access to your website using robots.txt directives.

Block AI2Bot

To completely block AI2Bot from crawling your site:

User-agent: AI2Bot
Disallow: /

Allow AI2Bot Full Access

To explicitly allow AI2Bot to crawl your entire site:

User-agent: AI2Bot
Allow: /

Selective Access for AI2Bot

To allow AI2Bot but restrict certain directories:

User-agent: AI2Bot
Disallow: /private/
Disallow: /api/
Disallow: /admin/
Allow: /

โœ“ AI2Bot respects robots.txt directives.

AI2Bot User-Agent String

The user-agent token for AI2Bot is:

AI2Bot

Check Your Site's AI Policy

See if you're blocking or allowing AI2Bot and other AI crawlers.