๐ŸฆŠStackFox
LAION logo

img2dataset (LAION)

Tier 4
๐Ÿ“š AI Trainingby LAION โ†—ยท Since 2021

Downloads images for LAION's open AI training datasets.

User-Agent Token
img2dataset
Respects robots.txt
Unknown
Impact Level
Niche
Smaller players and developer tools
Estimated Reach
Smaller players and developer tools

๐ŸŽฏWhat is img2dataset (LAION)?

img2dataset (LAION) is an AI training crawler operated by LAION. Collects data to train AI models.

๐Ÿ“Š How Your Data is Used

Creates massive open image-text datasets for training models like Stable Diffusion.

๐Ÿšซ What Happens If You Block

Images won't be included in LAION datasets (LAION-5B, etc).

๐Ÿ’ก Good to Know

Non-profit. LAION-5B contains 5.85 billion image-text pairs. Powers many image AI models.

๐ŸขAbout LAION

LAION logo
LAION

LAION operates 3 known bots for AI model training.

๐Ÿ›ก๏ธimg2dataset robots.txt Configuration

Control img2dataset access to your website using robots.txt directives.

Block img2dataset

To completely block img2dataset (LAION) from crawling your site:

User-agent: img2dataset
Disallow: /

Allow img2dataset Full Access

To explicitly allow img2dataset (LAION) to crawl your entire site:

User-agent: img2dataset
Allow: /

Selective Access for img2dataset

To allow img2dataset (LAION) but restrict certain directories:

User-agent: img2dataset
Disallow: /private/
Disallow: /api/
Disallow: /admin/
Allow: /

img2dataset User-Agent String

The user-agent token for img2dataset (LAION) is:

img2dataset

Check Your Site's AI Policy

See if you're blocking or allowing img2dataset (LAION) and other AI crawlers.