img2dataset (LAION)
Tier 4Downloads images for LAION's open AI training datasets.
img2dataset๐ฏWhat is img2dataset (LAION)?
img2dataset (LAION) is an AI training crawler operated by LAION. Collects data to train AI models.
Creates massive open image-text datasets for training models like Stable Diffusion.
Images won't be included in LAION datasets (LAION-5B, etc).
Non-profit. LAION-5B contains 5.85 billion image-text pairs. Powers many image AI models.
๐ขAbout LAION
๐ก๏ธimg2dataset robots.txt Configuration
Control img2dataset access to your website using robots.txt directives.
Block img2dataset
To completely block img2dataset (LAION) from crawling your site:
User-agent: img2dataset
Disallow: /Allow img2dataset Full Access
To explicitly allow img2dataset (LAION) to crawl your entire site:
User-agent: img2dataset
Allow: /Selective Access for img2dataset
To allow img2dataset (LAION) but restrict certain directories:
User-agent: img2dataset
Disallow: /private/
Disallow: /api/
Disallow: /admin/
Allow: /img2dataset User-Agent String
The user-agent token for img2dataset (LAION) is:
img2dataset๐Who Blocks img2dataset (LAION)?
๐Other LAION Bots
Check Your Site's AI Policy
See if you're blocking or allowing img2dataset (LAION) and other AI crawlers.