๐ŸฆŠStackFox
๐Ÿค–

Omgili

Tier 3
๐Ÿ“š AI Trainingby Webz.io โ†—ยท Since 2015

Sells crawl data via API for LLM training.

User-Agent Token
omgili
Respects robots.txt
Yes
Impact Level
Notable
10M+ users - Regional/specialized AI companies
Estimated Reach
300+ customers: Hootsuite, Sprinklr, NetBase, cybersecurity firms

๐ŸŽฏWhat is Omgili?

Omgili is an AI training crawler operated by Webz.io. Collects data to train AI models.

๐Ÿ“Š How Your Data is Used

$11.9M revenue company. Sells forum/discussion data to social listening platforms, cybersecurity companies, and AI training.

๐Ÿšซ What Happens If You Block

Content won't appear in Webz.io data feeds. Affects social listening tools (Hootsuite, Sprinklr), cybersecurity threat intel.

๐Ÿ’ก Good to Know

Crawls forums and discussions. Data sold to Hootsuite, Sprinklr, NetBase for social monitoring. Also provides darknet data to law enforcement.

๐ŸขAbout Webz.io

Webz.io

Webz.io operates 3 known bots for AI model training. Their service reaches 300+ customers: Hootsuite, Sprinklr, NetBase, cybersecurity firms.

๐Ÿ›ก๏ธomgili robots.txt Configuration

Control omgili access to your website using robots.txt directives.

Block omgili

To completely block Omgili from crawling your site:

User-agent: omgili
Disallow: /

Allow omgili Full Access

To explicitly allow Omgili to crawl your entire site:

User-agent: omgili
Allow: /

Selective Access for omgili

To allow Omgili but restrict certain directories:

User-agent: omgili
Disallow: /private/
Disallow: /api/
Disallow: /admin/
Allow: /

โœ“ Omgili respects robots.txt directives.

omgili User-Agent String

The user-agent token for Omgili is:

omgili

Check Your Site's AI Policy

See if you're blocking or allowing Omgili and other AI crawlers.