User-Agent Token
cohere-training-data-crawlerRespects robots.txt
UnknownImpact Level
Notable
10M+ users - Regional/specialized AI companies
Estimated Reach
10M+ users - Regional/specialized AI companies
๐ฏWhat is Cohere Training Crawler?
Cohere Training Crawler is an AI training crawler operated by Cohere. Collects data to train AI models.
๐ How Your Data is Used
Pre-training for Cohere's enterprise LLMs.
๐ซ What Happens If You Block
Content won't train Cohere's Command and Embed models.
๐ก Good to Know
Cohere focuses on enterprise. Competes with OpenAI/Anthropic in B2B.
๐ขAbout Cohere
๐ก๏ธcohere-training-data-crawler robots.txt Configuration
Control cohere-training-data-crawler access to your website using robots.txt directives.
Block cohere-training-data-crawler
To completely block Cohere Training Crawler from crawling your site:
User-agent: cohere-training-data-crawler
Disallow: /Allow cohere-training-data-crawler Full Access
To explicitly allow Cohere Training Crawler to crawl your entire site:
User-agent: cohere-training-data-crawler
Allow: /Selective Access for cohere-training-data-crawler
To allow Cohere Training Crawler but restrict certain directories:
User-agent: cohere-training-data-crawler
Disallow: /private/
Disallow: /api/
Disallow: /admin/
Allow: /cohere-training-data-crawler User-Agent String
The user-agent token for Cohere Training Crawler is:
cohere-training-data-crawler๐Who Blocks Cohere Training Crawler?
Blocking (50 sites)
11freunde.de15min.lt1881.no20minutos.es47news.jp4p.de4players.de588ku.com750g.comabendblatt.deaccountingtools.comad-italia.itadmagazine.fradnuntius.comadvrider.comadweek.comafp.aiaftenbladet.noaftenposten.noaftonbladet.seaftvnews.comagrarheute.comairfind.comajpmonline.orgalertustech.comallabolag.seallgaeuer-zeitung.deall-in.deallocine.frallure.comamarillo.comanglers.jpanime-planet.comapp.comarchinect.comarchitectsjournal.co.ukarchitecturaldigest.comarchitecturaldigest.inardaudiothek.deard.deardmediathek.deard-text.deargusleader.comarkansasonline.comarstechnica.comartexhibition.jpaskmen.comaspetjournals.orgaugsburger-allgemeine.deaugustachronicle.com
๐Other Cohere Bots
Check Your Site's AI Policy
See if you're blocking or allowing Cohere Training Crawler and other AI crawlers.