Webzio-Extended
Webzio-Extended is one of two crawlers operated by Webz.io, a company that sells web data to AI businesses. Its specific role is to act as an ethical filter: it reviews the content gathered by its companion bot (Webzio) and marks each piece as permitted or not permitted for AI and ML training. If your site allows this crawler, your content may end up in the training datasets used by commercial AI models, which can increase the chances of those models recognizing and mentioning your business when users ask about what you offer.
- User-agent
webzio-extendedwebzio-extended (+https://webz.io/bot.html)- Does it respect robots.txt?
- Yes
- Official documentation
- https://webz.io/blog/company/an-overview-of-the-webz-io-duo-of-crawlers/
How to allow it in your robots.txt
User-agent: webzio-extended
Allow: /How to block it (not recommended)
User-agent: webzio-extended
Disallow: /Frequently asked questions
Should I block Webzio-Extended?
If your goal is to be more visible in AI responses, probably not. This bot respects robots.txt, so you stay in control. Blocking it means your content does not reach the training data that Webz.io sells to AI companies, which reduces the chances of those models learning about your business.
How does this bot affect my visibility in AI tools?
Webzio-Extended decides what content gets labeled as suitable for training AI models. If your site is well-written and accessible, and this bot processes it, your content can become part of the datasets that future commercial AI models learn from, making it more likely they will mention your brand when someone asks about what you offer.
How can I tell if Webzio-Extended is visiting my site?
Check your server logs or your hosting or CDN dashboard, for example in Cloudflare. Search for 'webzio-extended' in the user-agent field: every visit from this bot will be recorded under that name.