Open Source Content Drained By AI Bots While Human Traffic Collapses, Cloudflare Finds

0
2
Open Source Websites Give More Than They Get As AI Crawlers Surge, Cloudflare Report Shows
Open Source Websites Give More Than They Get As AI Crawlers Surge, Cloudflare Report Shows

Cloudflare’s 2025 Internet Trends Report reveals how AI bots from major platforms are heavily crawling open-web content while sending minimal users back.

AI bots are extracting far more value from the open web than they return, according to Cloudflare’s 2025 Internet Trends Report, intensifying concerns about the sustainability of open-source and open-web ecosystems.

The report shows a severe crawl-to-refer imbalance, where automated AI crawlers consume vast volumes of content while directing minimal human traffic back to original sources. This trend signals a shift toward closed, platform-centric AI systems that weaken the open web’s long-standing attribution and traffic-sharing model.

Among major AI companies, Anthropic recorded the most extreme imbalance. Its crawl-to-refer ratio peaked near 500,000:1 before stabilising between 25,000:1 and 100,000:1, meaning its bots crawled some sites hundreds of thousands of times for every single visit sent back. The data suggests product designs that prioritise on-platform engagement over source discovery.

Cloudflare found that AI model training dominated crawler activity in 2025, generating seven to eight times more traffic than search crawling and roughly 32 times more than user-triggered crawling. OpenAI’s GPTBot accounted for much of this training-related activity.
The report also highlights the rise of dual-purpose bots. Googlebot and Microsoft’s Bingbot now perform both search indexing and AI training, despite Cloudflare’s calls for bots to declare a single, transparent purpose. At the same time, user-triggered AI bots grew more than 15 times during the year, signalling a shift toward real-time web access.

Governance gaps remain. Cloudflare blocked Perplexity’s bots in August 2025 for robots.txt violations, while Reddit later sued Perplexity over alleged scraping of search results.

Overall, the findings underscore how open-source content and open websites are increasingly exploited without proportional credit, traffic or compensation, raising urgent questions about ethical AI training and the future viability of the open internet.

LEAVE A REPLY

Please enter your comment!
Please enter your name here