Recent analysis by data journalist Ben Welsh has revealed some interesting trends regarding the blocking of AI bots on news websites. It was found that over a quarter of the news websites surveyed are blocking Applebot-Extended, while 53 percent are blocking OpenAI’s bot. Additionally, Google’s AI-specific bot, Google-Extended, is being blocked by nearly 43 percent of the sites. These figures indicate a significant level of bot blocking among news publishers, with Applebot-Extended still flying somewhat under the radar.
There appears to be a divide among news publishers when it comes to their approach to AI bots. Some are choosing to block these bots, while others are entering into partnerships with the bot owners. For example, The New York Times reported that Apple was trying to strike AI deals with publishers, and competitors like OpenAI and Perplexity have already announced partnerships with various news outlets and websites. This strategic approach may involve withholding data until a partnership agreement is reached, as suggested by Originality AI founder Jon Gillham.
There is evidence to suggest that partnerships play a significant role in determining whether or not news websites block AI bots. For instance, Condé Nast’s websites used to block OpenAI’s web crawlers until a recent partnership announcement, after which the bots were unblocked. Similarly, Buzzfeed blocks Applebot-Extended but allows bots from owners with whom they have a partnership, often involving payment. This practice highlights the influence of commercial agreements on bot blocking decisions.
Maintaining an up-to-date block list of AI agents can be a challenging task for news publishers. With the increasing number of AI bots being introduced and the manual editing required for robots.txt files, it can be difficult to keep track of which bots to block. Dark Visitors founder Gavin King notes that many publishers struggle to identify the bots that should be blocked due to copyright concerns. To address this issue, some opt for services that automatically update their block lists.
The responsibility of deciding which AI bots to block often falls on the shoulders of media executives, including CEOs of major media companies. Some outlets explicitly state that they block AI scraping tools because they lack commercial agreements with the bot owners. For example, Vox Media blocks Applebot-Extended across all its properties, along with other AI scraping tools, until a commercial agreement is in place. This hands-on approach by media executives reflects the growing importance of bot blocking decisions in the digital publishing landscape.
The rise of AI bots on news websites has prompted news publishers to carefully consider their approach to bot blocking. Partnerships with bot owners play a crucial role in determining whether bots are blocked or allowed to crawl websites. As the landscape of AI bots continues to evolve, media executives will need to stay vigilant in managing their block lists to protect their content and uphold copyright agreements.
Leave a Reply