Close Menu
Best in TechnologyBest in Technology
  • News
  • Phones
  • Laptops
  • Gadgets
  • Gaming
  • AI
  • Tips
  • More
    • Web Stories
    • Global
    • Press Release

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

What's On

Google Pixel 10 Series Colour Options, Storage Configurations Surface Online

8 July 2025

iOS 26 Beta 3 Update for iPhone Released With New Stock Wallpapers, Darker Liquid Glass Appearance

8 July 2025

OnePlus Nord 5, Nord CE 5 Launch Today: Know Price, Expected Features and Specifications

8 July 2025
Facebook X (Twitter) Instagram
Just In
  • Google Pixel 10 Series Colour Options, Storage Configurations Surface Online
  • iOS 26 Beta 3 Update for iPhone Released With New Stock Wallpapers, Darker Liquid Glass Appearance
  • OnePlus Nord 5, Nord CE 5 Launch Today: Know Price, Expected Features and Specifications
  • AI+ Nova 5G, Pulse India Launch Today: Know Price, Specifications and More
  • Samsung Galaxy Z Fold 7 Tipped to Get a Noticeable Price Hike Over Galaxy Z Fold 6 in India
  • Anthem Is the Latest Video Game Casualty. What Should End-of-Life Care Look Like for Games?
  • Samsung May Upgrade AI-Powered Audio Eraser Feature in One UI 8 With Real-Time Editing Support
  • Amazon Prime Day Sale 2025: Samsung Galaxy S24 Ultra to Be Available Under Rs. 80,000
Facebook X (Twitter) Instagram Pinterest Vimeo
Best in TechnologyBest in Technology
  • News
  • Phones
  • Laptops
  • Gadgets
  • Gaming
  • AI
  • Tips
  • More
    • Web Stories
    • Global
    • Press Release
Subscribe
Best in TechnologyBest in Technology
Home » The Race to Block OpenAI’s Scraping Bots Is Slowing Down
News

The Race to Block OpenAI’s Scraping Bots Is Slowing Down

News RoomBy News Room7 October 20243 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email

It’s too soon to say how the spate of deals between AI companies and publishers will shake out. OpenAI has already scored one clear win, though: Its web crawlers aren’t getting blocked by top news outlets at the rate they once were.

The generative AI boom sparked a gold rush for data—and a subsequent data-protection rush (for most news websites, anyway) in which publishers sought to block AI crawlers and prevent their work from becoming training data without consent. When Apple debuted a new AI agent this summer, for example, a slew of top news outlets swiftly opted out of Apple’s web scraping using the Robots Exclusion Protocol, or robots.txt, the file that allows webmasters to control bots. There are so many new AI bots on the scene that it can feel like playing whack-a-mole to keep up.

OpenAI’s GPTBot has the most name recognition and is also more frequently blocked than competitors like Google AI. The number of high-ranking media websites using robots.txt to “disallow” OpenAI’s GPTBot dramatically increased from its August 2023 launch until that fall, then steadily (but more gradually) rose from November 2023 to April 2024, according to an analysis of 1,000 popular news outlets by Ontario-based AI detection startup Originality AI. At its peak, the high was just over a third of the websites; it has now dropped down closer to a quarter. Within a smaller pool of the most prominent news outlets, the block rate is still above 50 percent, but it’s down from heights earlier this year of almost 90 percent.

But last May, after Dotdash Meredith announced a licensing deal with OpenAI, that number dipped significantly. It then dipped again at the end of May when Vox announced its own arrangement—and again once more this August when WIRED’s parent company, Condé Nast, struck a deal. The trend toward increased blocking appears to be over, at least for now.

These dips make obvious sense. When companies enter into partnerships and give permission for their data to be used, they’re no longer incentivized to barricade it, so it would follow that they would update their robots.txt files to permit crawling; make enough deals and the overall percentage of sites blocking crawlers will almost certainly go down. Some outlets unblocked OpenAI’s crawlers on the very same day that they announced a deal, like The Atlantic. Others took a few days to a few weeks, like Vox, which announced its partnership at the end of May but which unblocked GPTBot on its properties toward the end of June.

Robots.txt is not legally binding, but it has long functioned as the standard that governs web crawler behavior. For most of the internet’s existence, people running webpages expected each other to abide by the file. When a WIRED investigation earlier this summer found that the AI startup Perplexity was likely choosing to ignore robots.txt commands, Amazon’s cloud division launched an investigation into whether Perplexity had violated its rules. It’s not a good look to ignore robots.txt, which likely explains why so many prominent AI companies—including OpenAI—explicitly state that they use it to determine what to crawl. Originality AI CEO Jon Gillham believes that this adds extra urgency to OpenAI’s push to make agreements. “It’s clear that OpenAI views being blocked as a threat to their future ambitions,” says Gillham.

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleVivo X200 Series Price Leaked; Vivo X200 Pro Mini Design Revealed Through Alleged Hands-on Video
Next Article 3 underrated movies on Hulu you need to watch in October 2024

Related Articles

News

Anthem Is the Latest Video Game Casualty. What Should End-of-Life Care Look Like for Games?

8 July 2025
News

Why Jolly Ranchers Are Banned in the UK but Not the US

7 July 2025
News

Thanks to Zillow, Your Friends Know How Much Your House Costs—or if You’re Secretly Rich

7 July 2025
News

People Are Using AI Chatbots to Guide Their Psychedelic Trips

7 July 2025
News

On Mexico’s Caribbean Coast, There’s Lobster for the Tourists and Microplastics for Everyone Else

7 July 2025
News

How to Use Voice Typing on Your Phone

6 July 2025
Demo
Top Articles

ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

15 December 2024101 Views

Costco partners with Electric Era to bring back EV charging in the U.S.

28 October 202495 Views

Oppo Reno 14, Reno 14 Pro India Launch Timeline and Colourways Leaked

27 May 202582 Views

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

Latest News
News

Anthem Is the Latest Video Game Casualty. What Should End-of-Life Care Look Like for Games?

News Room8 July 2025
Phones

Samsung May Upgrade AI-Powered Audio Eraser Feature in One UI 8 With Real-Time Editing Support

News Room8 July 2025
Phones

Amazon Prime Day Sale 2025: Samsung Galaxy S24 Ultra to Be Available Under Rs. 80,000

News Room8 July 2025
Most Popular

The Spectacular Burnout of a Solar Panel Salesman

13 January 2025124 Views

ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

15 December 2024101 Views

Costco partners with Electric Era to bring back EV charging in the U.S.

28 October 202495 Views
Our Picks

AI+ Nova 5G, Pulse India Launch Today: Know Price, Specifications and More

8 July 2025

Samsung Galaxy Z Fold 7 Tipped to Get a Noticeable Price Hike Over Galaxy Z Fold 6 in India

8 July 2025

Anthem Is the Latest Video Game Casualty. What Should End-of-Life Care Look Like for Games?

8 July 2025

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

Facebook X (Twitter) Instagram Pinterest
  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact Us
© 2025 Best in Technology. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.