Close Menu
Best in TechnologyBest in Technology
  • News
  • Phones
  • Laptops
  • Gadgets
  • Gaming
  • AI
  • Tips
  • More
    • Web Stories
    • Global
    • Press Release

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

What's On

Realme 15 Series to Feature AI Edit Genie, a Voice-Enabled Photo Editing Tool

7 July 2025

Borderlands 4

7 July 2025

iPhone 15 to Get a Discount During Amazon Prime Day 2025 Sale: Price Revealed

7 July 2025
Facebook X (Twitter) Instagram
Just In
  • Realme 15 Series to Feature AI Edit Genie, a Voice-Enabled Photo Editing Tool
  • Borderlands 4
  • iPhone 15 to Get a Discount During Amazon Prime Day 2025 Sale: Price Revealed
  • Honor X70 Tipped to Launch With an 8,300mAh Battery, Snapdragon 6 Gen 4 SoC
  • Tony Hawk’s Pro Skater 3 + 4 Review – Skating It Safe
  • Xiaomi Compact Power Bank 20,000mAh Launched in India With Built-In Cable: Price, Features
  • Cyberpunk: Edgerunners 2 Is A Standalone Sequel Series Coming To Netflix
  • Samsung Galaxy S26 Ultra Said to Get 16GB RAM, Improved Telephoto Lens, More
Facebook X (Twitter) Instagram Pinterest Vimeo
Best in TechnologyBest in Technology
  • News
  • Phones
  • Laptops
  • Gadgets
  • Gaming
  • AI
  • Tips
  • More
    • Web Stories
    • Global
    • Press Release
Subscribe
Best in TechnologyBest in Technology
Home » Anthropic aims to fix one of the biggest problems in AI right now
News

Anthropic aims to fix one of the biggest problems in AI right now

News RoomBy News Room2 July 20242 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email

Hot on the heels of the announcement that its Claude 3.5 Sonnet large language model beat out other leading models, including GPT-4o and Llama-400B, AI startup Anthropic announced Monday that it plans to launch a new program to fund the development of independent, third-party benchmark tests against which to evaluate its upcoming models.

Per a blog post, the company is willing to pay third-party developers to create benchmarks that can “effectively measure advanced capabilities in AI models.”

“Our investment in these evaluations is intended to elevate the entire field of AI safety, providing valuable tools that benefit the whole ecosystem,” Anthropic wrote in a Monday blog post. “Developing high-quality, safety-relevant evaluations remains challenging, and the demand is outpacing the supply.”

The company wants submitted benchmarks to help measure the relative “safety level” of an AI based on a number of factors, including how well it resists attempts to coerce responses that might include cybersecurity; chemical, biological, radiological, and nuclear (CBRN); and misalignment, social manipulation, and other national security risks. Anthropic is also looking for benchmarks to help evaluate models’ advanced capabilities and is willing to fund the “development of tens of thousands of new evaluation questions and end-to-end tasks that would challenge even graduate students,” essentially testing a model’s ability to synthesize knowledge from a variety of sources, its ability to refuse cleverly worded malicious user requests, and its ability to respond in multiple languages.

Anthropic is looking for “sufficiently difficult,” high-volume tasks that can involve as many as “thousands” of testers across a diverse set of test formats that help the company inform its “realistic and safety-relevant” threat modeling efforts. Any interested developers are welcome to submit their proposals to the company, which plans to evaluate them on a rolling basis.











Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleThe Hunt for the Most Efficient Heat Pump in the World
Next Article Oppo A3 With Snapdragon 695 5G SoC, 5,000mAh Battery Launched: Price, Specifications

Related Articles

News

Why Jolly Ranchers Are Banned in the UK but Not the US

7 July 2025
News

Thanks to Zillow, Your Friends Know How Much Your House Costs—or if You’re Secretly Rich

7 July 2025
News

People Are Using AI Chatbots to Guide Their Psychedelic Trips

7 July 2025
News

On Mexico’s Caribbean Coast, There’s Lobster for the Tourists and Microplastics for Everyone Else

7 July 2025
News

How to Use Voice Typing on Your Phone

6 July 2025
News

How the Binding of Two Brain Molecules Creates Memories That Last a Lifetime

6 July 2025
Demo
Top Articles

ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

15 December 2024101 Views

Costco partners with Electric Era to bring back EV charging in the U.S.

28 October 202495 Views

Oppo Reno 14, Reno 14 Pro India Launch Timeline and Colourways Leaked

27 May 202582 Views

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

Latest News
Phones

Xiaomi Compact Power Bank 20,000mAh Launched in India With Built-In Cable: Price, Features

News Room7 July 2025
Gaming

Cyberpunk: Edgerunners 2 Is A Standalone Sequel Series Coming To Netflix

News Room7 July 2025
Phones

Samsung Galaxy S26 Ultra Said to Get 16GB RAM, Improved Telephoto Lens, More

News Room7 July 2025
Most Popular

The Spectacular Burnout of a Solar Panel Salesman

13 January 2025124 Views

ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

15 December 2024101 Views

Costco partners with Electric Era to bring back EV charging in the U.S.

28 October 202495 Views
Our Picks

Honor X70 Tipped to Launch With an 8,300mAh Battery, Snapdragon 6 Gen 4 SoC

7 July 2025

Tony Hawk’s Pro Skater 3 + 4 Review – Skating It Safe

7 July 2025

Xiaomi Compact Power Bank 20,000mAh Launched in India With Built-In Cable: Price, Features

7 July 2025

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

Facebook X (Twitter) Instagram Pinterest
  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact Us
© 2025 Best in Technology. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.