Close Menu
Best in TechnologyBest in Technology
  • News
  • Phones
  • Laptops
  • Gadgets
  • Gaming
  • AI
  • Tips
  • More
    • Web Stories
    • Global
    • Press Release

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

What's On

Distillation Can Make AI Models Smaller and Cheaper

20 September 2025

The Best iPhone 17 Cases and Accessories

20 September 2025

Gear News of the Week: Nothing’s Latest Earbuds, Amazon’s Hardware Event, and a New Free VPN

20 September 2025
Facebook X (Twitter) Instagram
Just In
  • Distillation Can Make AI Models Smaller and Cheaper
  • The Best iPhone 17 Cases and Accessories
  • Gear News of the Week: Nothing’s Latest Earbuds, Amazon’s Hardware Event, and a New Free VPN
  • Dying Light: The Beast Review – A Deadly Return to Form
  • The Video Games You Should Play This Weekend – September 19, 2025
  • RFK Jr.’s Vaccine Panel Votes Down Its Own Proposal to Require Prescriptions for Covid-19 Shots
  • Xbox Console Prices Will Increase Again Next Month
  • Donald Trump Is Saying There’s a TikTok Deal. China Isn’t
Facebook X (Twitter) Instagram Pinterest Vimeo
Best in TechnologyBest in Technology
  • News
  • Phones
  • Laptops
  • Gadgets
  • Gaming
  • AI
  • Tips
  • More
    • Web Stories
    • Global
    • Press Release
Subscribe
Best in TechnologyBest in Technology
Home » A New Benchmark for the Risks of AI
News

A New Benchmark for the Risks of AI

News RoomBy News Room4 December 20242 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email

MLCommons, a nonprofit that helps companies measure the performance of their artificial intelligence systems, is launching a new benchmark to gauge AI’s bad side too.

The new benchmark, called AILuminate, assesses the responses of large language models to more than 12,000 test prompts in 12 categories including inciting violent crime, child sexual exploitation, hate speech, promoting self-harm, and intellectual property infringement.

Models are given a score of “poor,” “fair,” “good,” “very good,” or “excellent,” depending on how they perform. The prompts used to test the models are kept secret to prevent them from ending up as training data that would allow a model to ace the test.

Peter Mattson, founder and president of MLCommons and a senior staff engineer at Google, says that measuring the potential harms of AI models is technically difficult, leading to inconsistencies across the industry. “AI is a really young technology, and AI testing is a really young discipline,” he says. “Improving safety benefits society; it also benefits the market.”

Reliable, independent ways of measuring AI risks may become more relevant under the next US administration. Donald Trump has promised to get rid of President Biden’s AI Executive Order, which introduced measures aimed at ensuring AI is used responsibly by companies as well as a new AI Safety Institute to test powerful models.

The effort could also provide more of an international perspective on AI harms. MLCommons counts a number of international firms, including the Chinese companies Huawei and Alibaba, among its member organizations. If these companies all used the new benchmark, it would provide a way to compare AI safety in the US, China, and elsewhere.

Some large US AI providers have already used AILuminate to test their models. Anthropic’s Claude model, Google’s smaller model Gemma, and a model from Microsoft called Phi all scored “very good” in testing. OpenAI’s GPT-4o and Meta’s largest Llama model both scored “good.” The only model to score “poor” was OLMo from the Allen Institute for AI, although Mattson notes that this is a research offering not designed with safety in mind.

“Overall, it’s good to see scientific rigor in the AI evaluation processes,” says Rumman Chowdhury, CEO of Humane Intelligence, a nonprofit that specializes in testing or red-teaming AI models for misbehaviors. “We need best practices and inclusive methods of measurement to determine whether AI models are performing the way we expect them to.”

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleXDefiant servers are closing in June as Ubisoft announces more layoffs
Next Article Let’s create a more cinematic experience at home for the holidays and beyond

Related Articles

News

Distillation Can Make AI Models Smaller and Cheaper

20 September 2025
News

The Best iPhone 17 Cases and Accessories

20 September 2025
News

Gear News of the Week: Nothing’s Latest Earbuds, Amazon’s Hardware Event, and a New Free VPN

20 September 2025
News

RFK Jr.’s Vaccine Panel Votes Down Its Own Proposal to Require Prescriptions for Covid-19 Shots

20 September 2025
News

Donald Trump Is Saying There’s a TikTok Deal. China Isn’t

19 September 2025
News

The Best Grills for Cookouts and Tailgates

19 September 2025
Demo
Top Articles

ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

15 December 2024105 Views

Costco partners with Electric Era to bring back EV charging in the U.S.

28 October 202495 Views

5 laptops to buy instead of the M4 MacBook Pro

17 November 202492 Views

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

Latest News
News

RFK Jr.’s Vaccine Panel Votes Down Its Own Proposal to Require Prescriptions for Covid-19 Shots

News Room20 September 2025
Gaming

Xbox Console Prices Will Increase Again Next Month

News Room20 September 2025
News

Donald Trump Is Saying There’s a TikTok Deal. China Isn’t

News Room19 September 2025
Most Popular

The Spectacular Burnout of a Solar Panel Salesman

13 January 2025129 Views

ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

15 December 2024105 Views

Costco partners with Electric Era to bring back EV charging in the U.S.

28 October 202495 Views
Our Picks

Dying Light: The Beast Review – A Deadly Return to Form

20 September 2025

The Video Games You Should Play This Weekend – September 19, 2025

20 September 2025

RFK Jr.’s Vaccine Panel Votes Down Its Own Proposal to Require Prescriptions for Covid-19 Shots

20 September 2025

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

Facebook X (Twitter) Instagram Pinterest
  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact Us
© 2025 Best in Technology. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.