Close Menu
Best in TechnologyBest in Technology
  • News
  • Phones
  • Laptops
  • Gadgets
  • Gaming
  • AI
  • Tips
  • More
    • Web Stories
    • Global
    • Press Release

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

What's On

Review: Timekettle T1 Handheld Translator

12 July 2025

Security News This Week: 4 Arrested Over Scattered Spider Hacking Spree

12 July 2025

How to Use Clean Energy Tax Credits Before They Disappear

12 July 2025
Facebook X (Twitter) Instagram
Just In
  • Review: Timekettle T1 Handheld Translator
  • Security News This Week: 4 Arrested Over Scattered Spider Hacking Spree
  • How to Use Clean Energy Tax Credits Before They Disappear
  • Gear News of the Week: Samsung’s Trifold Promise, Ikea’s Sonos Split, and Hugging Face’s New Robot
  • Future-Proof Your Wi-Fi With This Prime Day Wi-Fi 7 Router Deal
  • Amazon Prime Day Sale 2025: Best Deals on OnePlus Smartphones
  • These 142 Last-Chance Prime Day Deals Are Still On–For Now
  • Scientists Succeed in Reversing Parkinson’s Symptoms in Mice
Facebook X (Twitter) Instagram Pinterest Vimeo
Best in TechnologyBest in Technology
  • News
  • Phones
  • Laptops
  • Gadgets
  • Gaming
  • AI
  • Tips
  • More
    • Web Stories
    • Global
    • Press Release
Subscribe
Best in TechnologyBest in Technology
Home » OpenAI Touts New AI Safety Research. Critics Say It’s a Good Step, but Not Enough
News

OpenAI Touts New AI Safety Research. Critics Say It’s a Good Step, but Not Enough

News RoomBy News Room17 July 20244 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email

OpenAI has faced opprobrium in recent months from those who suggest it may be rushing too quickly and recklessly to develop more powerful artificial intelligence. The company appears intent on showing it takes AI safety seriously. Today it showcased research that it says could help researchers scrutinize AI models even as they become more capable and useful.

The new technique is one of several ideas related to AI safety that the company has touted in recent weeks. It involves having two AI models engage in a conversation that forces the more powerful one to be more transparent, or “legible,” with its reasoning so that humans can understand what it’s up to.

“This is core to the mission of building an [artificial general intelligence] that is both safe and beneficial,” Yining Chen, a researcher at OpenAI involved with the work, tells WIRED.

So far, the work has been tested on an AI model designed to solve simple math problems. The OpenAI researchers asked the AI model to explain its reasoning as it answered questions or solved problems. A second model is trained to detect whether the answers are correct or not, and the researchers found that having the two models engage in a back and forth encouraged the math-solving one to be more forthright and transparent with its reasoning.

OpenAI is publicly releasing a paper detailing the approach. “It’s part of the long-term safety research plan,” says Jan Hendrik Kirchner, another OpenAI researcher involved with the work. “We hope that other researchers can follow up, and maybe try other algorithms as well.”

Transparency and explainability are key concerns for AI researchers working to build more powerful systems. Large language models will sometimes offer up reasonable explanations for how they came to a conclusion, but a key concern is that future models may become more opaque or even deceptive in the explanations they provide—perhaps pursuing an undesirable goal while lying about it.

The research revealed today is part of a broader effort to understand how large language models that are at the core of programs like ChatGPT operate. It is one of a number of techniques that could help make more powerful AI models more transparent and therefore safer. OpenAI and other companies are exploring more mechanistic ways of peering inside the workings of large language models, too.

OpenAI has revealed more of its work on AI safety in recent weeks following criticism of its approach. In May, WIRED learned that a team of researchers dedicated to studying long-term AI risk had been disbanded. This came shortly after the departure of cofounder and key technical leader Ilya Sutskever, who was one of the board members who briefly ousted CEO Sam Altman last November.

OpenAI was founded on the promise that it would make AI both more transparent to scrutiny and safer. After the runaway success of ChatGPT and more intense competition from well-backed rivals, some people have accused the company of prioritizing splashy advances and market share over safety.

Daniel Kokotajlo, a researcher who left OpenAI and signed an open letter criticizing the company’s approach to AI safety, says the new work is important, but incremental, and that it does not change the fact that companies building the technology need more oversight. “​The situation we are in remains unchanged,” he says. “Opaque, unaccountable, unregulated corporations racing each other to build artificial superintelligence, with basically no plan for how to control it.”

Another source with knowledge of OpenAI’s inner workings, who asked not to be named because they were not authorized to speak publicly, says that outside oversight of AI companies is also needed. “The question is whether they’re serious about the kinds of processes and governance mechanisms you need to prioritize societal benefit over profit,” the source says. “Not whether they let any of their researchers do some safety stuff.”

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleNintendo Is Raising The Famicom Detective Club Series From The Dead With Emio – The Smiling Man
Next Article Looking for a cheap student laptop deal during Prime Day? How about $139?

Related Articles

News

Review: Timekettle T1 Handheld Translator

12 July 2025
News

Security News This Week: 4 Arrested Over Scattered Spider Hacking Spree

12 July 2025
News

How to Use Clean Energy Tax Credits Before They Disappear

12 July 2025
News

Gear News of the Week: Samsung’s Trifold Promise, Ikea’s Sonos Split, and Hugging Face’s New Robot

12 July 2025
News

Future-Proof Your Wi-Fi With This Prime Day Wi-Fi 7 Router Deal

12 July 2025
News

These 142 Last-Chance Prime Day Deals Are Still On–For Now

12 July 2025
Demo
Top Articles

ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

15 December 2024101 Views

Costco partners with Electric Era to bring back EV charging in the U.S.

28 October 202495 Views

Oppo Reno 14, Reno 14 Pro India Launch Timeline and Colourways Leaked

27 May 202582 Views

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

Latest News
Phones

Amazon Prime Day Sale 2025: Best Deals on OnePlus Smartphones

News Room12 July 2025
News

These 142 Last-Chance Prime Day Deals Are Still On–For Now

News Room12 July 2025
News

Scientists Succeed in Reversing Parkinson’s Symptoms in Mice

News Room12 July 2025
Most Popular

The Spectacular Burnout of a Solar Panel Salesman

13 January 2025124 Views

ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

15 December 2024101 Views

Costco partners with Electric Era to bring back EV charging in the U.S.

28 October 202495 Views
Our Picks

Gear News of the Week: Samsung’s Trifold Promise, Ikea’s Sonos Split, and Hugging Face’s New Robot

12 July 2025

Future-Proof Your Wi-Fi With This Prime Day Wi-Fi 7 Router Deal

12 July 2025

Amazon Prime Day Sale 2025: Best Deals on OnePlus Smartphones

12 July 2025

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

Facebook X (Twitter) Instagram Pinterest
  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact Us
© 2025 Best in Technology. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.