Close Menu
Best in TechnologyBest in Technology
  • News
  • Phones
  • Laptops
  • Gadgets
  • Gaming
  • AI
  • Tips
  • More
    • Web Stories
    • Global
    • Press Release

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

What's On
Apple’s foldable iPhone is coming… not just when you thought

Apple’s foldable iPhone is coming… not just when you thought

21 March 2026
Google is removing the hassle of remembering SIM codes on Android 17

Google is removing the hassle of remembering SIM codes on Android 17

21 March 2026
Best Website Builders in 2026

Best Website Builders in 2026

21 March 2026
Facebook X (Twitter) Instagram
Just In
  • Apple’s foldable iPhone is coming… not just when you thought
  • Google is removing the hassle of remembering SIM codes on Android 17
  • Best Website Builders in 2026
  • This luxury leather-clad Apple Watch charging kit from Hermès costs than a MacBook Pro
  • The new MacBook Neo charges faster than Apple claims
  • One of the best-value gaming laptops available right now: RTX 5070 Ti, 32GB RAM, OLED, Core Ultra 9 for $1,549
  • AirDrop support is coming to Samsung Galaxy phones, starting with S26
  • Windows 11 is bringing back a feature users have wanted for years
Facebook X (Twitter) Instagram Pinterest Vimeo
Best in TechnologyBest in Technology
  • News
  • Phones
  • Laptops
  • Gadgets
  • Gaming
  • AI
  • Tips
  • More
    • Web Stories
    • Global
    • Press Release
Subscribe
Best in TechnologyBest in Technology
Home » OpenAI Wants AI to Help Humans Train AI
News

OpenAI Wants AI to Help Humans Train AI

News RoomBy News Room27 June 20244 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
OpenAI Wants AI to Help Humans Train AI
Share
Facebook Twitter LinkedIn Pinterest Email

One of the key ingredients that made ChatGPT a ripsnorting success was an army of human trainers who gave the artificial intelligence model behind the bot guidance on what constitutes good and bad outputs. OpenAI now says that adding even more AI into the mix—to help assist human trainers—could help make AI helpers smarter and more reliable.

In developing ChatGPT, OpenAI pioneered the use of reinforcement learning with human feedback, or RLHF. This technique uses input from human testers to fine-tune an AI model so that its output is judged to be more coherent, less objectionable, and more accurate. The ratings the trainers give feed into an algorithm that drives the model’s behavior. The technique has proven crucial both to making chatbots more reliable and useful and preventing them from misbehaving.

“RLHF does work very well, but it has some key limitations,” says Nat McAleese, a researcher at OpenAI involved with the new work. For one thing, human feedback can be inconsistent. For another it can be difficult for even skilled humans to rate extremely complex outputs, such as sophisticated software code. The process can also optimize a model to produce output that seems convincing rather than actually being accurate.

OpenAI developed a new model by fine-tuning its most powerful offering, GPT-4, to assist human trainers tasked with assessing code. The company found that the new model, dubbed CriticGPT, could catch bugs that humans missed, and that human judges found its critiques of code to be better 63 percent of the time. OpenAI will look at extending the approach to areas beyond code in the future.

“We’re starting work to integrate this technique into our RLHF chat stack,” McAleese says. He notes that the approach is imperfect, since CriticGPT can also make mistakes by hallucinating, but he adds that the technique could help make OpenAI’s models as well as tools like ChatGPT more accurate by reducing errors in human training. He adds that it might also prove crucial in helping AI models become much smarter, because it may allow humans to help train an AI that exceeds their own abilities. “And as models continue to get better and better, we suspect that people will need more help,” McAleese says.

The new technique is one of many now being developed to improve large language models and squeeze more abilities out of them. It is also part of an effort to ensure that AI behaves in acceptable ways even as it becomes more capable.

Earlier this month, Anthropic, a rival to OpenAI founded by ex-OpenAI employees, announced a more capable version of its own chatbot, called Claude, thanks to improvements in the model’s training regimen and the data it is fed. Anthropic and OpenAI have both also recently touted new ways of inspecting AI models to understand how they arrive at their output in order to better prevent unwanted behavior such as deception.

The new technique might help OpenAI train increasingly powerful AI models while ensuring their output is more trustworthy and aligned with human values, especially if the company successfully deploys it in more areas than code. OpenAI has said that it is training its next major AI model, and the company is evidently keen to show that it is serious about ensuring that it behaves. This follows the dissolvement of a prominent team dedicated to assessing the long-term risks posed by AI. The team was co-led by Ilya Sutskever, a cofounder of the company and former board member who briefly pushed CEO Sam Altman out of the company before recanting and helping him regain control. Several members of that team have since criticized the company for moving riskily as it rushes to develop and commercialize powerful AI algorithms.

Dylan Hadfield-Menell, a professor at MIT who researches ways to align AI, says the idea of having AI models help train more powerful ones has been kicking around for a while. “This is a pretty natural development,” he says.

Hadfield-Menell notes that the researchers who originally developed techniques used for RLHF discussed related ideas several years ago. He says it remains to be seen how generally applicable and powerful it is. “It might lead to big jumps in individual capabilities, and it might be a stepping stone towards sort of more effective feedback in the long run,” he says.

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleSamsung Galaxy Z Fold 6 Leaks in New Renders, Suggests Three Colour Options
Next Article This Lenovo Legion Pro gaming laptop is nearly $900 off

Related Articles

Apple’s foldable iPhone is coming… not just when you thought
News

Apple’s foldable iPhone is coming… not just when you thought

21 March 2026
Google is removing the hassle of remembering SIM codes on Android 17
News

Google is removing the hassle of remembering SIM codes on Android 17

21 March 2026
Best Website Builders in 2026
News

Best Website Builders in 2026

21 March 2026
This luxury leather-clad Apple Watch charging kit from Hermès costs than a MacBook Pro
News

This luxury leather-clad Apple Watch charging kit from Hermès costs than a MacBook Pro

21 March 2026
The new MacBook Neo charges faster than Apple claims
News

The new MacBook Neo charges faster than Apple claims

21 March 2026
One of the best-value gaming laptops available right now: RTX 5070 Ti, 32GB RAM, OLED, Core Ultra 9 for ,549
News

One of the best-value gaming laptops available right now: RTX 5070 Ti, 32GB RAM, OLED, Core Ultra 9 for $1,549

21 March 2026
Demo
Top Articles
5 laptops to buy instead of the M4 MacBook Pro

5 laptops to buy instead of the M4 MacBook Pro

17 November 2024130 Views
ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

15 December 2024111 Views
Costco partners with Electric Era to bring back EV charging in the U.S.

Costco partners with Electric Era to bring back EV charging in the U.S.

28 October 2024100 Views

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

Latest News
One of the best-value gaming laptops available right now: RTX 5070 Ti, 32GB RAM, OLED, Core Ultra 9 for ,549 News

One of the best-value gaming laptops available right now: RTX 5070 Ti, 32GB RAM, OLED, Core Ultra 9 for $1,549

News Room21 March 2026
AirDrop support is coming to Samsung Galaxy phones, starting with S26 News

AirDrop support is coming to Samsung Galaxy phones, starting with S26

News Room21 March 2026
Windows 11 is bringing back a feature users have wanted for years News

Windows 11 is bringing back a feature users have wanted for years

News Room21 March 2026
Most Popular
The Spectacular Burnout of a Solar Panel Salesman

The Spectacular Burnout of a Solar Panel Salesman

13 January 2025137 Views
5 laptops to buy instead of the M4 MacBook Pro

5 laptops to buy instead of the M4 MacBook Pro

17 November 2024130 Views
ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

15 December 2024111 Views
Our Picks
This luxury leather-clad Apple Watch charging kit from Hermès costs than a MacBook Pro

This luxury leather-clad Apple Watch charging kit from Hermès costs than a MacBook Pro

21 March 2026
The new MacBook Neo charges faster than Apple claims

The new MacBook Neo charges faster than Apple claims

21 March 2026
One of the best-value gaming laptops available right now: RTX 5070 Ti, 32GB RAM, OLED, Core Ultra 9 for ,549

One of the best-value gaming laptops available right now: RTX 5070 Ti, 32GB RAM, OLED, Core Ultra 9 for $1,549

21 March 2026

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

Facebook X (Twitter) Instagram Pinterest
  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact Us
© 2026 Best in Technology. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.