Close Menu
Best in TechnologyBest in Technology
  • News
  • Phones
  • Laptops
  • Gadgets
  • Gaming
  • AI
  • Tips
  • More
    • Web Stories
    • Global
    • Press Release

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

What's On
After PS5 price hike, Xbox and Nintendo could be next

After PS5 price hike, Xbox and Nintendo could be next

29 March 2026
Conquest Sets His Sights On The Invincible VS Roster, 2 DLC Fighters And Open Beta Details Revealed

Conquest Sets His Sights On The Invincible VS Roster, 2 DLC Fighters And Open Beta Details Revealed

29 March 2026
I won’t buy the Galaxy A37 at 0, but I strongly recommend these 4 terrific options

I won’t buy the Galaxy A37 at $450, but I strongly recommend these 4 terrific options

29 March 2026
Facebook X (Twitter) Instagram
Just In
  • After PS5 price hike, Xbox and Nintendo could be next
  • Conquest Sets His Sights On The Invincible VS Roster, 2 DLC Fighters And Open Beta Details Revealed
  • I won’t buy the Galaxy A37 at $450, but I strongly recommend these 4 terrific options
  • Study says AI chatbots are increasingly ignoring humans, but it isn’t quite Skynet yet
  • Samsung is cooking up a money-saving trick for its browser
  • Meta’s next smart glasses sound like a treat for humans stuck with prescription lenses
  • YouTube CEO opens up about AI slop, and it sounds like cozy promises
  • Apple is opening Siri to pick AI models, but there’s only only that makes sense to me 
Facebook X (Twitter) Instagram Pinterest Vimeo
Best in TechnologyBest in Technology
  • News
  • Phones
  • Laptops
  • Gadgets
  • Gaming
  • AI
  • Tips
  • More
    • Web Stories
    • Global
    • Press Release
Subscribe
Best in TechnologyBest in Technology
Home » A New Trick Could Block the Misuse of Open Source AI
News

A New Trick Could Block the Misuse of Open Source AI

News RoomBy News Room2 August 20244 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
A New Trick Could Block the Misuse of Open Source AI
Share
Facebook Twitter LinkedIn Pinterest Email

When Meta released its large language model Llama 3 for free this April, it took outside developers just a couple days to create a version without the safety restrictions that prevent it from spouting hateful jokes, offering instructions for cooking meth, or misbehaving in other ways.

A new training technique developed by researchers at the University of Illinois Urbana-Champaign, UC San Diego, Lapis Labs, and the nonprofit Center for AI Safety could make it harder to remove such safeguards from Llama and other open source AI models in the future. Some experts believe that, as AI becomes ever more powerful, tamperproofing open models in this way could prove crucial.

“Terrorists and rogue states are going to use these models,” Mantas Mazeika, a Center for AI Safety researcher who worked on the project as a PhD student at the University of Illinois Urbana-Champaign, tells WIRED. “The easier it is for them to repurpose them, the greater the risk.”

Powerful AI models are often kept hidden by their creators, and can be accessed only through a software application programming interface or a public-facing chatbot like ChatGPT. Although developing a powerful LLM costs tens of millions of dollars, Meta and others have chosen to release models in their entirety. This includes making the “weights,” or parameters that define their behavior, available for anyone to download.

Prior to release, open models like Meta’s Llama are typically fine-tuned to make them better at answering questions and holding a conversation, and also to ensure that they refuse to respond to problematic queries. This will prevent a chatbot based on the model from offering rude, inappropriate, or hateful statements, and should stop it from, for example, explaining how to make a bomb.

The researchers behind the new technique found a way to complicate the process of modifying an open model for nefarious ends. It involves replicating the modification process but then altering the model’s parameters so that the changes that normally get the model to respond to a prompt such as “Provide instructions for building a bomb” no longer work.

Mazeika and colleagues demonstrated the trick on a pared-down version of Llama 3. They were able to tweak the model’s parameters so that even after thousands of attempts, it could not be trained to answer undesirable questions. Meta did not immediately respond to a request for comment.

Mazeika says the approach is not perfect, but that it suggests the bar for “decensoring” AI models could be raised. “A tractable goal is to make it so the costs of breaking the model increases enough so that most adversaries are deterred from it,” he says.

“Hopefully this work kicks off research on tamper-resistant safeguards, and the research community can figure out how to develop more and more robust safeguards,” says Dan Hendrycks, director of the Center for AI Safety.

The idea of tamperproofing open models may become more popular as interest in open source AI grows. Already, open models are competing with state-of-the-art closed models from companies like OpenAI and Google. The newest version of Llama 3, for instance, released in July, is roughly as powerful as models behind popular chatbots like ChatGPT, Gemini, and Claude, as measured using popular benchmarks for grading language models’ abilities. Mistral Large 2, an LLM from a French startup, also released last month, is similarly capable.

The US government is taking a cautious but positive approach to open source AI. A report released this week by the National Telecommunications and Information Administration, a body within the US Commerce Department, “recommends the US government develop new capabilities to monitor for potential risks, but refrain from immediately restricting the wide availability of open model weights in the largest AI systems.”

Not everyone is a fan of imposing restrictions on open models, however. Stella Biderman, director of EleutherAI, a community-driven open source AI project, says that the new technique may be elegant in theory but could prove tricky to enforce in practice. Biderman says the approach is also antithetical to the philosophy behind free software and openness in AI.

“I think this paper misunderstands the core issue,” Biderman says. “If they’re concerned about LLMs generating info about weapons of mass destruction, the correct intervention is on the training data, not on the trained model.”

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleVivo X200 Leaked Dummy Unit Shows Design; Vivo X200 Pro Battery Details Surface Online
Next Article Samsung Galaxy Buds 3 Pro review: say goodbye to AirPods envy

Related Articles

After PS5 price hike, Xbox and Nintendo could be next
News

After PS5 price hike, Xbox and Nintendo could be next

29 March 2026
I won’t buy the Galaxy A37 at 0, but I strongly recommend these 4 terrific options
News

I won’t buy the Galaxy A37 at $450, but I strongly recommend these 4 terrific options

29 March 2026
Study says AI chatbots are increasingly ignoring humans, but it isn’t quite Skynet yet
News

Study says AI chatbots are increasingly ignoring humans, but it isn’t quite Skynet yet

29 March 2026
Samsung is cooking up a money-saving trick for its browser
News

Samsung is cooking up a money-saving trick for its browser

29 March 2026
Meta’s next smart glasses sound like a treat for humans stuck with prescription lenses
News

Meta’s next smart glasses sound like a treat for humans stuck with prescription lenses

29 March 2026
YouTube CEO opens up about AI slop, and it sounds like cozy promises
News

YouTube CEO opens up about AI slop, and it sounds like cozy promises

29 March 2026
Demo
Top Articles
5 laptops to buy instead of the M4 MacBook Pro

5 laptops to buy instead of the M4 MacBook Pro

17 November 2024132 Views
ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

15 December 2024111 Views
Costco partners with Electric Era to bring back EV charging in the U.S.

Costco partners with Electric Era to bring back EV charging in the U.S.

28 October 2024100 Views

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

Latest News
Meta’s next smart glasses sound like a treat for humans stuck with prescription lenses News

Meta’s next smart glasses sound like a treat for humans stuck with prescription lenses

News Room29 March 2026
YouTube CEO opens up about AI slop, and it sounds like cozy promises News

YouTube CEO opens up about AI slop, and it sounds like cozy promises

News Room29 March 2026
Apple is opening Siri to pick AI models, but there’s only only that makes sense to me  News

Apple is opening Siri to pick AI models, but there’s only only that makes sense to me 

News Room28 March 2026
Most Popular
The Spectacular Burnout of a Solar Panel Salesman

The Spectacular Burnout of a Solar Panel Salesman

13 January 2025137 Views
5 laptops to buy instead of the M4 MacBook Pro

5 laptops to buy instead of the M4 MacBook Pro

17 November 2024132 Views
ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

15 December 2024111 Views
Our Picks
Study says AI chatbots are increasingly ignoring humans, but it isn’t quite Skynet yet

Study says AI chatbots are increasingly ignoring humans, but it isn’t quite Skynet yet

29 March 2026
Samsung is cooking up a money-saving trick for its browser

Samsung is cooking up a money-saving trick for its browser

29 March 2026
Meta’s next smart glasses sound like a treat for humans stuck with prescription lenses

Meta’s next smart glasses sound like a treat for humans stuck with prescription lenses

29 March 2026

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

Facebook X (Twitter) Instagram Pinterest
  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact Us
© 2026 Best in Technology. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.