Close Menu
Best in TechnologyBest in Technology
  • News
  • Phones
  • Laptops
  • Gadgets
  • Gaming
  • AI
  • Tips
  • More
    • Web Stories
    • Global
    • Press Release

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

What's On
OpenAI releases ChatGPT 5.3 Instant and says it’s less “cringe”

OpenAI releases ChatGPT 5.3 Instant and says it’s less “cringe”

5 March 2026
How Vulnerable Are Computers to an 80-Year-Old Spy Technique? Congress Wants Answers

How Vulnerable Are Computers to an 80-Year-Old Spy Technique? Congress Wants Answers

5 March 2026
The MacBook Neo may be Apple’s cleverest bait to catch them young

The MacBook Neo may be Apple’s cleverest bait to catch them young

5 March 2026
Facebook X (Twitter) Instagram
Just In
  • OpenAI releases ChatGPT 5.3 Instant and says it’s less “cringe”
  • How Vulnerable Are Computers to an 80-Year-Old Spy Technique? Congress Wants Answers
  • The MacBook Neo may be Apple’s cleverest bait to catch them young
  • What AI Models for War Actually Look Like
  • Ubisoft Gives Updates On Assassin’s Creed Franchise, Including Hexe And Invictus, Alongside Black Flag Remake Tease
  • This smart device stops sneaky AI gadgets from listening to your conversations
  • Big Tech Signs White House Data Center Pledge With Good Optics and Little Substance
  • Assassin’s Creed 4: Black Flag Remake All But Confirmed As Ubisoft Teases Its Existence In New Update
Facebook X (Twitter) Instagram Pinterest Vimeo
Best in TechnologyBest in Technology
  • News
  • Phones
  • Laptops
  • Gadgets
  • Gaming
  • AI
  • Tips
  • More
    • Web Stories
    • Global
    • Press Release
Subscribe
Best in TechnologyBest in Technology
Home » The Math on AI Agents Doesn’t Add Up
News

The Math on AI Agents Doesn’t Add Up

News RoomBy News Room23 January 20264 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
The Math on AI Agents Doesn’t Add Up
Share
Facebook Twitter LinkedIn Pinterest Email

The big AI companies promised us that 2025 would be “the year of the AI agents.” It turned out to be the year of talking about AI agents, and kicking the can for that transformational moment to 2026 or maybe later. But what if the answer to the question “When will our lives be fully automated by generative AI robots that perform our tasks for us and basically run the world?” is, like that New Yorker cartoon, “How about never?”

That was basically the message of a paper published without much fanfare some months ago, smack in the middle of the overhyped year of “agentic AI.” Entitled “Hallucination Stations: On Some Basic Limitations of Transformer-Based Language Models,” it purports to mathematically show that “LLMs are incapable of carrying out computational and agentic tasks beyond a certain complexity.” Though the science is beyond me, the authors—a former SAP CTO who studied AI under one of the field’s founding intellects, John McCarthy, and his teenage prodigy son—punctured the vision of agentic paradise with the certainty of mathematics. Even reasoning models that go beyond the pure word-prediction process of LLMs, they say, won’t fix the problem.

“There is no way they can be reliable,” Vishal Sikka, the dad, tells me. After a career that, in addition to SAP, included a stint as Infosys CEO and an Oracle board member, he currently heads an AI services startup called Vianai. “So we should forget about AI agents running nuclear power plants?” I ask. “Exactly,” he says. Maybe you can get it to file some papers or something to save time, but you might have to resign yourself to some mistakes.

The AI industry begs to differ. For one thing, a big success in agent AI has been coding, which took off last year. Just this week at Davos, Google’s Nobel-winning head of AI, Demis Hassabis, reported breakthroughs in minimizing hallucinations, and hyperscalers and startups alike are pushing the agent narrative. Now they have some backup. A startup called Harmonic is reporting a breakthrough in AI coding that also hinges on mathematics—and tops benchmarks on reliability.

Harmonic, which was cofounded by Robinhood CEO Vlad Tenev and Tudor Achim, a Stanford-trained mathematician, claims this recent improvement to its product called Aristotle (no hubris there!) is an indication that there are ways to guarantee the trustworthiness of AI systems. “Are we doomed to be in a world where AI just generates slop and humans can’t really check it? That would be a crazy world,” says Achim. Harmonic’s solution is to use formal methods of mathematical reasoning to verify an LLM’s output. Specifically, it encodes outputs in the Lean programming language, which is known for its ability to verify the coding. To be sure, Harmonic’s focus to date has been narrow—its key mission is the pursuit of “mathematical superintelligence,” and coding is a somewhat organic extension. Things like history essays—which can’t be mathematically verified—are beyond its boundaries. For now.

Nonetheless, Achim doesn’t seem to think that reliable agentic behavior is as much an issue as some critics believe. “I would say that most models at this point have the level of pure intelligence required to reason through booking a travel itinerary,” he says.

Both sides are right—or maybe even on the same side. On one hand, everyone agrees that hallucinations will continue to be a vexing reality. In a paper published last September, OpenAI scientists wrote, “Despite significant progress, hallucinations continue to plague the field, and are still present in the latest models.” They proved that unhappy claim by asking three models, including ChatGPT, to provide the title of the lead author’s dissertation. All three made up fake titles and all misreported the year of publication. In a blog about the paper, OpenAI glumly stated that in AI models, “accuracy will never reach 100 percent.”

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleSonic And The Black Knight – Part 3 | Super Replay
Next Article Your Fable reboot preview is here, open world Albion looks gloriously chaotic

Related Articles

OpenAI releases ChatGPT 5.3 Instant and says it’s less “cringe”
News

OpenAI releases ChatGPT 5.3 Instant and says it’s less “cringe”

5 March 2026
How Vulnerable Are Computers to an 80-Year-Old Spy Technique? Congress Wants Answers
News

How Vulnerable Are Computers to an 80-Year-Old Spy Technique? Congress Wants Answers

5 March 2026
The MacBook Neo may be Apple’s cleverest bait to catch them young
News

The MacBook Neo may be Apple’s cleverest bait to catch them young

5 March 2026
What AI Models for War Actually Look Like
News

What AI Models for War Actually Look Like

5 March 2026
This smart device stops sneaky AI gadgets from listening to your conversations
News

This smart device stops sneaky AI gadgets from listening to your conversations

5 March 2026
Big Tech Signs White House Data Center Pledge With Good Optics and Little Substance
News

Big Tech Signs White House Data Center Pledge With Good Optics and Little Substance

5 March 2026
Demo
Top Articles
5 laptops to buy instead of the M4 MacBook Pro

5 laptops to buy instead of the M4 MacBook Pro

17 November 2024126 Views
ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

15 December 2024111 Views
Costco partners with Electric Era to bring back EV charging in the U.S.

Costco partners with Electric Era to bring back EV charging in the U.S.

28 October 202499 Views

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

Latest News
This smart device stops sneaky AI gadgets from listening to your conversations News

This smart device stops sneaky AI gadgets from listening to your conversations

News Room5 March 2026
Big Tech Signs White House Data Center Pledge With Good Optics and Little Substance News

Big Tech Signs White House Data Center Pledge With Good Optics and Little Substance

News Room5 March 2026
Assassin’s Creed 4: Black Flag Remake All But Confirmed As Ubisoft Teases Its Existence In New Update Gaming

Assassin’s Creed 4: Black Flag Remake All But Confirmed As Ubisoft Teases Its Existence In New Update

News Room5 March 2026
Most Popular
The Spectacular Burnout of a Solar Panel Salesman

The Spectacular Burnout of a Solar Panel Salesman

13 January 2025137 Views
5 laptops to buy instead of the M4 MacBook Pro

5 laptops to buy instead of the M4 MacBook Pro

17 November 2024126 Views
ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

15 December 2024111 Views
Our Picks
What AI Models for War Actually Look Like

What AI Models for War Actually Look Like

5 March 2026
Ubisoft Gives Updates On Assassin’s Creed Franchise, Including Hexe And Invictus, Alongside Black Flag Remake Tease

Ubisoft Gives Updates On Assassin’s Creed Franchise, Including Hexe And Invictus, Alongside Black Flag Remake Tease

5 March 2026
This smart device stops sneaky AI gadgets from listening to your conversations

This smart device stops sneaky AI gadgets from listening to your conversations

5 March 2026

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

Facebook X (Twitter) Instagram Pinterest
  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact Us
© 2026 Best in Technology. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.