For all the noise around AI conquering chess, go, and now even coding, there is still a pretty glaring weakness hiding underneath those wins. AI is still pretty bad at handling a new video game it has never seen before.

The core argument of a new paper by NYU talks about how these headline-grabbing milestones have painted a misleading picture of how close machines are to real general intelligence.

Distinction really matters.

Chess and Go are impressive achievements, but these are games with fixed rules and a structured environment, compared to the complex modern video games. NYU notes that AI has yet to master human-like intelligence since it can’t adapt well.

Where AI remains lacking

According to researchers, many of AI’s biggest gaming successes are based on systems that are finely tuned to one specific game. In those defined boundaries, AI can basically become superhuman. But as soon as there are slight changes to the rules or environments, its impressive performance can collapse.

This is where video games come in as a real test of their intelligence. Games aren’t one-dimensional, often requiring a vast range of skills, including spatial reasoning, long-term planning, trial-and-error learning, and even social intuition. The report claims that this variety makes gaming a far better measure of flexible intelligence than isolated benchmark tasks.

Reinforcement learning and LLMs both hit a wall

The research paper adds that reinforcement learning can produce impressive results, but acceptable goals are only achieved after millions or billions of simulated runs. So the system becomes an expert in the exact situation it is trained for. But all of this falls apart when any changes are introduced. Even something as simple as shifted colors or repositioned objects on a screen can break it.

LLMs (Large Language Models) do not solve this either. NYU says they perform surprisingly poorly on unfamiliar games. When it does start doing well, this is usually in custom game-specific scaffolding to interpret game states, manage memory, and execute actions. Strip that extra support away, and performance drops fast.

The real benchmark

The researchers argue that a true game-playing AI would need to learn a new game from scratch in roughly the same amount of time as a skilled player. Maybe tens of hours, without massive simulation or prior exposure. All of which is beyond the capabilities of current systems.

And that is why this matters beyond gaming. If AI cannot reliably adapt to a brand-new video game, it is even less likely to handle the unpredictability of the real world. Chess may still make for a good headline, but modern games are showing just how far AI still has to go.

Share.
Exit mobile version