Anthropic Says New Claude 3.5 AI Model Outperforms GPT-4 Omni

Anthropic launched Claude 3.5 Sonnet on Thursday, which the AI startup says outperforms its previous AI models and OpenAI’s recently launched GPT-4 Omni on several metrics. The company also released Artifacts, a new dynamic workspace within Claude where users can edit and build upon Claude’s AI projects, such as a playable crab video game.

“Our goal with Claude isn’t to create an incrementally better LLM but to develop an AI system that can work alongside people and software in meaningful ways,” said Anthropic co-founder and CEO Dario Amodei in a press release. “Features like Artifacts are early experiments in this direction.”

This is the first release of Anthropic’s Claude 3.5 model family, which comes just three months after the launch of its Claude 3 models. Anthropic is fighting tooth and nail to keep up with product launches from OpenAI. Today, Anthropic is just launching the 3.5 version of its mid-tier model Sonnet, but the startup plans to release a 3.5 version of Haiku (its entry-level model) and Opus (its most capable version) later this year. The startup also says it’s exploring features such as web search and memory for future releases.

Anthropic’s focus is on releasing not just better models, but more features and capabilities to make their AI more useful. With Artifacts, Anthropic says users can create interactive projects within Claude. In a demo, the startup shows how you can create characters and icons within an AI-generated 8-bit crab video game, and edit the project along the way.

The AI startup says Claude 3.5 Sonnet is its strongest vision model yet, which allows Claude to perform better at visual reasoning, interpreting charts, and transcribing text from imperfect images. Anthropic claims Claude 3.5 Sonnet outperforms several vision capabilities of GPT-4 Omni, which debuted impressive vision capabilities at launch. Claude is supposedly better at visually understanding charts, documents, and math. Otherwise, Anthropic says its new Claude generally outperforms OpenAI’s ChatGPT at coding and reasoning.

Notably, Anthropic says Claude 3.5 Sonnet does a better job at parsing through what to, and what not to, answer. AI chatbots have become notorious for refusing to answer certain questions. Gizmodo did a deep dive into AI censorship a few months ago and found that Anthropic’s Claude refused to answer a lot of questions. With Claude 3.5, Anthropic feels confident in Claude’s judgment on answering inappropriate or potentially harmful questions.