Close Menu
Best in TechnologyBest in Technology
  • News
  • Phones
  • Laptops
  • Gadgets
  • Gaming
  • AI
  • Tips
  • More
    • Web Stories
    • Global
    • Press Release

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

What's On

Android 16’s stable release is right around the corner

14 May 2025

Google Announces Material 3 Expressive With Updated Dynamic Colour Themes for Android

14 May 2025

The powerful HP Omen Max 16 gaming laptop with RTX 5070 Ti is $350 off today

14 May 2025
Facebook X (Twitter) Instagram
Just In
  • Android 16’s stable release is right around the corner
  • Google Announces Material 3 Expressive With Updated Dynamic Colour Themes for Android
  • The powerful HP Omen Max 16 gaming laptop with RTX 5070 Ti is $350 off today
  • Top HP Coupon Codes for May
  • Nubia Z70S Ultra With Snapdragon 8 Elite SoC, 64-Megapixel Telephoto Camera Goes Global
  • Sandisk WD Black SN8100 NVMe SSD With Up to 14.9Gbps Read Speeds Launched in India
  • No, a lifetime VPN subscription doesn’t mean ‘your’ lifetime
  • Jessica Jones is back: Krysten Ritter to appear in Daredevil: Born Again season 2
Facebook X (Twitter) Instagram Pinterest Vimeo
Best in TechnologyBest in Technology
  • News
  • Phones
  • Laptops
  • Gadgets
  • Gaming
  • AI
  • Tips
  • More
    • Web Stories
    • Global
    • Press Release
Subscribe
Best in TechnologyBest in Technology
Home » Why ‘Multimodal AI’ Is the Hottest Thing in Tech Right Now
AI

Why ‘Multimodal AI’ Is the Hottest Thing in Tech Right Now

News RoomBy News Room15 May 20243 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email

OpenAI and Google showcased their latest and greatest AI technology this week. For the last two years, tech companies have raced to make AI models smarter, but now a new focus has emerged: make them multimodal. OpenAI and Google are zeroing in on AI that can seamlessly switch between its robotic mouth, eyes, and ears.

“Multimodal” is the biggest buzzword as tech companies place bets on the most enticing form of their AI models in your everyday life. AI chatbots have lost their luster since ChatGPT’s launch in 2022. So companies are hoping that talking to and visually sharing things with an AI assistant feels more natural than typing. When you see multimodal AI done well, it feels like science fiction come to life.

On Monday, OpenAI showed off GPT-4 Omni, which was oddly reminiscent of the dystopian movie about lost human connection Her. Omni stands for “omnichannel,” and OpenAI touted the model’s ability to process video alongside audio. The demo showed ChatGPT looking at a math problem through a phone camera, as an OpenAI staff member verbally asked the chatbot to walk them through it. OpenAI says it’s rolling out now to Premium users.

The next day, Google unveiled Project Astra, which promised to do roughly the same thing. Gizmodo’s Florence Ion used multimodal AI to identify what faux flowers she was looking at, which it correctly identified as tulips. However, Project Astra seemed a little slower than GPT-4o, and the voice was far more robotic. More Siri than Her, but I’ll let you decide whether that’s a good thing. Google says this is in the early stages, however, and even notes some current challenges that OpenAI has overcome.

“While we’ve made incredible progress developing AI systems that can understand multimodal information, getting response time down to something conversational is a difficult engineering challenge,” said Google in a blog post.

Now you might remember Google’s Gemini demo video from Dec. 2023 that turned out to be highly manipulated. Six months later, Google still isn’t ready to release what it showed in that video, but OpenAI is speeding ahead with GPT-4o. Multimodal AI represents the next big race in AI development, and OpenAI seems to be winning.

A key difference maker for GPT-4o is that the single AI model can natively process audio, video, and text. Previously, OpenAI needed separate AI models to translate speech and video into text so that the underlying GPT-4, which is language-based, could understand these different mediums. It seems like Google may still be using multiple AI models to perform these tasks, given the slower response times.

We’ve also seen a wider adoption of AI wearables as tech companies embrace multimodal AI. The Humane AI Pin, Rabbit R1, and Meta Ray-Bans are all examples of AI-enabled devices that utilize these various mediums. These devices promise to make us less dependent on smartphones, though it’s possible that Siri and Google Assistant will also be empowered with multimodal AI soon enough.

Multimodal AI is likely something you’ll hear a lot more about in the months and years to come. Its development and integration into products could make AI significantly more useful. The technology ultimately takes the weight off of you to transcribe the world to an LLM and allows the AI to “see” and “hear” the world for itself.

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleSuper Monkey Ball Banana Rumble Preview – Bringing Monkey Ball Back In 2024
Next Article Google Pixel 8a Gets AI Wallpaper Generator With First Software Update: Report

Related Articles

AI

Doom vs Boom: The Battle to Enshrine AI’s Future Into California Law

24 June 2024
AI

Perplexity Is Reportedly Letting Its AI Break a Basic Rule of the Internet

20 June 2024
AI

Anthropic Says New Claude 3.5 AI Model Outperforms GPT-4 Omni

20 June 2024
AI

Call Centers Introduce ‘Emotion Canceling’ AI as a ‘Mental Shield’ for Workers

18 June 2024
AI

AI Turns Classic Memes Into Hideously Animated Garbage

17 June 2024
AI

May ‘AI’ Take Your Order? McDonald’s Says Not Yet

17 June 2024
Demo
Top Articles

Costco partners with Electric Era to bring back EV charging in the U.S.

28 October 202493 Views

ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

15 December 202486 Views

5 laptops to buy instead of the M4 MacBook Pro

17 November 202457 Views

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

Latest News
Laptops

Sandisk WD Black SN8100 NVMe SSD With Up to 14.9Gbps Read Speeds Launched in India

News Room14 May 2025
News

No, a lifetime VPN subscription doesn’t mean ‘your’ lifetime

News Room14 May 2025
News

Jessica Jones is back: Krysten Ritter to appear in Daredevil: Born Again season 2

News Room14 May 2025
Most Popular

The Spectacular Burnout of a Solar Panel Salesman

13 January 2025120 Views

Costco partners with Electric Era to bring back EV charging in the U.S.

28 October 202493 Views

ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

15 December 202486 Views
Our Picks

Top HP Coupon Codes for May

14 May 2025

Nubia Z70S Ultra With Snapdragon 8 Elite SoC, 64-Megapixel Telephoto Camera Goes Global

14 May 2025

Sandisk WD Black SN8100 NVMe SSD With Up to 14.9Gbps Read Speeds Launched in India

14 May 2025

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

Facebook X (Twitter) Instagram Pinterest
  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact Us
© 2025 Best in Technology. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.