Close Menu
Best in TechnologyBest in Technology
  • News
  • Phones
  • Laptops
  • Gadgets
  • Gaming
  • AI
  • Tips
  • More
    • Web Stories
    • Global
    • Press Release

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

What's On

‘Wyoming King’ and More Mattress Sizes You Probably Didn’t Know Existed

17 July 2025

Now Is a Very Good Time to Buy a Used EV. Here’s Why

17 July 2025

Review: Bedsure Cooling Sheets

17 July 2025
Facebook X (Twitter) Instagram
Just In
  • ‘Wyoming King’ and More Mattress Sizes You Probably Didn’t Know Existed
  • Now Is a Very Good Time to Buy a Used EV. Here’s Why
  • Review: Bedsure Cooling Sheets
  • Top Curved Display Phones in India (July 2025): OnePlus 13, Motorola Edge 60 Pro, Vivo Y400 Pro 5G, and More
  • Trump and the Energy Industry Are Eager to Power AI With Fossil Fuels
  • “Think Gaming, Think Infinix”: CEO Anish Kapoor on Infinix’s Mission to Democratise Mobile Gaming in India
  • Dyneema’s New Fiber Composite Is Lighter, Stronger, and More Durable Than Ever
  • Vivo Y400 5G Could Launch in India in August; Price and Colour Options Tipped
Facebook X (Twitter) Instagram Pinterest Vimeo
Best in TechnologyBest in Technology
  • News
  • Phones
  • Laptops
  • Gadgets
  • Gaming
  • AI
  • Tips
  • More
    • Web Stories
    • Global
    • Press Release
Subscribe
Best in TechnologyBest in Technology
Home » Google’s AI just got ears
News

Google’s AI just got ears

News RoomBy News Room10 April 20243 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email

AI chatbots are already capable of “seeing” the world through images and video. But now, Google has announced audio-to-speech functionalities as part of its latest update to Gemini Pro. In Gemini 1.5 Pro, the chatbot can now “hear” audio files uploaded into its system and then extract the text information.

The company has made this LLM version available as a public preview on its Vertex AI development platform. This will allow more enterprise-focused users to experiment with the feature and expand its base after a more private rollout in February when the model was first announced. This was originally offered only to a limited group of developers and enterprise customers.

1. Breaking down + understanding a long video

I uploaded the entire NBA dunk contest from last night and asked which dunk had the highest score.

Gemini 1.5 was incredibly able to find the specific perfect 50 dunk and details from just its long context video understanding! pic.twitter.com/01iUfqfiAO

— Rowan Cheung (@rowancheung) February 18, 2024

Google shared the details about the update at its Cloud Next conference, which is currently taking place in Las Vegas. After calling the Gemini Ultra LLM that powers its Gemini Advanced chatbot the most powerful model of its Gemini family, Google is now calling Gemini 1.5 Pro its most capable generative model. The company added that this version is better at learning without additional tweaking of the model.

Gemini 1.5 Pro is multimodal in that it can interpret different types of audio into text, including TV shows, movies, radio broadcasts, and conference call recordings. It’s even multilingual in that it can process audio in several different languages. The LLM may also be able to create transcripts from videos; however, its quality may be unreliable, as mentioned by TechCrunch.

When first announced, Google explained that Gemini 1.5 Pro used a token system to process raw data. A million tokens equate to approximately 700,000 words or 30,000 lines of code. In media form, it equals an hour of video or around 11 hours of audio.

There have been some private preview demos of Gemini 1.5 Pro that demonstrate how the LLM is able to find specific moments in a video transcript. For example, AI enthusiast Rowan Cheung got early access and detailed how his demo found an exact action shot in a sports contest and summarized the event, as seen in the tweet embedded above.

However, Google noted that other early adopters, including United Wholesale Mortgage, TBS, and Replit, are opting for more enterprise-focused use cases, such as mortgage underwriting, automating metadata tagging, and generating, explaining, and updating code.

Editors’ Recommendations











Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleWitchfire’s Big ‘Ghost Galleon’ Update Adds New Classes, Enemies, Weapons, And More
Next Article Why Moon Studios Made No Rest For The Wicked Instead Of A Third Ori Game

Related Articles

News

‘Wyoming King’ and More Mattress Sizes You Probably Didn’t Know Existed

17 July 2025
News

Now Is a Very Good Time to Buy a Used EV. Here’s Why

17 July 2025
News

Review: Bedsure Cooling Sheets

17 July 2025
News

Trump and the Energy Industry Are Eager to Power AI With Fossil Fuels

16 July 2025
News

Dyneema’s New Fiber Composite Is Lighter, Stronger, and More Durable Than Ever

16 July 2025
News

Can US Measles Outbreaks Be Stopped?

16 July 2025
Demo
Top Articles

ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

15 December 2024101 Views

Costco partners with Electric Era to bring back EV charging in the U.S.

28 October 202495 Views

Oppo Reno 14, Reno 14 Pro India Launch Timeline and Colourways Leaked

27 May 202582 Views

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

Latest News
Phones

“Think Gaming, Think Infinix”: CEO Anish Kapoor on Infinix’s Mission to Democratise Mobile Gaming in India

News Room16 July 2025
News

Dyneema’s New Fiber Composite Is Lighter, Stronger, and More Durable Than Ever

News Room16 July 2025
Phones

Vivo Y400 5G Could Launch in India in August; Price and Colour Options Tipped

News Room16 July 2025
Most Popular

The Spectacular Burnout of a Solar Panel Salesman

13 January 2025124 Views

ChatGPT o1 vs. o1-mini vs. 4o: Which should you use?

15 December 2024101 Views

Costco partners with Electric Era to bring back EV charging in the U.S.

28 October 202495 Views
Our Picks

Top Curved Display Phones in India (July 2025): OnePlus 13, Motorola Edge 60 Pro, Vivo Y400 Pro 5G, and More

17 July 2025

Trump and the Energy Industry Are Eager to Power AI With Fossil Fuels

16 July 2025

“Think Gaming, Think Infinix”: CEO Anish Kapoor on Infinix’s Mission to Democratise Mobile Gaming in India

16 July 2025

Subscribe to Updates

Get the latest tech news and updates directly to your inbox.

Facebook X (Twitter) Instagram Pinterest
  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact Us
© 2025 Best in Technology. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.