Gemini 3.5 Live Translate Brings Real-Time Voice Translation

Google has launched Gemini 3.5 Live Translate, a new audio model delivering near real-time speech-to-speech translation across more than 70 languages. Unlike turn-based tools that wait for a speaker to finish before responding, it runs continuously, staying just a few seconds behind as the conversation unfolds. It preserves the speaker's original pitch, pacing, and intonation throughout.

Continuous Streaming Sets It Apart From Turn-Based Tools

Most live translation tools introduce noticeable gaps by processing speech in discrete chunks. Gemini 3.5 Live Translate streams audio as it arrives, trading some context for lower latency to stay in sync with the speaker. The result is fluid audio without the pauses that make turn-based systems feel disjointed in real conversation.

The model detects language automatically, with no manual setup required, and is built to handle noisy environments. That noise robustness matters for real-world deployments like live broadcasts, on-site meetings, or ride-hailing pickups.

Grab, the Southeast Asian super-app, is already testing it to bridge language gaps between drivers and riders at pickups. The company handles more than 10 million voice calls per month through its platform.

Available Now Across Meet, Translate, and the Gemini Live API

Developers can access Gemini 3.5 Live Translate in public preview today through the Gemini Live API and Google AI Studio. Platforms including Agora, LiveKit, and Pipecat have already integrated the API, managing the underlying streaming infrastructure.

Google Meet is entering private preview for select enterprise Workspace customers this month, with a broader rollout planned for later this year. The upgrade lifts Meet's supported languages from five to more than 70 and unlocks over 2,000 language combinations per session. Previously, the feature only translated to and from English.

For everyday users, the model is live now in the Google Translate app on both Android and iOS. Android users also get a new listening mode: hold the phone to your ear and translated audio streams directly through the earpiece, without disturbing anyone nearby.

All Output Watermarked With SynthID

Every audio file the model generates is watermarked using SynthID, Google DeepMind's imperceptible audio watermarking system. It's woven directly into the audio output and stays detectable even after post-processing, making it harder to pass AI-generated speech off as authentic.

Broader Google Meet availability for all Workspace customers is expected before year-end. Developers ready to start building can access example code in the Gemini Cookbook on GitHub.