tl  tr
  Home | Tutorials | Articles | Videos | Products | Tools | Search
Interviews | Open Source | Tag Cloud | Follow Us | Bookmark | Contact   
 Generative AI > Google Gemini API > Gemini Live API - Voice Customer Support Agent

Gemini Live API - Voice Customer Support Agent

Author: Venkata Sudhakar

The Gemini Live API enables real-time, bidirectional audio conversations with Gemini models. Unlike text-based chat, Live API streams audio in and out simultaneously, enabling natural spoken dialogue with sub-second latency. This is ideal for voice IVR systems, call centre automation, and voice-enabled kiosks.

In this tutorial, we demonstrate the Gemini Live API setup for a ShopMax India voice support agent. The agent listens to customer audio, processes queries about orders and products, and responds with synthesised speech - all in a continuous streaming session.

The below example shows how to configure and run a Live API session with function calling for a voice support scenario.


Now run the async Live API session loop,


It gives the following output,

Voice agent ready. Sending simulated customer query...
Audio chunk received: 4096 bytes
Audio chunk received: 4096 bytes
Audio chunk received: 3200 bytes
Transcript: Your order ORD-1001 is currently out for delivery and is expected
to arrive today by 6 PM in Mumbai. Is there anything else I can help you with?
Turn complete.

In production, replace the text prompt with real-time PCM audio bytes from a microphone stream, and pipe the received audio chunks to a speaker. The Live API supports interruption handling - the model stops speaking when it detects the user talking. Use session.send_realtime_input() for continuous microphone streaming.


 
  


  
bl  br