tl  tr
  Home | Tutorials | Articles | Videos | Products | Tools | Search
Interviews | Open Source | Tag Cloud | Follow Us | Bookmark | Contact   
 Generative AI > Google Gemini API > Gemini API Streaming with Server-Sent Events

Gemini API Streaming with Server-Sent Events

Author: Venkata Sudhakar

By default the Gemini API waits until the full response is generated before returning it. For chat interfaces, this creates a poor user experience - the screen is blank for several seconds before the full answer appears. Streaming mode returns tokens as they are generated, enabling a typewriter effect that feels responsive. ShopMax India uses streaming for their customer support chat interface.

Streaming is enabled by passing stream=True to generate_content. The response is an iterator of partial chunks. Each chunk has a text attribute with the newly generated tokens. For web delivery, wrap this in a Flask endpoint using Server-Sent Events.

The below example shows how ShopMax India implements streaming Gemini responses for their customer chat interface.


It gives the following output (tokens appear progressively),

ShopMax India Return Policy for Electronics:

1. 7-Day Replacement: All electronics are eligible for replacement
   within 7 days of delivery if found defective...
[tokens stream in real-time]
Total tokens: 312

For web chat interfaces, expose streaming responses as Server-Sent Events from a Flask endpoint. The browser receives tokens in real-time and appends them to the chat bubble as they arrive.


It gives the following output (SSE stream from the endpoint),

GET /chat/stream?message=What+is+your+return+policy

data: ShopMax
data:  India
data:  offers
data:  a 7-day
data:  replacement
data:  policy...
data: [DONE]

On the browser side, use the EventSource JavaScript API to consume the SSE stream and append tokens to the chat bubble in real-time. This gives ShopMax India customers a fast, responsive chat experience even for long answers about product specifications or warranty terms.


 
  


  
bl  br