In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Generative AI > Google Gemini API > Gemini Multi-turn Chat with History Management

Gemini Multi-turn Chat with History Management

Author: Venkata Sudhakar

Multi-turn chat with Gemini requires you to manage conversation history explicitly. The Gemini API is stateless ï¿½ you send the full conversation history with every request. The model uses this history for context: follow-up questions, pronoun references, corrections, and topic continuity all depend on the model seeing previous turns. Proper history management means including enough context for coherent replies while keeping the total token count affordable and within the model context window limit.

The Gemini SDK provides a ChatSession via client.chats.create() that manages history automatically ï¿½ you call session.send_message() and it appends turns internally. For production systems needing trimming, injection, or cross-session persistence, you manage the contents list yourself. Use ChatSession for rapid prototyping; use manual history management for production where you need to control exactly what context the model sees and persist history between user visits.

The below example shows both approaches: the simple ChatSession pattern for development, and a production-ready manual history manager with turn trimming, context injection, and cross-session persistence for returning customers.

Production history manager with trimming and cross-session persistence,

# APPROACH 2: Manual history - production-ready with trimming and persistence
MAX_TURNS = 10  # keep last 10 turns to control token cost
SYSTEM    = "You are a ShopMax India support agent. Reference prior context when relevant."

history_store = {}  # production: use Redis or database

def get_history(cid: str) -> list:
    return history_store.get(cid, [])

def save_history(cid: str, history: list) -> None:
    history_store[cid] = history[-MAX_TURNS:]

def chat(cid: str, message: str) -> str:
    history = get_history(cid)
    new_turn = types.Content(role="user", parts=[types.Part(text=message)])
    contents = history + [new_turn]
    resp = client.models.generate_content(
        model="gemini-2.0-flash",
        config=types.GenerateContentConfig(
            system_instruction=SYSTEM, max_output_tokens=150
        ),
        contents=contents
    )
    reply = resp.text
    history.append(new_turn)
    history.append(types.Content(role="model", parts=[types.Part(text=reply)]))
    save_history(cid, history)
    return reply

CID = "customer-9821"
print("=== Session 1 ===")
print("Agent:", chat(CID, "I need to register my Samsung TV warranty."))
print("Agent:", chat(CID, "Model is QA65Q80C, serial SN123456789."))

print("\n=== Session 2 - returning customer ===")
print("Agent:", chat(CID, "Hi again. Was my warranty registration processed?"))
print("History turns saved:", len(get_history(CID)))

It gives the following output showing context carried across both approaches,

=== ChatSession approach ===
User:  Hi, I bought a Samsung TV last week.
Agent: Welcome! I can help with your Samsung TV. What do you need assistance with?

User:  The remote is not working. What should I do?
Agent: Try replacing the batteries first. If that does not help, make sure there
       is nothing blocking the IR sensor on the front of the TV.

User:  I tried new batteries already - still not working.
Agent: Since new batteries did not help, the remote may be faulty. I can arrange
       a replacement remote for your Samsung TV. May I have your order ID?

Total history turns: 6

=== Session 1 ===
Agent: I can help with Samsung TV warranty registration. Please share the
       model number and serial number.

Agent: Warranty registered for Samsung QA65Q80C, serial SN123456789.
       You will receive a confirmation email within 24 hours.

=== Session 2 - returning customer ===
Agent: Welcome back! Yes, the warranty for your Samsung QA65Q80C
       (serial SN123456789) was registered. Can I help with anything else?

History turns saved: 4

# Session 2: agent remembered model and serial from Session 1 via persisted history
# Manual approach gives full control over what context the model sees

History management production rules: always trim history to the last 10 to 20 turns before sending to control token cost ï¿½ older turns rarely affect current response quality. Persist history in Redis with a TTL of 24 hours for active support tickets and 90 days for account-level context. When injecting memories (Tutorial 324) alongside history, put the memory context in the system instruction rather than the history list ï¿½ it is always available without consuming history slots. For compliance, log full untruncated history to Cloud Storage or BigQuery before trimming so you have the complete audit trail even though the model only sees the recent window.

Send your comments, suggestions or queries regarding this site to [email protected].