In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Generative AI > LangChain > Conversation Memory in LangChain

Conversation Memory in LangChain

Author: Venkata Sudhakar

By default, each call to an LLM through LangChain is stateless - the model has no memory of previous exchanges. Building a chatbot or conversational AI assistant requires maintaining conversation history and passing it to the LLM on each turn so it has the context of what was said before. LangChain provides several memory patterns ranging from simple in-memory message buffers to persistent storage backends that survive application restarts. Choosing the right memory strategy depends on conversation length, cost constraints, and whether the history needs to persist across sessions.

The modern LangChain approach to memory uses the RunnableWithMessageHistory wrapper, which integrates cleanly with LCEL chains. You provide a function that retrieves (or creates) a message history object for a given session_id, and LangChain automatically loads the history before calling the LLM and saves the new messages after. For development and single-server deployments, InMemoryChatMessageHistory stores history in a Python dict. For production multi-server deployments, use a persistent backend like RedisChatMessageHistory or a custom database-backed store.

The below example shows three memory patterns: simple in-memory history for development, windowed history to limit token usage, and a summary memory pattern that compresses old history into a running summary.

# pip install langchain langchain-openai
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.output_parsers import StrOutputParser

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.3, api_key="your-api-key-here")

# Pattern 1: Simple in-memory conversation history
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful data migration expert assistant."),
    MessagesPlaceholder(variable_name="history"),  # injected by RunnableWithMessageHistory
    ("human", "{input}")
])

chain = prompt | llm | StrOutputParser()

# In-memory store: one history object per session_id
store = {}

def get_session_history(session_id: str) -> InMemoryChatMessageHistory:
    if session_id not in store:
        store[session_id] = InMemoryChatMessageHistory()
    return store[session_id]

chain_with_memory = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="history"
)

# All calls with the same session_id share history
config = {"configurable": {"session_id": "user-venkata-001"}}

print("Turn 1:", chain_with_memory.invoke(
    {"input": "What is CDC and how does Debezium fit in?"},
    config=config
))
print("Turn 2:", chain_with_memory.invoke(
    {"input": "Which databases does it support?"},
    config=config
))
print("Turn 3:", chain_with_memory.invoke(
    {"input": "Can you summarise what we discussed so far?"},
    config=config
))

It gives the following output,

Turn 1: CDC (Change Data Capture) tracks and captures database changes in real time
by reading the transaction log. Debezium is an open-source CDC tool that reads
database logs and publishes change events to Apache Kafka topics.

Turn 2: Debezium supports MySQL, PostgreSQL, Oracle, SQL Server, MongoDB, Db2,
and several others through a Kafka Connect plugin architecture.

Turn 3: We discussed that CDC captures database changes via transaction logs,
and Debezium is the leading open-source CDC tool supporting multiple databases
including MySQL, PostgreSQL, Oracle, and SQL Server, publishing events to Kafka.

# Pattern 2: Windowed memory - keep only the last N messages
# Prevents context window overflow on long conversations
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.messages import BaseMessage
from typing import List

class WindowedChatHistory(InMemoryChatMessageHistory):
    max_messages: int = 10  # keep last 10 messages (5 turns)

def add_messages(self, messages: List[BaseMessage]) -> None:
        super().add_messages(messages)
        # Trim to keep only the most recent max_messages
        if len(self.messages) > self.max_messages:
            self.messages = self.messages[-self.max_messages:]

windowed_store = {}

def get_windowed_history(session_id: str) -> WindowedChatHistory:
    if session_id not in windowed_store:
        windowed_store[session_id] = WindowedChatHistory(max_messages=6)
    return windowed_store[session_id]

windowed_chain = RunnableWithMessageHistory(
    chain,
    get_windowed_history,
    input_messages_key="input",
    history_messages_key="history"
)

# Pattern 3: Persist history across restarts using a file or database
# Example: save/load history as JSON
import json

def save_history(session_id: str, history: InMemoryChatMessageHistory):
    messages = [{"type": m.type, "content": m.content}
                for m in history.messages]
    with open(f"history_{session_id}.json", "w") as f:
        json.dump(messages, f)
    print(f"Saved {len(messages)} messages for session {session_id}")

def load_history(session_id: str) -> InMemoryChatMessageHistory:
    history = InMemoryChatMessageHistory()
    try:
        with open(f"history_{session_id}.json") as f:
            messages = json.load(f)
        for m in messages:
            if m["type"] == "human":
                history.add_user_message(m["content"])
            else:
                history.add_ai_message(m["content"])
        print(f"Loaded {len(messages)} messages for session {session_id}")
    except FileNotFoundError:
        print(f"No history found for {session_id} - starting fresh")
    return history

# Usage
history = load_history("user-venkata-001")
save_history("user-venkata-001", history)

It gives the following output,

# Pattern 2 - windowed memory:
# After 4 turns (8 messages), only the last 6 messages are kept.
# The model naturally forgets the very first exchanges.
# This prevents exceeding the context window on long conversations.

# Pattern 3 - file-based persistence:
No history found for user-venkata-001 - starting fresh
Saved 6 messages for session user-venkata-001

# Next application restart:
Loaded 6 messages for session user-venkata-001
# Conversation resumes exactly where it left off

Memory strategy selection guide:

InMemoryChatMessageHistory - Best for: development, single-server apps, short-lived sessions. Limitation: lost on restart, not scalable across multiple server instances.

Windowed memory - Best for: long conversations where only recent context matters. Keep the last 6-10 messages (3-5 turns). Each message in history costs tokens on every subsequent call, so windowing is essential for cost control in production.

RedisChatMessageHistory - Best for: production multi-server deployments. Install langchain-community and use RedisChatMessageHistory(session_id, url="redis://..."). History persists across restarts and is shared across all server instances behind a load balancer.

Send your comments, suggestions or queries regarding this site to [email protected].