In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Generative AI > LangChain > LangChain Output Parsers

LangChain Output Parsers

Author: Venkata Sudhakar

By default, LangChain chains return raw text from the LLM. Output parsers transform that raw text into structured Python objects - a list, a dictionary, a Pydantic model, or a boolean. This is essential when your downstream code needs to work with the LLM output programmatically rather than just display it. Instead of writing custom string parsing for each use case, you declare the structure you want and let the parser handle conversion and validation.

LangChain provides several built-in parsers. StrOutputParser simply returns the text as a string - useful at the end of most chains. PydanticOutputParser takes a Pydantic model class, adds formatting instructions to your prompt automatically, and validates the LLM response against the model. CommaSeparatedListOutputParser extracts a plain comma-separated list into a Python list. JsonOutputParser extracts JSON objects. The parser plugs into an LCEL chain with the pipe operator after the LLM call.

The below example shows extracting a structured migration assessment from LLM output using PydanticOutputParser, so the result is a validated Python object instead of raw text.

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import PydanticOutputParser, CommaSeparatedListOutputParser
from pydantic import BaseModel, Field
from typing import List

llm = ChatOpenAI(model="gpt-4o-mini", api_key="your-api-key")

# Define the structure you want the LLM to return
class MigrationRisk(BaseModel):
    risk_level: str = Field(description="LOW, MEDIUM, or HIGH")
    risks: List[str] = Field(description="List of identified migration risks")
    estimated_days: int = Field(description="Estimated migration duration in days")
    recommendation: str = Field(description="One sentence recommendation")

# PydanticOutputParser adds format instructions to the prompt automatically
parser = PydanticOutputParser(pydantic_object=MigrationRisk)

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a database migration expert. Respond only in the requested format."),
    ("human", "Assess migration risks for: {scenario}\n\n{format_instructions}")
]).partial(format_instructions=parser.get_format_instructions())

# Chain: prompt | llm | parser
chain = prompt | llm | parser

result = chain.invoke({"scenario":
    "Oracle 11g with 200 stored procedures migrating to Aurora PostgreSQL in 4 weeks"})

print(type(result))               # MigrationRisk object, not a string
print("Risk level:", result.risk_level)
print("Risks:", result.risks)
print("Days:", result.estimated_days)
print("Advice:", result.recommendation)

It gives the following output,


Risk level: HIGH
Risks: ["200 stored procedures need conversion to PL/pgSQL",
        "4-week timeline is aggressive for this complexity",
        "Oracle-specific SQL syntax may not translate automatically"]
Days: 60
Advice: Extend timeline to 60 days and use AWS SCT for stored procedure conversion.

# result is a validated Pydantic object - not a string to parse manually
# result.risks is already a Python list
# result.estimated_days is already an int

It gives the following output,


1. Verify row counts match between source and target
2. Confirm all foreign key constraints are intact
3. Test application queries against target database
4. Check CDC lag is zero before switching connections
5. Validate index performance on critical query paths

If the LLM returns badly formatted output, PydanticOutputParser raises an OutputParserException. Wrap the chain with OutputFixingParser.from_llm(parser, llm=llm) to automatically retry with the malformed output and ask the LLM to fix its own formatting error - this adds resilience without custom error handling code.

Send your comments, suggestions or queries regarding this site to [email protected].