Langchain: Chains with Chat history

LangChain is a powerful framework for building applications powered by Large Language Models (LLMs). It simplifies the process of connecting LLMs with various data sources and tools to create more sophisticated and intelligent applications. This tutorial will introduce you to the basics of LangChain, including how to maintain chat history for context, and walk you through a few simple chain examples that you can run directly in Google Colab.

Setting Up Your Environment in Colab:

First, open a new Google Colab notebook. You’ll need to install the langchain library and a language model integration. For this tutorial, we’ll use OpenAI’s gpt-3.5-turbo model. You’ll also need an OpenAI API key.

Install LangChain and OpenAI:

!pip install langchain langchain_community openai

Set Your OpenAI API Key:

You’ll need to set your OpenAI API key as an environment variable. Go to https://platform.openai.com/account/api-keys to create one. Replace "YOUR_OPENAI_API_KEY" with your actual key.

import os
os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"

Basic Concepts:

LLM: The core language model that generates text. We’ll be using ChatOpenAI.
PromptTemplate: A way to structure and parameterize prompts for the LLM, making them reusable and dynamic.
Chains: Sequences of components linked together. They can include LLMs, prompt templates, utilities, and more.
Memory: In LangChain, “Memory” refers to the ability of a chain to remember past interactions. This is crucial for building conversational agents that can maintain context across multiple turns. We’ll use ConversationBufferMemory which stores the entire conversation history.

Example 1: A Simple Sequential Chain

Let’s start with a basic chain that takes a topic as input and generates a short poem about it, and then a title. This demonstrates a flow without explicit memory, but sets the stage for adding it.

from langchain.chat_models import ChatOpenAI # Using ChatOpenAI for conversational models
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain, SequentialChain

# Initialize the Chat LLM
llm = ChatOpenAI(temperature=0.7, model_name="gpt-3.5-turbo") # Using gpt-3.5-turbo for chat

# Define the prompt template for generating a poem
poem_prompt = PromptTemplate(
    input_variables=["topic"],
    template="Write a short poem about: {topic}",
)

# Create the first chain: generate the poem
poem_chain = LLMChain(llm=llm, prompt=poem_prompt, output_key="poem")

# Define the prompt template for writing a title for the poem
title_prompt = PromptTemplate(
    input_variables=["poem"],
    template="Suggest a short and catchy title for the following poem:\n\n{poem}\n\nTitle:",
)

# Create the second chain: generate the title
title_chain = LLMChain(llm=llm, prompt=title_prompt, output_key="title")

# Create the overall sequential chain
overall_chain = SequentialChain(
    chains=[poem_chain, title_chain],
    input_variables=["topic"],
    output_variables=["poem", "title"],
    verbose=True # Set to True to see the intermediate steps
)

# Run the chain
topic = "autumn leaves"
result = overall_chain({"topic": topic})

# Print the results
print(f"\n--- Example 1 Results ---")
print(f"Topic: {topic}")
print(f"Poem:\n{result['poem']}")
print(f"Title: {result['title']}")

Example 2: Conversational Chain with Memory

Now, let’s build a simple chatbot that can remember previous interactions. We’ll use ConversationBufferMemory to store the chat history.

from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
from langchain.prompts import (
    ChatPromptTemplate,
    MessagesPlaceholder,
    HumanMessagePromptTemplate,
    AIMessagePromptTemplate,
)
from langchain.schema import SystemMessage, HumanMessage, AIMessage

# Initialize the Chat LLM
llm = ChatOpenAI(temperature=0.7, model_name="gpt-3.5-turbo")

# Create a prompt template with a placeholder for chat history
# MessagesPlaceholder is crucial for memory to work with chat models
prompt = ChatPromptTemplate.from_messages([
    SystemMessage(content="You are a friendly and helpful AI assistant."),
    MessagesPlaceholder(variable_name="history"), # This is where chat history will be injected
    HumanMessagePromptTemplate.from_template("{input}")
])

# Initialize the memory
memory = ConversationBufferMemory(memory_key="history", return_messages=True)

# Create the conversational chain
conversation = ConversationChain(
    llm=llm,
    memory=memory,
    prompt=prompt,
    verbose=True # See the prompt being sent with history
)

print(f"\n--- Example 2: Conversational Chatbot ---")

# First interaction
response1 = conversation.predict(input="Hi there! What's your name?")
print(f"AI: {response1}")

# Second interaction (should remember the first)
response2 = conversation.predict(input="That's a nice name. Can you tell me a joke?")
print(f"AI: {response2}")

# Third interaction (should remember both previous interactions)
response3 = conversation.predict(input="Haha, that was a good one! What else can you do?")
print(f"AI: {response3}")

# You can also inspect the memory directly
print(f"\n--- Current Chat History in Memory ---")
for message in memory.chat_memory.messages:
    if isinstance(message, HumanMessage):
        print(f"Human: {message.content}")
    elif isinstance(message, AIMessage):
        print(f"AI: {message.content}")

In this example:

We use ChatOpenAI because it’s designed for conversational interactions.
ConversationBufferMemory stores all previous HumanMessage and AIMessage objects.
The MessagesPlaceholder(variable_name="history") in our ChatPromptTemplate is vital. LangChain automatically populates this placeholder with the contents of the memory_key (which is “history” in our case) from the ConversationBufferMemory. This allows the LLM to “see” the entire conversation so far.

Example 3: Branching Chain with Shared Context (Conceptual Parallel with Memory)

While a truly “parallel” chain that uses independent memory streams and then merges them is more complex (often involving RunnableParallel or custom graph structures in advanced LangChain), we can demonstrate a branching concept where different branches could conceptually draw from a shared, evolving chat history.

For simplicity, let’s adapt the branching example to include memory, showing how even a router can benefit from understanding the conversation context. The router will decide whether to generate a story or a poem, but now, the LLM guiding the decision will also have access to the conversation history.

from langchain.chat_models import ChatOpenAI
from langchain.prompts import (
    PromptTemplate,
    ChatPromptTemplate,
    MessagesPlaceholder,
    HumanMessagePromptTemplate
)
from langchain.chains import LLMChain, ConversationChain
from langchain.chains.router import MultiPromptChain
from langchain.chains.router.llm_router import LLMRouterChain, RouterOutputParser
from langchain.memory import ConversationBufferMemory

# Initialize the LLM for chat
llm = ChatOpenAI(temperature=0.7, model_name="gpt-3.5-turbo")

# --- Define the Memory for the overall interaction ---
overall_memory = ConversationBufferMemory(memory_key="history", return_messages=True)

# --- Define Prompt Templates for different tasks (Story/Poem) ---
story_prompt_template = ChatPromptTemplate.from_messages([
    SystemMessage(content="You are a creative storyteller. Write a very short story."),
    MessagesPlaceholder(variable_name="history"), # Include history in task-specific prompts
    HumanMessagePromptTemplate.from_template("Write a very short story about: {topic}")
])

poem_prompt_template = ChatPromptTemplate.from_messages([
    SystemMessage(content="You are a poetic wordsmith. Write a short poem."),
    MessagesPlaceholder(variable_name="history"), # Include history in task-specific prompts
    HumanMessagePromptTemplate.from_template("Write a short poem about: {topic}")
])

# Create ConversationChains for each destination, sharing the same memory
# This means the history accumulates across all interactions, regardless of the branch taken
story_chain = ConversationChain(llm=llm, prompt=story_prompt_template, memory=overall_memory, output_key="output")
poem_chain = ConversationChain(llm=llm, prompt=poem_prompt_template, memory=overall_memory, output_key="output")


# --- Define the Router Prompt Template ---
# The router also needs access to history to make better routing decisions based on context
router_template = """Given the user's query and the chat history, decide if the user is asking for a "story" or a "poem".
Return your decision as a JSON object with a "destination" key and a "next_inputs" key.
The "next_inputs" key should contain the extracted 'topic' from the user's request.

For example:
{{ "destination": "story", "next_inputs": {{"topic": "a brave knight"}} }}
OR
{{ "destination": "poem", "next_inputs": {{"topic": "the starry night"}} }}

Chat History:
{history}

User Query: {input}"""

router_prompt = ChatPromptTemplate.from_messages([
    SystemMessage(content="You are a routing assistant that directs user queries."),
    MessagesPlaceholder(variable_name="history"), # Router also sees history
    HumanMessagePromptTemplate.from_template(router_template),
])

# Create the router chain
# Note: LLMRouterChain doesn't directly take a memory object in its constructor for the prompt
# We'll need to use a ConversationChain around it if we want it to automatically manage history.
# For simplicity, let's feed the history manually to the router for demonstration,
# or create a ConversationChain for the router as well.
# For simplicity, we'll create a direct LLMChain for the router for this example,
# assuming the history is passed to it as part of the input.
router_llm_chain = LLMChain(llm=llm, prompt=router_prompt, output_parser=RouterOutputParser())


# Define the different destination chains
destination_chains = {
    "story": story_chain,
    "poem": poem_chain,
}

# Create the multi-prompt chain
# MultiPromptChain handles passing the correct inputs to the chosen chain
# and manages the router chain's execution.
multi_prompt_chain = MultiPromptChain(
    router_chain=router_llm_chain,
    destination_chains=destination_chains,
    default_chain=story_chain, # Fallback if routing fails
    verbose=True,
)

print(f"\n--- Example 3: Branching Chain with Shared Memory ---")

# Run the chain with initial query
query1 = "Tell me a short story about a magical tree."
result1 = multi_prompt_chain.run(input=query1, history=overall_memory.chat_memory.messages) # Manually pass history for router
print(f"\nQuery: {query1}\nResult:\n{result1}\n---")

# Let's add the human query and AI response to the overall memory
overall_memory.save_context({"input": query1}, {"output": result1})

# Run another query, which might refer to the previous one implicitly
query2 = "Now, write a poem about it." # "it" refers to the magical tree from previous turn
result2 = multi_prompt_chain.run(input=query2, history=overall_memory.chat_memory.messages) # Manually pass history for router
print(f"\nQuery: {query2}\nResult:\n{result2}\n---")

overall_memory.save_context({"input": query2}, {"output": result2})


# Another turn, perhaps changing topic or asking a follow-up
query3 = "What about a joke about a cat?"
result3 = multi_prompt_chain.run(input=query3, history=overall_memory.chat_memory.messages)
print(f"\nQuery: {query3}\nResult:\n{result3}\n---")

print(f"\n--- Final Chat History in Overall Memory ---")
for message in overall_memory.chat_memory.messages:
    if isinstance(message, HumanMessage):
        print(f"Human: {message.content}")
    elif isinstance(message, AIMessage):
        print(f"AI: {message.content}")

Important Notes for Example 3:

Memory Integration in Router: The LLMRouterChain itself doesn’t directly take a memory object in its constructor like ConversationChain does for its main prompt. In this example, we manually pass the history from overall_memory.chat_memory.messages to the multi_prompt_chain.run() method. This history then becomes available to the router_prompt through its MessagesPlaceholder.
Shared Memory: Crucially, both story_chain and poem_chain are initialized with the same overall_memory object. This ensures that the conversation history is shared and updated across all interactions, regardless of which branch of the MultiPromptChain is executed.
Manual Memory Update: After each multi_prompt_chain.run(), we manually call overall_memory.save_context() to update the shared memory with the latest human input and AI response. This keeps the history current for subsequent turns. For more automated memory management, you might consider using RunnableWithMessageHistory in newer LangChain versions, which abstracts this more elegantly.

Adding memory is a fundamental step in building dynamic and truly conversational LLM applications. LangChain’s memory modules, especially ConversationBufferMemory, make it straightforward to maintain context across turns. By incorporating MessagesPlaceholder in your prompt templates and passing the conversation history, you can enable your LLMs to engage in more coherent and contextually relevant dialogues. As you explore further, you can investigate other memory types like ConversationBufferWindowMemory (to limit history to a certain number of turns) or ConversationSummaryMemory (to summarize long conversations to save tokens).

Cheers!!

Amit Tomar