Chatbot Memory Mastery: A Practical Guide

Introduction

LangChain memory can be quite intricate, with various implementations designed to meet different system needs. 🌐 In our previous blog, we introduced the concepts of In-Prompt Memory 💬 and External Memory 🗂️ and even built a chatbot 🤖 using the older LangChain classes for In-Prompt memory. 🚀
Today, we’re taking our memory journey to the next level! We’ll dive into the new, more powerful LangChain classes—like BaseChatMessageHistory, ChatMessageHistory, RunnablePassthrough—and explore how they can help create chatbots with advanced memory capabilities. 🚀

So, grab your virtual toolkit, and let’s embark on this exciting adventure into the world of next-gen chatbot memory. Trust me, you won’t want to miss it! 😎
Here’s a sneak peek at what you’ll create by the end of this blog! 🌟

Few Components of LangChain Memory

LangChain provides a range of memory components to manage and store conversation history, enabling chatbots to recall and utilize past interactions. These components play a crucial role in building more sophisticated and context-aware systems. Below, we'll explore a few of the foundational components in LangChain memory that help structure how data is stored and retrieved during conversations:

BaseChatMessageHistory -

At its core, BaseChatMessageHistory is an Abstract Base Class (ABC) that serves as the blueprint for managing chat histories. It defines the structure and essential methods required to handle chat memory, whether stored in-memory, on disk, or in external databases.
Features of BaseChatMessageHistory:
- Abstract Nature: It provides the framework, not the implementation, making it highly customizable.
- Versatility: Supports integration with various storage solutions, from in-memory databases to cloud-based systems.
- Essential Methods: The class outlines key operations like adding messages, retrieving messages, and clearing history, ensuring consistency across implementations.
By extending this class, developers can tailor chat history storage to meet their application's specific requirements, whether for temporary sessions or persistent, long-term use.

ChatMessageHistory -

Building on the foundation laid by BaseChatMessageHistory, ChatMessageHistory offers an in-memory implementation that is simple, lightweight, and efficient. It is perfect for applications where chat history does not require persistence beyond the session's lifespan, such as short-lived chatbots or prototyping scenarios.
Features of ChatMessageHistory:
- In-Memory Storage: Stores messages in memory, making it fast and efficient.
- Ease of Use: Includes utility methods for adding and retrieving messages, such as add_user_message and add_ai_message.
- Short-Term Focus: Ideal for scenarios where chat context is transient and does not need to be saved for future sessions.

Methods of Managing Chat History -

Method	Description
`aadd_messages`	Asynchronously adds a list of messages.
`aclear`	Asynchronously removes all messages from the store.
`add_ai_message`	Convenience method for adding an AI message string to the store.
`add_message`	Adds a `BaseMessage` object to the store.
`add_messages`	Adds a list of messages to the store.
`add_user_message`	Convenience method for adding a human message string to the store.
`agent_messages`	Asynchronously retrieves messages from the store.
`clear`	Removes all messages from the store

These methods and their consistent structure ensure seamless integration with different backends and workflows, making BaseChatMessageHistory a powerful abstraction in LangChain's memory management system.

Integration of `BaseChatMessageHistory` and `ChatMessageHistory`

The real power of these classes lies in their ability to integrate seamlessly with various storage systems. While ChatMessageHistory provides a ready-to-use solution for in-memory chat history management, developers can extend BaseChatMessageHistory to integrate external storage solutions, such as:

Dictionary-Based Stores: A lightweight in-memory database for prototyping or temporary use.
Database Solutions: Persistent storage using relational (e.g., PostgreSQL) or NoSQL (e.g., MongoDB) databases.
Cloud-Based Storage: Scalability and persistence through services like AWS DynamoDB or Google Cloud Firestore.

For instance, a simple implementation could use a dictionary where session IDs are mapped to corresponding message lists, ensuring efficient chat memory management for multi-session environments. Alternatively, by extending BaseChatMessageHistory, you can integrate advanced storage mechanisms, enabling scalable and persistent chat memory for production-grade systems.

# Global dictionary to store chat histories for all sessions
# Structure -> {session_id: chatMessageHistory(), session_id2: chatMessageHistory2(),...}
global_store = {}

def get_session_history(session_id: str) -> BaseChatMessageHistory:
    """
    Retrieves or initializes the chat history (i.e. an instance of chatMessageHistory() class)
    for a given session ID.

    Args:
        session_id (str): Unique identifier for the chat session.

    Returns:
        BaseChatMessageHistory: The chat history object for the session.

    Additional Information:
        - Using this session ID mechanism, multiple chat sessions can be created and maintained
          for the same user by assigning a unique session ID for each session.
        - For example:
            * session_id = "user1_session1" (First session for User 1)
            * session_id = "user1_session2" (Second session for User 1)
        - Each session will have an independent ChatMessageHistory instance stored in the 
          global dictionary, enabling seamless handling of multiple sessions.
    """
    # Check if the session ID exists in the global store
    if session_id not in global_store:
        # Initialize a new ChatMessageHistory instance if session ID is not found
        global_store[session_id] = ChatMessageHistory()

    # Return the chat history object associated with the session ID
    return global_store[session_id]

The session ID allows the system to uniquely identify and manage multiple chat sessions for the same user. By assigning unique session IDs (e.g., "user1_session1", "user1_session2"), we can support multiple simultaneous conversations for one user.

RunnablePassthrough -

RunnablePassthrough is a simple utility in LangChain designed for passing data through without any modifications. It's like a middleman that takes data, holds it for a moment, and then passes it along unchanged.
It is used to extract specific data (like a field) from a dictionary and pass it along in a LangChain pipeline to the next step, such as a language model or another process.

# Chain to extract 'messages' from input and pass to model
chain = (
    RunnablePassthrough.assign(messages=itemgetter("messages"))  # Extracts 'messages'
    | prompt  # Generates a prompt from extracted messages
    | model   # Passes the prompt to the model
)

Why Use It?

Simplifies Data Handling:
- Focuses on extracting the required part of a complex data structure (e.g., dictionaries).
Keeps Pipelines Modular:
- Cleanly separates data extraction from processing or transformation steps.
Works as a Connector:
- Bridges raw data (like a dictionary) and the next step (like prompt creation or AI model input).

Implementing a Chatbot with Memory

Prerequisite

This article builds on the concepts from the first blog. Before proceeding, ensure you've covered the previous blogs. We’ll pick up where we left off and make adjustments to the app.py file to upgrade our chatbot’s memory. 🧠✨

Step 4: Creating the Main Working File → app.py

1.1 Importing Modules

import os                              # To interact with the operating system
import time                            # For handling time-related functions
import streamlit as st                 # For building the chatbot's UI
from dotenv import load_dotenv         # To load environment variables (e.g., GROQ_API_KEY)
from operator import itemgetter        # To efficiently access dictionary values by key
from langchain_core.messages import HumanMessage, SystemMessage, AIMessage   # For defining message types
from langchain_core.chat_history import BaseChatMessageHistory               # For chat history management
from langchain_community.chat_message_histories import ChatMessageHistory     # For in-memory chat history
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder   # For handling chatbot prompts
from langchain_core.runnables import RunnablePassthrough                       # For passing data through the chain
from langchain_groq import ChatGroq    # For creating the chatbot using the Groq model

1.2 Loading the Environment Variables and Setting Up the Model

load_dotenv()  # Load .env variables
groq_api_key = os.getenv("GROQ_API_KEY")  # Retrieve API key

model = ChatGroq(model="Gemma2-9b-It", groq_api_key=groq_api_key)  # Initialize the model

This is the same setup as the previous chatbot. Ensure your .env file has the correct GROQ_API_KEY.

1.3 Define the Prompt Template and Create the Chain

# Define the prompt template
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="messages"),  # Placeholder for dynamic messages
    ]
)

# Create the chain to process the messages through the model
chain = (
    RunnablePassthrough.assign(messages=itemgetter("messages"))  # Extract messages using itemgetter
    | prompt                                             # Apply the prompt template
    | model                                              # Pass the processed data to the model
)

Prompt Template:
- The ChatPromptTemplate is created using a list of predefined messages.
- The "system" message sets up the context for the assistant, specifying that it should answer questions helpfully.
- The MessagesPlaceholder is used as a dynamic placeholder that will later be filled with the actual user and assistant messages.
Chain:
- The RunnablePassthrough is used to extract the messages from the input data (typically from a dictionary).
- The | operator in LangChain is used to link multiple stages together, forming a chain.
- The chain first extracts the messages, then applies the prompt to format the input, and finally, sends it to the model for processing.

This setup ensures that the user’s message history is passed to the language model in the correct format.

1.4 Streamlit UI and Memory Initialization

# Streamlit UI Title
st.title("Synapse 🧠")

# Initialize memory in Streamlit session state
if "memory" not in st.session_state:
    st.session_state.memory = ChatMessageHistory()  # Initialize in-memory chat history

# Display past messages
for message in st.session_state.memory.messages:
    with st.chat_message("user" if isinstance(message, HumanMessage) else "assistant"):
        st.markdown(message.content)  # Render the message content in chat bubbles

Streamlit Title:
- st.title("Synapse 🧠") adds a title to the Streamlit UI. This sets the tone for the chatbot interface with a catchy name and emoji.
Initialize Memory:
- Checks if a key called "memory" exists in st.session_state.
- If not, initializes it with an instance of ChatMessageHistory, which stores chat messages for the current session.
Display Past Messages:
- Loops through all messages stored in st.session_state.memory.messages.
- Determines the type of message (user or assistant) by checking if it is an instance of HumanMessage.
- Uses st.chat_message to display the message in the chat UI.
- st.markdown renders the message content in a rich-text format.

This section ensures that past conversations are persistently displayed in the chatbot UI, creating a seamless user experience.

1.5 Capturing User Input and Generating a Response

# Capture user input
if prompt_text := st.chat_input("Enter your text..."):
    # Display user input
    st.chat_message("user").markdown(prompt_text)

    # Add user input to memory as a HumanMessage
    user_message = HumanMessage(content=prompt_text)
    st.session_state.memory.add_message(user_message)

    # Generate a response using the chain
    messages = [
        SystemMessage(content="You are a helpful assistant. Answer all questions to the best of your ability."),
        *st.session_state.memory.messages  # Include all previous messages from memory
    ]

    response = chain.invoke({
        "messages": messages
    })

    response_text = response.content  # Extract the assistant's response

Capture User Input:
- st.chat_input("Enter your text..."): Provides a text input field for the user.
- If the user enters text, it is captured in the variable prompt_text.
Display User Input:
- st.chat_message("user").markdown(prompt_text): Displays the user input in the chat interface as a user message.
Add Input to Memory:
- Creates a HumanMessage object using the captured input.
- Adds this message to the chat memory stored in st.session_state.memory.
Generate Response:
- Prepares the conversation history:
  - SystemMessage: Provides context for the assistant.
  - st.session_state.memory.messages: Includes all previous messages (user and assistant) in the conversation.
- Passes the prepared messages to the chain, which processes them through the defined pipeline (prompt and model).
Retrieve Response:
- response.content: Extracts the assistant’s reply, which can then be displayed or further processed.

This section handles user interaction, updates chat memory, and integrates the processing pipeline to generate responses seamlessly.

1.6 Rendering Chatbot’s Response

# Generate a word-by-word response for better UX
def response_generator(result_text):
    for word in result_text.split():
        yield word + " "
        time.sleep(0.05)  # Simulate typing effect

# Display assistant's response
with st.chat_message("assistant"):
    response_placeholder = st.empty()  # Create a placeholder for dynamic updates
    response = ""
    for partial_response in response_generator(response_text):
        response += partial_response
        response_placeholder.markdown(response)  # Update the placeholder with the growing response
    response_placeholder.markdown(response_text)  # Ensure the final response is displayed

# Add the assistant's response to both Streamlit memory and session memory
ai_message = AIMessage(content=response_text)
st.session_state.memory.add_message(ai_message)

Simulated Typing Effect:
- The response_generator(result_text) function splits the response into words and returns each word with a short delay (time.sleep(0.05)) to simulate typing.
Dynamic Response Display:
- A placeholder is created with st.empty() to dynamically update the response. The placeholder is updated word-by-word as the response is generated and finalized at the end.
Memory Integration:
- The assistant's response is wrapped in an AIMessage and stored in session memory with st.session_state.memory.add_message(ai_message) for future use.

Complete Code

Here's the final version of the chatbot code that we just built together:

import os
import time
import streamlit as st
from dotenv import load_dotenv
from operator import itemgetter
from langchain_core.messages import HumanMessage, SystemMessage, AIMessage
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables import RunnablePassthrough
from langchain_core.messages import trim_messages
from langchain_groq import ChatGroq 

# Load environment variables
load_dotenv()
groq_api_key = os.getenv("Chatbots")  # Fetch API key for Groq model

# Initialize the model using the API key
model = ChatGroq(model="Gemma2-9b-It", groq_api_key=groq_api_key)

# Define the prompt template for the conversation
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",  # System message that sets the behavior of the assistant
            "You are a helpful assistant. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="messages"),  # Placeholder for dynamic messages
    ]
)

# Set up the processing chain for the assistant's responses
chain = (
    RunnablePassthrough.assign(messages=itemgetter("messages"))  # Extract messages
    | prompt  # Apply the prompt template
    | model  # Pass the prompt to the model for generating a response
)

# Streamlit UI Title
st.title("Synapse 🧠")

# Initialize memory in Streamlit session state to keep track of past messages
if "memory" not in st.session_state:
    st.session_state.memory = ChatMessageHistory()  # Store chat history in session state

# Display past messages from memory
for message in st.session_state.memory.messages:
    with st.chat_message("user" if isinstance(message, HumanMessage) else "assistant"):
        st.markdown(message.content)

# Capture user input through Streamlit's chat input box
if prompt_text := st.chat_input("Enter your text..."):
    # Display user input in the chat
    st.chat_message("user").markdown(prompt_text)

    # Add user input to memory as a HumanMessage
    user_message = HumanMessage(content=prompt_text)
    st.session_state.memory.add_message(user_message)

    # Prepare all messages for the chain, including system message and past conversation
    messages = [
        SystemMessage(content="You are a helpful assistant. Answer all questions to the best of your ability."),
        *st.session_state.memory.messages  # Include memory messages for context
    ]

    # Generate the assistant's response using the defined chain
    response = chain.invoke({
        "messages": messages
    })

    response_text = response.content

    # Function to generate a word-by-word response for a smoother UX
    def response_generator(result_text):
        for word in result_text.split():
            yield word + " "
            time.sleep(0.05)  # Simulate typing delay for a natural feel

    # Display the assistant's response incrementally
    with st.chat_message("assistant"):
        response_placeholder = st.empty()  # Create a placeholder for dynamic updates
        response = ""
        for partial_response in response_generator(response_text):
            response += partial_response
            response_placeholder.markdown(response)  # Update the same placeholder with partial response
        response_placeholder.markdown(response_text)  # Ensure final response is rendered

    # Add the assistant's response to both Streamlit memory and session memory
    ai_message = AIMessage(content=response_text)
    st.session_state.memory.add_message(ai_message)

Step 5: Launch Your Chatbot 🚀

Congratulations! Your chatbot is ready to roll. 🎉 Let’s bring it to life:

Run the Application:
Fire up your terminal, activate your virtual environment, and type this magical command:

streamlit run app.py

(Replace app.py with your file’s name if you chose something fancier! 🧐)

Watch It in Action:
Your browser will open, and just like that, your chatbot will be live, ready to showcase its conversational skills. Test it out and enjoy the results of your hard work!

And That’s a Wrap! 🎉

Congratulations, you’ve just created a memory-powered chatbot using LangChain’s new memory classes! 🧠✨ You've added a layer of intelligence to your bot that allows it to remember past interactions—how cool is that? Time to take a moment and bask in the glory of your creation! 🏆

What’s Next? 🚀

Hold onto your hat because in the next blog, we’ll unlock your chatbot’s inner JARVIS by using the session_id in the ChatMessageHistory class to create multiple chat sessions for the same user. Imagine—each conversation can have its own memory lane. Stay tuned for some serious chatbot wizardry! 🧙‍♂️

Got Ideas? 💡

Have suggestions or feedback for us? Or maybe you’ve got a cool feature idea for the chatbot? Let us know in the comments or drop me a message—I’d love to hear from you! 😊

Mastering Chatbot Memory: A Practical Guide - 2

Step-by-Step Guide to LangChain's Updated Memory Features

Table of contents

Introduction

Few Components of LangChain Memory

BaseChatMessageHistory -

ChatMessageHistory -

Integration of `BaseChatMessageHistory` and `ChatMessageHistory`

RunnablePassthrough -

Why Use It?

Implementing a Chatbot with Memory

Prerequisite

Complete Code

And That’s a Wrap! 🎉

What’s Next? 🚀

Got Ideas? 💡

Mastering Chatbot Memory: A Practical Guide - 2

Step-by-Step Guide to LangChain's Updated Memory Features

Table of contents

Introduction

Few Components of LangChain Memory

BaseChatMessageHistory -

ChatMessageHistory -

Integration of BaseChatMessageHistory and ChatMessageHistory

RunnablePassthrough -

Why Use It?

Implementing a Chatbot with Memory

Prerequisite

Complete Code

And That’s a Wrap! 🎉

What’s Next? 🚀

Got Ideas? 💡

Integration of `BaseChatMessageHistory` and `ChatMessageHistory`