When I first starting using Quarkus Langchain4j I ran into some issues because I didn’t fully understand how Quarkus managed chat memory with Langchain4j. So, here’s what I’m gonna discuss:
- How CDI bean scopes of AI Services effect chat memory
- How chat memory is managed when @MemoryId is used as a parameter in Quarkus
- How default chat memory id works in Quarkus
- How chat memory can be leaked in some scenarios
- How to write your own Default Memory Id Provider
- How chat memory can effect your application
Chat Memory in Quarkus Langchain4j
Chat Memory is a history of the conversation you’ve had with an LLM. When using Quarkus Langchain4j (or just Langchain4j too) this chat history is automatically sent to the LLM when you interact with it. Think of chat memory as a chat session. Chat memory is a list of messages sent to and from the LLM. Each chat history is referenced by a unique ID and stored in a chat store. How the chat store stores memory depends on how you’ve built your app. Could be stored in memory or in a database for instance.
@MemoryId
With Quarkus Langchain4j you either use the @MemoryId annotation on a parameter to identify the chat history to use, or you let Quarkus provide this identifier by default. Let’s look at @MemoryId first:
@RegisterAiService
public interface MyChat {
@SystemMessage("You are a nice assistant")
String chat(@UserMessage msg, @MemoryId id);
}
@RegisterAiService
public interface AnotherChat {
@SystemMessage("You are a mean assistant")
String chat(@UserMessage msg, @MemoryId id);
}
With @MemoryId, the application developer is providing the chat memory identifier to use. The chat history is a concatenation of any other AiService that used the same memory ID. For example
@Inject
MyChat myChat;
@Inject
AnotherChat another;
public void call() {
String id = "1234";
String my = myChat.chat("Hello!", id);
String another = another.chat("GoodBye", id);
}
There’s a couple of things to think about when sharing a @MemoryId between different AiServices (prompts).
Shared Chat History
With the call to AnotherChat.chat() in line #10, the chat history of the previous call in line #9 is also included because the same memory id is passed to both function calls.
Only 1 SystemMessage per history
Another thing about running this code is that the original SystemMessage from MyChat is removed from chat history and a new SystemMessage from AnotherChat is added. Only one SystemMessage is allowed per history.
Self Management of ID
The application developer is responsible for creating and managing the @MemoryId. You have to ensure that id is unique (easily done with something like a UUID), otherwise different chat sessions could corrupt the other. If chatting is a string of REST calls, then you’ll have to make sure the client is passing along this memory id between HTTP invocations.
Sometimes LLMs are sensitive to what is in chat history. In the case above, the chat history has a mix of chat messages from two different prompts. It also loses the context of MyChat in line #10 as the MyChat system message is removed. Usually not a big deal, but every once in a while you might see your LLM get confused.
Default Memory Id
If a @MemoryId is not specified, then Quarkus Langchain4j decides what the memory id is.
package com.acme;
@RegisterAiService
public interface MyChat {
String chat(@UserMessage msg);
}
In vanilla, standalone Langchain4j, the default memory id is “default“. If you’re using langchain4j on its own, then you should not use default memory ids in multi-user/multi-session applications as chat history will be completely corrupted.
Quarkus Langchain4j does something different. A unique id is provided per CDI request scope. Request scope being the HTTP invocation, Kafka invocation, etc. Also, the interface and method name of the ai service is tacked on the end of this string. A “#” character is in the middle of the request id and the interface and method name. In other words, the format of the default memory id is:
<random-per-request-id>#<full qualified interface name>.<method-name>
So, for the above Java code, the default memory id for MyChat.chat would be:
@2342351#com.acme.MyChat.chat
There is a couple of things to think about with this default Quarkus implementation
Default Memory Id is tied to the request scope
Since the default id is generated as a unique id tied to the request scope, when your HTTP invocation finishes, the next time you invoke a ai service, a different default memory id will be used and thus you’ll have a completely new chat history.
Different chat history per AI Service method
Since the default id incorporates the ai service interface and method name, then there is a different chat history per ai service method and unlike the example in the @MemoryId section, chat history is not shared between prompts.
Using the Websocket extension gives you per session chat histories
If you use the websocket integration to implement your chat, then the default id is instead unique per session instead of per request. This means that default memory id is retained and meaningful for the entire chat session and you’ll retain chat history in between remote chat requests. The ai service interface name and method is still appended to the default memory id though!
Default memory ids vs. using @MessageId
So what should you use? Default memory ids or @MessageId? If you have a remote chat app where user interactions are in-between remote requests (i.e. HTTP/REST), then you should only use default memory ids for prompts that don’t want or need a complete chat history. In other words, only use default ids if the prompt doesn’t need chat memory. If you need a chat history in between remote requests, then you’ll need to use @MemoryId and manage ids for yourself.
The Websocket extension flips this. When using the WebSocket extension, since the default memory id is generated per websocket connection, you can have a real session and default memory ids are wonderful as you don’t have to manage memory ids in your application.
Memory Lifecycle tied to CDI bean scope
Ai services in Quarkus Langchain4j are CDI beans. If you do not specify a scope for this bean, it defaults to the @RequestScope. What a bean goes out of scope and is destroy an interesting thing happens. Any memory id referenced by the bean is wiped from the chat memory store and is gone forever. ANY memory id: default memory id or any id provided by @MemoryId parameters.
@RegisterAiService
@ApplicationScoped
public interface AppChat {
String chat(@UserMessage msg, @MemoryId id);
}
@RegisterAiService
@ApplicationScoped
public interface SessionChat {
String chat(@UserMessage msg, @MemoryId id);
}
@RegisterAiService
@RequestScoped
public interface RequestChat {
String chat(@UserMessage msg, @MemoryId id);
}
So, for the above code, any memory referenced by the id parameter of RequestChat.chat() will be wiped at the end of the request scope (i.e. the HTTP request). For SessionChat, when the CDI session is destroy, and AppChat when the application shuts down.
Memory tied to the smallest scope used
So, what if within the same rest invocation, you use the same memory id with all three of the ai services above?
@Inject AppChat app;
@Inject RequestChat req;
@GET
public String restCall() {
String memoryId = "1234";
app.chat("hello", memoryId);
req.chat("goodbye", memoryId);
}
So, in the restCall() method, even though AppChat is application scoped, since RequestChat uses the same memory id, “1234″, the chat history will be wiped from the chat memory store at the end of the REST request.
Default memory id can cause a leak
If you are relying on default memory ids and your ai service has a scope other than @RequestScoped, then you will leak chat memory and it will grow to the constraints of the memory store. For example
@ApplicationScoped
@RegisterAiService
public interface AppChat {
String chat(@UserMessage msg);
}
Since Quarkus’s default memory id is generated for the current request scope each and every time AppChat.chat() is called within a different request scope. Chat memory entries in the chat memory store will grow until the application shuts down.
Never use @ApplicationScoped with default ids
So, the moral of the story is never used @ApplicationScoped with your ai services if you’re relying on default ids. If you are using the websocket extension, then you can use @SessionScoped, but otherwise make sure your ai services are @RequestScoped.
What bean scopes should you use?
For REST-based chat applications:
- use the combination @ApplicationScoped and @MemoryId parameters to provide a chat history in between requests
- Use @RequestScoped and default memory ids for prompts that don’t need a chat history
- Do not share the same memory ids between @ApplicationScoped and @RequestScoped ai services
- If using the Websocket extension, then use @SessionScoped on your ai services that require a chat history.
Chat Memory and your LLM
So, hopefully you understand how chat memory works with Quarkus Langchain4j now. Just remember:
- Chat history is sent to your LLM with each request.
- Limiting chat history can speed up LLM interactions and cost you less money!
- Limiting chat history can focus your LLM.
All discussions for another blog! Cheers.
