Chat memory gets fuzzy fast once the UI hides what LangChain4j is actually retaining.
I wrote a Quarkus tutorial that makes retained-memory pressure visible with `TokenWindowChatMemory`, Ollama request counts, a turn ledger, and OpenTelemetry attributes. The useful split is simple: your app-level eviction budget is not the model context limit. https://www.the-main-thread.com/p/quarkus-langchain4j-chat-memory-budget #Java #Quarkus #LangChain4j #Ollama #OpenTelemetry








