Use this file to discover all available pages before exploring further.
Memory enables chat engines to maintain conversation context across multiple turns. LlamaIndex provides a flexible memory system with short-term, long-term, and specialized memory blocks.
import { Memory } from "@llamaindex/core/memory";import { Settings } from "llamaindex";const memory = new Memory([], { tokenLimit: 30000, // Default: 30k tokens llm: Settings.llm, // Use global LLM});// Add user and assistant messagesawait memory.add({ role: "user", content: "What is LlamaIndex?",});await memory.add({ role: "assistant", content: "LlamaIndex is a data framework for LLM applications.",});// Get messages within token limitconst messages = await memory.getLLM();console.log(messages.length); // 2
Stores conversations in a vector store for semantic retrieval:
import { VectorMemoryBlock } from "@llamaindex/core/memory/block";import { SimpleVectorStore } from "llamaindex/vector-store";import { OpenAIEmbedding } from "@llamaindex/openai";const vectorBlock = new VectorMemoryBlock({ id: "user-123-memory", vectorStore: new SimpleVectorStore(), embedModel: new OpenAIEmbedding(), priority: 1, // Higher priority = included first isLongTerm: true, // Stores processed messages long-term retrievalContextWindow: 5, // Use last 5 messages for retrieval queryOptions: { similarityTopK: 2, sessionFilterKey: "session_id", },});const memory = new Memory([], { memoryBlocks: [vectorBlock],});// Messages are automatically stored in vector memoryawait memory.add({ role: "user", content: "I like pizza" });await memory.add({ role: "assistant", content: "Great choice!" });// Later, relevant memories are retrievedawait memory.add({ role: "user", content: "What food do I like?" });const messages = await memory.getLLM();// Includes retrieved "I like pizza" from vector memory
import { FactExtractionMemoryBlock } from "@llamaindex/core/memory/block";import { Settings } from "llamaindex";const factBlock = new FactExtractionMemoryBlock({ id: "facts", llm: Settings.llm, maxFacts: 10, priority: 2, // Higher priority than vector memory isLongTerm: true,});const memory = new Memory([], { memoryBlocks: [factBlock],});// Facts are automatically extractedawait memory.add({ role: "user", content: "My name is Alice and I'm a software engineer in SF.",});await memory.add({ role: "user", content: "I'm working on a RAG application.",});// Extracted facts are included in contextconst messages = await memory.getLLM();// Includes extracted facts as a memory message
Include temporary messages without adding them to history:
const currentQuery = { role: "user" as const, content: "What did we discuss about pizza?",};// Include currentQuery without adding to memoryconst messages = await memory.getLLM( undefined, // Use default LLM [currentQuery] // Transient messages);// currentQuery is included but not stored