Documentation Index Fetch the complete documentation index at: https://mintlify.com/run-llama/LlamaIndexTS/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Chat engines enable conversational interactions with your data, maintaining chat history and context across multiple turns.
BaseChatEngine
Abstract base class for all chat engines.
import { BaseChatEngine } from "@llamaindex/core/chat-engine" ;
Properties
chatHistory
ChatMessage[] | Promise<ChatMessage[]>
The conversation history
Methods
Send a message and get a response Non-streaming: chat ( params : NonStreamingChatEngineParams ): Promise < EngineResponse >
Streaming: chat ( params : StreamingChatEngineParams ): Promise < AsyncIterable < EngineResponse >>
message
string | MessageContentDetail[]
required
The user message (text or multi-modal)
Whether to stream the response
Optional custom chat history or memory
Provider-specific chat options
The assistant’s response text
Retrieved source nodes (context chat engine only)
Additional response metadata
SimpleChatEngine
Basic chat engine without retrieval, just conversational LLM.
import { SimpleChatEngine } from "@llamaindex/core/chat-engine" ;
import { OpenAI } from "@llamaindex/openai" ;
Example
const llm = new OpenAI ({ model: "gpt-4" });
const chatEngine = new SimpleChatEngine ({ llm });
const response1 = await chatEngine . chat ({
message: "Hello! My name is Alice."
});
console . log ( response1 . response ); // "Hello Alice! How can I help you?"
const response2 = await chatEngine . chat ({
message: "What's my name?"
});
console . log ( response2 . response ); // "Your name is Alice."
ContextChatEngine
Chat engine with retrieval - retrieves relevant context for each message.
import { ContextChatEngine } from "@llamaindex/core/chat-engine" ;
Constructor Options
Retriever for fetching relevant context
Language model (defaults to Settings.llm)
System prompt for the chat
Template for injecting retrieved context
Example
import { VectorStoreIndex } from "llamaindex" ;
import { Document } from "@llamaindex/core/schema" ;
const documents = [
new Document ({ text: "The company was founded in 2020." }),
new Document ({ text: "Our main product is a data framework." })
];
const index = await VectorStoreIndex . fromDocuments ( documents );
const chatEngine = index . asChatEngine ();
const response = await chatEngine . chat ({
message: "When was the company founded?"
});
console . log ( response . response ); // "The company was founded in 2020."
console . log ( response . sourceNodes ); // Retrieved context nodes
Streaming Chat
const chatEngine = index . asChatEngine ();
const stream = await chatEngine . chat ({
message: "Tell me about the company" ,
stream: true
});
for await ( const chunk of stream ) {
process . stdout . write ( chunk . response );
}
Multi-modal Chat
Chat engines support images and other media:
const response = await chatEngine . chat ({
message: [
{ type: "text" , text: "What's in this image?" },
{
type: "image_url" ,
image_url: { url: "data:image/jpeg;base64,..." }
}
]
});
Custom Chat History
Using ChatMemoryBuffer
import { ChatMemoryBuffer } from "@llamaindex/core/memory" ;
const memory = new ChatMemoryBuffer ({ tokenLimit: 3000 });
const response = await chatEngine . chat ({
message: "Hello" ,
chatHistory: memory
});
Manual Chat History
const customHistory : ChatMessage [] = [
{ role: "user" , content: "Previous question" },
{ role: "assistant" , content: "Previous answer" }
];
const response = await chatEngine . chat ({
message: "Follow-up question" ,
chatHistory: customHistory
});
System Prompts
Setting System Prompt
const chatEngine = index . asChatEngine ({
systemPrompt: "You are a helpful assistant that always speaks in rhymes."
});
Custom Context Template
const chatEngine = index . asChatEngine ({
contextSystemPrompt: `
Use the following context to answer the question.
If you don't know, say so.
Context:
{context}
Question: {query}
`
});
Chat History Management
Accessing Chat History
const response = await chatEngine . chat ({
message: "Hello"
});
const history = await chatEngine . chatHistory ;
console . log ( history );
// [
// { role: "user", content: "Hello" },
// { role: "assistant", content: "Hi! How can I help you?" }
// ]
Resetting Chat History
import { SimpleChatEngine } from "@llamaindex/core/chat-engine" ;
const chatEngine = new SimpleChatEngine ({ llm });
// Chat history is stored in the engine
await chatEngine . chat ({ message: "Message 1" });
await chatEngine . chat ({ message: "Message 2" });
// Reset by creating new engine
const newChatEngine = new SimpleChatEngine ({ llm });
Retrieval Configuration
const chatEngine = index . asChatEngine ({
retriever: index . asRetriever ({
similarityTopK: 5 ,
mode: "default"
})
});
Custom Chat Engine
import { BaseChatEngine } from "@llamaindex/core/chat-engine" ;
import { EngineResponse } from "@llamaindex/core/schema" ;
class CustomChatEngine extends BaseChatEngine {
private history : ChatMessage [] = [];
async chat ( params : NonStreamingChatEngineParams ) : Promise < EngineResponse > {
const { message , chatHistory } = params ;
// Use provided history or internal history
const messages = chatHistory ?? this . history ;
// Add user message
messages . push ({ role: "user" , content: message });
// Generate response (custom logic)
const response = await this . generateResponse ( messages );
// Add assistant message
const assistantMessage = { role: "assistant" , content: response };
messages . push ( assistantMessage );
// Update internal history
this . history = messages ;
return {
response ,
sourceNodes: [],
metadata: {}
};
}
get chatHistory () {
return this . history ;
}
private async generateResponse ( messages : ChatMessage []) : Promise < string > {
// Custom response generation
return "Response" ;
}
}
Best Practices
Use context chat engine for RAG : Retrieves relevant information for each turn
Manage token limits : Use ChatMemoryBuffer to prevent context overflow
Provide clear system prompts : Guide the assistant’s behavior
Stream long responses : Better user experience for lengthy answers
Reset history periodically : Prevent context from becoming too large or stale
Include source nodes : Track which documents informed the response