Documentation Index
Fetch the complete documentation index at: https://mintlify.com/run-llama/LlamaIndexTS/llms.txt
Use this file to discover all available pages before exploring further.
The SummaryIndex (formerly ListIndex) maintains documents in sequential order without embeddings. It’s ideal for summarization tasks and small document collections.
When to Use SummaryIndex
Use SummaryIndex when:
- Summarizing documents: Generate summaries by processing all nodes
- Small datasets: When you have a limited number of documents
- No embeddings needed: Want to avoid embedding costs
- Sequential processing: Need to process documents in order
- Complete context: Want to ensure all documents are considered
Don’t use SummaryIndex when:
- You have large document collections (use
VectorStoreIndex instead)
- You need semantic similarity search
- You want selective retrieval based on relevance
Building Summary Indices
From Documents
import { Document, SummaryIndex } from "llamaindex";
const documents = [
new Document({ text: "Chapter 1: Introduction to AI" }),
new Document({ text: "Chapter 2: Machine Learning Basics" }),
new Document({ text: "Chapter 3: Deep Learning" }),
];
const index = await SummaryIndex.fromDocuments(documents);
From Nodes
import { Document, SummaryIndex, Settings } from "llamaindex";
const documents = [
new Document({ text: "Long document text..." }),
];
// Parse into nodes
const nodes = await Settings.nodeParser.getNodesFromDocuments(documents);
// Create index from nodes
const index = await SummaryIndex.init({ nodes });
With Storage Context
import { storageContextFromDefaults, SummaryIndex } from "llamaindex";
const storageContext = await storageContextFromDefaults({
persistDir: "./storage",
});
const index = await SummaryIndex.fromDocuments(documents, {
storageContext,
});
Querying Strategies
Default Retriever
The default retriever returns all nodes in the index:
import { SummaryIndex, SummaryRetrieverMode } from "llamaindex";
const index = await SummaryIndex.fromDocuments(documents);
const retriever = index.asRetriever({
mode: SummaryRetrieverMode.DEFAULT,
});
const nodes = await retriever.retrieve({
query: "Summarize the content"
});
console.log(`Retrieved ${nodes.length} nodes`);
// All nodes are returned with score = 1
LLM Retriever
Use the LLM to select relevant nodes:
import { SummaryRetrieverMode } from "llamaindex";
const retriever = index.asRetriever({
mode: SummaryRetrieverMode.LLM,
});
const nodes = await retriever.retrieve({
query: "What are the key points about machine learning?"
});
// LLM selects most relevant nodes
nodes.forEach((node) => {
console.log(`Score: ${node.score}`);
console.log(`Text: ${node.node.getText()}`);
});
How LLM Mode Works:
- Sends batches of nodes to the LLM
- LLM evaluates relevance to the query
- Returns only the selected nodes with relevance scores
Query Engine
Basic Query Engine
const queryEngine = index.asQueryEngine();
const response = await queryEngine.query({
query: "Summarize the main topics",
});
console.log(response.toString());
With Custom Response Synthesizer
import { getResponseSynthesizer } from "@llamaindex/core/response-synthesizers";
const responseSynthesizer = getResponseSynthesizer("tree_summarize");
const queryEngine = index.asQueryEngine({
responseSynthesizer,
});
const response = await queryEngine.query({
query: "Create a comprehensive summary",
});
Available Response Synthesizers:
compact - Concatenate nodes until context limit
tree_summarize - Build summary tree recursively
simple_summarize - Truncate to fit context
refine - Iteratively refine answer with each node
Chat Engine
Default Chat Mode
import { SummaryIndex } from "llamaindex";
const index = await SummaryIndex.fromDocuments(documents);
const chatEngine = index.asChatEngine();
const response = await chatEngine.chat({
message: "What is this document about?",
});
console.log(response.message.content);
// Follow-up with conversation memory
const followUp = await chatEngine.chat({
message: "Tell me more about that",
});
With LLM Retrieval Mode
import { SummaryRetrieverMode } from "llamaindex";
const chatEngine = index.asChatEngine({
mode: SummaryRetrieverMode.LLM,
});
const response = await chatEngine.chat({
message: "Explain the key concepts",
});
Examples
Summarization Example
import { Document, SummaryIndex, SummaryRetrieverMode } from "llamaindex";
import { openai } from "@llamaindex/openai";
import { Settings } from "llamaindex";
Settings.llm = openai({ model: "gpt-4o" });
async function summarizeDocument() {
const essay = `
Long essay text about various topics...
Multiple paragraphs of content...
`;
const document = new Document({ text: essay });
const index = await SummaryIndex.fromDocuments([document]);
// Use LLM mode for selective retrieval
const chatEngine = index.asChatEngine({
mode: SummaryRetrieverMode.LLM,
});
const response = await chatEngine.chat({
message: "Provide a comprehensive summary of the main points",
});
console.log(response.message.content);
}
summarizeDocument().catch(console.error);
Shared Storage with VectorStoreIndex
SummaryIndex and VectorStoreIndex can share the same storage context:
import {
SummaryIndex,
VectorStoreIndex,
storageContextFromDefaults,
} from "llamaindex";
const storageContext = await storageContextFromDefaults({
persistDir: "./storage",
});
// Create both indices with same storage
const vectorIndex = await VectorStoreIndex.fromDocuments(documents, {
storageContext,
});
const summaryIndex = await SummaryIndex.fromDocuments(documents, {
storageContext,
});
// Use vector index for specific queries
const specificAnswer = await vectorIndex.asQueryEngine().query({
query: "What is the capital of France?",
});
// Use summary index for summarization
const summary = await summaryIndex.asQueryEngine().query({
query: "Summarize all the content",
});
Inserting and Deleting Nodes
import { Document, SummaryIndex } from "llamaindex";
const index = await SummaryIndex.fromDocuments([]);
// Insert new document
const doc1 = new Document({
text: "First document",
id_: "doc-1",
});
await index.insert(doc1);
// Insert multiple nodes
const nodes = await Settings.nodeParser.getNodesFromDocuments([
new Document({ text: "Document 2" }),
new Document({ text: "Document 3" }),
]);
await index.insertNodes(nodes);
// Delete document by reference ID
await index.deleteRefDoc("doc-1");
Custom LLM Retriever Configuration
import {
SummaryIndex,
SummaryIndexLLMRetriever,
defaultChoiceSelectPrompt,
} from "llamaindex";
const index = await SummaryIndex.fromDocuments(documents);
// Create custom LLM retriever
const retriever = new SummaryIndexLLMRetriever(
index,
defaultChoiceSelectPrompt, // Custom prompt
10, // Choice batch size
);
const nodes = await retriever.retrieve({
query: "Find information about AI"
});
Complete Working Example
import {
Document,
SummaryIndex,
SummaryRetrieverMode,
Settings,
} from "llamaindex";
import { openai } from "@llamaindex/openai";
Settings.llm = openai({
apiKey: process.env.OPENAI_API_KEY,
model: "gpt-4o",
});
async function main() {
// Create sample documents
const documents = [
new Document({
text: "LlamaIndex is a data framework for LLM applications. It provides tools for ingestion, indexing, and querying.",
metadata: { chapter: 1 },
}),
new Document({
text: "Vector stores enable efficient similarity search. They store embeddings and support fast retrieval.",
metadata: { chapter: 2 },
}),
new Document({
text: "RAG combines retrieval with generation. It retrieves relevant context then generates responses.",
metadata: { chapter: 3 },
}),
];
// Build summary index
console.log("Building SummaryIndex...");
const index = await SummaryIndex.fromDocuments(documents);
// Test default retrieval (returns all nodes)
console.log("\n=== Default Retrieval ===");
const defaultRetriever = index.asRetriever({
mode: SummaryRetrieverMode.DEFAULT,
});
const allNodes = await defaultRetriever.retrieve({ query: "test" });
console.log(`Retrieved ${allNodes.length} nodes`);
// Test LLM retrieval (selective)
console.log("\n=== LLM Retrieval ===");
const llmRetriever = index.asRetriever({
mode: SummaryRetrieverMode.LLM,
});
const selectedNodes = await llmRetriever.retrieve({
query: "What is RAG?"
});
console.log(`Retrieved ${selectedNodes.length} relevant nodes`);
selectedNodes.forEach((node, idx) => {
console.log(`${idx + 1}. Score: ${node.score} - ${node.node.getText().substring(0, 50)}...`);
});
// Create summary using chat engine
console.log("\n=== Summary Generation ===");
const chatEngine = index.asChatEngine({
mode: SummaryRetrieverMode.LLM,
});
const summary = await chatEngine.chat({
message: "Provide a brief summary of all the topics covered",
});
console.log(summary.message.content);
// Query engine for Q&A
console.log("\n=== Query Engine ===");
const queryEngine = index.asQueryEngine();
const response = await queryEngine.query({
query: "How does RAG work?",
});
console.log(response.toString());
}
main().catch(console.error);
Default Mode:
- ✅ Fast - no LLM calls for retrieval
- ❌ Sends all nodes to response synthesis (expensive for large datasets)
- Best for: Small document sets (< 20 nodes)
LLM Mode:
- ✅ More efficient for large datasets
- ✅ Better relevance through LLM selection
- ❌ Additional LLM calls for retrieval
- Best for: Medium document sets where selective retrieval helps
When to Switch to VectorStoreIndex:
- Document count > 100
- Need semantic similarity search
- Want faster retrieval at scale