Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/run-llama/LlamaIndexTS/llms.txt

Use this file to discover all available pages before exploring further.

What are Query Engines?

Query engines process questions over your data and return synthesized answers. Unlike chat engines, query engines:
  • Handle single-turn questions (no chat history)
  • Support complex query patterns (routing, sub-questions)
  • Provide structured responses with source citations
  • Enable advanced retrieval strategies

Core Query Engine Types

RetrieverQueryEngine

The fundamental query engine that retrieves and synthesizes:
import { RetrieverQueryEngine } from "llamaindex";

const retriever = index.asRetriever({ similarityTopK: 5 });

const queryEngine = new RetrieverQueryEngine(retriever);

const response = await queryEngine.query({
  query: "What is the main topic?"
});

console.log(response.toString());

Simple Query Engine from Index

The easiest way to create a query engine:
import { VectorStoreIndex, Document } from "llamaindex";

const document = new Document({ text: "Your document text" });
const index = await VectorStoreIndex.fromDocuments([document]);

// Creates a RetrieverQueryEngine internally
const queryEngine = index.asQueryEngine();

const response = await queryEngine.query({
  query: "What does this document discuss?"
});

SubQuestionQueryEngine

Breaks complex questions into sub-questions:
import {
  Document,
  QueryEngineTool,
  SubQuestionQueryEngine,
  VectorStoreIndex
} from "llamaindex";

const document = new Document({ text: "Your document text" });
const index = await VectorStoreIndex.fromDocuments([document]);

const queryEngineTools = [
  new QueryEngineTool({
    queryEngine: index.asQueryEngine(),
    metadata: {
      name: "documents",
      description: "Information about the documents"
    }
  })
];

const queryEngine = SubQuestionQueryEngine.fromDefaults({
  queryEngineTools
});

const response = await queryEngine.query({
  query: "How did the topic evolve over time?"
});

console.log(response.toString());

RouterQueryEngine

Routes queries to the best query engine:
import {
  RouterQueryEngine,
  VectorStoreIndex,
  SummaryIndex,
  SentenceSplitter,
  Settings
} from "llamaindex";
import { OpenAI } from "@llamaindex/openai";
import { SimpleDirectoryReader } from "@llamaindex/readers/directory";

// Configure settings
Settings.llm = new OpenAI();
Settings.nodeParser = new SentenceSplitter({ chunkSize: 1024 });

// Load documents
const documents = await new SimpleDirectoryReader().loadData({
  directoryPath: "./data"
});

// Create different indices
const vectorIndex = await VectorStoreIndex.fromDocuments(documents);
const summaryIndex = await SummaryIndex.fromDocuments(documents);

// Create router
const queryEngine = RouterQueryEngine.fromDefaults({
  queryEngineTools: [
    {
      queryEngine: vectorIndex.asQueryEngine(),
      description: "Useful for retrieving specific context"
    },
    {
      queryEngine: summaryIndex.asQueryEngine(),
      description: "Useful for summarization questions"
    }
  ]
});

const response = await queryEngine.query({
  query: "Give me a summary of the documents"
});

console.log(response.response);
console.log("Selected:", response.metadata.selectorResult);

Complete Working Example

Here’s a production-ready query engine:
import { Document, VectorStoreIndex } from "llamaindex";
import fs from "node:fs/promises";
import { createInterface } from "node:readline/promises";

async function main() {
  const rl = createInterface({ 
    input: process.stdin, 
    output: process.stdout 
  });

  // Check API key
  if (!process.env.OPENAI_API_KEY) {
    console.log("OpenAI API key not found in environment variables.");
    process.env.OPENAI_API_KEY = await rl.question(
      "Please enter your OpenAI API key: "
    );
  }

  // Load document
  const essay = await fs.readFile("./data/essay.txt", "utf-8");
  const document = new Document({ text: essay, id_: "essay" });

  // Create index and query engine
  const index = await VectorStoreIndex.fromDocuments([document]);
  const queryEngine = index.asQueryEngine();

  console.log("\nReady to answer questions!\n");

  // Query loop
  while (true) {
    const query = await rl.question("Query: ");
    const response = await queryEngine.query({ query });
    console.log(response.toString());
    console.log();
  }
}

main().catch(console.error);

Response Synthesis

Query engines use response synthesizers to combine retrieved chunks:

Synthesis Modes

import { getResponseSynthesizer } from "llamaindex";

// Compact: Concatenates chunks up to token limit
const compactSynthesizer = getResponseSynthesizer("compact");

// Refine: Iteratively refines answer with each chunk
const refineSynthesizer = getResponseSynthesizer("refine");

// Tree Summarize: Builds answer in tree structure
const treeSynthesizer = getResponseSynthesizer("tree_summarize");

Using Custom Synthesizers

import { RetrieverQueryEngine, getResponseSynthesizer } from "llamaindex";

const retriever = index.asRetriever();
const synthesizer = getResponseSynthesizer("tree_summarize");

const queryEngine = new RetrieverQueryEngine({
  retriever,
  responseSynthesizer: synthesizer
});

Streaming Synthesis

Stream responses as they’re generated:
import { getResponseSynthesizer, NodeWithScore, TextNode } from "llamaindex";

const synthesizer = getResponseSynthesizer("compact");

const nodesWithScore: NodeWithScore[] = [
  {
    node: new TextNode({ text: "Relevant chunk 1" }),
    score: 1.0
  },
  {
    node: new TextNode({ text: "Relevant chunk 2" }),
    score: 0.8
  }
];

const stream = await synthesizer.synthesize(
  {
    query: "What is the answer?",
    nodes: nodesWithScore
  },
  true  // Enable streaming
);

for await (const chunk of stream) {
  process.stdout.write(chunk.response);
}

Advanced Query Patterns

Sub-Question Decomposition

Break complex queries into simpler sub-questions:
import {
  SubQuestionQueryEngine,
  QueryEngineTool,
  VectorStoreIndex,
  Document
} from "llamaindex";

const doc1 = new Document({ 
  text: "Content about topic A",
  id_: "doc_a" 
});
const doc2 = new Document({ 
  text: "Content about topic B",
  id_: "doc_b" 
});

const indexA = await VectorStoreIndex.fromDocuments([doc1]);
const indexB = await VectorStoreIndex.fromDocuments([doc2]);

const queryEngineTools = [
  new QueryEngineTool({
    queryEngine: indexA.asQueryEngine(),
    metadata: {
      name: "topic_a",
      description: "Information about topic A"
    }
  }),
  new QueryEngineTool({
    queryEngine: indexB.asQueryEngine(),
    metadata: {
      name: "topic_b",
      description: "Information about topic B"
    }
  })
];

const queryEngine = SubQuestionQueryEngine.fromDefaults({
  queryEngineTools
});

// Complex query that requires both sources
const response = await queryEngine.query({
  query: "Compare and contrast topic A and topic B"
});

console.log(response.toString());

Router-Based Selection

Automatically choose the best query engine:
import { RouterQueryEngine, VectorStoreIndex, SummaryIndex } from "llamaindex";

const vectorIndex = await VectorStoreIndex.fromDocuments(documents);
const summaryIndex = await SummaryIndex.fromDocuments(documents);

const router = RouterQueryEngine.fromDefaults({
  queryEngineTools: [
    {
      queryEngine: vectorIndex.asQueryEngine(),
      description: "Best for specific fact retrieval and targeted questions"
    },
    {
      queryEngine: summaryIndex.asQueryEngine(),
      description: "Best for summarization and overview questions"
    }
  ]
});

// Router automatically selects the appropriate engine
const factResponse = await router.query({
  query: "What was the author's first job?"
});
// Uses vectorIndex

const summaryResponse = await router.query({
  query: "Summarize the author's career"
});
// Uses summaryIndex

console.log("Selected engine:", summaryResponse.metadata.selectorResult);

Query Engine Configuration

Retrieval Parameters

const queryEngine = index.asQueryEngine({
  similarityTopK: 10,  // Retrieve top 10 chunks
  // Additional retriever options
});

Custom Retrievers

import { RetrieverQueryEngine } from "llamaindex";

const retriever = index.asRetriever({
  similarityTopK: 5,
  // Add filters, custom params, etc.
});

const queryEngine = new RetrieverQueryEngine(retriever);

Post-Processing

Add post-processors to refine results:
import { 
  RetrieverQueryEngine,
  SimilarityPostprocessor 
} from "llamaindex";

const retriever = index.asRetriever();
const postprocessor = new SimilarityPostprocessor({
  similarityCutoff: 0.7  // Filter out chunks below threshold
});

const queryEngine = new RetrieverQueryEngine({
  retriever,
  nodePostprocessors: [postprocessor]
});

Choosing the Right Query Engine

EngineUse CaseProsCons
RetrieverQueryEngineGeneral Q&ASimple, fastSingle retrieval strategy
SubQuestionQueryEngineComplex questionsBreaks down problemsMore LLM calls
RouterQueryEngineMultiple data sourcesIntelligent routingRequires good descriptions
Index.asQueryEngine()Quick prototypingEasiest setupLess control

Low-Level Query Pipeline

For maximum control, build the query pipeline manually:
import {
  Document,
  SentenceSplitter,
  TextNode,
  NodeWithScore,
  getResponseSynthesizer
} from "llamaindex";

// 1. Parse documents
const nodeParser = new SentenceSplitter({ chunkSize: 512 });
const nodes = nodeParser.getNodesFromDocuments([
  new Document({ text: "Your document text" })
]);

// 2. Simulate retrieval (in practice, use embeddings + vector search)
const nodesWithScore: NodeWithScore[] = [
  {
    node: new TextNode({ text: "Relevant chunk 1" }),
    score: 0.9
  },
  {
    node: new TextNode({ text: "Relevant chunk 2" }),
    score: 0.75
  }
];

// 3. Synthesize response
const synthesizer = getResponseSynthesizer("compact");

const response = await synthesizer.synthesize({
  query: "What is the answer?",
  nodes: nodesWithScore
});

console.log(response.toString());

Next Steps