Skip to main content

Quickstart

Get up and running with LlamaIndex.TS in just a few minutes. This guide will walk you through creating your first Retrieval-Augmented Generation (RAG) application.

What You’ll Build

You’ll create a simple application that:
  1. Loads a text document
  2. Creates a searchable index from the document
  3. Answers questions using the indexed data
1

Install LlamaIndex.TS

First, install the main package and an LLM provider. We’ll use OpenAI for this example.
npm install llamaindex @llamaindex/openai
The core llamaindex package provides the framework, while @llamaindex/openai adds OpenAI LLM and embedding support.
2

Set Up Your API Key

You’ll need an OpenAI API key. Get one at platform.openai.com/api-keys.Set your API key as an environment variable:
export OPENAI_API_KEY="sk-..."
Or create a .env file:
.env
OPENAI_API_KEY=sk-...
3

Create Your First RAG Application

Create a file called app.ts with the following code:
app.ts
import { Document, VectorStoreIndex } from "llamaindex";
import fs from "node:fs/promises";

async function main() {
  // Load your document
  const essay = await fs.readFile("./data.txt", "utf-8");
  const document = new Document({ text: essay });

  // Create an index from your document
  const index = await VectorStoreIndex.fromDocuments([document]);

  // Create a query engine
  const queryEngine = index.asQueryEngine();

  // Query your data
  const response = await queryEngine.query({
    query: "What is the main topic of this document?",
  });

  console.log(response.toString());
}

main().catch(console.error);
Replace "./data.txt" with the path to any text file you want to query.
4

Run Your Application

Execute your application:
npx tsx app.ts
Or if using Node.js with TypeScript:
node --loader tsx app.ts
You should see a response generated from your document!

Interactive Chat Example

Let’s enhance the example to create an interactive chat interface:
chat.ts
import {
  ContextChatEngine,
  Document,
  Settings,
  VectorStoreIndex,
} from "llamaindex";
import { createInterface } from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";
import fs from "node:fs/promises";

// Configure chunk size for better results
Settings.chunkSize = 512;

async function main() {
  // Load your document
  const essay = await fs.readFile("./data.txt", "utf-8");
  const document = new Document({ text: essay });

  // Create index and retriever
  const index = await VectorStoreIndex.fromDocuments([document]);
  const retriever = index.asRetriever({
    similarityTopK: 5,
  });

  // Create a chat engine
  const chatEngine = new ContextChatEngine({ retriever });
  const rl = createInterface({ input, output });

  console.log("Chat with your document! (Ctrl+C to exit)\n");

  while (true) {
    const query = await rl.question("You: ");
    const stream = await chatEngine.chat({ message: query, stream: true });

    process.stdout.write("Assistant: ");
    for await (const chunk of stream) {
      process.stdout.write(chunk.response);
    }
    console.log("\n");
  }
}

main().catch(console.error);
This creates a conversational interface that maintains context across multiple questions!

Building an Agent with Tools

For more advanced use cases, create an agent that can use tools:
agent.ts
import { openai } from "@llamaindex/openai";
import { agent } from "@llamaindex/workflow";
import { tool } from "llamaindex";
import { z } from "zod";

// Define a tool
const weatherTool = tool({
  name: "get_weather",
  description: "Get the current weather for a location",
  parameters: z.object({
    address: z.string().describe("The address"),
  }),
  execute: ({ address }) => `${address} is sunny and 72°F!`,
});

async function main() {
  // Create an agent with tools
  const weatherAgent = agent({
    llm: openai({
      model: "gpt-4o",
    }),
    tools: [weatherTool],
  });

  // Run the agent
  const result = await weatherAgent.run(
    "What's the weather like in San Francisco?"
  );

  console.log(result.data.message);
}

main().catch(console.error);
Agents require the @llamaindex/workflow package for orchestration:
npm install @llamaindex/workflow

What’s Happening?

Here’s what happens under the hood:
1

Document Processing

Your text is split into chunks (nodes) for efficient retrieval.
2

Embedding Generation

Each chunk is converted into a vector embedding using OpenAI’s embedding model.
3

Index Creation

Embeddings are stored in a vector index for semantic search.
4

Query & Retrieval

When you ask a question, relevant chunks are retrieved based on semantic similarity.
5

Response Generation

The LLM generates a response using the retrieved context.

Next Steps

Now that you’ve built your first RAG application, explore more features:

Installation Guide

Learn about runtime-specific setup and provider packages

Core Concepts

Deep dive into Documents, Nodes, Indices, and more

Vector Stores

Use production vector databases like Pinecone, Qdrant, or Chroma

Agents

Build sophisticated agents with reasoning and tool usage

Try It Online

Want to experiment without setup? Try our examples in StackBlitz: Open in Stackblitz