Skip to main content
Find answers to common questions about LlamaIndex.TS.

General Questions

LlamaIndex.TS is a TypeScript/JavaScript data framework for building LLM applications. It helps you:
  • Load and process data from various sources
  • Create vector embeddings and indices
  • Build RAG (Retrieval Augmented Generation) applications
  • Integrate with multiple LLM providers
  • Create chat engines and query engines
  • Build agentic workflows
It’s the TypeScript port of LlamaIndex Python, designed to work across multiple JavaScript runtimes including Node.js, Deno, Bun, and edge environments.
Both frameworks share the same core concepts but have some differences:Similarities:
  • Same core abstractions (indices, query engines, chat engines)
  • Similar API design and patterns
  • Support for the same LLM providers
  • RAG and agent capabilities
Differences:
  • Language: TypeScript vs Python
  • Package structure: Modular npm packages vs Python namespace packages
  • Runtime support: Multi-runtime JS vs Python only
  • Feature parity: Python has more features currently
See the Migration from Python guide for detailed comparisons.
No! LlamaIndex.TS is a standalone TypeScript/JavaScript library. While the concepts are similar to the Python version, you don’t need any Python knowledge to use it.If you’re coming from Python, check out the Migration from Python guide.
LlamaIndex.TS supports multiple JavaScript runtimes:Fully Supported:
  • Node.js >= 18.0.0 ✅
  • Deno ✅
  • Bun ✅
  • Nitro ✅
Supported with Limitations:
  • Vercel Edge Runtime ✅ (limited file system access)
  • Cloudflare Workers ✅ (limited file system access)
Not Supported:
  • Browser ❌ (due to lack of AsyncLocalStorage-like APIs)
The framework uses conditional exports to provide runtime-specific entry points.
Browser support is currently limited due to the lack of support for AsyncLocalStorage-like APIs in browsers. However, work is ongoing to improve browser compatibility.For now, we recommend using LlamaIndex.TS on the server-side (Node.js, edge runtimes) and calling it from your browser via API routes.

Installation & Setup

Install the core package:
npm install llamaindex
You’ll also need provider packages:
# For OpenAI
npm install @llamaindex/openai

# For vector stores
npm install @llamaindex/pinecone
npm install @llamaindex/qdrant
See the Getting Started guide for details.
Only if you want to use OpenAI models. LlamaIndex.TS supports many LLM providers:
  • OpenAI (GPT-4, GPT-3.5)
  • Anthropic (Claude)
  • Google (Gemini)
  • Ollama (Local models)
  • Groq, Mistral, Together AI, and more
You can also use local models with Ollama:
import { Ollama } from "@llamaindex/ollama";

Settings.llm = new Ollama({ model: "llama3" });
Required:
  • Node.js >= 18.0.0 (or another supported runtime)
  • npm, pnpm, or yarn
Recommended:
  • TypeScript for type safety
  • An LLM provider API key (OpenAI, Anthropic, etc.) or local Ollama setup
Yes! LlamaIndex.TS works with all modern JavaScript frameworks:
  • Next.js (App Router and Pages Router)
  • Remix
  • SvelteKit
  • Nuxt
  • Astro
  • Express
  • And more!
For Next.js, use the withLlamaIndex helper:
// next.config.js
const { withLlamaIndex } = require("llamaindex/next");

module.exports = withLlamaIndex({
  // Your Next.js config
});

Usage Questions

LlamaIndex.TS provides several ways to load data:1. SimpleDirectoryReader (for files):
import { SimpleDirectoryReader } from "llamaindex";

const reader = new SimpleDirectoryReader();
const documents = await reader.loadData({ directoryPath: "./data" });
2. Specialized readers:
import { PDFReader, DocxReader } from "@llamaindex/readers";

const pdfReader = new PDFReader();
const docs = await pdfReader.loadData("document.pdf");
3. Manual document creation:
import { Document } from "llamaindex";

const doc = new Document({ text: "Your content here" });
Create an index and query it:
import { VectorStoreIndex } from "llamaindex";

// Create index
const index = await VectorStoreIndex.fromDocuments(documents);

// Query
const queryEngine = index.asQueryEngine();
const response = await queryEngine.query({
  query: "What is LlamaIndex?",
});

console.log(response.toString());
Use a chat engine:
import { VectorStoreIndex } from "llamaindex";

const index = await VectorStoreIndex.fromDocuments(documents);
const chatEngine = index.asChatEngine();

// Chat with context
const response = await chatEngine.chat({
  message: "What does the document say about X?",
});

console.log(response.toString());
Chat engines maintain conversation history automatically.
Configure Settings.llm:
import { Settings } from "llamaindex";

// Anthropic
import { Anthropic } from "@llamaindex/anthropic";
Settings.llm = new Anthropic({ model: "claude-3-sonnet" });

// Google Gemini
import { Gemini } from "@llamaindex/google";
Settings.llm = new Gemini({ model: "gemini-pro" });

// Ollama (local)
import { Ollama } from "@llamaindex/ollama";
Settings.llm = new Ollama({ model: "llama3" });
Install the vector store provider and use it:
import { VectorStoreIndex } from "llamaindex";
import { PineconeVectorStore } from "@llamaindex/pinecone";

const vectorStore = new PineconeVectorStore({
  indexName: "my-index",
  apiKey: process.env.PINECONE_API_KEY,
});

const index = await VectorStoreIndex.fromDocuments(documents, {
  vectorStore,
});
Supported vector stores:
  • Pinecone, Qdrant, Chroma, Weaviate
  • MongoDB, PostgreSQL (pgvector)
  • Supabase, Milvus, Astra
  • And more!
Set stream: true in your query:
const stream = await queryEngine.query({
  query: "What is LlamaIndex?",
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.response);
}

Troubleshooting

This usually happens due to:
  1. Missing installation:
npm install llamaindex @llamaindex/openai
  1. TypeScript configuration:
{
  "compilerOptions": {
    "moduleResolution": "bundler"
  }
}
  1. Build issues:
rm -rf node_modules package-lock.json
npm install
See the Troubleshooting guide for more.
Solutions:
  1. Add retry logic:
import { OpenAI } from "@llamaindex/openai";

Settings.llm = new OpenAI({
  maxRetries: 3,
  timeout: 60000,
});
  1. Use a smaller model:
Settings.llm = new OpenAI({ model: "gpt-3.5-turbo" });
  1. Process in batches:
for (let i = 0; i < documents.length; i += 10) {
  const batch = documents.slice(i, i + 10);
  await index.insert(batch);
  await new Promise(resolve => setTimeout(resolve, 1000));
}
Options:
  1. Use batch embedding:
import { OpenAIEmbedding } from "@llamaindex/openai";

Settings.embedModel = new OpenAIEmbedding({
  batchSize: 100,
});
  1. Use a faster model:
Settings.embedModel = new OpenAIEmbedding({
  model: "text-embedding-3-small", // Faster than large
});
  1. Cache embeddings in a vector store.
Enable debug mode:
import { Settings } from "llamaindex";

Settings.debug = true;
This shows detailed logs about:
  • API calls
  • Document processing
  • Embedding generation
  • Query execution

Advanced Topics

Yes! Extend the base BaseEmbedding class:
import { BaseEmbedding } from "@llamaindex/core/embeddings";

class CustomEmbedding extends BaseEmbedding {
  async getTextEmbedding(text: string): Promise<number[]> {
    // Your embedding logic
    return [0.1, 0.2, 0.3, ...];
  }
}

Settings.embedModel = new CustomEmbedding();
Use the @llamaindex/workflow package:
import { AgentWorkflow } from "@llamaindex/workflow";

const agent1 = new AgentWorkflow({
  name: "researcher",
  tools: [searchTool],
});

const agent2 = new AgentWorkflow({
  name: "writer",
  tools: [writeTool],
});

// Orchestrate agents
Yes! Define tools and use them with agents:
import { FunctionTool } from "@llamaindex/core/tools";
import { AgentWorkflow } from "@llamaindex/workflow";

const weatherTool = FunctionTool.from(
  ({ city }: { city: string }) => {
    return `Weather in ${city}: Sunny, 72°F`;
  },
  {
    name: "get_weather",
    description: "Get weather for a city",
    parameters: {
      type: "object",
      properties: {
        city: { type: "string" },
      },
    },
  },
);

const agent = new AgentWorkflow({
  tools: [weatherTool],
});
Strategies:
  1. Chunk documents:
import { SentenceSplitter } from "llamaindex";

const splitter = new SentenceSplitter({
  chunkSize: 512,
  chunkOverlap: 50,
});
const nodes = splitter.getNodesFromDocuments(documents);
  1. Process in batches:
for (let i = 0; i < documents.length; i += 10) {
  await index.insert(documents.slice(i, i + 10));
}
  1. Increase Node.js memory:
node --max-old-space-size=4096 your-script.js

Cost & Performance

LlamaIndex.TS itself is free and open-source. Costs come from:LLM API calls:
  • OpenAI: ~$0.03 per 1K tokens (GPT-4)
  • Anthropic: ~$0.015 per 1K tokens (Claude)
  • Or use free local models with Ollama
Embedding API calls:
  • OpenAI: ~$0.0001 per 1K tokens
Vector database:
  • Varies by provider (some have free tiers)
Cost optimization tips:
  • Use smaller models (gpt-3.5-turbo vs gpt-4)
  • Reduce chunk size
  • Cache embeddings
  • Use local models
Yes! Use Ollama for completely free, local inference:
# Install Ollama
# https://ollama.ai

# Pull a model
ollama pull llama3
import { Ollama } from "@llamaindex/ollama";

Settings.llm = new Ollama({ model: "llama3" });
Settings.embedModel = new Ollama({ model: "nomic-embed-text" });

Contributing & Community

We welcome contributions! You can:
  • Fix bugs or add features
  • Improve documentation
  • Add examples
  • Help others in Discord
  • Report issues
See the Contributing Guide for details.
Community support:See the Community page for more.

Still Have Questions?

Discord

Join our Discord community

GitHub Discussions

Ask on GitHub Discussions

Documentation

Browse the full documentation

Examples

Check out code examples