Overview
Chroma is an open-source embedding database that can run locally or as a server. It’s designed for simplicity and ease of use.
Installation
npm install @llamaindex/chroma chromadb
Basic Usage
import { ChromaVectorStore } from "@llamaindex/chroma";
import { VectorStoreIndex, Document } from "llamaindex";
const vectorStore = new ChromaVectorStore({
collectionName: "my-collection"
});
const documents = [
new Document({ text: "LlamaIndex is a data framework." }),
new Document({ text: "Chroma is a vector database." })
];
const index = await VectorStoreIndex.fromDocuments(documents, {
storageContext: { vectorStore }
});
const queryEngine = index.asQueryEngine();
const response = await queryEngine.query({
query: "What is Chroma?"
});
Constructor Options
collectionName
string
default:"llamaindex"
Name of the Chroma collection
Custom Chroma client instance
host
string
default:"http://localhost:8000"
Chroma server URL
Batch size for operations
Setup Options
In-Memory (Default)
const vectorStore = new ChromaVectorStore({
collectionName: "my-collection"
});
// Runs in-memory, data not persisted
Persistent Local Storage
import { ChromaClient } from "chromadb";
const client = new ChromaClient({
path: "./chroma-data" // Local persistence
});
const vectorStore = new ChromaVectorStore({
collectionName: "my-collection",
chromaClient: client
});
Remote Server
import { ChromaClient } from "chromadb";
const client = new ChromaClient({
path: "http://chroma-server:8000"
});
const vectorStore = new ChromaVectorStore({
collectionName: "my-collection",
chromaClient: client
});
Running Chroma Server
Docker
docker pull chromadb/chroma
docker run -p 8000:8000 chromadb/chroma
Python
pip install chromadb
chroma run --host localhost --port 8000
Querying
Basic Query
const index = await VectorStoreIndex.fromVectorStore(vectorStore);
const retriever = index.asRetriever({
similarityTopK: 5
});
const nodes = await retriever.retrieve("search query");
const documents = [
new Document({
text: "Document 1",
metadata: { category: "tech", year: 2023 }
}),
new Document({
text: "Document 2",
metadata: { category: "science", year: 2024 }
})
];
const index = await VectorStoreIndex.fromDocuments(documents, {
storageContext: { vectorStore }
});
const retriever = index.asRetriever({
filters: {
category: "tech"
}
});
Collections
Manage multiple collections:
const docsStore = new ChromaVectorStore({
collectionName: "documents"
});
const codeStore = new ChromaVectorStore({
collectionName: "code"
});
const chatStore = new ChromaVectorStore({
collectionName: "chat-history"
});
Managing Data
Add Documents
const newDoc = new Document({ text: "New content" });
await index.insert(newDoc);
Delete Documents
await index.deleteRef(docId);
Clear Collection
const client = await vectorStore.client();
await client.deleteCollection({ name: "my-collection" });
Loading Existing Collection
import { VectorStoreIndex } from "llamaindex";
import { ChromaVectorStore } from "@llamaindex/chroma";
const vectorStore = new ChromaVectorStore({
collectionName: "existing-collection"
});
const index = await VectorStoreIndex.fromVectorStore(vectorStore);
Distance Metrics
Chroma supports different distance metrics:
import { ChromaClient } from "chromadb";
const client = new ChromaClient();
const collection = await client.createCollection({
name: "my-collection",
metadata: {
"hnsw:space": "cosine" // or "l2", "ip" (inner product)
}
});
Embedding Functions
Use custom embedding functions:
import { OpenAIEmbeddingFunction } from "chromadb";
const embedder = new OpenAIEmbeddingFunction({
api_key: process.env.OPENAI_API_KEY,
model_name: "text-embedding-3-small"
});
const collection = await client.createCollection({
name: "my-collection",
embeddingFunction: embedder
});
Complete Example
import { ChromaVectorStore } from "@llamaindex/chroma";
import { VectorStoreIndex, Document, Settings } from "llamaindex";
import { OpenAI, OpenAIEmbedding } from "@llamaindex/openai";
import { ChromaClient } from "chromadb";
// Configure settings
Settings.llm = new OpenAI({ model: "gpt-4" });
Settings.embedModel = new OpenAIEmbedding();
// Create persistent client
const client = new ChromaClient({
path: "./chroma-data"
});
// Create vector store
const vectorStore = new ChromaVectorStore({
collectionName: "my-docs",
chromaClient: client
});
// Load documents
const documents = [
new Document({
text: "LlamaIndex documentation...",
metadata: { source: "docs", page: 1 }
})
];
// Build index
const index = await VectorStoreIndex.fromDocuments(documents, {
storageContext: { vectorStore }
});
// Query
const queryEngine = index.asQueryEngine();
const response = await queryEngine.query({
query: "What is LlamaIndex?"
});
console.log(response.response);
Best Practices
- Use persistent storage: Enable data persistence for production
- Choose appropriate metric: Cosine for most text use cases
- Organize with collections: Separate data by use case or environment
- Run server for production: Use Chroma server for scalability
- Monitor memory: In-memory mode limited by available RAM
Troubleshooting
Connection Error
try {
const vectorStore = new ChromaVectorStore({
collectionName: "test",
host: "http://localhost:8000"
});
await vectorStore.client();
} catch (error) {
console.error("Cannot connect to Chroma:", error.message);
console.log("Make sure Chroma server is running on port 8000");
}
Collection Already Exists
const client = new ChromaClient();
// Get or create collection
const collection = await client.getOrCreateCollection({
name: "my-collection"
});
See Also