Cohere - LlamaIndex.TS

Overview

The Cohere provider integrates Cohere’s reranking API with LlamaIndex.TS to improve the relevance of search results. Reranking is a powerful technique to reorder retrieved documents based on their relevance to a query.

Installation

npm install @llamaindex/cohere

Basic Usage

import { CohereRerank } from "@llamaindex/cohere";

const reranker = new CohereRerank({
  apiKey: process.env.COHERE_API_KEY,
  topN: 5,
  model: "rerank-english-v2.0"
});

// Use as a node postprocessor
const rerankedNodes = await reranker.postprocessNodes(nodes, query);

Constructor Options

apiKey

string

required

Cohere API key (no environment variable default - must be provided)

topN

number

default:2

Number of top results to return after reranking

model

string

default:"rerank-english-v2.0"

Cohere rerank model to use

baseUrl

string

Optional custom API endpoint URL

timeout

number

Request timeout in seconds

Supported Models

Rerank Models

rerank-english-v2.0: General-purpose English reranking (default)
rerank-multilingual-v2.0: Multilingual reranking support
rerank-english-v3.0: Latest English model
rerank-multilingual-v3.0: Latest multilingual model

With Query Engine

import { VectorStoreIndex } from "llamaindex";
import { CohereRerank } from "@llamaindex/cohere";

const index = await VectorStoreIndex.fromDocuments(documents);

const reranker = new CohereRerank({
  apiKey: process.env.COHERE_API_KEY,
  topN: 3,
  model: "rerank-english-v3.0"
});

const queryEngine = index.asQueryEngine({
  nodePostprocessors: [reranker]
});

const response = await queryEngine.query({
  query: "What are the main features?"
});

With Retriever

import { VectorStoreIndex } from "llamaindex";
import { CohereRerank } from "@llamaindex/cohere";

const index = await VectorStoreIndex.fromDocuments(documents);
const retriever = index.asRetriever({ similarityTopK: 10 });

const reranker = new CohereRerank({
  apiKey: process.env.COHERE_API_KEY,
  topN: 5
});

// Retrieve initial results
const nodes = await retriever.retrieve({ query: "user query" });

// Rerank for better relevance
const rerankedNodes = await reranker.postprocessNodes(
  nodes,
  "user query"
);

Multilingual Reranking

const reranker = new CohereRerank({
  apiKey: process.env.COHERE_API_KEY,
  topN: 5,
  model: "rerank-multilingual-v3.0"
});

const rerankedNodes = await reranker.postprocessNodes(
  nodes,
  "Quelles sont les principales caractéristiques?" // French query
);

Custom Base URL

const reranker = new CohereRerank({
  apiKey: process.env.COHERE_API_KEY,
  baseUrl: "https://custom-cohere-endpoint.com",
  topN: 3
});

Configuration

Environment Variables

COHERE_API_KEY=your-api-key-here

Note: Unlike other providers, the Cohere package does not automatically read from environment variables. You must explicitly pass the API key.

How Reranking Works

Initial Retrieval: Your retriever fetches top-K documents (e.g., 10-20)
Reranking: Cohere’s model re-scores each document for relevance
Top-N Selection: Returns only the most relevant N documents
Score Update: Updates each node’s score to the relevance score

// Before reranking: 10 documents with embedding similarity scores
const initialNodes = await retriever.retrieve({ query, similarityTopK: 10 });

// After reranking: 3 most relevant documents with Cohere relevance scores
const reranker = new CohereRerank({ apiKey, topN: 3 });
const finalNodes = await reranker.postprocessNodes(initialNodes, query);

Performance Tips

Retrieve more, rerank to fewer: Retrieve 10-20 documents, rerank to top 3-5
Use for complex queries: Most beneficial when semantic search alone isn’t sufficient
Choose right model: v3.0 models offer better quality, v2.0 is faster
Set appropriate timeout: For large document sets, increase timeout

Error Handling

try {
  const rerankedNodes = await reranker.postprocessNodes(nodes, query);
} catch (error) {
  if (error.message.includes("API key")) {
    console.error("Invalid or missing Cohere API key");
  } else {
    console.error("Reranking failed:", error.message);
  }
}

Use Cases

Improve RAG quality: Rerank retrieved documents before sending to LLM
Multi-stage retrieval: First pass with embeddings, second pass with reranking
Cross-lingual search: Use multilingual models for queries in different languages
Semantic search refinement: Improve relevance beyond vector similarity

Best Practices

Always provide a query: Reranking requires a query string to work
Retrieve enough candidates: Aim for 10-20 initial results for best reranking
Don’t over-rerank: Top 3-5 results usually sufficient for most use cases
Handle empty results: Check if initial retrieval returns documents
Monitor costs: Reranking adds API costs, use judiciously

Documentation Index

​Overview

​Installation

​Basic Usage

​Constructor Options

​Supported Models

​Rerank Models

​With Query Engine

​With Retriever

​Multilingual Reranking

​Custom Base URL

​Configuration

​Environment Variables

​How Reranking Works

​Performance Tips

​Error Handling

​Use Cases

​Best Practices

​See Also

Overview

Installation

Basic Usage

Constructor Options

Supported Models

Rerank Models

With Query Engine

With Retriever

Multilingual Reranking

Custom Base URL

Configuration

Environment Variables

How Reranking Works

Performance Tips

Error Handling

Use Cases

Best Practices

See Also