Documentation Index
Fetch the complete documentation index at: https://mintlify.com/run-llama/LlamaIndexTS/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The Cohere provider integrates Cohere’s reranking API with LlamaIndex.TS to improve the relevance of search results. Reranking is a powerful technique to reorder retrieved documents based on their relevance to a query.
Installation
npm install @llamaindex/cohere
Basic Usage
import { CohereRerank } from "@llamaindex/cohere";
const reranker = new CohereRerank({
apiKey: process.env.COHERE_API_KEY,
topN: 5,
model: "rerank-english-v2.0"
});
// Use as a node postprocessor
const rerankedNodes = await reranker.postprocessNodes(nodes, query);
Constructor Options
Cohere API key (no environment variable default - must be provided)
Number of top results to return after reranking
model
string
default:"rerank-english-v2.0"
Cohere rerank model to use
Optional custom API endpoint URL
Request timeout in seconds
Supported Models
Rerank Models
rerank-english-v2.0: General-purpose English reranking (default)
rerank-multilingual-v2.0: Multilingual reranking support
rerank-english-v3.0: Latest English model
rerank-multilingual-v3.0: Latest multilingual model
With Query Engine
import { VectorStoreIndex } from "llamaindex";
import { CohereRerank } from "@llamaindex/cohere";
const index = await VectorStoreIndex.fromDocuments(documents);
const reranker = new CohereRerank({
apiKey: process.env.COHERE_API_KEY,
topN: 3,
model: "rerank-english-v3.0"
});
const queryEngine = index.asQueryEngine({
nodePostprocessors: [reranker]
});
const response = await queryEngine.query({
query: "What are the main features?"
});
With Retriever
import { VectorStoreIndex } from "llamaindex";
import { CohereRerank } from "@llamaindex/cohere";
const index = await VectorStoreIndex.fromDocuments(documents);
const retriever = index.asRetriever({ similarityTopK: 10 });
const reranker = new CohereRerank({
apiKey: process.env.COHERE_API_KEY,
topN: 5
});
// Retrieve initial results
const nodes = await retriever.retrieve({ query: "user query" });
// Rerank for better relevance
const rerankedNodes = await reranker.postprocessNodes(
nodes,
"user query"
);
Multilingual Reranking
const reranker = new CohereRerank({
apiKey: process.env.COHERE_API_KEY,
topN: 5,
model: "rerank-multilingual-v3.0"
});
const rerankedNodes = await reranker.postprocessNodes(
nodes,
"Quelles sont les principales caractéristiques?" // French query
);
Custom Base URL
const reranker = new CohereRerank({
apiKey: process.env.COHERE_API_KEY,
baseUrl: "https://custom-cohere-endpoint.com",
topN: 3
});
Configuration
Environment Variables
COHERE_API_KEY=your-api-key-here
Note: Unlike other providers, the Cohere package does not automatically read from environment variables. You must explicitly pass the API key.
How Reranking Works
- Initial Retrieval: Your retriever fetches top-K documents (e.g., 10-20)
- Reranking: Cohere’s model re-scores each document for relevance
- Top-N Selection: Returns only the most relevant N documents
- Score Update: Updates each node’s score to the relevance score
// Before reranking: 10 documents with embedding similarity scores
const initialNodes = await retriever.retrieve({ query, similarityTopK: 10 });
// After reranking: 3 most relevant documents with Cohere relevance scores
const reranker = new CohereRerank({ apiKey, topN: 3 });
const finalNodes = await reranker.postprocessNodes(initialNodes, query);
- Retrieve more, rerank to fewer: Retrieve 10-20 documents, rerank to top 3-5
- Use for complex queries: Most beneficial when semantic search alone isn’t sufficient
- Choose right model: v3.0 models offer better quality, v2.0 is faster
- Set appropriate timeout: For large document sets, increase timeout
Error Handling
try {
const rerankedNodes = await reranker.postprocessNodes(nodes, query);
} catch (error) {
if (error.message.includes("API key")) {
console.error("Invalid or missing Cohere API key");
} else {
console.error("Reranking failed:", error.message);
}
}
Use Cases
- Improve RAG quality: Rerank retrieved documents before sending to LLM
- Multi-stage retrieval: First pass with embeddings, second pass with reranking
- Cross-lingual search: Use multilingual models for queries in different languages
- Semantic search refinement: Improve relevance beyond vector similarity
Best Practices
- Always provide a query: Reranking requires a query string to work
- Retrieve enough candidates: Aim for 10-20 initial results for best reranking
- Don’t over-rerank: Top 3-5 results usually sufficient for most use cases
- Handle empty results: Check if initial retrieval returns documents
- Monitor costs: Reranking adds API costs, use judiciously
See Also