Skip to main content

Documentation Index

Fetch the complete documentation index at: https://lancedb-bcbb4faf-docs-namespace-typescript-examples.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Reranking is the process of re-ordering search results to improve relevance, often using a different model than the one used for the initial search. LanceDB has built-in support for reranking with models from Cohere, Sentence-Transformers, and more.

Quickstart

To use a reranker, you perform a search and then pass the results to the rerank() method.
import lancedb
from lancedb.rerank import CohereReranker

db = lancedb.connect("/tmp/lancedb")
table = db.open_table("my_table")

query = "what is the capital of france"

# Search with reranking
reranker = CohereReranker()
reranked_results = table.search(query).limit(10).rerank(reranker=reranker).to_df()

Supported Rerankers

LanceDB supports several rerankers out of the box. Here are a few examples:
RerankerDefault Model
CohereRerankerrerank-english-v2.0
CrossEncoderRerankercross-encoder/ms-marco-MiniLM-L-6-v2
ColbertRerankercolbert-ir/colbertv2.0
You can find more details about these and other rerankers in the integrations section.
SDK coverage differs across languagesThe provider-specific rerankers in the table above (CohereReranker, CrossEncoderReranker, ColbertReranker, and others under lancedb.rerankers) are currently Python-only. The TypeScript and Rust SDKs currently expose the generic Reranker interface (rerankHybrid / rerank_hybrid) and the built-in RRFReranker. To use a model-based reranker from TypeScript or Rust, you must implement the Reranker interface yourself.

Multi-vector reranking

Most rerankers support reranking based on multiple vectors. To rerank based on multiple vectors, you can pass a list of vectors to the rerank method. Here’s an example of how to rerank based on multiple vector columns using the CrossEncoderReranker:
from lancedb.rerankers import CrossEncoderReranker

reranker = CrossEncoderReranker()

query = "hello"

# `deduplicate=True` requires `_rowid` on every input result set,
# so call `.with_row_id(True)` on each search before passing it in.
res1 = table.search(query, vector_column_name="vector").limit(3).with_row_id(True)
res2 = table.search(query, vector_column_name="text_vector").limit(3).with_row_id(True)
res3 = table.search(query, vector_column_name="meta_vector").limit(3).with_row_id(True)

reranked = reranker.rerank_multivector([res1, res2, res3], deduplicate=True)
  • Passing deduplicate=True to rerank_multivector(...) raises a ValueError if any of the input result sets is missing the _rowid column. Therefore, it’s recommended to add .with_row_id(True) to every table.search(...) call before reranking, or omit deduplicate=True if you don’t need it.
  • RRFReranker.rerank_multivector(...) always requires _rowid on its inputs, regardless of the deduplicate flag.

Creating Custom Rerankers

LanceDB also allows you to create custom rerankers by extending the base Reranker class. The custom reranker should implement the rerank method that takes a list of search results and returns a reranked list of search results. This is covered in more detail in the creating custom rerankers section.