LanceDB will automatically vectorize the data both at ingestion and query time. All you need to do is specify which model to use. Popular embedding models like OpenAI, Hugging Face, Sentence Transformers, CLIP, and more, are supported.Documentation Index
Fetch the complete documentation index at: https://lancedb-bcbb4faf-docs-namespace-typescript-examples.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Step 1: Import Required Libraries
First, import the necessary LanceDB components:lancedb: The main database connection and operationsLanceModel: Pydantic model for defining table schemasVector: Field type for storing vector embeddingsget_registry(): Access to the embedding function registry. It has all the supported as well as custom embedding functions registered by the user- TypeScript uses
lancedb.embedding.getRegistry()andlancedb.embedding.LanceSchema()for the same registry/schema workflow
Step 2: Connect to LanceDB
Establish a connection to your LanceDB OSS directory or Enterprise cluster:Step 3: Initialize the Embedding Function
Choose and configure your embedding model:sentence-transformers provider with the BGE model; the TypeScript snippet uses the Transformers-backed
huggingface provider. You can:
- Change
"sentence-transformers"to other providers like"openai","cohere", etc. - Modify the model name for different embedding models
- Set
device="cuda"for GPU acceleration if available
Step 4: Define Your Schema
Create a Pydantic model that defines your table structure:SourceField(): This field will be embeddedVectorField(): This stores the embeddingsmodel.ndims(): Sets vector dimensions for your model- In TypeScript, use
model.sourceField(...)andmodel.vectorField()insideLanceSchema(...)
Step 5: Create Table and Ingest Data
Create a table with your schema and add data:table.add() call automatically:
- Takes the text from each document
- Generates embeddings using your chosen model
- Stores both the original text and the vector embeddings
Step 6: Query with Automatic Embedding
Note: On LanceDB Enterprise, the server does not generate embeddings from query text. In the Python remote client,table.search("greetings") can still work when the table schema includes embedding metadata, because
the client computes the query embedding before sending the vector search. If there is no embedding metadata for
that search, search("greetings") in auto mode is treated as FTS instead.
Search your data using natural language queries:
- Automatically converts your query text to embeddings
- Finds the most similar vectors in your table
- Returns the matching documents