Skip to main content

Documentation Index

Fetch the complete documentation index at: https://lancedb-bcbb4faf-docs-namespace-typescript-examples.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

When using LanceDB OSS, you can choose where to store your data. The tradeoffs between storage options are covered in the storage architecture guide. This page shows how to configure each backend.
LanceDB Enterprise storage configurationIn LanceDB Enterprise, you connect with db://... and the cluster owns the storage credentials, so storage_options are not passed at runtime. Cloud auth is set at deployment time. For federated databases, the namespace service vends per-request credentials automatically. See the Enterprise quickstart and the Azure deployment guide for the Enterprise flow.

Object stores

LanceDB supports AWS S3 (and compatible stores), Azure Blob Storage, and Google Cloud Storage. The URI scheme in your connect call selects the backend.

Configuration options

When running inside the target cloud with correct IAM bindings, LanceDB often needs no extra configuration. When running elsewhere, provide credentials via environment variables or storage_options.
Storage option casingKeys are case-insensitive. Use lowercase in storage_options and uppercase in environment variables.
Table-level storage_options inherit every key from the connection and override on a per-key basis. Pass them to create_table or open_table for options that should apply to a single table:
Inspect the effective optionsOn AsyncTable, await table.initial_storage_options() returns the options the table was opened with, and await table.latest_storage_options() returns the current options after any provider-driven refresh. The deprecated table.storage_options() method will be removed in a future release.

General object store options

KeyDescription
allow_httpAllow non-TLS connections.
allow_invalid_certificatesSkip certificate validation for TLS connections.
connect_timeoutTimeout for the connect phase.
timeoutTimeout for the full request.
user_agentUser agent string sent with requests.
proxy_urlProxy URL to route requests through.
proxy_ca_certificatePEM-formatted CA certificate for proxy connections.
proxy_excludesComma-separated hosts that bypass the proxy (domains or CIDR).
download_retry_countNumber of retries when downloading objects.
client_max_retriesMaximum retries for object-store client requests.
client_retry_timeoutTotal retry timeout (seconds) for object-store client requests.
Option support varies by backendThese are commonly used options. Cloud-specific keys (for example region, endpoint, service_account, and Azure credential keys) are backend-dependent and can be provided in storage_options as needed.

New table configuration

These options control the Lance file format and features used when creating new tables. Pass them via storage_options at connection or table level. They are evaluated only at table creation; setting them on an existing connection does not rewrite or alter tables that already exist.
KeyValuesDefaultDescription
new_table_data_storage_versionlegacy, stablestableLance file format version for new tables. Use legacy for backward compatibility with older clients, or stable for the current format with better performance.
new_table_enable_v2_manifest_pathstrue, falsefalseUse v2 manifest path naming. Requires LanceDB >= 0.10.0 to read.
new_table_enable_stable_row_idstrue, falsefalseKeep row IDs stable across compaction, delete, and merge operations.
import lancedb

# Set the Lance file format version at connection level
db = lancedb.connect(
    "s3://bucket/path",
    storage_options={
        "new_table_data_storage_version": "stable",
    },
)
Deprecated parameterThe data_storage_version parameter on create_table() is deprecated. Use new_table_data_storage_version in storage_options instead.

AWS S3

Set AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and optionally AWS_SESSION_TOKEN as environment variables or pass them in storage_options. Region is optional for AWS but required for most S3-compatible stores. Minimum permissions usually include s3:PutObject, s3:GetObject, s3:DeleteObject, s3:ListBucket, and s3:GetBucketLocation scoped to the relevant bucket/prefix.

S3-compatible stores

If the endpoint is http:// (common in local development), also set ALLOW_HTTP=true or pass allow_http=True in storage_options.

S3 Express

Consult AWS networking requirements for S3 Express before enabling.
Clean up failed multipart uploadsLanceDB aborts multipart uploads on graceful shutdown, but crashes can leave incomplete uploads. Add an S3 lifecycle rule to delete in-progress uploads after a few days.

Server-side encryption with KMS

To encrypt at rest with an AWS KMS key, set aws_server_side_encryption to aws:kms and aws_sse_kms_key_id to the key ID or ARN. The same options apply at connection or table level and combine with bucket-level default encryption. The IAM principal needs kms:Encrypt, kms:Decrypt, and kms:GenerateDataKey on the configured KMS key.

Google Cloud Storage

Provide credentials via GOOGLE_SERVICE_ACCOUNT (path to JSON) or include the path in storage_options. GCS defaults to HTTP/1; set HTTP1_ONLY=false if you need HTTP/2.

Azure Blob Storage

Set AZURE_STORAGE_ACCOUNT_NAME and AZURE_STORAGE_ACCOUNT_KEY as environment variables, or pass them via storage_options. For SAS-token auth, set azure_storage_account_name and azure_storage_sas_token: Other supported keys include service principal credentials (azure_client_id, azure_client_secret, azure_tenant_id), managed identities, and custom endpoints.

Tigris Object Storage

Tigris exposes an S3-compatible API. Configure the endpoint and region: Environment variables AWS_ENDPOINT=https://t3.storage.dev and AWS_DEFAULT_REGION=auto achieve the same configuration.