Qdrant Technical Research Report¶

Last Updated: 2025-12-18

Research Methodology: Gemini CLI internet search + Qdrant official documentation (chrome-devtools)

Overview¶

Qdrant is a high-performance, open-source vector database written in Rust. It specializes in storing, indexing, and searching high-dimensional vectors with associated metadata (payloads), making it ideal for AI/ML applications requiring similarity search.

Key Innovation: Filtrable HNSW index that extends the graph with payload-based edges, enabling efficient filtered vector search without sacrificing accuracy.

Funding: $37.8M total funding

Source: Qdrant GitHub | Qdrant Documentation

1. Core Architecture¶

Segment-Based Storage¶

Data within a collection is divided into segments, each with independent: - Vector storage - Payload storage - Vector indexes - Payload indexes - ID mapper (internal ↔ external ID relationship)

Collection
├── Segment 1 (appendable)
│   ├── Vector Storage (In-memory or Memmap)
│   ├── Payload Storage (InMemory or OnDisk/RocksDB)
│   ├── HNSW Index
│   ├── Payload Indexes
│   └── ID Mapper
├── Segment 2 (non-appendable)
│   └── ...
└── Write-Ahead Log (WAL)

Segment Types: - Appendable: Supports add, delete, and query operations - Non-appendable: Read and delete only (optimized for search)

Source: Storage Documentation

2. Vector Storage Options¶

Storage Type	Description	Use Case
In-memory	All vectors in RAM, disk for persistence only	Highest speed, sufficient RAM
Memmap	Virtual address space mapped to disk file	Large datasets, uses page cache

Memmap Configuration¶

PUT /collections/{collection_name}
{
    "vectors": {
      "size": 768,
      "distance": "Cosine",
      "on_disk": true
    },
    "hnsw_config": {
        "on_disk": true
    }
}

Key Parameter: memmap_threshold - segments convert to memmap after reaching this point count.

Source: Storage Documentation

3. Indexing System¶

3.1 Vector Index (HNSW)¶

Qdrant uses HNSW (Hierarchical Navigable Small World Graph) exclusively for dense vector indexing.

Architecture: - Multi-layer navigation structure - Upper layers: sparse, long-distance connections - Lower layers: dense, short-distance connections - Search starts from top layer, iteratively descending

Key Parameters:

Parameter	Default	Description
`m`	16	Max edges per node (higher = more accurate, more memory)
`ef_construct`	100	Neighbors considered during build (higher = more accurate, slower build)
`ef`	= ef_construct	Search range (configurable per query)
`full_scan_threshold`	10000	Below this, full-scan preferred over HNSW

storage:
  hnsw_index:
    m: 16
    ef_construct: 100
    full_scan_threshold: 10000

Source: Indexing Documentation

3.2 Payload Index¶

Similar to document database indexes, built for specific fields to accelerate filtering:

Type	Use Case	Filter Support
`keyword`	String matching	Match
`integer`	Numeric values	Match, Range
`float`	Decimal values	Range
`bool`	Boolean values	Match
`geo`	Geographic coordinates	Geo Bounding Box, Geo Radius
`datetime`	Timestamps	Range
`text`	Full-text search	Full Text Match
`uuid`	UUID optimization	Match

Special Index Types: - Tenant Index: Optimizes multitenancy with is_tenant: true - Principal Index: Optimizes time-based filtering with is_principal: true - On-disk Index: Reduces memory with on_disk: true

Source: Indexing Documentation

3.3 Sparse Vector Index¶

For vectors with high proportion of zeros (like BM25 outputs): - Exact search (no approximation) - Similar to inverted index - Supports IDF modifier for relevance ranking

PUT /collections/{collection_name}
{
    "sparse_vectors": {
        "text": {
            "modifier": "idf",
            "index": { "on_disk": false }
        }
    }
}

Source: Indexing Documentation

3.4 Full-Text Index¶

For keyword search within string payloads:

Tokenizers: - word: Split by spaces and punctuation - whitespace: Split by spaces only - prefix: Creates prefix index (h, he, hel, hell, hello) - multilingual: Language-aware (charabia + vaporetto)

Features: - Lowercasing (default on) - ASCII folding (é → e) - Stemming (Snowball algorithm) - Stopwords filtering - Phrase matching

PUT /collections/{collection_name}/index
{
    "field_name": "content",
    "field_schema": {
        "type": "text",
        "tokenizer": "word",
        "lowercase": true,
        "phrase_matching": true,
        "stemmer": { "type": "snowball", "language": "english" }
    }
}

Source: Indexing Documentation

3.5 Filtrable HNSW Index¶

Key Innovation: Extends HNSW graph with payload-based edges for efficient filtered search.

Problem: Standard HNSW fails with strict filters (graph becomes disconnected).

Solution: 1. Add extra edges based on payload values 2. Use ACORN Search Algorithm for complex filter combinations

Source: Filtrable HNSW Article

4. Hybrid Search Capabilities¶

4.1 Query API with Prefetch¶

Enables multi-stage and hybrid queries through nested prefetch operations:

POST /collections/{collection_name}/points/query
{
    "prefetch": [
        { "query": {"indices": [1, 42], "values": [0.22, 0.8]}, "using": "sparse", "limit": 20 },
        { "query": [0.01, 0.45, 0.67, ...], "using": "dense", "limit": 20 }
    ],
    "query": { "fusion": "rrf" },
    "limit": 10
}

4.2 Fusion Methods¶

Method	Description	Formula
RRF	Reciprocal Rank Fusion	score(d) = Σ 1/(k + rank)
DBSF	Distribution-Based Score Fusion	Normalizes using mean ± 3σ

4.3 Multi-Stage Queries¶

Coarse-to-fine search strategies: 1. Quantized → Full precision: Use compressed vectors first, refine with full vectors 2. MRL (Matryoshka): Short vector candidates, long vector refinement 3. Dense → Multi-vector: Dense pre-fetch, ColBERT re-ranking

POST /collections/{collection_name}/points/query
{
    "prefetch": {
        "query": [1, 23, 45, 67],  // small byte vector
        "using": "mrl_byte",
        "limit": 1000
    },
    "query": [0.01, 0.299, 0.45, 0.67, ...],  // full vector
    "using": "full",
    "limit": 10
}

4.4 Additional Query Features¶

Feature	Description	Version
MMR	Maximal Marginal Relevance for diversity	v1.15.0+
Score Boosting	Custom formulas with payload values	v1.14.0+
Time Decay	exp_decay, gauss_decay, lin_decay functions	v1.14.0+
Grouping	Group results by payload field	v1.11.0+

Source: Hybrid Queries Documentation

5. Data Integrity¶

Write-Ahead Log (WAL)¶

All changes written to WAL before segments: 1. Operations assigned sequential numbers 2. Segments store last applied version 3. Per-point version tracking 4. Safe recovery from abnormal shutdown

Source: Storage Documentation

6. Distributed Architecture¶

Raft Consensus¶

For cluster coordination and consistency: - Leader election - Log replication - Fault tolerance

Sharding & Replication¶

Feature	Description
Sharding	Horizontal data partitioning
Replication	Multiple copies for fault tolerance
Consistent Hashing	Even data distribution

Source: Distributed Deployment Guide

7. Vector Quantization¶

Reduces memory footprint and improves search speed:

Method	Description	Trade-off
Scalar Quantization	Float32 → Int8	4x compression, minimal accuracy loss
Product Quantization (PQ)	Vector → subspace codes	Higher compression, more accuracy loss
Binary Quantization	Float → 1-bit	Maximum compression

Source: Vector Quantization Article

8. 2025 Features & Ecosystem¶

Recent Updates¶

Feature	Version	Description
Sparse Vectors	v1.7.0+	Native sparse vector support with IDF
Hybrid Queries	v1.10.0+	Prefetch-based multi-stage queries
Tenant Index	v1.11.0+	Multitenancy optimization
Score Boosting	v1.14.0+	Custom ranking formulas
MMR	v1.15.0+	Diversity in results
Parametrized RRF	v1.16.0+	Configurable fusion constant

Ecosystem¶

Component	Description
FastEmbed	Lightweight embedding library
Qdrant MCP Server	Model Context Protocol integration
Qdrant Cloud	Managed service with free tier
Web UI	Built-in data exploration interface

9. Comparison with Other Vector Databases¶

Feature	Qdrant	Chroma	Weaviate	Milvus
Language	Rust	Rust (v1.0)	Go	Go
Index	HNSW only	HNSW	HNSW + Flat	IVF, HNSW, DiskANN
Sparse Vectors	Native	No	No	Yes
Hybrid Search	RRF, DBSF	No	BM25 + Vector	Yes
Filtering	Pre-filtering (Filtrable HNSW)	Post-filtering	Pre-filtering	Post-filtering
Quantization	Scalar, PQ, Binary	No	PQ, BQ	Various
Multitenancy	Native (Tenant Index)	No	Native	Partition-based

10. Key Technical Specifications¶

Metric	Value
Language	Rust
Vector Index	HNSW (single algorithm, highly optimized)
Distance Metrics	Cosine, Dot Product, Euclidean, Manhattan
Max Dimensions	Unlimited (practical limit by memory)
Persistence	WAL + Snapshots
Storage Options	In-memory, Memmap, On-disk
SDKs	Python, TypeScript, Rust, Go, Java, C#

11. Key Takeaways¶

Filtrable HNSW: Unique approach to filtered vector search by extending HNSW with payload-based edges
Rust Performance: Written in Rust for memory safety and performance
Flexible Storage: In-memory for speed, Memmap for scale, configurable per collection
Hybrid Search: Native support for combining dense + sparse vectors with RRF/DBSF fusion
Multi-Stage Queries: Prefetch architecture enables coarse-to-fine search patterns
Production-Ready: WAL for durability, Raft for distributed consensus, quantization for scale
Multitenancy: Native tenant index optimization for SaaS applications
Full-Text Search: Built-in tokenization, stemming, and phrase matching
Score Boosting: Custom ranking formulas using payload values and decay functions
MCP Integration: Official MCP server for AI agent integration

Qdrant Technical Research Report¶

Overview¶

1. Core Architecture¶

Segment-Based Storage¶

2. Vector Storage Options¶

Memmap Configuration¶

3. Indexing System¶

3.1 Vector Index (HNSW)¶

3.2 Payload Index¶

3.3 Sparse Vector Index¶

3.4 Full-Text Index¶

3.5 Filtrable HNSW Index¶

4. Hybrid Search Capabilities¶

4.1 Query API with Prefetch¶

4.2 Fusion Methods¶

4.3 Multi-Stage Queries¶

4.4 Additional Query Features¶

5. Data Integrity¶

Write-Ahead Log (WAL)¶

6. Distributed Architecture¶

Raft Consensus¶

Sharding & Replication¶

7. Vector Quantization¶

8. 2025 Features & Ecosystem¶

Recent Updates¶

Ecosystem¶

9. Comparison with Other Vector Databases¶

10. Key Technical Specifications¶

11. Key Takeaways¶

References¶

Documentation¶

Technical Articles¶

Market Data¶