Deep Dive Guide to Building and Optimizing Qdrant Filtered HNSW Indexes: Sharding, Replication, and Scoring Strategies for High-Performance Search

Do you ever need to quickly find similar items filtered by specific attributes within a vast dataset? Qdrant is a powerful tool for this, but without proper setup and optimization, it cannot reach its full potential. In this guide, we will delve into how to build and optimize filtered HNSW (Hierarchical Navigable Small World) indexes using sharding, replication, and scoring strategies.

1. The Challenge / Context

The recent surge in applications leveraging Large Language Models (LLMs) has amplified the importance of efficiently storing and retrieving vector embeddings. Specifically, similarity search in high-dimensional vector spaces, combined with user-defined metadata filtering conditions, has become a common requirement. As a simple example, consider an e-commerce search engine looking for "products in a specific category posted within the last 3 months." Simply scanning all vectors is inefficient and can lead to significant latency. Traditional relational databases are not optimized for vector search, and most vector databases lack the optimization to provide fast search performance with filtering. This leads to scalability and performance issues as data grows.

2. Deep Dive: Qdrant and Filtered HNSW Indexes

Qdrant is an open-source vector database for vector similarity search. In particular, the filtered HNSW (Hierarchical Navigable Small World) index is a core feature that supports fast vector search with metadata filtering conditions. HNSW is a proximity graph-based index structure that offers excellent search speed and memory efficiency. Qdrant combines metadata filtering on top of this HNSW index, allowing only vectors that meet specific conditions to be searched. Filtering is performed before the index search, reducing unnecessary vector comparisons and improving performance. Furthermore, Qdrant supports data distribution and high availability through sharding and replication, ensuring stable performance even with large datasets.

3. Step-by-Step Guide / Implementation

Now, let's look at how to build and optimize a filtered HNSW index in Qdrant step-by-step. In this example, we will use a simple product dataset and implement filtered search based on category and price range.

Step 1: Setting up Qdrant Client and Creating a Collection

First, you need to install the Qdrant client and connect to a Qdrant instance. We assume a Qdrant cluster is already running.


from qdrant_client import QdrantClient, models
from qdrant_client.models import (
    VectorParams,
    Distance,
    PointStruct,
    Filter,
    FieldCondition,
    Range,
    CreateCollection,
    HnswConfigDiff,
    OptimizersConfigDiff
)

client = QdrantClient(":memory:") # 로컬 메모리 Qdrant 인스턴스 사용 (테스트용)
# 또는
# client = QdrantClient(host="localhost", port=6333) # Qdrant 인스턴스 연결

collection_name = "products"

client.recreate_collection(
    collection_name=collection_name,
    vectors_config=VectorParams(size=128, distance=Distance.COSINE), # 벡터 크기 및 거리 메트릭 설정
    hnsw_config=HnswConfigDiff(m=16, ef_construct=100, full_scan_threshold=10000), # HNSW 파라미터 조정 (선택 사항)
    optimizers_config=OptimizersConfigDiff(indexing_threshold=20000), # 최적화 파라미터 조정 (선택 사항)
    replication_factor=2, # 복제 팩터 설정 (선택 사항)
    shard_number=2 # 샤드 개수 설정 (선택 사항)
)
    

VectorParams defines the vector size and distance metric. HnswConfigDiff is used to adjust HNSW index parameters. m is the maximum number of neighbors connected in each layer, ef_construct is the search scope during index construction, and full_scan_threshold is the threshold for using a full