Once the item embeddings are computed offline, they must be stored in a highly optimized Approximate Nearest Neighbor (ANN) index to allow for fast retrieval. ANN indexes come in two main flavors:
These are highly optimized libraries designed for fast nearest-neighbor search:
- FAISS (Facebook AI Similarity Search) — Optimized for large-scale search.
- ScaNN (Scalable Nearest Neighbors by Google) — Efficient and fast.
👉 Use Case: When speed is the highest priority, such as real-time product recommendations.
These are full-fledged databases that store embeddings along with metadata and provide advanced filtering and indexing:
- Qdrant — Open-source and production-ready.
- Hopsworks — Integrates with ML feature stores.
- MongoDB (with vector search) — Combines vector search with traditional database features.
👉 Use Case: When we need long-term storage, metadata filtering, and scalable querying (e.g., filtering by category or price).