Introduction

The rise of AI applications — from semantic search and recommender systems to Retrieval-Augmented Generation (RAG) for LLMs — has driven a surge of interest in vector databases. These systems store high-dimensional embedding vectors (numeric representations of data like text or images) and support fast similarity search, enabling queries for “nearest” or most semantically similar items. In response, two approaches have emerged: purpose-built vector databases designed from the ground up for this workload, and traditional databases augmented with vector search capabilities. This report surveys both categories, detailing key systems in each, their specialties, indexing methods for similarity search, performance and scalability, ecosystem integrations, pros/cons, and ideal use cases.

Modern Purpose-Built Vector Databases

Modern vector databases are specialized systems optimized for storing embeddings and performing k-nearest neighbor (kNN) searches at scale. They typically implement advanced Approximate Nearest Neighbor (ANN) algorithms (like HNSW, IVF, etc.), support metadata filtering with vector queries, and often allow hybrid queries combining vector similarity with keyword search. Below we list prominent vector databases and their characteristics.

🧠 Pinecone → pinecone.io

Pinecone is a fully managed vector database built for ease, speed, and scale. You push vectors, query for similarity, and Pinecone takes care of the infrastructure behind the scenes. It’s a cloud-native, enterprise-grade service often chosen for its convenience and integration-ready design.

Proprietary indexing layer based on HNSW + infrastructure enhancements
Support for dot product, cosine, and Euclidean similarity
Metadata filtering
Two deployment modes: serverless (autoscaling) and dedicated pods (manual tuning)
Vector and sparse vector hybrid search (dense + keyword)

Pinecone performs well on high-volume workloads, especially with dedicated pods. It scales horizontally with replicas and can handle millions of vectors per index. Benchmarks show slightly lower recall than self-hosted options but strong QPS performance. Serverless mode may introduce latency or pricing trade-offs for some workloads.

LangChain, Hugging Face, OpenAI embeddings
Python/JS SDKs
REST API

Perfect for teams that need to stand up a semantic search or memory backend for a chatbot quickly. Used widely in:

Semantic document search
RAG-based assistants
Personalized content retrieval
Production search systems at scale with minimal DevOps

Advantages	Weaknesses
Fully managed and scalable	Proprietary and closed source
Zero infrastructure burden	Limited algorithm customization
Integrated metadata filters	Cost may scale fast with volume
Hybrid search (sparse + dense)	Less transparency on internals
Strong ecosystem integrations	Not suitable for on-prem deployments

If you need to move fast, especially in a startup or product prototyping environment, Pinecone makes vector search seamless. But teams with strict data policies or seeking fine-grained tuning might feel boxed in. Still, it’s one of the strongest players in the managed space.

🧩 Weaviate → weaviate.io

Weaviate is a robust, open-source vector database written in Go, built with developer experience and hybrid search in mind. It supports both semantic and symbolic search natively, with GraphQL or REST APIs for interaction. It’s one of the most extensible solutions available.

HNSW-based ANN with support for metadata filtering
BM25 and keyword hybrid search
Modular architecture with built-in vectorization via OpenAI, Hugging Face, etc.
GraphQL query interface
Aggregations and filtered vector search

Weaviate delivers fast query latencies on medium-to-large corpora. Horizontal scaling via sharding and replication in Kubernetes. Handles billions of vectors, though indexing and RAM usage may spike at large scale. Memory-resident index, with plans for disk-based search in future.

LangChain, Hugging Face
Built-in text2vec modules
REST & GraphQL APIs

Weaviate fits best when you need both semantic and keyword relevance:

Enterprise document search
Scientific research assistants
QA bots with filterable content (e.g., by department)
Search + filtering in multi-tenant SaaS products

Advantages	Weaknesses
Hybrid search built-in	Requires Kubernetes to scale
Built-in vectorization modules	RAM-hungry for large datasets
Modular, open-source design	GraphQL may feel verbose
Strong community and docs	Sharded setups can be complex
Real-time updates and aggregation	Indexes are memory-resident only (for now)

Weaviate is arguably the most feature-complete open-source vector DB right now. If you’re okay with running Kubernetes, it’s powerful and extensible. But you’ll want to budget for infrastructure and memory if your dataset is large.

🏗 Milvus → milvus.io

Milvus is a production-grade, highly scalable vector database developed by Zilliz. It’s known for supporting a wide range of indexing strategies and scaling to billions of vectors. Built in C++ and Go, it’s suitable for heavy-duty vector infrastructure.

Support for IVF, PQ, HNSW, DiskANN
Dynamic vector collections
CRUD operations and filtering
Disk-based and in-memory indexing
Horizontal scalability via Kubernetes

Milvus is made for scale. It can index tens of millions of vectors per second, and scales via sharded microservices. The cost is complexity: managing Milvus at scale demands orchestration, memory, and storage planning.

LangChain, pymilvus, REST
External embeddings: OpenAI, HF, Cohere

Best suited for:

High-scale recommendation systems
Vision similarity search (large image corpora)
Streaming data indexing
Platforms requiring vector + scalar search

Advantages	Weaknesses
Handles billion-scale vector data	Complex to deploy and manage
Multiple index types available	High memory usage in some modes
Active open-source community	Microservice architecture requires tuning
Disk-based support for large sets	Needs Kubernetes and cluster knowledge
Strong filtering and CRUD ops	APIs are less ergonomic for beginners

Milvus is the workhorse of vector DBs. If you’re building infrastructure with billions of vectors and demand flexibility, it’s a great choice. But know that you’ll need ops investment to run it at its best.

🚀 Qdrant → qdrant.tech

Qdrant is a fast, open-source Rust-based vector DB focused on performance and simplicity. It’s memory-efficient, filter-friendly, and can now perform hybrid search. One of the fastest-growing players with a rich feature roadmap.

HNSW with memory mapping
Payload filtering and geo support
Scalar and binary quantization (RAM efficient)
Hybrid search (sparse + dense)
Raft-based clustering and durability

Benchmark leader in QPS and latency for dense vector queries. Rust-based design allows low memory usage. Easily scales with horizontal partitioning. Recent updates allow disk-based support for massive collections.

LangChain, Hugging Face
Python/JS SDKs, REST/gRPC
WASM compile target (experimental)

Perfect fit for:

AI assistant document memory
Similarity search on e-commerce platforms
High-performance recommendation engines
RAG pipelines on moderate to large corpora

Advantages	Weaknesses
Top benchmark performance	Smaller ecosystem than Elastic or PG
Low memory and CPU footprint	Filtering & sparse search newer features
Easy deployment & config	All vector generation must be external
Hybrid search support	No built-in SQL-like query language
Active open-source roadmap	Some advanced features still maturing

Qdrant is what you reach for when you need speed and resource efficiency without giving up on filtering or flexibility. It’s well-engineered, developer-friendly, and growing rapidly. Ideal for modern, performance-conscious AI applications.

📦 Chroma → trychroma.com

Chroma is an open-source, developer-friendly vector store focused on local-first, embedded use cases. It is designed to make integrating semantic memory into your applications as simple as possible.

Embedded Python or Node.js library
Powered by hnswlib and DuckDB/ClickHouse under the hood
Automatic persistence and simple API
Optional vector compression

Chroma is optimized for ease and rapid prototyping. It is best suited for use cases that can run on a single node. Query speed is excellent for small to mid-sized datasets due to in-memory hnswlib usage.

LangChain, LlamaIndex
Hugging Face embeddings or OpenAI
Python and JS SDKs

Best suited for:

LLM memory store for chatbots
Local semantic search in personal tools
Offline or edge-device AI search
Hackathons, demos, notebooks

Advantages	Weaknesses
Extremely easy to use	No horizontal scaling support
Embedded and zero-setup	Limited production readiness for large scale
Fast local query latency	Not optimized for massive concurrency
Open source, permissive license	Few enterprise features (security, clustering)

Chroma is your go-to for rapid development and low-friction experimentation. It’s not built to scale to billions of vectors, but for local AI applications, it’s a joy to work with.

see you in a part 2 with overview of traditional databases with vector supporting

Author: Max Levko

Data and AI enthusiast View all posts by Max Levko

Vector Databases and Vector Search in Traditional Databases: 2025 overview

Introduction

Modern Purpose-Built Vector Databases

🧠 Pinecone → pinecone.io

🧩 Weaviate → weaviate.io

🏗 Milvus → milvus.io

🚀 Qdrant → qdrant.tech

📦 Chroma → trychroma.com

Author: Max Levko

Leave a comment Cancel reply

Introduction

Modern Purpose-Built Vector Databases

🧠 Pinecone → pinecone.io

🧩 Weaviate → weaviate.io

🏗 Milvus → milvus.io

🚀 Qdrant → qdrant.tech

📦 Chroma → trychroma.com

Related

Author: Max Levko

Leave a comment Cancel reply