Vector Database Comparison: Pinecone, Weaviate, ChromaDB, pgvector
By Diesel
toolsvector-databasescomparisoninfrastructure
Every RAG tutorial picks a vector database in the first paragraph and never explains why. "We'll use Pinecone for this tutorial." Cool. But why? And would ChromaDB have been fine? Would pgvector in the PostgreSQL you already run have been better?
I've used all four in production. The answer is always about your constraints, never about which database is "best."
## The Four Contenders
**Pinecone**: Fully managed cloud service. You don't run anything. They handle scaling, replication, indexing. You send vectors, you query vectors, you pay the bill.
**Weaviate**: Open-source with a managed cloud option. Runs as a standalone service. Does more than vector search, including built-in vectorization, hybrid search, and generative modules.
**ChromaDB**: Open-source, embeddable. Runs in-process or as a lightweight server. The SQLite of vector databases. Minimal configuration, gets you started in minutes.
**pgvector**: A PostgreSQL extension. Not a separate database. Your vectors live in the same PostgreSQL instance as your relational data. One database, one connection, one backup strategy. The related post on [building agent memory on top of them](/blog/building-agent-memory-vector-databases) goes further on this point.
## Pinecone: Pay for Simplicity
```python
from pinecone import Pinecone
pc = Pinecone(api_key="your-key")
index = pc.Index("my-index")
# Upsert vectors
index.upsert(vectors=[
{"id": "doc1", "values": embedding, "metadata": {"source": "manual"}},
])
# Query
results = index.query(vector=query_embedding, top_k=10, include_metadata=True)
```
Pinecone's value proposition is zero ops. No servers to manage, no indexes to tune, no capacity planning. You create an index, pick a metric (cosine, euclidean, dot product), and start querying. It scales horizontally. It handles billions of vectors. You never think about the infrastructure.
The pod-based and serverless architectures give you flexibility. Serverless is great for bursty workloads where you don't want to pay for idle capacity. Pods give you predictable performance when you need it.
**Where it falls down:** Cost. At scale, Pinecone gets expensive. You're paying for a managed service margin on top of the actual compute and storage. For startups burning through venture money, fine. For enterprises with large vector datasets, the bill matters.
Also, vendor lock-in. Your data lives in Pinecone's cloud. Moving to another solution means re-embedding and re-indexing everything. That's not just an API migration.
Metadata filtering is capable but has limits. Complex filters with multiple conditions and range queries can be slower than you'd expect. If your use case is heavily filter-dependent, benchmark it.
## Weaviate: The Swiss Army Knife
```python
import weaviate
client = weaviate.connect_to_local()
# Create collection with built-in vectorizer
collection = client.collections.create(
name="Document",
vectorizer_config=weaviate.Configure.Vectorizer.text2vec_openai(),
properties=[
weaviate.Property(name="content", data_type=weaviate.DataType.TEXT),
weaviate.Property(name="source", data_type=weaviate.DataType.TEXT),
]
)
# Insert (auto-vectorized)
collection.data.insert({"content": "Some text", "source": "manual"})
# Hybrid search (vector + keyword)
results = collection.query.hybrid(query="search terms", limit=10, alpha=0.5)
```
Weaviate's killer feature is hybrid search. It combines dense vector similarity (what things mean) with sparse BM25 keyword matching (what words appear). The `alpha` parameter lets you blend them. For RAG, this consistently outperforms pure vector search because it catches both semantic matches and exact keyword matches.
Built-in vectorization is the other big selling point. You configure a vectorizer module (OpenAI, Cohere, Hugging Face, local models) and Weaviate handles embedding automatically on insert and query. You send text, it returns results. No embedding pipeline to manage.
**Where it falls down:** Operational complexity. Weaviate runs as a separate service. In production, that means monitoring, scaling, backups, and version upgrades for another piece of infrastructure. If you're already running Kubernetes, adding Weaviate is straightforward. If you're a small team with a simple stack, it's another thing to keep alive.
The API has also gone through significant breaking changes between major versions. Migrating from v3 to v4 of the Python client was a real rewrite, not just a find-and-replace.
## ChromaDB: Good Enough, Fast Enough
```python
import chromadb
client = chromadb.Client() # In-memory
# Or: chromadb.PersistentClient(path="./chroma_data")
# Or: chromadb.HttpClient(host="localhost", port=8000)
collection = client.create_collection("documents")
collection.add(
documents=["Some text", "Another document"],
ids=["doc1", "doc2"],
metadatas=[{"source": "manual"}, {"source": "auto"}]
)
results = collection.query(query_texts=["search terms"], n_results=10)
```
ChromaDB is the fastest path from zero to working vector search. Three lines and you have an in-memory vector database. A few more and it's persisted to disk. For prototypes, development environments, and small datasets (under a million vectors), it's perfect.
The in-process mode is underrated. No server, no network hops, no connection management. Your application and your vector database are in the same process. For CLI tools, data processing scripts, and applications where adding a server is overkill, this is exactly what you want.
ChromaDB also handles embedding for you if you configure a default embedding function. Or you can bring your own embeddings. Both patterns work.
**Where it falls down:** Scale. ChromaDB is not designed for billions of vectors or high-concurrency production workloads. It uses HNSW indexing and works well up to low millions of vectors. Beyond that, you need a purpose-built solution.
No hybrid search. Vector similarity only. If you need keyword matching combined with semantic search, you're either building that yourself or choosing Weaviate.
The server mode (Chroma as a service) exists but it's not battle-hardened the way Pinecone or Weaviate are. For production services with SLA requirements, I wouldn't rely on it as the primary backend.
## pgvector: The Boring Choice (Compliment)
```sql
CREATE EXTENSION vector;
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
content TEXT,
source TEXT,
embedding vector(1536)
);
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
-- Or for better recall: CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
-- Insert
INSERT INTO documents (content, source, embedding)
VALUES ('Some text', 'manual', '[0.1, 0.2, ...]'); For a deeper look, see [hybrid search capabilities](/blog/hybrid-search-rag-production).
-- Query
SELECT content, source, 1 - (embedding <=> $1) AS similarity
FROM documents
ORDER BY embedding <=> $1
LIMIT 10;
```
pgvector is a PostgreSQL extension. That sentence contains its entire value proposition. If you already run PostgreSQL (and you probably do), you can add vector search without adding another database to your stack.
Same backups. Same monitoring. Same connection pooling. Same ACID transactions. Your vectors and your relational data live together. You can join them. You can filter vectors using standard SQL WHERE clauses on relational columns, and PostgreSQL's query planner handles the optimization.
The HNSW index support is genuinely good. Recall is high, performance is competitive with dedicated vector databases for datasets up to tens of millions of vectors. IVFFlat is faster to build but lower recall.
**Where it falls down:** It's not a dedicated vector database. At very large scale (hundreds of millions to billions of vectors), purpose-built solutions like Pinecone will outperform pgvector on pure vector search speed. If vector search is your primary workload, not a feature of a larger application, you'll eventually outgrow pgvector.
No built-in hybrid search. You can combine pgvector with PostgreSQL's full-text search (`tsvector`), but you're writing the ranking logic yourself. It works, but it's not the one-line hybrid query Weaviate gives you. It is worth reading about [pgvector as a lightweight option](/blog/production-rag-langchain-pgvector) alongside this.
## My Decision Framework
**"I need something working in an hour"**: ChromaDB. Install, import, query. You're done.
**"I already run PostgreSQL"**: pgvector. One extension, zero new infrastructure. For most applications, this is the answer, and people over-engineer past it constantly.
**"I need hybrid search in production"**: Weaviate. The built-in BM25 plus vector search is worth the operational overhead.
**"I never want to think about infrastructure"**: Pinecone. You pay more, you worry less.
**"I have billions of vectors and a budget"**: Pinecone serverless or Weaviate Cloud. At that scale, managed services earn their margin.
The most common mistake I see is choosing the most powerful option when a simpler one would do. If your dataset is a million documents and you're running PostgreSQL, you don't need Pinecone. pgvector handles it fine, and you didn't add complexity to your stack.
Start simple. Measure. Migrate when you hit a real limitation, not a hypothetical one.