Which vector database should I use for production RAG?

Default to pgvector if you already run Postgres and your corpus is under 5 million vectors. Use Pinecone if you want a fully managed service and your team does not want to think about infrastructure. Use Qdrant if you want self-hosted with strong hybrid retrieval. Use Weaviate if you need built-in modular features like vectorization or multi-tenancy. Avoid running multiple vector databases in production.

Is pgvector good enough for production?

Yes, for most cases. pgvector handles up to 5 million vectors comfortably on a properly-sized Postgres instance with HNSW indexes. Beyond that, performance starts requiring careful tuning. The main reasons to graduate off pgvector are corpus size over 10 million vectors, latency requirements under 50ms p95, or specific hybrid retrieval features that are easier in dedicated vector stores.

How much does Pinecone cost at scale?

Pinecone pricing in 2026 starts at $70/month for the standard pod-based tier and scales linearly with vector count. A 1 million vector corpus on Pinecone serverless is roughly $30-100/month depending on query volume. The same corpus on pgvector is the cost of the underlying Postgres host, which can be under $50/month on managed Postgres or basically free if Postgres is already running.

What is hybrid retrieval and which databases support it?

Hybrid retrieval combines semantic search (vector similarity) with keyword search (BM25 or similar) to handle queries that need both fuzzy meaning matching and exact term matching. Qdrant has built-in BM25 plus dense retrieval. Weaviate has hybrid search natively. pgvector requires combining with Postgres full-text search manually but the result is solid. Pinecone added sparse-dense hybrid in 2024 and it works but feels less integrated than Qdrant or Weaviate.

How do I migrate from one vector database to another?

The vectors and metadata are portable; the index format and query syntax are not. Migration involves re-ingesting all vectors into the new system, rebuilding indexes (HNSW parameters differ), and rewriting all retrieval queries. Plan for 1-3 weeks of engineering time per migration depending on corpus size and query complexity. Plan to run both systems in parallel for a week to compare retrieval quality before cutting over.

Vector Database Buyer's Guide: pgvector, Pinecone, Weaviate, Qdrant

Pick the wrong vector database and your RAG system will work but cost three times what it should, take six weeks to migrate when you outgrow it, or fail under load at the worst possible moment. We have shipped production RAG on all four of these. Here is what we learned.

The four contenders

pgvector is the Postgres extension that turns any Postgres instance into a vector store. It has been around since 2021 and is the boring, reliable choice. Runs anywhere Postgres runs.

Pinecone is the original managed vector database. Pure SaaS. Pay per vector and per query. Best in class for “I do not want to think about infrastructure.”

Qdrant is open source, self-hostable or managed. Strong on hybrid retrieval and on running large corpora cheaply on your own hardware.

Weaviate is open source, self-hostable or managed. Includes a richer feature set: built-in vectorization, multi-tenancy, modular plugins.

All four work. The question is which one fits the rest of your stack and constraints.

The dimensions that matter

We score every vector database decision on five dimensions:

Cost at scale. What does it cost when the corpus is 1M, 10M, or 100M vectors?
Hybrid retrieval quality. Can it combine semantic and keyword search well?
Operational maturity. What happens when the on-call gets paged at 2 a.m.?
Lock-in. How hard is it to migrate out?
Stack fit. Does it match the rest of your infrastructure?

Stack fit is the dimension teams under-weight. Adding a new database to your stack is a real cost. If you already run Postgres, pgvector saves you a vendor contract, a separate failover plan, and a separate monitoring integration.

Cost at scale

Real numbers from systems we have shipped or audited.

Vectors	pgvector (managed PG)	Pinecone serverless	Qdrant Cloud	Weaviate Cloud
100k	$25/mo	$30/mo	$40/mo	$50/mo
1M	$50/mo	$80/mo	$100/mo	$120/mo
10M	$200/mo	$400/mo	$300/mo	$450/mo
100M	$1,500/mo	$3,500/mo	$1,800/mo	$4,000/mo

Caveats: prices are approximate, vary by query volume, and assume reasonable index parameters. The pgvector numbers assume the Postgres instance is doing nothing else; if Postgres is already running for your application, the marginal cost of vectors is much lower.

The summary: at small scale, all four are within $50 of each other. At 10M+ vectors, pgvector and Qdrant pull ahead. At 100M+ vectors, the cost difference is dramatic.

Hybrid retrieval quality

Hybrid retrieval matters because pure semantic search misses queries that need exact term matching. “What did our 2023 contract say about indemnification” needs both semantic understanding (“what does indemnification mean here”) and exact term match (“2023 contract”).

Qdrant has built-in sparse-dense hybrid retrieval with BM25 plus dense vectors. Out of the box. The retrieval quality is competitive with anything we have tested.

Weaviate has native hybrid search. Configure once, works.

Pinecone added sparse-dense hybrid in 2024. It works. The integration feels less polished than Qdrant or Weaviate, but it gets the job done.

pgvector requires you to compose hybrid retrieval manually using Postgres full-text search plus pgvector’s similarity, then merge the two ranked lists in your application code. The result is good, but you are doing the merging logic yourself. For most production systems this is fine; for teams that want hybrid as a first-class feature, the dedicated vector stores save engineering time.

Operational maturity

What happens when something breaks at 2 a.m. is a much bigger deal than people give it credit for.

pgvector: Postgres is a known quantity. Your existing Postgres playbook (replication, backups, point-in-time recovery, monitoring) covers pgvector. There is no separate learning curve for the on-call.

Pinecone: managed service. The on-call is Pinecone’s, not yours. The downside is when Pinecone has an incident, you are watching their status page and waiting. We have seen 3-hour outages a few times in 4 years of using Pinecone in production. Acceptable for most teams; not acceptable for systems where the AI feature is mission-critical.

Qdrant: self-hosted requires ops capability. Their Cloud offering is competent but newer than Pinecone’s. Documentation and community are solid.

Weaviate: similar to Qdrant. Self-hosted is doable but adds operational burden. Cloud is competent.

The summary: managed (Pinecone, Cloud variants of Qdrant and Weaviate) trades cost for operational simplicity. Self-hosted trades operational burden for cost control. pgvector splits the difference because Postgres is something you probably already operate.

Lock-in and migration

The data is portable. Vectors are just floats. Metadata is JSON. You can dump and re-ingest anywhere.

The query patterns are not portable. Every database has its own query syntax. Pinecone uses a Python SDK. pgvector uses SQL with custom operators. Qdrant uses a REST API. Weaviate uses GraphQL or REST. Migrating means rewriting every retrieval query.

The index parameters are not portable either. HNSW configurations differ. The “optimal” parameters in one system are not optimal in another, and re-tuning requires re-ingesting.

Plan for 1 to 3 weeks of engineering work to migrate between any pair, depending on corpus size and query complexity. Run both in parallel for a week before cutting over so you can compare retrieval quality.

The migration cost is the strongest argument for picking right the first time.

When to pick each

A decision tree, in order. Stop at the first match.

Already running Postgres? Corpus under 5M vectors? Latency requirement above 50ms? Pick pgvector. Add the extension, build an HNSW index, write SQL. Done.
No existing infrastructure preference, and the team does not want to operate a database? Pick Pinecone. The cost is real but the operational simplicity is worth it for many teams.
Self-hosting is on the table and hybrid retrieval is core to the use case? Pick Qdrant. Native hybrid plus reasonable cost at scale.
You need built-in vectorization, multi-tenancy, or other modular features Weaviate provides? Pick Weaviate. Otherwise its complexity is overkill.
Corpus over 100M vectors and query volume is high? This is past the easy answers. Run a benchmark on your actual data and queries against pgvector, Qdrant self-hosted, and one managed service. The right answer depends on your specific access pattern.

What we always avoid

Running two vector databases in production. Pick one. Stick with it. Migrating is annoying but staying on two is worse.
Choosing based on vendor pitches alone. Every vector database vendor has a deck claiming they are the fastest, the cheapest, and the most accurate. They cannot all be right. Run your own benchmarks against your own data.
Picking the trendy option. Qdrant has been having a moment in 2025-2026. That does not make it the right answer for every project. Weaviate had a similar moment in 2023. The fashionable choice is not always the right one.
Premature scale planning. If you have 50,000 vectors today, pick pgvector and move on. By the time you hit 5 million vectors you will have learned things that change the decision.

What we will probably revisit in 2027

The vector database landscape moves fast. Some bets we are watching:

DiskANN-based approaches (Microsoft’s research, productized in some Azure offerings) for billion-scale corpora at lower cost than HNSW.
Postgres extensions beyond pgvector, like vchord and vector-rs, that promise better performance for the Postgres-already-running pattern.
Embedded vector search in SQLite (sqlite-vec) for edge and serverless use cases where adding a database is overkill.
Multi-vector retrieval (ColBERT-style) becoming a first-class feature in the major vector databases.

Re-evaluate your choice every 12-18 months. Most teams will not need to migrate, but the option becomes worth considering at that cadence.

Bottom line

For most production RAG systems we have shipped or audited:

Under 5M vectors and Postgres in the stack: pgvector, every time.
Managed-only requirement: Pinecone for the operational simplicity, accept the cost.
Self-hosted, hybrid retrieval matters: Qdrant.
Specific Weaviate feature you actually need: Weaviate.

The boring choice is usually the right choice. Vector databases are infrastructure, not differentiation. Pick the one that makes the rest of your team’s life easier, ship the RAG system, and move on to the work that actually creates value.

Vector Database Buyer's Guide: pgvector, Pinecone, Weaviate, Qdrant

The four contenders

The dimensions that matter

Cost at scale

Hybrid retrieval quality

Operational maturity

Lock-in and migration

When to pick each

What we always avoid

What we will probably revisit in 2027

Bottom line

The Production-Ready Checklist for AI Systems

The 2-Week AI Strategy Sprint, in Detail

Voice Agent or IVR? A Decision Framework

Ready to scope something?