Weaviate
Open-source vector database with built-in hybrid search, modules for embedding generation, and strong filtering.
Category
Vector Databases
Difficulty
Intermediate
When to use
You want an open-source, self-hostable vector store with hybrid (vector + keyword) search out of the box.
When not to use
You need the simplest possible setup and Chroma or pgvector would do the job.
Alternatives
Pinecone Qdrant Milvus pgvector
At a glance
| Field | Value |
|---|---|
| Category | Open-source vector database |
| Difficulty | Intermediate |
| When to use | Self-hosted RAG with hybrid search |
| When not to use | Minimal setups where simpler stores work |
| Alternatives | Pinecone, Qdrant, Milvus, pgvector |
What it is
Weaviate stores objects with properties and (optionally) vector embeddings, and supports BM25, vector, and hybrid search that combines them. Its module system can generate embeddings at ingestion time via OpenAI, Cohere, HuggingFace, or local models, so you don’t have to run a separate embedding step. GraphQL and REST APIs, strong filtering, and multi-tenancy are included.
When we reach for it at Ephizen
- Self-hosted deployments where data can’t leave our infrastructure.
- Workloads where hybrid search outperforms pure vector search (legal, docs, code).
- Multi-tenant systems that need isolated namespaces.
Getting started
import weaviate
client = weaviate.connect_to_local()
docs = client.collections.get("Docs")
docs.data.insert({"content": "Refund policy...", "tags": ["faq"]})
res = docs.query.hybrid(query="how do refunds work?", limit=5)
Gotchas
- The v3 → v4 Python client was a breaking change. Make sure docs and code match the version you installed.
- Module-based embedding ties your index to a specific provider; bring-your-own-vectors is cleaner for migrations.
- HNSW indexes are in-memory — size the node accordingly.
Related tools
- ChromaAn embedded, developer-friendly vector database. The fastest path from "I have some docs" to "I have a working RAG prototype".
- pgvectorA PostgreSQL extension that adds vector types, distance operators, and ANN indexes. Turns your existing database into a vector store.
- PineconeA managed vector database designed for production semantic search and RAG — no servers to run, scale and latency handled for you.