Embedding
embeddingsA dense vector that encodes the meaning of a piece of data so similar things land near each other in vector space.
In one line
A dense vector that encodes the meaning of an input so similar things land near each other in vector space.
What it actually means
An embedding is just a fixed-length list of floats (often 384, 768, or 1536 numbers) produced by a model that was trained so that semantically related inputs end up with similar vectors. You can embed words, sentences, full documents, images, or audio — whatever the encoder was trained on. Similarity is usually measured with cosine similarity or dot product. The space itself has no labels; meaning emerges from how the training objective pushed related examples together and unrelated ones apart.
Why it matters
Embeddings are how we make unstructured data searchable, clusterable, and comparable without writing rules. They’re the substrate of every retrieval system, recommendation feed, deduplication pipeline, and semantic search box. If you’re building RAG, your retrieval quality is bounded by how good your embeddings are at separating the things that actually matter for your task.
Example
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")
vecs = model.encode(["a cat on a mat", "a feline on a rug", "a stock market crash"])
# vecs[0] and vecs[1] cosine ~0.85; vecs[0] and vecs[2] cosine ~0.05
You’ll hear it when
- Setting up a vector database for RAG.
- Choosing between OpenAI, Cohere, or open-source embedding models.
- Debugging why retrieval misses obvious matches (“the embedding model can’t tell those apart”).
- Building a deduplication or clustering pipeline.
- Discussing recommendation systems or two-tower retrieval.