Embeddings
Learn what Embeddings (Vector Embeddings) means in AI and machine learning, with examples and related concepts.
Definition
Embeddings (or vector embeddings) are numerical representations of text, images, or other data as arrays of numbers (vectors) that capture semantic meaning.
The key insight: similar concepts end up near each other in vector space. “dog” and “puppy” have very similar embeddings, while “dog” and “airplane” are far apart. This makes it possible to search by meaning, not just keywords.
Embeddings are the foundation of RAG systems, semantic search, recommendation engines, and clustering.
How It Works
Text: "How to train a neural network"
↓ Embedding Model
Vector: [0.023, -0.451, 0.887, ..., 0.012] (1536 dimensions)
Text: "Deep learning model training guide"
↓ Embedding Model
Vector: [0.019, -0.448, 0.891, ..., 0.009] (similar! → close in space)
Text: "Best pizza recipes"
↓ Embedding Model
Vector: [0.756, 0.234, -0.112, ..., 0.445] (different → far apart)
The distance between vectors tells you how semantically similar two pieces of text are. Common distance metrics: cosine similarity, dot product, Euclidean distance.
Why It Matters
- Semantic search — Find documents by meaning, not keyword matching (“affordable housing” matches “low-cost apartments”)
- RAG — Retrieve relevant context for LLM by finding the most similar document chunks
- Recommendations — “Users who liked X also liked Y” via similarity in embedding space
- Classification — Cluster similar items without manual labeling
- Deduplication — Find near-duplicate content even with different wording
Example
from openai import OpenAI
import numpy as np
client = OpenAI()
def get_embedding(text: str) -> list[float]:
response = client.embeddings.create(
model="text-embedding-3-small",
input=text
)
return response.data[0].embedding
def cosine_similarity(a, b):
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
# Compare similarity
e1 = get_embedding("How to build a REST API")
e2 = get_embedding("Creating a web service endpoint")
e3 = get_embedding("Best chocolate cake recipe")
print(cosine_similarity(e1, e2)) # → 0.89 (very similar)
print(cosine_similarity(e1, e3)) # → 0.12 (not similar)
Popular Embedding Models
| Model | Dimensions | Provider | Use Case |
|---|---|---|---|
| text-embedding-3-small | 1536 | OpenAI | General purpose, cost-effective |
| text-embedding-3-large | 3072 | OpenAI | Higher accuracy |
| Cohere embed-v3 | 1024 | Cohere | Multilingual |
| all-MiniLM-L6-v2 | 384 | Hugging Face | Free, runs locally |
Key Takeaways
- Embeddings convert text into numbers that capture meaning — similar text = similar vectors
- They power semantic search, RAG, and recommendation systems
- You need a vector database to store and search embeddings efficiently
- Embedding quality directly affects your RAG system’s accuracy
- Smaller embedding models (384-d) are often good enough and much faster than large ones
Part of the DeepRaft Glossary — AI and ML terms explained for developers.