What is Embeddings (Vector Embeddings)?

Definition

Embeddings (or vector embeddings) are numerical representations of text, images, or other data as arrays of numbers (vectors) that capture semantic meaning.

The key insight: similar concepts end up near each other in vector space. “dog” and “puppy” have very similar embeddings, while “dog” and “airplane” are far apart. This makes it possible to search by meaning, not just keywords.

Embeddings are the foundation of RAG systems, semantic search, recommendation engines, and clustering.

How It Works

Text: "How to train a neural network"
        ↓ Embedding Model
Vector: [0.023, -0.451, 0.887, ..., 0.012]  (1536 dimensions)

Text: "Deep learning model training guide"
        ↓ Embedding Model
Vector: [0.019, -0.448, 0.891, ..., 0.009]  (similar! → close in space)

Text: "Best pizza recipes"
        ↓ Embedding Model
Vector: [0.756, 0.234, -0.112, ..., 0.445]  (different → far apart)

The distance between vectors tells you how semantically similar two pieces of text are. Common distance metrics: cosine similarity, dot product, Euclidean distance.

Why It Matters

Semantic search — Find documents by meaning, not keyword matching (“affordable housing” matches “low-cost apartments”)
RAG — Retrieve relevant context for LLM by finding the most similar document chunks
Recommendations — “Users who liked X also liked Y” via similarity in embedding space
Classification — Cluster similar items without manual labeling
Deduplication — Find near-duplicate content even with different wording

Example

from openai import OpenAI
import numpy as np

client = OpenAI()

def get_embedding(text: str) -> list[float]:
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

# Compare similarity
e1 = get_embedding("How to build a REST API")
e2 = get_embedding("Creating a web service endpoint")
e3 = get_embedding("Best chocolate cake recipe")

print(cosine_similarity(e1, e2))  # → 0.89 (very similar)
print(cosine_similarity(e1, e3))  # → 0.12 (not similar)

Popular Embedding Models

Model	Dimensions	Provider	Use Case
text-embedding-3-small	1536	OpenAI	General purpose, cost-effective
text-embedding-3-large	3072	OpenAI	Higher accuracy
Cohere embed-v3	1024	Cohere	Multilingual
all-MiniLM-L6-v2	384	Hugging Face	Free, runs locally

Key Takeaways

Embeddings convert text into numbers that capture meaning — similar text = similar vectors
They power semantic search, RAG, and recommendation systems
You need a vector database to store and search embeddings efficiently
Embedding quality directly affects your RAG system’s accuracy
Smaller embedding models (384-d) are often good enough and much faster than large ones

Part of the DeepRaft Glossary — AI and ML terms explained for developers.

Embeddings