data

Vector Database

Learn what Vector Database means in AI and machine learning, with examples and related concepts.

Definition

Vector database is a specialized database designed to store, index, and search high-dimensional vectors — the numerical representations (called embeddings) that AI models use to understand meaning.

Traditional databases search by exact matches: “find all users where name = ‘John’”. Vector databases search by similarity: “find documents most similar in meaning to this query.” This is what powers semantic search, recommendation systems, and RAG pipelines.

When you ask Perplexity AI a question, it converts your query to a vector, searches a vector database for semantically similar documents, retrieves the most relevant ones, and feeds them to an LLM to generate an answer. The vector database is the engine that makes this fast — searching millions of documents in milliseconds.

How It Works

1. EMBED — Convert text to vectors using an embedding model
   "How to train a neural network"  →  [0.23, -0.87, 0.45, ..., 0.12]
                                        (768 or 1536 dimensions)

2. STORE — Save vectors in the database with metadata
   {
     vector: [0.23, -0.87, 0.45, ...],
     metadata: { source: "tutorial.md", section: "chapter 3" },
     text: "Neural network training involves..."
   }

3. SEARCH — Find similar vectors using distance metrics
   Query: "deep learning training process"
   →  [0.25, -0.82, 0.41, ...]  (similar vector!)
   →  Returns top-k nearest neighbors

   Distance metrics:
   • Cosine similarity — angle between vectors (most common for text)
   • Euclidean distance — straight-line distance
   • Dot product — faster variant of cosine

Approximate Nearest Neighbor (ANN)

Brute-force comparison against millions of vectors would be too slow. Vector databases use ANN algorithms to find approximate matches much faster:

Brute force: Compare query against ALL 10M vectors → 100% accurate, very slow
ANN (HNSW):  Navigate a graph structure → ~99% accurate, 1000x faster

Common ANN algorithms:
  HNSW (Hierarchical Navigable Small World) — best general-purpose
  IVF (Inverted File Index) — good for very large datasets
  PQ (Product Quantization) — compressed vectors for memory efficiency

Why It Matters

Major Vector Databases (as of 2026)

DatabaseTypeBest ForPricing
PineconeManaged cloudProduction RAG, zero opsPay per usage
WeaviateOpen source / cloudHybrid search (vector + keyword)Free / managed
ChromaDBOpen source (embedded)Prototyping, small projectsFree
QdrantOpen source / cloudPerformance-critical appsFree / managed
pgvectorPostgres extensionExisting Postgres usersFree (extension)
MilvusOpen sourceLarge-scale (billions of vectors)Free / managed

Example

# ChromaDB — simplest way to start (no server needed)
import chromadb

# Create a client and collection
client = chromadb.Client()
collection = client.create_collection(
    name="knowledge_base",
    metadata={"hnsw:space": "cosine"}  # cosine similarity
)

# Add documents — ChromaDB auto-generates embeddings
collection.add(
    documents=[
        "Python is a high-level programming language known for its simplicity.",
        "JavaScript runs in web browsers and is essential for frontend development.",
        "Rust provides memory safety without garbage collection.",
        "Go was designed at Google for building scalable server software.",
        "TypeScript adds static typing to JavaScript.",
    ],
    ids=["doc1", "doc2", "doc3", "doc4", "doc5"],
    metadatas=[
        {"language": "python"},
        {"language": "javascript"},
        {"language": "rust"},
        {"language": "go"},
        {"language": "typescript"},
    ]
)

# Semantic search
results = collection.query(
    query_texts=["What language is good for beginners?"],
    n_results=3
)

print(results["documents"][0])
# → ['Python is a high-level programming language known for its simplicity.',
#     'Go was designed at Google for building scalable server software.',
#     'JavaScript runs in web browsers...']
# Python ranks first because "simplicity" is semantically closest to "beginners"
# Full RAG pipeline: Vector DB + Claude
import chromadb
from anthropic import Anthropic

# Set up vector store with your documents
chroma = chromadb.Client()
docs = chroma.create_collection("product_docs")

# Index your documentation
docs.add(
    documents=[
        "The Pro plan costs $29/month and includes 50GB storage and priority support.",
        "To reset your password, go to Settings > Security > Change Password.",
        "Our API rate limit is 1000 requests per minute on the Pro plan.",
        "Billing is processed on the 1st of each month. You can cancel anytime.",
    ],
    ids=["pricing", "password", "api-limits", "billing"]
)

# Answer questions using retrieved context
anthropic = Anthropic()

def answer(question: str) -> str:
    # Step 1: Retrieve relevant documents
    results = docs.query(query_texts=[question], n_results=2)
    context = "\n".join(results["documents"][0])

    # Step 2: Generate grounded answer
    response = anthropic.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=300,
        system="Answer based only on the provided context. If the answer isn't in the context, say so.",
        messages=[{
            "role": "user",
            "content": f"Context:\n{context}\n\nQuestion: {question}"
        }]
    )
    return response.content[0].text

print(answer("How much does the Pro plan cost?"))
# → "The Pro plan costs $29/month and includes 50GB storage and priority support."

print(answer("What's the API rate limit?"))
# → "The API rate limit is 1000 requests per minute on the Pro plan."

Key Takeaways


Part of the DeepRaft Glossary — AI and ML terms explained for developers.