llm

Grounding

Learn what Grounding means in AI and machine learning, with examples and related concepts.

Definition

Grounding is the practice of connecting an LLM’s responses to verifiable, external sources of truth — so the model answers based on real data rather than what it “memorized” during training.

When an LLM answers a question from memory alone, it might hallucinate — generating plausible-sounding but wrong information. Grounding prevents this by anchoring responses to specific documents, databases, search results, or APIs. If the model says “Claude costs $3 per million input tokens,” grounding means there’s an actual source document backing that claim.

RAG (Retrieval-Augmented Generation) is the most common grounding technique, but grounding is the broader concept. It includes any method that ties model output to verifiable information: web search, database queries, API calls, or document retrieval.

How It Works

UNGROUNDED (risky):
  User: "What's the latest Claude pricing?"
  Model: [relies on training data from months ago]
  Model: "Claude Sonnet costs $3 per million input tokens" ← might be outdated

GROUNDED (reliable):
  User: "What's the latest Claude pricing?"
  System: [retrieves current pricing page]
  Model: [reads the retrieved document]
  Model: "According to Anthropic's pricing page, Claude Sonnet 4.6
          costs $3 per million input tokens." ← sourced from real data

Grounding Methods

1. RETRIEVAL GROUNDING (RAG)
   Query → Search vector database → Relevant docs → Model reads & answers
   Example: Company knowledge base Q&A

2. SEARCH GROUNDING
   Query → Web search API → Search results → Model synthesizes
   Example: Perplexity AI, Google Gemini with Search

3. TOOL GROUNDING
   Query → Model calls API/tool → Real-time data → Model responds
   Example: "What's the weather?" → calls weather API → real answer

4. DOCUMENT GROUNDING
   Upload document → Model reads it → Answers only from document
   Example: "Summarize this contract" with uploaded PDF

Why It Matters

Example

# Grounding with web search using Perplexity-style approach
from anthropic import Anthropic

client = Anthropic()

def search_web(query: str) -> list[dict]:
    """Simulate web search — in production, use a real search API."""
    # Use SerpAPI, Brave Search, or Tavily in production
    return [
        {
            "title": "Claude Pricing - Anthropic",
            "url": "https://anthropic.com/pricing",
            "snippet": "Claude Sonnet 4.6: $3/M input, $15/M output. Claude Opus 4.6: $15/M input, $75/M output."
        },
        {
            "title": "Claude API Documentation",
            "url": "https://docs.anthropic.com",
            "snippet": "Claude supports 200K token context windows across all models."
        }
    ]

# Step 1: Search for relevant information
query = "What are the current Claude API pricing tiers?"
search_results = search_web(query)

# Step 2: Ground the model's response in search results
context = "\n\n".join(
    f"Source: {r['title']} ({r['url']})\n{r['snippet']}"
    for r in search_results
)

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=500,
    temperature=0,
    system="""Answer questions using ONLY the provided search results.
Cite sources with [Source Title](URL) for every factual claim.
If the search results don't contain the answer, say so.""",
    messages=[{
        "role": "user",
        "content": f"Search results:\n{context}\n\nQuestion: {query}"
    }]
)

print(response.content[0].text)
# → "According to [Claude Pricing - Anthropic](https://anthropic.com/pricing),
#    Claude Sonnet 4.6 costs $3/M input tokens and $15/M output tokens..."
# Document grounding — answer only from uploaded content
def grounded_qa(document: str, question: str) -> str:
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=500,
        temperature=0,
        messages=[{
            "role": "user",
            "content": f"""You are a document analysis assistant.
Answer the question based ONLY on the document below.
If the answer is not in the document, respond: "This information is not in the provided document."

--- DOCUMENT START ---
{document}
--- DOCUMENT END ---

Question: {question}"""
        }]
    )
    return response.content[0].text

# The model can only use information from the document — no hallucination
contract = open("service_agreement.pdf.txt").read()
answer = grounded_qa(contract, "What is the termination notice period?")

Grounding vs RAG

Grounding (concept)RAG (technique)
ScopeBroad — any source of truthSpecific — document retrieval
SourcesSearch, APIs, databases, documentsVector database / document store
Real-timeCan be (via APIs/search)Usually not (pre-indexed)
RelationshipThe goalOne way to achieve it

RAG is the most popular grounding technique, but grounding also includes web search, API calls, and tool use.

Key Takeaways


Part of the DeepRaft Glossary — AI and ML terms explained for developers.