What is Grounding? — AI Glossary

Definition

Grounding is the practice of connecting an LLM’s responses to verifiable, external sources of truth — so the model answers based on real data rather than what it “memorized” during training.

When an LLM answers a question from memory alone, it might hallucinate — generating plausible-sounding but wrong information. Grounding prevents this by anchoring responses to specific documents, databases, search results, or APIs. If the model says “Claude costs $3 per million input tokens,” grounding means there’s an actual source document backing that claim.

RAG (Retrieval-Augmented Generation) is the most common grounding technique, but grounding is the broader concept. It includes any method that ties model output to verifiable information: web search, database queries, API calls, or document retrieval.

How It Works

UNGROUNDED (risky):
  User: "What's the latest Claude pricing?"
  Model: [relies on training data from months ago]
  Model: "Claude Sonnet costs $3 per million input tokens" ← might be outdated

GROUNDED (reliable):
  User: "What's the latest Claude pricing?"
  System: [retrieves current pricing page]
  Model: [reads the retrieved document]
  Model: "According to Anthropic's pricing page, Claude Sonnet 4.6
          costs $3 per million input tokens." ← sourced from real data

Grounding Methods

1. RETRIEVAL GROUNDING (RAG)
   Query → Search vector database → Relevant docs → Model reads & answers
   Example: Company knowledge base Q&A

2. SEARCH GROUNDING
   Query → Web search API → Search results → Model synthesizes
   Example: Perplexity AI, Google Gemini with Search

3. TOOL GROUNDING
   Query → Model calls API/tool → Real-time data → Model responds
   Example: "What's the weather?" → calls weather API → real answer

4. DOCUMENT GROUNDING
   Upload document → Model reads it → Answers only from document
   Example: "Summarize this contract" with uploaded PDF

Why It Matters

Accuracy — Grounded responses are verifiable against their sources, dramatically reducing hallucination
Freshness — Models can access information published after their training cutoff
Trust — Citations let users verify claims themselves
Enterprise adoption — Most companies won’t deploy LLMs without grounding because the risk of wrong answers is too high
Legal/compliance — In regulated industries, every AI-generated claim may need a traceable source

Example

# Grounding with web search using Perplexity-style approach
from anthropic import Anthropic

client = Anthropic()

def search_web(query: str) -> list[dict]:
    """Simulate web search — in production, use a real search API."""
    # Use SerpAPI, Brave Search, or Tavily in production
    return [
        {
            "title": "Claude Pricing - Anthropic",
            "url": "https://anthropic.com/pricing",
            "snippet": "Claude Sonnet 4.6: $3/M input, $15/M output. Claude Opus 4.6: $15/M input, $75/M output."
        },
        {
            "title": "Claude API Documentation",
            "url": "https://docs.anthropic.com",
            "snippet": "Claude supports 200K token context windows across all models."
        }
    ]

# Step 1: Search for relevant information
query = "What are the current Claude API pricing tiers?"
search_results = search_web(query)

# Step 2: Ground the model's response in search results
context = "\n\n".join(
    f"Source: {r['title']} ({r['url']})\n{r['snippet']}"
    for r in search_results
)

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=500,
    temperature=0,
    system="""Answer questions using ONLY the provided search results.
Cite sources with [Source Title](URL) for every factual claim.
If the search results don't contain the answer, say so.""",
    messages=[{
        "role": "user",
        "content": f"Search results:\n{context}\n\nQuestion: {query}"
    }]
)

print(response.content[0].text)
# → "According to [Claude Pricing - Anthropic](https://anthropic.com/pricing),
#    Claude Sonnet 4.6 costs $3/M input tokens and $15/M output tokens..."

# Document grounding — answer only from uploaded content
def grounded_qa(document: str, question: str) -> str:
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=500,
        temperature=0,
        messages=[{
            "role": "user",
            "content": f"""You are a document analysis assistant.
Answer the question based ONLY on the document below.
If the answer is not in the document, respond: "This information is not in the provided document."

--- DOCUMENT START ---
{document}
--- DOCUMENT END ---

Question: {question}"""
        }]
    )
    return response.content[0].text

# The model can only use information from the document — no hallucination
contract = open("service_agreement.pdf.txt").read()
answer = grounded_qa(contract, "What is the termination notice period?")

Grounding vs RAG

	Grounding (concept)	RAG (technique)
Scope	Broad — any source of truth	Specific — document retrieval
Sources	Search, APIs, databases, documents	Vector database / document store
Real-time	Can be (via APIs/search)	Usually not (pre-indexed)
Relationship	The goal	One way to achieve it

RAG is the most popular grounding technique, but grounding also includes web search, API calls, and tool use.

Key Takeaways

Grounding connects LLM responses to verifiable external sources, reducing hallucination
RAG is the most common grounding technique, but web search and API calls also qualify
Grounded responses include citations, letting users verify claims
Essential for enterprise AI deployment — most companies require grounding for production LLM applications
Perplexity AI and Google Gemini with Search are prominent examples of grounded AI products

Part of the DeepRaft Glossary — AI and ML terms explained for developers.

Grounding