llm

Hallucination

Learn what Hallucination means in AI and machine learning, with examples and related concepts.

Definition

Hallucination is when an AI model generates output that sounds confident and plausible but is factually wrong, fabricated, or unsupported by its training data.

This isn’t a bug in the traditional sense — it’s a fundamental consequence of how LLMs work. These models are trained to produce likely-sounding text, not to verify truth. When the model doesn’t “know” something, it doesn’t say “I don’t know” — it generates the most probable continuation, which can be entirely made up.

Common hallucination types include: inventing citations that don’t exist, fabricating statistics, confidently stating wrong facts, and creating plausible-looking code that calls non-existent APIs or functions.

How It Works

Hallucinations happen because LLMs are pattern-completion machines, not knowledge databases:

Why LLMs hallucinate:

1. TRAINING OBJECTIVE
   Goal: predict the next likely token
   NOT: verify the truth of what's generated
   → The model optimizes for plausibility, not accuracy

2. KNOWLEDGE GAPS
   User:  "What was the GDP of Liechtenstein in Q3 2025?"
   Model: [hasn't seen this specific data]
   Model: "The GDP was $7.2 billion..."  ← plausible but fabricated
         (instead of saying "I don't have that data")

3. PATTERN OVER-GENERALIZATION
   Training data: many papers have DOIs like "10.1038/..."
   Model generates: "10.1038/s41586-024-07832-x" ← looks real, doesn't exist

The risk is highest when:

Why It Matters

This is the number one concern for enterprises adopting LLMs. Every production AI system needs a hallucination mitigation strategy.

Mitigation Strategies

StrategyHow It HelpsExample
RAGGrounds answers in real documentsPerplexity AI cites sources for every claim
GroundingConnects model output to verifiable sourcesGoogle Gemini with Google Search
Low temperatureReduces randomness, sticks to likely tokenstemperature=0 for factual tasks
Prompt engineeringInstruct the model to say “I don’t know""If unsure, say you’re not certain”
Human reviewCatch fabrications before they reach usersRequired for high-stakes outputs

Example

from anthropic import Anthropic

client = Anthropic()

# BAD: Ask for specific citations — high hallucination risk
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=300,
    messages=[{
        "role": "user",
        "content": "List 5 peer-reviewed papers about transformer efficiency published in 2025 with DOIs."
    }]
)
# ⚠️ The model may generate plausible-looking but fake DOIs

# BETTER: Instruct the model to be honest about uncertainty
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=300,
    temperature=0,
    messages=[{
        "role": "user",
        "content": """List peer-reviewed papers about transformer efficiency.
        Rules:
        - Only include papers you are confident exist
        - If you're unsure about a detail (DOI, year, author), say so explicitly
        - It's better to list fewer papers than to fabricate any"""
    }]
)
# BEST: Use RAG to ground responses in real documents
# (See the RAG glossary entry for a full implementation)
def grounded_answer(query: str, documents: list[str]) -> str:
    context = "\n---\n".join(documents)
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=500,
        temperature=0,
        messages=[{
            "role": "user",
            "content": f"""Answer based ONLY on the provided documents.
If the documents don't contain the answer, say "Not found in provided sources."

Documents:
{context}

Question: {query}"""
        }]
    )
    return response.content[0].text

Key Takeaways


Part of the DeepRaft Glossary — AI and ML terms explained for developers.