Why AI Confidently Makes Things Up and What You Can Do About It

You asked ChatGPT for the CEO of a company and it gave you a name with complete confidence. The name was wrong. You asked it to summarise a legal document and one of the key details it reported wasn’t in the document at all. You asked Claude to recommend a recent report on a topic and it cited a study that doesn’t exist.

This isn’t a bug or a malfunction. It’s a structural feature of how large language models work — and understanding it is the difference between using AI tools intelligently and being embarrassed by them in front of a client.

What Hallucination Actually Means

The term “hallucination” in AI refers to when a model generates text that is factually incorrect, fabricated, or unsupported — but presents it with the same confident tone as accurate information. The model doesn’t know it’s wrong. It has no concept of “I made this up.” It’s doing exactly what it was trained to do: predict the most plausible next token given everything that came before.

That’s the key insight. These models aren’t retrieving facts from a database. They’re generating text that looks like the kind of text that would answer your question correctly. Most of the time, that works. But the mechanism has no built-in truth verification, which means it fails in ways that can be hard to detect without external checking.

Why Models Hallucinate: The Technical Picture in Plain English

Large language models are trained on enormous amounts of text — web pages, books, articles, code. Through that training, they develop an extremely sophisticated model of language: what words follow other words, what structures appear in certain contexts, what kinds of answers appear after certain kinds of questions.

When you ask a question, the model isn’t “thinking” in the way a human does. It’s pattern-matching at massive scale: given this input, what is the most statistically likely output? For most common questions, the training data contains enough examples that the most likely output is correct. For uncommon questions — niche topics, recent events, very specific facts — the training data is sparse, and the model fills in the gaps with plausible-sounding text that may not be grounded in reality.

Three situations reliably produce more hallucinations than others:

Questions about specific facts it’s uncertain about. Dates, statistics, names, URLs, citations — anywhere a precise value matters, models are more likely to generate something plausible but wrong than to admit they don’t know.

Questions about recent events past the training cutoff. Models have knowledge cutoffs. Events after that date simply aren’t in the training data. Rather than saying “I don’t know,” the model often generates a plausible-sounding answer based on patterns from before the cutoff.

Questions where the correct answer is counterintuitive or rare. If the right answer looks unusual compared to most training examples, the model tends to drift toward the more common (but wrong) pattern.

How Common Is This, Really?

More common than most business users realise, less common than the sceptics claim. Research on hallucination rates varies significantly depending on task type and model, but real-world studies consistently find rates between 3% and 27% on factual question-answering tasks, depending on how obscure or specific the questions are.

For business-critical content — anything involving specific facts, figures, dates, legal details, or citations — a 5% error rate is not acceptable without human review. For general writing, brainstorming, summarising content you already have, or drafting first versions of things you’ll edit, the hallucination risk is much lower and often tolerable.

The most dangerous hallucinations aren’t the obvious ones (a clearly wrong date or a completely made-up company). They’re the plausible ones — a statistic that’s slightly off, a person’s title that’s almost right, a legal clause that sounds reasonable but doesn’t reflect the actual contract.

High vs Low Hallucination Risk by Task Type

Task Risk Level Why
Citing sources, statistics, studies 🔴 High Models fabricate plausible-sounding references
Specific dates, names, titles 🔴 High Precision required; model fills gaps with patterns
Legal and regulatory details 🔴 High Nuanced rules; confident-sounding errors are dangerous
Summarising a document you provide 🟡 Medium Grounded in source material but can still misread
Drafting emails, marketing copy 🟢 Low No specific facts required; quality of reasoning matters more
Brainstorming, ideation, frameworks 🟢 Low No factual claims to verify; plausibility is the goal

Five Practical Techniques to Reduce Hallucination Risk

1. Provide the source material yourself

The single most effective way to reduce hallucinations is to give the model the information it needs rather than asking it to recall it. If you want a summary of a report, paste the report in. If you want analysis of a contract, include the contract text. When the model is working from content you’ve provided rather than its training data, the hallucination risk drops dramatically — it’s constrained to what’s actually in front of it.

2. Ask the model to cite its reasoning

Adding “explain your reasoning step by step” or “indicate if you’re uncertain about any of these claims” to your prompt won’t eliminate hallucinations, but it does two useful things: it forces the model to surface its logic, making errors easier to spot, and models tend to be more accurate when they’ve been prompted to reason explicitly rather than just produce an answer.

3. Use web-search-enabled models for factual queries

ChatGPT with search enabled, Claude with web search, and Perplexity AI all have the ability to retrieve current information from the web before answering. For queries involving specific facts, recent events, statistics, or citations, this is significantly more reliable than asking a model to answer from training data alone. Make it a habit: any time you need a specific, verifiable fact, use a model with live search rather than a static one.

4. Never ask for citations unless you’re going to verify them

One of the most reliable hallucination triggers is asking an AI to “cite sources” or “provide references.” Models are very good at generating text that looks like academic citations — author names, journal titles, publication years — and very prone to making them up. If you need real citations, use Perplexity AI (which links to actual sources) or verify any AI-generated citation independently before using it.

5. Build human review into high-stakes workflows

For any AI-generated content that will be seen by clients, published publicly, used in legal or financial contexts, or relied upon for decisions: build in a human review step. Not a quick skim — a deliberate check of any factual claims against a primary source. This isn’t a criticism of AI tools; it’s just appropriate use. The model is a first drafter, not a fact-checker.

What to Do When Your AI Gives a Customer Wrong Information

If you’re running a customer-facing chatbot and a user receives incorrect information from it, the right response is the same as any other customer service error: acknowledge it promptly, correct it, and improve the underlying system.

The improvement side is where the real work is. Most chatbot hallucinations come from one of three sources: the model being asked questions outside the scope of the information you’ve given it, the provided information being outdated, or the prompt not constraining the model tightly enough. Address the root cause — add relevant information to the knowledge base, update stale content, or tighten the system prompt to say something like “only answer based on the information provided; if you don’t know, say so.”

The Right Mental Model for Using AI Accurately

Think of a large language model as an extremely well-read colleague who has absorbed an enormous amount of information but has an imperfect memory for specific details and no access to anything published recently. You’d happily ask this colleague to draft a proposal, brainstorm ideas, review your writing, or explain a concept. You’d think twice before relying on them for a precise legal citation or a current statistic without double-checking.

Building Verification Habits Into AI Workflows

The most durable solution to AI hallucination is not better prompting but better workflow design that incorporates verification as a standard step rather than an optional check. For any workflow where AI outputs will be published, shared with clients, or used in decisions, build verification checkpoints that match the stakes. Factual claims in client deliverables should be traced to primary sources before delivery. Statistics cited in published content should be verified against the original data source. AI-generated code should pass a test suite before deployment. Medical, legal, or financial information should receive domain expert review. The verification step is not an admission that AI is unreliable — it is the same professional standard that applies to any research or analysis process, AI-assisted or not. Frame it as quality assurance, not AI distrust, and it integrates naturally into professional workflows.

That mental model maps almost perfectly onto how these tools actually behave. Use them for what they’re genuinely good at — generating, structuring, editing, and reasoning about content — and verify anything specific and factual before it goes anywhere that matters.

Leave a Comment