Why ChatGPT and Gemini say different things about your brand (and how to explain it to the client)
The email arrives on a Tuesday: "I asked ChatGPT about accountancy firms in Zaragoza and we came up third. My partner tried the same thing in Gemini and we don't appear. Which of the two is lying?".
Neither. And if you can't explain why, the conversation gets complicated: the client concludes that this AI thing is a lottery and that there's no point paying you to work on it. The reality is exactly the opposite, and this article gives you the explanation — in terms a client understands — and the way to turn it into an argument in your favour.
The AIs don't read the same newspaper
The first cause, and the most important: each AI searches different places when it needs fresh information.
When ChatGPT switches on its web search, it draws mainly on the Bing index. Gemini is Google's and consults the Google index. Perplexity keeps its own indexing system and crawls the web itself. And there are models that barely search live and answer almost entirely from what they learned during training.
Translation for the client: it's like asking three friends about a restaurant, each of whom reads different guides. One reads the guide where your client has a full listing and reviews; another reads one where they don't even appear. It's not that a friend is lying: it's that their sources don't match.
This explains most discrepancies. If the client's website indexes well in Google but has problems in Bing, or if their reviews live in a directory that Perplexity crawls but the others ignore, the answers are bound to diverge.
The base brain isn't the same either
Second cause: even if all four AIs read the same sources today, each comes to the conversation with a "brain" trained at different moments and on different materials.
Each model has a training cut-off date: everything that happened after it isn't in its memory, it can only find it by searching. If your client opened their second clinic four months ago, an AI with recent training may know it "from memory" and another will only know it if its search engine stumbles on the news. And the training materials don't match either: some AIs have digested more forums, others more press, others more company listings.
Result: faced with the same question, one AI pulls from outdated memory, another from a fresh search, and the answers don't line up.
And even if everything were the same, the same AI changes from one day to the next
Third cause, the one that throws people most: ask the same AI the same thing twice and it can give you two different answers. These systems generate text with a degree of deliberate variability; they don't recite a database, they draft each answer from scratch. Add that the indexes update daily and that the companies themselves tune their models constantly, and you've got ground that shifts every week.
That's why a loose screenshot proves almost nothing: neither the good one nor the bad one. It proves what that AI said, that day, to that question phrased exactly that way.
The script to explain it to the client
When the partner's email and the disagreeing Gemini arrive, this is the answer that makes clear you've got a handle on it:
Key data
"It's normal and it happens to everyone: each AI consults different sources — ChatGPT looks mainly at Bing, Gemini at Google, Perplexity has its own search engine — and on top of that their answers vary from one day to the next. Appearing in one and not in another isn't an error: it's the real picture of where you're strong and where you're not. And that's exactly why you can't watch this with a single query on a Tuesday."
Three ideas in one: it's normal, there's an explanation, and the logical conclusion is to measure properly. If the client wants to go deeper without jargon, there are analogies that work very well for explaining AI visibility.
From problem to argument: why this sells multi-AI monitoring
Here's the twist that matters to your agency: the discrepancy between AIs isn't a flaw in the channel, it's the reason the service exists.
If all four AIs said the same thing, you could just check one and extrapolate. Since they don't, whoever looks only at ChatGPT is blind to three of the four shop windows. And it's not a rare case: in the study we ran on 9,865 Spanish SMEs across 30 sectors and 10 cities, 91% appeared in only 1 of the 4 main AIs (the data, by sector and city, here). The normal situation for a Spanish SME is exactly that of Tuesday's email: visible in one, invisible in the rest.
The sales argument builds itself:
- Coverage, not anecdote. "We measure you across the 4 AIs with your sector's real questions, not with a loose query."
- Trend, not snapshot. Since the answers vary, what counts is the series: are you appearing more or less than two months ago? A one-off measurement doesn't answer that; periodic tracking does. On the right pace we write in how often to monitor without losing your mind.
- Actionable diagnosis. If you appear in Gemini but not in ChatGPT, the problem usually lies in specific sources (indexing in Bing, directories, reviews) and can be worked on. The discrepancy tells you where to dig.
Doing this by hand — 4 AIs × dozens of prompts × every client × every week — doesn't scale, and that's why we automate it in Surfeo for agencies: each client with their prompts monitored weekly in ChatGPT, Gemini, Perplexity and Claude, with the history saved to show the evolution instead of loose screenshots.
What you shouldn't tell the client
Two temptations worth avoiding:
"We'll get you appearing in all of them." Nobody controls what an AI answers. You can work on the sources they drink from — content, structured data, reviews, directories — and that moves results, but promising guaranteed presence in all four is signing up to a breach. What you can commit to and what you can only report we cover in what KPIs to put in a GEO proposal.
"Gemini is wrong, listen to ChatGPT." Each AI has its audience and its volume. Writing one off because it doesn't favour you today is looking the other way: your client's customer doesn't pick the AI that speaks best about you.
Frequently asked questions
Which of the AIs is "the important one" for my client?
It depends on their audience, which is why you measure them all. ChatGPT has the largest user base in Spain — frequent use went from 4% to 28% in two years (Funcas, III Survey on AI, 2026) — but Gemini comes built into Google's ecosystem and Perplexity carries weight with profiles that research before buying. Treating them as a single channel with four shop windows is the most honest approach.
If the answers change every week, what's the point of measuring them?
The same as measuring positions in Google even though they dance around: what matters is the trend, not the snapshot. A brand that's been worked on appears more and more often, in more AIs and with more correct data. That can only be seen with periodic, comparable measurements.
Can one AI say false things about my client while another gets it right?
Yes, and it's one of the most urgent cases: old hours, outdated addresses, wrong descriptions. It happens because that AI drinks from an outdated source the others don't use. The solution isn't to "warn the AI" — you can't — but to correct the source: website, listings, directories, structured data.
How do I explain this in a slide without launching into the technical spiel?
With a number and a screenshot: "You appear in 2 of the 4 AIs for your 10 key questions." The discrepancy between AIs is captured in the figure itself, no theory needed. We have spelled out exactly what to put on that slide.
Next time a client shows you two AIs contradicting each other, don't make excuses: explain why it happens and show them the full data. Take the free visibility test with their website and you'll see in minutes which of the 4 AIs they appear in and which they don't.