How to Check What AI Systems Say About Your Brand Without Fooling Yourself

A brand audit in AI search starts to become useful only after you stop treating one answer as a verdict.

A VP of marketing opens ChatGPT after a board meeting and types the company name. The answer is not terrible. It gets the category roughly right, mentions two real products, and links to the website. It also describes the company as “early-stage,” although the business now has 140 employees and enterprise customers. The VP takes a screenshot, drops it into Slack, and asks whether this is “an AI visibility issue.”

Within ten minutes, three more screenshots appear. Someone asks Perplexity the same question and gets a cleaner answer with citations. Someone asks Google’s AI Mode for alternatives to a competitor and the company is missing entirely. Someone asks Gemini a broader category question and sees two smaller competitors listed above them. The thread becomes animated in the way internal Slack threads become animated when nobody is sure whether the evidence is serious.

There is a small embarrassment inside this scene. Everyone knows the screenshots matter, but nobody knows how much.

That is where most AI brand checks begin: a few prompts, a few screenshots, a little panic, and a large temptation to overinterpret. The answer may be wrong, but it may also be unstable. The brand may be absent, but perhaps the prompt was poorly chosen. A competitor may appear, but perhaps the question was closer to their segment than yours. A citation may look authoritative while pointing to an old directory page nobody has updated in years.

Checking what AI systems say about your brand is worth doing. It is becoming part of ordinary brand hygiene. But done casually, it produces more anxiety than knowledge.

The first answer is usually the least interesting one

Most teams start with a branded prompt: “What is [Company]?” That is a natural place to start, but it is also a narrow one. A branded prompt tests whether the system can describe a company after being directly named. It does not tell you much about whether the company appears during discovery.

Buyers rarely begin with the exact question marketing teams wish they would ask. They ask for a vendor category. They ask for alternatives. They ask whether a company is trustworthy. They ask how one provider compares with another. They ask which solution fits a specific problem, budget, market, or team size. They ask a question that sounds slightly wrong because they do not yet know the category language.

The branded answer is still useful. It shows the basic public biography of the company: what the system thinks the company does, which products or services it names, whether the facts are current, and whether it uses language the company recognizes. But the more revealing answers often come from prompts where the brand is not named at all.

A company may be described accurately when named and still be absent when the buyer asks for providers in the category. Another may appear in “alternatives to [competitor]” prompts but not in “best providers for [use case]” prompts. A third may appear only when the prompt uses old category language. These patterns tell different stories.

The useful audit begins when the team stops asking, “What does AI say about us?” and starts asking, “In which buyer situations does AI remember us, and in which ones does it not?”

Different systems expose different parts of the public record

It is also a mistake to treat “AI” as one surface.

ChatGPT Search can provide timely answers with links to relevant web sources, and OpenAI says ChatGPT may decide to search the web depending on the question or the user’s choice. OpenAI’s help page is careful to frame search as something that can bring web sources into the answer environment, not as a guarantee that every answer is produced in the same way. Perplexity describes itself as an AI-powered search engine that searches the web and returns conversational answers backed by verifiable sources and citations. Its help center makes citations central to the product experience. Google’s AI features may use query fan-out, issuing multiple related searches across subtopics and data sources to build a response, according to Google Search Central.

Those mechanisms are not identical. They do not retrieve the same sources, present the same citations, or respond to prompts with the same level of stability. Even within one system, answers can change by phrasing, freshness, location, session context, and whether retrieval is triggered.

This is irritating if you want a single score. It is useful if you want diagnosis.

Perplexity may expose which public pages are being treated as sources because citations are part of the interface. Google’s AI features may reveal how a broad comparison query gets decomposed into related subtopics. ChatGPT may show where a named brand is understood well but where category-level discovery is weak. A system that refuses to mention the company may not be “wrong” in isolation; it may be showing that the brand’s public evidence is too thin for that particular question.

No single platform deserves to be crowned as the truth. The useful work is reading across platforms for recurring patterns.

If one system misses the company, it is a clue. If every system misses it for the same buyer question, it is a problem.

A good prompt set sounds like a buyer, not a vendor

A common failure mode in AI visibility checks is vendor-centered prompting. The team writes prompts using internal language: exact product names, preferred category labels, polished phrasing from the homepage. The system responds reasonably well, and everyone feels better than they should. A real buyer is usually less precise.

A procurement lead may ask for “tools that monitor what ChatGPT says about a brand” even if the company calls the category “AI visibility intelligence.” A founder may ask for “how to know if AI search is hurting our company” rather than “generative engine optimization audit.” A marketer may ask for “why Perplexity recommends competitors” or “how to track brand mentions in AI answers.” A board member may ask for “companies helping B2B brands show up in AI search,” which is broad enough to pull in SEO agencies, analytics tools, reputation firms, and newer GEO providers.

Those imperfect prompts are valuable because they resemble the market’s language before the market has adopted yours.

The audit should include branded questions, but also category questions, alternative questions, comparison questions, trust questions, and use-case questions. It should include prompts that use your preferred language and prompts that use the messy language buyers actually use. It should include direct competitors and the wrong competitors you keep getting compared with, because AI systems may reproduce the same confusion.

The uncomfortable prompts are often the best ones. They show whether the company is discoverable only inside its own vocabulary.

The source is often more important than the answer

Screenshots are seductive because they make a variable answer look fixed. A screenshot has edges. It feels like evidence.

But the more useful object is usually the source behind the answer.

When an AI system cites a directory, a review platform, an industry article, a partner page, or a service page, open it. Read it like a buyer who is trying to decide whether the company belongs in the category. Does the page describe the current business? Does it use old language? Does it include competitors but not you? Does it frame the category in a way that favors one type of vendor? Does it use a concrete description your own site avoids?

Often the AI answer is less mysterious after the cited pages are inspected. A competitor is recommended because it appears in three third-party sources for the category. Your company is missing because it is absent from the source the answer used. The answer gets your offer wrong because the only page with a plain description is outdated. The AI cites your homepage but still summarizes the company vaguely because the homepage itself never says what the buyer receives.

A citation is not a trophy. It is a doorway into the source trail.

This is where an AI visibility check becomes more than prompt play. It tells the company which public materials are shaping its description and which materials are failing to do so.

Do not measure only presence

The simplest AI visibility metric is whether the brand appears. It is also the easiest metric to misuse.

Presence can be good, bad, or irrelevant. A brand appearing in the wrong category may be worse than absence. A brand mentioned in a neutral list without proof may not influence a buyer. A brand cited as a source may matter more than a brand merely named in passing. A competitor appearing above you may matter in one prompt and not matter in another if the prompt is poorly matched to your actual buyer.

The audit should read the answer as a piece of market evidence. How is the brand described? Is the category right? Are the facts current? Which competitors appear? What proof is named? Are sources shown? Does the answer sound confident, cautious, or confused? Does the system repeat old language? Does it omit the company’s strongest use case?

A company can have high mention frequency and low description quality. Another can have low mention frequency but strong citations when it does appear. Another may be invisible for broad category prompts but strong for specific use cases. These are different conditions, and they require different work.

The dashboard temptation is to compress them into one score. The editorial task is to keep enough texture to know what the score means.

Stable prompts beat dramatic screenshots

The most useful AI visibility checks are boring in one specific way: they repeat.

A stable prompt set, run on a schedule, reveals movement. A random set of screenshots reveals mood. If the team changes every prompt every month, it cannot tell whether the brand changed, the market changed, or the prompt changed. If the team keeps the core prompts stable, patterns begin to show.

The prompt set will still evolve. New services launch, competitors appear, buyers adopt new language, and platforms change. The core set, however, should remain steady enough to compare one month with the next.

It also helps to preserve the full answer, not only the conclusion. AI answers are variable and sometimes slippery. A result that looks like a win in a screenshot may contain a factual error in the second paragraph. A result that looks like a loss may cite a source that reveals exactly where to improve coverage. The archive of answers becomes useful only when the team can inspect it later.

The best rhythm is modest. Run the prompts. Read the answers. Inspect the sources. Note the patterns. Decide which public materials need to change. Then run the same prompts again after enough time has passed for the public record to shift.

This is slower than screenshot theater. It is also the only version that compounds.

The audit should end with edits, not awe

An AI visibility check is not complete when the team knows what the systems said. It is complete when the team knows what to change.

Sometimes the change is factual. A profile needs updating. An old page needs a redirect. A directory category is wrong. A service name is outdated.

Sometimes the change is explanatory. The service page needs to describe the deliverable. The homepage needs a plainer first sentence. The comparison article needs to explain why the company is not in the category buyers assume.

Sometimes the change is evidential. The company needs proof in public, not only in sales decks. It needs a case study, a sample output, a more current partner page, a review profile that reflects the current business, or third-party material that supports the category.

And sometimes the finding is that the prompt itself is not yours to win. That is useful too. AI visibility should not become a project to appear everywhere. It should help the company appear where it belongs and understand where it is being misplaced.

The practical value of an AI brand check is not the screenshot. It is the correction to the public record that the screenshot made visible.