Methodology

How the AVI score works.

A score you can't interrogate is a score you can't trust. This page is the full methodology behind the AI Visibility Index — what we ask, how we measure, and why results behave the way they do.

What we measure

Every scan asks the current production models of ChatGPT, Claude, Perplexity, and Geminithe questions your buyers actually ask — category questions (“best tools for…”), comparisons, and problem-based queries. The prompt set is generated once from your brand's keywords and competitors, then kept identical on every scan so movement in your score reflects movement in the answers, not movement in the questions.

Every response is stored verbatim. Anything you see in your dashboard — a mention, a sentiment label, a citation — traces back to a raw answer you can read yourself.

The formula

The AVI is a weighted composite of five components, scored 0–100:

40%

Mention rate

Of all the questions we ask, in how many answers does your brand appear? This is the primary visibility signal — you can't win an answer you're not in.

30%

Sentiment

When you are mentioned, how favorably? Each mention is classified positive, neutral, or negative based on how the engine describes you.

15%

Context quality

How you're framed matters: a direct recommendation outranks a comparison, which outranks a passing mention, which outranks criticism.

10%

Position

Where in the answer you appear. Brands named first carry far more recommendation weight than brands named fourth.

5%

Citation rate

Whether engines cite your domain as a source. Weighted lowest because engines differ wildly here — Perplexity cites almost always, Claude rarely — so it can't carry the score.

Why scores vary between scans

AI engines are probabilistic. The same question can produce slightly different answers on different days — that's the medium, not a measurement error. We do three things about it:

  • Deterministic settings — we pin generation parameters (temperature 0) so each engine is as repeatable as it allows.
  • Smoothing — your headline AVI is smoothed across recent scans, so one unusual response doesn't whipsaw the score.
  • Identical prompts — the question set never changes between scans, so variance comes only from the answers.

Even so, expect your score to move a few points between consecutive scans without any action on your part. Read trends, not ticks: sustained movement across several scans is signal; a single small jump is weather.

Honest answers

Why did my score change when nothing about my brand changed?

AI engines are probabilistic systems: ask the same question twice and the answers can differ slightly — different competitor orderings, a mention that appears in one run and not the next. We pin generation settings for maximum determinism and smooth scores across scans, but run-to-run movement of a few points is inherent to measuring AI answers. A genuine trend shows up as sustained movement across multiple scans, not a single jump.

Why do engines disagree with each other?

Each engine has different training data, different citation behavior, and different update schedules. A spread between your per-engine scores is expected and is itself useful signal — it tells you which engine to focus on.

Are the queries cherry-picked?

No. Every brand gets a canonical prompt set — category, comparison, and problem-based questions generated from your keywords and competitors — and the same set runs on every scan, so results are comparable over time. You can read every prompt and every verbatim response in the dashboard.

What don't you do?

We don't fabricate or extrapolate data — every number traces back to a stored, verbatim engine response you can read. We don't pay engines for placement (nobody can — that's the point of AEO). And we don't compare your score against brands we haven't actually scanned.

Want to see the methodology applied to your own brand?

Join the waitlist →