Skip to content

Is Schema Useful for LLMs?

Sara Taher
6 min read
Is Schema Useful for LLMs?

“Why Sara, why?”
This article may beg the question: why do I throw myself into hot water by bringing up controversial SEO topics?

I know the easier path: just go with the flow, say what everyone else is saying, and collect the applause. But that’s boring. Why talk about schema when so many “SEO influencers” insist it’s useless for “AI search” or “GEO”?

Did I just say "GEO".... Oh boy that’s another fish to fry...

Why do I even bring up GEO? In a previous newsletter, I explained how GEO differs from traditional SEO tactics, and how it’s not just for Google AI Overviews it also applies to LLMs like ChatGPT. Two different systems, right?

My point is simple: we need more open discussion, not gatekeeping. SEO shouldn’t be a “my way or the highway” industry. Whether you’re for or against schema, GEO, or any other tactic—you shouldn’t be slapped with negative labels just for having a different opinion.

And let’s talk about fairness. Why go after small creators who mention GEO, but stay quiet when big players do the same? Because you don’t want to upset your friends? 😄 That double standard is the real problem.

Or as Gagan Ghotra puts it:

And before anyone rushes in...

I’ve NEVER pitched GEO to a client, nor offered it as a standalone service. Any AI search or GEO tactics I use are just part of my broader SEO offering.

Also… while we're talking about AI, can we address the trend of people asking ChatGPT how it works and then treating that as fact? 😆 It hallucinates, right?

Anyway, back to SEO’s most debated legit tactic. And I say legit because it’s not harmful to your site, the real debate is about its value for SEO. Yep, we’re talking schema!

TL;DR

  • Schema is important for search engine. Specifically in instances where it generates rich results in search for that can improve your CTR.
  • Schema is also important for ecommerce products and product groups.
  • It's a confirmed fact that schema markup helps Microsoft’s LLMs understand your content.
  • Some observations showed that Open AI bots are crawling JSON data more than HTML.
  • The opposing arguments are based on asking LLMs themselves how they work and whether they use schema, and also based on the tokenization process which changes the input format. But imo, still not strong arguments, and I may be wrong.
  • Andrea Volpini shared an interesting article that clearly states: "Testing confirms the divide: sites with comprehensive structured data appear accurately in AI responses; those without risk being misunderstood or ignored entirely. There are so many gold nuggets from this article, you'll have to scroll to the bottom to read it, but one tip that emerged is maybe we should use different schema types (JSON, microdata, and RDF).... have a read!

Here’s the thing: schema generally is still important.

I've chatted in the past with SEOs that dismissed the value of schema all together, but then when you bring up E-commerce schema types like product schema, they acknowledge it's value.

Google is still investing in schema. Just last year, Google rolled out support for Product Variants. If schema was pointless, why would Google continue expanding its structured data support?

There are also schema types that make you eligible for rich results in search, and therefore improving your CTR for example:

  • VideoObejct schema: helps you get a rich video results
  • Organization schema: provides information that may show up in knowledge panel.
  • Review snippet: shows start reviews next to your product/service in search results.

and more. Here's a guide on structured data markup that Google Search supports.

So dismissing schema all together, is not accurate.

Now back to our main topic, is schema useful for LLMs?

Microsoft Confirms Schema Helps Its LLMs (Copilot) Understand Your Content

I don't know why this is ignored/dismissed. We have a confirmation from Fabric Canel, Principal Product Manager @ Microsoft Bing that "schema markup helps Microsoft’s LLMs understand your content" - source.

So, the answer for Microsoft copilot, is YES!

What about other LLMs?

We don't have a clear statement at the moment whether ChatGPT, Gemini or Perplexity use schema. So the short answer is we don't know for sure.

However,

we have few tests and observations. For example, I shared this before: Nadeem Haddadeen on LinkedIn. Nadeem noticed that Open AI bots are crawling JSON data more than HTML.

Why some SEOs say that LLMs don't use schema?

Let's walk through the opposing standpoints.

  • One reason I found SEOs saying that LLMs like chatgpt don't use schema is that they asked chatgpt and it said no 😄 we all know that chatgpt may hallucinate or provide inaccurate information, right?
  • Another is that LLMs uses the process of tokenization which breaks downs input into tokens, where each token is converted into a unique numerical ID from the model's vocabulary. For example, the sentence "Hello, world!" might become the tokens:
    • "Hello" --> ID: 123
    • "," --> ID: 456
    • "world" --> ID: 789
    • "!" --> ID: 111

and sometimes this tokenization doesn't split input into words, sometimes a token is part of a word. Based of that, some people argue that LLMs cannot tell the difference in input between schema codes and other words. But LLMs do generate schema, doesn't that mean they understand schema the same way they understand a language?

After some discussions, the conclusion I came to is that:

LMs understand schema, but we don't have anything to support that it carries a different weight/importance. At best, schema is the equivalent of stating the same information in a clear structured manner on the page itself. So is there value? I think yes.

Some have tested it

Andrea Volpini shared an interesting article that clearly states:

"Testing confirms the divide: sites with comprehensive structured data appear accurately in AI responses; those without risk being misunderstood or ignored entirely. "

Here are few very interesting insights from the article:

  • Structured data visibility varies dramatically between different LLM tool types.
  • LLM doesn’t access structured data or raw HTML directly; it receives a sanitized snippet from the retrieval layer and, if it “opens” a page, a synthesized representation rather than the full source.
  • In a test on TripAdvisor, an AI model was able to spot a Restaurant schema with details like ratings and reviews. This shows that AI can use structured data when it’s available.
  • For e-commerce, sites with strong structured data become the main source for AI answers. Sites that only use normal product pages risk being ignored.
  • When AI searches the web, it can see JSON-LD, microdata, and RDFa because search engines already index them. But if it just opens your page, it often can’t see JSON-LD—only microdata built into the HTML.
    • **Tip: Implement dual structured data strategies—maintain JSON-LD for search engine indexing while supplementing with microdata and semantic HTML for direct agent access.
  • Experiments with GPT-OSS-120B and GPT-5 confirm a fundamental shift: AI models are moving from processing text to interpreting structured data.

And That’s a Wrap (Almost 😄)

In my opinion, schema is useful for search and for LLMs.

But looking at the bigger picture, this is not about schema only.

We need to slow down before dismissing ideas in a field that's changing so fast right now, none of us is totally caught-up.

Thanks for reading and see you next newsletter!


Like what you read and want to support me?

  • Sign up for my newsletter if you're not already.
  • Share the newsletter and invite your friends to signup. Help me reach 2k signups by end of 2025 please 🙂
  • Provide feedback on how I can make this newsletter better!!!
  • Buy me coffee.
  • If you're an SEO tool or an SEO service provider, consider sponsoring my newsletter. I'm also open to other partnership ideas as well.

Disclaimer: LLMs were used to assist in wording and phrasing this blog.

Artificial Intelligence

Related Posts

Members Public

How to Use AI for SEO?

AI, AI, all we've been hearing about lately is or has to do with AI. In this blog, I'm not going to talk about the impact of AI on search, but rather how does Artificial Intelligence changes the way we do our work! I recorded a

2 Ways to Use AI for SEO
Members Public

The Zero/Low Search Volume Dilemma

If we understand why search is changing, we can come up with tactics to optimize for that. In other words, if we know what's Google looking for and why they need AI Overviews and reddit, we can create content to cater for that, right? Last week we discussed

The Zero/Low Search Volume Dilemma
Members Public

Blogging for AI: Creating Content for LLMs and Modern Search

What was once considered helpful content by Google has changed. Take "Retro Dodo" for example, a website that was doing very good with all unique helpful first-hand experience [add whatever positive adjective here] content. In May 2023, the website hit 1 Million organic results - cool stuff right?

How to Blog for the LLMs and the New Era of Search