Google tracks how many domains use each Schema.org vocabulary term and publishes the data monthly. I analyzed the May 2026 snapshot — 5,545 terms across 6 adoption tiers — to understand what structured data the web actually implements versus what schema.org defines.
schema:Person or schema:Product.schema:name or schema:author.Data source: Schema.org GitHub — data/public_stats/google · Snapshot: May 2026
Key Findings
77% of all schema.org terms are used by fewer than 1,000 domains. The vocabulary is vast, but real-world adoption is concentrated in a small fraction of what's defined. Most of schema.org exists in theory only.
82% of all schema.org properties (Predicates) are in the <1K bucket, compared to 50% of types (Itemtypes). Properties are more numerous and far less adopted — most have almost no real-world usage.
Itemtypes have a more balanced adoption curve. While still heavily skewed toward rare, 40% of types are used by 1K or more domains — versus only 18% of predicates. Types lead real-world usage at every tier above the floor.
Only 43 terms out of 5,545 reach the 10M+ tier. The structured data that Google sees consistently across the web. 12 are Itemtypes, 31 are Predicates.
The 12 most common Itemtypes are overwhelmingly infrastructure-oriented types — including WebSite, WebPage, BreadcrumbList, Organization, Person, and ImageObject. Their prevalence likely reflects the fact that many of these Schema types are automatically generated by popular CMS platforms, themes, and SEO plugins.
The 31 mainstream Predicates are equally predictable — name, url, description, image, author, datePublished. These are properties every CMS and site template already emits. Adoption is high because the barrier is near zero.
Adoption Distribution
| Domain Bucket | Count | % of Total | Itemtypes | Predicates |
|---|---|---|---|---|
| < 1K | 4,264 | 77.0% | 485 | 3,779 |
| 1K – 10K | 560 | 10.1% | 236 | 324 |
| 10K – 100K | 420 | 7.6% | 151 | 269 |
| 100K – 1M | 158 | 2.9% | 39 | 119 |
| 1M – 10M | 100 | 1.8% | 35 | 65 |
| 10M+ | 43 | 0.8% | 12 | 31 |
Interpretation
The distribution follows a classic power law: extremely concentrated at the bottom, with a long, thin tail at the top. The vast majority of Schema.org vocabulary remains theoretical in the sense that it exists within the standard but sees little real-world adoption. Moving from one tier to the next represents an order-of-magnitude increase in domain reach, while each successive tier contains dramatically fewer terms.
Class Comparison
958 Itemtypes (types like schema:Product) vs. 4,587 Predicates (properties like schema:author). The adoption patterns are notably different.
Interpretation
Itemtypes have significantly stronger mid-tier adoption. 24.6% of types land in the 1K–10K bucket vs. only 7.1% of predicates.
Predicates are 4.8× more numerous than types yet overwhelmingly rare. Schema.org defines properties for nearly every conceivable attribute, but most never get implemented. The vocabulary outpaced real-world need.
At the 10M+ tier, predicates (31) outnumber types (12) nearly 3:1. This makes sense structurally — a single type declaration like WebSite triggers multiple property declarations (name, url, description, publisher) simultaneously.
Itemtype Adoption
Focusing on Itemtypes only (not properties). Within each tier, exact ranking is unknown — Google publishes bucket ranges, not precise counts. Types are listed alphabetically within their tier.
Bar width = number of types in each tier. Click the top two rungs to see the terms.
Interpretation
All 12 x 10M+ types are structural, not content types. WebSite, WebPage, BreadcrumbList, ListItem — these describe page architecture. The only entity types are Organization and Person. No Product, Article, or Review made it to this tier.
The types Google recommends for rich results — Product, Review, FAQPage, VideoObject, BlogPosting — are all in the 1M–10M tier, not 10M+.
Event sits in the 100K–1M bucket, despite being a prominent Google rich result type. Structured data adoption for events is notably lower than for products or articles — likely because event sites are a smaller slice of the web.
Mainstream Adoption
These are the schema.org terms that appear on more than 10 million domains. If you're implementing structured data for SEO, this is the list that counts.
Interpretation
The 12 mainstream Itemtypes are structural, not semantic. WebSite, WebPage, BreadcrumbList, ListItem — these describe the architecture of a page, not its topic. Organization and Person are the only entity types that made the cut.
The 31 mainstream Predicates are almost all auto-generated by CMSs and templates. name, url, description, image, datePublished — these appear because WordPress, Yoast, and site builders emit them automatically, not because developers chose them deliberately.
Product, Event, Recipe, FAQPage, HowTo, Review — none reached 10M+. These high-intent rich result types that Google heavily promotes are still used by fewer than 10M domains, suggesting uptake of deliberate structured data is lower than commonly assumed.
The Long Tail
Interpretation
The long tail is not a failure — it is the nature of a general-purpose vocabulary. Schema.org is designed to describe every conceivable type of thing: medical conditions, academic courses, sports events, music recordings, legislative processes. Most of the vocabulary will always be specialized by design.
But it does have a practical implication: implementing any term below the 10K-domain threshold means having an edge over your competitors.
Schema.org, in practice, is shaped far more by implementation defaults than by the full scope of its specification. While the vocabulary contains thousands of terms, real-world usage is heavily concentrated in a small subset.
The data shows a clear long-tail distribution: 77% of all terms are used by fewer than 1,000 domains, and only 0.8% reach the 10M+ tier. This reinforces that Schema.org is not uniformly adopted, but instead follows a steep power law where a small core carries the majority of real-world usage.
At the top of this distribution, adoption is dominated by infrastructure-level constructs — WebSite, WebPage, Organization, Person, BreadcrumbList — along with a small set of universally emitted properties such as name, url, and description. Their prevalence is strongly influenced by CMS defaults, themes, and SEO tooling, which lower the barrier to near-zero implementation effort.
Because most Schema.org adoption is passive — driven by tooling rather than deliberate implementation — there is a real opportunity for those who choose to go further. Intentional, well-structured schema markup remains rare enough that implementing it thoughtfully is still a meaningful competitive differentiator.
Methodology