Schema and structured data
Last updated 17 June 2026 5 min
Structured data is information added to a webpage in a standardised, machine-readable format so search engines, LLMs, and other AI systems can understand what the page is about — not just the words on it, but what those words mean. "Schema" is the vocabulary used to describe that meaning.
Schema vs. structured data
The two terms are often used interchangeably, but they're distinct:
- Schema.org is a shared vocabulary — a dictionary of types (
Product,Article,LocalBusiness,FAQPage) and properties (price,author,openingHours) developed jointly by Google, Microsoft, Yahoo, and Yandex. - Structured data is the actual code on the page that uses that vocabulary to describe specific entities.
Schema is the language. Structured data is what you write in it.
Why it matters
Search engines parse HTML well enough to index a page, but they can't always tell if a string of digits is a phone number or a product code, or that a block of text is a product review and not just a description. Structured data removes the guesswork by explicitly specifying the components of the content.
The practical outcomes of Schema:
- Rich results. Eligible markup unlocks enhanced SERP features — star ratings, FAQ accordions, recipe cards, sitelinks, product info, event listings. These take up more real estate and typically lift CTR.
- Eligibility for verticals. Google Shopping, Top Stories, Recipe carousels, and similar surfaces require specific schema to appear.
- AI and LLM citations. Generative search (Google AI Overviews, ChatGPT, Perplexity, Gemini) leans on structured data to identify entities, attributes, and relationships when summarising or citing sources. Proper schema makes information easier to extract and eliminates misunderstanding.
Structured data is not a direct ranking factor in the traditional sense, but it materially affects how a page is presented and whether it qualifies for high-visibility surfaces.
Formats
Three formats are supported, but only one is recommended:
- JSON-LD (JavaScript Object Notation for Linked Data). A script block placed in the
<head>or<body>. Doesn't touch visible HTML, easy to maintain, and Google's preferred format. Use this by default. - Microdata. Inline attributes on existing HTML elements (
itemscope,itemtype,itemprop). Tightly coupled to markup, harder to maintain. - RDFa. Similar to Microdata, with broader use outside SEO.
Unless there's a specific reason otherwise, use JSON-LD.
A minimal example for an article:
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "Schema and Structured Data",
"author": {
"@type": "Person",
"name": "John Doe"
},
"datePublished": "2026-05-07",
"publisher": {
"@type": "Organization",
"name": "Company Name"
}
}
</script>
Common schema types
The right type depends on the page's purpose:
- Organization / LocalBusiness — homepage/contact page. Establishes the entity and feeds the knowledge panel.
- Product — product pages. Triggers price, availability, and review snippets.
- Article / NewsArticle / BlogPosting — editorial content.
- FAQPage — pages with genuine question-and-answer content.
- HowTo — step-by-step guides.
- Review / AggregateRating — testimonials and ratings, when tied to a specific product or service.
- BreadcrumbList — site hierarchy.
- Event, Recipe, VideoObject — content-specific types with their own rich result formats.
A single page can carry multiple schema types where genuinely relevant — for example, an Article with a BreadcrumbList and an embedded VideoObject.
Implementation principles
A few rules separate useful structured data from the kind that triggers manual actions:
- Mark up only what's visible on the page. Schema describing content the user can't see is a violation of Google's structured data guidelines and risks a manual penalty.
- Be accurate. Prices, ratings, availability, and dates must match what the page shows. Stale or fabricated values can cause more harm than missing markup altogether.
- Use the most specific type available.
Dentistis better thanLocalBusiness;TechArticleis better thanArticlewhere it fits. - Connect entities. Use
sameAsto linkOrganizationmarkup to LinkedIn, Facebook, and official social profiles. This strengthens entity recognition for both search and LLMs. - Keep it valid. Required properties must be present; recommended properties should be where possible.
Testing and validation
Three tools cover almost everything:
- Google Rich Results Test (
search.google.com/test/rich-results) — confirms whether a page is eligible for specific rich result types and shows previews. - Schema Markup Validator (
validator.schema.org) — validates against the full Schema.org vocabulary, including types Google doesn't render rich results for. - Google Search Console reports on structured data at scale once the markup is live: enhancement reports flag errors, warnings, and valid items by type. This is where ongoing monitoring lives.
Common mistakes
- Marking up content that isn't on the page.
- Copying competitors' schema without adjusting values.
- Leaving placeholder values (
"price": "0.00","ratingValue": "5"with one fake review). - Using
FAQPagefor content that isn't genuinely Q&A. - Forgetting to update schema when page content changes.
In summary
Structured data won't make a poor page rank, but it makes a good page legible — to crawlers, to rich result systems, and increasingly to the AI models now sitting between users and the open web. Done well, it can pay off in visibility, click-through, and citation share.
Disclaimer: All information contained herein is for informational purposes only. It is not advice or instructional.