Schema markup, defined
Schema markup is a small block of structured data you add to your web pages using a shared vocabulary called Schema.org. Its purpose is to translate the meaning of your content into a format machines can read with zero ambiguity. A human reading a page sees “$49.99” and instantly knows it's a price. A crawler sees a string of characters and has to guess. Schema markup removes the guessing.
Schema.org is jointly maintained by Google, Microsoft, Yahoo, and Yandex, with input from the broader web standards community. It defines hundreds of types — Person, Article, Product, Recipe, Event, Movie, Course — each with a specific set of properties. When you mark up a page as a Product, you can attach properties like name, price, availability, and aggregateRating — and every machine that reads your page understands them in exactly the same way.
The vocabulary is the standard. The format you embed it in is up to you. The three options — JSON-LD, Microdata, and RDFa — produce equivalent results, but they have very different practical trade-offs. The next section unpacks why JSON-LD has won.
JSON-LD vs Microdata vs RDFa
All three formats encode the same Schema.org vocabulary, but they live in very different parts of your HTML. Pick one and stick with it — mixing formats on the same page produces unpredictable results.
JSON-LD (recommended)
JSON-LD lives in a single <script type="application/ld+json"> block, usually placed in the page <head>. Because it's decoupled from the visible HTML, you can add, remove, or update structured data without touching your templates. CMS plugins can inject it. Static-site generators can build it. AI-search systems prefer it because the data is unambiguous and self-contained. Google explicitly recommends it.
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "What is Schema Markup?",
"author": { "@type": "Person", "name": "Jane Smith" },
"datePublished": "2026-05-03"
}
</script>Microdata (legacy)
Microdata weaves Schema.org attributes directly into your visible HTML using itemscope, itemtype, and itemprop. It worked well in 2014, but today it's fragile: a designer cleaning up the DOM can quietly delete an entire schema; a templating engine can break the structure with a stray wrapper. Use it only if you're maintaining a legacy site that already relies on it.
RDFa (specialty)
RDFa is conceptually similar to Microdata but uses a different attribute set (vocab, typeof, property) inherited from the broader Linked Data world. It's mostly used inside knowledge-graph and academic publishing systems. For a normal commercial site in 2026, JSON-LD is the correct default — there is no situation where RDFa or Microdata will outperform a clean JSON-LD implementation.
Why schema markup matters for AI search
For roughly twenty years, schema markup was a Google-rich-results play. You added FAQ schema to get an FAQ accordion in the SERP. You added Recipe schema to qualify for the recipe carousel. The benefit was visible, narrow, and traffic-driven.
Then large language models started indexing the web. ChatGPT's search, Perplexity, Google AI Overviews, Gemini, and Bing's Copilot all read the same structured data — but they use it for a fundamentally different purpose. Instead of rendering rich snippets, they extract entities, facts, and relationships and weave them into generated answers. Pages with clean schema become the cited sources behind those answers. Pages without it get summarized, paraphrased, or skipped entirely.
A 2025 industry study found that pages with complete Article and Organizationschema were cited by ChatGPT and Perplexity at roughly 4–6× the rate of equivalent pages without schema. The pattern isn't magic — it's mechanical. When the model needs to cite a source for “Who founded Acme Corp?”, it reaches for the page that explicitly declares an Organization with a named founder, not the page that mentions the founder in the third paragraph.
For AI search, the schema fields that matter most are the ones that ground your content in the AI's knowledge graph:
sameAs— links to your Wikipedia page, Wikidata entry, LinkedIn, GitHub, and social profiles. This is how AI systems verify you're the entity you claim to be.authoras Person — not just a name string. A nested Person object with asameAsprofile and a description gives the model an entity to attribute the content to.datePublishedanddateModified— AI engines weight freshness heavily. Stale dates are treated as a negative signal.description— at 150+ words rather than the typical 30-word meta description. AI systems use this as a content summary when deciding what to extract.mainEntity— explicitly declares the primary entity the page is about. Removes any ambiguity for the AI parser.
For a deeper breakdown, see our companion guide on schema markup for AI search, which covers the GEO (Generative Engine Optimization) framework end to end.
How to add schema markup to your website
Implementation is the part most guides over-complicate. The actual workflow has four steps:
- 1
Pick the right schema type
Match the schema type to the page's primary purpose. A blog post is an Article. A product page is a Product. A help page with Q&A is a FAQPage. You can combine multiple types on a single page (Article + FAQPage + BreadcrumbList is a common, valuable combo) — just don't pick the wrong primary type.
- 2
Generate the JSON-LD
Use a generator that outputs valid Schema.org JSON-LD. Our ten free generators cover the most common types and produce AI-optimized output by default — meaning every recommended field is included, not just the bare minimum Google requires.
- 3
Embed it on the page
Paste the
<script type="application/ld+json">block into your page's<head>. WordPress users can use a plugin like Rank Math, WPCode, or Yoast. Shopify users paste into the relevant theme template. Webflow, Wix, Squarespace, and most page builders have a custom-code field. - 4
Validate before shipping
Run the page through our schema validator (catches AI-readiness issues) and Google's Rich Results Test (confirms rich-result eligibility). Both checks take 30 seconds and catch ~95% of real-world mistakes.
That's it. The only step that takes meaningful effort is the first one — picking the right type — and even that becomes second nature after you've marked up a few pages. Everything downstream is mechanical.
The 10 schema types most websites need
Schema.org defines hundreds of types, but a normal commercial website only ever uses a handful. Here are the ten that cover roughly 95% of real-world schema needs, each linked to a free generator that outputs AI-optimized JSON-LD.
FAQ Schema
Mark up question-and-answer content. The single highest-leverage schema for AI citation.
Article Schema
Identify long-form content — blog posts, news, tutorials — with author and publish dates.
Product Schema
Pricing, ratings, availability, and SKUs for ecommerce listings and rich product cards.
Local Business
Address, hours, and phone for brick-and-mortar stores, restaurants, and service businesses.
HowTo Schema
Step-by-step instructional content. Critical for voice search and AI step extraction.
Organization
Site-wide brand entity with sameAs links to social profiles and Wikipedia/Wikidata.
Review Schema
Editorial reviews and aggregated ratings for products, services, and local businesses.
Event Schema
Conferences, concerts, webinars, and classes with dates, venue, and ticketing info.
Breadcrumb
Signal site hierarchy on deep pages. Helps both users and AI map your information architecture.
Video Schema
Mark up videos with thumbnail, duration, upload date, and transcript URL for video AI surfaces.
Common schema mistakes (and how to avoid them)
After validating thousands of schema blocks through our tools, the same handful of mistakes account for the vast majority of failures. None are hard to fix once you know to look for them.
- Schema that doesn't match visible content. Adding FAQ schema for questions that don't actually appear on the page violates Google's guidelines and can trigger a manual penalty. Schema must mirror what a human reader sees.
- Author as a string, not a Person.
"author": "Jane Smith"is technically valid but tells the AI nothing. Use a nested Person object withname,url, andsameAsat minimum. - Missing or stale dateModified. AI engines treat freshness as a major weighting signal. A page with
datePublished: "2019-03-12"and nodateModifiedreads as abandoned, even if the content is current. - Relative URLs in image and url fields. Use absolute URLs (
https://example.com/img.png), never relative paths. Crawlers and AI parsers don't always resolve relative URLs the way browsers do. - Duplicating schema across the site. If your homepage and your /about page both declare the same Organization schema, that's fine. But emitting two identical Article schemas on the same page (one from your CMS, one from a plugin) creates conflict. Audit your raw HTML before going live.
- Skipping the validator. JSON syntax errors silently break the entire schema block. A missing comma or unescaped quote means the schema isn't parsed at all. Always validate before publishing.
Frequently asked questions
What is schema markup in simple terms?+
Is schema markup the same as SEO?+
Does schema markup help with Google rankings?+
What's the difference between JSON-LD, Microdata, and RDFa?+
Where do I put the schema markup code on my page?+
How do I know if my schema markup is working?+
Do I need schema markup if I have great content?+
How long does schema markup take to start working?+
Ready to add schema to your site?
Pick a schema type, fill a quick form, copy the JSON-LD. No sign-up, no paywall, no usage limits — every generator outputs AI-optimized markup by default.