Metadata, and metadata directives

Last updated 17 June 2026 5 min

Metadata is information about a page rather than the page's visible content. Directives are specific instructions to crawlers and browsers about how to handle that page. Both live in the <head> of an HTML document (or in HTTP response headers from the web server) and shape how a page is indexed, ranked, and displayed in search results.

Note: Not all metadata tags function as directives. Most are for informational purposes, but some do function as directives, and so are also included here.

Core SEO metadata

Title tag

<title>Page Title — Brand Name</title>

Title is a primary on-page metadata element. Used as the clickable headline in search results and browser tabs. Google may rewrite titles in search results based on context, but a well-written title is still a strong signal.

Best practice: 50–60 characters, primary keyword near the start, brand at the end, unique per page. Avoid "stuffing" titles; there is no point. Excessively long titles are truncated in search results, and everything beyond the visible portion is rapidly devalued.

Meta description

<meta name="description" content="A concise summary of the page content.">

The description snippet is displayed beneath the title in search results. It is less of a direct ranking factor, but heavily influential in click-through rate. Google uses the meta description as a starting point, but frequently rewrites them based on the specific user query.

Best practice: 140–160 characters, includes a clear purpose or value proposition.

Canonical tag

<link rel="canonical" href="https://example.com/preferred-url">

Tells search engines which URL is the master version when multiple URLs serve similar content. Covered in detail in Canonical Management Explained.

Language

<html lang="en-au">

Declares the page's primary language. Used by browsers, screen readers, and search engines.

Viewport

<meta name="viewport" content="width=device-width, initial-scale=1">

Required for mobile responsive layouts. Without it, mobile browsers render the page as if it were a desktop screen and zoom out, requiring the user to zoom laboriously in and out to read content.

Charset

<meta charset="UTF-8">

Tells the browser how to interpret the file's character encoding. UTF-8 is the universal standard and supports virtually all written languages.

Robots directives

Robots directives control crawler behaviour. They can be set page-by-page (in HTML) or via HTTP headers (X-Robots-Tag).

In HTML

<meta name="robots" content="noindex, nofollow">

Common directive values

  • index / noindex — whether the page can appear in search results.
  • follow / nofollow — whether links on the page pass authority signals.
  • noarchive — don't show a cached version of the page.

Less common / special purpose directive values

  • nosnippet — don't show any text snippet in results.
  • max-snippet:N — limit the snippet to N characters.
  • max-image-preview:[none|standard|large] — limit image previews in results.
  • max-video-preview:N — limit video preview duration to N seconds.
  • unavailable_after:[date] — drop the page from the index after a specified date.
  • notranslate — don't offer to translate the page in search results.
  • noimageindex — don't index images on the page.

Directives can be targeted to specific bots:

<meta name="googlebot" content="noindex">
<meta name="bingbot" content="noindex">

Googlebot directives override generic robots directives where they conflict.

Directives vs. robots.txt

These two are often confused but do different things:

  • robots.txt controls crawling. It tells bots which URLs they're allowed to fetch.
  • Robots meta directives control indexing. They tell bots whether to include a fetched page in the index.

A page disallowed in robots.txt but with a noindex meta tag will still potentially be indexed (without content), because by honoring the robot.txt request to not crawl the page, Google can't fetch the page to see the noindex directive. To deindex a page reliably, allow crawling in robots.txt and use a noindex meta directive.

Open Graph metadata

Used by social platforms when a page is shared. Covered in detail in Open Graph Protocol Explained.

<meta property="og:title" content="...">
<meta property="og:description" content="...">
<meta property="og:image" content="...">

Structured data (JSON-LD)

Structured data is metadata about the page's content, used by search engines to power rich results.

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "...",
  "author": { "@type": "Person", "name": "..." },
  "datePublished": "..."
}
</script>

JSON-LD is Google's preferred format. Microdata and RDFa are functional alternatives, but are outdated and far less commonly used today.

Other useful metadata

  • <meta name="theme-color"> — sets the address bar colour on mobile browsers.
  • <meta name="referrer"> — controls what referrer information is sent when users follow links.
  • <link rel="alternate" hreflang="..."> — language and region targeting (covered in Hreflang Localization Explained).
  • <link rel="preload"> and <link rel="preconnect"> — performance hints to the browser.

Good metadata management

  • Every indexable page has a unique title and description.
  • Every page declares language, charset, and viewport.
  • Robots directives are explicit on pages that should not be indexed (search results, login pages, thank-you pages, faceted filters).
  • Canonical tags are self-referencing on indexable pages and point to the correct master version on duplicates.
  • Open Graph metadata is configured for any page likely to be shared.
  • Structured data is implemented for content types that support rich results (products, articles, FAQs, events, organisations).
  • No conflicting signals — for example, a page with noindex shouldn't appear in the sitemap or have hreflang annotations pointing to it.

Common issues

  • Repetitive titles and descriptions across multiple pages.
  • Missing or empty metadata on pages.
  • Noindex on pages that should rank — accidentally left over from staging or development.
  • Indexable pages with no canonical — a small issue, but worth fixing to avoid ambiguity.
  • Conflicting robots directives — for example, index in HTML but noindex in X-Robots-Tag headers.

Disclaimer: All information contained herein is for informational purposes only. It is not advice or instructional.