← Hyperleap

Sitemap Validator

Verify your XML sitemap against the sitemap.org schema — catch format errors, missing fields, and best-practice violations

Press ⌘+Enter to analyze

What Does “Verify Sitemap” Actually Mean?

Verifying a sitemap means checking that the file complies with the sitemap.org schema and that the URLs inside it actually represent the structure of your site. The schema check catches malformed XML — unclosed tags, missing namespaces, invalid date formats, or non-standard fields. The structural check catches mistakes a schema validator cannot see: 404 URLs, 301 redirects pointing somewhere unexpected, orphaned URLs that no internal link points to, or duplicate entries.

This validator runs the schema-level check against the sitemap.org spec for urlset and sitemapindex roots. For URL-level health checks (status codes, redirect chains), use Google Search Console or Screaming Frog after this validator passes.

How Do You Know If Your Sitemap Is Valid?

A valid sitemap meets all of these criteria:

  • Well-formed XML. Every tag opens and closes; the namespace declaration matches sitemap.org.
  • Root element is one of two: <urlset> for a regular sitemap, or <sitemapindex> for a sitemap of sitemaps. Anything else is invalid.
  • Every URL has a fully-qualified location. <loc> must include the protocol and domain — relative URLs are rejected.
  • Optional fields use valid values. <lastmod> in ISO 8601; <changefreq> in the allowed enum (always, hourly, daily, weekly, monthly, yearly, never); <priority> between 0.0 and 1.0.
  • Within size limits: at most 50,000 URLs and 50 MB uncompressed per file. Past those limits, split into multiple sitemaps and link them from a sitemap index.

What Are the Most Common Sitemap Validation Errors?

  • Missing XML declaration. The first line must be <?xml version="1.0" encoding="UTF-8"?>. Some CMS exports skip this; add it back.
  • Wrong namespace. The <urlset> element must declare xmlns="http://www.sitemaps.org/schemas/sitemap/0.9".
  • Invalid lastmod format. Use full ISO 8601: 2026-01-15 or 2026-01-15T10:30:00+00:00. US-style dates (1/15/26) are rejected.
  • Invalid changefreq value. Only the enum values (always, hourly, daily, weekly, monthly, yearly, never) are allowed. “biweekly” or “quarterly” are not valid.
  • Priority outside 0.0–1.0. Some tools emit priorities of 5 or 10; anything outside the spec range is invalid.
  • Mixed protocols (http and https) in the same sitemap. Pick one — usually https — and stick with it.

Should Search Engines Be Able to Find Every URL on My Site?

Not necessarily. A sitemap should list URLs you want indexed — high-quality, canonical, public-facing pages. URLs you should keep out of the sitemap include:

  • Pages with noindex meta directives (you're telling search engines to skip them anyway).
  • Pages blocked by robots.txt.
  • URL parameter variants of canonical pages (search results, tracking-tagged URLs, sort orders).
  • Thin or low-quality pages that would dilute your overall site quality score.
  • Internal pages (admin, staging, legal-only access).

When you train a Hyperleap AI agent on your website, the agent reads your sitemap to discover what pages to learn from. A clean, validated sitemap means a clean knowledge base — your AI agent answers visitor questions from the right pages, not from internal admin pages or thin parameter variants. See how Hyperleap AI agents use your sitemap →

Need an AI chatbot for your website?

Hyperleap AI Agents answer customer questions, capture leads, and work 24/7.

Get Started Free