Robots.txt

Validates the presence and accessibility of the robots.txt file.

What is this check?

A `robots.txt` file is a plain text file located at the root of your website (e.g., `yourdomain.com/robots.txt`) that tells search engine crawlers which pages or sections they are allowed or forbidden to crawl.

Why is it important?

It's the first place search engine bots look. It allows you to prevent crawling of private areas, duplicate content, or unimportant pages, helping you guide 'crawl budget' to your most important pages.

What is the impact?

An incorrectly configured `robots.txt` file is one of the most catastrophic SEO errors. You can accidentally block your entire website from being indexed by search engines.

Example Implementation

# Allow all crawlers full access
User-agent: *
Allow: /

# Disallow crawlers from accessing a specific folder
User-agent: *
Disallow: /private/