WordPress Robots.txt Check

A WordPress robots.txt check helps detect if your site is blocking search engines from crawling important pages. A misconfigured robots.txt file can prevent Google from indexing your site correctly.

Use this page as part of the broader WordPress pre-launch checklist when you want to validate crawl behavior before delivery.

Run PreFlight Back to checklist

Why this matters

A valid robots.txt file helps define crawl behavior at the site level. It can be useful for controlling access to specific paths, but it should be configured carefully. If the file is missing, inaccessible, malformed, or overly restrictive, search engines may crawl the wrong areas or be blocked from sections that should remain discoverable.

This matters especially before publishing, because robots.txt errors can affect how the site is explored by crawlers from the first moment it is live. It is also important not to confuse crawl blocking with index control. A robots.txt rule does not work the same way as a noindex directive.

What to review

Before marking this check as correct, review the following points:

Root location

The file should be available at /robots.txt on the public site root.

Reachable response

The response should be accessible without errors or broken redirects.

Valid text rules

The file should use valid plain text rules.

No accidental blocking

Important public sections of the site should not be blocked accidentally.

Not a noindex substitute

The file should not be used as a substitute for page-level noindex directives.

How PreFlight checks this check

PreFlight requests the robots.txt file from the public root of the site and verifies whether it is reachable and behaves as expected from the outside. It also checks whether the file is present in a usable form instead of returning an error, an invalid response, or a setup that makes the rules unreliable.

This check helps detect whether robots.txt is available and technically coherent before delivery. It does not replace a full crawl policy review, but it is a strong early signal for launch readiness.

PASS / WARN / FAIL

PASS

The robots.txt file is present at the expected root location, can be fetched correctly, and does not show obvious issues that would weaken crawl control on the live site.

WARN

The file exists, but there may be something that deserves review, such as unusual behavior, weak accessibility, redirect dependency, or rules that could create confusion before launch.

FAIL

The robots.txt file is missing, unreachable, invalid, or behaves in a way that makes crawl control unreliable on the public site.

Run PreFlight Open checklist

Common mistakes

Placing robots.txt outside the root location.

Returning HTML, errors, or broken redirects instead of a valid text file.

Blocking important public URLs by mistake.

Using robots.txt to try to prevent indexing instead of using noindex where needed.

Leaving old staging restrictions in place after migration.

FAQ

Does robots.txt prevent pages from being indexed?

Not reliably on its own. Robots.txt controls crawling, not indexing in the same way as a noindex directive.

Where should robots.txt be located?

It should live at the root of the public site, for example /robots.txt, because that is the location crawlers expect for host-level rules.

Can a missing robots.txt still be a problem?

Yes. A site can technically work without it, but if crawl behavior is part of the expected launch setup, missing or inaccessible robots.txt is still a technical gap worth fixing.