Bug (and workaround): 404 page returns HTTP 200

From the Redirect options documentation:

You can set up a custom 404 page for all paths that don’t resolve to a static file. This doesn’t require any redirect rules. If you add a 404.html page to your site, it will be picked up and displayed automatically for any failed paths.

This seems to work just fine for pages that don’t exist and for most special files, but fetching /404 or /404.html returns HTTP 200 instead of HTTP 404. As far as I’m aware, this is bad practice, and it may cause Google Search to be confused as to whether it should index such a soft 404 error page or not.

I did figure out a workaround by adding the following lines in my _redirects file, but it’s a bit silly:

/404.html /404 404!
/404 /404.html 404!

Note that self-redirects are ignored for some reason, hence why I had to write the rules like that.

A proper fix would be appreciated.

@SmashManiac Hmm, a 404 page is still a page, and if you request it and it loads, that would seem to be a valid response. The existence of the 404 page does not constitute an error. The 404 page is where visitors are sent when there is an error elsewhere. Therefore, it seems that the best practice would be that visiting the 404 page does not generate an (additional) error, right?

I understand where you’re coming from, but there are a few flaws with that logic.

First, it does make sense at first glance that when requesting directly a file called 404.html, and such a file exists on the server, that the contents of this file should be returned with no error regardless of context. The issue is that 404.html has a special meaning to Netlify. As such, Netlify should only consider this file as a private configuration file like _headers or _redirects, instead of also making it available as a fetchable resource.

Second, with a normal 404 error, no redirect occur on the client side. Netlify may have documented that feature under “Redirect options” and have it customizable with _redirects rules, but it’s not what actually happens. Indeed, the user is not redirected to a different URL during this process. In fact, sites that actually do redirect to a different URL to show a 404 error will start running into SEO issues as the bad page may be indexed by search engines. As such, saying “The 404 page is where visitors are sent when there is an error elsewhere.” is simply incorrect.

Finally, and that was my original point, is that you end up with an HTTP header that contradicts the HTTP body. A human may not see the former, but it causes a lot of problems to bots. In particular, if you were to submit a sitemap containing a soft 404 page to Google Search, it would normally flag it as an error.

I hope it clarifies the situation.

@SmashManiac Not even a little. We’re going to have to agree to disagree on this one.

Uh… this is not a matter of opinion.

Even if you don’t personally agree with some of the principles I’ve raised in my previous post, the fact remains that Netlify returns a contradictory HTTP response in this case, which causes an SEO issue, and that should be fixed.

In case you had not noticed, I did include evidence of this problem in my original post; it’s the soft 404 errors link from the Google Search Console documentation, included here again for convenience. The only reason I spoke in conditional is because such detection mechanisms by bots rely on heuristics.

Here’s another link from the same documentation, which explains how a soft 404 can trigger an error. You can see all possible errors and warnings Google Search Console can flag about indexing coverage on that page.

And here’s more details from a third-party about soft 404s, which contains an actual screenshot of said error in Google Search Console.

I hope these extra references clarify the validity of my bug report.

thanks for all this, @SmashManiac! I’ll get some :eyes: on this and we can take a closer look - you raise some valid points. More soon.

1 Like

Have to stand behind SmashManiac on this one, I think - a 404 is a 404, by definition hard-returns code 404, and then may also present a 404-orientated web page…

1 Like

Hi, @narrationsd and @SmashManiac. I’ve filed a feature request to change this behavior and always return a 404 for the the path /404 or /404.html.

If/when this feature becomes a reality, we’ll post a follow-up here to let you know about it. Please reply anytime if there is more information to share about this feature request and/or if there are any questions.

If you are someone else visiting this page and you would also like to see this feature request become a reality, please add a post below adding your “+1 vote”. If you do this, we will keep the feature request updated so that our project managers have accurate information about the level of interest in this feature.

1 Like