Many times, website owners will spend so much time focusing on what they want their visitors to see, they forget about optimizing what they don’t expect their visitors to see.
In recent weeks, I’ve seen several mishandled 404s, but one theme seems to return “200 OK” codes to search engine crawlers for 404 pages. The sites doing this are pretty significant sites we’ve looked to engage from an SEO perspective.
In a couple cases through our monitoring of Webmaster Tools, we’ve even discovered client sites with this issue. The last thing you need in the Google index is an error page. But even worse are multiple URLs that were mistyped being returned as “OK.”
Another interesting trend I’ve seen is dynamically generated 301 or 302 redirects that send users to either the home page or a custom error page when someone mistypes the URL from a domain into the browser. For example, if I typed www.searchenginewatch.com/doesnotexist, I’m led to a nice custom 404 error page that offers users a number of choices to find the right page. The key is to understand exactly what kind of response code it returned to the server.
In order to check response codes, I like to use the SEOconsultants.com tool. If I place the mistyped URL above into the box at this page, I get a detailed list of how the page is seen from a crawler’s perspective. The call to the server generates a 301 redirect to the custom 404 page described above, but most important, it returns a “Server Response” of 404 not found.
Because the user is led to a page with more information, they will likely be happy and continue to navigate in order to find what they want. The key is that by the server response code being a 404, the URL won’t be indexed in the search engines and will simply “disappear.”
Now if the response code was “200 OK,” which is a mistake developers sometimes make, a new URL could be indexed (/doesnotexist) with possible duplicate content. The home page is often used to redirect people in the event of a dead or non-existent page.
From a user experience standpoint, it’s arguable if that’s the right choice or if the user should be sent to an error page. I’m sure you can find many people on both sides of that fence. From an SEO perspective, you can end up with duplicate versions of your home page in the search engine indexes, but with bogus URLs.
Without a custom 404 error page, the visitor — human or robot — is left with only two courses of action: to abandon their search or click the back button. Search engines can reduce rankings due to server errors and broken pages. Simple errors such as “404 page not found” in large quantities can make the search engines believe that a site isn’t complete or is under construction and, as a result, they may determine that the site isn’t worthy of strong search engine rankings.
When a nonexistent page is requested from the server, the server should respond with a special “HTTP Status” header value of “404 Not Found,” which may also be followed by custom error-page body content. Incorrectly configured Web-servers that respond with a status header value of “200” (or any other erroneous value) are exposed to significant risk with respect to search engines’ “duplicate content penalties.” This is because the identical content (in this case, the error page content) would be available under a potentially infinite number of URLs.
Custom 404 pages serve several important purposes. First, they return the correct code to the users and to search engine spiders, informing the visitors that the page they were seeking wasn’t found. Second, custom 404 pages present visitors with options about what to do next. Without a custom 404 error page, the visitor — human or robot — is left with only two courses of action: to abandon their search or click the back button. Neither of these are a satisfactory response to an error.
Geis provides specific recommendations to several clients on this issue. So, if you have this problem with your sites, don’t feel like you’re alone. Take care of it because it could potentially be damaging in the long term from a user experience and SEO perspective.