If you come to work one morning and find that your company’s traffic from Google
has fallen to nothing, a competitor may be redirecting traffic from your site to his.
Amazingly, there may be little or nothing you can do to stop this blatant rip-off.
The cause is an obscure HTML command that is interpreted poorly by Google but
correctly by Yahoo and some other search engines. Knowing about the trick at least
gives you some hope of understanding it — if it happens to you.
I described on Nov. 1 how duplicate copies of your content can result in the
copies ranking higher in Google than your own original material. And on
Nov. 8, I showed
how you can prevent duplicates from hurting your site’s rankings.
Today, I’ll finish this series on search-engine problems with a look at the complete
hijacking of your Web site traffic.
The 302 Problem In A Nutshell
The ability of one site to hijack another site’s Google traffic arises because of
two different but related HTML codes:
• A 301 code is a permanent redirect.
This code is placed on a page by a Webmaster when the content has moved to a different
location. The 301 code is like a Webmaster saying to search engines, “Page A is now
at Page B.” This code is legitimately needed because the site has a new domain name or
simply renamed its directory structure. Search engines usually give the new page
the same weight that the old page had. This means companies can shift content around
without losing the rankings that were previously earned in search-engine listings.
• A 302 code is a temporary redirect (more precisely referred to as
a “found elsewhere” code).
The 302 code essentially says, “Page A is now at Page B, but it plans to move back, so keep
the link to Page A and give it the same weight as Page B.” This could be a legitimate
use of the code. For example, a company could be load-balancing by temporarily moving some high-demand content
to a beefier server.
The problem is that Google sometimes puts the Page A link in its results instead of Page B,
even when Page A is an attacker’s site that is using a 302 code to steal Page B’s
traffic. As amazing as it may seem, Google can shift your legitimate traffic to an imposter,
who can profit from any visitors who don’t immediately notice any difference.
Some Search Engines Solve The Problem
The problem is not entirely new, but it’s become a serious concern lately as Webmasters have started using
302 codes more widely. Google’s own “AdSense” listing was even
an unrelated site, which had been using 302 codes innocently as a way to
make its links editable at a later date.
Complaints grew to the point
that Danny Sullivan, the editor of Search Engine Watch, held an “Indexing Summit”
to discuss it publicly with search-engine officials in August.
According to Eric Baldeschweiler, Yahoo’s director of software development, the
Yahoo search engine solved the problem many months ago. Yahoo uses the following
rules to settle things:
• All 301 permanent redirects are fine. Yahoo links to Page B. You
can safely move content around within your site or to an entirely new domain
with no problem. There’s only one exception — Yahoo shows the URL of a site’s
home page even if it immediately redirects to a “welcome” page.
• All 302 temporary redirects within a domain are fine. Yahoo links
to Page A. You can temporarily redirect traffic from your own Page A to your own
Page B without losing the search-engine ranking of Page A.
• All 302 redirects from one domain to another are considered
permanent. Page A derives no benefit. Yahoo links to Page B. This eliminates
any site’s ability to steal traffic from an unrelated site.
Sullivan confirms, “In fact, Yahoo has gone to that kind of solution.” However,
he adds, “The result that came down from Google” — which also participated in
the summit — “was, ‘We think we’ve solved that problem.’ ”
Sullivan believes MSN Search might also suffer from the 302 problem to some
extent, but MSN did not take part in the summit.
Others also feel Google is still affected to some extent. “As far as I know,
they haven’t solved it,” says Gideon Greenspan, a developer of
Google Publishes A Tech Support Portal
If you think your Web site’s traffic is flowing to some other party, Google has
a form you can fill out to report it. Matt Cutts, a Google search quality
links to the form in his blog and says complaints that are submitted there
“will get the same level of investigation” they would if he personally was
notified. You’ll have to try it to find out if that’s so.
Whether or not your company is affected today, you need to be knowledgeable
about potential redirection problems in case you eventually have to do something
about them (even if that mostly boils down to complaining your head off).
Sullivan has shown several examples of how the problem affects Google and MSN
Search in his own blog entries.
Claus Schmidt, an Internet consultant who’s been tracking the problem for ages,
has an extensive technical
explanation. (This includes a note that Yahoo’s solution is compatible with
the original RFC that
defines the 301 and 302 codes.)
Yahoo’s solution seems eminently reasonable and workable to me. Rather than
experimenting with complex rules to analyze URL hijacking, Google and other
search engines should simply adopt the rational 301/302 policy shown above.
This problem shouldn’t exist and need not exist. Finding out that your site has
suddenly lost most of its traffic because of an HTML trick is a lousy way to
start the day.
Update: English Version of IceSword
I wrote on
June 14, 2005, about IceSword, an antihacker utility designed to defeat
“rootkits” that infect Windows. IceSword, at that time, was available only in a
An English-language version of the program is now available for download from
the following Web page:
Xfocus.net is the home of a Chinese group of security researchers. The group’s download
page, as a result, is written entirely in Chinese. Non-Chinese speakers,
however, can easily download IceSword_en1.12.rar (a compressed file) by
clicking the blue characters in angle brackets shown at the bottom of the page.
I’d like to thank one of my readers who goes by the name of Illukka for his help
researching this topic. He’ll receive a gift certificate for a book, CD, or DVD
of his choice for sending me a tip I printed.