You might be pretty mad if you found out that your articles on the Web were
being copied and inserted into someone else’s site without permission or credit.
To add injury to insult, the “duplicate” site can show up in search-engine listings and make your own site’s listings disappear.
This is an example of “Googlewashing” — a term that combines Google and brainwashing
— and it’s becoming a
serious problem for a growing number of content-based businesses.
Whatever You Publish Is Mine
A recent instance of an original site being pushed down in Google’s listings by
duplicates actually got some attention at Google recently because of the person
who was affected — one of Google’s own search quality engineers.
• The original post. The aforementioned Google employee, Matt
Cutts, maintains his own site, MattCutts.com/blog, on which he
publishes articles and comments about the search giant’s ranking methods. An
offhand remark of his about a new dish in the Google cafeteria called “bacon
polenta” happened to lead off one such article a few weeks ago.
• Excerpts of the post. A few days later, the Threadwatch marketing blog
strange thing. You could search for the entire first two sentences of Cutts’
article in Google (which should produce only one hit), but his article wasn’t
even No. 1. Instead, other bloggers — who’d innocently pasted into their sites
an excerpt of Cutts article, and then legitimately linked to the full piece —
were ranking higher on the phrase. Threadwatch posted a screen shot
demonstrating the effect.
• Serious Googlewashing. To add a note of gravity to all this
hilarity, a French group known as Dark SEO Team (the acronym means Search Engine
Optimization) developed a deliberate hack into Google’s search-engine algorithm.
The team’s own copycat page was soon also ranking higher than Cutts’ original.
And the group made a chilling claim:
“Anyone can use Google’s duplicate content filters to ruin a competitor’s
website, and steal his ranking and traffic,” according the team’s
Open Letter to Matt Cutts.
In addition, Dark SEO Team has made waves in the world of search engine
marketing by demonstrating how an ordinary site can fake “Page Rank 10.” PR10 is
Google’s highest score for Web pages. Such a high Page Rank gives a must-coveted
boost to a site’s content in Google’s results.
The problem with fake page rank has become so great that a site called
SEOLogs.com has even posted a Fake Page Rank Detector.
You enter a site’s Web address, and the detector tells you whether that site has
earned its Google Page Rank or has succeeded in faking it.
What’s going on here? The implications are worrisome for all kinds of legitimate
e-businesses, not just those that specialize in helping Web sites get better
It’s Not Nice To Fool Mother Google
Matt Cutts’ original page wasn’t listed as high as other pages that merely
copied a piece of his content because of a problem that all search engines are
facing, not just Google. Since search engines have become a universal way to
find things on the Web, many shady promoters post thousands of Web pages hoping
that one or more will show up near the top of the listings.
These sites use a variety of tricks to “look good” to the search engines’ bots.
A human being who happened to find one of these sites, however, would
immediately see that the content was little more than links to other sites,
usually links that pay commissions to the site owner.
Search engine professionals call these “spammy sites.” Some of them have used
the techniques of search engine optimization so well that they rank higher than
well-written sites. When human searchers are led to sites like this, they blame
Google and other search engines for recommending a page that wasn’t worth
Since many of these spammy sites use the same content over and over, Google and
the other indexes have added software routines that try to eliminate duplicate
content. If the same words and phrases are found on several sites, some of the
sites will be pushed lower down in the ranking or not appear at all.
The problem is that Google can’t easily tell which of several duplicate sites is
the genuine, original source of the content.
Imitation Is The Sincerest Form Of Invisibility
Google has by now corrected whatever it was that was preventing its employees’
blog from coming up as the No. 1 listing on the phrase that he used. Today, all
you have to do is enter the words bacon polenta at Google and Cutts’
original work appears at the top of the list.
Googlewashing, and the issue of duplicate content confusing search engines, is
much larger than this one example, though. The effects can hurt everyone from
blogs that distribute material via RSS (Rich Site Syndication) to corporations
that publish works that others may or may not be authorized to reproduce.
Next week, I’ll describe how you can prevent your Web content from being
duplicated — and how you can keep normal, authorized syndication from making
your site invisible to search engines.