Thursday, May 6, 2021

Gada.be Illustrates Search Engine Ups and Downs

An unusual metasearch service named Gada.be launched its public beta on the
Web in October. By November, its novel approach to search got it banned from Google’s
listings.

The ban has been lifted now, and Gada.be once again appears in Google’s index.

But the whole experience — and the goals Gada.be is trying to achieve —
can teach important lessons to any business that gets traffic from search-engine listings.

My Kingdom for a Subdomain

Gada.be is the brainchild of Chris Pirillo, the proprietor of Lockergnome.com, based
in Seattle, Wash., and Shayne Sweeney, a programmer in Chico, Calif. From the beginning,
the site was intended as much for tiny cell phones as it was
for PCs with full screens. The very name Gada.be (pronounced “gotta be”) was selected
because it requires few keypresses on a 10-key keypad. The number 4232.2233
spells out Gada.be on most handsets.

Gada.be is a metasearch engine for RSS feeds. You can search for feeds on a
topic dear to you and have them delivered to your desktop or mobile device.

This is exactly where Gada.be ran afoul of Google, however. Pirillo and Sweeney
decided to make searches easy on cell phones by allowing search terms to be
defined via subdomains. For example, you can search for feeds on
the term “pink elephant” by typing the following into a Web browser:


http://pink.elephant.gada.be

To search for the words pink elephant in any order, not necessarily as a
phrase, you use hyphens instead of periods between the search words:


http://pink-elephant.gada.be

This use of subdomains (also known as canonical names) isn’t new
to Gada.be. Many sites use canonical names for various functions. But it was an
interesting way for Gada.be to provide search results. Users could simply type a
URL instead of loading a search page, navigating to an input box, and then
waiting for an additional results page.

It also waved a red flag for Google, which dropped Gada.be pages from its index
(probably automatically) around Nov. 18.

Hell Hath No Wrath Like Google Scorn

Google gives more weight to sites that have a search term somewhere in the name.
Trying to take advantage of this, Web operators who are out to make a quick buck
have generated thousands of junk sites (called “search engine spam” by Google).
These sites contain little or no content but have various search terms sprinkled
throughout their canonical names, meta tags, visible text, and elsewhere — all
to grab some free traffic from Google and some revenue from Google ads.

As Matt Cutts, a Google search-quality engineer, commented (in response to a
David Naylor blog item
about Gada.be’s disappearance), “I suspect it would do better if it didn’t generate
dynamic subdomains for any phrase at all.”

Gada.be seems to have corrected the banning and subsequently re-appeared in
Google’s index last week. The startup did this by converting searches into a
more normal URL structure. Today, a search for pink elephant (using
Gada.be’s default setting) results in your Web browser being redirected to the
following address:


http://gada.be/d/pink-elephant

Other Web sites can link to pages in this form, without making it appear to
Google that Gada.be is generating thousands of separate subdomains — a sign of
junk Web sites.

“We were receiving 9,000 [unique users] a day before Google banned us,” says
Pirillo. “We’re at 6,000 a day now,” after returning to Google’s search results,
he says.

To Be Junk or Not To Be Junk

Aside from its suspicious-looking use of subdomains, Gada.be has the feel of a
content-free site that might be auto-generated by quick-buck operators. For
example, a search results page at Gada.be is spartan and contains no actual
information about the topic of the search. The only listings that appear are
links to other Web pages. (Of course, the same thing is true of Google results
pages.)

Pirillo emphasizes that Gada.be is far more than just auto-generated links
surrounded by revenue-producing Google ads. The point of indexing RSS feeds, he
says, is that users can integrate them into their own RSS aggregators. To do
this, users can add /opml to any search address, as follows:


http://gada.be/d/pink-elephant/opml

This produces a screenful of
OPML
— Outline Processor Markup Language. In brief, OPML is an XML-based
way of formatting lists so they can be used in different operating environments.
In the case of Gada.be, the list could be a variety of feeds you want to
subscribe to regarding a particular topic.

To get updates on your chosen topic, you import the OPML text into your RSS
aggregator. Most aggregators already support the importing of this kind of data.
(The feature is bleeding-edge enough, however, that a Gada.be update page says, “FeedDemon is the only
one that does it properly.”)

An unrelated service, Wynia.org, is also running
an “experiment” using Gada.be to generate a custom RSS feed from any keywords
you enter. This seems to be just the beginning of the add-ons people can create
using this technique.

The Future of Gada.be

Pirillo and Sweeney are particularly proud of the scalability they say they’ve
built into Gada.be. The service’s “asynchronous fetcher” can currently parse as
many as 10,000 sources per request, they explain.

Gada.be already returns results from hundreds of sources — but not from the
“organic” search-engine results of Google,
Yahoo, and some other sites. At present, they output only blog and news indexes
and various other services as RSS, the developers say.

They expect this to change as search engines adopt the OpenSearch specification, which is a
standard way to output search results as RSS. It’s currently being promoted by
Amazon.com’s A9 and other search engines. Compliance with OpenSearch will be a
requirement to integrate a search engine into the search box of Microsoft’s
forthcoming Internet Explorer 7.0, according to Pirillo.

Attracting 6,000 users a day is respectable, but it represents a tiny fraction
of the users who are now served by Google and other Web giants. Despite this,
Gada.be’s developers envision the day when their site will need to be scaled
across numerous load-balanced servers.

“We’ve already been approached by four venture capital firms, who looked at it
and saw the promise,” Pirillo says.

RSS feeds, OPML, OpenSearch — all of these technologies do indeed show a lot of
promise. It’s anyone’s guess how this will all get put together and who the
ultimate winners will be. At this point, Gada.be is an intriguing example of the
possibilities, even though it’s still a diamond in the rough.

Similar articles

Latest Articles

What is Raw Data?

By itself, raw data doesn’t look like much or mean much, but it has the potential to be processed for analysis.  Processed data comes from...

What is Data Analysis?

Everything measurable that has happened, is happening, and will happen in a business can be boiled down to data. But not all data is...

IBM Begins Cloud Confidentiality...

IBM has positioned its cloud offering against the unique security, compliance and confidentiality needs of specific vertical markets with a sharp focus on finance...

Top Big Data Certifications...

The term Big Data reflects a very real growing trend. By 2020, every human will be generating an astounding 1.7 MB per second. That...