Download the authoritative guide: Cloud Computing 2018: Using the Cloud to Transform Your BusinessAn unusual metasearch service named Gada.be launched its public beta on the Web in October. By November, its novel approach to search got it banned from Google's listings.
The ban has been lifted now, and Gada.be once again appears in Google's index.
But the whole experience -- and the goals Gada.be is trying to achieve -- can teach important lessons to any business that gets traffic from search-engine listings.
My Kingdom for a Subdomain
Gada.be is a metasearch engine for RSS feeds. You can search for feeds on a topic dear to you and have them delivered to your desktop or mobile device.
This is exactly where Gada.be ran afoul of Google, however. Pirillo and Sweeney decided to make searches easy on cell phones by allowing search terms to be defined via subdomains. For example, you can search for feeds on the term "pink elephant" by typing the following into a Web browser:
To search for the words pink elephant in any order, not necessarily as a phrase, you use hyphens instead of periods between the search words:
This use of subdomains (also known as canonical names) isn't new to Gada.be. Many sites use canonical names for various functions. But it was an interesting way for Gada.be to provide search results. Users could simply type a URL instead of loading a search page, navigating to an input box, and then waiting for an additional results page.
It also waved a red flag for Google, which dropped Gada.be pages from its index (probably automatically) around Nov. 18.
Hell Hath No Wrath Like Google Scorn
Google gives more weight to sites that have a search term somewhere in the name. Trying to take advantage of this, Web operators who are out to make a quick buck have generated thousands of junk sites (called "search engine spam" by Google). These sites contain little or no content but have various search terms sprinkled throughout their canonical names, meta tags, visible text, and elsewhere -- all to grab some free traffic from Google and some revenue from Google ads.
As Matt Cutts, a Google search-quality engineer, commented (in response to a David Naylor blog item about Gada.be's disappearance), "I suspect it would do better if it didn't generate dynamic subdomains for any phrase at all."
Gada.be seems to have corrected the banning and subsequently re-appeared in Google's index last week. The startup did this by converting searches into a more normal URL structure. Today, a search for pink elephant (using Gada.be's default setting) results in your Web browser being redirected to the following address:
Other Web sites can link to pages in this form, without making it appear to Google that Gada.be is generating thousands of separate subdomains -- a sign of junk Web sites.
"We were receiving 9,000 [unique users] a day before Google banned us," says Pirillo. "We're at 6,000 a day now," after returning to Google's search results, he says.
To Be Junk or Not To Be Junk
Aside from its suspicious-looking use of subdomains, Gada.be has the feel of a content-free site that might be auto-generated by quick-buck operators. For example, a search results page at Gada.be is spartan and contains no actual information about the topic of the search. The only listings that appear are links to other Web pages. (Of course, the same thing is true of Google results pages.)
Pirillo emphasizes that Gada.be is far more than just auto-generated links surrounded by revenue-producing Google ads. The point of indexing RSS feeds, he says, is that users can integrate them into their own RSS aggregators. To do this, users can add /opml to any search address, as follows:
This produces a screenful of OPML -- Outline Processor Markup Language. In brief, OPML is an XML-based way of formatting lists so they can be used in different operating environments. In the case of Gada.be, the list could be a variety of feeds you want to subscribe to regarding a particular topic.
To get updates on your chosen topic, you import the OPML text into your RSS aggregator. Most aggregators already support the importing of this kind of data. (The feature is bleeding-edge enough, however, that a Gada.be update page says, "FeedDemon is the only one that does it properly.")
An unrelated service, Wynia.org, is also running an "experiment" using Gada.be to generate a custom RSS feed from any keywords you enter. This seems to be just the beginning of the add-ons people can create using this technique.
The Future of Gada.be
Pirillo and Sweeney are particularly proud of the scalability they say they've built into Gada.be. The service's "asynchronous fetcher" can currently parse as many as 10,000 sources per request, they explain.
Gada.be already returns results from hundreds of sources -- but not from the "organic" search-engine results of Google, Yahoo, and some other sites. At present, they output only blog and news indexes and various other services as RSS, the developers say.
They expect this to change as search engines adopt the OpenSearch specification, which is a standard way to output search results as RSS. It's currently being promoted by Amazon.com's A9 and other search engines. Compliance with OpenSearch will be a requirement to integrate a search engine into the search box of Microsoft's forthcoming Internet Explorer 7.0, according to Pirillo.
Attracting 6,000 users a day is respectable, but it represents a tiny fraction of the users who are now served by Google and other Web giants. Despite this, Gada.be's developers envision the day when their site will need to be scaled across numerous load-balanced servers.
"We've already been approached by four venture capital firms, who looked at it and saw the promise," Pirillo says.
RSS feeds, OPML, OpenSearch -- all of these technologies do indeed show a lot of promise. It's anyone's guess how this will all get put together and who the ultimate winners will be. At this point, Gada.be is an intriguing example of the possibilities, even though it's still a diamond in the rough.