Friday, March 29, 2024

Spock: Web 2.0 Search Engine

Datamation content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Does the Web need another search engine?

Jaideep Singh and Jay Bhatti, the co-founders of Spock, are betting that it does — specifically, one for people search.

With a rumored $7 million in venture capital from Clearstone Venture Partners and Opus Capital Ventures, Spock came out of stealth mode in April at the Web 2.0 Expo in San Francisco, and is currently conducting an invitation-only beta with over 25 thousand users. Talking about how Spock will work and might be used, Bhatti says flatly, “Our goal is to index everyone everywhere.”

Spock originated with the founders’ frustration in managing their connections. With over a thousand people in his address book, Bhatti says, “Every time I started looking for someone, it was impossible to find them if I didn’t know their name exactly. And if I wanted to know, say, everyone who had got their MBA at Stanford and currently lives in Seattle, I could never do that with my Outlook address book or any other application that was out there.” The challenge, he explains, is to create an application that can keep track of all the information in people’s address books while making that information easily accessible.

Spock, he says, “almost is like a del.icio.us for people,” in that it will allow users access to connections from any web-connected computer. The difference, Bhatti says “is the scale of what we’re trying to do. It’s not a web site. It’s a search engine.”

Bhatti adds that, contrary to appearances, the name is not an allusion to the famous Star Trek character. Rather, he says the name was chosen because it was easy to remember, say, and spell. Another consideration is the company’s hope that “to spock” will soon became a verb like “to google.”

How Spock Beams You Up

Like any search engine, Spock at its most basic is about entering search terms and reading the results. However, one of the things that makes Spock different is that you can also search by meta-tags that are created and voted on by registered users, as on Digg. This feature means that you can look up not only specific people, such as Linus Torvalds, but also specific people in certain circumstances, such as “Dick Cheney shooting scandal” or categories of people, such as “booksellers San Francisco.” All relevant information, ranging from web page links through images and videos to the tags associated with the person, is summarized in the results. Users can then click on the results to find out more information about the person for whom they searched, or on the tags assigned to the person to find other people in the same category.

Currently, the results given by the beta are reasonable, but have noticeable gaps in almost any topic you can think of. However, Bhatti insists that the service will improve as it continues. He says that Spock already includes at least 100 million names, and is growing steadily. “We’re going to be adding two million unique people every day because we’re crawling the web,” he says. In addition, as people register for Spock, they can speed the process by uploading their address books from their mail readers or their connections on social networking sites. As people add and vote on tags, the site will be enriched even further.

Besides curiosity and managing personal contacts, Bhatti sees many different uses for Spock. The site could be used as a portal for breaking news stories, quickly providing information from both official media and blog sources. Headhunters could use it to find job applicants. Individuals might use it for dating, or for travel or consumer information. Researchers could use Spock to find all related information about a search subject. Since Spock will index dead people as readily as live ones, it could also be used for genealogy research if it grows as planned.

“Spock is going to become the central point for searching for people,” Bhatti says simply. “Everyone will go to us and nowhere else.”

Spock’s Challenges

The basic concept is simple enough, but implementing such a collection of ideas is complex. “One of the biggest things we care about is the accuracy and the relevance of information,” Bhatti says. “We make a very hard effort to see that every source that we crawl on the web has an authority ranking. For example, if ‘president of the United States” is coming up on a particular person, and we got that information from www.whitehouse.gov, we’ll give that a lot more credibility than we will if that name came up from FaceBook.”

Similarly, Spock’s developers hope to ensure that, as on Google, the results that are most likely to be relevant are near the top of the results.

Another major concern is what Spock calls “entity resolution” — distinguishing one person with the same name from another. This problem is so large that, as part of the pre-launch publicity, Spock is hoping to enlist non-employees in the search for ways and means by sponsoring a contest for the best solution to this problem, with results to be judged by Stanford computer science professors and a first prize of $50,000.

Yet another problem is ensuring that assigning tags doesn’t get out of hand. Only registered users are allowed to add tags or vote on them, and who supported each tag will be clearly visible to all users. Then, over time, users’ votes will become weighted depending on their behavior. “You can almost think of it like Wikipedia,” Bhatti says, “where no one can edit George Bush’s profile unless they’ve been registered on Wikipedia for six months and they’re doing good work. The same thing on Spock: certain users will have more power based on the authority they’ve shown on Spock in terms of tagging, voting, and adding other people.”

As on Flickr, such trusted users will be able to earn extra privileges, such as the use of a power interface from which they can add bulk tags or edit information. According to Bhatti, Spock is also considering two or three tiers of privileges. “If people are doing really good work, we want to reward that,” Bhatti says.

Conversely, if a user starts adding malicious or slanderous tags, or becomes the object of other’s people’s complaints, their privileges could be decreased or removed after a review by the company. In extreme cases, the user’s privileges could be removed altogether. Should an admonished user’s behavior improve, then they might have their privileges reinstated.

What Bhatti calls “personal identifiers” — email and street addresses of famous people, birth dates, social insurance numbers and similar information — will not be used on Spock. “We’ve made a conscious decision that, even if we find such data, we will never display it on Spock,” Bhatti says. “We will always hide it. Because we don’t think that’s the most relevant thing.” Bhatti also promises that Spock’s bots will not search password protected sites, or sites whose robot.txt files request otherwise.

Spock vs. Google, Et Al.

As a third generation search engine, Spock faces a crowd of competitors. However, Bhatti insists that Spock’s research and chosen specialization will be enough to differentiate it from the pack.

For instance, he characterizes fellow newcomer Wink as trying to compete in the same space, but as “more of a social network.” By contrast, Spock’s strategy is “pure search” and based on the development of intellectual property. “We’re making a really determined effort to go at this from a hard science perspective,” Bhatti says.

He considers existing social networks even less of a direct competitor. In fact, he points out that Spock’s growing links to sites like LinkedIn and MySpace can only be mutually beneficial to everybody. Spock has no current plants to implement similar services because, if it did, “all of a sudden we’d be competing with the people we’re collaborating with, and that’s not a good thing for Spock.”

Besides, Bhatti adds, social network sites are relatively unprofitable. However, he declined to talk about how Spock might be monetized, except to mention that it would sell contextual ads and that other sources of income would emerge from the company’s intellectual property.

As for Google, Bhatti sees little overlap. “Google does a really good job or organizing searches around web documents,” he explains. “However, at Spock, we organize information around people, and that requires a totally different skill set. When Google looks at a web page, it doesn’t care what the document is about. All it cares about is whether the document has the relevant keyword that you’ve searched for. When Spock builds up a search, it has to care about whether it’s about a person, and, if it is a person, what’s relevant, and what are all the keywords about that person. So the technology and the algorithms for what we do are totally different from Google’s. Our user interface is integrated to show results around people, and nothing else. So that’s going to give us a very unique perspective.”

In many ways, Spock is a mash-up in the best tradition of Web 2.0, taking ideas from other sites and recombining them in the hopes of making something new. Whether that something will become as popular as its founders hope will become obvious after Spock makes its official launch in mid-July.

Subscribe to Data Insider

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more.

Similar articles

Get the Free Newsletter!

Subscribe to Data Insider for top news, trends & analysis

Latest Articles