The basic concept is simple enough, but implementing such a collection of ideas is complex. "One of the biggest things we care about is the accuracy and the relevance of information," Bhatti says. "We make a very hard effort to see that every source that we crawl on the web has an authority ranking. For example, if 'president of the United States" is coming up on a particular person, and we got that information from www.whitehouse.gov, we'll give that a lot more credibility than we will if that name came up from FaceBook."
Similarly, Spock's developers hope to ensure that, as on Google, the results that are most likely to be relevant are near the top of the results.
Another major concern is what Spock calls "entity resolution" -- distinguishing one person with the same name from another. This problem is so large that, as part of the pre-launch publicity, Spock is hoping to enlist non-employees in the search for ways and means by sponsoring a contest for the best solution to this problem, with results to be judged by Stanford computer science professors and a first prize of $50,000.
Yet another problem is ensuring that assigning tags doesn't get out of hand. Only registered users are allowed to add tags or vote on them, and who supported each tag will be clearly visible to all users. Then, over time, users' votes will become weighted depending on their behavior. "You can almost think of it like Wikipedia," Bhatti says, "where no one can edit George Bush's profile unless they've been registered on Wikipedia for six months and they're doing good work. The same thing on Spock: certain users will have more power based on the authority they've shown on Spock in terms of tagging, voting, and adding other people."
As on Flickr, such trusted users will be able to earn extra privileges, such as the use of a power interface from which they can add bulk tags or edit information. According to Bhatti, Spock is also considering two or three tiers of privileges. "If people are doing really good work, we want to reward that," Bhatti says.
Conversely, if a user starts adding malicious or slanderous tags, or becomes the object of other's people's complaints, their privileges could be decreased or removed after a review by the company. In extreme cases, the user's privileges could be removed altogether. Should an admonished user's behavior improve, then they might have their privileges reinstated.
What Bhatti calls "personal identifiers" -- email and street addresses of famous people, birth dates, social insurance numbers and similar information -- will not be used on Spock. "We've made a conscious decision that, even if we find such data, we will never display it on Spock," Bhatti says. "We will always hide it. Because we don't think that's the most relevant thing." Bhatti also promises that Spock's bots will not search password protected sites, or sites whose robot.txt files request otherwise.
Spock vs. Google, Et Al.
As a third generation search engine, Spock faces a crowd of competitors. However, Bhatti insists that Spock's research and chosen specialization will be enough to differentiate it from the pack.
For instance, he characterizes fellow newcomer Wink as trying to compete in the same space, but as "more of a social network. By contrast, Spock's strategy is "pure search" and based on the development of intellectual property. "We're making a really determined effort to go at this from a hard science perspective," Bhatti says.