MOUNTAIN VIEW, Calif. — Microsoft brought its brainiacs to Silicon Valley for a road show highlighting the latest cool stuff.
Scientists from Microsoft Research labs in San Francisco and Redmond joined their colleagues at the company’s Mountain View, Calif. campus to showcase speculative projects that could someday find their way into products.
Researchers are working on everything from a Web services-based model of the universe to sneaky ways to foil spammers.
Dan Ling, vice president of Microsoft Research, told an audience of academics, entrepreneurs and business folk that while Research has only a small part of Microsoft’s hefty $7 billion R&D budget, most of the company’s products are influenced by what it does.
For example, the San Francisco lab’s statistical analysis of the Web could find its way into the new search technology Microsoft is readying to go up against Google. Jim Gray, a Microsoft Research Distinguished Engineer, said that a yearlong project to produce a statistical characterization of the Web turned up some interesting and useful trends. Microsoft Research tracked 1 billion Web pages for a year, analyzing what had changed and looking for anomalies.
By keeping track of how many Internet names mapped to the same IP address or how many other pages linked to a single Web page, the technology seems to be able to identify what Gray called ”places you don’t want a search engine to go,” such as sites identified with pornography or spam. Microsoft researchers Marc Najork, Mark Manasse and Dennis Fetterly published the research and passed the information to the MSN Search team.
A new algorithm for finding the shortest route could be used for Microsoft MapPoint.Net, Gray said. In tests, author Andrew Goldberg found it delivered a 20-times improvement in time and memory for the road network of a large state. This improvement could enable shortest path routing for PDAs. It could be used to offer users real-time advise about traffic congestion or road outages, and it also could enable larger requests, such as driving directions for the shortest cross-country route.
A very long-term project, Ling said, is modular data center software, codenamed Boxwood, that could make large-capacity storage and computation systems cheaper by virtualizing storage, distributing the locking and global state to unify the system, and automating provisioning, error detection and reinitializing.
”We need to get rid of the idea that with our 1500 CPUS we’re going to have 1500 different file systems,” Ling told internetnews.com.
One area Microsoft Research is helping lead Microsoft is the company’s efforts to combat spam. ”It’s of great importance to the Hotmail group which is here in Silicon Valley,” Ling said.
The stats are alarming: 23 percent of e-mail users say spam has reduced their e-mail use, while 76 percent are bothered by offensive or obscene content, and as much as 78 percent of all e-mails are spam.
”It’s something that needs to be undertaken by the community as a whole. Leading e-mail providers are starting to get together to look at common strategies,” he said.
Ling also outlined several approaches, including employing machine learning techniques to automatically identify e-mails that look like spam. With millions of Hotmail users participating in helping to train the software, Ling said, the filters can become very effective over time. Microsoft also is considering ”black hole” lists and some form of ”postage” that makes it more expensive to send spam, whether that’s charging money, making the computer perform a computation or giving senders a test to prove they’re human. All these could make spamming a little less economical.
Another project — MindNet — is a semantic network. ”Think of it as a bunch of senses of a particular word and relationships between those words,” Ling explained. For example, different words would link to the word bank when used to denote a financial institution than when it referred to the bank of a river.