SHARE

Let the Machine Do the Learning

So-called machine learning and the semantic web go hand in hand, for exploring and exploiting the continuum between structured and unstructured data to connect diverse sources of knowledge on a large scale. Learn how the Semantic Web is changing the way we treat data at the LinkedData Planet Conference. Sir Tim Berners-Lee, inventor of the […]

Written By

Jennifer Zaino

May 30, 2008

4 minute read

Datamation content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

So-called machine learning and the semantic web go hand in hand, for exploring and exploiting the continuum between structured and unstructured data to connect diverse sources of knowledge on a large scale.

Learn how the Semantic Web is changing the way we treat data at the LinkedData Planet Conference. Sir Tim Berners-Lee, inventor of the World Wide Web and director of the W3C, is among the event’s keynote speakers.

One expert put it this way: “Technically, people used to make strong distinctions between unstructured data in free text, and structured data that was digested and put into a database that people could use,” says Dr. William Cohen, associate research professor at Carnegie Mellon University’s Machine Learning Department. He’ll be speaking on the topic of using machine learning to discover and understand structured and unstructured data at the LinkedData Planet Conference, June 17-18 in New York.

“But there is a continuum between these. Web sites, for instance, have information with some structure — tables and lists, often derived from an underlying database but presented in a way people can understand. It’s intended for the human user, not the computer,” Cohen says.

For the semantic web’s capabilities to be realized, it needs machine learning to make the connections among these pieces of information in whatever format, and from whatever source, on a large scale. Consider, for example, a large organization that is the product of many acquisitions over the years, where different sub-organizations have different relationships with the same customer, expressed in different formats. It’s a lot of work and technically hard to do to try to understand that customer in the context of the whole organization through traditional rules-engineering approaches, and many of these knowledge engineering approaches fall down with larger and larger sources of data and more diverse sources.

“The way it’s done today, it’s labor intensive and costly. The goal is to do it better, faster, and cheaper, and on a broader scale,” Cohen says.

Machine learning is figuring out what the rules ought to be — for example, putting into a unified format data from two different stores, wherein one store the data may put the customer’s name first and in the other the product you sell to it. But usually the complexities aren’t as easily resolved, so writing a rule that makes two entries look exactly the same to put into a database and have a consistent set of keys and a consistent user experience could be time-consuming and difficult. However, most likely there is some sort of tag, or metadata, that conclusively identifies an item, like a SKU.

“So you can look at those IDs and say these objects are probably the same because they have this consistent ID, and from those you can figure the mapping out to be this,” Cohen says. “A person would do this but the key thing is to get the machine to do the same thing….to automatically figure out what the rules ought to be for all 100 companies you deal with, and there’s no process that involves human labor. If you can do that, it’s a huge win.”

Not without complications, however, especially in its implications for the infrastructure.

“If you use machine learning to construct these rules, that forces you to come to grips with the fact that some rules, because they are learned from data, will be inaccurate,” says Cohen. Which could have consequences affecting the entire business cycle — ordering, billing, supplying. And scalability has to be a much greater consideration. “There is a lot of work on things that work well for 10,000 data points but we’re ten years off from having them work on 100 million data points,” he says — and ten years from now we’ll probably be closer to 10 billion data points, anyway. “The amount of data is growing very quickly, so the technologies we work on, we have to really understand their scalability.”

Huawei’s AI Update: Things Are Moving Faster Than We Think

FEATURE | By Rob Enderle,
December 04, 2020
Keeping Machine Learning Algorithms Honest in the ‘Ethics-First’ Era

ARTIFICIAL INTELLIGENCE | By Guest Author,
November 18, 2020
Key Trends in Chatbots and RPA

FEATURE | By Guest Author,
November 10, 2020
Top 10 AIOps Companies

FEATURE | By Samuel Greengard,
November 05, 2020
What is Text Analysis?

ARTIFICIAL INTELLIGENCE | By Guest Author,
November 02, 2020
How Intel’s Work With Autonomous Cars Could Redefine General Purpose AI

ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 29, 2020
Dell Technologies World: Weaving Together Human And Machine Interaction For AI And Robotics

ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 23, 2020
The Super Moderator, or How IBM Project Debater Could Save Social Media

FEATURE | By Rob Enderle,
October 16, 2020
Top 10 Chatbot Platforms

FEATURE | By Cynthia Harvey,
October 07, 2020
Finding a Career Path in AI

ARTIFICIAL INTELLIGENCE | By Guest Author,
October 05, 2020
CIOs Discuss the Promise of AI and Data Science

FEATURE | By Guest Author,
September 25, 2020
Microsoft Is Building An AI Product That Could Predict The Future

FEATURE | By Rob Enderle,
September 25, 2020
Top 10 Machine Learning Companies 2020

FEATURE | By Cynthia Harvey,
September 22, 2020
NVIDIA and ARM: Massively Changing The AI Landscape

ARTIFICIAL INTELLIGENCE | By Rob Enderle,
September 18, 2020
Continuous Intelligence: Expert Discussion [Video and Podcast]

ARTIFICIAL INTELLIGENCE | By James Maguire,
September 14, 2020
Artificial Intelligence: Governance and Ethics [Video]

ARTIFICIAL INTELLIGENCE | By James Maguire,
September 13, 2020
IBM Watson At The US Open: Showcasing The Power Of A Mature Enterprise-Class AI

FEATURE | By Rob Enderle,
September 11, 2020
Artificial Intelligence: Perception vs. Reality

FEATURE | By James Maguire,
September 09, 2020
Anticipating The Coming Wave Of AI Enhanced PCs

FEATURE | By Rob Enderle,
September 05, 2020
The Critical Nature Of IBM’s NLP (Natural Language Processing) Effort

ARTIFICIAL INTELLIGENCE | By Rob Enderle,
August 14, 2020

SEE ALL
ARTICLES

Let the Machine Do the Learning

Jennifer Zaino

Company

Categories

Let the Machine Do the Learning

RELATED NEWS AND ANALYSIS

Jennifer Zaino

Company

Categories