But that only gives part of the picture since 85 percent of an organization's knowledge isn't in databases. To get at the rest, a new generation of text mining tools allows companies to discover relationships and summarize information from large stores of previously unanalyzed data.
Structured and Unstructured
Information breaks down into two broad categories - structured and unstructured. Structured is what we find in databases. Every bit of information has an assigned format and significance.
|Unstructured data is what we find in emails, reports, PowerPoint presentations, voice mail, phone notes, agendas and photographs.|
Companies have been using data mining software for years to extract business intelligence from their structured data. Since the database fields are clearly defined, it is easy to run queries and formulas which extract meaningful information, not just raw data. Computers are great at handling massive quantities of structured information, something which people have a hard time doing.
Unstructured data is what we find in emails, reports, PowerPoint presentations, voice mail, phone notes, agendas and photographs. Shaku Atre, president of the Santa Cruz, CA business intelligence consultancy Atre Group, points out that much of this type of information is better referred to as semi-structured since it contains structured metadata such as the e-mail headers or revision dates in Word documents. For simplicity, we will group the entire spectrum of data that is less structured than database entries under the term "unstructured."
This data typically comprises about 85% of an organization's knowledge stores, but it is not always easy to find, access, analyze or put to use.
"We are drowning in information but are starving for knowledge," says Mani Shabrang, technical leader in research and development at The Dow Chemical Company's business intelligence center in Midland, Michigan. "That information is only useful when it can be located and then synthesized into knowledge."
Running full text queries to find key words is one way to locate text information but it is severely limited. It still relies on a human to then read that information, spot the relationships and convert it into useful knowledge. One problem lies in determining the true meaning and importance of language.
Continued on Page 2.