Why Your Big Data Needs Good Algorithms

Big Data is nothing without a superior algorithm, and that will be where you compete in the future.
Posted November 25, 2015
By

Andy Patrizio


You've jumped head first into Big Data. You upped your storage, added all kinds of new data streams and now have petabytes more data than you did a year ago. Your Data Lake is fuller than any reservoir in California.

Now what are you going to do with it?

The fact is, data in and of itself is worthless, Big or not-so-big, without the processing on it to extract knowledge and insight. And it's the algorithms you run on them where value lies, says Gartner. At its recent CIO Symposium, some of its top analysts took the stage to say the era of the algorithm will make Big Data worth something.

"Big data is not where the value is," Peter Sondergaard, senior vice president of Gartner told the assembled crowd. "Data is inherently dumb. It doesn't do anything unless you know how to use it and act with it. Algorithm is where the real value lies. Algorithms define action."

One could argue that has always been the case with business intelligence, but Frank Buytendijk, Gartner vice president and distinguished fellow, said the aim of BI has always been more intangible and indirect. "What is becoming more important now is 'prescriptive analytics,' in other words, the algorithms that interpret data and immediately take action," he said.

"Data will tell you everything, and data is absolutely growing. The problem is people can't deal with the volumes of data. Data scientists can do amazing things but it's only that top level data that's valuable," said Andi Mann, chief technology advocate for Splunk, a platform provider of operational intelligence.

He added that the problem with algorithms is they can be static, doing the same thing over and over with the data. "Which is fine, in some cases. But if you are trying to disrupt a market, build a business in a new region, you need flexibility. Can they reprocess data with a new lens? Algorithms won't do that. They are only as smart as they were when they were created," said Mann.

So in addition to understanding your data, they also need to adapt. That's a tall order. But Gartner believes we are entering a post-app world, where algorithms (often in the form of smart agents like Siri and Cortana) are taking over. Sondergaard predicted that by 2020, smart agents will handle 40% of interactions and Microsoft will be building its strategy around Cortana, not Windows.

Sondergaard says algorithms spot the business moments, meaningful connections, and predict ill behaviors and threats and therefore, the CIO should be the strategic voice on the use of information. "Calculate the value of your algorithms," he said. "Be an algorithmic business."

Where Do They Come From?

The good thing about algorithms is they are not expensive pieces of code that cost a fortune. Algorithms are often taken from the public domain, such as university research, and then highly adapted to the customer's needs.

The bad news? The public domain ones are usually generic and you will need to spend time modifying them for your business.

"Most of the algorithms we use are not home grown. The algorithms is written as a piece of code and probably published as a research paper. That's powerful and we share many of these," said Ihab Ilyas, co-founder of Tamr, a Big Data platform for unifying multiple data sources.

So holding on to algorithms isn't necessarily the right strategy. Buytendijk said it is a business where giving away your product makes sense and a framework for the new algorithm economy is "give, take, and multiply." He explains that it all starts with "value," how you can generate the most value with your capabilities, technologies and information.

"Every organization has information or capabilities that are worth more when you share it, compared to keeping it for yourself. So it starts with sharing. But it goes two ways, it’s give and take. Once you make a habit of sharing, there are whole other networks of information and computing power opening up," said Buytendijk.

Algorithms can be bought and sold but there hasn't been significant commoditization around Big Data yet," said Mann. "There's nothing to stop someone from sourcing algorithms externally. The problem is that external providers need to know your business in ways you may not know. That's where consultants come in," he said.

One of the major challenges is that data sources often exist independent of each other and are unstructured but need to be combined, said Sven Junkergård chief technology officer of Zephyr Health, the Insights-as-a-Service firm in healthcare. Building those connections is usually the hard part.

"Once you have assembled data and connected it all, we can run algorithms on top of that to figure out how important it is. For us, we have algorithms to build connections. Algorithms range from trivial to hard. Coming up with the first approach to bear fruit, that is by far the most difficult thing," he said.

And for Zephyr, there is a fair amount of taking, in the form of using existing algorithms. "Most things you do today are taking an existing algorithm and applying it to the new area. A lot of that takes place at universities. Most of the time you adapt [existing algorithms] to your system. The vast majority of things are evolutionary approaches," said Junkergård.

Buytendijk added that there needs to be clear governance around who owns the algorithms. "I don’t mind who owns them, as long as they are clearly owned and governed. We’ve seen too many algorithms go out of control already, harming the business," he said.

Flexibility a Must

One thing algorithms need is fluidity, for multiple reasons. The first and obvious reason is data changes. Data coming in may change, a new source may be added or a source dropped. This is where machine learning comes in, something that has been around for a while but not broadly used.

For Big Data, machine learning is seeing a revival because companies can't wait for human intervention or for people to realize things have changed and they need to alter the algorithms. It needs to change on the fly.

Many machine learning algorithms come with a product, like in advertising networks. They have the ability to alter what is presented based on changes in users or other variables. Usually these adaptations are kept close to the vest because they become a competitive advantage, as opposed to algorithms in public domain.

Then there's the human element. Algorithms can be pretty smart but they don't have human understanding. Last year, during a hostage situation in Sydney, Australia, everybody wanted to get out of the downtown and they all called the startup taxi service Uber. Uber has an automated dynamic pricing algorithm to match supply and demand, which usually works in a market-driven scenario, balancing supply and demand.

But in this case, the Uber algorithm didn't take into account there was a hostage crisis and when demand for drivers skyrocketed, prices went up 400%. "It made Uber look cold and uncaring, like they were taking advantage of the situation. A more sophisticated algorithm would have capped the increase and have spotted the unusual nature of the spike," said Buytendijk.

Uber later corrected the situation and offered people their money back but the damage was done and Uber got a black eye. You can't program an algorithm to have a heart. "Algorithms act on an organization's behalf. They'd better be smart and trustworthy," said Buytendijk.

Conclusion

At the Symposium, Sondergaard said CIOs need to do three things: inventory their algorithms; assign ownership of them; and classify them. Buytendijk expanded on these steps.

"Taking inventory is needed because it is pretty bad if one algorithm runs away and you didn't even know it existed," he said, in reference to the Uber example. "With assigning ownership, it sounds logical but is hardly done. You create a go-to place. And classify which ones are really important to the business and protect them. Or give them away. Goldman Sachs gives customers access to its trading systems so they can trade themselves. It may cost commissions but creates deeper ties," he said.

Photo courtesy of Shutterstock.




Tags: Data Analytics, algorithms, Big Data Streaming Analytics Platforms


0 Comments (click to add your comment)
Comment and Contribute

 


(Maximum characters: 1200). You have characters left.