Slammer, Blaster and SoBig may have shaken your confidence in IT security. Why do these threats continue to plague us, and how can your organization be better prepared to handle the next piece of malware that appears on the Internet?
The key to IT security confidence is effective security event management. And the secret to effective security event management is real-time correlation — integrating and analyzing data from all your security systems.
Architecturally, there are two principal approaches used in correlation today. One is query-based correlation, which relies on all the data being present in a database for later analysis. The other uses in-memory techniques. Additionally, there are at least two fundamentally distinct ways of examining data during the correlation process itself — rules-based and state-based.
Query-Based Data Warehousing
Since databases are a well-known technology, this approach has the advantage of being more comfortable for many end-user organizations. Relational databases are well-known technology platforms with skills readily available. When used for correlation, sophisticated queries are executed against the event data warehouse to identify important relationships that indicate a potential threat against your organization.
However, the stream of inbound events generates massive, sustained insert rates. As data tables grow they need to be aggressively managed in order to maintain performance. Additionally, it is difficult to optimize a database solution for both high sustained insert rates and querying; and it is the querying that is performing the correlation.
You will also have a certain inevitable latency between event data arriving and it being correlated, depending on the database implementation, since the data must first be stored and then retrieved during the correlation process.
From an application perspective, the correlation is only as good as the data contained in the warehouse. This may be good enough for many uses, but a query-based solution will be unable to easily acquire more data from additional sources to enrich the correlation process. This limitation becomes increasingly important if you are trying to tie your security correlation into downstream effects — a warehouse-based solution is unlikely to be able to take into account an event’s actual impact during correlation.
In-Memory Event-Driven Correlation
The alternate approach is “computational correlation,” where the correlation server processes the inbound event in memory as it arrives, without requiring a database to feed the correlation process.
These solutions are typically much faster than query-based solutions, and scale out more easily. This is because the latency involved with the database store/query cycle is no longer part of the correlation process. This can give the computational approach faster performance, greater architectural flexibility, and lower bandwidth needs.
Also, since these solutions aren’t constrained by the data in the database, it may be easier for a vendor to enable them to fetch contextual information from other sources, potentially delivering higher value in the eventual result.
However, all data from a memory-based solution needs to get into a database eventually for reporting and analysis purposes. For memory-based solutions, an important part of their long-term performance is to understand when and how that persistence is delivered — whether the storage happens concurrently with correlation, or consecutively.
Correlation Definitions
Riding on top of these different architectures are various ways of defining how the correlation itself (the linking of apparently independent events) is defined. Rules-based approaches are easy to understand. They’re procedural, use programming language like syntax (maybe aided by some kind if GUI wizard), and are therefore relatively easy to comprehend.
One way to think about rules is that they’re typically used to define “attack signatures.” As a result, rule-sets tend to mushroom over time, and can become difficult to manage. Rules tend to be modified reactively, as the result of an attack getting through — they don’t know what they don’t know.
State-based approaches, however, tend not to look for specific event sequences (if A then B, then deduce C); rather they can be used to detect higher-level patterns of behavior that indicate threats.
The advantage of this approach is that state-based tracking doesn’t tend to expand at nearly the same rate as rules-based approaches. State-based solutions also may be more robust to sequence errors (say when B arrives before A, perhaps due to a network problem).
On the other hand, the very power of state-based solutions means that you cannot necessarily point to a rule that specifically identifies, say, Code Red, because state-based approaches typically run at a higher level of abstraction. They’ll catch a Code Red, but there’s no rule specifically for it. As an alternative approach, this can be powerful, but takes some getting used to.
Picking the Approach
When reviewing real-time correlation approaches, you need to weigh the different options against your own perceived needs.Key questions and issues to decide may include:
Ultimately, if you are considering a real-time threat identification solution, nothing beats a comprehensive evaluation, selection and testing process. But being an informed buyer makes it much more likely that the solution you pick will continue to meet your needs for longer.
Phil Hollows is vice president of security products for OpenService, a vendor of network security event management products.