So what was the problem? The answer is simple; it appears that, by design, wire-transfers for over $100K were querying the mainframe nearly 100 times, while other transfers would query it only a couple of times. Same end-user, same application, same transaction, but just a single parameter made the transaction take a whole different path, and made the difference between a 3 second and a 2 minute response time.
Now the question is; why can't existing monitoring tools identify the problem? The reason is simple: Traditional monitoring tools monitor the infrastructure and not the transactions.
In a complex heterogeneous infrastructure, there are many tools for monitoring each and every component, but no single spinal cord that is able to show how transactions behave across components. None of the tools are able to deterministically correlate a single request coming in to a server with all of the associated requests going out of a server and keep on doing so throughout the Transaction Path. Just like the chopper which could not figure out which of the vehicles coming out of the tunnel contained the suspect who came into the tunnel in the first place.
This situation raises some strategic questions regarding your monitoring approach. How effective is a monitoring framework without that business context? Are you supposed to just to make sure the servers are up and running and applications are responding, or is your real goal is to make sure that the business transactions are being executed as intended and on time?
Applications are tricky, transactions are tricky, and they become even trickier in a complex heterogeneous infrastructure that is composed of multiple platforms, operating systems, application nodes, tiers, databases and where communication between components is in different protocols back and forth for every single click of a button by an end-user.
Only by being able to trace each and every single transaction activation throughout its entire path - 100% of the time, for all transactions, across all components - will you be able to systematically collect necessary granular information in order to get business-contextualized visibility into your datacenter. This kind of visibility is a key factor in being able to identify problems effectively when, or even before they arise.
W. Edwards Deming said; In God we trust; all others must bring data. I think he was absolutely right. IT Operations can use choppers, or CSI crime-lab detectives, or Jack Bauers. They all have their roles, but when it comes to fast and effective problem identification as well as many other IT related decision making processes (thats a whole different article ) real accurate data is required no partial data, no assumptions.
Business Transaction Management provides you with that data, and by doing so, it provides your IT Organization with visibility and predictability. Wouldnt it be great if you could go to sleep at night knowing that your infrastructure is reliable? That is, unless you want to play the role of the CTU Director
By the way, the 8th season of 24 will be premiered on January 17th, 2010.
Nir Livni is the Director of Product Management at Correlsense.