Tuning ERP systems for peak performance: Page 3

(Page 3 of 3)

Step 4:
Conduct unit tests to ensure all required functions work

Based on the performance requirements defined in Step 1, all identified functions like adding an asset and generating an accounts payable report ought to be tested first in isolation to ensure that these tasks can be accomplished end-to-end using the system. This is a precursor to the remaining tests that need to be accomplished. Unit tests are meant to ensure that the product works the way it's supposed to. Although the unit tests are conducted in isolation from one another, it's still expected that all the components that are part of the system are in place and function appropriately, but are not running at that point in time. Automated testing tools can be used to accomplish these tasks.

Step 5:
Conduct an integration test to ensure compatibility and connectivity between all components

This step is critical since it's the first occasion when the entire system is put to the test. All functions, subfunctions, and their interactions are tested concurrently while the computing environment and all its components are active. The main purpose of the test is to ensure that all components can "talk" to each other and that any compatibility and connectivity issues that exist will be revealed. Before studying and baselining the performance behavior of each of the components in the system, it's important to fix these compatibility and connectivity issues.

Integration testing can flush out any number of problems, such as a connectivity time-out by an application server for the 1001st user when the five application servers are configured to provide access to 200 users each. Another example is bad configuration pointing to different versions of connectivity libraries. By completing this step successfully, you ensure that the system in its entirety works, even if it is not yet at the acceptable level of performance.

Step 6:
Launch all monitoring tools

At this phase we kickoff all monitoring tools that have been put in place during Step 2. It's important to baseline and track the resource consumption of these monitoring tools since they can act as a skew to the test measurements.

An example of the monitoring tools that can be used based on the sample platform identified in Step 2 includes:

  • Enterprise Symon for Solaris 2.6 to measure resource usage of the database, application server, OLAP server memory consumption, CPU usage and I/O activity
  • Sybase Monitor Server for ASE 11.5.1 to measure database usage, lock contention and cache hit ratio
  • Performance Monitor for Windows NT 4.0 to measure resource consumption on the file server and secondary application server
  • LANanalysis tool to identify network bandwidth utilization
When starting these performance measurement tools, get output both in the form of real-time visualization to study instant changes to response time/performance, and as a file for the purpose of analyzing the data.

Step 7:
Create a baseline response time for all key tasks under no stress

Once all the monitoring tools and the entire infrastructure are up and running, we get a baseline of the response time of all identified critical tasks without simulating the typical or the peak workload. This indicates how individual tasks perform and how quickly they respond when the back-end is basically idle. The response time for each task is computed by submitting them and measuring the start-time to end-time with a tool like a stopwatch or time scripts. The numbers generated through this step can be used for comparative analysis purposes.

Step 8:
Baseline response time for all key tasks under different load conditions

The two key load conditions under which the response times are measured include the typical workload and the peak load. Assume that a task like submitting a requisition is expected to be completed in less than 40 seconds. When running these tests several times, we come up with averages of 22 seconds under idle conditions, 32 seconds under typical workload, and 52 seconds under peak-load conditions. Now, we know clearly that there would be conditions when the expected levels of performance cannot be achieved with the given infrastructure and its configuration.

Testing under various conditions

Task #1
(Submit a requisition)
Start- time End- time Total- time
Under an idle workload

#1 10:00:00 10:00:21 21 seconds
#2 10:01:00 10:01:23 23 seconds
#3 10:02:00 10:02:22 22 seconds
#4 10:03:00 10:03:22 22 seconds
#5 10:04:00 10:04:22 2 seconds
Average 22 seconds
Under a typical workload
#1 10:00:00 10:00:31 31 seconds
#2 10:01:00 10:01:33 33 seconds
#3 10:02:00 10:02:32 32 seconds
#4 10:03:00 10:03:32 32 seconds
#5 10:04:00 10:04:32 32 seconds
Average 32 seconds
Under a peak workload
#1 10:00:00 10:00:51 51 seconds
#2 10:01:00 10:01:53 53 seconds
#3 10:02:00 10:02:52 52 seconds
#4 10:03:00 10:03:52 52 seconds
#5 10:04:00 10:04:52 52 seconds
Average 52 seconds
Proper analysis of the data generated by monitoring tools will reveal the bottleneck or the cause for the slow response time. By alleviating bottlenecks, the response times can be improved slowly but steadily. By running Steps 3-8 repeatedly, bottlenecks associated with every task can be identified and fixed. In cases where the performance requirements are met for specific tasks after running through the process the first time, it's important to continue baselining after every change to ensure no performance degradation associated with a specific task has taken place.

Step 9:
If requirements are not met, make necessary changes in the form of tuning

Tweaking, tuning, and adding or removing the relevant resource after running this process several times is an optimal way of meeting performance requirements for all tasks. In one situation the bottleneck could be in the credit reporting process. When looking at the process flow, it will become apparent this task can't handle more than a limited number of concurrent requests. Given a situation like this, the configuration of this process could be modified to spawn more of the same to accommodate more requests.

In another situation the firewall and its authentication/authorization process could be identified as a bottleneck. A load-balanced cluster of firewall servers could resolve this issue. A particular server's CPU utilization could cause a bottleneck--and something as simple as adding CPUs could be the answer to the problem. Similarly, if the bottleneck is caused by I/O contention on a specific device, implementing striped mirrors for that device could be the answer.

By adopting this methodology, all relevant issues and bottlenecks can be identified. This enables the overall throughput of the system to be optimized. An added benefit of this process is ensuring functionality requirements and addressing compatibility requirements.

Finally, by embracing this process you can tell if the applications will work under realistic workloads and prove if service-level agreements will be met. In addition, these tests show how the system will affect existing the infrastructure as well as help gauge capacity planning efforts during the system's lifecycle. //

Rakesh Radhakrishnan is a consulting manager with Noblestar Systems Inc., a Reston, Virginia, consulting company specializing in e-commerce and enterprise applications. He can be reached at

Page 3 of 3

Previous Page
1 2 3

0 Comments (click to add your comment)
Comment and Contribute


(Maximum characters: 1200). You have characters left.