Friday, May 28, 2010

Performance Testing - On LAN and over the Internet(WAN).

Performance Testing - On LAN and over the Internet (WAN).

Generally we have performance test setup, where application servers and load injectors(generators) residing on the same LAN. We perform the tests and compare the response time reports with the SLA(Service Level Agreement) and declare test is PASSED.
When the same application is deployed on the data-center(Production Environment) and remote clients(real users) start accessing the application, they would notice difference in response time, that is not matching with the SLA. 

Why are clients accessing the same application observing high response time,which is running on similar hardware, that is TESTED & PASSED by the performance testing team ?

Why in many cases applications fail to perform to expected levels, despite load testing prior to roll-out?

We have been focusing on identifying software and hardware bottleneck, not considering the real world conditions like network impact, end user bandwidth etc.

There is difference between testing the application on LAN and over the internet.

Following factors effect the application response time when the request travel through internet.
1. Pack Loss - In internet terminology, every communication is sent as data packets.
A packet is a sequence of bytes and consists of a header followed by a body. The header describes the packet's destination and, optionally, the routers to use for forwarding until it arrives at its final destination.
Due to noise, some of the packets are lost or distorted. 
2. Packet Latency - All the packets will not travel the same path for a specific destination, internet nodes will select the shortest destination route based on number of factors,  if some of the packets are delayed, construction of the response would get delayed, in-turn response time.
Understand, how many hosts(routers) packet need to travel to reach destination server.
3. Packet Effects - Dynamic IP packet routing effects, including out-of-order packets, duplicated packets,
fragmentation phenomena, and TTL Effects (Time-to-live, decrement counter. Field in the packet header to count number of routers packet passed through) .
4. Link Faults - Possible damages to bit streams and possible disconnections.
5. Congestion - Sharp spikes in internet traffic, which may result in high latency or packet loss, or both.
6. Outgoing and Incoming Bandwidth - If your application throughput is not matching with the network, it would create congestion and increase response time.

Why is this effecting the server performance?
When data packets are lost, request  would be incomplete, so the application server has to resend the lost packets again with extra effort on the CPU. The stress on the servers increases, since more resources are needed to support remote end-users. Sessions are open longer, OS resources are occupied for more time and more concurrent threads are needed.

Without incorporating the above factors(WAN effects) into a load test, memory usage, thread usage, connection pool utilization, network stack, and other critical server resources can be significantly understated.

Select this link to understand internet line quality. 

Now, how to conduct TRUE performance test by taking above factors into consideration ?

Approach 1:  Install load generators at client location, where real users are accessing the application. This is a tedious process, clients network may choke during load test and security related issues my exist.

Approach 2: Perform test on your LAN by emulating the above factors. To create above factors locally, we need to use WAN Emulators.This can be achieved by using SHUNRA Virtual Enterprise Suite, it can record the above said factors from a remote user (or) real user and emulate the same in the LAN. Shunra provide plug-in software for Loadrunner. 

Approach 3: Generating load from cloud. What is cloud load testing? 
There are companies that can simulate load for any number of users from any part of the globe using cloud testing services (such as Amazon EC2)

For more information look into Cloud Load Testing.

Approach 3: Estimate impact of the network - Separately estimate the impact of the network on an application being deployed and manually factor that into result reports and deployment readiness findings.
This provides partial insight into the impact of the network, but totally ignores the interrelationships that network performance has on the application logic.Whenever the application logic or infrastructure configuration changes, there is an ongoing risk of encountering an unanticipated adverse impact and unhappy users.

Approach 4: Generate load from your premises in such a way that you are hitting the data center or production environment through internet. But purchasing or leasing hi-capacity internet lines just for conducting load test is very costly. If your internet connection capacity is less then the application throughput, your results will be incorrect.

Approach 5:  By using third party load generation services like Gomez  and Keynote

Gomez network exist in 168+ countries and 2,500+ ISP's. It consist of 500+ combinations of browsers and OS, 150+ commercial data centers, 5,000+ mobile supported devices and 1,50,000+ commercial grade desktops through which it is possible to generate enormous real user load. Internet giants like Google, Yahoo, FaceBook, LinkedIn... use these services. I feel it is ultimate in load testing for predicting true results. 

Attaching Gomez Network Screen shot.

Attaching Gomez Recorder screen shot, that records user actions on the page (similar to Loadrunner Click & Script protocol)

Attaching Gomez node selection (Load Generator) screen shot by Country|City|ISP.
Attaching Gomez results screen shot by City|ISP

Before analyzing results, always make sure that bandwidth is not the bottleneck. 

While performing tests over internet always compare internet bandwidth with the application throughput of all the virtual users. For example, if you are running test with 100 virtual users and consolidated throughput is 20Mbps, make sure your internet bandwidth is more than 20Mbps. If your application throughput and bandwidth don't match, there would be packet congestion, high response time is record, here it is not the problem with the application server but with the network.

While conducting performance test, response time should increase when more users are added, only when server CPU or other counters touch maximum limit, if this not the case, there would be some issue in the network or load generators.

Following posts are also related to performance testing.

Performance Testing Configuration or Setup
Performance related issues between browser and server
Performance Testing - TTFB, TTLB
Browser wars & End user performance, content display impact
Analyze Browser - JavaScript, AJAX, Rendering Details


No comments:

Post a Comment