Table of Contents
The article explains the purposes and benefits of load testing. It also provides helpful tips for app owners and walks you through the main stages of the process.
Some reasons why your web service or mobile application needs load testing:
Losing your new business in a matter of seconds because of slow system response time is definitely not a desirable result. In the highly competitive space of web services, this outcome is, unfortunately, quite real.
Imagine there is a newly launched pizza ordering website. It’s been working without failures during regular business hours. Soon a pizza day comes — seemingly, just at the right time and as a promo chance for the new pizzeria. A new order comes every few seconds and suddenly everything stops. The response time of the website slows down while some orders freeze midway in the cart. The users who logged in to the website run into the page error puzzled and associate this new pizza brand with this service denial. The loss is direct. It’s caused by the lack of understanding of why a load test may be needed for a web service like this.
In short, the reasons why you need load tests as a business owner can be summarized as the following:
- You want to keep your online product up and running no matter what
- You plan to scale up your website or app
- You want to identify bottlenecks for your product (find out what stops its performance when loading increases)
- You are making sure your online business stands for massive traffic volume.
To prevent bad scenarios from happening, it’s recommended to conduct load tests at the right time.
Now, what exactly is a load test?
Load tests put software under load, and measure its response to this load. It’s important that this load level is preset and known, which makes this kind of test different from other testing strategies. Usually, this test implies multiple (quite a big number) users entering the software simultaneously.
It won’t only allow fixing the identified weaknesses of the tested software but will also prompt the creation of scripts for subsequent load tests and keep the system under control.
In this way, the tested system’s behavior is understood and will be more predictable in regular and irregular circumstances alike. Its bottlenecks will be identified, allowing for taking the necessary steps to stabilize the overall performance of the app.
What will the results of the load test tell you? You’ll learn the following information:
– How many users can access your software simultaneously;
– How scalable your software is, i.e., how many more users will be allowed to use it;
– Response of your software to the peak log;
– Evaluate your infrastructure and understand if it’s sufficient to maintain your software;
In other words, with load testing, you’ll have bottlenecks of your product clearly defined.
What does load testing give me as the product owner?
Building a stable and reliable infrastructure. Nobody wants to keep repeating unsuccessful login attempts.
By all accounts, it’s much easier and cheaper to face the problem in your app early on, than to fix it during the inactive phase at the expense of your customer base.
When should I run the load test?
One of the popular time strategies for load tests is to run it early on and based on an easier configuration, at the pre-release stage. This approach also lets you discover important information about the behavior of your system even though a specific number of users won’t be obtained. When the test is over, you’ll be aware of whether your app is behind your targets and which bottlenecks it has.
This is the case when a series of easy tests at various stages of deployment will be appropriate. An advantage of this test is the possibility to get errors fixed at an earlier time and save on costs.
Another approach is emulating real-life conditions. This approach requires a more careful configuration with the purpose to simulate reality as realistically as possible. It also takes an extended time to deploy this environment. Since such a configuration takes more time and effort than the previous strategy, its timeline may be delayed.
Stages of load tests
Let’s move on to the stages of the load test:
1. Analysis of non-functional requirements
Unlike functional testing, verifying the performance of specific functions (like “place an order” in the delivery app or “call a driver” in a taxi app), non-functional testing evaluates the performance of a system in part of its capacity, load, volume, and stress.
2. Collection of the information about business processes that should be tested in terms of load.
3. Determine success criteria for a load test.
4. Preparing the software test stand to enable the load-script which is correspondent to the product load.
The cost of this software stand is approximately equal to one copy of the app’s infrastructure summed with the infrastructure of the load test depending on traffic.
5. Preparing the infrastructure for the tool that launches load scripts.
6. Preparing the infrastructure to assemble both hardware and system metrics.
7. Preparing the load scripts.
8. Load testing itself. It may last from 1 – 2 hours to 6 – 9 hours. In critical cases, for example, at risk of failure of release, it may take extended time and have the testing mode on 24/7.
9. Obtaining and interpreting test results.
What other aspects of load tests should I pay attention to?
➜ Load testing is not performance testing although it encompasses the part of performance tests. The latter examines software for overall performance and launches a broad scope of testing. As we discussed above, load testing is interested in the behavior of software provided there is a user load on a system.
Other types of performance tests include tests for endurance, stress, scalability, volume, capacity, and spike.
➜ Load testing renders the information in the quality and quantitative dimensions. Having the productivity metrics on your table, you can match them with the non-functional requirements set in your plan. If you aren’t happy with the results, the team will be able to fix critical indicators in the very near future and manage similar tests in the future according to the test script. The series of tests is the way to control app development dynamically.
➜ Through tests, we identify the limits of app scaling or the components that block this scaling. These blocks are, for example, memory leaking or limitations on the number of connection queues. This way, the tested app can return technical errors, such as the absence of available connections, the timeout error reflecting the disconnection between services, and processing errors from intermediate services that perform without issues given there is no load. In the worst-case scenario, it can lead to downtime in the system’s work during brief periods.
➜ This recommendation is especially relevant for newly launched services that start their commercial journey. When the tested software experiences peak traffic (holidays or marketing campaigns), it’s advisable to stay in touch with the engineers.
➜ Tests with higher loads are recommended for the established products demonstrating sustainable performance under regular loads.
➜ In an already started load test where the first steps were done, the remaining procedures can be automated and used for the next releases for repeated examination.
➜ When the system is under high load, it’s advisable to keep track of the hardware metrics and the network of test infrastructure. Connections may expire, and the processor’s/memory capacity may be affected by the traffic volume and user actions.
➜ When preparing load scripts, it’s necessary to take care of the data selection within a wide range. The script is able to retrieve data from an already known data file or generate random values within a range available with a test stand. It’s important because repeated requests during the load test can rely only on the cache whereas turning to a wide range of data can guarantee that most requests will be covered by the database.
➜ The data volume on a test stand must correspond to the production environment. This aspect should be considered by an engineer among others. With less data volume, there is the probability that requests will be completed quicker than under real conditions.
➜ Charts can show that the system is sustainable, however, the practice won’t prove that. In this case, software analysis is additionally required.
How do we interpret test results and act on this interpretation?
Interpretation of test results takes place in two stages:
1. Analysis and matching of received metrics against the requirements or respective metrics of the previous tests.
At this stage, we pay attention to the throughput capacity, response time, traffic volume, and degree of errors. If there is a long response time and errors are revealed, it doesn’t mean that the test has failed.
When analyzing load metrics, the percentile score is applied. What makes this measurement important in test analysis? As an app owner, you’ll be able to grasp how many users will enjoy the given response time and if this matches your goal.
Percentile means the percentage of values not exceeded by a particular value. For example, 50 percentile with a response time of 5 seconds means that half of the transactions (or 50% of them) are not slower than 5 seconds or even faster.
When these metrics are ready, engineers put a certain threshold for the next test(s) and all values beyond this range won’t be able to go through it.
2. Analysis of system metrics of the services and hardware integrated into the app.
This stage mainly focuses on metrics that were already put into Elasticsearch or InfluxDB for further visualization. Load tests are able to spotlight major metrics and, therefore, inform when they approach bottlenecks or indicate a full degradation of a system. It will allow engineers to set up alerts in the production environment based on relevant ranges and, therefore, promptly react to incoming problems.
How bottlenecks affect the productivity of your project
Just like the neck of a bottle slows down the water flow, a single component is able to be a notable hindrance in the whole workflow of your system. That’s why it’s common to refer to these elements as bottlenecks.
If you anticipate constant growth of visitors and users, then the unexpected failure with handling app traffic during one or few releases is able to cause financial and reputational loss. So bottlenecks restrain the scaling of an app and prevent its evolution into a more prosperous business enterprise or a popular online tool.
Which components can impede further growth?
Any software system consists of three levels. These are remote hardware systems, software frameworks, or network systems. Cloud-based hardware includes databases and storage of web pages; software frameworks are responsible for rendering web pages, and, finally, software frameworks maintain the transfer of data. Working together, these three levels ensure a smooth work of a web application.
If one of the aforementioned components demonstrates the lowest capacity, speed, or lowest throughput, it will act as a bottleneck. If this was not the case, the tested system would function in a smooth way and perform without delays.
Bottlenecks can be represented by Central Processing Unit (CPU) utilization, memory usage, disk usage, network delays, etc. They are identified in the process of tests.
How our company conducted the load test for an educational app
The client’s intention was to learn if the service is able to handle 80,000 users at once. The next question was how many users can actually use the system at the same time.
The tested system is an educational web application. Using this app, you can find the correct name and prompt the correct pronunciation of this name, which is crucial in a multicultural university environment. Therefore, the return of search results and rendering the name is of critical importance for this app.
Our goal was to examine the authentication system and its potential limits. Users log in to the app with their credentials and it causes sending signals to an API.
Our goal was to have precise information on the load capacity rather than doing an approximate evaluation of the system’s capability to endure the load. Further tests and monitoring will be based on these findings.
Data storage was powered by an open-source database Influx DB. We applied Graphana for visualization. Metrics for the app’s infrastructure were processed with the help of Kubernetes, New Relic, and Sumo logic (the latter is used for both storing and collecting data).
Stages of tests
1. The technical aspects were agreed upon with the customer:
the use of a machine for the test. It was deployed on the customer’s end and was represented by EC2.
the test time – the night time was chosen in order to have the production environment tested.
2. Coming up with script and goals:
- Preparing the load test scenarios
- Preparing tools and test environment
- Sorting out goals and milestones with the customer
- Identifying non-functional requirements
- Matching the retrieved data with the application metrics
3. Testing stage
We could see how well the system handles the load by checking visualization charts. If indicators are close to the “ceiling”, it was a sign of a potential bottleneck. If hardware capacity doesn’t apparently approach its limits while the load is growing, it proves sustainable and can handle significant throughput.
Pods are the smallest units of execution in Kubernetes. When adding more pods, we observed that the response time increases.
Finally, the parameters of the system were received: 50 responses per second were the maximum capacity.
Before the testing, our understanding of how the system reacts to the integration of the third-party system was not the same. So the test supplied us with new information to act on.
When we increased the number of the tested app copies for a browser extension, we could identify the bottleneck – it was a browser extension.
The answers to the main questions were thereafter documented:
1. The service won’t be able to handle 80,000 users at once.
2. Almost 200,000 users can be handled by the system but it will take an hour for the system to process that.
3. It also became possible to identify at which stage and under which conditions the system can start degradation.
4. Working out recommendations for the tested service
The conducted test worked as the basis for subsequent examinations. Thanks to the received metrics, we could navigate through SLA and agree on what can be considered good or bad for the system’s capacity.
The picture provided by a load test can not be limited, it works as one of the analytical stages, allowing you to grasp the internal capacity of your system in a better way. It supplies you with critical data and lays the foundation for your constant awareness of the capacity of your system and its potential weak spots.
There is no perfect and universal blueprint for a load test – that’s why initiating this process with seasoned Developers is what determines a successful outcome!