Table of Contents
The article explains the purposes and benefits of load testing. It also provides helpful tips for app owners and walks you through the main stages of the process.
Some reasons why your web service or mobile application needs load testing:
Losing your new business in seconds because of slow system response time is undesirable. Unfortunately, this outcome is natural in the highly competitive web services space.
Imagine there is a newly launched pizza-ordering website. It’s been working without failures during regular business hours. Soon a pizza day comes — seemingly, just at the right time and as a promo chance for the new pizzeria. A new order comes every few seconds, and suddenly, everything stops. The website’s response time slows while some orders freeze midway in the cart. The users who logged in to the website ran into the page error puzzled, and associated this new pizza brand with this service denial. The loss is direct. It’s caused by the lack of understanding of why a load test may be needed for a web service like this.
In short, the reasons why you need load tests as a business owner can be summarized as the following:
- You want to keep your online product up and running no matter what
- You plan to scale up your website or app
- You want to identify bottlenecks for your product (find out what stops its performance when loading increases)
- You are making sure your online business stands for massive traffic volume.
To prevent bad scenarios from happening, it’s recommended to conduct load tests at the right time.
Now, what exactly is a load test?
Load tests put software under load and measure its response to this load. This load level must be preset and known, making this test different from other testing strategies. Usually, this test implies multiple (quite a significant number) users entering the software simultaneously.
It won’t only allow fixing the identified weaknesses of the tested software but will also prompt the creation of scripts for subsequent load tests and keep the system under control.
This way, the tested system’s behavior is understood and will be more predictable in regular and irregular circumstances. Its bottlenecks will be identified, allowing for taking the necessary steps to stabilize the app’s overall performance.
What will the results of the load test tell you? You’ll learn the following information:
– How many users can access your software simultaneously;
– How scalable your software is, i.e., how many more users will be allowed to use it;
– Response of your software to the peak log;
– Evaluate your infrastructure and understand if it’s sufficient to maintain your software;
In other words, with load testing, you’ll have the bottlenecks of your product clearly defined.
What does load testing give me as the product owner?
Building a stable and reliable infrastructure. Nobody wants to keep repeating unsuccessful login attempts.
- Understanding the limits of your products before they go live and the reasons behind these constraints. You’ll see which components are on the way to the speedy and sustainable performance of your app or web service.
- Understanding the product’s Capacity, Load, Volume, Stress, and other parameters.
- Receiving the information on the system’s response time and transmitted data volume.
- Understanding if your system can serve end-users during promo campaigns on traffic volume peaks.
- Identify non-functional requirements for your project if you don’t have them now.
By all accounts, it’s much easier and cheaper to face the problem in your app early on than to fix it during the inactive phase at the expense of your customer base.
When should I run the load test?
One of the famous time strategies for load tests is to run it early on and based on a more manageable configuration at the pre-release stage. This approach also lets you discover important information about the behavior of your system even though a specific number of users won’t be obtained. When the test is over, you’ll know whether your app is behind your targets and which bottlenecks it has.
This is the case when a series of easy tests at various stages of deployment will be appropriate. An advantage of this test is the possibility of fixing errors earlier and saving on costs.
Another approach is emulating real-life conditions. This approach requires a more careful configuration to simulate reality as realistically as possible. It also takes an extended time to deploy this environment. Since such a configuration takes more time and effort than the previous strategy, its timeline may be delayed.
Stages of load tests
Let’s move on to the stages of the load test:
1. Analysis of non-functional requirements
Unlike functional testing, verifying the performance of specific functions (like “place an order” in the delivery app or “call a driver” in a taxi app), non-functional testing evaluates the performance of a system in part of its capacity, load, volume, and stress.
2. Collection of the information about business processes that should be tested regarding load.
3. Determine success criteria for a load test.
4. Preparing the software test stand to enable the load script corresponding to the product load.
The cost of this software stand is approximately equal to one copy of the app’s infrastructure, summed with the load test infrastructure depending on traffic.
5. Preparing the infrastructure for the tool that launches load scripts.
6. Preparing the infrastructure to assemble both hardware and system metrics.
7. Preparing the load-scripts.
8. Load testing itself. It may last from 1-2 hours to 6-9 hours. In critical cases, for example, at risk of release failure, it may take extended time and have the testing mode on 24/7.
9. Obtaining and interpreting test results.
What other aspects of load tests should I pay attention to?
➜ Load testing is not performance testing, although it encompasses the part of performance tests. The latter examines software for overall performance and launches a broad scope of testing. As we discussed above, load testing is interested in the behavior of software provided there is a user load on a system.
Other performance test types include endurance, stress, scalability, volume, capacity, and spike tests.
➜ Load testing renders the information in the quality and quantitative dimensions. Having the productivity metrics on your table, you can match them with the non-functional requirements set in your plan. If you aren’t happy with the results, the team will be able to fix critical indicators very shortly and manage similar tests in the future according to the test script. The series of tests is a way to control app development dynamically.
➜ Through tests, we identify the limits of app scaling or the components that block this scaling. These blocks are, for example, memory leaking or limitations on the number of connection queues. This way, the tested app can return technical errors, such as the absence of available connections, the timeout error reflecting the disconnection between services, and processing errors from intermediate services that perform without issues, given there is no load. In the worst-case scenario, it can lead to downtime in the system’s work during brief periods.
➜ This recommendation is especially relevant for newly launched services that start their commercial journey. When the tested software experiences peak traffic (holidays or marketing campaigns), staying in touch with the engineers is advisable.
➜ Tests with higher loads are recommended for the established products demonstrating sustainable performance under regular loads.
➜ In an already started load test where the first steps were done, the remaining procedures can be automated and used for the following releases for repeated examination.
➜ When the system is under high load, it’s advisable to keep track of the hardware metrics and the network of test infrastructure. Connections may expire, and the traffic volume and user actions may affect the processor’s/memory capacity.
➜ When preparing load scripts, handling the data selection within a wide range is necessary. The script can retrieve data from an already known data file or generate random values within a range available with a test stand. It’s essential because repeated requests during the load test can rely only on the cache, whereas turning to a wide range of data can guarantee that the database will cover most requests.
➜ The data volume on a test stand must correspond to the production environment. This aspect should be considered by an engineer, among others. With less data volume, there is a probability that requests will be completed quicker than under natural conditions.
➜ Charts can show that the system is sustainable. However, the practice won’t prove that. In this case, software analysis is additionally required.
How do we interpret test results and act on this interpretation?
Interpretation of test results takes place in two stages:
1. Analysis and matching received metrics against the requirements or respective metrics of the previous tests.
At this stage, we pay attention to the throughput capacity, response time, traffic volume, and degree of errors. If there is a long response time and mistakes are revealed, it doesn’t mean the test has failed.
When analyzing load metrics, the percentile score is applied. What makes this measurement important in test analysis? As an app owner, you’ll be able to grasp how many users will enjoy the given response time and if this matches your goal.
Percentile means the percentage of values not exceeded by a particular value. For example, the 50 percentile with a response time of 5 seconds means that half of the transactions (or 50% of them) are not slower than 5 seconds or even faster.
When these metrics are ready, engineers put a certain threshold for the next test(s), and all values beyond this range won’t be able to go through it.
2. Analysis of system metrics of the services and hardware integrated into the app.
This stage mainly focuses on metrics already put into Elasticsearch or InfluxDB for further visualization. Load tests can spotlight primary metrics and information when approaching bottlenecks or indicating a complete system degradation. It will allow engineers to set up alerts in the production environment based on relevant ranges and promptly react to incoming problems.
How bottlenecks affect the productivity of your project
Just like the neck of a bottle slows down the water flow, a single component can be a notable hindrance in the whole workflow of your system. That’s why it’s common to refer to these elements as bottlenecks.
If you anticipate constant growth of visitors and users, the unexpected failure to handle app traffic during one or a few releases can cause financial and reputational loss. So bottlenecks restrain the scaling of an app and prevent its evolution into a more prosperous business enterprise or a popular online tool.
Which components can impede further growth?
Any software system consists of three levels. These are remote hardware systems, software frameworks, or network systems. Cloud-based hardware includes databases and storage of web pages; software frameworks are responsible for rendering web pages, and software frameworks maintain data transfer. Working together, these three levels ensure smooth web application work.
If one of the abovementioned components demonstrates the lowest capacity, speed, or throughput, it will be a bottleneck. If this were not the case, the tested system would function smoothly and perform without delays.
Bottlenecks can be represented by Central Processing Unit (CPU) utilization, memory usage, disk usage, network delays, etc. They are identified in the process of tests.
How our company conducted the load test for an educational app
The client intended to learn if the service could handle 80,000 users concurrently. The next question was how many users could use the system simultaneously.
The tested system is an educational web application. Using this app, you can find the correct name and prompt the correct pronunciation of this name, which is crucial in a multicultural university environment. Therefore, the return of search results and rendering the name is critical for this app.
Our goal was to examine the authentication system and its potential limits. Users log in to the app with their credentials, and it causes sending signals to an API.
Our goal was to have precise information on the load capacity rather than doing an approximate evaluation of the system’s capability to endure the load. Further tests and monitoring will be based on these findings.
Data storage was powered by an open-source database Influx DB. We applied Graphana for visualization. Metrics for the app’s infrastructure were processed with the help of Kubernetes, New Relic, and Sumo logic (the latter is used for storing and collecting data).
Stages of tests
1. The technical aspects were agreed upon with the customer:
the use of a machine for the test. It was deployed on the customer’s end and was represented by EC2.
the test time – the night time was chosen to have the production environment tested.
2. Coming up with script and goals:
- Preparing the load test scenarios
- Preparing tools and test environment
- Sorting out goals and milestones with the customer
- Identifying non-functional requirements
- Matching the retrieved data with the application metrics
3. Testing stage
We could see how well the system handles the load by checking visualization charts. If indicators are close to the “ceiling,” it signifies a potential bottleneck. If hardware capacity doesn’t approach its limits while the load grows, it proves sustainable and can handle significant throughput.
Pods are the minor units of execution in Kubernetes. When adding more pods, we observed that the response time increased.
Finally, the system’s parameters were received: 50 responses per second were the maximum capacity.
Before the testing, our understanding of how the system reacts to the integration of the third-party system was not the same. So the test supplied us with new information to act on.
When we increased the number of the tested app copies for a browser extension, we could identify the bottleneck – it was a browser extension.
The answers to the main questions were after that documented:
1. The service won’t be able to handle 80,000 users at once.
2. Almost 200,000 users can be handled by the system, but it will take an hour for the system to process that.
3. It also became possible to identify at which stage and under which conditions the system can start degradation.
4. Working out recommendations for the tested service
The conducted test worked as the basis for subsequent examinations. Thanks to the received metrics, we could navigate through SLA and agree on what can be considered good or bad for the system’s capacity.
The picture provided by a load test can not be limited. It works as one of the analytical stages, allowing you to better grasp your system’s internal capacity. It supplies you with critical data and lays the foundation for your constant awareness of the capacity of your system and its potential weak spots.
There is no perfect and universal blueprint for a load test – that’s why initiating this process with seasoned Developers determines a successful outcome!