12 Ways to Improve App Performance
Agile development, DevOps, continuous integration/continuous delivery (CI/CD), continuous testing and more. All these practices are designed to improve software quality while reducing time to market. And, with all the tools available for developers, testers, architects, infrastructure engineers and others, it may seem as though uptime should be easier than it once was to achieve. However, the increasing complexity of software and our experiences as consumers and business users often tell a different story.
“The art of programming has changed in the last 10 to 20 years. [Back then,] it was more like engineering because people learned a lot of programming and approaches, but nowadays, the instruments and approaches tend to lower the bar of specialists who can contribute to software development,” says Max Belov, chief technology officer at Coherent Solutions, a digital product engineering company. “These people do not always understand the complexity of the systems, so software has lower quality as a result.”
There is also a cost to achieving higher software quality. For example, in highly regulated industries, organizations must meet certain quality and safety requirements to remain in compliance. Failure to do so can result in fines and lawsuits, not to mention reputational damage if something like a water level controller veers out of control.
On the other side of the spectrum are consumer apps that vary greatly in product quality. For example, eBay’s website is often down, and there are far fewer down reports about the mobile app, but it’s not perfect, either. Meanwhile, the web app users can’t buy or sell anything and all they see is an error message, which is an unacceptable user experience for both buyers and sellers.
“For the longest time, I didn’t use Instagram, but I heard a lot about it, that it’s like wonderful, magical app,” says Belov. “Then, when I installed it, I was very disappointed. [The] design and usability are very poor, and performance is very bad.”
There are many other factors that cause apps to hang, underperform and crash. Following are some common examples:
1. Understand the level of reliability required for an application or service.
Are there specific SLAs or business outcomes that should be met? If so, it’s all about working backwards from measuring the app reliability at present against those targets or goals, and then minimizing incidents that risk or impact that reliability. According to Rob Skillington, co-founder and CTO of Chronosphere, a cloud-native observability platform, the best way engineering teams achieve this today is through the practice of monitoring their software stack both at the application level and at the infrastructure service level.
“One important factor is whether this service and application is hosted in the cloud or on-premises within your own infrastructure. However, either way, teams must measure the availability of the infrastructure and monitor how the application is operating within that environment,” says Skillington in an email interview. “Once you have monitoring in place, engineers need to take responsibility for being on call when there is an impact to the service and its reliability. Additionally, engineers should monitor for leading indicators that could degrade and lead to an incident if not properly investigated and remediated.”
2. Avoid quality as an afterthought.
One of the biggest app performance pitfalls is failing to prepare the software in advance for monitoring and observability. Typically, the amount of telemetry that the software emits is in the form of basic event logs that are hard to analyze in any high level or aggregate form meaningfully, according to Skillington.
“Without the proper categorization and tagging mechanisms, you are faced with too much data to figure out which events in the stream of event log data actually relate to the issue you’re trying to solve,” says Skillington. “A lot of the time there’s not enough metadata on the stream of data to categorize events relating to an issue. When building an application, developers need to be thinking about structured logs and metrics that the application can expose to support diagnosis of problems and performance.”
3. Do not underestimate the complexity of real-world environments.
Developers sometimes make faulty assumptions about production environments, so without the proper testing, the app can fail in the wild causing customer experience and loyalty issues.
“Throughout my career, including leading groundbreaking projects at General Motors and advising Fortune 200 companies, I’ve learned that one of the biggest pitfalls developers face is underestimating the complexity of real-world user environments,” says Timothy Bates, professor of practice, University of Michigan and former Lenovo CTO, in an email interview. “Comprehensive load testing and considering edge cases are crucial. In my experience, the weakest links often lie in inadequate error handling and lack of scalability planning. My advice: always design with scalability and resilience in mind to future-proof your applications.”
4. Remember to have failover before it’s too late.
Many enterprises have adopted multi-cloud strategies, in part, to ensure cloud application uptime and failover. Even if the app is running on-prem, having failover in place is wise.
“The best way to prevent performance issues is to right-size the environment to begin with. That means understanding usage patterns, understanding your business’ seasonality, and spending the extra time and money to over-provision and give yourself a buffer,” says Marcus Merrell, principal test strategist at platform for test company Sauce Labs. “The next best way to ensure consistent performance is to understand your backup plan: you need a failover environment (or cloud region). Your [application performance monitoring] should be set up so that it picks up signs of failure early and can switch to an alternate environment with little drama.”
Once those things are in place, it’s important to load-test the primary environment to the point of failure and make sure the failover works. This extra expense in time and operation is dwarfed by what a major outage would cost, Merrell says.
5. Do not over-rely on automation.
Modern software delivery involves considerable automation, which is necessary given how complex software is these days. According to Coherent Solutions’ Belov, the most important advantages of automating routine tasks, such as testing, deployment, security checks, monitoring, and disaster recovery, are the avoidance of human error and the necessity to develop thorough scenarios.
“We are all humans, and when doing things manually, we are prone to mistakes just because we may get tired, get distracted, miss error messages, misinterpret error codes, or just have no time to complete a task. Automated scenarios allow us to avoid these,” says Belov. “Automated testing is indeed required for checking what we know may fail. However, there often are more unknown weak spots in your product than those you are aware of. Having proper exploratory testing is the key to finding and eliminating them, so, do not forget that you need a team of experienced QA engineers to do that, do not rely solely on automated testing.”
6. Do not treat security as an afterthought.
Oftentimes the base configuration for code, build, and cloud infrastructure isn’t inherently secure. Without properly verifying that these tools are configured properly, developers can inadvertently introduce risk into an organization by creating infrastructure without the security team having visibility and being able to implement proper security controls, according to Joe Nicastro, field CTO at Legit Security, an application security posture management platform.
“Most of the time, developers have very expansive permissions into the environments that they work in and continue to collect these permissions as they move from project to project. This can lead to developers that are compromised giving attackers excessive access to critical infrastructure within an organization,” says Nicastro. “More often than not, build systems are [created] with default configurations and excessive permissions within an environment. This can lead to leakage of IP, PII or other types of data should an attacker get into these systems. Additionally, this can also lead to malicious attackers using these compromised systems to poison an organization’s artifacts before they are sent downstream to their clients or customers.”
7. Beware of poisoned open source.
Organizations are using open-source software much more often than in the past. While security concerns originally kept enterprises from using it, the evolution of apps and user expectations means that developers need fast access to pre-built code though modernly, bad actors are intentionally poisoning open-source projects, according to Nicastro.
“Open-source software is becoming a major attack vector for applications and development environments,” says Nicastro. “Things like namespace confusion and dependency attacks [such as] the XZ backdoor can cause a developer to put organizations at risk when accidentally bringing in the wrong package. Understanding open-source package hygiene and validating all packages before bringing them into a product [are] a must.”
8. Failing to prioritize software defects.
Security, reliability, and performance bugs are all too common. When attempting to achieve sustainable app performance, it’s not about eliminating bugs, but instead fixing the ones that often occur and have the most significant impact, according to David Brumley, cybersecurity professor at Carnegie Mellon University.
“Prioritization is critical, not just within development teams but across different functions. This means ensuring that development teams build with the customer in mind and meet customer needs,” says Brumley in an email interview. “Test-driven development and practices like secure by design can help achieve this by setting acceptance criteria at the beginning of development. From a technical standpoint, developers should embrace adversarial testing techniques — writing unit tests for the ‘happy path’ and using approaches like fuzz testing or chaos engineering to test the ‘what ifs’ and edge cases.
9. Don’t limit thinking to obvious errors.
To ensure a service has high-availability, and high user satisfaction, it’s critical for teams and businesses to continuously wonder, “What if?” For instance, what if a first- or third-party service goes down? Rather than showing a default error page and causing users to lose their work, consider whether they’d prefer an automated retry on their behalf. If that still fails, then maybe consider how to preserve their state, let them know something is wrong and that you’ll try again at a later state, according to Matt Machuga, senior director of engineering at global employment solution provider Oyster.
“You can go one step further and recognize something is wrong with a service and gracefully degrade the service, or disable that functionality, until the downstream service comes back to a healthy state,” says Machuga in an email interview. “These little wins can be taken at a micro-level, or expanded to a macro-level where your company has an always-warm failover environment ready to serve traffic if anything happens to your primary. The needs for each business are different, so spend time considering a healthy balance of cost and effectiveness for your user-base.”
Also ensure the application has thorough telemetry and tracing baked in because it’s easy to get tripped up by red herrings during incidents. Being able to click a button in a telemetry platform to correlate multiple anomalous readings that match up with something identified can help teams rapidly determine if something is a fluke or a trend before customers do.
10. The happy path can cause angst.
To ensure consistent and sustainable app performance it is important to measure the entire stack under real-world conditions rather than focusing on just the backend services under optimal conditions. Matt Dolan, mobile development lead at IT services and consulting firm AND Digital, says the importance of strong communication between the front-end and back-end teams cannot be underestimated as it can help minimize data usage which can improve download reliability and performance on sub-optimal network connections. If all else fails, consider techniques like loading shims to enhance perceived performance.
“Development teams often fall into the trap of focusing on the happy path, overlooking critical edge cases and failure points in requirements and testing,” says Dolan in an email interview. “Automated tests frequently neglect to simulate real-world conditions like network failures, which are essential for understanding app responsiveness under adverse situations.”
Many organizations neglect the maintenance phase, focusing more on feature development and ignoring the long-term lifecycle of apps, which includes updating dependencies and addressing technical debt. Regularly updating dependencies and staying current with the latest tooling can enhance security, performance, and overall application efficiency.
11. Design for uptime.
Maximizing availability is crucial for any software application to deliver uninterrupted operations, maintain customer satisfaction and maximize revenue opportunities. 98% of companies experienced costs exceeding $100K per hour due to application downtime. Additionally, one out of every six outages results in costs exceeding $1 million. These expenses are associated solely to productivity and data loss costs. Even more losses can be attributed to reputation damage and churn. Therefore, maximizing uptime plays a key role in determining the success of a business, according to Luqman Saeed, Jakarta EE Expert at Payara Services, an open-source application server provider.
“High application availability should be considered from the early design phase, in line with Quality by Design principles,” says Luqman in an email interview. “Subject matter experts should define the key features and capabilities that should be included as well as suitable downtime mitigation strategies. During this stage, it is extremely important to consider the business requirements that the application needs to address as well as the operational conditions.”
Developers should work closely with architecture specialists and SMEs with extensive know-how on the topic to identify the most effective solution. It is also important to consider what average and peak demand will look like to determine the resources needed. Strategies for horizontal and vertical scaling should be considered to handle varying loads efficiently.
12. Have a proper DevOps practice.
Many of the pitfalls developers face when addressing app performance concerns can be handled by adopting proper DevOps practices, according to Chris Chapman, CTO at MacStadium, a private Mac cloud provider. This means structuring the development practice so it includes planning, integrated testing and deployment, consistent and rapid updates, integrated security, reviewed and maintained code, and high levels of collaboration between developers, product teams, operations groups, and security.
“Without these processes and practices in place the problems of consistency and performance, amongst a litany of others, can become overwhelming,” says Chapman in an email interview. “Disorganized teams regularly fumble with scope creep trying to attack large problems, with slow or inconsistent improvements without predictable updates and maintained code, with poor user experiences because of a lack of feedback loops, and with errors and bugs because of a lack of testing and validation.”