It seems like the industry is leaving application performance management (APM) behind and moving towards a new observability world. But don’t be fooled. While vendors are rebranding themselves as observability tools, APM is still an important piece of the puzzle.
“Observability is becoming a bigger focus today, but APM just by design will continue to have a critical role to play in that. Think about observability holistically, but also understand that your applications, your user-face applications and your back-end applications are driving revenue,” said Mohan Kompella, vice president of product marketing at the IT ops event correlation and automation platform provider BigPanda.
Because of the complexity of modern applications that rely on outside services through APIs and comprise microservices running in cloud-native environments, simply monitoring applications in the traditional way doesn’t cover all the possible problems users of those applications might experience.
“What’s important,” explained Amy Feldman, head of AIOps product marketing at Broadcom, “is to be able to take a look at data from various different aspects, to be able to look at it from the traditional bytecode instrumentation, which is going to give you that deep-level transactionability back into even legacy systems like mainframe or, TIBCO, or even an MQ message bus that a lot of enterprises still rely on.”
Further, as more applications are running in the cloud, Feldman said she’s seeing developers “starting to change the landscape” of what monitoring looks like, and they want to be able to have more control over what the output looks like. “So they’re relying more on logs and relying more on configuring it through APIs,” she said. “We want to be able to move from this [mindset of] ‘I’m just telling you what to collect from an industry and vendor perspective,’ to having the business be more in charge about what to collect. ‘This is the output, I want you to measure it, look at all the data and be able to assimilate that into that entire topological view.’”
APM, observability or AIOps?
Kompella explained there’s a lot of confusion in the market today because as vendors add more and more monitoring capabilities into their solutions, APM is being blended into observability suites. Vendors are now offering “all-in-one” solutions that provide everything from APM to infrastructure, logging, browser and mobile capabilities. This is making it even harder for businesses to find a solution that works best for them because although vendors claim to provide everything you need to get a deep level of visibility, each tool addresses specific concerns.
“Every vendor has certain areas within observability they do exceedingly well and you have to be really clear about the problem you’re trying to solve before making a vendor selection. You don’t want to end up with a suite that claims to do everything, but only gives you mediocre results in the one area you really care about,” Kompella said.
When looking to invest in a new observability tool, businesses and development teams need to ask themselves what the specific areas or technologies that they are interested in monitoring are and where they are located. Are they on-premises or are they in the cloud? “That is a good starting point because it helps you understand if you need an application monitoring tool that’s built for microservices monitoring and therefore in the cloud, or if you still have a large number of on-premise Java-based applications,” Kompella explained.
Much of monitoring applications in the cloud is reliant upon the providers giving you the data you need. Feldman said cloud providers could give you information through an API, or deliver it through their monitoring tool. The APM solution has to be able to assimilate that information too.
While Feldman said the cloud providers haven’t always provided all the data needed for monitoring, she believes they’re getting better at it. “There’s definitely an opportunity for improvement. And in a lot of areas, you do see APM vendors also provide their own way to instrument the cloud… being able to install an agent inside of the cloud service, to be able to give you additional metrics,” she said. “But we’re seeing, I think, a little bit more transparency than we had before in the past. And that’s because they have to be able to provide that level of service. And being able to have that trend, a little bit of transparency, helps to increase communications between the service and the provider.”
BigPanda’s Kompella said the overarching driver of monitoring is to not just “stick your finger in the wind” and decide to measure whichever way the wind blows. You really have to understand your systems to figure out what metrics are going to matter to you. One way to do that is by analyzing what is generating revenue. Kompella went on to explain that you have to look at where you’ve had outages or incidents in the last couple of months, how they’ve impacted your revenue and rating, and then that will lead you to the right type of APM or observability tools that can help you solve those problems.
Additionally, businesses need to look at their services from the evolution of their technology stack. For instance, a majority of their applications may be on-premises today, but the company might have a vision to migrate everything to the cloud over the next three years. “You want to make sure that whatever investments you make in APM tools are able to provide you the deep visibility your team needs. You don’t want to end up with a legacy tool that solves your existing problems, but then starts to break down over the next few years,” said Kompella. “Technology leaders should judiciously analyze both what’s in the bag today versus what’s going to happen in the next few years, and then make a choice.”
Getting the big picture
Broadcom’s Feldman explained that a monitoring solution should give you perspective and context around what is happening, so having the traditional inside-out view of APM coupled with an outside-in perspective can aid in resolving issues when they arise. Such things as synthetic monitoring of network traffic, and real user monitoring of how applications are used can provide invaluable insight to an application’s performance. She also noted if the application is running in the cloud, you could use Open Tracing techniques to get things like service mesh information to understand what the user experience is for a particular cloud service.
Kompella added that log management and network performance monitoring (NPM) can help extend your monitoring capabilities. While APM tools are good at providing a deep dive of forensics or metrics, log traces help you go even deeper into what’s going on with your applications and services and help improve performance, he said.
Network performance monitoring is also extremely important because most large enterprises are working in very hybrid environments where some parts of their technology stacks live on-premises and in the private or public cloud. Additionally, applications tend to have a multi-cloud strategy and are distributed across multiple cloud providers.
“Your technology stack is extremely fragmented and distributed across all these on-prem and cloud environments, which also means that understanding the performance of your network becomes super critical,” said Kompella. “You might have the most resilient applications or the best APM tools, but if you’re not closely understanding network traffic trends or understanding the potential security issues impacting your network, that will end up impacting your customer experience or revenue generating services.”
What is to come?
The reason monitoring strategies are becoming so important is because the pressure for digital transformation is just that much greater today. A recent report from management consulting company McKinsey & Company found the COVID-19 crisis has accelerated digital transformation efforts by seven years.
“During the pandemic, consumers have moved dramatically toward online channels, and companies and industries have responded in turn. The survey results confirm the rapid shift toward interacting with customers through digital channels. They also show that rates of adoption are years ahead of where they were when previous surveys were conducted,” the report stated.
This means that the pressure to move or migrate to the cloud quickly is that much greater, according to Mohan Kompella, vice president of product marketing at BigPanda, and as a result APM solutions have to be built for the cloud.
“Enterprises can no longer afford to look for APM tools or observability tools that just don’t work in a cloud-native environment,” he said.
Kompella also sees more intelligent APM capabilities coming out to meet today’s needs to move to the cloud or digitally transform. He went on to explain that APM capabilities are becoming very commoditized, so the differences between vendors are getting smaller and smaller. “Getting deep visibility into your applications has been largely solved by now. Companies need something to make sense of this tsunami of APM and observability data,” he said.
The focus is now shifting to bringing artificial intelligence and machine learning into these tools to make sense of all the data. “The better the AI or the machine learning is at generating these insights, the better it is at helping users understand how they’re generating these insights,” said Kompella.
“Every large company has similar problems, but when you start to dive in deeper, you realize that every company’s IT stack is set up a little bit differently. You absolutely need to be able to factor in that understanding of your unique topology in your unique ID stack into these machine learning models,” said Kompella.
The trouble with alerts
Alarms are a critical way to inform organizations of performance breakdowns. But alarm overload, and the number of false positives these systems kick off, has been a big pain point for those responsible for monitoring their application systems.
Amy Feldman, head of AIOps product marketing at Broadcom, said this problem has existed since the beginning of monitoring. “This is a problem we’ve been trying to sell for at least 20 years, 20 plus years … we’ve always had a sea of alarms,” she said. “There have always been tickets where you’re not sure where the root cause is coming from. There’s been lengthy war rooms, where customers and IT shops spend hours trying to figure out where the problem is coming from.”
Feldman believes the industry is at a point now where sophisticated solutions using new algorithmic approaches to datasets have given organizations the capability to understand dependencies across an infrastructure network. Then, using causal pattern analysis, you understand the cause and effect of certain patterns that go on to be able to determine where your root cause is coming from.
“I think we’re at a really exciting point now, in our industry, where those challenges that we’ve always seen for the last 20 years, are something that we truly can accomplish today,” she said. “We can reduce the noise inside of the Event Stream to be able to show what really has the biggest impact on your business and your end users. We’re able to correlate the data to be able to recognize and understand patterns. ‘I’ve seen this before, therefore, this problem is a recurring problem, this is how you fix the problem.’”
AI and ML are key, Feldman said. “I think APM was probably one of the first industries to kind of adopt that. But now we’re seeing that evolution of where it’s taking off across multiple data sets, whether that’s the cloud observability, data sets, networking, data sets, APM data sets, even, mainframe and queuing type information, all of that now is getting normalized in and then used your experience too. So all the information now is coming together is giving us a great opportunity.