Insight by Splunk

Network monitoring gives way to observability

Monitoring of enterprise networks and the assets on them has long been a foundational activity for understanding cyber threats. Now, the concept of observabilit...

Monitoring of enterprise networks and the assets on them has long been a foundational activity for understanding cyber threats. Now, the concept of observability is supplanting monitoring. That’s according to Mala Pillutla, global vice president – of observability – at Splunk.

“What we believe at Splunk is, observability is an evolution of monitoring,” Pillutla said. More than a substitution of terms, observability implies deeper knowledge of what’s happening on a network coupled with the capability of taking action. Observability becomes more important as agencies move from traditional transactions to digital services, Pillutla said.

“When you’re dealing with issues of digital systems that have become mission critical,” she added, “those systems don’t just support your business, but are actually a direct line of engagement with your customers.”

Shape

Observabilty Overview

Observability matters to what I call the application development teams, and the traditional IT ops teams, and also the security teams

When an interruption strikes, having observability measures in place will help the organization quickly pinpoint the cause – for example, a denial of service attack, a poorly configured application programming interface, or a demand spike.

In short, observability is key to delivering highly resilient applications.

Changes in how applications are fielded, driven by cloud adoption, also drives the need for observability, to where traditional monitoring can’t “see.” At one time applications ran as run-time code on agency services. Then came virtualization. Now the trend, Pillutla noted, is towards containerization of micro services.

Given all of the functionality within a container and the fact that containers interact via APIs, visibility into applications built this way can be a challenge.

Shape

The Industry Perspective on Observability Adoption

As the digital transformation accelerates, so too are our customers observability competency that that we've come across.

“So you need a technology that can actually spawn through all of these containers in a highly distributed environment, and help identify what your application is experiencing,” Pillutla said. She cited one commercial customer that generated 20 million web requests per day in a highly complex micro services application hosted both in a cloud and in the company’s data center.

“What we provided is that full fidelity visibility that allowed them to find that needle in that haystack,” Pillutla said.

The implication is that generating of logs is insufficient to stay on top of cybersecurity and application performance. Pillutla said logs must be accompanied by metrics and traces into application activities.

“I don’t think log logs themselves are enough,” she said. “We need to add metrics and traces to get to that granularity, then correlate those metrics with traces and logs.” The result is what she called a troubleshooting workflow connecting monitoring, trouble prevention, and fast mitigation of issues that do come through. Add a dash of artificial intelligence, and the observability elements enable a predictive capability, so the organization and prevent performance degradations or outages.

Security and observability are also converging, Pillutla said.

“Our vision is to have that unified security and observability platform to drive resiliency,” she said.

With observability coupled to security and performance, agencies can prepare for contingencies as effectively as, say, Amazon prepares for Black Friday sales.

Pillutla said that observability is an evolving class of enterprise IT product. She cautioned agencies to avoid acquiring and unwieldy library of tools. Often they operate separate tools for logging, network monitoring and application performance monitoring. She cited one company with at least four monitoring tools producing multiple dashboards but no correlation among the data sets they produced.

“So there was a lot of time lost during reconciling these dashboards and data,” she said, “and slowing down ultimately what they were trying to do, trying to prevent an outage.”

Listen to the full show:

Copyright © 2024 Federal News Network. All rights reserved. This website is not intended for users located within the European Economic Area.