This past fall, Juliana Vida from Splunk wrote a piece exploring the value that full observability can bring to government IT organizations as they strive to achieve end-to-end visibility across their entire hybrid technology landscape, including both on-premise and fully cloud-based environments. Indeed, observability, or the ability to measure and assess various systems’ internal health based on their external data outputs, can provide government IT teams the clarity and context they need to manage performance and troubleshoot issues (often in advance) across their vast and growing IT empires.
We agree wholeheartedly with Ms. Vida’s assessment, with the exception of one key finding from a recent Splunk report which she describes as “unfortunate”: Nearly 80% of public sector organizations are only at the beginning stages of understanding and deploying observability technologies.” On the contrary, we actually think this is a good thing, because the observability landscape is changing so rapidly and government IT teams now have an opportunity to deploy a better way of doing things.
The observability paradigm is being flipped on its head
As IT environments become more expansive – comprising both on-premise and third-party cloud environments – the volume of external data being generated is becoming virtually impossible for IT teams to keep up with. We see this most prominently with resource-constrained teams in the government sector. However, IT teams across industries often find themselves “drowning in data,” lacking the ability to quickly and cost-effectively ingest, harness and make sense of all their data in a meaningful way. Data is supposed to be the key enabling IT teams to detect and fix problems much faster, thus reducing the all-important mean-time-to-repair (MTTR). But ironically, as more data is being created, recent DevOps practitioner surveys reveal that the average MTTR is actually increasing.
In this sense, observability as we have known it to-date no longer works and a new approach is required. As noted, the silver lining for government IT teams is they have an opportunity to address observability in a fresh and novel way, with a method that will actually serve them well versus knocking them over in an overwhelming deluge. Here’s what this new approach looks like:
Moving from “big data” to “small data” – Traditional observability approaches have entailed pooling all data in a central repository for analysis, with the understanding that data becomes contextually richer the more of it that is kept and analyzed in parallel. The problem with this approach is that all data is relegated to hot storage tiers which are exceedingly expensive. The incidence of an IT team unknowingly exceeding a data limit and getting hit with a huge unexpected bill as a result is far more common than one would expect. Even though an organization may never use the vast majority of its observability data, the solution is not to indiscriminately discard certain data sets, as this often introduces significant blindspots. Rather, the solution is to analyze all data in smaller, bite-sized chunks.
Analyzing data at its source, in real-time – Besides cost, another drawback of the “centralize and analyze” approach described above is the fact that it takes time to ingest and index all this data – time that an organization may not have if a mission-critical system is down, costing anywhere from $25,000 per minute to $250K per hour. A much better approach entails analyzing data for anomalies at the source, as it’s being created in real-time. Not only does this allow IT teams to detect issues faster, but it also helps them immediately identify the root cause, based on what systems and applications are throwing the errors. When data is analyzed in real-time at the source, the issue of data limits becomes null and void and in many cases, so too does the case for central repositories.
Being able to quickly and reliably access any and all data needed for troubleshooting – Analyzing data at its source, as it’s being created means an organization can have an eye on all data, even if all this data is not ultimately relegated to a high-cost storage tier. And this is important because there are going to be times when access to all of this data is needed. Developer team members should have access to all their datasets regardless of the storage tier they’re in, and they should be able to get hold of them easily, not having to ask SREs and operations team members who often serve as the gatekeepers in the central repository model.
In summary, it may be true that government organizations are nascent observability adopters. But at this juncture in the evolution of observability, that may be a positive, because observability as we know it is changing dramatically and newcomers have a unique opportunity to immediately immerse themselves in a fresh, modernized way of collecting, analyzing and correlating data. Forgoing central repositories; scrutinizing all data at the source in real-time; and cost-effectively keeping all data so it’s always there if and when it’s ever needed, represents the future of observability and a great starting point for government IT teams to set in motion their observability strategies.