Enhancing federal security: The vital role of observability in cyber defense

By implementing an infrastructure for observability, federal security teams can make better decisions about access and identity-based threats.

In an era marked by escalating cyber threats and sophisticated adversaries, the federal security domain faces unprecedented challenges in safeguarding national assets and critical infrastructure. Observability has emerged as a critical component in this landscape, providing unparalleled visibility into system operations and enabling timely detection and response to security incidents.

Observability in federal cybersecurity

When it comes to security, you can’t protect what you can’t see. That’s why organizations that can visualize and understand their data are in a much better position to thwart cyberattacks and breaches. Observability is the best way for businesses to change how they detect and remediate cyberattacks — so much so that the observability market is expected to reach $2 billion by 2026.

While observability isn’t a mainline discussion in the identity security space, it’s an essential piece of the puzzle, shining a light on the attack surface that allows teams to identify and prevent breaches. By layering observability with identity management, security teams have access to more data on identity-based threats, and fewer silos to break down as they race to identify and prevent attacks.

The identity attack surface

Observability is especially crucial in managing the identity attack surface. The sprawl of applications and systems employees are connected to is increasing exponentially, and security teams need specific information to determine which kinds of access are legitimate and which are risky. By layering observability with identity management, security teams have access to more data on identity-based threats, and fewer silos to break down as they race to identify and prevent attacks.

Establishing a baseline of normal

Observability helps to establish a baseline of “normal behavior,” enabling identity and access management (IAM) systems to use data to make valuable decisions that protect business operations. This strategy, known as behavior-driven governance, takes granular data about how people actually use their identities and access privileges.

Key components of observability

Three types of data matter the most in setting a baseline:

  • Metrics: Quantifying performance, including key performance indicators (KPIs) such as response time, error rates and alerts.
  • Traces: Allowing IT teams to locate the source of an alert (i.e., which part of a login process is vulnerable to bugs).
  • Logs: Answering the who, what, where, when and how of access activities with contextual event information.

For example, If the company only has U.S. employees and North American suppliers, and there’s a login attempt from Singapore, it’s easier to log that as a red flag and investigate. Better observability into data and the patterns associated with it can help businesses detect potential breaches quickly and efficiently.

To get the most out of observability, these three types of data should be used together to gain an overall understanding of the identities a business manages.

Insights from the 2024 Report on the Cybersecurity Posture of the United States

The 2024 Report on the Cybersecurity Posture of the United States  references concepts related to observability and monitoring within IT systems, particularly in the context of cybersecurity. Here are some extracts and analyses that highlight this:

  1. Logging and monitoring:
  • The Defending Federal Networks section mentions:
    • “[The Cybersecurity and Infrastructure Security Agency, Office of Management and budget and Office of the National Cyber Director (ONCD)] have engaged with industry to determine the feasibility of providing enhanced logging capabilities to Federal civilian executive branch agencies.”
    • “CISA’s Persistent Access Capability, made possible through the widely adopted [endpoint detection and response (EDR)] initiative born out of Executive Order 14028, facilitates real-time threat intelligence sharing.”

These excerpts indicate a focus on improving logging capabilities and real-time monitoring, which are critical components of observability in IT systems.

  1. Incident response:
  • The Improving Incident Preparedness and Response section mentions:
    • “In response to significant incidents and malicious cyber activity, Cybersecurity Advisories (CSA) provide critical infrastructure owners and operators and other entities with timely guidance to detect and respond to threats.”

This highlights the role of monitoring in detecting threats and guiding responses, which aligns with observability practices.

  1. Cyber analytics and data system:
  • The Future Outlook section mentions:
    • “To create a common operating landscape for cybersecurity, CISA is developing a new Cyber Analytics and Data System. This infrastructure will be used to integrate cybersecurity data sets; provide internal tools and capabilities to facilitate the ingestion and integration of data; and orchestrate and automate the analysis of data to support the rapid identification, detection, mitigation, and prevention of malicious cyber activity.”

This is a direct reference to building an integrated system for observability, where data ingestion, integration and analysis are automated to enhance cybersecurity.

  1. Zero trust architecture:
  • The Defending Federal Networks section mentions:
  • “The Administration crafted a cybersecurity modernization agenda to invest in systems that support collective defense and created the first strategy to adopt Zero Trust Architecture (ZTA) across the Federal civilian enterprise.”

Zero trust architecture relies heavily on continuous monitoring and validation of user behavior, which is a core principle of observability.

  1. Supply chain exploitation:
  • The Supply Chain Exploitation section mentions:
  • “Hybrid deployments, in which organizations use both locally hosted systems and cloud assets, can introduce complex centralized logging and authentication regimes, creating opportunities for malicious actors to evade detection and abuse identity management systems.”

This points to the importance of centralized logging and monitoring in detecting and mitigating supply chain threats.

Overall, the report emphasizes various aspects of observability and monitoring in IT systems as part of a broader cybersecurity strategy. The focus on logging, real-time monitoring, incident response and zero trust architecture are all indicative of an observability-centric approach to managing and securing IT environments.

Case study: Netflix

Take Netflix as an example. The company embarked on a plan to crack down on password sharing to stop users from accessing the app from devices not associated with their home network. While the initial announcement generated significant user backlash, Netflix has continued to explore and refine its approach rather than completely walking back the plan. This provides an interesting case study for how to use observability for identity security.

Netflix’s primary challenge was to accurately identify and manage unauthorized access without disrupting legitimate user activity. To achieve this, Netflix would have needed to process and visualize a vast amount of user data to gain full observability into user behaviors and patterns. This would involve:

  • Data collection: Gathering detailed metrics, logs and traces related to user activities. This includes login times, IP addresses, device types and usage patterns.
  • Behavioral analytics: Using machine learning algorithms to analyze the collected data and establish a baseline of normal user behavior. This allows Netflix to detect anomalies indicative of password sharing, such as multiple logins from geographically distant locations within a short time frame.
  • Real-time monitoring: Implementing real-time monitoring systems to continuously track user activities and detect suspicious behavior as it happens. This enables prompt action to mitigate potential security threats.
  • User notifications: Developing mechanisms to notify users of unusual activity and verify their identity through multi-factor authentication (MFA). This step is crucial to ensure that legitimate users are not unfairly penalized.
  • Policy adjustments: Using insights gained from observability to refine and adjust access policies dynamically. This includes setting limits on the number of devices that can access an account simultaneously and restricting logins from specific regions.

Netflix has invested heavily in building its own in-house observability tools to meet these needs — for example, tools like Atlas for time-series data storage and visualization. The ability to visualize and understand user data in real time allows for proactive threat detection and mitigation, ultimately enhancing identity security without compromising user experience.

Best practices for a data-first observability framework

To set up a data-first observability framework, follow these best practices:

  • Key observability metrics based on organizational business priorities.
  • Executive buy-in to, and organization-wide education on, a culture of observability, data access and governance.
  • A pipeline to centralize and standardize data sources that can be used to identify baseline and “abnormal” behavior.
  • Analytics tools and automated processes to sort through the noise of alerts.

By implementing an infrastructure for observability, federal security teams can break through the noise and make better decisions about access and identity-based threats. This not only enhances system reliability and performance but also fortifies the overall security posture of federal agencies.

Mohit Palriwal is a senior software engineer at Netflix and a former principal software engineer at Salesforce.

Copyright © 2024 Federal News Network. All rights reserved. This website is not intended for users located within the European Economic Area.

Related Stories

    Roger Waldron

    What does the FAS mean by ‘leverage the collective buying power of the federal government?’

    Read more
    Getty Images/iStockphoto/maxkabakov

    Seven cyber resilience recommendations for DoD mission continuity and data recovery

    Read more
    Derace Lauderdale/Federal News NetworkOSINT

    Open-source intelligence professionalism: Distinguishing ‘OSINT’ from ‘Pro-SINT’

    Read more