Data silos get a bad rap, especially in the federal government. Technology experts have been talking about tearing down and smashing silos for as long as they’ve existed. The reasoning was sound — break down walls between data, bring it all together into a centralized location, and perform cross-agency analysis to derive the most impactful mission insights. The goal still makes sense, but technology has progressed to the point that we don’t have to always take the most disruptive path to get there.
Despite decades of silo shaming, the reality is that data silos typically exist for good reasons, ranging from security to privacy, access and compliance. The approach of dumping all agency data into a cloud or an on-premises data lake can actually complicate the efforts of drawing rapid, relevant insights. These environments often require more resources (labor and time especially) to process and prepare structured and unstructured data for analysis, making the data lake seem more like a data swamp to wade through.
Respect the Silos
While consolidating redundant data centers or sets may cut down on costs, that doesn’t mean data centralization is always the answer. Today’s analytics strategies and technologies should respect federal data silos for two primary reasons:
Security: Centralizing data can make it easier for cyber threats to attack. Often, separate departments have their own security protocols, which can be difficult to implement if datasets are consolidated. Additionally, trying to have one security policy to rule them all reduces security controls to the least common denominator between missions, either exposing or constraining data inappropriately.
Ownership: Integrating data just for the sake of eliminating silos can introduce new challenges in governance and ownership, especially for departments and sub-agencies that set and manage controls for their own datasets. Additionally, many agencies, like the Census Bureau, are legally prohibited from commingling their most sensitive data, including census responses.
In short, integrating data or tearing down silos simply because it sounds innovative is unrealistic and can cause serious security and mission ramifications.
For data to be truly useful, it must be easily accessible to a wide variety of stakeholders, ranging from the policymaker to the data analyst to the average federal employee. Technology can now bring agency-wide data together for fast, accurate analysis while maintaining the integrity of silos that actually make sense. Instead of smashing silos simply for the sake of smashing them, agency strategies should focus on discoverability with analytics platforms that can query lots of different data types across distributed environments within seconds, and respect the data silos that are there for a reason.
Today’s emerging technologies can index data into search engines, regardless of format, for faster discovery and more seamless analysis, leaving the master copy of data in its original system and avoiding disruption to multiple teams. No matter where the data resides — on servers, in containers, or deployed as functions — users can still run faster queries, whether it’s on log files or server metrics.
Instead of demolishing silos, agencies can focus on democratizing data by enabling relevant users to analyze it. Index-based search engines are easier for all stakeholders to use and can support a number of data projects while providing holistic insights across multiple data stores. Search indices process both structured and unstructured data for analysis by organizing the data into structures optimized for fast answers to ad-hoc questions — without intervention from developers.
Modern search solutions built on search indices empower users with discoverability to find the most important content based on the queries they know without having to submit change requests or issue new task orders to silo-busting projects.
Now agencies can empower all stakeholders, from the data analyst to the policymaker, to store, search, and analyze large volumes of data across silos and in near real-time. No matter the size of the agency, and no matter how many silos, today’s analytics tools can scale from the smallest IT shop with a few dozen firewalls to massive agencies with over 10,000 servers — all while respecting the security protocols of each independent data silo.
Don’t Be Disruptive
When data silos make sense, it’s better to augment existing technologies to unlock the power of data while minimizing disruption. As more government data moves to the cloud, analytics tools will need to become vendor agnostic and be able to work with a variety of cloud and SaaS platforms, regardless of the provider.
Federal data silos aren’t going away anytime soon. And there’s no shame in that, especially when keeping certain silos is integral to agency missions. By focusing less on tearing down walls and more on data discoverability, you can harness better insights from the various data at your agency’s disposal, silos and all.