Moving to the cloud has allowed agencies to store and use massive amounts of data more efficiently than when they maintained their own data centers. Now the next big step for many of these agencies, especially if their goal is to implement emerging technologies like artificial intelligence, is figuring out how to get the most out of that data. For a number of federal chief data officers and data experts, that means improving data quality, transparency and access more widely across the enterprise.
Data quality is important to these efforts, and that doesn’t just mean ensuring the numbers in a table are accurate. Utility and trustworthiness of data is also determined by documenting where the data comes from, how often it’s updated, and who has access to it. That’s why some CDOs are working on how to establish a stronger foundation or framework to ensure processes and data management are meeting specific standards in order to facilitate easy data sharing.
“The dimensions that we’re focusing on nowadays are primarily accessibility and transparency, and how do we raise all the boats within the agency that allows everybody to use data? If people are interested in learning more, I’ll just give a quick shout out to the Federal Committee on Statistical Methodology (FCSM). They have a Data Quality Framework that we found very productive,” said Avital Percher, assistant to the chief data officer for analytics and strategy at the National Science Foundation, during an August 25 FedInsider webinar. “We’re making sure that everybody has the opportunity and access to get the information they need to make an informed decision about the data that they’re using.”
Some CDOs will go so far as to say that the only way to drive value with data is by ensuring unfettered access to what data is available. In pursuing that goal, data silos become their chief nemesis. Data in a silo is only understood by the people in the unit that has access to it. The rest of the agency may not even know it exists, much less what it contains. That’s why when an agency begins modernizing processes and examining their business architecture, the first step needs to be determining what assets they have available, how widely they’re understood, and who can access them.
“I want as many of our staff to have access to as much data as they possibly can. Even if you’re not allowed access to the data, I want you to be able to see the metadata to know that it exists. And that at a future point when you can have access to it or when it is appropriate that you can move forward and do something from there,” said Jacques Vilar, chief data officer at the Federal Deposit Insurance Corporation, during an August 19 webinar. “Understanding that data lineage, the purpose of that data, smart tagging, providing information to staff, to use that data set in a different way, all of that will drive insight to our leaders and help them make better decisions or quicker decisions. And that just does nothing but help the enterprise.”
Another major concern is privacy; identity, credential and access management is a huge consideration in federal agencies right now, especially with the recent pushes by the administration to adopt a widespread zero trust posture. And that’s where that data framework can come into play again. Agencies can use it to set restrictions around who can access what data, when, and for how long.
“We have several of our frameworks that we have established successfully within the context of DoD and IC, and we’re going to actually share data across,” said Deepak Kundal, chief data officer at the National Geospatial-Intelligence Agency, during an August 18 webinar. “So then it really becomes the responsibility … of the agency sharing the data — who are going to be termed as the data custodians — to actually apply the data security controls and data security rules onto the data payload, telling the agencies with whom we are sharing the data that ‘as long as you are adhering to our data security standards, that may be specific to an environment out there, you are okay to use this data for a limited time period.’ And when we say it’s a limited time period, that could be anywhere from 24 hours to about 30 to 60 to about 90 days. Those frameworks are in place today, and that is enabling the sharing of the data across agencies.”
But the culture of an agency can be just as big a consideration when establishing this kind of a data framework. Not everyone is used to thinking in terms of building centralized, shared tools for everyone’s benefit. Agencies have to foster that culture. And the way to do it is to start small, and prove the use cases.
“Folks should really be looking at thinking about delivering regular snacks, or ‘snackable’ forms of emerging technology and data, not full meals,” said Scott Beliveau, branch chief of advanced analytics at the U.S. Patent and Trademark Organization, during the August 25 webinar. “Because if you think about it, you deliver a full meal, everybody falls asleep after a full meal. But if you deliver those small snacks, those bite-sized pieces of wins, everybody keeps hungry and keeps coming back for more; it’s like the Lay’s potato chips, you can’t eat just one. So those wins build upon those wins.”