When big data gets too big

Best listening experience is on Chrome, Firefox or Safari. Subscribe to Federal Drive’s daily audio interviews on Apple Podcasts or PodcastOne.

Data is one of the government’s strategic assets. Now that sounds nice, but it’s also enshrined in statute and regulation. But Big Data can get too big, and now data leaders, at least in the military have been talking about data downsizing. For more insight, Nick Hart, CEO of the Data Foundation, and Jesse Rauch, vice president for Federal at Active Navigation, joined Federal Drive with Tom Temin.

Interview transcript:

Tom Temin: Nick, good to have you on.

Nick Hart:  Great to be here.

Tom Temin: And the vice president of Federal at Active Navigation, Jesse Rauch. Mr. Rauch, good to have you on.

Jesse Rauch: Thank you very much, Tom, good to be here.

Tom Temin: And I guess you had a recent conference and several military data types of people spoke about the idea of data downsizing. What is going on, Nick?

Nick Hart: Well, there’s been this idea for quite a long time of data minimization, which is to say that we should be strategic about the information that we’re collecting to make sure that it’s valuable and that there’s a plan for use, but also not overdoing it. In the federal government we have a law called the Paperwork Reduction Act, which is this actually a really strategic way to review data collections before they happen to maximize the value of the information that we’re getting from the American public. And increasingly, we know, we’re collecting a lot of data for government, and it’s being stored in data lakes and data warehouses, but we’re not maximizing the value of that information. So I mean, we’ve had a lot of discussion in the last couple of years about creating the strategies to make sure we’re actually using it. And so this is going to be a really important concept as we move forward and maximizing the value of data as a strategic asset.

Tom Temin: Yeah, Jesse, I’ve heard that from several agencies that yes, you can amass more and more data, but the lake, the data lake, the data warehouse is something of a suspect idea a little bit because the more you amass, maybe the dirtier it gets and the harder it is to analyze it because if you have to run your tools against endless and ever growing datasets.

Jesse Rauch: That’s very true. And you really start to multiply these issues, especially if you start getting redundant data in there or obsolete data, where suddenly your infrastructure costs have gone up, your cost of analysis have gone up, and the quality of your analysis goes down. So yeah, really focusing on getting rid of that redundant data, making sure your data is clean and well managed, makes the resulting data lakes and analysis just that much better.

Tom Temin: Which gets to the question of who’s in charge here? Because one person may say, for purposes of bias reduction, or whatever – I’m making that one up – you need to throw all this data in. And so, and someone else may say, but you know, technically, it’s gonna make it too hard, and the storage gets expensive, etc, etc. So how do you arrive as a group as an agency with what the right data strategy ought to be, Nick?

Nick Hart: Well, Tom, you used the magic word “group.” I mean, this is a team sport. This is not any one single person’s job. And that’s by design. So in government, we have chief information officers, chief data officers, analytics officials, valuation officials, and they really all need to work together to ensure that we are doing this in a way that makes sense. Jesse just alluded to the fact that if you’re collecting too much data and not being mindful of the quality, then that poses some challenges for analysis. And really, we’re trying to generate insights that are useful here using this data. And so you have to be able to monitor the quality over time. And that’s a very difficult thing. We know that the more we use data, we also begin to identify the quality gaps. And that’s why I like to say this is a team sport, because it’s not even just about the people who in their job descriptions have data today. Every program manager, every employee in the federal government should be part of this enterprise. And I think we’re increasingly seeing that that’s playing out as agencies are institutionalizing the function of the chief data officers, for example, and better governing their data.

Tom Temin: Yeah, that whole question of governance, I guess, comes up in so many contexts and data governance, we hearing it now. And what does data governance mean and how do you implement governance for data?

Jesse Rauch: Data governance is a fairly broad topic, right? Because you have records, you have data management, you have personal data. So there’s a number of regulations that come into play. So it’s that overarching understanding of your data and how it best needs to be managed to meet your mission goals and regulatory requirements. So that’s a whole conversation, or two or three, right there by themselves. So with that, you know, it is looking at your data, understanding what it is being used for, and how it needs to be used, and then managing it to meet those goals in a safe and effective way.

Tom Temin: We’re speaking with Jesse Rauch, he’s vice president of Federal at Active Navigation, and with Dr. Nick Hart, the CEO of the Data Foundation. And the Federal Data Strategy for 2020 is been out for a little bit. And I’m not sure everyone totally understands what it is, where it fits into the general IT and program management panoply. So maybe Nick, you could just review for us what people need to do about the Federal Data Strategy, now that it’s been out there and fermenting for a while?

Nick Hart: This was intended as this 10-year, long-term strategy to help us better solve some of the problems that we’ve been dealing with for years. And the way that this is implemented is that every year, the federal government will produce a action plan. So you’re referring to the 2020 Action Plan, which outlines a whole bunch of actions for agencies that include standing up their data governance processes with their new chief data officers, for example, which we were just talking about. In reality, every agency should also have its own data strategy and should be thinking about how they make the practices and principles for these activities more useful, and relevant for the context of fulfilling their own agency missions. So really, the federal data strategy as this governmentwide plan is really just a catalyst for ensuring that every agency and every program across the federal government, we’re making real progress, and that we’re going to do it as fast as possible. So it’s just a point here to be made that you know, we’re not going to solve all of our data quality and data sharing challenges overnight or even by tomorrow. But we can make some incremental progress that will really go a long ways to achieving the goal, this long-term vision of recognizing data as a strategic asset.

Tom Temin: And with the advent of the chief data officers and, more and more agencies are getting on board because they have to, and they’re getting them and even some of the nonstatutory agencies are getting chief data officers so the idea seems to be taking hold. What’s your view of the chief data officer? What are they, a technical person? Are they a computer guy? Are they a strategy person? What are they in the best sense?

Jesse Rauch: They are definitely a strategy person. So the strategy around managing the data, the strategy around effectively using it to drive the mission is very much a strategy mindset that the chief data officers go in with. And you’ll find that across the speakers that we had last week at our conference, you’ll really find that they are trying to look at the big picture, bringing the stakeholders together, and identifying the long term roadmap for how to effectively manage that data for years to come. So that’s a huge improvement and a great sense of vision for that chief data officer position.

Tom Temin: Because you mentioned that the Army, the Navy, the Army Corps of Engineers, and the Air Force at the conference, all were into this idea of data, let’s say maybe not so much downsizing, maybe right sizing to use another popular term is what they are after. And so it seems like DoD is really a leader in this – fair assessment?

Jesse Rauch: Absolutely. The joys of being in the military mindset is that when you get a mission, you can execute against it. And so when they have started to stand up these chief data officer positions, they’ve really, you know, looked at that with an operational mission-based focus. So the clarity of DoD is very nice in terms of the clarity of the chief data officer position and really bringing that data to bear against solving those mission priorities.

Tom Temin: And Nick, how do you bring the data strategy and the data governance together with the technical requirements that agencies have with storing data all over the place and cloud and hybrid technical infrastructure? Because they’re really not the same thing but they depend on one another.

Nick Hart: Yeah, I mean, this is all really integrated, or it should be. And DoD is a really interesting model the other agencies can learn a lot from, in part because they figured out how to take the role of the CDO. And it’s not just a single CDO at the top of DoD, it’s really a position that exists in all of the different parts of the Department of Defense. And they’ve ensured that there are people and resources to support implementation of these activities. And that’s a really essential element to getting this right. A CDO, that’s operating on a shoestring budget as a single person in an office, as some agencies have chosen to do, is not going to be able to realize the full benefit of integrating all these activities and coordinating across all of the different silos that exist in agencies. So you know, I hate to overplay the point about resources, but like this is really a role for Congress, and OMB and the administration to prioritize the resources for these positions and ensure they have what they need to get the job done.

Tom Temin: I guess if you want to save billions on one end, you can invest a few hundred thousand in the other.

Nick Hart: Yeah, and this isn’t billions of dollars. I mean, this is literally millions of dollars, thousands of dollars in some cases would go a long ways to improving this infrastructure that has been long under resourced.

Tom Temin: Dr. Nick Hart is CEO of the Data Foundation. Jesse Rauch is vice president of Federal at Active Navigation. Thank you both so much.

Nick Hart: Thank you.

Jesse Rauch: Thank you, Tom. It’s great to be here.

Tom Temin: We’ll post this interview at FederalNewsNetwork.com/FederalDrive. Hear the Federal Drive on demand and on your device. Subscribe at Apple Podcasts or Podcastone.