Nearly a year into the implementation of the Foundations for Evidence-Based Policymaking Act, the leadership behind some of the agency data offices the law helped elevate have more than new data pilots and dashboards to show for their efforts.
More importantly, officials say they have buy-in from their agencies’ workforce to see these projects succeed, and enough curious employees to brainstorm ways to take these ideas to the next level.
With a final one-year action plan of the Trump administration’s Federal Data Strategy expected later this month, the departments of Transportation and Energy have turned to artificial intelligence tools to get a better handle of their data. But data experts from both agencies say these tools will only get them so far, without having the right workforce and infrastructure in place.
“I’ve had many people in industry be like, ‘You need to use more machine learning and more artificial intelligence and all that good stuff.’ I probably do, but I want to do that with a purpose, grounded in engineering practice,” Dan Morgan, Transportation’s chief data officer, said Tuesday at an AFCEA Bethesda conference.
Case in point, the agency has started to use natural language processing to parse through its vast inventory of regulations and pull statistics on the number of complex regulations or those with the most exceptions.
The project, which has freed up time for the agency’s Office of Legal Counsel, has had a promising start and cost very little to get started. Morgan said he and a GS-7 colleague created a dashboard with these metrics over the course of a week, using only “about $6 of AWS and a little Tableau and a little elbow grease.”
The dashboard isn’t perfect — the Federal Register has changed the way it structures some regulations over time — but Morgan said it’s “directionally correct,” and a promising template to fine-tune through similar projects.
“We were able to start to help our lawyers see some new things, and we’re just trying to lather, rinse and repeat this process and this project over, and over, and over again,” Morgan said.
Similarly, the Energy Department has begun using machine learning tools to automatically tag incoming field data about various types of oil and gas wells it oversees across the United States.
“When there is a problem in the field, you really need the right information to come to the surface, not piecemeal information, not parts of information. That is why we are leaning more on the insights of machine learning and things like that, so we can get the information fast, and it’s reliable,” said Pam Isom, the Energy Department’s deputy chief information officer and adviser to the chief data officer.
But Isom said these tools can only go so far in addressing the agency’s data-centric challenges.
“I’m not so interested in the technology, but if you can tell me how this will solve a business problem, well, I know that’s going to have an impact,” she added.
While some agency CDOs have identified culture change as a significant barrier to overcome, Isom, who previously served as Energy’s CDO, said the agency’s employees have embraced some of these data-centric ideas.
“We introduced some of the fundamentals of things that we need to be doing, and because there is so much pain within the organization, I thought I was going to have to really sell it, [but] I didn’t have to sell it,” she said. “It was like, ‘Yeah, I agree with what she’s saying. We need this and I think we should start doing what she’s saying.’”
Isom, the agency’s former CDO, said the agency has recently stood up a data governance board and launched a department-wide innovation exchange
“Now you have a place, kind of like a map of what are some core innovations that are happening within the department, and this is how you navigate to find out to understand where you can find certain data points around those innovations,” she said.
Over the course of this year, agencies have stood a number of data-centric hires through the Evidence Act. Meanwhile, the Office of Personnel Management is working on a governmentwide job series for data scientists.
“I believe we need data managers in each of the different departmental elements, so that they take responsibility for ensuring the data within their specific area,” Isom said.
Meanwhile, Transportation is also ramping up its data-centric recruitment. The National Highway Traffic Safety Administration, which provides vehicle safety ratings, is hiring a new head of data management, while the Federal Motor Carrier Safety Administration, which ensures more than 5 million truck drivers are medically cleared to drive, seeks a chief technology officer.
“The data has never been as good as it is right now, and that’s because of good management and leadership across my agency,” Morgan said. “And the way I sort of build on that ethos: no one is satisfied with the quality of data anywhere in my agency. That’s awesome. It’s not bad, right? You have to harness that energy — people want to make the data better.”
But the challenge of providing reliable data doesn’t just end with Transportation. The agency’s census of fatal crashes, for example, relies on input from police officers on the scene and several layers of local and state government before it makes it to a federal database.
“The best data is local, and yet I still want to be able to respond and understand what’s happening in the transportation system around the country. So I need different and surrogate data sources,” Morgan said.
In addition to pulling data from the navigation app Waze, Morgan said Transportation also relies on data from other federal agencies, such as the departments of Agriculture, Interior and more.
“I can’t understand where to build transportation if I don’t know what habitats I’m going to impact … I can’t manage the transportation system well unless I understand its impacts on human health,” Morgan said, adding that data comes from the EPA and the Centers for Disease Control.
“The point here is that you don’t live transportation one piece of infrastructure at a time. You live in a place, and understanding a place necessarily involves integrating data from different disciplines, so that we can make good decisions about how to build and manage that place together as a community,” he added.