Knowing when to dig, when to draw the line in federal data investigations

Federal agencies are expanding their data sources and tools – adding data analytics – used to conduct investigations. Data is a blessing and a curse, but some entities are using it successfully, and changing their approach to suit both the trove of incoming information and the varying needs for it.

That is according to Timothy Persons, chief scientist and managing director for the Science, Technology Assessment, and Analytics Team at the Government Accountability Office. He said data is nothing new to investigations but what has changed is the computing infrastructure around it.

“When you look at the key data challenge, for both the GAO and the federal government, it really is a sufficiency question and that’s both in the quality and the quantity of the – not just the data, but the information that’s yielded from the data that can be used to support an investigative outcome that’s a good one,” he said on Federal Monthly Insights – Special Bulletin: Digital Investigations.

GAO is an investigatory agency whose auditors need to analyze not just agency finances but also performance. They need reliable data and they need it quickly to complete investigations in a timely manner.

One example is the Supplemental Nutrition Assistance Program (SNAP), which Persons said has been susceptible to improper payments, documentation errors and outright fraud. So much of SNAP activity results in monetary or value transactions, which can be tracked.

Another example he gave was from the Office of the Inspector General at the Department of Health Human Services, which has been monitoring the opioid crisis – another public health emergency.

“And it was just tracking down where were those script doctors that would just automatically write a script for the opioids and things and they were just – they were easily identifiable in the data, investigators were able to see that and implement law enforcement measures and help mitigate that challenge,” he said on Federal Drive with Tom Temin.

One of the primary obstacles to expanding use of data is the curation, knowing what data is useful and from a quality source. Persons said the current fears about misinformation on the internet means it is important to triage the data that agencies access. Agencies need a governance strategy that allows maximum access to those who need data, and limited access for those who should not have it. That curation also requires a data centric strategy, which is an evolution that Persons said mirrors the rise of chief data officers versus traditional chief information officers.

In this regard, he said GAO is on its own journey.

“Classically, from a chief information officer perspective, all data is like – it’s zeros and ones running across my computing infrastructure, whether it’s networks or switches, or boxes, and so on,” Persons said. “But the chief data officer perspective is different. It’s saying, well that’s potential or perhaps real asset, it’s a renewable asset, I could go back to that data source later on, I can reuse it, add to it, etc.”

To make that determination of when the data used for an investigation is complete, he said keep the principle of “sufficiency” top of mind. Have a threshold for when the data is satisfactory, but also have the end goal for the investigation itself in view at all times. Think of a federal prosecutor whose job is to use the power of the state to incarcerate someone, he suggested – it is paramount they stake their case on good data that adheres to constitutional law and was investigated properly.

“If you lack that data-centric thinking, that strategy and that governance model, and if you don’t have that ‘What am I really trying to answer right now?’ that discrete key question, then you could be running all over randomly and not end up where you need to be,” he said.

Deciding what data makes the cut can also be a matter of time constraints. Many federal investigators come from science and engineering background, Persons being among them, so he empathized with their desire to keep digging because “there’s never enough data.”

“[Investigators] would want to collect and do things forever to be extra, extra certain on something, let’s say,” he said. “But in this case, you don’t have that, you have a finite amount of time, you have other cases you have to deal with. It really is a workflow processing-type conversation, in one sense.”

Related Stories

    NCIS/Facebook

    NCIS analyzes, consolidates and structures Navy criminal investigations data for better results

    Read more
    A group of people play Spikeball at twilight on the beach, Thursday, Aug. 7, 2020, in Ocean Park, Maine. As the summer tourist season reaches its peak Maine's coronavirus positivity rate remains among the lowest in the nation. (AP Photo/Robert F. Bukaty)

    NOAA’s data sharing with windmill developer could yield ocean insight for both parties

    Read more

Comments

Sign up for breaking news alerts