Insight by Nuix

How to modernize investigative data analysis

Among the most ubiquitous of federal agency activities are investigations. The term brings to mind law enforcement agencies, of course, such as the FBI or the IRS criminal division. Thought of more broadly, investigations also include oversight activities by inspectors general or the Government Accountability Office, program examinations by analysts or congressional staffers, and casework by agencies as diverse as the Merit Systems Protection Board or the Equal Employment Opportunity Commission.

The processes of modern investigative work have many elements in common regardless of the nature of the specific investigation, criminal or otherwise. Increasingly, the process includes the curation, integration and analysis of data from diverse sources. Investigators don’t want to overlook anything that might be relevant. But they also don’t want to be burdened with extraneous data and the inefficiencies it brings. And they want everyone connected to a given investigation to have access to an inclusive, authoritative source of data and analytics tools.

Robert O’Leary is the head of investigations for U.S. Government and Corporate at Nuix, a vendor of software that aids in determining patterns and actionable information from data. A long time state police investigator with a specialty in data analysis, O’Leary said, “I think the biggest challenge is processing all of the available data into a single platform and being able to see it in one view.” Nowadays that often includes data from cell phones, internet of things sensors, video from fixed or drone-mounted cameras, and computers.

Integrating data from disparate sources presents a technical challenge, one he said the Nuix platform has advanced. In particular, the blending of structured and unstructured data in an automated way, or simply making sense of purely unstructured data.

One important feature, for example, “is pulling named entities out of source data. We identify people’s names, URLs or web addresses, email addresses, IP addresses, monetary values,” O’Leary said. “And we’ve built some that even have MAC addresses, if we have a suspicion in a cyber case that one machine is connected to another.”

It all adds up, he said, to “the ability to pull that those pieces of information that have investigated value out of the data set very, very quickly.”

Equally important in deriving meaningful information out of unstructured data, O’Leary said, is context, which may be gained from the few words or lines surrounding a particular term. Converting text in the audio, or vice versa, can often help. He recalls one case in which “we were able to find the code word in transcribed recordings, and be able to understand where they were using the code words – essentially drug terms that we were able to look at and understand the context in which they were used.”

Another important analytic capability is what O’Leary called regex, short for regular expressions. An example is numeric or alpha-numeric string patterns that show definite provenance, such as credit card, Social Security, or Medicare accounts.

O’Leary said that from an IT architecture standpoint, data analytics on shared cloud platforms is becoming the preferred model. In practice, most investigations are conducted by teams. Teams may span multiple agencies and levels of government.

“The ability to pull all that data in, and to process it all into a single case, and then make that case available to everyone,” he said, means individual team members “can not only see their data, but also how it corresponds and connects and correlates with the data from their colleagues. It gives them a much better understanding of what they have.”

This approach also make easier for investigators to work from anywhere, and on portable devices.

O’Leary said Nuix’s scripting capabilities in a variety of languages enables investigative organizations to customize how they apply analytics and design their workflows.

Data Challenges for Investigations

Having all that data in a single place, we do all the link analysis, all the connections, so we only have one case that then goes to the prosecuting attorney. Or to whoever is the next authority to get it. We don't have to go through five or six different versions of the case, to determine which is the most comprehensive.

Strategies for Data Integration

I think the biggest challenge is processing all of the available data into a single platform and being able to see it in one view. We can process data in any number of tools. But there's only there's only one that brings it in from all the sources, whether it's computers, cell phones, Internet of Things, such as drone dumps, cameras. We can bring it all into one platform.

Listen to the full show:

Featured speakers

  • Robert O’Leary

    Head of Investigations, Nuix USG

  • Tom Temin

    Host, The Federal Drive, Federal News Network

Sign up for breaking news alerts