Kull says the solution will make it possible to analyze fiscal data collected under the Office of Management and Budget’s A-133 single audit process, designed to account for over a half trillion dollars in assorted federal grants handed out every year. In this case, he says the challenge came in dealing with A-133 audit information from the Department of Health and Human Services.
“HHS folks wanted to go from the collection form, which is the summary of the A-133 audit results, which is submitted to the Federal Audit Clearinghouse. They wanted to link to a finding in that collection form to the actual place in the audit report that described the nature of the problem,” Kull says.
Kull says the challenge is that many of the A-133 audit reports over the years had been saved as Adobe PDF documents. Agency financial auditors and employees cannot easily parse, manipulate or integrate the data without proper software.
Kull says, at first, the research team from HHS, the AGA and PWC, tried an approach using XBRL – Extensible Business Reporting Language. That’s a markup language used to make complicated financial data usable on websites, and also in big database applications. It’s currently being used by the Federal Deposit Insurance Corporation and the Securities and Exchange Commission, for handling and displaying corporate financial reports.
“We’d take a couple of programs in a couple of states. We’d tag a couple of sections of the report and we’d find that the reports were in PDF, the collection forms were digital. We also had a hard time accessing the reports, because, according to the Federal Audit Clearinghouse, they had to go through redaction, and it became a mess!” Kull says.
He says they had reached an impasse until the AGA research team got a helping hand from another of its technical partners.
“We were working with a company called Mark Logic, they had volunteered to give us server time, and they said they’ve got software that can consume these PDF reports, and convert them into digital formats, and they can use intuitive software to create page breaks,” Kull says.
He says that within a few weeks, they were able to process not only the data from the two sample grant programs, but entire datasets from whole states, covering all the HHS programs involved.
By using this new approach, Kull says the project team could link directly from the Federal Audit Clearinghouse collection form right to the place in the audit report that described and showed the finding.
Kull says the newly developed technique is a significant breakthrough for the federal audit and accounting community because for the first time it lets the analytic use of audit information in the Federal Audit Clearinghouse that until now had been untouchable. He adds that it opens a whole new world for other databases similar to the HHS collection that until now had been trapped in a proprietary file format.
“Having worked in the federal government for 32 years, there are numerous databases spread out through government, where agencies collect information about a particular activity…that’s locked up in proprietary taxonomies,” he says. “Second, if they’re trying to be accessed by the public, they go through redaction, most of this is manual redaction, which costs time and money. And this is a technology which says if you’re able to digitally consume these documents, even if they’re in PDF, there is software that is out there that will allow you to mine the data.”
Kull says that the new technique will make it possible to eventually create a fully interactive Single Audit Database that is searchable, and compatible with XBRL. He adds that linking the data collection form with the final auditors’ report also will make possible improved analysis of the federal grants that have been awarded since the single audit process was made part of the law in 1984.