Data formats stall transparency in government

When it comes to government transparency, the Freedom of Information Act officers at federal agencies have a lot of catching up to do.

Emily Shaw, deputy policy director at the Sunlight Foundation, told Federal Drive with Tom Temin that FOIA officers face an uphill battle, trying to change an office culture in government where the the document format or choice — PDFs or worse, printouts — make it harder to redact sensitive information on documents that could otherwise be released to the public.

emily shaw
Emily Shaw, deputy policy director at the Sunlight Foundation

“What they are trying to do is walk a very careful line between opening the government appropriately and following the laws to make information available, at the same time protecting information they are legally required to protect,” she said.

By improving the quality of data, Shaw said it’s much easier to redact the information for public release. Keeping records in a digital format, she said, allows software programs to automatically redact some information, like email addresses and Social Security numbers,  at a volume that human operators can’t handle.

“In a lot of cases, government agencies are used to working with PDFs or with printouts, even,” she said. “And so, they will turn it into a form that can only be read by humans. If it’s a printout, then it will never again have the quality that can be used to automatically redact it, and it needs to be done by a person.”

Agencies have been wrestling with this issue for more than a decade. Shaw said she hopes government rounds the corner soon.

“This might be a tech lag that hopefully we’ll see rectified soon,” she said. “We are seeing more adoption of newer techniques that are speedier and can be automated; obviously with human oversight, but it’s not as onerous as being the first line. We hope that does transform the availability of public information.”

Not a data should be treated equally, however. Shaw said agencies should have different standards in mind when redacting emails, for example, as opposed to databases containing the names of registered lobbyists.

“In many of those cases, it’s important to have personally identifiable information, because that’s the point of the database,” she said. “So in those cases, that’s where the challenge is the greatest.… You just have to be careful that certain categories of information are not included, as opposed to individual-level information altogether.”

Related Stories


Sign up for breaking news alerts