Agencies are making the switch from paper to digital records. That’s because the National Archives and Records Administration will only accept permanent records in an electronic format after December 2022.
But once archivists get ahold of those digital records, they could face a jumble of legacy file formats, and to make sure NARA can still open these records years from now, the agency has released a Digital Preservation Framework with digital recordkeeping best practices for hundreds of file formats.
NARA has received electronic files from agencies since the 1970s, mostly data sets from mainframe computers.
Leslie Johnston, NARA’s director of digital preservation, told Federal News Network that the records the agency now receives include geographic information system and statistical data, as well as audio and video files.
Preservation work for electronic records still comes down to risk assessments and deciding how best to keep these records available decades from now. But electronic preservation also comes with its own unique challenges.
“The main issue with electronic records that’s different is that there is over a 50-year time period such a wide range of applications and operating systems and types of computers that have been used to create those records. So obsolescence is actually the big issue for us,” Johnston said in an interview.
The framework builds off NARA’s transfer guidance, which outlines the most sustainable file formats agencies can use to save their permanent records. The framework provides an in-depth analysis of the sustainability of over 500 different files.
Under the way records schedule work in the federal government, agencies tell NARA what types of records they have, including file formats, and how long the agency will need those records for ongoing active business needs.
Based on the business needs, agencies can hold onto records for five years, 15 years or even longer before NARA ever receives them.
“In terms of software, five years isn’t a long time. But 10 years, 15 years, 20 years, 25 years can be a long time,” Johnston said. “Releasing something like this digital preservation framework that can help an agency look at what it’s maintaining in its records management program to say, ‘This is a format that we regularly use, that could potentially be at risk, let’s make sure that we’re looking at not just the media it’s stored on but what format we’re storing it in.’”
NARA received dozens of comments on the framework during the public comment period that ended last November. Johnston said those comments helped NARA build guidance for statistical data and e-book file types.
The variety of file formats archivists now encounter also requires a greater familiarity with technology. Library and information science schools, for example, now teach the concept of digital curation and how archivists should approach geospatial data or when email records move to the cloud.
“Records managers are having to become not just archivists and records managers, but technology specialists as well, and that’s a transition that can’t happen and doesn’t happen overnight,” Johnston said.
In addition to handling more complex records, Johnston said federal archivists used to pulling records from on-premises computers and servers are more accustomed to working with records in content management systems or located on vendor-hosted systems.
Emerging technology has also made some preservation work easier. In moving to the cloud, NARA has its digital catalog replicated in data centers across the country. Johnston said having digital backups is key.
“Someone recently asked me about risks, and they said, ‘The risks to archival records used to be water, fire and insects. And what are the risks now?’ And I said, ‘Well, water, fire and insects,’ because it may be the cloud, but it’s still a physical computer sitting somewhere in a data center, and the same physical risks are true for computing equipment as they are for anything else,” she said.
NARA also moved to the cloud in part because other agencies had already begun the migration. As for the next steps, Johnston said NARA is developing ways to transfer records from an agency’s cloud to its own.
“Being able to work with the records where they are, rather than constantly copying and moving the files reduces risk, because every time you pick up a file, make a copy and move it somewhere, there’s always more risk associated with it.”