NARA takes on digitizing modern textual records and other formats

The deadline for agencies to submit digitized records for archiving and preservation by the National Archives and Records Administration came and went.

The deadline for agencies to submit digitized records for archiving and preservation by the National Archives and Records Administration or NARA came and went earlier this summer. For an update on dealing with modern textual records and other types, the Federal Drive with Tom Temin spoke with two people from NARA, the director of Records Management Policy and Outreach and Lisa Haralampus, the director of digitization for the Office of Research Services Denise Henderson.

Interview Transcript: 

Denise Henderson  NARA has been digitizing since the early 2000s. We had still imaging labs where we were doing small scale digitization. But as time has gone on and changed and evolved, we realized that we needed the ability to expand our capacity. So, for the last year and a half or so, there was a renovation effort in our Archives II building at College Park, and we’ve innovated 18,000 square feet and established a modern state of the art digitization center. It’s a mixed-use space that co-locates our work processes, so archival prep, preparation of records before digitization, metadata capture, and then ultimately scanning. So, we bought sort of the different functions together in one location. We have a fleet of top-of-the-line imaging equipment that ranges from overhead camera setups, flatbed scanners, microfilm and microfiche theater, and we purchased three new IBM L, HP fusion at 300. high speed scanners that will exponentially make more records available online.

Tom Temin  Digitization is really multiple processes. If you’re taking a picture of an ancient manuscript, you know, from handwritten by Benjamin Franklin and ink on parchment or something, that’s one thing. And if it’s an email from last week, that needs to be preserved because it got some politician in trouble, then that’s a whole different type of scanning and digitization.

Denise Henderson  Absolutely, I’m so glad that you said it is a multi-part process. And different records require different techniques. And to scan NARA, what we’re digitizing in my unit are the records created by federal agencies that have been accession into the National Archives holdings. So, they are archival records, that are no longer in active use by the agencies that are being accessed in the US every day by researchers. And it ranges as you said something from the Revolutionary War period all the way up to modern paper. So, we have equipment to handle anything that comes our way.

Tom Temin  All right. And Lisa, in the larger sense, agencies are facing deadlines they’ve known about for years for how they have to send records into NARA to update us on what the requirements are, what the legal requirements are, what the deadlines are, and then we’ll get into some of the specific types of records.

Lisa Haralampus  Thank you, Tom. I’d like to frame the conversation with the title of our agency that you work. We work the neath tonight at the National Archives and Records Administration. Denise works in the National Archives part that she said there’s so much fascinating work happening within our stacks within our walls. As we digitize archival records that we have. I get to work in and Records Administration. It’s my job to create the policy and guidance framework that helps agencies that are creating records today that will become part of the archives tomorrow. So, one of the strategies we are in right now is NARA has said in our strategic plan, and in memos that we’ve jointly written with OMB, that NARA is going to stop taking analog records. Sometimes I say analog, sometimes I say paper. Papers is the shorthand but what we’re saying is we want all of the records that are agencies transferred to us now on to be in digital format. So, the deadline for that is so your timing of this interview is wonderful. It is June 30. So, in little more than six weeks, there’s going to be a major shift in how NARA accessions records in the records from agencies or arguably valuable permanent records that are part of our nation’s our nation’s treasures. So, how’s that going to work? Well, in 2019, we issued digitization standards for temporary records, because we know that agencies are not just digitizing for archival needs, they’re digitizing records for their business needs. So, most records fall under the category of temporary and we issued those data standards at a very high level five years ago. But last year was a big moment for us was when we issued regulations that covered paper records and photographic prints. We figured paper and photographic prints cover about 85% of what is coming to the National Archives, we got the biggest set of records covered first. Those standards are very detailed. They are almost like a checklist. And they explain to agencies and the vendors supporting agencies what needs to happen to create that digital image, that version of that permanent record that is coming to NARA, we are not accessioning, the paper and the digital image, we are only going to be bringing in the digital image. So, when you have that as your goal, that we’re not going to have the source record, the regulations have to be tight. They have to be specific, and they have to be very comprehensive, sort of step by step. So that is the start. Tom, what’s your next? Where do you want me to take the story next?

Tom Temin  Well, yes, with the word digital image is what I’m a little hung up on. And you can explain this. I don’t think any photograph created by the government is other than created digitally anymore. Maybe somebody has a film with, you know, in the back of a Hasselblad but I think 99.99% are digital already. But digital and paper, even paper records are born digitally. When the OMB issued a memo on I don’t know how to do artificial intelligence. People may print it out. But that wasn’t written on a typewriter.

Lisa Haralampus  You Yes, exactly, Tom. So, for us at the National Archives, think of us as living in the past. We are taking we’re giving guidance on how agencies are transferring records for us today. But when were those records created, generally 20 to 30 years ago. So here we are 2024. And we’re accessioning records that were created, maybe in 1994, maybe 2004. At that time, that’s why we have so much paper that’s coming in, because you created them by using your Word Perfect, right? You’re go back in time to where you were in 20, uh 1994, before we even had, and I’m sorry to say, because when you think of it now, the strategy for preserving emails that were written in 94 was sure to print and file I remember to print and file all the metadata related to that. So, what would we do now with boxes of emails created 94? We would say please digitize those and send them to us following our digitization. So, it’s this flow. So, records that it is crazy to say, but I hope that at some point, these digitization regulations, these digitization standards, they’re not needed in another 10-15-20 years, because we will have born digital records, to accompany our digitized records.

Tom Temin  That’s really a key point. Yes. And you’re talking about something that was created possibly, not necessarily in an analog setting. They don’t have typing pools even as late as 94. But people were still using Wang word processing. I mean, I remember agencies had that. So, you have a format issue. Perhaps, if you’ve tried to open a Wang created document in Microsoft Word, you probably couldn’t open it. So, there’s that issue.

Lisa Haralampus  Right. So, we have format standards that we use the National Archives to say, all right, their records have been created in so many different formats over time by so many agencies, depending on what they’re doing. Here’s a list of formats that we the national archives will take, because we think we can sustain them. So, we call it our preferred format list and are accepted format list. So, the good example is email, we will take PST files, we will take EML files, we will take XML files, but we won’t see like Lotus Notes on that email list. We’re like, No, we can’t handle a flat image, we need the email to be sent to us in a format that we can maintain. And we do that for many types of federal records.

Tom Temin  Sure. And that gets to the issue of where there is a printout, and you don’t want that paper. Suppose it was a Lotus Notes printout. And so, all they have that’s archivable in effect is the paper, then you get into standards for that paper and tell us how the regulations also relate to the so called FADGI guidelines. Federal agencies digital guidelines initiative was not rules but they maybe are the basis for rules.

Lisa Haralampus  Exactly. So, when the federal government issues regulations, one of our directions from OMB and various legislation is that you always want to start by using consensus standards or voluntary standards. It’s not generally if unless your federal mission is really unique and you are the Standards Authority. We try to base our standards on what’s common practice. So, when we were Developing the digitization standards for permanent records. We went and looked well, what would we base them on? We at the National Archives, our job is to preserve our nation’s history. We are a cultural heritage institution. So, we are using the federal agency Digitization Guidelines, because those were standards that were created to handle cultural heritage materials. The fancy guidelines were created. NARA as a partner in FADGI, we actually have representation on that group. It’s led by the Library of Congress. Smithsonian’s involved in those standards, as well as other groups as well. So, there are a voluntary consensus digitization standard for cultural heritage materials. What they talk about very specifically is it captures how digital imaging science has changed over the past 20 years. When I think of digitization standards were issued in the past. I’ll tell them you’d hear people say things like okay, 300 dpi. Got it. 300 DPI is NARAs digitization standards. What’s your new standard? Lisa? And I like Well, I’m so glad you asked the question that way. Because if you’re asking about dpi, you haven’t learned yet how Imaging Science has changed. So, the FADGI standards was our basis for the technical component of scanning included things like how much noise like what is the allowable distance between allowable error for noise? How do you test to make sure that you’ve got a calibrated workstation, so you know, your image is what you produce. And although Denise does not follow our digitization standards, because those are for agencies, I’m sure she could also explain a little bit more on the technicality of how sophisticated digital imaging standards are now, so that we know that the image we get is a faithful representation detailed enough for us to be able to serve the public in the future. It’s really about that level of detail that we capture.

Copyright © 2024 Federal News Network. All rights reserved. This website is not intended for users located within the European Economic Area.

Related Stories