June 30th deadline approaches, when NARA will only accept digitized documents. Agencies must deal with the largest volume known as modern textual records.
Ever since people applied ink to parchment, preserving records has posed challenges. Now federal agencies face a June 30 deadline to digitize certain federal records. The National Archives and Records Administration will require agencies to submit the digitized versions, including metadata for future accessibility. Agencies are moreover obligated to conform to NARA standards in carrying out digitization.
Long in the making and several times delayed, the digital requirement stems ultimately from the never-ending growth in annual production of paper records and resulting volume.
“There’s hundreds of millions of dollars being spent every year by federal agencies to create, manage and store these hardcopy records,” said Anthony Massey, strategic business developer at Canon on Federal Insights — Records Management. The digitization directive, Massey said, is designed to make archiving easier and less costly while making records themselves more accessible.
The various types of documents – maps, photographs, items deemed culturally significant, and 8.5 x 11-inch bureaucratic output each have their own associated standards and require different technologies to achieve digitization, Massey noted. Helping inform NARA standards making have been guidelines from the Federal Agencies Digital Guidelines Initiative, or FADGI.
The initiative got underway about 10 years ago “as a concept of how to begin to guide agencies into what kind of a digitization format they could then roadmap their policy and procedure to,” Massey said.
Many digitizing procedures incorporate scanning. Scanning itself has continually advanced, said Tae Chong, Canon’s manager of new business development. One development especially relates to a type of document known as a modern textual record (MTR).
An MTR typically was created electronically, perhaps with a modern word processing program or – as is often the case with older records about to leave agency possession and move to NARA – in a program whose technical format is no longer extant.
That means digitizing a paper printout using scanning. Now, Chong said, scanning technology includes “software engineering techniques to tell the text from the background and … special software image processing to essentially enhance the visibility of the text element, while erasing unwanted graphics on the background.”
A second element in state-of-the-art scanning, Chong said, encompasses optical character recognition that “can kick in to pick up the text information and pass it to a software application which will then index the document for later search and retrieval.”
He noted that agencies must also by law preserve a paper copy. But by extracting the information and indexing it, public retrieval and viewing will no longer require handling the paper itself.
“The new regulatory requirement is focusing on creating a digital replica of the paper originals,” Chong said.
MTRs differ from cultural heritage documents. In the latter type, the entire area of the document encompasses information to preserve; for example, pieces of artwork or hand-lettered manuscripts. OCR technology won’t yield much information, and the background requires preservation along with whatever else the document exhibits.
“When NARA and the working group of FADGI began to establish classifications of imaging for digitizing these various types of records,” Massey said, “they discovered in that particular context of the printed record, there was a need to get a special type of digitization process called MTR that was simpler, less involved with much less expensive equipment that could do a very high quality image and make it transportable into an archive.”
Because the MTRs exist nearly universally as printed on standard office paper, agencies can apply high speed scanning techniques to them. Massey said agencies have produced billions of MTRs, printing them out as either temporary or permanent records.
For such documents, Massey said, NARA wants an online catalog. A researcher with a particular topic “can go to a Library of Congress online catalog and look up that document, instead of having to go in person to a particular storage site or physically go and handle that document.”
While MTR is a process or image standard and not a hardware standard, Massey said Canon has developed scanners specifically for MTR.
“The hardware must then be aligned to those scanning requirements,” he said.
For practical purposes, speed is an important requirement for MTR scanners. Massey said the faster the process occurs, the faster agencies can clear back file projects for older records. For new records, he said agencies should consider establish in-house capability to scan and index records as they create them.
“When records management officers look at day-forward scanning,” Massey said, “knowing that from that day forward they also have to digitize these records, they want access to equipment that can do that at a setting that is confidently MTR capable.”
Listen to the full show:
Copyright © 2024 Federal News Network. All rights reserved. This website is not intended for users located within the European Economic Area.
Tom Temin is host of the Federal Drive and has been providing insight on federal technology and management issues for more than 30 years.
Follow @tteminWFED