How data is driving the 2020 census to be more accurate, efficient

Best listening experience is on Chrome, Firefox or Safari. Subscribe to Federal Drive’s daily audio interviews on Apple Podcasts or PodcastOne.

The Census Bureau opened up its online response form for the 2020 count Thursday.

And getting to that day, however, wasn’t easy.

Atri Kalluri, the senior advocate for Decennial Census Response Security and Data Integrity at the Census Bureau, said it took a lot of time and effort to bring innovation to the decennial count.

“We have worked very hard to basically ensure we have systems that are easy to use, that are safe and it’s an important undertaking for the government and for the entire country,” Kalluri said in an interview after speaking at the Splunk Government Summit in Washington, D.C. We worked with industry experts as well as federal agencies to protect the confidentiality and privacy of the information the public gives us. We depend on that trust to maximize that self-response and we take that very seriously. We have a good track record of maintaining the privacy and the confidentiality of the data that we get through our various surveys and censuses. With that experience and with the modern technologies, we have fielded systems and solutions that we feel are ready to go and will work securely for us.”

Atri Kalluri is the senior advocate for Decennial Census Response Security and Data Integrity at the Census Bureau.

The move to the online response option — which Kalluri emphasized is one of three the Census Bureau is offering this year along with phone and postal mail—is both the biggest difference over past nationwide counts and key to reducing costs and increasing accuracy.

Kalluri said in 2010 the bureau relied almost entirely on paper records and hundreds of thousands of people walking door-to-door collecting data.

Through the online response form and other innovations, the Census is expecting more accuracy at a lower cost this year. Census Day is April 1 and the entire effort must be completed by Dec. 31.

Kalluri said Census has tested the online response platform every year since 2013 to create the trust and continually improve the security architecture.

“We worked for a long time with industry leaders to make sure we had the best solution possible,” he said. “We have confidence that our solution is the best we can put out there. It can scale to 600,000 concurrent users.”

As for security effort, Kalluri said the Census Bureau has ensured the data is protected in a “secure vault” to protect the privacy and confidentiality of it.

Read more: Technology News

“This is the most advanced census ever so we are encrypting our data at rest, all of our databases and all of our iPhone used by field workers,” he said. “We also are encrypting data in motion so when data is flowing from one system to another, we ensure it’s encrypted. The idea there is to ensure our field workers who are using these devices, even if they lose their device, that we don’t use the information. It’s not only encryption, we have ways of wiping the device.”

Kalluri said over the next six or nine months, he will ensure the security of the systems as well as ensuring the information can be shared securely with bureau and other administration officials.

Data driving better decisions

Along with the online response form, Census rolled out several other time and money savings innovations.

Kalluri said the combination of Postal Service, state and local government and other data sources reduced the number of field workers who had to confirm addresses.

He said in 2010 field workers had to “walk every street” to verify addresses. For 2020, Census could use the data to verify about 65% of all the addresses, meaning they only had to verify the remaining 35%.

Kalluri said that meant the bureau hired only about 32,000 people instead of 135,000 to do this work.

“We have looked at the technologies and used imagery to compare different vantages and detect change so that we could concentrate on those areas where there is change and try to do the canvassing in office,” he said. “We still identified areas through that analysis that required field verification.”

The third area of innovation is using geospatial data to optimize the routes of more than 500,000 field workers, who are using iPhones.

Kalluri said the data also is helping the bureau ensure field workers who speak specific languages, whether Spanish or Farsi or Chinese, go to the areas where those they could use those skills.

Advertisement
“It’s an intelligent way of using the data that we already have. We are seeing huge improvements. Just through address canvassing, we have seen that there are efficiencies through the optimization of the routes and the address listers could complete the verification of address at a much faster pace that what they have done in previous years,” he said. “The fourth innovation is the use of administrative records. If there are housing units that are vacant or non-existent, is it necessary to verify by knocking on the doors of those housing units? Are there better ways to do that? If the information about unoccupied or non-existing housing units are made available to us, we could use that and eliminate the unnecessary knocks on the door.”

Administrative records are information the public provides the Census or state and local governments.

“The idea of doing the data analysis and making decisions based on what we see as evidence is in the core of what we do,” Kalluri said. “We did a lot of research as part of various census tests we conducted. We fielded new ways of conducting the operations as part of the tests and learn from them, and help improve on the solutions we put out for the decennial census.”

Kalluri added that over the next three to six months, the bureau will use the data it’s getting back to make improvements to its processes to ensure operations are heading in the right direction. He said the data also will help Census decide how much more or less capacity it will need in the cloud so as not to cause slowdowns in the collection and processing of data.