SSA collected petabytes of data. Now what?

The Social Security Administration is updating its IT strategy so it can start taking advantage of the massive amounts of data it collects, and move toward data...

Having petabytes of data but no way to organize or query it is like having billions of dollars in the bank, and no pin number. That’s why the Social Security Administration is updating its IT strategy, so it can start taking advantage of that data and using it to make better informed decisions.

Robert Klopp, chief information officer and deputy commissioner of systems at SSA, told the Federal Drive with Tom Temin that SSA is building an enterprise-wide data warehouse, integrating terabytes —maybe even petabytes — of data so it can be collectively analyzed and searched, and implementing visualization and analytic tools in order to “create a culture shift towards data-driven decision making.”

Klopp described three different types of data, all compiled and stored separately:

  • Enumeration data — “all the data where we capture the fact that people are born and we give them SSNs”
  • Benefit data — “the information that we need in order to determine whether you’re eligible for certain benefits”
  • Income data — “helps us understand how much money you have put away for your retirement, and how we’re going to pay that out”

“And as you can imagine, within those topics, there’s terabytes and terabytes, and even petabytes of data associated with all that,” Klopp said. “All of that data is personal information. And so we’re very careful about how we store and how we use it. We have very strict policy guidelines for almost all of this data and how it can be used.”

That presents the first challenge to SSA’s new data strategy: in addition to migrating all of it to central storage — the aforementioned warehouse — Klopp’s team must ensure that it is secure and ensure that there can be no violations of the policy that governs the use of private data.

The second challenge is integrating the vast amounts of data from disparate silos into one central database, and then rendering that data useful.

To do this, Klopp said that the data warehouse has to be “significantly larger” than the combined capacity of all the silos. This is because the warehouse will have to store and track not only the personal data but the history of that information as well, to allow the SSA to track trends over time. He said the warehouse will filter and aggregate records in order to accurately respond to both general and sophisticated queries.

Klopp said they’re building this warehouse in the cloud, which means they have to get federal authorization. But the first feedback just came in from the cybersecurity personnel, he said, and his team just needs to tweak a few things in response to that.

“Being [Federal Risk and Authorization Management Program] certified doesn’t mean it’s perfectly secure, so on top of FedRAMP, we ask cyber-guys to take a look at what’s going on, to provide us this [authorization to operate], and it’s this very specific detailed cyber analysis that we’re going through now,” Klopp said.

Klopp said he expects the warehouse to be ready and populated with 40 terabytes of data by the end of the calendar year, and that the new program will be rolling out to beta testers around July.

“I’m an engineer, so I’m not eternally optimistic, so I will tell you at this stage of the game, it’s more than ‘hopefully,’” he said. “We’re tracking very solidly towards having it delivered at the end of the year. Right now I don’t see technical risk in it or anything like that; I think we’re going to make those dates.”

Klopp said that once the system is in place, SSA will roll out visualization and predictive analysis tools to help employees manipulate the data. He said he then has to begin indoctrinating people in the use of these tools, creating a “cultural shift” towards data-driven decision-making within the agency.

“We’re considering a different model where we view mySSA as a product where we continuously work and improve that thing as we go, much the same way as a commercial software company would continuously work to improve their product. This changes the investment paradigm from having to look at every feature as if it was a separate investment,” Klopp said.

He said that gives the SSA more flexibility when it comes to planning the budget each year. If the SSA is flush, they can funnel more money towards the project on the fly, or reduce the flow when fortunes aren’t so good.

The way to implement this scenario, Klopp said, is to either increase or decrease the number of contractors working on a product. The core IT work at SSA will still get accomplished, because the agency uses employees for about 70 percent of the work in the IT field, and Klopp said he would like to increase that number to 80 percent.

Copyright © 2024 Federal News Network. All rights reserved. This website is not intended for users located within the European Economic Area.

Related Stories