While some agencies have jumped into the big data game without a clear plan for using it, others have been using data to improve their services for years. One agency has even reached the point where it’s discovered that too much data can be just as much a hindrance as a help.
The Veterans Health Administration is working with devices that are currently available to the public, like the Fitbit or the Apple Watch, which are already providing new data to health care providers and helping to make VHA’s telework programs, the largest and oldest in the nation, even more effective. But the agency is just starting to learn how big the implications of that data can be.
“As we’ve been working in small groups to figure out how biometrics work and how things are doing, I started to get a couple of new devices that are not FDA-approved yet, and I started figuring out quite quickly that I if I could baseline my own data, or that of my wife’s, or that of my boss’, we could actually figure out very quickly 24-to-72 hours before we actually had any onset of illness,” Joseph Ronzio, chief health technology officer of the Department of Veterans Affairs, said at the Nov. 17 Immixgroup Government IT Sales Summit. “And we’re able to actually do that repetitively lately so that typically I’m getting sensors that are dinging off and telling me that I’m getting sick well before I am.”
But he said also that while this information is extremely helpful, there can come a point where the influx of data is too much to handle.
“I’ll tell you now, working on the data side, I wouldn’t want your sensor feeds at 100 percent level. Every piece of data I get, I’ve got to maintain for 125 years or more,” Ronzio said. “If I start getting your Fitbit steps every second from now to the next 50 years, that’s going to clog up our systems.”
Instead, he said checks and balances need to be built into these wearables and biometrics. Rather than have the sensors reporting every single piece of information, they should be programmed to establish a baseline, and report deviations from that standard.
“That’s why we’re very quickly moving to the point of moving the applications to a level that we can actually give the knowledge and the intelligence to the application on your device, and then have it understand your role more, and understand your variation, so if you start going towards a negative trend, it can ask you, ‘Hey, would you like to package this up and send it to your doctor, because we’re seeing what might be a problem,’” Ronzio said. “That will actually self-regulate, and allow us to actually address medical concerns instead of, ‘Hey, they didn’t walk 10,000 steps today.'”
Meanwhile, the Treasury Inspector General for Tax Administration oversees the Internal Revenue Service, which collects a staggering amount of data on every American citizen, company and organization that pays taxes.
“We collect and process about 200 million tax returns a year, for 145-150 million taxpaying entities,” said George Jakabcin, TIGTA’s chief information officer. “So when you start to talk about what is data, just think about your own tax return, how many lines — depending on how complex, whether you use the EZ form or the full-on 1040 — you start to do the math and begin to realize, ‘Wow, there’s a lot of data being collected on the American taxpayer.’ That’s what we’re talking about. … We start talking about terabytes of data very quickly.”
But TIGTA is not new to the big data game. It’s had data analytics capabilities for 10 years.
In fact, after the 2014 IRS cyber breach, TIGTA put those data analytics skills to use in trying to mitigate the damage. It looked closely at the known group of victims and was able to apply the information it got from that group to a much larger dataset. This allowed them to identify more than 600,000 new potential victims. Those accounts were immediately flagged by the IRS, allowing them to prevent fraudulent returns. This saved the IRS about $4 million.
“That’s one very specific example of looking for those characteristics, that unknown, and applying it to a larger data set and being able to say, ‘Aha, here is a place where you might want to take a closer look and see if there’s something that could be prevented,’” Jakabcin said.
He said that that instance was less of a breach, and more along the lines of criminals exploiting a vulnerability in the system. They attempted to download the tax returns of around 200,000 taxpayers, and actually got around 104,000. But they didn’t get any data directly from the IRS. Instead, they were able to use data collected from other places online and use that to access people’s IRS accounts.
“And never say never, but fortunately thus far … whenever the perps come forward, they come armed with the information that they have bought, stolen or otherwise acquired from other sources, so you can be reasonably assured that neither TIGTA nor the IRS is the organization compromising your tax records. It’s a consequence of some nefarious activity conducted before they even get to the IRS,” Jakabcin said.