HealthData.gov has reached old age in technology terms. Just five years after its launch to make health data more accessible, the technology running the portal is well-behind the times.
Damon Davis, the director for the Health Data Initiative in the Department of Health and Human Services, said over the next few months the agency’s IdeaLab will usher in a new set of technologies to update the platform and add more capabilities to the data.
“What we are going to try to do is advance the platform to be something that is going to be more usable, better search, easier access to data sets that also will support more ‘liquid’ forms of data,” Davis said recently after a panel discussion at the Symantec Government Symposium in Washington. “We really want to support a community of data users. We want to really attract people to amore sticky, more usable platform that’s going to allow them understand how the data can be used from its original way it was collected and curated, and also be supportive of each other.”
Davis said the hope is by creating a community around the health data sets, HHS components can share and collaborate more easily, and let outside experts help solve problems.
HHS launched Healthdata.gov as part of President Barack Obama’s open government initiative in 2011 with 197 datasets. The portal now has more than 1,900 datasets and hosts several apps including hospital compare and TXT4Tots.
But as the evolution of digital services continued, the portal hasn’t kept up.
Davis said one major area is how HHS presents the data. He said the move to a more “liquid” form is all about usability.
“The idea is really to get away from a lot of the unstructured data that we currently produce and get to more structured data that is available in machine- readable formats.
We have to move away from all of these PDFs that are beautiful on paper, but are not necessarily something you can readily utilize then you are trying to mash up data,” he said. “When you are trying to do analytics and reach some sort of hardcore knowledge generation, it’s just not going to happen with a stack of PDFs be they in electronic format or whatever. We really want to drive a lot of the agencies across the department to produce more machine readable formats, the CSVs, the JSON files, a lot of structured output that will allow folks with real hardcore computing capabilities to take in multiple, disparate datasets and really start to advance knowledge generation across multiple different domains of health and social services.”
Connect data owners, users
When HHS launched the portal in 2011, it relied on open source platforms Drupal and C-KAN. But as Web services technologies advanced, HHS had to do a lot of patchwork to make these separate platforms integrate.
To increase capabilities and usability of HealthData.gov, health data officials turned to the HHS IdeaLab for help.
Davis said the IdeaLab is leading the migration effort to a more Web services- friendly platform known as D-KAN, which is based on Drupal but includes more capabilities.
Davis said he expects HHS to launch the new platform this summer.
The IdeaLab plans to use an agile approach to the new portal so it can improve over time at a relatively low cost.
He said the new technology will help administering the back-end of the platform as well as improving the communication between the community of users and the data owners.
And the new portal will show the user community that it’s bigger than most people think.
Davis said one HHS bureau takes advantage of another’s data quite often and the new HealthData.gov will provide connections between data owners they better share and complement each other’s data.
“From an external perspective, we are really trying to focus a lot more on supporting things like the entire research enterprise. How is it that we can have dataset providence?” he said. “We want to understand where the data is collected and curated, and then how is it used way down the line in some research project or some publication or what have you and be able to track back through all of the various folks who have touched the data to understand who it has evolved over time, the level of quality or what have you.”
In many ways, what HHS is trying to do is ensuring the data is secure, valuable and accessible as well as driving decision-making.
Davis said HHS must understand what data exists, where it lives, how are users gaining access to it and who are some of its biggest customers.
“It’s an extremely long tail of better decision making based on a better awareness of where the data lives, what kind of systems are needed to maintain it, what should the lifecycle of the data be so we need to think through what data retention needs to look like so we are not unnecessarily expending costs to manage data that nobody is using,” he said. “We want to be a little more intelligent about the data management as well so we are looking for cost reductions, looking for efficiencies in how the data can be utilized, all sort of management decisions that will be implemented at all the operating divisions across the department.”