Open source: An efficient way to handle data

Best listening experience is on Chrome, Firefox or Safari. Subscribe to Fed Tech Talk’s audio interviews on Apple Podcasts or PodcastOne

This week on Federal Tech Talk, host John Gilroy spoke with Howard Levenson, regional vice president for Federal at Databricks.

Most federal listeners understand that big data is a constant today, but Levenson brings up an uncomfortable topic — this data is not static. It is in constant change. Given this fact, how does a federal professional assure that real-time information is provided for key decision-makers?

Levenson brought up the example of a military jet aircraft. Each flight produces one terabyte of data. If you consider a fleet of aircraft and constant movement, then we have a data set that is constantly being amended and altered.

It looks like the federal community has gotten a handle on harnessing the data. The real question is how to keep that ever-changing data in a state of readiness for analysis.  Even if you start with a data set that is normalized, this does not mean the new information is ready for analytics.

The analytics Levenson referred to is, of course, machine learning and artificial data.  Let us not forget that even the top computer science genius at MIT cannot create an algorithm without clean data.

Levenson suggested that an open-source approach on a platform will be the most efficient way to handle making sure data is up to date in today’s world.

Comments

Federal Tech Talk

TUESDAYS at 1:00 P.M.

Host John Gilroy of The Oakmont Group speaks the language of federal CISOs, CIOs and CTOs, and gets into the specifics for government IT systems integrators. Follow John on Twitter. Subscribe on Apple Podcasts or Podcast One.