Federal agencies are among the users of and contributors to a data consortium tracking the coronavirus, organized by retail traffic company SafeGraph. It’s making data available about foot traffic to researchers, all for free. SafeGraph’s marketing director Nick Singh joined Federal Drive with Tom Temin for more on what’s going on.
Tom Temin: Mr. Singh, good to have you on.
Insight by Kodak Alaris: Learn the importance of electronic records in all mission and administrative efforts across government in this free webinar.
Nick Singh: Thanks for having me, Tom,
Tom Temin: Tell us about this consortium and then we’ll get into what SafeGraph does. But you have a consortium all contributing data to this effort?
Nick Singh: Yes, we have a consortium of about 1,000-plus organizations, including governments at the city, state federal level, and academics from about 80 different universities who are all using different data sets, including free datasets from SafeGraph to help fight coronavirus.
Tom Temin: And SafeGraph’s own data sets sound interesting. Tell us about those.
Nick Singh: Sure. SafeGraph is a geospatial data company that uses anonymized and aggregated location data to provide insights. So the two data sets that are getting the most play through the consortium are, one, a social distancing data set, which helps government agencies, as well as academics understand how well each neighborhood is social distancing. And the second most interesting data set is foot traffic to businesses derived from this anonymized mobile location data that helps organizations understand which sectors in which industries in which types of businesses are the most hard hit by coronavirus in terms of from an economic perspective.
Tom Temin: And for both of these, this is data derived from the locations over time of cell phones to then, correct?
Nick Singh: Yes, we work with an underlying mobile location data panel that’s derived from 45 million devices. And we take this at an anonymized and aggregated level to create all our different insights. So it’s not so much ever about an individual device, but more so how when you look at a whole bunch of devices to create anonymized insights at a place level or at a neighborhood level.
Tom Temin: And how closely together can this data determine people are?
Nick Singh: So it isn’t so much used for understanding population density, but more so used to understand, for example, in social distancing, what proportion of households are staying in per day in each neighborhood where neighborhood is a census block group – about 1,500 households. So that’s kind of a proxy for how well people are doing social distancing and staying indoors.
Tom Temin: I see. So that if you wanted to, say, find out how many individuals are visiting a particular bar or they’re just serving drinks on the street, for example, in the city, could you find that out?
Nick Singh: Yes, we’d be able to do it. But overall, we’ve been able to see that most business foot traffic is down. But this helps organizations understand what neighborhoods within their own locales might need extra resources or messaging to help spread the word to stay at home.
Tom Temin: Got it. In normal times then this type of data would be used by a business to understand the patterns of its customers and how it might adapt itself to suit those.
Nick Singh: Absolutely, Tom. Normally, it’s used by retailers to figure out where to open up their next store for site selection. It’s used by finance companies to understand how foot traffic is going to a store like Starbucks, and whether to long or short stock. It’s used by advertisers to help with location-based advertising in normal times, a lot of different use cases. But right now we’re all in on fighting coronavirus with data.
Tom Temin: Sure, and for purposes of coronavirus, or any other purpose, is it possible to know anything about the populations like any other demographics about them?
Nick Singh: Absolutely. So we’re reporting data only at the census block group level and the U.S. Census itself has given data about each of these census block groups – about 1,500 households. So you’re able to kind of do analysis on what neighborhoods are doing the best social distancing, and then what demographic attributes about those neighborhoods might be correlated to how well they’re social distancing? And that is definitely an area of research for these academics who are trying to understand how does response to coronavirus differ to the socioeconomic factors?
Tom Temin: We’re speaking with Nick Singh. He’s the marketing head at SafeGraph. And what other types of data sets then are going into the consortium to go along with the two data sets that SafeGraph is offering?
Nick Singh: Yes, we have a transaction data set created by our partners at Facteus, or formerly known as ARM Insight. That has anonymized credit card transaction data, which is another way for you to understand how is commerce happening at these individual locations, which is pretty interesting to a lot of economics professors as well as the Federal Reserve and other different government organizations who are trying to not only understand the humanitarian crisis, but really the economic crisis and recovery.
Tom Temin: Sure, and you’ve got some federal agencies also putting in data or are they consumers of it?
Nick Singh: They’re their consumers of it. They often, due to privacy reasons, they are mostly just consumers, but they will join SafeGraph data with their own internal datasets to come up with some really cool analyses.
Tom Temin: Sure. Do you know any agencies doing just that?
Nick Singh: Yes, the CDC actually just published a big report yesterday, which zoomed in on San Francisco, New York, New Orleans, and Seattle, and looked at how social distancing was occurring, using SafeGraph data, and combined it with data from local health departments around COVID cases. And they were able to show as well as data on when different orders were given to stay at home. And this was actually some of the first large scale evidence of people actually listening to the social distancing orders, and then following suit and actually staying at home. And we may intuitively know this to be true that there’s definitely a lot less people on the streets, but it’s still pretty cool, from these researchers and academics to really have the proof from the data to show, wow, these stay at home orders, people are actually listening to them. And that’s from the CDC, as of yesterday.
Tom Temin: I guess you could use the same data mashups to maybe disprove some popular shibboleths, too, and maybe fine tune the whole response?
Nick Singh: Absolutely. People are using this data to understand if there are already neighborhoods that are doing great social distancing. Maybe you don’t need as much advertising or community enforcement or outreach in those neighborhoods. And you can better use the resources for neighborhoods that really do need that help?
Tom Temin: The way I hear this is that federal agencies or any law enforcement agencies at any governmental level or health agencies could take a risk management approach to how they deploy their resources.
Nick Singh: Absolutely. That is how this data is being used. Because you know, we have finite resources and it’s all hands on deck. So if anything is able to help them, adjust their response and use their, you know, finite resources more intelligently through data, that’s exactly what they’re doing.
Tom Temin: And tell us about how the consortium works itself. Is there a central location where data sets can be downloaded? And how does that all work?
Nick Singh: Yeah, we it’s all thanks to our partners at Slack. They’ve helped us create a free Slack community for these thousand-different-plus organizations. So in that Slack community, we’re all chatting about the data, different government organizations are asking different academics for help with understanding the data and massaging the data. And that’s also where we’ve given instructions and how these people, these consortium members are able to access our data for free.
Tom Temin: And just to back up, with the 45 million devices that you can have a fabric of data for to track, are these people that have opted into something or is it from the carriers or how does that work?
Nick Singh: Yeah, this is all from opted in devices. So we all get this data from a different data supplier and how it works is that different data supplier is getting the underlying data from mobile phones via apps. So this is not so not based on cell carrier data, but more so GPS data collected through SDKs from these mobile apps that are in the health safety navigation space, and it’s 100% opted in. It’s also California Privacy Act compliant. So in signing up for these apps, in the terms it’s explicitly written that data is being used in this way.
Tom Temin: So if you allow an app to access your location data, you might be doing some public good, in other words?
Nick Singh: In a way, yes. And it’s very important for us to emphasize that all of this analysis is not based on any one individual device. It’s all about obligated and anonymize insights at a neighborhood level or at a point of interest level. So it’s never so much about tracking an individual and seeing where they go trying to do contact tracing for just one device or anything like that. It’s all about just at the neighborhood level when you’re looking at groups of 1,500 households, what can we kind of approximate as the group behavior?
Tom Temin: Nick Singh is marketing head at SafeGraph. Thanks so much for joining me.
Nick Singh: Thank you for having me.
Tom Temin: We’ll post this interview along with a link to more information at www.FederalNewsNetwork.com/FederalDrive. Subscribe to the Federal Drive at Apple Podcasts or Podcastone. Stay up to date on your agency’s latest responses to coronavirus. Visit our special resource page at www.FederalNewsNetwork.com