Using AI to analyze coronavirus data

Best listening experience is on Chrome, Firefox or Safari. Subscribe to Federal Drive’s daily audio interviews on Apple Podcasts or PodcastOne.

Ultimately, fully understanding and solving the coronavirus pandemic will be about the data. There’s no shortage of data sources that are growing hourly. Now nine organizations, business and academic, have formed a coalition to bring coronavirus data sources together, and added incentives for researchers who can apply modern data analysis and artificial intelligence to it. Leading this effort is the Silicon Valley company C3.ai, whose founder, Thomas Siebel, joined Federal Drive with Tom Temin.

Interview transcript:

Tom Temin: Tom, good to have you back.

Thomas Siebel: Good morning.

Tom Temin: So when Anthony Fauci says we’re going to be guided by the data, you know the age of data and data analysis has really arrived. Tell us about the effort you’re leading.

Thomas Siebel:  Yeah. This is a very interesting initiative that we began called the C3 Digital Transformation Institute. And this is a joint effort with Microsoft, of the University of California-Berkeley, University of Illinois in Urbana , Carnegie Mellon, the University of Chicago, MIT and Princeton. And we have – this is an effort to aggregate some of the best minds and resource s on the planet and initially focus them on the COVID- 19 pandemic. This is all in, in cash and in kind. We’ve funded this to the tune of about $400 million. The first call for papers just went out, and it’s on the development of new AI techniques to mitigate COVID-19 pandemic. We have a massive computing resources being provided by Microsoft and by the supercomputers at Lawrence Berkeley Labs and at the National Center for Supercomputing Applications. These are two, 26- petaflop machines that are bridged by the Microsoft Azure C loud. On top of that, we put the C3.ai platform and these universities, they’re in the process right now of allocating an initial, I think, $6 million in research awards to, you know, apply AI to mitigate COVID-19 pandemic. These are genome- specific medical protocols. You know, predicting the course of the disease. You know, new clinical trials, programs, modeling simulation prediction of COVID-19 propagation – this sort of thing. So this is a massive amount of research being kicked off in very short order to help make a contribution to this dialogue.

Tom Temin: And just a quick question on that $400 million. That is a real sum. Where did that all come from?

Thomas Siebel:  No, this is a serious contribution. And it was $60 million in cash from C3.ai, and then we have in- kind contributions from C3.ai in the form of software, in- kind contributions from Microsoft in the form of cloud, and then supercomputer resources. S o these researchers will have very large data sets to work with, very large computers to work on, and we expect them. And this is a – think of it as kind of an emergency Manhattan Project- type of, you know, exigency where we need, we’re looking for results now. Interesting – all of the learning that comes out of this, all of the data science, all the new techniques, all of the learning goes into the public domain, so it’s available for the world to use. Secondly, associated to this, we’re publishing a COVID-19 data set, which I think will be the world’s largest aggregation of COVID-19 data, that we’re publishing next Monday. It basically – these are data that we’ve aggregated from CDC, from NIH, from the World Health Organization, from Kaiser, from Johns Hopkins. And it’s epidemiological data, it’s X- rays, it’s CAT scans, it’s patients’ history. It’s basically all of the data that we could find in the world. We’re aggregating this data into a unified, generated image that will be published next Monday. And we’re making this resource available to the world at no cost, to COVID-19 research. This will support, both virtually every researcher wants to use it, and we will – this effort is big, supported also by our partners in AWS, and that it will also be made available to the researchers at the Digital Transformation Institute.

Tom Temin: We’re speaking with Tom Siebel, chairman and CEO of C3.ai. And it looks like there’s some federal interest in this as well as federal in- kind contributions.

Thomas Siebel:  We’re in very active discussions with the professionals at NIH – the National Institutes of Health – and the CDC. And both organizations are interested in cooperating with us and providing us data sets to make of this, COVID-19 data leg that we’re publishing as rich and as robust as possible. So we’re getting active cooperation from the United States federal government. And this will be, I think that that effort is going to dramatically expand the size of this data set available for research. The other area that we’re seeing is these research projects that are being funded by Digital Transformation Institute. I’ll be amazed if they don’t receive additional funds from NIH and CDC and pharmaceutical companies and others to accelerate what they’re doing. I mean, I cannot imagine, you know, here we are in April 2020 and I cannot imagine a more important use for AI than to address this COVID crisis. And that’s what we’re going to do.

Tom Temin: Sure, and with everything locked down and people keeping their distance, it’s not clear to me and I don’t think anyone at this point, what the precise transmission mechanism really is for this thing. Do you think that’s the kind of question that might be answered with some more precision? Such that maybe we could get the economy going, at least to a degree, if people understood how the contact actually happens and the infections occur?

Thomas Siebel:  Absolutely. But you think of what’s going on at the C3.ai Digital Transformation Institute. We are aggregating resources from really the finest resources on the planet, the most accomplished data scientists, bioengineers, biologists, including those at CDC and and NIH, to really focus their efforts on logistic and optimization analyses for public health strategies and interventions … supposed to design sampling and testing strategies, improving societal resistance in response. Okay, I’m just kind of broad – you know, genome – specific medical protocols, you name it. So I think the probability of something good not coming out of this is zero.

Tom Temin: Well, that’s good odds. And this data set that you will be publishing is available to any researcher. But you also are looking for proposals that come with grants. Tell us more about those.

Thomas Siebel:  There’s two things. So the Digital Transformation Institute is the, the principals there that include basically the people who run engineering and bioengineering at Princeton and Carnegie Mellon and Illinois and what have you – they’ve done a call for papers. I suspect there will be hundreds of proposals. They’ll pick some subset of those proposals and they’ll fund them immediately with, you know, $100,000 to $500,000 in cash initially, and then massive computing resource s and massive data sets to do their research. But they’re gonna do an extensive peer evaluation of these proposals very kind of, very quickly and get the money out into the research community. You know, this will happen in June, but much of it will go before June. So this is going to happen very quickly. And I think we’re going to see some exciting new approaches. And at the edge of this I’m certain we’ll see some positive contributions.

Tom Temin: One organization I didn’t hear you mention I wanted to ask about is the Food and Drug Administration.

Thomas Siebel:  We haven’t heard from the FDA yet. You know, we’ve had a lot of inquiries from NIH and lots of inquiries from CDC. We’ve had some inquiries from the White House, so we’re in discussion with all of those organizations we haven’t heard from the FDA yet, o r the Department of Agriculture yet. I expect we will. But to understand, we just announced this last week, and it’s, you know, kind of kicking off this week and next week. So I know that a lot of people are busy dealing with – everybody’s busy dealing with the crisis they have on their plate. And we haven’t heard from the FDA yet but we certainly invite that dialogue.

Tom Temin: All right, Steve Hahn, if you’re listening, call in. Tom Siebel is chairman and CEO of C3.ai, thanks so much for joining me and good luck on this project.

Thomas Siebel: Thank you, Tom. Have a great day. It was nice to talk with you.

Tom Temin: We’ll post this interview and a link to more information at www.federalnewsnetwork.com/FederalDrive. Hear the Federal Drive on demand. Subscribe at Apple Podcasts or Podcastone. Stay up to date on your agency’s latest responses to coronavirus. Visit our special resources page at www.federalnewsnetwork.com.