Federal administrative agencies — and that’s nearly all of them — get their name because they administer regulations spawned by laws. It takes people to make the countless decisions made in the every administrative state. Now some academics have looked into whether artificial intelligence might help agencies do our work. Stanford Law School law professor David Freeman Engstrom joined Federal Drive with Tom Temin with what they found.
Tom Temin: Professor Engstrom, good to have you on.
Insight by GEHA: Learn why shopping for health insurance this year is more important than ever in this exclusive ebook.
David Freeman Engstrom: Thanks for having me.
Tom Temin: First of all, what was the scope of the study and what caused you to look at this particular topic — artificial intelligence in administrative decision making?
David Freeman Engstrom: The impetus for the study was that there’s this emerging debate about public sector use of AI, so the use of AI by government. But much of that debate has focused on the criminal side of government, in the criminal justice system. So think risk assessment tools that are used to make bail and parole and sentencing decisions or predictive policing. These are hot button questions, and we’re so glad that there’s lots of debate about it. But there hadn’t been much attention to the civil regulatory side of the system, and you don’t have to squint too hard to see that there’s actually a quiet revolution going on in how government performs those civil regulatory functions. So we build a team you can only build at Stanford. It had a bunch of lawyers, some business school types. It also had 10 computer science PhDs and we turn them loose on the federal administrative state to really study how government has begun to use these powerful analytic tools.
Tom Temin: And you found that it has, in fact, begun to use them in a lot of instances. Tell us some of the top line findings.
David Freeman Engstrom: Sure, so we had our our teams look across 120 agencies, subagencies and and also, you know, full scale government departments and try to surface every possible use case they could. And they found about 160 of them. They found that 45% of those agency subagencies and departments were either experimenting with or had fully deployed machine learning tools of one kind or another. So you can really say that these tools now span the government, spans policy areas from law enforcement to education and everything in between. It spans governance tasks, so lots of analysis that agencies have to do, lots of engagement with the public that agencies have to do is being powered by machine learning tools. Lots of delivery of public services. And then most important in our mind, these tools are being used to adjudicate. So think disability benefits at the Social Security Administration or intellectual property rights at the U. S. Patent and Trademark Office. So that’s the adjudication side. And also enforcement — a lot of the major enforcement agencies have begun to use machine learning tools to try to predict who might be violating federal law so that the agency can allocate its scarce resources towards further investigation and perhaps even launching an enforcement action against those targets.
Tom Temin: And when we talk about adjudication, that is typically a decision made by a human based on facts and that human has knowledge of the existing regulations. And, yes, you deserve disability or don’t. And so are the AI tools used to replace that human decision making, or do they somehow inform in such a way that you still have an auditable decision made by a person that is defensible by that person?
David Freeman Engstrom: So I think another top line finding, then, that feeds into your question or is responsive to your question is that we don’t find very much, if any, fully automated decisions within the federal government right now. So these tools assist human decision makers for the moment. In some cases, that’s because of the technical limits of the tools and other cases it’s because I think we have very real and legitimate anxieties about fully displacing human discretion. At the Social Security Administration, some of the tools are involved in triage, it’s clustering together cases that are that are similar so that an administrative judge can proceed more quickly and equitably through those cases. So one of the problems that big, mass adjudicatory agencies like SSA have backlogs. It takes sometimes a veteran at the Board of Veterans Appeals wait years for a decision. Another problem at these agencies is inter judge disparities in decision-making. So though cases are randomly allocated to administrative judges, at SSA there are some judges that grant disability benefits 5% of the time, and some judges that grant them 95% of the time. And so something other than the merits of those cases must be driving those outcomes, so you can think of a triage tool that clusters together cases as a way to try to mitigate both of those problems. The most interesting tool at SSA is something called the Insight System, and it is a natural language processing tool that can help spot errors in draft decisions. When an administrative judge can essentially input a draft decision and get back an output from the machine that flags any of 30 types of errors within that decision, some sort of basic inconsistency and how the decision is structured, for instance. Aand so that’s something that very much assists the human decision-maker. And it’s a powerful tool and it’s an important tool because of the crime needs within the system, because it can help mitigate those backlog problems and because it can try to narrow some of the disparities that might exist among judges.
Tom Temin: We’re speaking with David Freeman Engstrom, professor and associate dean at Stanford Law School. And one of the other findings you have is that agencies are using artificial intelligence monitoring and analyzing risks to public health and safety. And with the coronavirus issue so much in the public mind right now, give us an example of how you found them using it in this whole public health and safety area.
David Freeman Engstrom: The example that we highlight in the report and by the way, the report includes a canvass of the entire administrative state. But then the middle portion of the report actually goes into a lot of detail and isolate seven uses at seven agencies and so really gets under the hood of these tools. And we think that’s really important, because in law you really have to understand the details of the system in order to understand its legal implications. So that middle part of the report has an entire chapter on really interesting use at the Food and Drug Administration, at the FDA. And one of the things the FDA has to do is monitor adverse drug events reports. So when someone has a reaction to a drug, it often gets reported into a system. Sometimes that’s voluntary on the part of the person who has experienced the problem. But they’re also mandatory reporters out there, and the result is that there’s a data set at FDA with just millions of these adverse event reports, and the agency has to parse those reports and make some sense of them. And so there is a natural language processing tool that FDA has piloted that tries to predict which of those reports, in which of those reports there might be an actual causal relationship between the drug and the adverse event. And so here’s an example of predictive analytics that can really powerfully help the agency distill down what it needs to do and to decide where it needs to investigate further and maybe decide whether some sort of action needs to be taken. Perhaps even whether a drug approval needs to be altered in some relevant way?
Tom Temin: And what about deployment of AI in the improvement process for service to citizens applying for benefits or applying for something they’re automatically entitled to, like Social Security benefits or Medicare, that type of thing? Because that user experience and efficiency of that type of process has always been a tough one, tough nut to crack.
David Freeman Engstrom: The most obvious applications here would include chat bots, and a number of agencies are starting to pilot and even deploy chat bots, and so that’s that very direct point of intersection between state and citizen. And so, HUD has a chat bot that they have experimented with that would help people understand whether they might be entitled to housing benefits of some sort. I know the IRS is very interested in this. The IRS of course is in the business of trying to fight tax avoidance. Now, sometimes that tax avoidance is deliberate, but sometimes it’s simply lack of good information on the part of a taxpayer. And so there are obviously just huge benefits to be had from an automated system that can reliably and accurately get information into the hands of citizens. We also profile in our report an example of use of AI for public service delivery, and that includes the United States Postal Service, which has a couple of pilot projects around autonomous vehicles. And that would include autonomous vehicles that could help with that last-mile delivery, the getting of mail directly to our mailboxes. But it could also include and also includes long haul trucking, which of course, is another aspect of getting that mail to us. So very interesting pilots in place to ensure that USPS is in a position to leverage that technology when it, you know, when it finally goes to scale. And I know that there’s been a lot of debate about how soon exactly that will be. But it does seem like it’s gonna happen eventually.
Tom Temin: Did you find any cases that might offer cautionary tales or some of the dangers of guard rails that government needs to keep in mind when deploying this type of technology?
David Freeman Engstrom: Yes, I think I’d like to answer that question in very general terms, which is a theme that ran throughout our work and that runs throughout our report is that there’s this very important accountability concern when government uses these tools. And the reality is that these tools actually trigger a very basic collision. So on the one hand the law, it’s called Administrative Law, it’s the body of law that governs how agencies do their work. It’s built around accountability, transparency and reason, given the ideas that when government takes action that affects our rights, it’s supposed to explain why. So on the one hand you have that, you have that legal requirement of reason giving. On the on other hand, you have this whole suite of tools built around machine learning that are, at least in their more sophisticated forms, are by their very structure not explainable. And so you have this very basic collision. And so the concern is that government makes decisions that are important. They’re important to individual citizens, they’re important to the collective decisions that we make a society and yet the decisions might not be fully explainable. And I think in the next 10 or 15 years, as more and more of these tools come online and are used in salient ways by government, I think it’s an issue that first judges, I think, are gonna have to work out and try and understand better. And agency administrators as well. And then ultimately, I wouldn’t be surprised if Congress gets involved and has to think hard about how we can build some meaningful accountability structures around the use of these tools.
Tom Temin: And in choosing their data sets to train tools, did you find that agencies have the skill that they need aboard to make sure that the data sets they do choose give the kind of results that are supportable in the context of what it is they’re trying to do by statute and regulation?
David Freeman Engstrom: That’s a great question and it’s another theme that runs throughout the report, which is that in order to really leverage these tools, both the data but also the analytic techniques you need to analyze the data — we believe that agencies need to really focus on internal capacity-building. Agencies can get these tools through the procurement process. They can always go to a private contractor. They can always go to Raytheon or any number of highly sophisticated private contractors that can provide these tools. But our belief is that in order to create tools that are well tailored to these very subtle governance tasks that are fairly implemented, that it’s gonna take internal capacity to do that. And you don’t simply want to rely on the procurement process to get that next tool, which you want to ensure so that the government has the capacity to build tools itself, to leverage, new opportunities as they come along, and then to oversee the implementation of the tools that it has. And so one of our really big concerns is that over reliance on procurement, over reliance on those private contractors will hollow out the technical expertise of the bureaucratic state. And we think that has really significant implications. Now, you asked, Is there enough capacity right now to really extract the data that government needs to use these tools? And I think the answer is no. And I think that there is a very important set of decisions to be made at the agency level at the government level about just how much we want to invest in this and how important it is to us. And we believe it’s very important. But it’s not going to be cheap.
Tom Temin: And of course you did this report, this whole study for the administrative conference of the United States, probably one of the most important and least known bodies operating in the government. What will they do with it? These are the people that decide how the administrative government will behave.
David Freeman Engstrom: So ACUS for short, the Administrative Conference of the United States, has a very interesting statutory mandate. They’re supposed to provide guidance to the rest of the administrative state. They don’t have lots of hard-edged authority in that regard, but they work with agencies. They have a representative from most of the major agencies that’s part of the conference. And so what will they do with the report? Well they make sure that it gets into the hands of those representatives and they try as hard as they can to make sure it crosses the desks of agency heads and general counsels. Um, and so our hope is that, uh that that many people in decision-making positions throughout the government will have access to the reported and will benefit from it. You know, we held an event up at NYU where we invited technologists and officials from various agencies to get together and talk about these tools. That was midstream in our process of working on the report. So we were still trying to understand better how these tools were being used. But that was an amazing exchange. And what you find is that there are all kinds of left-hand, right-hand problems within the vast federal administrative apparatus. And so you had agency technologists who in theory were just down the street from another agency’s technologists, and they didn’t necessarily know what was going on. Now this is another aspect of that capacity building challenge I think the government faces. I think the government needs to start thinking more on entity-level terms. And some of that has happened. The General Services Administration, there certainly been some good work there. And the White House has been fairly active in the past month, or month and a half in trying to put into place the pieces of a national AI policy. But hopefully there’s gonna be more and more and more entity-level thinking that can help solve some of those left-hand, right-hand problems. Because, of course, in part, you know, the way forward might be more interagency work to try to develop these tools where we can leverage the kinds of knowledge and the expertise that resides across agencies.
Tom Temin: And the administrative government seems to run mostly in one direction with ever more regulations, ever more rules, ever more procedures. I think the code of regulations would feel 25 bookcases at this point. Did you find that people were at all interested or any examples of people using artificial intelligence to, maybe, say this could be done away with or we don’t need this anymore? This is an extraneous type of rule. Administration’s try to do that by fiat but it never really shrinks.
David Freeman Engstrom: I can’t think of an example of a tool that’s being used to try to winnow down the amount of regulation. Now it could exist. Agencies are trying to develop tools that can help them analyze rule-making comments, so comments submitted to them as part of the notice and comment rule-making process. And so once you have that tool, I suppose you could imagine a similar analysis to try to go through existing regulations. And you can topically model them, which is a way of — it’s an unsupervised way that a machine learning tool can try to understand what’s important within a large body of text. And so I suppose you could imagine a tool that’s gonna help with the the winnowing process. But I I can’t think of anything in particular that, you know, that we came across in in our studies that would be doing that kind of work.
Tom Temin: David Freeman Engstrom is professor and associate dean at Stanford Law School. Thanks so much for joining me.
David Freeman Engstrom: Thank you for having me. This was a real treat.