The baby boomers used LSD to hallucinate. Nowadays people use generative artificial intelligence to hallucinate, to create misinformation that seems credible. N...
The baby boomers used LSD to hallucinate. Nowadays people use generative artificial intelligence to hallucinate, to create misinformation that seems credible. Now the Government Accountability Office has published a detailed study of generative AI and its implications. The Federal Drive with Tom Temin got more now from the GAO’s director of science and technology assessment, Brian Bothwell.
Interview Transcript:
Tom Temin And this is a fairly brief report, but it’s kind of packed with information. Maybe just the one sentence description for people that may be having their heads under a pillow for the past six months. Generative AI is one branch of AI. And what characterizes it?
Brian Bothwell Well, generative AI is a content creator. It’s a system that with prompts maybe just like kind of minimum prompts, will answer your questions, will create pictures for you or create video for you. It can actually aid in some complex design processes like designing molecules, new drugs or or generating code for programing.
Tom Temin And when you type in your prompt, is it going out to the internet and finding everything it thinks is relevant? Or do the different platforms have their own, I don’t know, body of knowledge built in?
Brian Bothwell The large language models, for example, like ChatGPTC and Bard are trained on a large, large amount of data. What exactly is in that data? I mean, that’s that’s I think that’s depends on who’s creating the model. So when you type in your prompt, the model is taking that prompt and then looking at the data it’s been trained on to give you a result.
Tom Temin So then it shares the same possible weakness with every kind of AI. And that is so much depends on what you use to train it.
Brian Bothwell Exactly. And I’ll just give you a for instance. Personally, I went to went to the one of the large language models and I asked it just for kicks to write a short bio about me. And I had to give it a second prompt to tell tell the model that I was with GAO because there’s other people out there with my name. And it came back with a bio really quickly, but there are some things wrong in it. You talk about hallucinating on top of the show. It’s that I graduated from college from a place I never went to. It’s that I had a master’s degree in a topic that I did not. So there there are it’s going out and scraping data or been trained on this data that’s out there, but it doesn’t necessarily give you an accurate response.
Tom Temin So it can go from pretty close to complete nonsense, basically.
Brian Bothwell Yes. Or a lot of times I think in-between where it gives you a lot of accurate information, but it’s interspersed with things are not true.
Tom Temin Because one of the points in your report is how mature is it? So that’s my question. How mature is it? It might be very mature in terms of the capabilities of the algorithm, but if the data is all flawed, then it doesn’t matter how good the algorithm is.
Brian Bothwell Yeah, that’s that’s a good point. I mean, these things continue to advance and keep training them mind more and more data. Newer models keep coming out with more data, more parameters. So they’re improving, but you still need to be able to check and see, okay, is the is the output actually accurate? And that’s that’s where some of the issues are.
Tom Temin And besides false information, you list some of the other possibilities, some of the challenges of it. What are the chief ones that people need to worry about besides misinformation in the first place?
Brian Bothwell Yeah, we list several of those. I think one is, is can you trust these models? How do you how do you perform oversight on these, for example. These are generally called big black boxes. You put something in, you get something out. But what’s going on in the model and since these are so large, have so many parameters in them, it’s really hard to to be transparent. The other thing we already talked about was false information, hallucinations. There are also some economic issues, like what data are these models trained on? If you go scrape the Internet for a bunch of data, you might be collecting up copyrighted data, other sensitive data. Privacy risks also can involve when you put in a prompt, what kind of information do you put in? Where does that information go? Is that stored in the model? Is that used by the models that more data for the model to be trained on? Those are the kind of issues that that people need to think about.
Tom Temin For that matter. It could be scraping up other documents generated by generative AI and therefore you get increasingly less accurate the more you go.
Brian Bothwell And that’s yeah, that’s another point. It’s like multiplying fractions, right? I mean, if you can theoretically, if you have your model trained on inaccurate data and then you are producing more inaccurate data based on that, if you feed that in your database, yeah, I think you know what you were talking about might happen.
Tom Temin We’re speaking with Brian Bothwell. He’s director of science and Technology assessment at the Government Accountability Office. On the other hand, there are opportunities possibly for it. What do you think of the chief ones there that you found?
Brian Bothwell Yeah, there are some fantastic opportunities. It has great potential applications across education, government, medicine and law. These opportunities here are really great at summarizing information. You can go ask the model a question about a topic and it’s going to give you a maybe a slightly flawed answer, but is going to give you a lot of good information about the topic you’re looking for. These models can actually enable automation, make it much easier to produce things and take less person power to do that kind of thing. It can improve productivity. I can think of an example where it’s like if you have somebody who has to do copywriting or they have to do advertising for their products, you could ask one of these models to write something up and you’ve got something that it’s not a blank sheet of paper anymore. You’ve got something you could start with and refine from there to use for that kind of purpose.
Tom Temin And it strikes me that going back to the challenges, what would happen if someone tried to copyright, say something that was created by generative AI when presumably whoever typed in the same prompt would get the same result? So how could you purport to claim that’s copyrightable?
Brian Bothwell Yeah. The U.S. Copyright Office has already got some guidance on copyrighting works using generative AI. They’ve already said if you generate something solely from a prompt to a generative AI system, you can’t have a copyright for that.
Tom Temin And what did you find when you tried it out in this document that GAO has published? The prompt was to draw orange cats in an abstract style, and you got some pretty good cartoons out of it. What does that tell us about the process to use these things in a responsible way?
Brian Bothwell Well, we didn’t actually use the process to generate that graphic, but we used our knowledge of what we learned about how they work to create the graphic. But I’ve been working with plenty of people who have tried this out on the site. We’re not using it as an agency. We also look at the tool and see what steps we need to ensure proper use. But we’re not using it for our work. We’ve got plenty of people really interested in it. They’re doing this on the side with their personal accounts. There’s some pretty interesting stuff that they’ve they’ve developed.
Tom Temin Yeah. So the pseudo reality of it, I guess, ultimately is what people need to worry about. That plus whatever privacy might have been violated or whatever, whatever data that is maybe sensitive but unclassified would end up in there.
Brian Bothwell Yeah, that that that is one problem. I think that’s what, for example, some some agencies have worried about is you really can’t expose nonpublic data to these systems because you don’t know where that data is going, that information is going. And that’s that’s a that’s a concern.
Copyright © 2024 Federal News Network. All rights reserved. This website is not intended for users located within the European Economic Area.
Tom Temin is host of the Federal Drive and has been providing insight on federal technology and management issues for more than 30 years.
Follow @tteminWFED