Award-winning way to safely use generative AI

Artificial Intelligence

Award-winning way to safely use generative AI

Large language model artificial intelligence. It's like some exotic new oil. Everyone thinks it can make great recipes, but not many know how to cook with it.

Tom Temin@tteminWFED

April 11, 2024 2:02 pm

9 min read

Large language model artificial intelligence. It’s like some exotic new oil. Everyone thinks it can make great recipes, but not many know how to cook with it. Well, some people do. Like a group at the MITRE Corporation, which recently won an award for a product they developed called m-chat. For details, Federal Drive Tom Temin was joined in studio by MITRE’s Senior Vice President, Cedric Sims.

Interview Transcript:

Tom Temin Well, let’s begin at the beginning. What is M-chat? What have you done here?

Cedric Sims Well, thank you for the opportunity. I hopefully you know a little bit about MITRE Corporation. I’ll share if I can just real quickly.

        Learn how DLA, GSA’s Federal Acquisition Service and the State Department are modernizing their contract and acquisition processes to make procurement an all-around better experience for everyone involved.

Tom Temin We know it’s the nonprofit that does so much research and development on behalf of the government. And then a cooperative research and development operation agreements. Fair to say?

Cedric Sims Very fair to say. And we also do all that only in the public interest. So it’s a great opportunity for us to serve our nation and unique ways, bringing some of the best talent that’s available to some of the most complex problems in the world.

Tom Temin And one of them is how to use generative AI at the moment.

Cedric Sims Exactly. There’s so many technologies that approach us and they approach us quickly. And today’s innovation systems, technology comes to us at a pace that we’ve never seen before. In generative AI, imagine just 24 months ago, only the deepest researchers were talking about it. Now it’s everywhere.

Tom Temin They’re talking about it. And we know that the publicly deployed artificial intelligence is producing about 50% useful material and 50% nonsense. And of the 50% nonsense, half of that is utter nonsense. Yet people still think there’s value. So tell us about what you’ve done that might help federal agencies use this safely.

Cedric Sims You mentioned earlier M-chat. So that M stands for MITRE, and it’s our MITRE ChatGPT environment. The commercial product most people know called ChatGPT, was created by OpenAI. We saw an early need not only to understand what the capabilities were, but also where some of the risks were. So one of the best ways to do that is to experiment and understand the technology more holistically. So we worked very closely with OpenAI, and also with Microsoft to deploy a special version of ChatGPT, specifically for the MITRE Corporation.

Tom Temin So it’s in an enclave where you can totally control the data that feeds to train it.

Cedric Sims Exactly. And that enclave is a special and secure environment where we can take unique and bespoke data that is unique to our sponsors, as well as some of our industry partners. We can do special analysis in that arena. And we can do it in such a way that it is secured, and we make sure that only the data and its outputs are available to MITRE and our sponsors.

        Read more: Artificial Intelligence

Tom Temin And the big word nowadays, especially from just a few weeks ago, the administration put out a follow up to its executive order on AI, and one of the strictures that it specified was human in the loop. And so talk about how the human in the loop remains important even in an enclave, even with, as you say, unique and bespoke data. So that’s very carefully handled. You still have to make sure that what comes out, yeah, that looks right.

Cedric Sims That’s a great point. One of the interesting things about generative AI is that the technology itself does some unique things, and we’re still learning how it works and why it works the way that it does. There’s a lot of discourse right now, too, about human in the loop. Some people are talking about, will AI replace jobs? No, AI will not replace jobs. But what we will see is AI being a significant coworker, if you will, to the work that’s already being done. So that human in the loop is absolutely critical. The human in a loop will generate the prompts that make the AI function. The human in the loop will also validate the outcomes of those prompts, and then make sure that it ultimately provides a value that’s expected from the intent that where we started.

Tom Temin And getting back to that point, that the publicly available ChatGPTs do go way off the rails, and this is a well known phenomenon in the companies themselves don’t hide that fact. Are you able, in an enclave like at MITRE, to understand the mechanisms by which these algorithms do start to take things off track?

Cedric Sims There’s some unique aspects to that. There’s a special way in which research needs to be done to understand how the models were trained, because how they’re trained actually affects what they produce in terms of output. And also understanding how specifically to leverage the models through this concept I mentioned earlier, called a prompt to make sure that they’re actually doing the things that are intended. So part of our work is not just to utilize generative AI, but also to understand the threats that are within generative AI, whether it be introduced because of training errors or, quite frankly, misuse, producing the kind of off the rail responses that you were mentioning.

Tom Temin We’re speaking with Dr. Cedric Sims. He’s senior vice president for enterprise innovation and integration at the MITRE Corporation. And you mentioned prompts, and that’s kind of the country cousin that sometimes gets underappreciated is how you prompt these really has a huge influence on what comes out. And so maybe the training for a lot of federal would be users is prompt training, fair to say?

Cedric Sims Absolutely. So, Tom, one of the things we’re doing starting with AI in our own workforce is to do training for our workforce about how to do prompting. We have a set of intermediate courses, we have a set of beginner courses, and we have some advanced research related work that’s helping to understand how to get the best out of these various systems. We look at ChatGPT as more of a general kind of generative AI, but there are some very specialized generative AI capabilities that are looking at cyber threats to be able to actively defend networks. There are special versions of AI that are looking at health care related outcomes to help us understand new ways to approach cancer treatments and other matters like that. So we’re looking at generative AI, we talk about ChatGPT, but there’s so many versions of this that will be critical to different sectors as we move forward.

Tom Temin And just a detailed question in the operational sense. If you train people to do the right kinds of prompts, say, an organization that has users and you want to help them with HR or something, that’s the common example. Can you design a system such that bad or out-of-band or purposely malicious prompt don’t get entered?

        Want to stay up to date with the latest federal news and information from all your devices? Download the revamped Federal News Network app

Cedric Sims So yes, absolutely. There are ways to build a front end to actually analyze the prompts. Interestingly enough, you can use generative AI to do that too.

Tom Temin I was going to say you have AI controlling AI.

Cedric Sims Well, not quite that loose, but yes, you can certainly preview prompts to look at whether or not the prompt itself might have a biased input that might create a biased output. There are ways to look at prompts to see if the prompt is appropriate for the specific AI model, which might require, again, maybe a different use or application of the prompt. So there are some different ways to control the front end. We do front end control of our m-chat environment within MITRE corporation, and it’s very critical so that we can make sure that the use is appropriate. So we want to have our employees leverage the capability, but we definitely want to make sure that we get the appropriate use of the capabilities.

Tom Temin So an agency then or an organization of any type could deploy, say, several models. And the front end could be almost like a traffic cop or an engine roundhouse, sending each prompt to the right track so that it gets to the appropriate application.

Cedric Sims Tom, sounds like you need a job at MITRE Corporation. You’re right on track, exactly. There are lots of opportunities to control the front end activity and then route those accordingly. But most of our research that we’re doing now, again, we mentioned the generative AI with respect to general prompting, getting text replies back. But we’re also looking at AI in its application in areas such as autonomous vehicles, where data regarding how vehicles are actually helping to augment driver experiences, and maybe also augment safety systems that exist, on our existing roadways and vehicle to vehicle communication. Those are unique ways. Those themselves are not your traditional prompts. The prompts are data that comes from the systems, like the input from the steering wheel or the accelerator, or maybe data about a vehicle entering a roadway. Those are sorts of things we’re also looking at to understand how we can make future systems safer and more reliable.

Tom Temin So you’re saying we now have convergence of the internet of Things in artificial intelligence?

Cedric Sims Exactly.

Tom Temin All in the cloud, of course, while we’re getting the modern trends.

Cedric Sims Well, the great thing about generative AI and the promise that there, you mentioned the cloud. One of the traditional ways that data science was utilized in the past was to put all the data in the cloud and then have the cloud do the processing and then give you responses back. The power of generative AI is that a lot of that computational and maybe that knowledge can actually be put at the edge. So it can be deployed within the vehicle or deployed within the medical device or deployed specifically within the network security system. And then from there, local compute can actually help drive smart decisions and outputs as a result of the capability.

Tom Temin Yes, because to get to the very practical level, if it is, say, a vehicular type of operation, you can’t have network latency.

Cedric Sims Exactly. You can’t have the car stop functioning because you’re driving in a middle of the desert in Nevada.

Tom Temin

Tom Temin is host of the Federal Drive and has been providing insight on federal technology and management issues for more than 30 years.

Follow @tteminWFED

How the U.S. Space Force can modernize at a smarter pace

The Space Hour

Protected: Industry Exchange Cloud 2024: SAP’s Joe Ditchett on how agencies can move procurement to the cloud

Cloud Computing

VA approach to automated software testing is rooted in flexibility and efficiency

Federal Insights

Award-winning way to safely use generative AI

Related Stories

How the U.S. Space Force can modernize at a smarter pace

Protected: Industry Exchange Cloud 2024: SAP’s Joe Ditchett on how agencies can move procurement to the cloud

VA approach to automated software testing is rooted in flexibility and efficiency

Upcoming Events

Related Stories

Top Stories