Deploying large language models will let agencies tap the potential of generative artificial intelligence, says Guidehouse’s Bassel Haidar. He shares a primer...
As more technologists and general users understand how large language models work, generative artificial intelligence will seem less like a risky black box and more like a powerful beam of light that government agencies can carefully focus, says Bassel Haidar.
An AI program built on an LLM algorithm such as GPT-4 “allows you to do many things. It allows you to act in a specific way. When you focus it on what you want to do, the LLM becomes an expert in that field,” said Haidar, director of artificial intelligence and machine learning at Guidehouse.
The focus comes from both the data an organization uses to train the model and also on how it queries the model for results, Haidar said during Federal News Network’s AI and Data Exchange. The publicly available versions of generative AI, including ChatGPT, have been trained on data from a wide variety of domains, to mimic understanding of any question put to them. But, he said, federal agencies and other organizations can shape an LLM by using narrower, domain-focused data.
Haidar said GPT-4 and other LLMs, which derive from a computational architecture that dates to 2017, contain a function called an attention mechanism. He likened it to a person trying to listen to 20 conversations at once at a dinner party, then switching to focus on a single conversation.
The attention mechanism “allows you to home in on the different parts of the conversation, as many as you need. It gives you contextual understanding of things that are relative in the sentences and where things are placed,” Haidar said.
What allows AI technology to speed up the process is that it sorts through and delivers findings in a parallel fashion, continually building on its understanding as it acquires more information. Products like ChatGPT, with an attention mechanism operating in parallel, came were trained on diverse data.
Haidar said no one can eliminate bias from AI algorithmic outputs, but it’s possible to minimize it with a framework designed around what Guidehouse calls RIISE values: respect, innovation, integrity, stewardship and excellence. Government organizations can tailor their AI and data governance by defining what the RIISE values mean in context of their specific missions.
“For instance, at Guidehouse, under ‘respect,’ we said we want to look under ethical consideration and social impact,” Haidar said. “What does that mean? That means our AI should adhere to societal norms, respect human rights and avoid causing harm.”
He said Guidehouse mapped the five values over a list of considerations when developing its AI governance and ethics.
“We talk about accountability, legal compliance, transparency, robustness, reliability, privacy and data protection, safety and security, model bias, and sustainability,” Haidar said. An organization can include more considerations, but “if the process becomes too heavy, people tend to not follow it.”
For an AI system to produce a desired output, people need education in crafting the right input, the right query, Haidar said. That’s a branch of AI work known as prompt engineering.
“Really, prompt engineering is, how do I ask my large language model … to deliver the best result possible?” he said.
It might sound simple, but prompt engineering comes with certain tenets, Haidar said.
“You have to walk the LLM through what you are trying to do,” he said. Often a phrase or simple sentence won’t do it. “If you really want to create great results, you have to break it down and maybe have to have role playing back and forth” using the LLM iteratively.
Haidar used the example of writing a blog. Before just entering the topic into the prompt window, he said, a better approach would entail “talking” with the LLM, which would first ask about the topic and then continue asking the user about additional parameters in a dialogue fashion.
“Now, the LLM itself created the prompt,” Haidar said, “and when I say, ‘OK,’ it goes ahead and executes the prompt.”
He also shared how generative AI could help government organizations provide speedier service to the public using called retrieval-augmented generation. Essentially, RAG strengthens the accuracy of what LLMs put out by pulling, or retrieving, data from outside sources, Haidar said.
As a for-instance, he talked through the example of someone wanting to find a specific form from online from, say, the Small Business Administration. (Read about some other ways that Guidehouse’s Haidar expects organizations will tap into generative AI, “Quantifying the Potential of Generative AI.”)
“I don’t want to dig in for three hours to find the form that I need,” he said.
“It’ll be much easier if I have a chat-like interface.” The LLM will retrieve “all the websites, all the content, all the PDFs, and now I can ask it questions, and then it can generate the content and tell me where to find it.”
He cautioned that when using RAG, it’s important to ensure the model doesn’t pull in nonfederal data or data from sources outside of the context of, in this case, small business loans or assistance. Haidar said it’s wise to get confidence measures from LLM-generated content, so that if something tends to produce a less-than-90% confidence, a subject matter expert can check it.
Haidar said generative AI success depends on a workforce with the skills to apply it wisely and realize it can’t replace people.
“I don’t believe that jobs are going go away,” he said. “I think we’re going to have different types of jobs for sure. But that’s why upskilling is critical.”
In fact, AI can bring new capabilities to large numbers of people, Haidar said. He noted that during his career has spent thousands of hours learning different aspects of computer programming. Now, thanks to generative AI, “English is the new programming language. Or whatever your natural language is. There’s nothing to fear. You try, you get a result, you refine it, you add to it.”
Success, he added, “is just getting away from that fear and allowing people to experiment.”
Discover more articles and videos now on our AI & Data Exchange event page.
Copyright © 2024 Federal News Network. All rights reserved. This website is not intended for users located within the European Economic Area.
Tom Temin is host of the Federal Drive and has been providing insight on federal technology and management issues for more than 30 years.
Follow @tteminWFED