Why red-teaming is crucial to the success of Biden’s executive order on AI

With the pace in which AI and generative AI has been rolled out to the public, red-teamers are playing catch-up, with their team facing a significant deficit. T...

On October 30, President Biden unveiled his highly anticipated executive order on artificial intelligence. AI has been one of the hottest topics across industries for the last year because of the far-reaching impacts of the technology – some ways that we have still yet to discover. Given the implications of AI on our society, the executive order is comprehensive and aims to ensure we’re maximizing the technology, while remaining safe and secure.   

One of the most critical components mentioned in the executive order related to our safety and security is AI red-teaming. In cybersecurity circles, “red-teaming” is the process whereby a team of professionals seeks to find vulnerabilities in a particular system or group of systems. They’re hired to find flaws in networks and applications before threat actors do, so issues can be resolved before damage is done. This is particularly important with AI because numerous organizations have rushed to implement it into their systems, and they may have unintentionally exposed themselves to new attack paths. These systems require testing, especially if they’re being utilized by government organizations or in critical infrastructure.  

The concept of red-teaming has been around for decades, first embraced by the military to test their defenses and uncover any weaknesses before adversaries did. In the world of AI and generative AI, red-teaming is cutting-edge. While there are some red-teaming techniques that will carry over from existing cybersecurity efforts, there are other types of testing that need to be implemented to comply with this executive order. The key to successful red-teaming is to simulate the kinds of attacks that you’ll see in real-world situations. Two prominent examples related to generative AI are prompt attacks and data poisoning.   

Prompt attacks

Prompt attacks involve injecting malicious instructions into prompts that control a large language model (LLM), which can lead to the LLM performing unintended actions. For example, earlier this year a university student used certain prompts to elicit confidential information, including the code name for an AI project a big technology company was working on, as well as some of the metadata that should’ve never been exposed. Since there are already several LLMs that are publicly available, red-teamers will need to make sure these types of scenarios are not happening on these platforms. Furthermore, they’ll need to do this with any LLMs that are in development, so they’re being released only after rigorous testing has occurred. The big challenge here is finding a new prompt or set of prompts that threat actors haven’t yet discovered and exploited. It’s a constant race against the clock.  

In addition to the prompt itself, red-teamers will need to examine backend data to ensure it can’t be compromised. Equally, an attack can happen via APIs, so those need to be tested and airtight as well.  

Data poisoning

The other major attack that red-teamers will need to test against is data poisoning. With data poisoning, threat actors try to manipulate the data that LLMs are trained on, thereby creating new biases, vulnerabilities for others to attack, and backdoors to compromise the data. Poisoned data can have severe impacts on the results that LLMs deliver because when they’re trained on poisoned data, they learn to associate patterns based on that information. For example, misleading or inaccurate information about a brand or a political figure can unfairly sway people’s decision making. In another scenario, someone may receive inaccurate medical information about how to treat a routine sickness or ailment, which can then lead to something more serious.  

As LLMs are relied on more frequently and are embraced as the modern-day search engine, it’s imperative that the data remains pure. To test against data poisoning, red-teamers will need to simulate a whole host of data poisoning attacks to uncover any vulnerabilities in the LLM’s training and deployment processes. If they’re able to manipulate the data in any way through those simulations, those are areas that will need to be fortified.   

Moving Forward

With the pace in which AI and generative AI has been rolled out to the public, red-teamers are playing catch-up, with their team facing a significant deficit. There’s a lot for them to address in the near term, and in many cases, they’ll need to learn on the fly.  

To be effective, red-teamers need specialized expertise, so you can expect to see a series of government trainings and briefings over the next several months. As corporate teams get up to speed and do the testing themselves, there’s going to be a huge onus on contractors to do this testing as well. This will require a lot of validation work to ensure these contractors have the proper skills and expertise to get the job done, so they’ll be asked to show their testing methodologies more than they’re used to.   

To accelerate things, testing methodologies should incorporate learnings from some of the leading firms, like Microsoft and Google, which are beginning to share more of their testing methods. The faster red-teams can shore up any holes and meet the new standards laid out in the executive order, the more we can maximize the benefits of AI on a national scale.   

Jonathan Trull is chief information security officer and head of solutions architecture at Qualys.

Copyright © 2024 Federal News Network. All rights reserved. This website is not intended for users located within the European Economic Area.

Related Stories

    (Getty Images/iStockphoto/metamorworks)AI EO

    Red Teams, watermarks and GPUs: The advantages and limitations of Biden’s AI executive order

    Read more