Army faces data overload but LLMs are not the answer

Army

Army faces data overload but LLMs are not the answer

"Everybody who's acquiring AI from the commercial world — demand to see where the data came from. And don't stop until they tell you," said Stephen Riley.

Anastasia Obis

July 3, 2024 2:26 pm

5 min read

Army leaders and soldiers are inundated with data — the sheer volume of information is hindering their decision-making and causing analysis paralysis. But turning to Chat GPT-like tools to help commanders get after this problem might not be the answer.

“Ninety percent of the time, don’t do it. It’s the easy button. But using [large language models] like Chat GPT or Gemini — that is boiling the ocean to make yourself a cup of coffee. You don’t have the compute resources to run effective LLMs down at the tactical edge,” Stephen Riley, who is part of the Army engineering team at Google, said during an Association of the U.S. Army event Tuesday.

The Army generates a vast amount of data due to its large number of personnel and extensive range of operations, making the service one of the largest AI users among the military branches. But having a lot of data does not mean Army leaders can get actionable insights from it.

“I say there’s too much damn data out there. We can’t overload our warfighters and our leaders with too much data,” said Young Bang, the principal deputy assistant secretary of the Army for acquisition, logistics and technology.

Google, for example, improved the quality of search results long before the advent of large language models, and the Army could apply similar methods to how it handles its large swaths of data, said Riley.

One way the tech giant worked to improve search results was by analyzing which search results were clicked on most often and identifying which results were most useful to the most users.

Additionally, the company developed a knowledge graph that “represents widely accepted truths and relationships.” This approach helps ground search results in established knowledge, which subsequently requires less computational power than LLMs.

“Now we’ve got two things working in tandem. We’ve got what’s been most useful to the most people and we’ve got what is actually a good result because it conforms with generally accepted truth. All of this doesn’t require LLMs. So how do we do this with the Army? Let’s start building a knowledge graph of things that are true for the Army,” said Riley.

“We don’t need to train a gigantic LLM with all of the ADPs and FMs and say, ‘All right, we’ve got a model.’ You could actually encode all of those ADPs, all the operations stuff, all the intel stuff — we could encode that into a knowledge graph, which requires infinitely less compute power. That’s something you could deploy forward on a pretty small box. I encourage everybody to look first at the old ways of doing things. They tend to be more efficient. I got to think a little harder about how to implement them. But it’s a lot more efficient and it’s very doable.”

Bang said that while LLMs are useful for general purposes, using them in combination with small language models for specific military terms, military jargon, cyber terms or other specific languages would provide better results for soldiers.

“Do you really need LLMs and SLMs at the edge? No. If you use that and overlay a knowledge graph, I think that’s a much better practical implementation of things. Because we can’t afford all the computing resources that we’re going to need to process all that or do the training on it or even the retraining or the inference at the edge,” said Bang.

But the concern is that malicious actors can potentially overload existing data sets with misinformation, which would lead to a shift in what’s considered a commonly accepted truth or knowledge. Riley said that’s why it’s important to have humans in the loop. “We cannot abdicate human reasoning to the machines.”

“You could theoretically overload it and start shifting truth in on a given access to some degree. But as we index stuff, the data that we index is also run through the current knowledge graph. But we also have humans in the loop; we are watching what’s going on with the trends, with the shifting of the Overton window there,” said Riley.

Poisoned datasets

When using AI datasets, particularly for training large language models, malicious actors don’t have to poison the whole dataset. Compromising even a small piece of a server will introduce bad data that will contaminate the overall training dataset. That’s why the military services acquiring AI models and data sets from the commercial world should “demand to see where the data came from.”

“Google ain’t going to tell you. Demand it of us anyway. Microsoft ain’t going tell you. Demand it anyway. We have already seen cases where companies building large LLMs have sourced data from other companies that say they have a bunch of data. And it turns out they source from other companies that are given some pretty bad stuff. Maybe not deliberate misinformation, but stuff that absolutely would not comply with our nation or Army values. In all cases, demand to see where that data came from. And don’t stop until they tell you,” said Riley.

“We’ve talked about this data bill of materials. Famously, after Solar Winds, people are asking for a software bill of materials. We must develop some kind of data bill of materials and make that it’s a standard part of acquisition of these AI systems. We’ve got to do it because we’re already seeing this problem whether you know it or not.”