Securing AI in federal and defense missions: A multi-level approach

To deploy AI that reaches production and operates within classification, compliance and policy constraints, federal leaders must view AI security in layers.

As the federal government accelerates artificial intelligence adoption under the national AI Action Plan, agencies are racing to bring AI into mission systems. The Defense Department, in particular, sees the potential of AI to help analysts manage overwhelming data volumes and maintain an advantage over adversaries.

Yet most AI projects never make it out of the lab — not because models are inadequate, but because the data foundations, traceability and governance around them are too weak. In mission environments, especially on-premises and air-gapped cloud regions, trustworthy AI is impossible without secure, transparent and well-governed data.

To deploy AI that reaches production and operates within classification, compliance and policy constraints, federal leaders must view AI security in layers.

Levels of security and governance

AI covers a wide variety of fields such as machine learning, robotics and computer vision. For this discussion, let’s focus on one of AI’s fastest-growing areas: natural language processing and generative AI used as decision-support tools.

Under the hood, these systems, based on large language models (LLMs), are complex “black boxes” trained on vast amounts of public data. On their own, they have no understanding of a specific mission, agency or theater of operations. To make them useful in government, teams typically combine a base model with proprietary mission data, often using retrieval-augmented generation (RAG), where relevant documents are retrieved and used as context for each answer.

That’s where the security and governance challenges begin.

Layer 1: Infrastructure — a familiar foundation

The good news is that the infrastructure layer for AI looks a lot like any other high-value system. Whether an agency is deploying a database, a web app or an AI service, the ATO processes, network isolation, security controls and continuous monitoring apply.

Layer 2: The challenge of securing AI augmented data

The data layer is where AI security diverges most sharply from commercial use. In RAG systems, mission documents are retrieved as context for model queries. If retrieval doesn’t enforce classification and access controls, the system can generate results that cause security incidents.

Imagine a single AI system indexing multiple levels of classified documents. Deep in the retrieval layer, the system pulls a highly relevant document to augment the query, but it’s beyond the analyst’s classification access levels. The analyst never sees the original document; only a neat, summarized answer that is also a data spill.

The next frontier for federal AI depends on granular, attribute-based access control.

Every document — and every vectorized chunk — must be tagged with classification, caveats, source system, compartments and existing access control lists. This is often addressed by building separate “bins” of classified data, but that approach leads to duplicated data, lost context and operational complexity. A safer and more scalable solution lies within a single semantic index with strong, attribute-based filtering.

Layer 3: Models and the AI supply chain

Agencies may use managed models, fine-tune their own, or import third-party or open-source models into air-gapped environments. In all cases, models should be treated as part of a software supply chain:

  • Keep models inside the enclave so prompts and outputs never cross uncontrolled boundaries.
  • Protect training pipelines from data poisoning, which can skew outputs or introduce hidden security risks.
  • Rigorously scan and test third-party models before use.

Without clear policy around how models are acquired, hosted, updated and retired, it’s easy for “one-off experiments” to become long-term risks.

The challenge at this level lies in the “parity gap” between commercial and government cloud regions. Commercial environments receive the latest AI services and their security enhancements much earlier. Until those capabilities are authorized and available in air-gapped regions, agencies may be forced to rely on older tools or build ad hoc workarounds.

Governance, logging and responsible AI

AI governance has to extend beyond the technical team. Policy, legal, compliance and mission leadership all have a stake in how AI is deployed.

Three themes matter most:

  1. Traceability and transparency. Analysts must be able to see which sources informed a result and verify the underlying documents.
  2. Deep logging and auditing. Each query should record who asked what, which model ran, what data was retrieved, and which filters were applied.
  3. Alignment with emerging frameworks. DoD’s responsible AI principles and the National Institute of Standards and Technology’s AI risk guidance offer structure, but only if policy owners understand AI well enough to apply them — making education as critical as technology.

Why so many pilots stall — and how to break through

Industry estimates suggest that up to 95% of AI projects never make it to full production. In federal environments, the stakes are higher, and the barriers are steeper. Common reasons include vague use cases, poor data curation, lack of evaluation to detect output drift, and assumptions that AI can simply be “dropped in.”

Data quality in air-gapped projects is also a factor. If your query is about “missiles,” but your system is mostly indexed with documents about “tanks”, analysts can expect poor results, also called “AI hallucinations.” They won’t trust the tool, and the project will quietly die. AI cannot invent high-quality mission data where none exists.

There are no “quick wins” for AI in classified missions, but there are smart starting points:

  • Build upon a focused decision-support problem.
  • Inventorying and tagging mission data.
  • Bringing security and policy teams in early.
  • Establishing an evaluation loop to test outputs.
  • Designing for traceability and explainability from day one.

Looking ahead

In the next three to five years, we can expect AI platforms, both commercial and government, to ship with stronger built-in security, richer monitoring, and more robust audit features. Agent-based AI pipelines with autonomous security accesses that can pre-filter queries and post-process answers (for example, to enforce sentiment policies or redact PII) will become more common. Yet even as these security requirements and improvements accelerate, national security environments face a unique challenge: The consequences of failure are too high to rely on blind automation.

Agencies that treat AI as a secure system — grounded in strong data governance, layered protections and educated leadership — will be the ones that move beyond pilots to real mission capability.

Ron Wilcom is the director of innovation for Clarity Business Solutions.

Copyright © 2026 Federal News Network. All rights reserved. This website is not intended for users located within the European Economic Area.

Related Stories

    Soldier uses the Military OneSource app

    China hacked our mobile carriers. So why is the Pentagon still buying from them?

    Read more

    Why Uncle Sam favors AI-forward government contractors — and how contractors can use that to their advantage

    Read more

    Securing the spotlight: Inside the investigations that protect America’s largest events

    Read more