AI-RAN and the trust gap: Why autonomous networks need more than intelligence

The telecommunications industry is at a critical turning point. For nearly a decade, the industry’s primary hope for disruption was Open Radio Access Network (Open RAN) architecture. The promise was simple: break down the monolithic, tightly integrated hardware-software stacks of legacy vendors and replace them with a democratic ecosystem of interoperable, software-defined components.

But the real-world execution of Open RAN has been a masterclass in complexity, plagued by system integration bottlenecks and what many in the industry call “vendor lock-in 2.0”. Major legacy players successfully co-opted the movement, embedding proprietary automation and orchestration layers that often left operators managing the same old dependencies under a new, more complicated name.

From the ashes of this stagnation, a new paradigm is rising: Artificial Intelligence-Radio Access Network (AI-RAN). Rather than treating intelligence as an optional software layer bolted onto static cellular protocols, AI-RAN natively embeds machine learning and accelerated computing into the very physical and architectural design of the network.

The momentum is real. Founded in early 2024, the AI-RAN Alliance grew to 132 global members by early 2026, welcoming tier-one operators and hardware giants like Qualcomm, SK Telecom and Vodafone to its board.

Market forecasts reflect this massive shift. According to data from Dell’Oro Group, AI-RAN is projected to account for roughly a third of the total RAN market, reaching a valuation of US$10 billion by 2029.

However, as Stefan Pongratz, the vice President at Dell’Oro Group, cautions, the near-term priorities are focused on efficiency gains – such as improving user experience, cutting power consumption and driving automation – rather than immediate, massive new revenue streams. In other words, in the modern network economy, intelligence beats openness, but only if operators can deploy it without losing control of their systems.

The three dimensions of the AI-RAN architectural blueprint

At its second anniversary, coinciding with Mobile World Congress (MWC) 2026, the AI-RAN Alliance published four foundational blueprints and technical guidelines to establish the architectural roadmap for this integration. The alliance structures AI-RAN across three core dimensions :

AI-for-RAN: Using machine learning to optimise the physical layer (Layer 1) and radio resource management functions.This replaces traditional, hand-coded algorithms with neural networks that can handle channel estimation, beamforming and dynamic spectrum sharing in microsecond loops.
AI-on-RAN: Reimagining the cellular base station as a distributed edge computing node. By deploying graphics processing units (GPUs) at the cell site, operators can host low-latency applications – like physical AI, computer vision and local generative AI – directly at the network edge.
AI-and-RAN: Achieving a shared infrastructure model where telecoms and external AI workloads run concurrently on a unified, accelerated cloud platform, maximising hardware utilisation and return on investment (ROI).

Blueprint publication	Primary technical scope	Operational objectives	Core architectural alignment
AI-RAN reference architecture	Defines high-level AI-native RAN design, mapping out functional domains and primary physical and logical components.	Establishes standardised interfaces and functional boundaries to ensure multi-vendor co-existence.	Links physical Layer 1 processing with virtualised container execution layers.
AI/ML techniques for RAN performance	Summarises the state-of-the-art neural networks, reinforcement learning models, and training methodologies for radio frequency (RF) optimisation.	Improves spectral, operational and energy efficiency across both link-level and network-wide domains.	Aligns with 3GPP Releases 18–20 and emerging 6G air interface specifications.⁶
Platform and infrastructure orchestration	Outlines management and orchestration (MANO) strategies for balancing concurrent, heterogeneous AI and RAN workloads.	Dynamically optimises resource sharing and GPU utilisation without compromising carrier-grade RAN performance.	Integrates with Kubernetes, Red Hat OpenShift and O-RAN Service Management and Orchestration (SMO).
AI-on-RAN monetisation framework	Details business models and technical interfaces required to offer “AI-as-a-Service” (AIaaS) at the cellular edge.	Enables operators to monetise spare base station processing capacity by processing local enterprise AI queries.	Leverages standardised open APIs, including TM Forum Open APIs and CAMARA interfaces.

The autonomous trust gap

As cellular networks transition toward Level 4 (“High Autonomy”) and Level 5 (“Full Autonomy”) under the TM Forum framework, the operational model shifts from human-assisted automation to fully closed-loop, self-evolving management. At Level 4, the network makes real-time decisions based on high-level goals (intents) set by human operators. At Level 5, the network achieves complete self-governance, adapting dynamically to unforeseen conditions without manual intervention.

The financial payoff for communications service providers (CSPs) is massive. TM Forum research suggests that successfully transitioning to Level 4/5 autonomy can result in up to a 55% reduction in operations and maintenance costs, a 71% rise in customer satisfaction, and a 21% increase in energy savings. Reaching this maturity unlocks the agility to transform the customer experience with real-time personalisation, zero-touch provisioning and self-healing systems.

But this “humans-further-from-the-loop” paradigm exposes a critical, unowned security vulnerability. In a traditional network, automation runs on deterministic, human-auditable scripts. In an autonomous network powered by agentic AI, trust has historically been transitive and informal.

According to risk analysts at EY, the single greatest threat facing telecoms operators today is “misjudging evolving needs in privacy, security and trust,” a challenge accelerated by the rapid integration of AI. In multi-agent autonomous chains, this trust gap manifests in three specific failure modes:

1. Transitive trust and the chain of indirection

In current Agent-to-Agent (A2A) coordination protocols, downstream agents blindly trust that upstream agents have accurately translated the original operator’s intent. As a task is delegated across multiple hops – from a human operator to an orchestrator, to a planning agent, and finally to a tool server executing API calls – the chain of custody degrades.If a single intermediate agent is compromised or fed manipulated data, it can inject malicious instructions. Because trust is transitive, downstream systems will execute these commands under structurally valid but contextually unauthorised security credentials.

2. Intent drift and cryptographic intent binding

Without cryptographic mechanisms linking a human operator’s original signed policy to the micro-transactions executed by downstream agents, autonomous networks are vulnerable to intent drift. This occurs when the actions of autonomous AI agents slowly veer away from what the operator originally intended.

To prevent this, operators require cryptographic intent-binding protocols. When a human operator sets a network goal (such as “optimise coverage”), the system attaches a secure, unchangeable digital signature to that instruction.Every AI agent downstream must verify this signature before executing any action, ensuring they never drift from the human’s original plan.

Without this binding, an operator might command the system to “optimise cell coverage in Sector A.” If a downstream routing agent encounters manipulated telemetry, it may reformulate its strategy to shut down neighboring cells entirely or exfiltrate private diagnostic logs to an external server. The downstream system validates the API token, but the action itself represents a severe breach of the operator’s original intent.

3. Scope expansion

Scope monotonicity is a key safety property ensuring that when a complex task is broken down and delegated, the operational permissions of any child task must strictly be a subset of (or narrower than) its parent’s scope. In other words, authority can shrink as tasks are delegated, but it can never grow.

Because current protocols lack native mechanisms to enforce this restriction, subordinate agents can generate child tasks with broader permissions than their parents.

For instance, a parent task authorised only to “analyse packet drop statistics” could spawn a child task that requests administrative write-access to physical routing tables under the guise of running a diagnostic probe. This unauthorised privilege escalation happens because the protocol only evaluates the identity of the invoking agent, rather than enforcing strict boundary limits across the delegation chain.

Bridging the silos: TM Forum meets O-RAN architecture

In simple terms, the RAN Intelligent Controller (RIC) acts as the AI control layer for modern radio networks. To bridge the trust gap, operators need a standardised approach that unifies operations across silos, aligns intelligence across domains, and builds automation that respects the complexity of existing networks. This is achieved by combining the TM Forum Autonomous Networks (AN) reference architecture with the O-RAN Alliance’s RIC framework.

To make this operational, forward-thinking operators are deploying a centralised AI control tower. Functioning as an enterprise control plane, the AI control tower coordinates all active autonomous processes. It keeps a single, anchored inventory of all active AI assets (models, weights, training data and execution logs) linked directly to business services and network domains within a configuration management database (CMDB).

This allows operators to bind business intents to real-time execution parameters, enforcing strict compliance guardrails aligned with global risk frameworks like the NIST AI Risk Management Framework (RMF) and the EU AI Act.

At the radio network layer, this intelligence is executed via the RIC, which is split into two operational brains: the Non-Real-Time (Non-RT) RIC (managing long-term, big-picture policies taking one second or more using applications called rApps) and the Near-Real-Time (Near-RT) RIC (making rapid, real-time adjustments under one second using applications called xApps).

Because these applications are often developed by a heterogeneous mix of third-party software providers, they operate as distinct black boxes competing for the same physical network resources. This competition introduces severe operational conflicts, which the O-RAN specifications categorise into three primary vectors:

Direct conflicts: Occur when multiple apps attempt to change the exact same parameter at the same time. For example, a traffic steering xApp tries to increase a cell’s transmission power to boost coverage, while an energy-saving xApp simultaneously requests a power reduction. Without mediation, the base station undergoes rapid, unstable parameter oscillations.
Indirect conflicts: Occur when apps modify different parameters that have coupled dependencies on a shared target metric. For instance, one xApp adjusts antenna tilts to optimise load balancing, while another alters handover parameters. Both actions alter the physical boundary of the cell, potentially causing dropped sessions.
Implicit conflicts: Occur when apps optimise entirely different, unaligned key performance indicators (KPIs) while sharing a common resource pool. For example, a physical layer optimisation xApp may attempt to maximise spectral efficiency by allocating all available physical resource blocks (PRBs) – which are the basic blocks of wireless spectrum and bandwidth allocated to mobile users – to high-throughput video streaming users. Simultaneously, a security compliance xApp may require a slice of those same PRBs to execute dynamic threat monitoring. Because the physical medium has a finite capacity, the independent actions of these models inadvertently starve critical systems, degrading overall quality of service (QoS).

To resolve these conflicts, several advanced mitigation architectures have been proposed:

GRACE (Graph Convolutional Networks for Conflict Detection): GRACE maps the complex, non-linear dependencies between active apps, controlled physical parameters and network metrics. By modeling the network as a graph, GRACE captures hidden dependencies to execute real-time root-cause analysis. It detects these hidden conflicts with an impressive success rate exceeding 98%.
ORCA (O-RAN Resource Conflict Analysis): ORCA leverages specialised machine learning models to predict the downstream performance impacts of parameter modifications. In complex resource conflict scenarios, ORCA reduces calculation and estimation errors by over 60% compared to traditional, ungrounded LLM predictions.
PACIFISTA: A predictive, multi-objective optimisation framework designed to evaluate “what-if” scenarios. PACIFISTA establishes a dynamic negotiation interface between competing applications, finding an optimal configuration that minimises degradation across all active services.

Furthermore, to ensure real-time safety, operators are deploying lightweight trust-verification gates within the Near-RT RIC. These gates continuously calculate a unified, real-time trust score that dynamically fuses observed network deviations, model telemetry drift (where machine learning models slowly degrade and lose accuracy over time), and explainability attributions. If this trust rating falls below a predefined threshold, an automated rollback controller immediately intervenes. It disables the untrusted app’s control interface, falls back to a deterministic, known-good baseline policy, and generates an explainable incident report for human review.

When configured properly, these intelligent apps can drive a spectral efficiency gain of up to 10% and boost user throughput by 10% to 40% depending on the specific optimisation model deployed.

The geopolitics of silicon: The “CUDA Moat” vs. silicon agnosticism

The silicon tug-of-war in 6G

As the industry prepares for 6G, the technical frontier is shifting from open interfaces to the underlying silicon and software compilation stacks. This transition has triggered an intense debate between open-source initiatives and proprietary computing ecosystems.

At the center of this debate is the partnership between the Linux Foundation and the US Department of Defense‘s FutureG project on the “OCUDU” initiative. The goal of OCUDU is to build an open-source reference implementation of the 6G baseband stack, allowing academic institutions, startups and smaller vendors to inject custom machine-learning code directly into the physical layer of the network. Proponents argue that open-sourcing the baseband stack is necessary to bypass the proprietary licensing and fair, reasonable and non-discriminatory (FRAND) patent moats maintained by traditional infrastructure vendors, which have historically limited market entry.

Nvidia’s proprietary “CUDA Moat”

Nvidia has aligned with this open-source push by offering “Aerial,” its software-defined RAN reference platform. Aerial allows developers to run software-defined processing on GPUs, enabling the integration of neural network-based waveforms directly into the baseband. However, this strategy highlights a critical architectural lock-in risk known as the “CUDA Moat“.

While Aerial is open-source at the application layer, its execution remains highly dependent on Nvidia’s proprietary CUDA software stack – the specialised software environment that translates AI code into instructions that Nvidia’s graphics chips can execute. L1 baseband code optimised for parallel execution on CUDA-accelerated GPUs cannot be easily compiled or run on alternative hardware platforms—such as AMD graphics processors, Intel x86 CPUs, or energy-efficient Arm processors.

This software dependency creates a key point of strategic differentiation between the major equipment manufacturers:

Nokia’s GPU-first bet

Nokia has pursued a native GPU implementation, developing baseband software that offloads all computationally intensive functions—including forward error correction (FEC), the heavy-duty math used to detect and fix data transmission errors – directly to Nvidia GPUs.Higher-layer software is executed on paired Arm-based Grace CPUs or standard x86 processors.

While this parallel processing model provides high spectral efficiency and throughput, it creates an architectural dependency on Nvidia’s hardware and software release cycles, potentially limiting Nokia’s ability to easily migrate its baseband stack to alternative silicon.

Ericsson’s silicon-agnostic defense

Ericsson has adopted a hardware-agnostic strategy, keeping its software stack decoupled from proprietary GPU acceleration where possible. Ericsson’s software, originally written for Intel architecture, was adapted to run on Nvidia’s Grace CPUs. This migration was made possible by Arm’s integration of vector processing (via the SVE2 instruction set), which provides the mathematical processing required for standard operations.

Under Ericsson’s model, standard code is processed on the CPU, and only highly resource-intensive functions (like FEC) are offloaded to an external GPU accelerator. This approach preserves long-term sourcing flexibility and silicon independence, but it can introduce integration and compilation overheads compared to a fully native, GPU-optimised stack.

Crucially, Ericsson points out that standard applications running on general-purpose CPUs can already deliver spectral efficiency gains of up to 10% and user throughput increases of 10% to 40%—all without necessitating expensive, power-hungry edge GPUs that risk inflating the total cost of ownership (TCO).

These divergent philosophies are forcing operators to evaluate the long-term TCO and operational risks of GPU-accelerated infrastructure.

Operational dimension	GPU-accelerated L1 architecture (Nokia native)	Silicon-agnostic CPU-first architecture (Ericsson)	Core strategic implications for operators
Silicon and vendor lock-in risk	High. Baseband software is tied to Nvidia’s proprietary CUDA compilation ecosystem.	Low. Software remains compiled for general-purpose processors (Arm/x86).	Operators must balance immediate peak performance gains against the risk of hardware supplier lock-in.
Capital and operational expenditure (CAPEX/OPEX)	High initial CAPEX for accelerators.Potential OPEX exposure to software licensing models.	Moderate CAPEX. Standardised CPU upgrades align with traditional IT procurement cycles.	Orange CTO Bruno Zerbib warns that GPU upgrades could lead to “subscription-style” pricing for spectral efficiency updates.
Energy consumption and power efficiency	High peak power demand, though mitigated by dynamic co-scheduling of AI/RAN workloads.	Highly predictable, low baseline power demand.	Co-scheduling AI and RAN workloads is essential to prevent GPU idle-state power waste at the cell edge.
Physical security and asset exposure	High. Placing valuable GPU nodes at distributed cell sites increases theft risk.	Low. Standard baseband units are less valuable to commodity hardware thieves.	Operators must invest in physical site hardening and active physical security for accelerated edge nodes.

Laurent Leboucher, the CTO of Orange, has noted the strategic challenge of vendor lock-in: “The risk of lock-in exists, definitely. We really want to stay open. At the same time, we know that benefiting from a very, very large-scale general-purpose architecture should improve the TCO [total cost of ownership]. At the end of the day, it will be a trade-off. But we would welcome an architecture where we have the capacity at some point to decide to swap if we need to swap.”

Similarly, Orange CTO Bruno Zerbib has raised questions regarding the TCO and physical security of GPU-based architectures. Zerbib has warned that placing highly valuable, enterprise-grade GPUs in distributed, unmonitored physical basestation enclosures – which have historically been targets for copper theft – creates a significant physical security risk : “Do I want to make my baseband units too valuable and then have people stealing them because we have GPUs inside? If you are now building an architecture where you have a bunch of very expensive small data centers in some areas that have not traditionally been super-protected, I think that’s actually an issue.”

Furthermore, he has questioned whether the promise of software-defined spectral updates will trap operators in a rent-seeking software model: “What is the cost of the GPU? What is the TCO? Are the GPUs going to potentially become a subscription game, where I’m now going to have to pay for that software update to get more spectral efficiency?”

Empirical proof points: Case studies and technical validation

Despite the industry discussion surrounding AI-RAN, as of mid-2026, no mobile operator has launched a wide-scale commercial deployment. However, several controlled outdoor trials and consortium projects have validated the theoretical performance, energy and monetisation models.

The Fujisawa City field trial (SoftBank and Nvidia)

The primary empirical reference for AI-RAN is the outdoor 5G field trial conducted by SoftBank and Nvidia in Fujisawa City, Kanagawa, Japan. The trial utilised an end-to-end software-defined virtualised 5G RAN stack integrated with a 5G core, running on a single rack of Nvidia GH200 Grace Hopper Superchips, coupled with BlueField-3 DPUs and Spectrum-X network interfaces.

The virtualisation layer was managed by Red Hat’s OpenShift Container Platform, running SoftBank-optimised software built with Nvidia Aerial libraries, alongside Layer 2 software developed by Fujitsu.

Operating 20 outdoor 5G cells, the system achieved a peak downlink throughput of 1.3 Gbps under ideal conditions and 816 Mbps with carrier-grade outdoor availability.

The trial successfully validated multi-tenancy. While traditional, single-purpose base stations average a capacity utilisation of approximately 33% (as they are sized for peak traffic loads), the AI-RAN stack achieved up to 100% capacity utilisation – a threefold increase.

To achieve this, SoftBank’s orchestrator leveraged Multi-Instance GPU (MIG) technology to dynamically partition the physical GPU cores. During low-traffic periods, compute slices were allocated to low-priority edge AI workloads, such as local visual analysis or factory robotics processing. During traffic surges, the orchestrator immediately reclaimed those GPU slices, re-allocating them to high-priority virtual baseband functions to maintain strict communication KPIs.

To manage these workloads across geographically distributed locations, SoftBank developed and open-sourced its AITRAS Orchestrator and core Dynamic Scoring Framework, contributing them to the Linux Foundation’s Open Cluster Management (OCM) project. The Dynamic Scoring Framework centrally aggregates and analyses cluster-level metrics – such as real-time power consumption, physical resource strain, and application performance – to determine the optimal placement of both network functions and third-party AI workloads.

Importantly, AITRAS integrates with a GitOps pipeline to maintain human oversight within this autonomous environment. This is a software management practice where any proposed changes to the network are written as code updates and pushed to a secure repository.

When an AI agent generates an optimal cluster configuration change based on the scoring results, the change is not immediately applied directly to the production clusters. Instead, the agent pushes the proposed policy changes to a Git repository, creating a “pull request”. This forces a human network operator to review, validate and sign off on the proposed configuration changes before they are deployed. The outcome of this review is logged and used to retrain the agent’s model over time.

Additional live field trials and validation initiatives

T-Mobile US (Seattle Innovation Center): Nokia, T-Mobile and Nvidia conducted over-the-air (OTA) testing of GPU-accelerated AI-RAN at T-Mobile’s Seattle headquarters. The trial successfully demonstrated that latency-sensitive L1 processing could coexist with heavy generative AI inference on the same physical chip without degrading wireless QoS.
Indosat Ooredoo Hutchison (IOH): In collaboration with Nokia, IOH executed Southeast Asia’s first AI-RAN-powered Layer 3 5G call in a live operator environment in Surabaya. The trial validated the feasibility of concurrent AI-and-RAN processing in an active, real-world network.
TM Forum Catalyst: Project Aura (AI + RAN): This collaborative project – bringing together StarHub, Celcom Digi, Etisalat UAE, Red Hat, SynaXG and Orex Sai – built a standardised blueprint for edge-based monetisation of network intelligence. Project Aura demonstrated the execution of co-scheduled virtualised RAN functions alongside high-value enterprise applications.
TM Forum Catalyst: Trusted AI for autonomous management (C26.0.969): This project addressed the operational gap between physical asset records and digital orchestration systems. By deploying visual inspection AI and standardised “digital employees” within an ODA-aligned architecture, the Catalyst achieved physical-to-digital asset record consistency exceeding 99% and resource inventory accuracy above 95%, establishing the foundational data integrity required to safely scale closed-loop automation.

Strategic imperatives: Trust as the agile telco’s ultimate moat

As telecommunications operators transition toward highly autonomous AI-native networks, the primary competitive differentiator will not be raw computational capacity or proprietary algorithm optimisation.

McKinsey research into agile telcos reveals that operators who commit to agile transformations are three times more likely to be top-quartile performers – but only if they commit to structural, cultural and governance changes. In an era of rapid technological change and complex multi-vendor integration, the ultimate competitive advantage for agile telcos will belong to those who establish robust, open-standard trust frameworks.

To build a sustainable operational moat, telcos must execute on four strategic areas:

1. Cryptographic intent binding and scope monotonicity

Operators must move away from transitive, informal trust models in their automation layers. Transitioning to Level 4 and Level 5 autonomy should be contingent on the implementation of cryptographic intent-binding protocols. No configuration change, resource slice, or routing modification should be executed by downstream apps unless the command contains a cryptographically signed, structured intent token originating from an authorised operator. This mathematically enforces scope boundaries, preventing unauthorised privilege escalation and intent drift.

2. Multi-vendor conflict arbitration

Operators should reject proprietary, vendor-specific orchestration frameworks that restrict them to single-vendor software ecosystems. Telcos should mandate the integration of open, graph-based conflict detection models (such as GRACE or ORCA) within their Service Management and Orchestration (SMO) platforms. By deploying independent, mathematically verifiable conflict arbitration engines, operators can safely run a diverse mix of third-party apps, achieving a verified multi-vendor integration success rate of over 92% and fostering a competitive and innovative application ecosystem without risking network stability.

3. Open-standard compilation layers to prevent silicon lock-in

To safeguard against hardware-based vendor lock-in, operators should mandate that all virtualised software be compiled via open-source, hardware-agnostic runtime standards. Telcos should actively support initiatives such as the Linux Foundation’s OCUDU project and include strict clauses in vendor contracts requiring compatibility with alternative silicon accelerators. By decoupling the software optimisation layer from proprietary parallel computing architectures like CUDA, operators preserve the long-term flexibility to transition between silicon providers.

4. GitOps-driven human-in-the-loop governance

Operators must balance the speed of closed-loop automation with the security of human oversight. Rather than permitting autonomous agents to modify physical cluster configurations directly in real time, operators should enforce a GitOps-based review pipeline. Under this model, AI-driven optimisation proposals are pushed as pull requests to a secure repository, requiring validation by a human engineer before deployment. This pipeline maintains an immutable audit trail, prevents uncoordinated systemic failures, and uses human feedback to continually retrain and refine the network’s underlying models, establishing a secure path to trusted autonomy. This “governance-as-code” approach transforms compliance from a manual, stress-inducing bottleneck into a continuous, automated evidence pipeline.