How we designed Level 4 autonomy we could trust

March 19, 2026
4
mins read

At MWC Barcelona 2026, Rakuten Mobile announced it was successfully validated by TM Forum to achieve Level 4 autonomy for energy efficiency, delivering 20% RAN energy conservation. In this article, Rakuten Symphony VP of Automation and Transformation Ahmed Gamal Abdelaziz unpacks what the certification represents at a system level, the five foundational layers that support L4 deployments and how operators can begin approaching their own strategies.

As telecom’s first Level 4 autonomous deployments go live, a proven operational path to support these rollouts is coming into focus.

This development comes as more operators target the high-value L4 use cases identified by TM Forum to have maximum impact on operator operations and business, like autonomous fault resolution, predictive maintenance, energy optimization and automated service provisioning.

Level 4 defines a state in which the network operates within operator-defined intents and guardrails without requiring human approval for every action. Outcomes are continuously validated with self-correction taking place based on what is observed.

It is important to note that Level 4 is not simply a new system capability. It becomes the anchor of operations, gradually transforming and influencing what happens around it.

Building a system that supports L4 operations

Specific requirements and definitions have been important in establishing an understanding of what is achieved in Level 4 deployments. In discussions with operators eager to move to L4 operations, what is less understood is what it takes to build a system capable of achieving it from a systems, data, governance and operations perspective.

An easy test: if the loop cannot explain why it acted, demonstrate that it operated within defined guardrails and roll back safely when an action doesn’t produce the expected outcome, it is not truly Level 4. These consequential gaps often define the difference between successful L4 operations and what merely meets L4 requirements on paper.

At Rakuten, we identified five operational layers that close common gaps, based on our experience building and certifying systems that have been validated by TM Forum. Each layer has to be in place and functioning well before the next one can do its job effectively.

Layer 1: Intent and policy management

In Layer 1, the operator sets objectives in alignment with key priorities (e.g., maintain network availability above 99% or reduce QoE impact by 10% during specific times). Here, it is defining intent so the network knows what it is supposed to do and not do. These intents are translated into policies that govern network behavior, comprising hard guardrails the system must never violate and soft guardrails it should avoid violating unless necessary. On this front, LLM training and tuning techniques are simplifying how natural language is translated into precise intent definition systems can confidently act on.

Layer 2: Data readiness and observability

Most L4 failures stem from data issues, not machine learning failures. If Layer 2 is weak, nothing built above it will be structurally sound. Data readiness means addressing table stakes challenges like silos while setting strategies for handling missing data, identifying and managing outliers, properly baselining data and segmenting data in meaningful ways. True power comes only when you can look across everything from above to recognize patterns no single source is able to reveal, giving the decision engine the context it needs to act confidently instead of guessing.

Layer 3: The decision engine

Layer 3 is where decisions are made about whether an action should be taken based on gain estimation and risk estimation. What is to be gained from this action and what is the risk of taking it? If both calculations fall within acceptable thresholds as defined by the operator (not the machine), the action is eligible to run. The actions available to the decision engine come from a library built on existing network policies, operational activities and configurations previously carried out by human engineers. Machine learning samples this history, assigning risk versus reward weights to potential actions and continuously self-correcting based on new knowledge. Rather than an engineer choosing the policy, intent management selects from that pool based on what best achieves the defined goal. Traceability is important here, requiring every decision be explainable. We don't just want to know what the system did, but why, which data and inputs were considered, what it is going to do next, and what safety or fallback measures are in place. A confidence or trust score accompanies each decision, informed by the model's history and interactions with engineers over time. Simply, if you can't trace a decision, you can't trust it or safely learn from it.

Layer 4: Safe execution

After making a decision, Layer 4 is where the system acts, based on environment. In open RAN environments this happens through defined interfaces connecting the SMO and RAN equipment. Safe execution runs pre-checks in advance of action and post-checks upon completion, with a structured rollout approach. You may see canary deployments where changes are applied first to a limited scope or geofencing that constrains where an action can take effect. Simultaneously running compensation algorithms ensures network quality won’t degrade while changes are being made. Importantly, Layer 4 never assumes the decision was perfect, only that it was good enough to try.

Layer 5: Assurance and continuous learning

Layer 5 is where the system asks what worked, what didn’t and if it is still capable of making good decisions going forward. Drift detection is a core function of this layer, determining if the data has become stale or the model has drifted to the point where decisions no longer reflect network conditions or operator priorities. At this stage, the system can determine if it needs to be retrained or if policies need to be retuned given conflicts with other policies or overlap that creates decisioning confusion. This is natural. At some point, as more policies are created, history is established and decisions are made, it is necessary to step back and consolidate into a coherent structure. Layer 5 is also where actions are validated against expected outcomes, informing future decisions. This continuous learning loop is what turns L4 operations into living systems versus sophisticated rule sets.

Where to start?

L4 operations are in reach for every operator, regardless of existing systems or network design. It is the organization’s priorities and data strength that will most inform the starting point.

Maybe carbon footprint reduction or operational efficiency is the priority. Or it could be network deployment velocity and monetization.

Operators start by mapping to the high-value scenarios they want to pursue first.

While a greenfield operator launching a new network might prioritize network planning-focused use cases, an incumbent with generations of accumulated legacy systems may begin with service assurance and fault management to relieve specific pain points.

The first use case will always be the hardest given the supporting system is being pursued in tandem. However, once the layers are in place, moving to the next use case becomes more about transferring learning to a validated framework that is now just accommodating a new problem.

The higher the stakes, the more evidence and confidence operators will need before deploying additional use cases. So, energy efficiency which carries relatively low decision risk given any resulting performance degradation can be recovered nearly instantly may make a great first goal. While proactive fault management that takes action on a running network (and therefore has a larger blast radius) may be saved for later pursuits, when the system has been proven and is trusted.

Of course, the right architecture alone doesn’t guarantee a successful L4 deployment. Operational realities also come into play, testing governance, organizational readiness, integration realities and the role people play alongside the system.

Next week, I will be a guest on Zero-Touch Live where we will pick up this conversation and discuss what we actually learned at each stage of system implementation and how that informed our approach, the next use cases we are pursuing and the work we are doing with other operators eager to deploy Level 4 autonomous operations.

Stay tuned for more information about how to register to join our next Zero-Touch Live discussion. Tag Ahmed Gamal Abdelaziz in the comments to start a conversation.

Open RAN
Related Newsletter
Beyond adoption: What it really takes to build an AI-first culture
Enterprise AI adoption is not a matter of if but how. The largest companies in the world are making significant investments across tech, people and resources, angling for an advantage that could pay dividends for those that establish leadership positions.
September 11, 2025
5
MINUTES
Transformers and Large Language Models: Intro to the foundational architecture of Generative AI
More than a billion+ users later, LLMs have been adopted faster than any tech in history, powered by the experience shift from “searching” for information and being presented links to getting targeted answers that are “generated” in milliseconds. With this shift came fast-changing expectations across enterprise software, telco operations and daily productivity.
July 31, 2025
4
MINUTES
From silicon to services: What MWC confirmed about telecom’s future
In this special MWC Barcelona edition of Zero-Touch, Rakuten Symphony SVP Partner & Portfolio Sheheryar Khakwani (SK) reflects on why telecom’s next phase of growth hinges on how industry stakeholders collaborate to turn capability into service.
March 5, 2026
4
MINUTES
How agentic networks break barriers: Dispatch from AWS hackathon at FYUZ
In this week’s Zero-Touch newsletter, Rakuten Symphony CMO Geoff Hollingworth shares his observations on telecom’s persistent “sexiness” challenge and why his experience at the recent AWS Breaking Barriers Hackathon, held at FYUZ 2025, has him thinking telcos may finally be getting their innovation groove back. Then AWS 6G and AIML Technologist Ejaz Sial and AWS Industries Director Technology Kaniz Mahdi share a readout from the hackathon, which brought together 377 participants from 68 projects and crowned a winner that showcased impressive, AI-driven RAN learning and optimization. First up, Geoff Hollingworth.
November 20, 2025
5
MINUTES