To secure 5G networks, your most important preparation revolves around design and implementation choices for new technologies, including virtualization, containerization and orchestration. These technologies, while new to telecom networks, are well-proven in other industries. It is our mission to learn from these industries and secure the future of telecom. The design and configuration choices presented by modern network and cloud technologies open a wide spectrum of possible outcomes. This can increase security risk compared to previous network generations. However, whenconfigured according to best practices, cloud native 5G and Open RAN networks can become the most secure mobile network generation deployed to date.
In this guide, we detail a five-step path to making strong choices around decoupling access, enforcing segregation, patching and hardening, verifying deployments, and hacking monitoring. We also share Rakuten’s experience implementing the five-step 5G security path in its work with Rakuten Mobile in Japan.
5G and Open RAN introduce new technologies that change potential hacking risk exposure. Specifically, the cloud technologies introduced in modern 5G networks provide design and configuration choices that offer a spectrum of security challenges if not managed correctly.
When configured weakly, these technologies can increase hacking exposure and hacking impact compared to previous network generations. However, when configured according to best practices, 5G and Open RAN networks can become the most secure mobile network generation deployed to date.
The good news is that we already know the critical security configuration settings required to make cloud-native 5G and Open RAN networks secure. This guide summarizes proven best practices from other industries that have successfully adopted open cloud technologies. It introduces a five-step path for securing the IT infrastructure of a 5G or Open RAN network and shares Rakuten Symphony’s experience implementing these steps.
This (new network architectures) can increase security risk compared to previous network generations. However, when configured according to best practices, cloud native 5G and Open RAN network scan become the most secure mobile network generation deployed to date.
We will start by contrasting a well-secured network with one that does not pay sufficient attention to security design and hardening.
We already know the critical security configuration settings required to make cloud-native 5G and Open RAN networks secure.
The evolution to 5G enables faster and more flexible mobile networks. Some of these networks will excel in information security, while others will introduce additional risks compared to earlier mobile network generations. The difference between weak and strong 5G networks mostly comes down to design and configuration choices, as the remainder of this study illustrates.
Let us compare weak vs. strong choices when implementing four design principles found in most modern 5G networks, specifically in Open RAN networks.
Modern 5G networks are:
Each of the four design principles brings about new technology design choices and configuration settings. In each, weak choices can diminish the hacking resistance of a mobile network, while strong choices can strengthen a network compared to earlier technology generations:
Security down-side created by bad choices
5G network design principles
Up-side created by strong choices (five-step path)
Each 5G deployment chooses between weak and strong design and configuration options. This chapter details a five-step path to making strong choices, while the next chapter shares Rakuten Symphony’s experience and implementation challenges in following the five-step path
The human factor has been a security sore point in all technology generations. With cloud-based 5G networks, we have an opportunity to decouple human access from technical infrastructure.
Cloud infrastructure is configured through scripts (“infrastructure as code”) rather than through human operators. Humans ultimately maintain these scripts, but security checks can happen between the creation and execution of each script. When done right, decoupling limits both the impact of human errors and the effects of social engineering.
In addition, network built from code can recover much faster after an incident.
Network segregation, especially of critical components, limits the possible reach of a hacking incident. The technology underpinning 5G networks makes segregation both easier and harder.
Segregation in cloud-native networks is easier because software-defined networks allow any granularity of segregation simply through configuration, and can flexibly be adopted to evolving functional requirements. No firewalls are required.
It is also harder to implement. New connections between network elements arise from virtualization/containerization whereas in earlier network generations, the only connection between elements was a network cable. Today, different elements typically reside as containers in the same cluster.
For segregation to be effective, containers need to be configured strongly to prevent hackers from being able to escape. Strong container configuration can be enforced centrally through automated tools in the deployment pipeline.
For cloud-native networks, it is best practice to maintain “golden” master images with the latest patches and configuration settings, and to base all containers on these images.
Regular patching and strong configuration settings (known as ‘hardening’) remain critical in shortening the time window during which network elements are exploitable through newly discovered security vulnerabilities.
Once again, cloud-native 5G makes patching both easier and harder compared to earlier network generations. Easier, because network elements can be rebuilt on the fly through an automated deployment pipeline, each time including the latest patches and configuration settings. Harder, because virtually all patching must be done through these automated processes due to the larger scale and complexity of 5G compared to earlier generations.
For cloud-native networks, it is best practice to maintain “golden” master images with the latest patches and configuration settings, and to base all containers on these images. A packer automatically builds network elements by combining a golden image with element-specific software, and stores the images in a repository. Automated deployment processes use these artifacts to deploy any number of required network elements, each including the latest security baseline.
Security testing happens as part of the development and deployment pipelines. Yet again, cloud native networks require much faster assurance processes to match their continuous deployment practices.
Tests must be fully automated as part of deployment pipelines and should uncover programming issues, patching gaps, and configuration issues through vulnerability assessment tools and source code scanners.
Feedback to developers and DevOps engineers should be immediate and automated by blocking issues from being committed into the pipeline.
Modern technology produces a lot of telemetry, which — when analyzed correctly — provides plenty of opportunity to detect and alert on hacking activity.
Not all hacking attacks can be fully prevented, creating a need for security monitoring to detect hackers as they move through the network.
Modern technology produces a lot of telemetry, which—when analyzed correctly—provides plenty of opportunity to detect and alert on hacking activity
To respond at the speed of hackers, some evasion actions need to be automated. For example, suspicious instances should be saved for forensics purposes while replacements are immediately rebuilt with the latest security baseline. While the suspicious instance is being analyzed, other instances automatically take over the load. Speeding up recovery while at the same time resolving the inherent conflict of objectives between forensics and business continuity is another strength of containerized deployments.
The five-step path leads to more secure networks but relies on additional baseline security controls that are commonly found across IT infrastructures. As these standards controls are widely documented elsewhere, including in common security certifications, we mention them only briefly. The following controls are required for securely operating any complex IT infrastructure:
“The Rakuten telco environment, typical for a cloud-native 5G network, comes at unprecedented scale: We run thousands of containers in hundreds of clusters spread across the country.”
At Rakuten, we implemented the five-step 5G security path. This chapter shares our experience, provides concrete advice for similar deployments, and details non-trivial complications with some controls.
The Rakuten telco environment, typical for a cloud-native 5G network, comes at unprecedented scale: We run thousands of containers in hundreds of clusters spread across the country.
At this scale, deployment, management, and monitoring are only possible with complete automation.
In our orchestrated environment, users leverage automated workflows for deployment into hundreds of clusters without any manual intervention. Post-deployment workflows such as monitoring, auto healing, upgrades are also automated so that the environment can be managed with a lean operations team.
Let us look at the example of deploying a DU. DUs – or Distributed Units – are controllers for one or more antennas. We have thousands of these.
Deploying a DU involves BIOS configuration, OS installation, OS configuration, Kubernetes installation and configuration, Rakuten SymworldTM installation and configuration, and application deployment and configuration.
Before heavy automation, we had to give access to the hardware, OS, Kubernetes, and Symworld Platform to different human operators. Now with end-to-end orchestration in place, no human operator needs to have access to any of these components. This automation scales the effectiveness of operations exponentially while at the same time reducing the deployment time and improving the security of the platform by keeping access credentials tightly managed.
In Rakuten, as in any other telco environment, a lot of vendor applications need to be deployed in various stages of the life cycle (sandbox, staging, production). The applications need to run efficiently without causing noisy neighbor issues, and without enabling hackers to laterally move between them.
We implemented four levels of isolation to allow applications to co-exist on the same cluster.
Between these levels of isolation, we limit the reach of any successful hack. Next, let us discuss how to reduce these hacking threats before they happen.
In complex IT environments, new security issues are continuously discovered as hackers continue to learn new vulnerabilities. A modern vulnerability management process can quickly identify and address upcoming issues.
To handle the challenges of country-wide 5G deployments, we implemented a streamlined vulnerability management process with five phases:
In evolving IT infrastructures, keeping track of every single security relevant technology component – network appliance, servers, operating system packages, middleware, software libraries, and applications – is a large challenge. Instead of trying to solve this challenge through a central asset directory, we instead combine readily available data sources to create an approximate inventory. This ad-hoc inventory aggregates information from:
By combining these data sets, we reach a close approximation of the IT infrastructure and software components running in our network, which provides the foundation for the next step in vulnerability management: Finding vulnerabilities.
The overview also allows for quick queries around newly-discovered issue, along the lines of “where does software X run in version Y”. These queries enable ad-hoc emergency patching for the most severe issues.
No single scanning tool covers all technologies we are using, hence we employ a range of scanners including standard vulnerability assessment tools, application-level scanners, and cloud-specific tools, among others. In addition to automated tools, periodic hacking simulations (“red team” exercises) and a private bug bounty program allow us to spot additional security issues.
The flip side of running a large range of scanners is an overwhelming and partly-redundant amount of vulnerability information, which prompts for strict prioritization.
Our focus is on increasing hacking resilience, that is: Closing those vulnerabilities that hackers like to exploit.
To choose which vulnerabilities to address first, we use prioritization tools that rank issues by their attractiveness to hackers. The most attractive vulnerabilities typically include unpatched bugs, for which exploits are readily available, and severe configuration mistakes including weak or default passwords.
To decide which vulnerabilities matter, our CSIRT team exchange information with equivalent teams at other companies and security vendors regarding incidents worldwide. We find that over 95% of the issues reported by standard tools have no relevance for hacking resilience and can be de-prioritized. The remaining issues are packaged into remediation actions.
A number of teams contribute to the remediation of security issues –including sysadmins, DevOps engineers, and developers.
In our experience, most contributors are experts in IT, but not necessarily in security. Presenting them with a list of vulnerabilities often does not provide enough information to address the issues.
We map vulnerabilities into remediation packages, which are described in the domain-language of the remediation contributor. Instead of vulnerabilities, we talk about patching processes; instead of configuration weaknesses, we detail hardening steps; and instead of weak credentials, we explain strong credential management practices.
We support the remediation contributors in two ways, until the relevant issues are closed:
First, we track their progress, both in absolute terms and as a relative ranking between the teams. This “gamification” of vulnerability remediation assures that the most effective contributors encourage others to follow in their footsteps.
Second, we remain available for questions throughout the process, to ensure every contributor feels enabled to execute remediation actions even when they encounter a particular remediation package for the first time.
Based on tracking and supporting, we find that decentral remediation of issues continuously reduces the number of vulnerabilities in our network, starting with those the hackers are most interested in.
By implementing several checks in our deployment pipeline, we are able to detect security defects six to eight weeks earlier than before, thereby mitigating risks and reducing the mitigation overhead. This was achieved with three tool-based capabilities and one overarching process.
We start with the CI/CD workflow in which developers create code, commit it to central repositories, then test and deploy mostly automatically. Initially, many security issues were only detected after deployment, causing reactive remediation rather than proactive remediation.
We introduced three capabilities to detect security defects earlier. Static Application Security Testing (SAST) and Software Composition Analysis (SCA) assure that known security issues are detected before deployment. Cloud Security Posture Management (CSPM) ensures that new security issues are quickly addressed after deployment.
A BOM is automatically generated when creating container images and isused to identify such newly-discovered vulnerabilities in already-deployedcontainers. A CSPM analyses the BOM and other data and flags securityissues to the DevOps engineer, and optionally to developers, usuallytriggering a rebuild of the image with updated components.
To be most effective, the three tool-based capabilities need to be integratedinto an overall pipeline governance process, which ensures that securitychecks are enforced and that feedback is well-understood and acted uponby developers and engineers.
Ramping up security monitoring can take a long time to adapt to new environments. Specifically for cloud-native 5G deployments, the monitoring scope also continuously evolves, further increasing the challenge of creating holistic security monitoring.
At Rakuten, we needed monitoring from day 1 and hence started implementing a continuous improvement process that allowed adaptive learning even before our first mobile network was launched in Japan. The continuous learning approach is built around improving hacking threat detection on two fronts in parallel through simulated hacking activity, expressed in two scores:
We are simulating over 30 hacking techniques so far, with a focus on the technologies hackers are most interested in. For example, APIs are a likely hacking target in 5G environments. API hacks can lead to information leakage, identity theft, and service unavailability; hence we prioritize them when implementing analytics tests.
The Analytics Score tests drive continuous improvement of our Security Operations Center (SOC) performance. During the first three months of running the tests, our analytics coverage doubled.
Signal and Analytics Scores are each measured on a scale of 0-100%. The overall Security Monitoring Performance Score is simply:
Security Monitoring Performance Score = Signal Score x Analytics Score
Additional end-to-end assurance. The signal and analytics measurements continuously increase the detection baseline bottom-up. They do not however capture Response capabilities. To quality assure the Security Monitoring process end-to-end we periodically conduct Red Team simulations in which ethical hackers break into our network to provide a real world opportunity to respond to their activities.
Threat Response. Our analysts work 24/7 and are supported through modern technology to increase the effectiveness of their response. For instance, our security orchestration, automation, and response (SOAR) tools invoke automation scripts, for example to automatically collect forensics evidence and rebuild suspicious network elements. These processes aim to continuously increase the level of automation in threat detection and response over time.
Venturing into new technologies creates opportunity and risks. 5G and Open RAN are no exception: They provide faster, lower-latency, and smarter networks, but introduce an unprecedented level of technology scale and complexity.
Fortunately, other industries have paved the way towards securely deploying and operating cloud-scale infrastructures. At Rakuten Mobile and Rakuten Symphony, we leverage best practice knowledge in five ways:
We train every day because the race is never over and the hackers never slow down. The first five steps you take in this race will be your most important. We hope this guide brings clarity to the track ahead.
“The Rakuten telco environment, typical for a cloud-native 5G network, comes at unprecedented scale: We run thousands of containers in hundreds of clusters spread across the country.”