There are many benefits to implementing a hybrid cloud strategy, but it comes with growing pains, and businesses need to understand how to combat this.
The appeal of cloud computing is financial: rather than buy infrastructure, rent it – swap operational expense for capital expense. Unfortunately, this financial solution distracts the cloud user from the responsibilities remaining after moving the applications out of the enterprise. Businesses should focus on two critical areas: 5G “garbage collection” and shadow cloud.
5G Metadata Corruption – the 21st Century Garbage Collection Problem
Garbage collection is a classic computing problem. When a well-behaved program acquires storage, it releases it before terminating or returning control to its parent process. As a result, the storage becomes fragmented, and the requesting process will fail. The original work on the problem dates to a paper written sixty years ago by John McCarthy, Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I.
When implementing a cloud strategy, it’s easy to think that the problems that come with on-premise storage don’t exist. After all, programs run in virtual machines or under the control of a container or transiently as a lambda function. The environment is torn down after the function or program finishes. Any pending I/O or synchronization signals will fail to attempt to address a non-existent target environment. But the garbage collection problem remains. In the cloud, it isn’t the run-time environment that might become polluted with un-freed storage – the metadata gets corrupted. The control structure breaks down when it exhausts the various finite namespaces that track and account for the storage, virtual images, containers and functions as they come and go.
Virtual images come into being in the 5G world with each new application request from any 5G device. As these devices move, they connect with different cells and edge servers, and the image in the old environment must be torn down as a new instance stands up in the new environment. Furthermore, each component of that image consumes metadata – the processes must signal their state and dependencies to the new instance, or data will be lost in the handoff. Think back to the tense moment in Apollo 13 when the team has to start up the LEM – “I’m going to need your gimbal angles, Jack, before you shut down the computer!”
“When implementing a cloud strategy, it’s easy to think that the problems that come with on-premise storage don’t exist”
When we get full 5G, including densities of one device per square meter, the rate of image construction and teardown will accelerate to orders of magnitude beyond the human-scale creation and destruction of virtual images in some development environments. Consider the many problems we had experienced over the past year when a counter rolled over or a table filled up. Boeing 767s and 777s reboot every 160 days to clear counters. The NYC public internet went down for ten days after a date counter in a GPS satellite rolled over in April 2019. Some HP storage devices failed after 32,768 hours of operation, exhausting 16-bit use counters. Some Cisco routers failed after exhausting 64K NAT translations. MITRE maintains a list of CVEs relating to resource exhaustion. Similar problems will happen in 5G, without warning, unless vendors fund and fix counter capacity defects.
Shadow Cloud
With DevSecOps, organizations have picked up the pace of application deployment. However, there remain some capabilities that do not make the cut. Some users choose not to wait for an officially approved vehicle for new functionality and instead get the technology they need informally. Sometimes, this includes cloud services – storage, network, SaaS, IaaS, PaaS or other *aaS capabilities. A user may acquire a service for a specific task without knowing it comes with a cloud component. These unofficial cloud capabilities make up the “shadow cloud” world. Cloud is inexpensive and is very easy to buy. Sometimes you don’t even know you’ve bought it.
Shadow cloud can be a vehicle for malware propagation, cryptomining and data exfiltration. Cloud access security brokers (CASB) offer a solution to this problem by watching activity that invokes cloud and blocking unsanctioned, inappropriate or risky actions. As such, a CASB can be a component of a zero trust architecture.
Some organizations develop policies regulating the use of unapproved cloud resources. This can help. Training and more involvement by procurement can encourage safe measures.
Cloud is often implicated as a source for malware, usually through improperly configured or protected applications, as in supply chain attacks. Therefore, your organization should monitor unsanctioned cloud use to minimize the chance that an unknown service brings malware into the organization, or worse, passes it through your organization to your partners, suppliers or customers.
The metadata corruption problem is trickier. Most vendors are not conscious of the inherent limits in their code – the size of counters. Some vendors still act as if cloud resources were infinite. They aren’t. Require that cloud and SaaS vendors guarantee the capacity they can manage as part of the procurement process.
William Malik VP of Infrastructure Strategies, Trend Micro