TAGS:

A Pragmatic Look at Cloud Infrastructure

Gopinath Balakrishnan

I work for Google but these viewpoints are my own and do not represent any company.

Public cloud offers rapid setup, access to tools, the latest tech, and pay-as-you-go pricing. These attributes appeal to large enterprises looking for flexibility, elastic infrastructure, and the ability to build and run applications that serve customers in multiple geographical regions.

However, enterprises considering the public cloud have to reckon with trade-offs around costs, application modernization efforts, the networking challenges that come with globally distributed applications, and ongoing reviews of architectural choices.

If you’re already cloud-native with a modern application stack, this may not apply. However, understanding cloud performance metrics is still key to setting the right alerts and meeting your application’s service level objectives (SLOs).

Lifting And Shifting Gets Expensive

Many enterprise apps are not designed for the cloud. Simply lifting and shifting an application to the public cloud without refactoring the application will likely increase the cost of computing, storage, and networking over time. There must be a constant effort to monitor and control usage. Cloud cost optimization and management is outside the scope of this article, but without these efforts, the cloud can become a very expensive infrastructure. Yes, if you intend to lift and shift and then immediately modernize apps, your infrastructure spending might improve, but you’ll still incur those modernization costs.

The Challenges of Highly Distributed Applications

Another aspect to consider is whether your application can be deployed in a highly distributed fashion across global cloud regions. The challenge with highly distributed deployment is that while it’s great for resiliency and availability, it introduces cost and some additional networking challenges.

Most public clouds offer few large regions with lower cost for many products, hence many enterprises chose such larger regions for both cost and also increased availability of cloud resources for peak demands with automated scaling enabled. However, prices for cloud products and services can vary across regions and continents.

Thus, you must review your edge/global deployment strategy, and also review infrastructure deployment in the cloud availability zone and how your application can be self-sufficient within a cloud region or availability zone.

Can You Get Cloud Agility With Data Center Control?

Large enterprises want to come to the cloud to reap the benefits of building, operating, and managing infrastructure in a modern way at scale. However, they don’t always have the experience or staff to re-architect or modernize the apps. Usage costs can significantly increase your networking spend in clouds if applications are not optimized. Yes, you may work with cloud providers to discount network spend, but be aware your network costs can become extremely high or unpredictable.

While the cloud offers great benefits including availability and global connectivity, keep in mind it’s a shared network between you and other users. There’s a limit of any resource in a given cloud location and time. There’s also the possibility of a noisy neighbor situation that can affect your application performance, although cloud providers constantly work on addressing these issues by implementing new solutions, monitoring usage, and capacity augmentations.

Cloud networks are built with massive connectivity and SDN-based infrastructure operated and managed by SREs who constantly update and upgrade networks. These SREs and automated processes largely assume cloud customers have their applications built with resiliency which can detect and mitigate based on higher layer signals. Generally speaking, that is great and how apps should be built rather than relying on L4 network layers. However many enterprise apps are not built this way today and they believe cloud networks should offer the same reliability as their privately managed and dedicated DC network or WAN. The problem is not all enterprise apps are built with resiliency at higher layers.

Regularly Review Your Architectural Choices

This should not be a surprise to enterprises building applications and delivering services using public cloud infrastructure. It’s almost that a year in cloud technologies is roughly equivalent to seven years in the typical enterprise infrastructure business given the pace of learning, challenges, and new solutions being introduced. So, it is important to revisit your infrastructure choices made two to three years ago with a laser focus on your business priorities and how you can take advantage of new solutions and new ways of doing things to optimize and simplify further for your business.

From a networking point of view, any large deployment of apps with multiple regions, hybrid, and multi-cloud infrastructure requires significant architecture planning. This includes going deep into each cloud service provider’s network, including path diversity, latency between locations, connectivity, and service SLAs. Based on such architecture reviews, it’s critical to plan your availability and reliability for large, multi-region, hybrid and multi-cloud deployments. Not surprisingly, with every cloud provider constantly introducing new features, an architecture you decided on two years ago needs to be revisited and possibly updated based on your business priorities and problems you want to solve.

Revisit End to End App Latency SLOs/Alert Thresholds Critically

Running and managing a DC network/private backbone WAN is very different from managing cloud connectivity because the cloud provider manages infrastructure and provides a level of agility to business with a layer of abstractions. Given that, it’s important to revisit your app latency and throughput needs with major regions both for east-west and north-south and establish your internal SLOs with your application owners. You may have to often challenge stringent latency and availability requirements. Build your alerting and monitoring based on such internal SLOs and make sure to do such architecture planning deeply involving cloud provider experts and product specialists.

Lastly, I have noticed many enterprises think of cloud providers as another infrastructure vendor. Cloud is simply not another rented data center rack someone else manages. Every cloud continuously improves and provides new services and new options to solve problems. Building open and transparent partnerships with cloud providers around business goals, and problems you are trying to solve and building things together will help a long way toward successful adoption.

Gopinath Balakrishnan

About Gopinath Balakrishnan: Gopinath is a cloud architect with deep expertise in developing networking software/hardware solutions. He works with Google’s strategic accounts architecting scalable, performant and secure connectivity solutions that ensure high availability and reliability of their application infrastructure. He specializes in hybrid/multi-cloud connectivity, DDoS mitigations, network security, data center switching and routing, Paas, and Iaas technologies.

Leave a Comment