When Walmart Inc. commits to something, it tends to think big. Building a supercloud is no exception.
For the past two years the Bentonville, Arkansas-based retail giant has been putting in place an abstraction layer that masks the distinctions between its preferred public cloud providers, internal hybrid cloud and edge computing nodes in 10,000 locations around the U.S. The principal goals: Build a multicloud environment that fits the business size and agility with the freedom to choose the best services from the cloud providers rather than being locked to one.
The multicloud provides Walmart with the ability to optimize resource consumption based on size, scale posture, cost and location to drive developer productivity by eliminating platform complexity, improving network resilience and making data analysis available to a much larger audience of business users.
Walmart doesn’t call its project — dubbed the Walmart Cloud Native Platform — a supercloud, but the principles are in line with an evolution that has organizations seeking to erase distinctions between major public cloud providers. The aim is to pick and choose the services they want without being locked in.
“We can bring the best capabilities from the cloud providers together in the same stack because we’ve created seamless integrations between providers,” said Kevin Evans, vice president of infrastructure services at Walmart. “We abstract away the provisioning and policy management and built that into our pipelines. We want people to take advantage of best-of-breed services.”
Developers can now provision most cloud services from an internal DXchange Integration Cloud console without having to fret about details of setup and configuration. Walmart’s platform engineering team is also building a set of preconfigured common services aimed at specific technology problems.
That will give developers less choice but faster time to resolution. “The ultimate goal is flexibility but we don’t always want to give up on functionality,” Evans said.
The retail giant is also in the process of overhauling much of its legacy software infrastructure and moving it to software containers or migrating existing virtual machines to containers using the open-source KubeVirt technology.
The company has already seen cost savings of up to 18%, but reducing cloud expenditures isn’t the primary objective. “Portability is more important than cost; it’s how we build resiliency,” Evans said. Using an approach it calls the “triplet model,” the company is positioning combinations of public cloud, private cloud and edge nodes strategically around the U.S. to give it the utmost flexibility in how it deploys workloads.
“Each triplet is located milliseconds from each other so we can use compute in one place and data in another,” Evans said. That makes failover fast and automatic in the event of an outage while letting Walmart take advantage of cloud bursting capabilities during times of peak traffic.
“It’s much easier to fail out of a provider in the same region to another provider,” Evans said. “We started out looking to leverage around cost, but it’s transitioned into leverage around resiliency.”
Walmart developed the abstraction layer internally and uses Red Hat Inc.’s OpenShift version of Kubernetes to orchestrate containers running on top of a 93,000-node compute fabric.
1,700% more site updates
The new platform has already enabled the company to make 170,000 adjustments to its website back end every month, compared with just 1,000 previously. “This means we can launch new experiences such as the ability to schedule a vaccine appointment at thousands of Walmart pharmacies faster and more seamlessly than ever,” Suresh Kumar, global chief technology officer and chief development officer, wrote in a LinkedIn post.
WCNP also leverages services from Microsoft Corp.’s Azure Cloud and the Google Cloud Platform, which are the retail giant’s preferred public cloud providers. Like some in its industry, Walmart doesn’t use cloud services from archrival Amazon Inc.
Although the company’s market might enable it to extract custom work from public cloud vendors, it has chosen the carrot over the stick. “We’ve asked them to lean in to make best-of-breed services portable so we can run them in other places,” Evans said. That has allowed Walmart to use data persistence from one provider and caching from another, for example, without having to install the full cloud stack onsite.
However, “we want them to be able to leverage anything as a product so it isn’t incumbent on us to support,” he said. “If they make it a profit, they are motivated to keep it current.”
The company avoids costly egress fees by leveraging direct connections to enable data to flow seamlessly. “We do a fair amount of big-data analytics using data lakes with data extracted from transactional systems,” Evans said. “We lean into what the cloud providers can offer.”
Revisiting the edge
Despite the current industrywide fascination with edge computing, which moves intelligence to the far reaches of the network, Evans said the triplet model has paradoxically rekindled interest in centralization.
“A lot of store operations are moving back to the cloud,” he said. “We can put most of the components of our platform into our stores but we don’t want to do that because it requires compute power and maintenance. It’s not super-efficient.”
Under the evolving model, he said, “If we can create resilient connectivity, it’s better to put more functionality into the cloud. The decision is based on latency requirements.”
The Walmart supercloud is still young and the company is struggling with the same challenges of observability and performance management that confront any company taking the cloud-native route.
“The more components and complexity the greater the challenge,” Evans said. “A given transaction can traverse everything from the consumer to the store back to the cloud and our data centers. We need to have that visibility.”
But if there’s a ground zero for the enterprise supercloud, Bentonville may be as good as it gets.