Empowering famous brands to create digital experiences that build customer loyalty requires a robust innovative and scalable infrastructure, and Acquia Inc., the largest provider of Drupal hosting capabilities, knows this well.
Ranking second in the digital experience platform space, right behind Adobe Systems Inc., Acquia enables marketers, developers and IT operations teams to rapidly compose and deploy digital products and services with the goal of engaging customers, enhancing conversions, and helping businesses grasp their competitive advantage.
To support these evolving, high-demand services, Acquia realized that its legacy infrastructure would no longer be sufficient and decided to move to the open-source container orchestration system Kubernetes. But while this escalated software development, it also escalated the complexity of the platform, creating the need for automatic and real-time optimization and observability.
“We leverage StormForge to enable us to ‘right-size’ applications for performance, provide us cost benefits, allocate what you need when you need it for our customers,” said Charley Dublin (pictured), vice president of product management at Acquia, in an interview to Dave Vellante, host of theCUBE, SiliconANGLE Media’s live streaming studio, during StormForge’s recent “Solving the Kubernetes Complexity Gap by Optimizing With Machine Learning” event. “And that at-scale piece is a critical part.”
The need for a world-class platform
Acquia’s main open DXP platform claims to be the largest independent developer community in the content management system Drupal, allowing the company to fuse disparate data and marketing technologies alongside automating digital experiences. Acquia has more than 4,000 customers, including a range of well-known enterprises, such as IBM, Vodafone Group, Pinterest, NBC Publishing, Whole Foods Market, Panasonic, BBC Worldwide and Johnson & Johnson.
The company’s first objective is to provide customers a better experience with Drupal, entailing capabilities around easing the use for the system itself, according to Dublin.
“That has to run on a world-class platform, it has to be the most performing, it has to be the most secure, it needs to be flexible to enable customers to run Drupal however they want to. And so that involves the ability to support thousands of different kinds of modules that come out of the community,” he explained.
Acquia’s apparent choice for the agility of Kubernetes is also that of most modern companies today. The use of Kubernetes and the container orchestration tool has sharply increased in the past few years. A study shows that the global application container market is expected to reach USD 8.2 billion by 2025, registering a 26.5% compound annual growth rate since 2019.
Another study estimates 89% of business organizations now use container orchestration in production or pre-production environments, with 77% naming it as central to their digital transformation strategy. However, the research also found that almost all (94%) of the organizations using Kubernetes are experiencing challenges. High on the list of problems is that Kubernetes is notoriously hard to fine-tune. With mission-critical workloads at stake, engineers err on the side of caution and run up large bills provisioning cloud resources they don’t need.
Enabling intelligent business trade-offs
The key technology Acquia chose to solve its Kubernetes complexities was machine learning, as is provided by StormForge, which uses the tool to automatically manage Kubernetes resources at scale to improve efficiency and performance.
In an interview to theCUBE last February, Matt Provo, founder and chief executive officer of StormForge, explained that the company started out as a lab focused on building real artificial intelligence that connected to solving real business problems, touting several staff members with applied math doctorates who work on machine learning.
The actual use-case application to solving Kubernetes issues came after the team had been working together for nearly four years.
“DevOps startup StormForge’s newest offering merges advanced machine learning with observability tools to provide IT Ops and developer teams with real-time configuration recommendations to ensure operational optimization,” said Charlotte Dunlap, principal analyst of application platforms, enterprise technology and services at GlobalData PLC in a report released in February.
The core of StormForge’s solution is Optimize Pro, which provides an experimentation-based approach to optimization. It gives in-depth application insights through a process of rapid experimentation and scenario analysis using machine learning in non-production environments.
There is a series of five steps. First, the solution scans the business’ Kubernetes cluster to automatically find all the configurable parameters and detect the current values for these parameters to use as a baseline for comparison. Then, it optimizes the structure for multiple objectives, such as application response time or cost, enabling real business trade-offs between competing goals.
After this, load testing is used to place a realistic production load on the application during the optimization process. Notably, load tests can be performed using StormForge performance testing or with a third-party load testing tool. The fourth step is where machine learning comes in. StormForge uses the technology to analyze all the results and deliver a new set of parameters for the next trial. Lastly, the StormForge controller repeats the cycle for the number of trials specified.
Acquia benefits from this pre-production optimization by fully understanding system behavior to spike efficiency. With ML analysis, its team can easily identify trade-offs and make the right business decisions, providing a sustainable, scalable and operationally efficient way to manage performance and cost, according to Dublin.
“We support customers leveraging Drupal in every industry; globally, we do business in 30 different countries, and so what we have is a very wide range of applications and consumer and consumption models,” he explained. “And we felt that leveraging StormForge would put us in a position where we’d be able to ‘right-size’ resources to those different kinds of applications, essentially letting the platform align to how customers want to operate their applications.”
Acquia even tried to handle the new structure internally but shortly gave up, as the number of applications grew immensely.
“As you start getting to scale from a few apps to hundreds of apps, to certainly across our fleet of tens of thousands of applications, you really need something that leverages machine learning. You really need a technology that’s integrated well within AWS, and StormForge provides that solution,” Dublin stated.
Observability informs decision-making
In addition to Optimize Pro, aimed at the non-production environment, Acquia has participated in beta trials for StormForge’s newly released Optimize Live observability technology. It gathers data companies are already collecting to enable intelligent optimization within production environments. The product continuously and consistently observes the data flowing through Kubernetes tools and serves recommendations back to the user, who can then allow automatic patch and deploy or decide to manually deploy into the environment themselves.
Optimize Live also allows engineers to run checks and balances between pre-production and production environments to make sure that actual application performance and deployment levels are in line with service-level objectives and agreements, as well as business goals, according to Provo.
“My vision has been for us to be able to close the loop between data coming out of pre-production and the associated optimizations and data coming out of a production environment and our ability to optimize that,” Provo said to theCUBE when the release of the Optimize Live was announced.
For Acquia, the observability tool was an important improvement.
“Optimize Live allows us to do that [observability] in real time, to make policy decisions across our fleet on what’s the right trade-off between performance cost [and] other parameters,” Dublin said. “Again, it informs our decision-making and our management of our platform. That would be very, very difficult otherwise.”
Resource optimization — sometimes referred to as “cloud rightsizing” — is a premise that has been around for quite some time, with varying levels of traction and adoption, according to James Sanders, research analyst at 451 Research, part of S&P Global Market Intelligence.
“From a product side, doing this for VMs can be achieved with some light automation, monitoring and simple heuristics. Rightsizing for Kubernetes clusters is quite a bit more involved, as is often the case with Kubernetes, so an ML-informed approach could aid in addressing that complexity,” he said.
Rather than replacing experts with technology, Acquia adopts a human-in-the-loop strategy. This means that the company continues to have specialists taking care of the applications but not each app individually. The team can use ML insights to make decisions about multiple applications simultaneously.
“Without StormForge we’d have to do massive data aggregation. We’d have to have machine learning and additional infrastructure to manage to derive this information, and that is not our core business,” Dublin said.
Looking forward, Acquia plans to strengthen its involvement with StormForge products to drive even more automation and decision-making as it expands and moves more and more customers to its new platform.
“We’re going to uncover use cases, different challenges as we go,” Dublin stated. “I think it’s a learning process for both sides, but I think it’s been successful so far and has a lot of potential.”
Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.