Balancing capacity and performance of applications without either degrading performance or over-provisioning (and overspending on) infrastructure is hard. Engineers have to keep an eye out and control too many moving pieces to achieve that balance between performance and needed infrastructure capacity. The current model of manually configuring auto-scalability across different layers is no longer sufficient or scalable and can be done in much more innovative ways.
Magalix solution scales Kubernetes's pods, containers, and nodes based on multiple factors, such as predicted workloads, dependencies between microservices, and available capacity to keep your applications within their performance goals with the least amount of resources. For example, if a web application's API latency set as the key performance indicator (KPI), and that KPI is set to be no more than 800 milliseconds, Magalix AI optimizes pods, containers and nodes to make sure that the application always meets that goal with the least amount of resources (CPU, memory, and I/O).
Magalix AI understand the application’s run-time architecture, analyze the impact of workloads on resources needs, and starts to proactively scale pods, containers, and nodes of Kubernetes clusters up or down.
The current tools to set autoscaling rules and manually updating them every few days or every few weeks are no longer scalable and are overdue for a disruption.
There are many moving parts and factors that can impact the scale of applications and infrastructure, one such factor is user's workloads, evolving application architecture, and a plethora of VM sizes that engineers need to select from. These factors are moving at different velocities and touched by different teams - learn more. If these resources are not managed at the proper frequency and with the proper precision can lead to bad things for your customers and business happen. Here are a few, angry customers due to sub-optimal performance, angry management due to the rising cost of running the business, and contentious team interaction about resources control and management.
Even though the current autoscaling mechanisms and policies have been used almost since the inception of cloud computing, still many teams struggle to keep up with proper performance management. Additionally, many companies are now breaking the silence about spending too much on cloud infrastructure beyond control in many cases.
Finding that balance between performance and resources is a continuous process that consists of: (1) Evaluating users workloads patterns, (2) Understanding the impact of these workloads on different components, and (3) identifying potential resources bottlenecks for each component (or microservice), and (4) updating the scale of these components and underlying infrastructure. It is not possible to do this with current tools at a proper frequency, not even once a week.
At Magalix we are on a mission to reinvent how applications scalability is managed. Because scalability must be managed across the whole run-time stack in a much faster and more precise way, we decided to decompose the problem to the basic components and make the above-mentioned process more driven by AI to augment developer's brain powers without limiting their control over the stack.
The current approaches to managing applications performance and scalability mostly depend on setting auto-scalability rules. Kubernetes community did a good job providing a lot of options to scale pods and nodes using HPA, VPA, and CA. But that requires a lot of legwork and continuous monitoring of performance and resources utilization metrics.
Magalix uses an approach that is much closer to what is actually taking place with application's usage, i.e. workloads, and the usage of different resources, such as memory, CPU, I/O, etc.
Magalix autoscaling cycle
In a nutshell:
- After connecting a cluster to Magalix backend, different container, and infrastructure level metrics are observed to analyze their significance.
- Magalix AI starts to build predictive models of significant metrics to discover any repeatable patterns that may impact performance and resources utilization. For example, if an application is having resource spikes twice a day, Magalix AI models start to make sense of how these spikes impacting the utilization of CPU and memory inside those components observed to have significant changes at peak times.
- Magalix applies this impact analysis across the whole application to identify the different areas requiring resources allocation adjustment.
- Next, a scalability decision is generated, stamped with a future execution time along with the required resources adjustment for each pod and container. These decisions are further analyzed and synthesized to identify the impact on nodes. More decisions to scale nodes might be generated as well.
- If the auto-pilot is turned on, Magalix agent executes these decisions. And the process repeats!
The main idea behind Magalix is to automate some of the repeatable tasks with the help of AI to take some scalability decisions for pods, containers, and nodes. Below is a high-level diagram to explain how Magalix works to automate scalability inside a Kubernetes cluster. To kick-start this workflow Magalix agent pod must be installed at the target cluster. The agent performs four main tasks: (1) Sends resources metrics to the backend, (2) Executes scalability decisions, and (3) provide relevant feedback data to Magalix BE. For more detail about the Magalix end-to-end autoscaling workflow, please read the detailed description here.
Magalix high-level workflow
- Magalix system and components do not have access to any of your application's data or code. Our system works at the operating system level to read consumption of CPU, memory, and I/O.
- Scalability decisions are pro-active e.g. future decisions, usually a few hours away. This is by design to avoid being reactive with our decisions.