How Rightsizing Your Cloud Infrastructure Can Unveil Hidden Savings

All organizations are re-planning their budgets to meet the challenges of the current economic crisis. IT leadership is under the same pressure as every other line of business—rethinking spend commitments and reevaluating resources. This leads many IT and finance teams to use FinOps practices to self-fund cost-cutting.

One place where teams might find these savings is within their cloud infrastructure. While cloud services are great in that they can be provisioned immediately to develop new projects or test that next amazing app, they’re often purchased at incorrect sizes.

Think of it in terms of how consumers purchase the latest technology. We see the marketing and labels and think “Oohh, I do want the latest version!” However, do we think about whether or not we actually utilize all of that computing horsepower?

Okay, so how much you utilize your phone might be different than judging the efficiency of a computing cluster. But there are savings to be found by running a cloud cost management exercise called rightsizing.

Here are three steps to take to identify immediate cloud cost savings.

Step 1: Make sure your teams use correct resource types within each cloud service

Every cloud service is different (think compute, containers, storage, databases, machine learning). For instance (pun intended), on AWS EC2, instances of various types and sizes have different per-second costs. The generic M-class caters to typical workloads at an affordable rate. However, if users require compute-intensive, memory-heavy, or GPU-focused workloads, they’ll need to pay different rates for these services.

Using a cloud cost management platform, look back at tags and allocation to see which teams, projects, or departments are generating high VM or compute costs. Dig deeper and see which services they’re using. Ask these questions:

  • Are these services the correct ones for the workloads?
  • How much is each machine being utilized?
  • Can workloads be combined onto the same instance?
  • Are these teams getting the best rate for the instances they’ve chosen?

Pro tip: If you’re unsure about how well your teams are utilizing their existing services, a cloud cost management platform uses data analytics to make sense of it all for you. See how Cloudability combines multiple cloud cost data points from multiple cloud providers to surface your actual cloud costs in detail.

Storage services make a difference as well

Storing data might seem simple on paper—write this to that disk, read that part of this disk, repeat. However, the choices of cloud storage, their speeds, and media adds a bit of complexity. Computing solutions, like EC2, have built-in storage, but what if users want more volume or faster types of disk drives? Well, they’ll have to pay more. It might be worth looking at what kind of data your teams are storing to see if there are any opportunities to change storage types to cheaper ones to save.

While faster SSD or NVMe storage is the go-to for workloads, users can opt for cheaper “cold” storage solutions for data they hardly touch or use. These solutions are often cheaper to run over time, but might require costs to transfer that data from cold storage back into everyday use. It should be obvious that while finding savings is important here, don’t reduce costs and threaten critical data stores that need to be accessed by production applications or workloads.

Use a cloud cost management platform to help manage read and write activity to get a sense of how much storage is actually costing your teams across your cloud infrastructure.

Every business uses their cloud services differently

Differing workloads and services will create various shapes of “efficiency”—no two businesses are alike. Without a visualization of this utilization data, it can be difficult to see if your teams are fully utilizing all of what a service might offer.

An example could be looking at a compute cluster and seeing if each individual instance is using all of its capability. Here are some key questions to answer:

  • Were these instances set up during a time of intense utilization, and are they now running idle at times?
  • Are your teams tracking usage over time to determine how much work they actually produce?

This is an opportunity for change and for savings. Using a cloud cost management platform to help determine where this waste is happening can help your technology teams determine what to shut off and what to rightsize.

Step 2: Rightsize existing services into ones that fit best

There are also many sizes of services within each family within the various types of services. For teams with large budgets or who are new to cloud services, it’s common to see them start up services to get projects and workloads up and running with no regard to whether or not those instances are the correct size.

Your teams might find savings by ensuring that services on your infrastructure are the correct size. This means looking at various cloud services and seeing what percentage of them are being utilized. Some businesses prefer to have a goal of utilization, like 75%, which tells engineers that they’re using enough while leaving room for workload spikes.

When a cloud service is consistently underutilized, it doesn’t mean that the service is poor—it might just mean your teams are using the wrong size and you’re being billed a higher rate per second! Using a data platform to help your teams determine what percentage of utilization is efficient can help with rightsizing discussions.

NOTE: If you change the size of a cloud service, and it is attached to a lower committed rate agreement, e.g. AWS EC2 instance and their respective Reserved Instances, you’ll want to be ready to convert or sell those RIs!

Avoid error-prone human decision-making

While smaller businesses with smaller cloud footprints might be able to rightsize their cloud services by hand, it gets complicated at enterprise scale. Having hundreds of thousands of services to manage can make it impossible to track cloud costs and utilization every month.

Instead, use a cloud cost management platform to not only ingest and manage all cloud services, but to also surface reports that help identify which services require rightsizing. Not only does this avoid guesswork, but it utilizes actual data over time to help your technology teams determine different baselines for service utilization.

Establishing these baselines show which services are operating efficiently and which are underperforming. This creates data-driven talking points when considering where to find savings from the IT budget. However, you might find that some services are overburdened, creating opportunities to reduce service downtime (a different kind of opportunity cost) by sufficiently provisioning enough computing power to sustain those workloads.

Leading cloud cost management platforms will have API functionality to connect to infrastructure and automate rightsizing, taking away risky, error-prone manual processes. This can be a great long-term initiative for engineers to take part in building programmatic solutions to rightsizing while building strong FinOps practices for your organization.

Step 3: Take a data-driven approach to cloud elasticity

The benefit of cloud services is that you can always spin up new resources to help with increased workloads. It’s when teams forget to shut off those resources where overages and inefficiency occur. Remember when your parents would yell at you for keeping the lights on, or the front door or fridge open? No one likes waste.

Same goes for cloud services. There’s no need to keep instances on when they aren’t in use. This usage with low utilization is actually waste. Hunting for this waste manually can be quite the chore, especially within a massive, scaled cloud infrastructure. Instead, rely on a data analytics platform to help your teams identify and rightsize services to improve utilization.

The places where insights can come up can be surprising as well. In a well-tagged infrastructure running a cloud cost management platform, sometimes “zombie costs” from services left on or unattended can be detected by users operating within different teams or even locations. Democratizing how teams access cloud cost and utilization data opens up more opportunities for FinOps-minded members to seek out ways to save, react to anomalies, and generally be more vocal about cloud cost optimization.

FinOps experts let the machines do the work of analyzing and automating rightsizing by setting up thresholds of utilization for services to not go under or over. This wayfinding helps teams be proactive about whether or not their infrastructures require more (or fewer) services. This is by far better than over-purchasing and forgetting about things (and having them incur costs by the seconds!).

There might be more opportunities for savings than you think

Workloads and applications on the cloud that serve thousands of users per day will likely see ebbs and flows of utilization. If you have the means of tracking this over time, you’ll see baselines and averages of how much individual services actually use. If these workloads are barely tapping the potential of your cloud services, you might have waste.

If you’re running a lean operation and every service is well-utilized, then great. Maybe you’ll find savings opportunities elsewhere (when was the last time you checked on how well your infrastructure is tagged?).

Before any kind of at-scale infrastructure rightsizing begins, be sure that your technology and finance teams have a cloud cost management platform or tool in place to assist in making sense of this data. Services like Apptio Cloudability help ingest cloud cost and utilization data, processing it within an analytics engine, while producing easy-to-read dashboards and reports. This creates a common language for all technologists, finance personnel, or business leaders to help make sense of cloud costs.

Get started with an Apptio Cloudability free trial.

Article Contents

Categories

Tags

Additional Resources