V2 Digital: Accelerating the Digital Next

Cloud Cost Optimisation - An Architect’s Guide to Operational Efficiency (Part 1)

Shane Baldacchino Headshot
Shane Baldacchino
May 16, 2024
Cloud Cost Optimisation

Cloud and cost can be quite a polarising topic. Do it right, and you can run super lean, drive down the cost to serve, and ride the cloud innovation train. But inversely, do it wrong, and treat the public cloud like a data centre, and your costs could be significantly larger than on-premises.

I have been fortunate to work for some of Australia's largest websites and two of the major public cloud vendors. When it comes to architecture, I have seen the exceptional and the questionable.

Just like a car has value for its economy, there are often tradeoffs. Having a low litres per litres/kilowatt hours per 100km often goes hand in hand with low performance.

Our goal should be:

How can we increase the efficiency of our architecture without compromising other facets such as reliability, performance and operational overhead?

Broadly speaking, yes, you should be spending less.

In this multi-part series, we’re going to cover three main domains;

  1. Operational optimisation

  2. Infrastructure optimisation

  3. Architectural optimisations

With such a broad range of optimisations, hopefully, something you read here will resonate with you and provide some meaningful cost-saving initiatives that you can execute in your environment. I want to show you where the opportunities for savings exist.

Public cloud is full of new architectural levers for us builders, which is amazing but can be daunting. New levers for all of us, and with the hyperscale providers releasing north of 2000 (5.x per day) updates per day, we all need to pay attention and constantly climb this cloud maturity curve.

Maths related to the public cloud can be complex at times. Services may have multiple pricing dimensions. The key is to find those cost savings and invest in them.

Serverless Cost Optimisation Tips

Cloud – A new dimension

When we look through the lens of the public cloud it has brought us all a new dimension of flexibility, so many more building blocks. The question is, how are you constructing them?

Lego

Image 1: Are you a Lego master?

When we, as builders, talk about architecture, we will often architect around a few dimensions, some more important than others, depending on your requirements.

Commonly we will architect for availability, performance, security and function, but I would like to propose a new domain for architecture, and that is economy.

When you’re building your systems, you need to look at the economy of your architecture because today, in 2024, you have a great deal of control over it. New frameworks, tools, technologies, hosting platforms… all new, new, new.

Lifecycle Management

Your goal should be to trial and change the way a system is built during its own lifetime. As architects and developers, we must move away from this model of heavy upfront design or some finger-in-the-air predictions of what capacities a solution needs.

Instead, embrace the idea of radical change during an application lifecycle funded by cost savings.

Yes, there are degrees to which you can do this depending on whether you built the system yourself or you’re using COTS (Consumer Off The Shelf Software), but I will walk through options that you can apply to your existing stacks regarding what is possible.

How Are You Keeping Score?

Even with COTS, there are options. Have you noticed the appearance of new levers in the form of updates? Do you have a mechanism in place to be kept aware of updates? If you do, then that's great, but if you don’t, let me share with you two patterns we use at V2 Digital.

Two mechanisms you can use is to feed updates into Slack or Teams either via RSS 

Or via serverless compute with a Webhook into your messaging platform of choice.

Exposing your teams to the latest updates can often be a cue to alter your architecture whilst upskilling your internal team and building their capability.

AWS Slack Feed: Slack has a built in RSS feed parser making life easy for the technologists at V2
Azure Slack Feed

Image 2: Slack has a built in RSS feed parser making life easy for the technologists at V2

The Basics

With the right approach, some sizeable cost savings can be made to reduce your cloud bill. 

Strategic moves for cloud cost optimisation

Image 3: To pass "GO" you must follow strategic moves for Cloud Cost Optimisation.

The first step is to go back to basics, to get the fundamentals right from the start. These are fundamentals in cloud and, to a degree, software development. 

Understand your baseline.

You can’t improve what you can’t measure. Do you know what your per transaction cost is?

What is your per transaction cost? Do you know what the cost to serve is?
If you do, well done, but if you don’t then how can you improve?

Measuring The Cost To Serve

There are three different approaches to determining this baseline:

When you have this information, you can ask the question.

What’s my average transaction flow versus my average infrastructure cost? Then you can put it up in the corner and say, “Development team, we need to optimise”.

This becomes your measure, and you need to make this relevant and tangible to your business stakeholders for organisational buy-in.

Cloud Cost Dashboard

Image 4: Do you have a cost dashboard?

Operational Optimisation

Another consideration is how you are paying for public cloud. Using a credit card in a PAYG (Pay As You Go) model might be a great way to get started, but it can be expensive for Microsoft Azure and Amazon Web Services.

Here are some approaches to investigate:

In my experience, you need to move away from paying on demand because this is the most expensive way to leverage public cloud. In comparison, on-demand savings can range from 15% to 90%. Typically, discounts apply either for commitment, giving cloud providers certainty, or, in the case of SPOT, for your ability to leverage idle unused resources.

While not groundbreaking, ‘Reserved Instances’ and ‘Savings Plans’ allow you to minimise the cost of traditional architectures. My next piece of wisdom is to have a ‘Reserved Instance / Savings Plan’ percentage target.

Some of the best organisations I have seen in the past have had up to 80% of their IaaS resources covered by ‘Reserved Instances / Savings Plans’. If you don’t have a target, I recommend you look into this.

But before you make a purchase, understand your workload. Understand the ebbs and flows of your baseline load.

The rule of thumb is to assess a workload for 3 months, during the time right size accordingly.

Leverage Azure Monitor / Amazon CloudWatch with a combination of Azure Advisor / AWS Trusted Advisor to fine-tune your application.

Optimise The Humans – High Value vs. Low Value

Operational optimisation. How much time do you spend thinking about labour costs, do you include these costs in your cost to serve?  Think about one’s labour cost. You hire people, they do ‘stuff’. The thing is, cloud practitioners can be an expensive resource.

To prove my point, according to SEEK, the average Database Administrator (DBA) in Australia earns $105,000 AUD annually.

This is just the median DBA and none of us here would ever work with just a median DBA, so we have established that people have a cost. But let’s think about what is the actual meaning of this cost.

Looking through the lens of something DBA’s do so often, a minor database engine upgrade. This is important as we should be upgrading our databases on a regular basis (security, features, performance).

But let’s look at the Amazon RDS, which is a managed service for running relational databases in the cloud vs. running a database engine on IaaS.

Self-Managed (IaaS)

Amazon RDS

Backup primary

Verify update window

Backup secondary

Create a change record

Backup server OS

Verify success in staging

Assemble upgrade binaries

Verify success in production

Create a change record

Create rollback plan

Rehearse in development

Run against staging

Run against production standby

Verify

Failover

Run in production

Verify

8 Hours Minimum

1 Hour

What’s the administrative effort of a minor database engine upgrade?

While managed services may appear more expensive on paper, the administrative cost of performing undifferentiated heavy lifting is far greater. I am saving time, and I will receive logs and an audit trail that I can attach to my change record for auditability.

You may say to me, well we’re going spend that money anyway, these people are not going away.

I would say that’s great, but you could invest that particular chunk of time into something else of greater business value like maybe tuning your database (query plans, index optimisation). This is a better use of a DBA time with a higher value return.

Summary

Public Cloud brings a magnitude of opportunities to builders and architects. Public Cloud provides you with a raft of new levers that you can pull, twist, and pull to architect for the new world.

Contact us at V2 Digital and let us help you and your team climb the cloud maturity curve and achieve the same or better outcome at a lower cost.
Architectures can and should evolve, but they need to make sense. What is the cost of change?

Join me in the next part of this multi-part series as we explore Infrastructure and Architectural optimisations you can make.

Enjoy this insight?Share it with your network
linked inmail

© 2024 V2 Digital|Privacy Policy

In the spirit of reconciliation V2 Digital acknowledges the Traditional Custodians of country throughout Australia and their connections to land, sea and community.
We pay our respect to their Elders past and present and extend that respect to all Aboriginal and Torres Strait Islander peoples today.