blog-post

Photo by Mark Sze

Is your software choice responsible for your high cloud costs?

Here at Outer Loop, we are often working on developing new models to answer specific transport-related questions, and we always try to balance the sophistication of our models against the reality that these models will be run on our client’s own hardware, consuming both time and physical resources. We also know the reality of modelling projects means there is often a time crunch, and modellers appreciate the ability to get as many ‘what-if?’ scenarios as possible within those constraints.

However, even the most efficient code still requires some time to calculate. This is where cloud computing comes into the picture, offering a solution that modelling teams increasingly turn to. Cloud-based deployment of model simulations can bring significant relief in these situations, helping to alleviate the bottlenecks we are all too familiar with. The obvious benefit is that resources can be scaled up and down quickly to meet atypical demand for simulation runs, all at a much lower cost than purchasing and maintaining hardware internally to meet these rare occasions.

However (and there’s always a however), it’s important to note that not all cloud options are equal in terms of pricing. The internet is replete with cautionary tales of poorly chosen cloud architectures leading to exorbitant costs. Competent cloud computing architects and/or inquisitive technical staff are needed to navigate these challenges and identify the optimal cloud instances for their models, avoiding unnecessary expenses on unused RAM, storage, or CPU power.

Let us illustrate this with a fairly simple example - using a run-of-the-mill cloud instance to run a stock standard trip-based traffic assignment model (which needs around 16 cores and 32 GB of RAM). Prices can vary tremendously with those requirements alone, as cloud providers have a virtually unlimited mix of CPU generations and configurations; AWS provides 19 instances classes which have this exact configuration ranging in price from $0.40 to $1.00 USD/hour. We are going to focus on just 4 of the more recent instance types that should provide excellent performance (prices in US$/h).

Instance name Architecture CPU generation Linux Windows
c6g.4xlarge ARM Graviton 2 0.544 N/A
c7g.4xlarge ARM Graviton 3 0.580 N/A
c6i.4xlarge x86 3th Gen intel Xeon Scalable 0.680 1.416
c7i.4xlarge x86 4th Gen intel Xeon Scalable 0.714 1.504

There are two clear patterns that jump out of this table: x86 machines are at least 23% more expensive than their ARM counterparts, and Windows instances are more than 2x as expensive as their Linux counterparts. Those numbers by themselves don’t actually help us - if an x86 machine is 30% faster, it’s absolute worth the extra 23% cost. To answer the fundamental question of which would be the most cost-effective way of running our model, we need to actually benchmark each instance running the real world workload that we want to scale up.

To do this we used the Arkansas Statewide model which we converted to AequilibraE a few years ago to run assignment to convergence (1e-5) for all the model’s 4 time periods. This model has over 6,400 zones, 245,000 nodes and over 300,000 links (most are bidirectional) with three different vehicle classes to be assigned. The model is not extremely congested, so the number of iterations per assignment stays under 100, resulting in a manageable task for an experiment of this nature.

We should preface these results by saying that we were so surprised with the results, that we re-ran them a few days apart to ensure this was not a fluke. We also repeated the same experiment for instances with 16 vCPUs and 64GB of memory, and the pattern presented below was nearly identical. As the results below show clearly, the latest generation of x86 CPUs is 41% more expensive than its ARM counterpart, up from a 12% difference from the CPUs in previous generations. It is also impressive to note how quickly the ARM-based instances are improving, from 10% slower at the previous (6g) generation to nearly 15% FASTER (compared to the equivalent generation of intel CPUs).

Instance name Run time (mm:ss) $/run Cost index (Linux)
c6g.4xlarge 31:13 0.283 1.47
c7g.4xlarge 19:55 0.193 1.00
c6i.4xlarge 28:07 0.319 1.65
c7i.4xlarge 22:53 0.272 1.41

This is nothing new for those of us who work with high-performance software, cloud deployments, and large-scale experiments/studies. In a comprehensive test back in 2020, AnandTech had already shown that things don’t look good for our old x86 friends. Still, the gap has widened substantially with the latest CPUs on the market.

There is still one elephant in the room, though: Microsoft Windows.

One of the benefits of running models in AequilibraE is that we can take advantage of the amazing work that the Python community has put into porting Python and it’s eco-system to these newer machine architectures and get significantly cheaper per-run costs. Many modelling teams, however, are not so flexible and are locked into a fairly narrow range of instances which fit the requirements of their commercial modelling package.

While the intel-based solution is only 41% more expensive than the ARM one, this difference jumps to 198% when considering the cost of running this model on Windows. That means that if Windows on x86 is the only option your software supports - you will be paying about 3 times as much to run the same model!. I can’t imagine this won’t soon be discussed in transportation modelling circles as more companies move their computational workloads to the cloud.

With that only one question remains: Do you love your muscle memory and comfort with the software you have been using more than your wallet?

Kudos to Jake Moss and Jamie Cook for their contribution to this post.

Related Articles