AWS Graviton 2 ARM C6g is the Magento Cloud Killer instance

Yegor Shytikov
6 min readOct 14, 2020

According to my last test, ARM Graviton has 70% better M2 throughput performance; however, 10% slower TTFB because of slower clock speed 2.5GHz vs. 3GHz. Also, ARM has a lower energy consumption and carbon footprint and doesn’t heat our Planet so much.

AWS Graviton it is Apple M1 in the cloud.

Apple’s new M1 CPU has the same architecture as AWS Graviton 2 Magento Cloud killer EC2 instance. This is a custom AWS 64-core monolithic server chip design built using a 7-nanometer manufacturing process and ARM64 architecture.

Main Intels CPU issue is Hyperthreading technology. Usually, Magento hosters are cheating by selling one Tread as 1 physical CPU; however, 1 vCPU performance vise is 0.5 of the physical CPU.

Let's test throughput performance of 8vCPU C5 instance vs. C6g with also 8 vCPUs, however, physical cores without multithreading.

We need to keep busy with all 8 CPUs during a small period of time.

C5 1 request

Concurrency Level:      1
Time taken for tests: 3.183 seconds
Complete requests: 20
Failed requests: 0
Total transferred: 647480 bytes
HTML transferred: 553780 bytes
Requests per second: 6.28 [#/sec] (mean)
Time per request: 159.137 [ms] (mean)
Time per request: 159.137 [ms] (mean, across all concurrent requests)
Transfer rate: 198.67 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 155 159 4.6 158 176
Waiting: 155 158 4.6 157 175
Total: 155 159 4.6 158 176

Result: 157ms

C6 1 request

Concurrency Level:      1
Time taken for tests: 2.734 seconds
Complete requests: 16
Failed requests: 0
Requests per second: 5.85 [#/sec] (mean)
Time per request: 170.875 [ms] (mean)
Time per request: 170.875 [ms] (mean, across all concurrent requests)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 1
Processing: 169 171 1.3 170 175
Waiting: 168 169 0.4 169 170
Total: 169 171 1.4 171 175

Result: 169ms

C5 8 requests all CPUs are busy.

Complete requests:      20
Failed requests: 0
Requests per second: 6.83 [#/sec] (mean)
Time per request: 292.912 [ms] (mean)
Time per request: 146.456 [ms] (mean, across all concurrent requests)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 162 277 37.1 285 306
Waiting: 162 275 36.9 284 305
Total: 163 277 37.1 285 307

Result: 286ms

Intel processor has 60% performance degradation.

C6 8 requests all CPUs are busy:

Complete requests:      20
Failed requests: 0
Total transferred: 2412840 bytes
HTML transferred: 2401960 bytes
Requests per second: 29.19 [#/sec] (mean)
Time per request: 274.096 [ms] (mean)
Time per request: 34.262 [ms] (mean, across all concurrent requests)
Transfer rate: 3438.63 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 1
Processing: 169 172 1.3 172 175
Waiting: 168 171 1.4 171 174
Total: 170 172 1.3 172 175

Result: 171ms

As we can see, C6g doesn’t have performance degradation if the number of running PHP-FPM processes equal to the number of vCPUs. Intell has this issue because concurrent processes blocking physical CPUs.

Magento Cloud vs AWS Graviton2 multi request performance

With hyper-threading, each core can accept threads from the operating system (or hypervisor). These two threads will still share physical execution units (CPU).

Each Graviton2 has 64 physical cores, and an AWS virtual CPU (vCPU) uses an entire core rather than the simultaneous multithreading (SMT) core sharing that’s applied to x86 vCPUs.

Simultaneous multithreading can decrease performance if any of the shared resources are bottlenecks for performance. Magento 2 Core itself is a performance disaster for processors. Certified Magento Developer should test whether simultaneous multithreading is good or bad for their Magento application in various situations and insert extra logic to turn it off if it decreases performance. In my opinion, a more Physical Core is always better than a virtual/logical one.

Andrei Frumusanu states:

If you’re an EC2 customer today, and unless you’re tied to x86 for whatever reason, you’d be stupid not to switch over to Graviton2 instances once they become available, as the cost savings will be significant.

So, 2.5 GHz AWS Graviton2 ARM Processor 64 vCPU (64 physical CPU equivalent to128 Magent Cloud vCPUs plan) — 128 GiB — 1588$/months C6g.16xlarge instance will overperform in throughput 3 GHz Intel Xeon Platinum 8275L used in the Magento Cloud Processor 48 vCPU (24 physical CPU) 96 GiB — 1489$ months and maybe overperform next instance size C5.18xlarge — 3 GHz Intel Xeon Platinum 8124M 72 vCPU(36vCPUs) 144 GiB — 2233$/month, and also possible overperform C5.24xlarge 3 GHz Intel Xeon Platinum 8275L 96 vCPU (48 physical CPUs) 192 GiB memory.

So, C6g.16xlage single instance 690$ Months with a 3-year reservation with no upfront payment can overperform the biggest Magento Cloud 120CPU plan with ~ $35K per Month. Also, Vertical scaling is available if you don’t need all the performance all the time. Also, infrastructure management overhead is almost equal to 0 because you are not auto-scaling. You are keeping maximum server capacity at a fraction of the Magento cloud cost.

You can also create Magento Cloud infrastructure on Graviton 2 instances with auto-scaling using Terraform:

Terraform Magento ARM Cloud Architecture diagram:

Terraform Magento Cloud

Magento Commerce Cloud uses X1 instance [2.3 GHz Intel Xeon E7–8880 v3 (Haswell)128 vCPU 1952 GiB memory] to scale cloud for the most expansile enterprise’s customer plan (120 CPU). However, this instance type has a much worse performance (60% slower) than R5 [3.1 GHz Intel Xeon Platinum 8175 (Skylake)] and Graviton 2 (10% faster)

X1 performance is:

The code took 0.16240096092224 seconds to complete.

Redis performance : ssh# redis-benchmark -c 100 -p 6370

====== SET ======
100000 requests completed in 1.74 seconds
100 parallel clients
3 bytes payload
keep alive: 1
0.00% <= 1 milliseconds
99.64% <= 2 milliseconds
99.80% <= 4 milliseconds
99.82% <= 5 milliseconds
99.89% <= 6 milliseconds
99.96% <= 7 milliseconds
100.00% <= 7 milliseconds
57603.69 requests per second
====== GET ======
100000 requests completed in 1.80 seconds
100 parallel clients
3 bytes payload
keep alive: 1
0.01% <= 1 milliseconds
99.12% <= 2 milliseconds
99.79% <= 3 milliseconds
99.92% <= 4 milliseconds
100.00% <= 4 milliseconds
55555.56 requests per second

R5 performance:

The code took 0.10573697090149 seconds to complete.

Redis performance 143 061.52 requests per second

C6g Graviton 2 CPU performance:

The code took 0.15448880195618 seconds to complete

Redis Performance 168 067.22 requests per second

Magento just uses the instance with the biggest number of vCPU (128) available. But, more cores(virtual) are not necessarily faster! Especially if it is a virtual CPU, not physical cores.

This Open Source Project setups Magento 2 infrastructure on Graviton instance (price starting from 56$/month) and in several minutes and can use it:

--

--

Yegor Shytikov

True Stories about Magento 2. Melting down metal server infrastructure into cloud solutions.