Disk speed is an important part of measuring a server’s performance. AWS has many different types of EBS volumes, and it uses a burst-bucket model—similar to T2 instances—to determine your disk’s overall speed.
Most server workloads probably include some sort of memory-caching, so if you have plenty of RAM, your disk speed might not matter that much; once a file is read, it can stay in memory for a while. But for write-heavy workloads, disk speed starts to become the limiting factor, and can make or break your server’s performance.
IOPS and SSD Performance Explained
AWS lists and measures SSD speed using Input-Output Operations Per Second (IOPS). This is largely just a measure of the device’s 4K Random read and write speed.
SSDs perform differently under varying workloads, so there’s a few ways to measure how fast they are. The first is Sequential Read and Write speed, which measures how fast they are at reading one large file from disk. Speed does matter, especially when working with large data, but this is the ideal scenario, and in the real world SSDs usually have to pull data from multiple locations at once.
A better metric is random performance. This benchmark reads and writes files in 4,096 byte-sized chunks to random locations, hence the name “4K Random.” It more accurately mimics the real world load the SSD may face.
Random benchmarks can vary depending on the queue depth—a measure of how much the SSD currently has to process. When the SSD is being queried for a bunch of files, the queue depth will be high, which speeds up performance. But the baseline performance is measured at queue depth 1, which seems to be what AWS measures their SSDs at.
IOPS are a measure of how many actual operations are taking place. The formula for finding the IOPS from MB/s is:
IOPS = (MBps / KB Per Operation) * 1024
And because we’re reading 4 KB at a time, the formula becomes:
IOPS = MBps * 256
The desktop SSD in the above screenshot would be over 13,000 IOPS, which is fairly good for a 2 TB NVME SSD.
The Burst-Bucket Model
The main thing that makes AWS EBS volumes complicated is burst performance. This works very similarly to how T2/T3 instances work—when the disk sits idle, it accumulates IO credits at a rate determined by the volume size.
These credits go into a “bucket,” which collects them up to a maximum of 5.4 million, enough to burst to full performance for 30 minutes. The bucket starts full to enable quick bootstrapping of applications and startup for instances.
Credits are depleted from the bucket to use up performance.
gp2 has a maximum performance of 3,000 IOPS, so you’re only able to drain 3,000 per second.
Volumes earn IO credits at a rate of 3 per GB per second. Meaning that if you have a volume greater than 1 TB, your bucket will be always full, and you won’t have to worry about burst performance. Anything lower than that, and you’re limited to the baseline performance based on how many credits you earn.
If you need more sustained performance, you can rent a bigger volume, or use a Provisioned IOPS volume (
io1). While these are more expensive per GB, they enable you to buy IOPS directly. You can purchase anywhere from 100 to 64,000 IOPS, at a rate of $0.065 per provisioned IOPS. This is only really cost effective if you want more than 3,000 IOPS. For anything under that, you’ll effectively be paying double the price for the volume. For example, if you needed a 3,000 IOPS 64 GB volume, you could simply provision a 1 TB
gp2 volume for half the price. But, if you want the extra speed, you can pay for it.
Hard Drive (st1 and sc1) Performance
AWS’s hard drive-based EBS volumes also use a burst-bucket model, but the hard drives work a bit differently than SSDs, so it’s not measured in IOPS. Because a hard drive uses a spinning disk head, the read and write speeds will be fixed. Doing random reads and writes will slow this down significantly (one of the main downsides of hard drives). AWS uses sequential read speeds here.
st1, the base speed grows by 40 MiB/s per TB, starting at 20 for the minimum volume size of 500 GB.
Burst speed grows by 250 MiB/s per TB, up to a maximum of 500 MiB/s. For volumes larger than 12 TB, you’re able to burst to maximum speed 100% of the time. Anything less, and you are limited by your burst credit balance.
sc1, the base speed grows by 12 MiB/s per TB, starting at 6 for the minimum volume size of 500 GB. It makes it much slower, and it will never reach 100% burst capacity (but it is cheaper).
Burst speed is also limited, and grows by 80 MiB/s per TB, up to a maximum of 250 MiB/s. This equals about 8,000 IOPS, but again this is likely the sequential speed, and you won’t see random speeds this high out of any hard drive.
How to Find Your Real-World Disk Speed
You could use a tool like
dd to measure sequential write speed, however, this doesn’t stress the disk nearly enough to be useful, and isn’t indicative of any real-use case.
To get something better, you need to install a disk benchmarking tool called
fio from your distro’s package manager:
sudo apt-get install fio
Then, run it with the following command:
fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k --iodepth=64 --size=250M --readwrite=randrw --rwmixread=80
It will create a 250 MB file, and perform random read and write tests at a ratio of 80% reads, 20% writes, giving you a much more accurate view of how your disk really performs.
A quick test with a 25 MB file shows the benefit of AWS’s burst bucket model. The gp2 volume is able to burst to a fast speed for a bit to handle the transfer smoothly. With such a small size, the SSD is able to effectively burst past the 3,000 IOPS limit but only for a second.
A longer test with a 250 MB file gives a better look at how the SSD will perform under larger loads. In this case, the test takes longer than a second, so the speed is limited by the burst IOPS speed, coming in at 2,600 IOPS.
Of course, if we were to let this test run for more than 30 minutes, the gp2 volume would run out of credits, and slow down to just 24 IOPS for an 8 GB volume. But you’re likely not going to encounter loads that will be using 100% of your disk, and if you do, you can always use a bigger disk with guaranteed performance, or provision IOPS directly.