Popular Searches

How to Get Started with Auto-Scaling Container Deployments on AWS ECS

ecs hero image

AWS’s Elastic Container Service (ECS) is a compute engine specifically designed for Docker containers. You can use it to deploy containers using underlying EC2 instances, or run a server-agnostic deployment on Fargate.

What Is ECS?

The basic usage of ECS is pretty simple. Rather than having to administrate Linux servers, you simply give it a Docker container, choose how much compute power you want to give it, and set it to work. ECS handles the dirty work of finding the metal to run it on. By default, it runs serverless using Fargate, though you can optionally choose to run your containers on EC2 instances, and maintain full control over them.

Whenever you want to make updates, you simply update the container in the container registry and trigger an update for ECS. You can automate this entire process using CodePipeline, which can build your container from source and roll out a blue/green deployment to ECS.

ECS can also be easily auto-scaled with a single toggle, compared to EC2, which takes some additional setup. If you choose to enable this feature, ECS will automatically deploy new containers to match demand whenever CPU usage, memory usage, or other alarms get high. With auto-scaling, you’ll never have to worry about upgrading your server to a higher class, or manually deploying multiple of them. The opposite effect is also true—your application can scale down during off hours, saving you money in the process.

As far as pricing goes, there are no additional fees for the EC2 launch model. You simply pay for the underlying instances. For Fargate, the fee is calculated based on the number of vCPUs and memory requested. If you run the numbers, Fargate comes out to being 20% more expensive on paper. However, this is offset by the fact that Fargate deployments (when properly configured) will only use exactly as much resources as they need, which streamlines costs quite a bit. Fargate also supports Spot Instances, which save a ton of money over EC2, and make Fargate the preferred launch method in any case that doesn’t require direct access to the underlying server for some reason.

Setting Up Docker and Pushing to ECR

To get the container to ECS, you’ll need to push it to a repository. You can use Docker hub or your own registry server, but AWS offers their own solution with Elastic Container Registry. You can push containers here, and have them private to your AWS account, and easily accessible from other services like ECS and CodePipeline.

Head over to the ECR Management Console and create a new repository. The repository URI depends on your AWS account ID—you can copy it under the “URI” column.

Save the following script as updateECR.sh next to your Dockerfile. Replace the TAG and REPO variables with the proper values.


aws ecr get-login-password | docker login --username AWS --password-stdin $REPO
docker build -t $TAG .
docker tag $TAG:latest $REPO
docker push $REPO

Running this script will login to ECR, build your container, tag it, and push it to your repository. If you refresh the list, you should see your container:

ecr repository

Deploying on ECS

To launch a container on ECS, you’ll need two things:

  • A “Task Definition,” which contains metadata about the containers themselves—which ports are exposed, how much memory to allocate to containers, etc. You can run multiple containers in a single task definition, if your application makes use of more than one.
  • A “Service,” which represents a deployment of a task definition, the networking associated with it, and the auto-scaling settings. You can group multiple services together into a single “Cluster.”

You’ll need to create the task definition first, so create one from “Task Definitions” in the sidebar.

create new task definition

Give it a name, and specify the task memory and vCPU size. Keep in mind that you can always create multiple containers and link them together with a Load Balancer. If you’re planning on using auto-scaling, you’ll want to think of this value as the unit size of each container—rather than deploying a a single 16 vCPU container, you might choose to deploy eight 2 vCPU containers.

set vCPU and Mem

Next, click “Add Container” to define the containers this task definition will access. Paste in the URI of your Docker container in ECR. You can also set soft- and hard-memory limits here, and open ports (only applicable to EC2 launch types).

add container

Once that’s created, you can deploy Services using this definition. Head over to “Clusters” and create a new cluster, choosing either Fargate or EC2 depending on your preference. You can optionally choose to create a new VPC for this cluster, or you can deploy it in your default VPC.

You can run tasks manually, but it’s better to create a service to handle it. Create a new service from the Cluster view:

create new service

Give it a name, select the launch type you’re using, and select the task definition you just created. Here, you can set the number of tasks you’d like to launch, and how many of them should be healthy at any given time. If one of the tasks fails or is terminated, a new one will be spun up to replace it. This is independent of auto-scaling.

service config

On the next screen, select your VPC, and select a subnet to deploy into. You’ll also want to open up the Security Group that your service is using, and open the ports necessary for your application to function.

You can also add a Load Balancer or set custom DNS settings from this page. If you’re using a Load Balancer, you’ll want to make sure to set a reasonable “Health Check Grace Period,” which will prevent tasks from being marked as unhealthy while still launching.

service config

On the next screen, you can configure auto-scaling, which is as simple as turning it on, specifying how many tasks you would like to run, the maximum number you can afford, and the minimum that ECS should never drop below.

service config

You can set the target tracking scaling policy to TargetTrackingPolicy, and the ECSServiceAverageCPUUtilization threshold to 75-80% or so. If you don’t want to use the target-tracking policy, you can manually scale up and down based on CloudWatch alarms.

Once deployed, your service will take a minute or so to fire up the first containers, and will be available on the container or Load Balancer’s ENI endpoint. If you’d like, you can assign an Elastic IP to this ENI, which you can configure with your DNS for a permanent link to the cluster.

Anthony Heddings Anthony Heddings
Anthony Heddings is the resident cloud engineer for LifeSavvy Media, a technical writer, programmer, and an expert at Amazon's AWS platform. He's written hundreds of articles for How-To Geek and CloudSavvy IT that have been read millions of times. Read Full Bio »

The above article may contain affiliate links, which help support CloudSavvy IT.