X
Popular Searches

How To Get The Largest Items In An S3 Bucket

AWS Logo

If you’re curious about the largest items in an AWS S3 bucket, you can use the CLI to print out a list sorted by size. This can help you locate unusually large objects in the bucket which may be taking up space.

Listing and Sorting Items with the S3 CLI

S3 provides some built in sorting options in the menus, so if you’re just looking for the largest item in a folder, you can simply sort that folder. However, if you want to search for all items regardless of key, you’ll need to do so from the AWS CLI. If you don’t have that installed, you can refer to our guide on configuring it to set it up.

The command for listing objects is pretty simple:

aws s3api list-object-versions --bucket example-bucket

This query can take a while to evaluate, since it’s downloading a master list of all objects and their metadata, but you’ll get a JSON array containing an entry for each item, like the following:

        {
            "ETag": "\"04e28fbee1ef2721123bb4e9a78183a895\"",
            "Size": 320,
            "StorageClass": "STANDARD",
            "Key": "folder/file.json",
            "VersionId": "fNdwjJRaEjBYUSBgZe51oj_s4ONo5GsL",
            "IsLatest": false,
            "LastModified": "2020-11-05T18:59:18+00:00",
            "Owner": {
                "DisplayName": "username",
                "ID": "501092a155f88f4d174d7as3d2a347f33b9495f0261434682ab9a"
            }
        }

To parse and sort this, you can use jq — a fantastic utility for working with JSON on the command line. You can download it from apt, though binaries are also available:

sudo apt-get install jq

This makes the final command the following, which will still take a while to evaluate, but will print out the largest 100 items in the bucket:

aws s3api list-object-versions --bucket oxide.rust | jq -r '.Versions[] | "\(.Key)\t \(.Size)"' | sort -k2 -r -n | head -100
Advertisement

If you want more or less items, you can change the input parameter to the head command, which trims all but the first N lines.

Anthony Heddings Anthony Heddings
Anthony Heddings is the resident cloud engineer for LifeSavvy Media, a technical writer, programmer, and an expert at Amazon's AWS platform. He's written hundreds of articles for How-To Geek and CloudSavvy IT that have been read millions of times. Read Full Bio »

The above article may contain affiliate links, which help support CloudSavvy IT.