In many ways, S3 buckets act like like cloud hard drives, but are only “object level storage,” not block level storage like EBS or EFS. However, it is possible to mount a bucket as a filesystem, and access it directly by reading and writing files.
The Benefits and Limitations of S3 as a Filesystem
The magic that makes this whole setup work is a utility called
s3fs-fuse. FUSE stands for Filesystem in Userspace, and it creates a mounted virtual filesystem.
s3fs interfaces with S3, and supports a large subset of POSIX, including reading, writing, creating directories, and setting file metadata.
One of the great benefits of using S3 over traditional storage is that it’s very effective at storing individual objects long term, with no limit at all on total bucket size. You can store 10 photos or 10 million photos in S3, and it’ll work largely the same. In applications where you need a large (and cheap) disk, S3 makes sense, and if the application you’re integrating wants file access, this is a good way to bridge the two.
Of course, it’s not without limitations. While it works fairly comparatively to an S3 API in terms of performance when storing and retrieving whole files, it obviously doesn’t replace the much faster network attached block storage entirely. There’s a reason this configuration isn’t officially supported by AWS—you’ll run into concurrency issues with multiple clients using files, especially if you have clients in different regions accessing the same bucket. Of course, S3 also has this limitation, and it doesn’t prevent you from having multiple clients attached, but it’s more apparent when FUSE seems to give you “direct” access. It’s not, and you’ll have to keep this limitations in mind.
AWS does have a service similar to this—Storage Gateway, which can act as a local NAS and provides local block storage backed by S3. However, this is more of an enterprise solution, and it requires an entire physical server to deploy a VMWare image to.
s3fs, on the other hand, is a simple single server solution, although it doesn’t do much caching.
So, if you can convert applications to using the S3 API rather than a FUSE, you should do that instead. But, if you’re okay with a bit of a hacky solution,
s3fs can be useful.
Setting Up s3fs-fuse
Compared to how hacky it is, it’s surprisingly easy to set up.
s3fs-fuse is available from most package managers, though it may just be called
s3fs on some systems. For Debian-based systems like Ubuntu, that would be:
sudo apt install s3fs
You’ll need to create an IAM user, and give it permission to access the bucket you wish to mount. At the end, you’ll get a secret access key:
You can paste these in the standard AWS credentials file,
~/.aws/credentials, but if you want to use a different key,
s3fs supports a custom password file. Paste both the access key ID and secret into
/etc/passwd-s3fs , in the following format:
echo ACCESS_KEY_ID:SECRET_ACCESS_KEY > /etc/passwd-s3fs
And make sure the permissions on this keyfile are set properly, or it’ll complain:
chmod 600 /etc/passwd-s3fs
Then, you can mount the bucket with the following command:
s3fs bucket-name /mnt/bucket-name
If that doesn’t work, you can enable debug output with a few extra flags:
-o dbglevel=info -f -o curldbg
If you want this to mount at boot, you’ll need to add the following to your
s3fs#bucket-name /mnt/bucket-name fuse _netdev,allow_other,umask=227,uid=33,gid=33,use_cache=/root/cache 0 0