X
Popular Searches

How to Index Your Docker Image’s Dependencies With Syft

Illustration showing the Syft mascot

Syft is a CLI utility that generates a Software Bill of Materials (SBOM) for container images. An SBOM is a catalogue of dependencies used by your image. It gives you visibility into the “materials” that form your image’s filesystem.

Producing an SBOM can help you identify overly complex package supply chains that put you at risk of dependency confusion attacks. Distributing an SBOM alongside your image informs users of what lies below the surface. This provides a useful starting point when tightening supply chain security.

Syft is developed by Anchore which also offers a complete container scanning engine. The Syft CLI is capable of extracting package lists from images using popular operating systems and programming languages. Both Docker and OCI images are supported.

Installing Syft

An installation script is available to download the latest Syft binary and add it to your path:

curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s -- -b /usr/local/bin

Mac users can also get Syft from Homebrew by adding the anchore/syft repository and installing the syft package.

Advertisement

Once you’ve got Syft on your system, run syft in your terminal to display the available commands. You can generate completions for your shell by running syft completion.

Use syft version to find your installation’s version. Check the GitHub tags page periodically to find new releases, then reuse the installation script to download each update.

Scanning an Image

Syft’s functionality is currently exposed by a single sub-command, syft packages. Pass it an image tag to generate an SBOM for:

syft packages alpine:latest

Syft will download the image, scan its contents, and produce a catalogue of discovered packages. The output will be shown as a table in your terminal. Each result includes the detected package name, version, and type.

The package list for this image is short. As it’s an Alpine base image, the installed packages are intentionally streamlined to provide the smallest possible surface. Larger images could contain hundreds or thousands of packages across several different formats. It can be helpful to combine Syft with existing Unix terminal tools like grep and awk to extract the data you’re looking for.

syft packages example-image:latest | grep example-package-to-find

Supported Package Types

Syft supports many popular package formats across the leading operating systems and programming languages. The list includes:

  • APK (Alpine), DEB (Debian), and RPM (Fedora) OS packages.
  • Identification of Linux distributions across Alpine, CentOS, Debian, and RHEL favors.
  • Go modules
  • Java inJAR, EAR, and WARvariations
  • NPM and Yarn packages
  • Python Wheels and Eggs
  • Ruby bundles
Advertisement

Although not every language is covered, you’ll still benefit from the OS-level scanning irrespective of your application’s chosen stack.

Changing the Output Format

The default output format is called table. It renders a columnar-based table of results in your terminal, creating a new row for each detected package. An alternative human-readable format is text which presents a list of packages with Version and Type fields nested under each section.

Syft supports several programmatic formats too:

  • json – Save package data to a JSON structure.
  • cyclonedx – A CycloneDX report in XML format.
  • spdx and spdx-jsonSPDX-compatible reports in either tag-value or JSON format.

Using one of these reports lets you archive findings to a file for later reference:

syft packages alpine:latest -o json > alpine-packages.json

The standardized CycloneDX and SPDX formats can help integrate Syft scans into your CI/CD pipelines. The data is accessible to other ecosystem tools that work with package lists and SBOM results.

Syft also integrates with Grype, Anchore’s standalone container filesystem vulnerability finder. Data from Syft can be fed straight into Grype if you use the JSON output format.

syft packages example-image:latest -o json > sbom.json
grype sbom:./sbom.json

Grype will compare the package list to its index of known vulnerabilities. It’ll highlight the packages which contain problems, giving you an immediate starting point to improve your security posture.

Using Other Image Sources

Syft can use images from other sources besides public Docker registries. You can reference any OCI-compliant image, either via a registry tag or as a saved image tar. Paths to image archives can be handed straight to Syft:

docker image save my-image:latest > my-image.tar
syft packages ./my-image.tar
Advertisement

Syft works with private Docker registries too. It uses your existing credentials in your ~/.docker/config.json file:

{
    "auths": {
        "registry.example.com": {
            "username": "",
            "password": ""
        }
    }
}

Although Syft focuses on container image scans, it can also create an SBOM for arbitrary filesystem paths. You can use Syft to index your host’s packages by scanning directories that commonly contain software binaries and libraries:

syft packages dir:/usr/bin

You must explicitly add the dir: scheme if you’re referencing a path outside your working directory. Otherwise Syft will try to interpret it as an image tag reference.

Conclusion

Syft extracts package lists from your container images. The generated data acts as an SBOM for your image, increasing your awareness of your supply chain length.

Syft is distributed as a single binary that produces reports in several different formats. It can be readily integrated into CI/CD systems to upload an SBOM artifact as part of your image build pipeline. This increases accountability and aids audit trails by recording each image’s full software list at the time it’s produced.

Advertisement

Adding Syft scans to your workflow keeps you informed of the packages you’re using. Once you’ve got this information, you can begin assessing each package to determine whether it’s really needed. If you find a lot of packages that aren’t used by your workload, consider switching to a minimal base image and layering only essential software on top.

James Walker James Walker
James Walker is a CloudSavvy IT contributor. He is the founder of Heron Web, a UK-based digital agency providing bespoke software development services to SMEs. He has experience managing complete end-to-end web development workflows with DevOps, CI/CD, Docker, and Kubernetes. Read Full Bio »

The above article may contain affiliate links, which help support CloudSavvy IT.