This article continues my series on GitHub Actions. In this article, I will show you how to improve the execution time of your GitHub Actions workflow by using caching.
Feel free to read my previous articles on GitHub Actions:
Introduction
GitHub Actions is a powerful tool for CI/CD. It is free for public repositories and has a generous free tier for private repositories. However, the free tier has some limitations. One of them is the execution time limit. For example, the free tier for private repositories has a limit of 2000 minutes per month. This is more than enough for most projects, but if you have a large project with a lot of tests, you can easily hit this limit. In this article, I will show you how to improve the execution time of your GitHub Actions workflow by using caching.
Caching in GitHub Actions allows you to store and reuse certain files or dependencies between workflow runs. By caching these artifacts, you can avoid redundant computations and reduce the time required for tasks such as installing dependencies, building packages, or compiling code.
Benefits of Caching in GitHub Actions
Here are some key benefits of using caching in GitHub Actions:
Faster Workflow Execution: Caching allows you to avoid repeating time-consuming tasks, such as downloading and installing dependencies. By storing these files in the cache, subsequent workflow runs can retrieve them quickly, reducing overall execution time.
Cost and Resource Efficiency: With caching, you can reduce resource consumption and associated costs. Instead of performing repetitive operations, you can reuse cached artifacts, optimizing the utilization of available computing resources.
Improved Developer Productivity: Faster feedback loops enable developers to iterate and test their code more frequently. By reducing the time spent waiting for workflows to complete, developers can focus on writing code and delivering features faster.
Best Practices for Caching in GitHub Actions
To leverage caching effectively in GitHub Actions, consider the following best practices:
Identify Cacheable Artifacts: Determine which files or dependencies can be cached. For example, you can cache package managers' dependencies like
node_modules
orpip packages
. Identifying the right artifacts to cache is crucial to achieving maximum performance gains.Define Cache Keys: Cache keys determine when the cache should be used or invalidated. GitHub Actions allows you to define custom cache keys based on specific criteria, such as the content of a file or the version of a dependency. Choosing appropriate cache keys ensures that the cache is invalidated only when necessary, preventing outdated artifacts from being reused.
Use Cache Actions: GitHub Actions provides cache actions that simplify caching implementation. The @actions/cache JavaScript library is a popular choice for managing caching in workflows. It offers flexible options for storing and retrieving cache artifacts based on keys, scopes, and paths.
Balance Cache Size and Freshness: While larger caches may provide more performance benefits, it's essential to strike a balance between cache size and freshness. Storing too much in the cache can lead to increased storage costs and longer cache retrieval times. Consider periodically purging and rebuilding the cache to avoid accumulating unnecessary artifacts.
Leverage Workflow Matrix: If your workflows involve multiple platforms, versions, or configurations, consider utilizing the workflow matrix feature. By defining different matrix combinations, you can cache artifacts specific to each configuration, further improving execution times.
Enough Talk, Show Me the Code
Workflow without Caching
We'll go through two examples of the same workflow. The first one will not use caching, and the second one will use caching. We'll compare the execution times of both workflows to see the difference. We'll use an existing FastApi project that I created in a previous article. You can find the project here. The project utilizes Docker and Docker Compose to run the application. The workflow tests the application and builds a Docker image and pushes it to Docker Hub. The workflow is triggered on every push to the main
branch. Here is the workflow file:
name: Docker Compose Actions Workflow
on:
push:
branches: [ "main" ]
pull_request:
branches: [ "main" ]
env:
# Use docker.io for Docker Hub if empty
REGISTRY: docker.io
# github.repository as <account>/<repo>
IMAGE_NAME: ${{ github.repository }}
jobs:
push_to_registry:
name: Push Docker image to Docker Hub
runs-on: ubuntu-latest
steps:
- name: Check out the repo
uses: actions/checkout@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Log in to Docker Hub
if: github.event_name != 'pull_request'
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/metadata-action@818d4b7b91585d195f67373fd9cb0332e31a7175
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
- name: Build and push Docker image
if : github.event_name != 'pull_request'
uses: docker/build-push-action@v4
with:
context: "{{defaultContext}}:src"
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
Lets go through the workflow step by step:
Name: The name of the workflow. This is optional.
On: The event that triggers the workflow. In this case, the workflow is triggered on every push to the
main
branch.Env: Environment variables used in the workflow. In this case, we have two environment variables:
REGISTRY
andIMAGE_NAME
. TheREGISTRY
variable is used to specify the Docker registry to push the image to. TheIMAGE_NAME
variable is used to specify the name of the image.Jobs: The workflow consists of one job called
push_to_registry
. The job is run on the latest version of Ubuntu.Inside the push_to_registry we specify the steps to be executed. The first step is to check out the repository. The second step is to set up Docker Buildx. The third step is to log in to Docker Hub. The fourth step is to extract metadata for Docker. The fifth step is to build and push the Docker image.
5a. Check out the repo: This step checks out the repository. This is a required step for all workflows.
5b. Set up Docker Buildx: This step sets up Docker Buildx. Docker Buildx is a CLI plugin that extends the Docker command with the full support of the features provided by Moby BuildKit builder toolkit. It provides the same user experience as docker build with many new features like creating scoped builder instances and building against multiple nodes concurrently. You can read more about Docker Buildx here.
5c. Log in to Docker Hub: This step logs in to Docker Hub. The step is only executed if the event that triggered the workflow is not a pull request. The step uses the
DOCKER_USERNAME
andDOCKER_PASSWORD
secrets to log in to Docker Hub. The secrets are stored in the repository settings. You can read more about secrets here. In this instance ensure you have theDOCKER_USERNAME
andDOCKER_PASSWORD
secrets are set in your repository settings.5d. Extract metadata (tags, labels) for Docker: This step extracts metadata for Docker. The step uses the
docker/metadata-action
action to extract the metadata. The action is used to extract metadata from Dockerfiles and docker-compose files. The action outputs two variables:tags
andlabels
. Thetags
variable contains the tags for the Docker image. Thelabels
variable contains the labels for the Docker image. You can read more about thedocker/metadata-action
action here.5e. Build and push Docker image: This step builds and pushes the Docker image. The step uses the
docker/build-push-action
action to build and push the Docker image. The action is used to build and push Docker images. The action takes in the following parameters:context: The build context. This is the path to the directory containing the Dockerfile. In this case, the build context is
src
.push: Whether to push or not. In this case, we set it to
true
to push the image.tags: The tags for the Docker image. In this case, we use the
tags
variable from the previous step.labels: The labels for the Docker image. In this case, we use the
labels
variable from the previous step.
Now lets see the execution time on the first run:
As the image shows, the workflow took 3 minutes 25 seconds to complete. Now lets implement caching and see if we can improve the execution time.
Workflow with Caching
Using caching in GitHub Actions is pretty straightforward. You just need to add the actions/cache
action to your workflow. The action takes in the following parameters:
path: The path to the directory to be cached. In this case, we want to cache the
src
directory.key: The key to use for restoring and saving the cache.
restore-keys: An ordered list of keys to use for restoring the cache if no cache hit occurred for the key.
cache-version: The version of the cache. This is optional.
run The steps to run if the cache is not restored. This is optional.
Now let's add the actions/cache
action to our workflow:
name: Docker Compose Actions Workflow
on:
push:
branches: [ "main" ]
pull_request:
branches: [ "main" ]
env:
# Use docker.io for Docker Hub if empty
REGISTRY: docker.io
# github.repository as <account>/<repo>
IMAGE_NAME: ${{ github.repository }}
jobs:
push_to_registry:
name: Push Docker image to Docker Hub
runs-on: ubuntu-latest
steps:
- name: Check out the repo
uses: actions/checkout@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Log in to Docker Hub
if: github.event_name != 'pull_request'
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/metadata-action@v4
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
- name: Cache Docker layers
id: cache
uses: actions/cache@v3
with:
path: /tmp/.buildx-cache
key: ${{ runner.os }}-buildx-${{ github.sha }}
restore-keys: |
${{ runner.os }}-buildx-
- name: Build and push Docker image
if : github.event_name != 'pull_request'
uses: docker/build-push-action@v4
with:
context: "{{defaultContext}}:src"
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=local,src=/tmp/.buildx-cache
cache-to: type=local,dest=/tmp/.buildx-cache
Note the following changes:
The Cache Docker step is added before the Build and push Docker image step.
cache-from and cache-to parameters are added to the Build and push Docker image step.
Let's break down the changes in detail:
Cache Docker layers: This step caches the Docker layers. The step uses the
actions/cache
action to cache the Docker layers. The action takes in the following parameters:path: The path to the directory to be cached. In this case, we want to cache the
src
directory.key: The key to use for restoring and saving the cache. Here we use the
runner.os
andgithub.sha
variables to create a unique key for the cache.restore-keys: An ordered list of keys to use for restoring the cache if no cache hit occurred for key. Here we use the
runner.os
variable to create a unique key for the cache.
Build and push Docker image: This step builds and pushes the Docker image. The step uses the
docker/build-push-action
action to build and push the Docker image. Here we added thecache-from
andcache-to
parameters to the action. Thecache-from
parameter specifies the cache to use for the build. Thecache-to
parameter specifies the cache to use for the push. In this case, we use thetype=local
cache to cache the Docker layers. Thesrc=/tmp/.buildx-cache
specifies the source of the cache. Thedest=/tmp/.buildx-cache
specifies the destination of the cache.
Now lets see the execution time on the first run:
As the image shows, the workflow was executed in 15 seconds! The percentage improvement is approximately 92.68%, now this isn't by any means a conclusive test, but it does show the potential of caching in GitHub Actions. Also, note that the workflow execution will vary on subsequent runs as the cache will be used.
Below is a screenshot of the cache step on GitHub actions:
Conclusion
In this article, we saw how to use caching in GitHub Actions. We saw how to implement caching in a workflow and saw the performance improvement. We also saw how to use the actions/cache
action to cache the Docker layers.
References
Feel free to reach out to me on Twitter for any questions or feedback. You can also leave a comment below.