GitHub Actions: Data Flow & Data Persistence
In Github Actions, by default, data is not inherently persistent or available to the whole pipeline. Every step has is its own process, every job has its own runner. By default, whatever data emerges in a job, ends with it.
How do we pass data from one process to the other, or save it for the next process?
A short sweet answer:
Strategy | Data | Scope | Persistence | Explanation | Example |
---|---|---|---|---|---|
env | Values | Job (internal) | Ephemeral | Propagates data between steps in the same job | Pass a boolean to control whether the next step should run |
outputs | Values | Workflow (internal) | Ephemeral | Propagates data between jobs/steps in the same workflow | Pass a deployment id to the next job |
artefacts | Files | Workflow (internal & external) | Persistent | Propagates files between jobs/workflows | Pass the project build to different test jobs running in parallel Intended for frequently changing data. Files are available for download after the workflow finishes. |
cache | Files | Workflow (internal & external) | Persistent | Propagates files inside and between workflows in the same repository | Cache npm packages for use in different workflow runs. Intended for files that don't change much. |
For a completer answer: read on.
All the workflow examples in this article can be found as files here, along with a copy of the respective redacted logs.