Jenkins X Pipelines Internals Part 3 — Stages
This is the third part of a series of blog posts on the internals of the Jenkins X Pipelines. We’ll focus on the “stages” that compose a pipeline, and we’ll see how they are implemented using Tekton.
In the first part of this series, we’ve walked through everything that happened in the cluster, from the incoming GitHub WebHook event to a running Tekton pipeline. In the second part, we’ve talked about the “Meta Pipeline”, which is responsible for converting your Jenkins X Pipeline into a Tekton Pipeline. Now it’s time to dive into the internals of the Tekton pipelines, and we’ll start with an investigation of how the Jenkins X Stages — that compose a pipeline — are implemented.
So a pipeline has stages, and each stage has steps. But why do we need stages? Well, for basic pipelines you might not really need them if you only have a couple of sequential steps to run. But if you want to run more complex pipelines, with parallelism for example, then you’ll need the power of the stages.
First, a few pointers:
As you can see, a stage has a name and a list of steps, but it can also have embedded stages to run sequentially, or in parallel. We’ll start simple, with a basic pipeline with a single-stage and a single step:
buildPack: none
pipelineConfig:
pipelines:
pullRequest:
pipeline:
stages:
- name: unit-tests
steps:
- name: unit-tests
command: make test
image: golang:1.13
As we saw in the previous blog post, stages are converted into Tekton Tasks, which are then converted into Pods. We can confirm that by running the jx get build pods -r yourGithubRepo
command which lists the pods for the given GitHub repository.
In the output, we can see 2 pods: 1 for the meta-pipeline task — with the “meta-” prefix— and one for our task. If we inspect our task’s pod — using the kubectl get pod xxx -o yaml
command for example — we can see that it has a few containers:
step-git-source-githubOrg-repoName-pr-1-tm8bh
, which uses the gcr.io/abayer-pipeline-crd/tekton-for-jx/git-init:0.8.0-for-jx container image and thegit-init
command.step-git-merge
, which uses the gcr.io/jenkinsxio/builder-jx container image and thejx step git merge
command.step-unit-tests
, which uses the golang:1.13 container image.
There are a few init containers as well:
step-credential-initializer-sdrgc
, which uses the gcr.io/abayer-pipeline-crd/tekton-for-jx/creds-init:0.8.0-for-jx container image and thecreds-init
command.step-working-dir-initializer-vzsgw
, which uses the gcr.io/abayer-pipeline-crd/tekton-for-jx/bash:0.8.0-for-jx container image.step-place-tools
, which uses the gcr.io/abayer-pipeline-crd/tekton-for-jx/entrypoint:0.8.0-for-jx container image and theentrypoint
command.
And some volumes, including:
workspace
— an “emptyDir” volume, which is mounted on every container at/workspace
home
— another “emptyDir” volume, which is mounted on every container at/builder/home
tools
— another “emptyDir” volume, which is mounted on every container at/builder/tools
The init containers are inserted in the pod by Tekton’s taskrun Controller — more specifically by the MakePod
function. They are used to initialize resources used by the Task’s steps:
- The credentials initializer is used to write git and/or docker credentials files in the
home
shared volume, that will be available for all further steps. It is called with the-basic-git=knative-git-user-pass=https://github.com
argument, which means it will retrieve the Git credentials for GitHub from theknative-git-user-pass
Kubernetes Secret — which is coming from the Tekton Helm chart for Jenkins X and configured in your dev environment Git repository. - The working dir initializer is used to create the directory where our Git repository will be cloned, by running a simple
mkdir -p /workspace/source
in our case. - The tools initializer is used to copy the
entrypoint
binary from its container image into thetools
shared volume, to make it available for all further steps. It will then be used as the container entry-point for all the steps — more on that in a later blog post.
The non-init containers are defined by Jenkins X. We can see that by looking at the task definition — using the kubectl get taskruns.tekton.dev xxx -o yaml
command for example — which has the following steps:
git-merge
, which uses the gcr.io/jenkinsxio/builder-jx container imageunit-tests
, which uses the golang:1.13 container image
so that explains 2 of the 3 containers:
unit-tests
, which is our own step- and
git-merge
, which has been inserted by thestageToTask
conversion function, as we saw in the previous blog post related to the meta pipeline. This step ensures that the workspace is set up with the right content.
But what about the step-git-source-...
step/container? In fact, it comes from Tekton, which interprets the “input resource” defined by Jenkins X in the Task:
inputs:
resources:
- name: workspace
targetPath: source
type: git
The git
resource type is converted to a Tekton GitResource
. Tekton resources can implement a GetInputTaskModifier
function to modify the task on which they are defined. In our case, the GitResource
is prepending a step to run the git-init
command.
It is interesting to have a look at the implementation of the git-init
command because it doesn’t perform a basic git clone
operation, but instead use an optimized git fetch
with the --depth=1
flag to retrieve only a single commit. It won’t retrieve the other branches or tags either — it’s up to you to retrieve them if you need them.
This means that as a user of Tekton, Jenkins X only requires a git workspace, and let Tekton handle the “git clone” operation. Jenkins X is then responsible for performing the right “checkout” — because it requires a specific merge logic, to merge the Pull Request branch commits on top of the master branch, see the previous blog post related to the meta pipeline for more details.
Multiple stages
What happens if we use multiple stages instead of a single one?
buildPack: none
pipelineConfig:
pipelines:
pullRequest:
pipeline:
stages:
- name: unit-tests-1
steps:
- name: unit-tests
command: make test
image: golang:1.13
- name: unit-tests-2
steps:
- name: unit-tests
command: make test
image: golang:1.13
This Jenkins X Pipeline will result in 2 tasks — 1 per stage — and so 2 pods.
The pod for the first stage/task has 6 containers:
step-create-dir-workspace-b9c69
which runs themkdir -p source
command.step-git-source-githubOrg-repoName-pr-1-vzkpx
which runs thegit-init
command.step-git-merge
which runs thejx step git merge
command.step-unit-tests
— our own stepstep-source-mkdir-githubOrg-repoName-pr-1-c2lhw
which runs themkdir -p /pvc/unit-tests-1/workspace
command.step-source-copy-githubOrg-repoName-pr-1-chjlp
which runs thecp -r source/. /pvc/unit-tests-1/workspace
command.
The pod for the second stage/task has only 3 containers:
step-create-dir-workspace-p76f7
which runs themkdir -p /workspace/source
command.step-source-copy-workspace-5gc5n
which runs thecp -r /pvc/unit-tests-1/workspace/. /workspace/source
command.step-unit-tests
— our own step
Both pods also have a Kubernetes Persistent Volume Claim (“PVC”) mounted at /pvc
.
Why are our 2 pods so different, when the stage from which they are built is the same? If we inspect our 2 tasks, we can see that they are almost the same, except that the first one has the following output resource declared:
outputs:
resources:
- name: workspace
targetPath: source
type: git
So Jenkins X will just ask Tekton to bind the output of the first task to the input of the second task — this is done in the stageToTask
transformation function. On the Tekton side, this is handled by the AddOutputResources
function which internally uses a PVC to store the workspace content.
When you have 2 consecutive stages, they are converted as 2 consecutive tasks, which are scheduled by the Tekton pipelinerun Controller one after the other. The logic for it is in the PipelineRunState
's GetNextTasks
function. This implies that the second pod will only be created after the first pod has been completed. This is why the only way to conserve data for the duration of the pipeline is to use a persistent volume.
Tekton will take care of inserting extra steps in your tasks to copy everything from your workspace to the persistent volume, and then from the persistent volume into the workspace. These are the step-source-copy-xxx
steps, running cp
commands. The PVC itself is managed by the pipelinerun Controller using the ArtifactStorage
, and it has the same lifecycle as the pipeline. The PVC settings are retrieved from the config-artifact-pvc
ConfigMap — see the createPVC
function. This ConfigMap is created by the Tekton Helm chart for Jenkins X — and it defaults to using a 5Gi
volume. You can change the volume size in the env/tekton/values.tmpl.yaml
file of your dev environment Git repository.
So if you want to split your pipeline’s steps into multiple stages, remember that it will introduce overhead for persisting the workspace between each stage.
In the next blog post, we’ll explore how the steps are implemented.