On GitHub pipelines and diverging branches

Just before Christmas I started a new job.  I won't get into the details, but my new team has a different workflow than I'm used to, and the other day I noticed a problem with it.  My colleague suggested that the experience might make for good blog-fodder, so here's the break-down.

First, let me start by describing the workflow I was used to at my last job.  We had a private GitLab instance and used a forking workflow, so getting a change into production went like this:

  1. Developer forks the main repo to their GitLab account.
  2. Developer does their thing and makes a bunch of commits.
  3. Developer opens a merge request from their fork's branch to the main repo's main branch.  Code review ensues.
  4. When review is complete, the QA team pulls down the code from the developer's fork and tests in a QA environment.  Obviously testing needs differ, but a "QA environment" is generally exactly the same thing as a developer environment (in this case, a set of disposable OpenStack VMs).
  5. When testing is complete, the merge request gets merged and the code will go out in the next release (whenever that is - we didn't do continuous deployment).
  6. Every night, a set of system-level tests runs against a production-like setup that uses the main branches of all the relevant repos.  Any failures get investigated by developers and QA the next morning.

I'm sure many people would quibble with various parts of this process, and I'm not going to claim there weren't problems, but it worked well enough.  But the key feature to note here is the simplicity of the branching setup.  It's basically a two-step process: you fork from X and then merge back to X.  You might have to pull in new changes along the way, but everything gets reconciled sooner or later.

The new team's process is not like that.  We use GitHub, and instead of one main branch, there are three branches to deal with: dev, test, and master, with deployment jobs linked to dev and test.  And in this case, the merging only goes in one direction.  So a typical workflow would go like this:

  1. Developer creates a branch off of master, call if "feature-X"
  2. Developer does their thing and makes a bunch of commits.
  3. Developer opens a pull request from feature-X to dev and code review ensues.
  4. When the pull request is approved, the developer merges it and the dev branch code is automatically deployed to a shared development environment where the developer can test it.  (This might not be necessary in all cases, e.g. if local testing is sufficient.)
  5. When the developer is ready to hand the code off to QA, they open a pull request from feature-X to test.  Again, review ensues.
  6. When review is done, the pull request gets merged and the test branch code is automatically deployed to test, where QA pokes at it.
  7. When QA is done, the developer opens a pull request from feature-X to master and (drum roll) review ensues.
  8. When the master pull request is approved, the code is merged and is ready to be deployed in the next release, which is a manual (but pretty frequent) process.

You might notice something odd here - we're only ever merging to dev and test, never from them.  There are occasionally merges from master to those branches, but never the other way around.  Now, in theory this should be fine, right?  As long as everything gets merged to all three branches in the same order, they'll end up with the same code.  Granted, it's three times as many pull requests to review as you really need, but other than that it should work.

Unfortunately, theory rarely matches practice.  In fact, the three branches end up diverging - sometimes wildly.  On a large team, this is easy to do by accident - Bob and Joe are both working on features, Bob gets his code merged to test first, but testing takes a long time, so Joe's code gets out of QA and into master first.  So if there are any conflicts, you have the potential for things like inconsistent resolutions.  But in our case, I found a bunch of code that was committed to the dev branch and just never made it out to test or master.  In some cases, it even looks like this was intentional.

So this creates an obvious procedural issue: the code you test in QA is not necessarily the same as what ends up in production.  This may be fine, or it may not - it depends on how the code diverges.  But it still creates an obvious risk because you don't know if the code your releasing is actually the same as what you validated.

But it gets worse.  This also creates issues with the GitHub pipeline, which is where we get to the next part of the story.

Our GitHub pipelines are set up to run on both "push" and "pull_request" actions.  We ended up having to do both in order to avoid spurious error reporting from CodeQL, but that's a different story.  The key thing to notice here is that, by default, GitHub "pull_request" actions don't run against the source branch of your pull request, they run against a merge of the source and target branches.  Which, when you think about it, is probably what you want.  That way you can be confident that the merged code will pass your checks.

If you're following closely, the problem may be evident at this point - the original code is based on master, but it needs to be merged to dev and test, which diverge from master.  That means that you can get into a situation where a change introduces breakage in code from the target branch that isn't even present in the source branch.  This makes it very hard to fix the pipeline.  Your only real choice at that point is to make another branch of the target branch, merge your code into that, and then re-create the pull request with the new merged branch.  This is annoying and awkward at best.

But it gets worse than that, because it turns out that your pipeline might report success, even if the merge result would be broken!  This appears to be a GitHub issue and it can be triggered simply by creating pull requests.  

The easiest way to explain is probably by describing the situation I actually ran into.  I had a change in my feature-X branch and wanted to go through our normal process, which involves creating three pull requests.  But in my case, this was just a pipeline change (specifically, adding PHPStan analysis), so it didn't require any testing in dev or test.  Once it was approved, it could be merged immediately.  So here's what I did:

  1. First, I created a pull request against dev.  The "pull_request" pipeline here actually failed, because there was a bunch of code in the dev branch that violated the PHPStan rules and wasn't in master, so I couldn't even fix it.  Crud.
  2. After messing around with dev for a while, I decided to come back to that and just create the pull requests for test and master.
  3. So I created the pull request for test.  That failed due to drift from master as well.  Double crud.
  4. Then I created the pull request for master.  That succeeded, as expected, since it was branched from master.  So at least one of them was reviewable.
  5. Then I went back and looked at the dev pull request and discovered that the "pull_request" pipeline job now reported as passing!

Let me say that more explicitly: the "pull_request" job on my pipeline went from "fail" to "pass" because I created a different pull request for the same branch.  There was no code change or additional commits involved.

Needless to say, this is very bad.  The point of running the pipeline on a pull request is to verify that it's safe to merge.  But if just doing things in the wrong order can change a "fail" to a "pass", that means that I can't trust the results of my GitHub pipeline - which defeats the entire purpose of having it!

As for why this happens, I'm not really certain.  But from my testing, it looks like GitHub ties the results of the "pull_request" job to the last commit on the source branch.  So when I created the pull request to dev, GitHub checked out a merge of my code and dev, ran the pipeline, and it failed.  It then stores that as part of the results for the last commit on the branch.  Then I created the master pull request.  This time GitHub runs the pipeline jobs against a merge of my code with master and the jobs pass.  But it still associates that result with the last commit on the branch.  Since it's the same commit and branch are for both pull requests, this success clobbers the failure on the dev pull request and they both report a "pass".  (And in case you're wondering, re-running the failed job doesn't help - it just runs whatever the last branch it tested was, so the result doesn't change.)

The good news is that this only seems to affect pull requests with the same source branch.  If you create a new branch with the same commits and use that for one pull request and the original for the other, they don't seem to step on each other.  In my case, I actually had to do that anyway to resolve the pipeline failures.

So what's the bottom line?  Don't manage your Git branches like this!  There are any number of valid approaches to branch management, but this one just doesn't work well.  It introduces extra work in the form of extra pull requests and merge issues; it actually creates risk by allowing divergence between what's tested and what's released; and it just really doesn't work properly with GitHub.  So find a different approach that works for you - the simpler, the better.  And remember that your workflow tools are supposed to make things easier.  If you find yourself fighting with them, then you're probably doing something wrong.

You can reply to this entry by leaving a comment below. This entry accepts Pingbacks from other blogs. You can follow comments on this entry by subscribing to the RSS feed.

Add your comments #

A comment body is required. No HTML code allowed. URLs starting with http:// or ftp:// will be automatically converted to hyperlinks.