Tracking Deploys in Git Log.
Knowing what is going on with git and many environments can be hard. In particular, it can be hard to easily know where the server environments are on the git history, and how the rest of the world relates to that. I've set up a couple interlocking gears of tooling that help me know whats going on.
Network
One thing that I love about GitHub is it's [network view](corsica net), which gives a nice high level overview of branches and forks in a project. One thing I don't like about it is that it only shows what is on GitHub, and is a bit light on details. So I did some hunting, and I found a set of git commands that does a pretty good at replicating GitHub's network view.
$ git log --graph --all --decorate
I have this aliased to git net
. Let's break it down:
git log
- This shows the history of commits.--graph
- This adds lines between commits showing merging, branching, and all the rest of the non-linearity git allows in history.--all
- This shows all refs in your repo, instead of only your current branch.--decorate
- This shows the name of each ref net to each commit, like "origin/master" or "upstream/master".
This isn't that novel, but it is really nice. I often get asked what tool I'm using for this when I pull this up where other people can see it.
Cron Jobs
Having all the extra detail in my view of git's history is nice, but it doesn't help if I can only see what is on my laptop. I generally know what I've commited (on a good day), so the real goal here is to see what is in all of my remotes.
In practice, I only have this done for my main day-job project, so the update script is specific to that project. It could be expanded to all my git repos, but I haven't done that. To pull this off, I have this line in my crontab:
*/10 * * * * python2 /home/mythmon/src/kitsune/scripts/update-git.py
I'll get to the details of this script in the next section, but the important
part is that it runs git fetch --all
for the repo on question. To run this
from a cronjob, I had to switch all my remotes to using https
protocol for
git instead of ssh
, since my SSH keys aren't unlocked. Git knows the
passwords to my http remotes thanks to it's gnome-keychain integration, so this
all works without user interaction.
This has the result of keeping git up to date on what refs exist in the world. I have my teammate's repos as remotes, as well as our central master. This makes it easier for me to see what is going on in the world.
Deployment Refs
The last bit of information I wanted to see in my local network is the state of
deployment on our servers. We have three environments that run our code, and
knowing what I'm about to deploy is really useful. If you look in the
screenshot above, you'll notice a couple refs that are likely unfamiliar:
deployed/state
and deployed/prod
, in green. This is the second part of the
update-git.py
script I mentioned above.
As a part of the SUMO deploy process, we put a file on the server that contains the current git sha. This script read that file, and makes local references in my git repo that correspond to them
Wait, creates git refs from thin air? Yeah. This is a cool trick my friend
Jordan Evans taught me about git. Since git's references are just files on the
file system, you can make new ones easily. For example, in any git repo, the
file .git/refs/heads/master
contains a commit sha, which is how git knows
where your master branch is. You could make new refs by editing these files
manually, creating files and overwriting them to manipulate git. That's a
little messy though. Instead we should use git's tools to do this.
Git provides git update-ref
to manipulate refs. For example, to make my
deployment refs, I run something like git update-ref refs/heas/deployed/prod 895e1e5ae
. The last argument can be any sort of commit reference, including
HEAD
or branch names. If the ref doesn't exist, it will be created, and if
you want to delete a ref, you can add -d
. Cool stuff.
All Together Now
Now finally the entire script. Here I am using an git helper that I wrote that
I have ommited for space. It works how you would expect, translating
git.log(all=True, 'some-branch'
to git log --all some-branch
. I made a
gist of it for the curious.
The basic strategy is to get fetch all remotes, then add/update the refs for
the various server environments using git update-rev
. This is run on a cron
every few minutes, and makes knowing what is going on a little easier, and git
in a distributed team a little nicer.
#!/usr/bin/env python
import os
import re
import subprocess
import requests
repo_dir = "{HOME}/src/kitsune".format(**os.environ)
environments = {
'dev': 'http://support-dev.allizom.org/media/revision.txt',
'stage': 'http://support.allizom.org/media/revision.txt',
'prod': 'http://support.mozilla.org/media/revision.txt',
}
def main():
cdpath = os.path.join(os.path.dirname(os.path.realpath(__file__)), '..')
os.chdir(cdpath)
git = Git()
print(git.fetch(all=True))
for env_name, revision_url in environments.items():
try:
cur_rev = git.rev_parse('deployed/' + env_name).strip()
except subprocess.CalledProcessError:
cur_rev = None
new_rev = requests.get(revision_url).text.strip()
if cur_rev != new_rev:
print 'updating ' + env_name, cur_rev[:8], new_rev[:8]
git.update_ref('refs/heads/deployed/' + env_name, new_rev)
if __name__ == '__main__':
main()
That's It
The general idea is really easy:
- Fetch remotes often.
- Write down deployment shas.
- Actually look at it all.
The fact that it requires a little bit of cleverness, and a bit of git magic along the way means it took some time figure out. I think it was well worth it though.