From time to time, a change comes up that needs to happen everywhere. You got a new email address and so it needs to be updated in many maintainers files. Maybe a new legal policy requires you to add a file with a disclaimer to all of the 200 repositories your company maintains. Or you just want to clear all the trailing whitespace people accidentally committed.
Whatever the reason, sometimes we need a custom automated script to spare ourselves hours of mindless manual work. The trick is, of course, being able to write and debug it faster than it would take to do the actual task manually.
This post explains how to use gitwalk to quickly make changes to many repositories in parallel.
Gitwalk is a tool to manipulate multiple git repositories at the same time. It abstracts away all the logistics of cloning, updating and iterating over them so you can focus on which repositories will be processed and what needs to be done. Gitwalk uses simple expressions to select groups of repositories and processors to say what needs to happen for each one.
It can do a range of things from searching through the code to making and committing changes, which is what we’ll take a look at today.
For the purpose of this post, let’s say we need to add a file called
DISCLAIMER to all my repositories on GitHub. The first step is to figure
out the right expression that will select all of my repos. Gitwalk supports
wildcards, so it will look something like this:
We’ll use gitwalk with the
--dry-run option to try it out:
The glob matched quite a few forks as well that don’t really need a
DISCLAIMER. We can use an exclude expression to remove a few matches from
the list. Gitwalk will merge the results of the two expressions into a single
list of repositories. Read more about the
expression syntax in the docs.
Now that our expressions match only repositories owned by me, let’s see how to commit the file to each of them.
We’ll need a short script that will create the file, commit and push it upstream. Gitwalk will simply call that for each repo as it iterates over them. You can use whatever language you like to write it. For this example, I picked bash. Here’s how it looks:
It’s almost an identical set of commands that you’d run when adding a new file by hand. We’ll make it an executable and pass it to gitwalk as follows:
Note that the path to your script needs to be absolute (hence the
variable). This is because the current working directory is changed to be
inside the repository when the script is executed.
The repositories will be cloned automatically when needed, so you don’t need to worry about anything else. Here’s what the above command does:
And one of the commits on GitHub:
I implemented the same example in JS
as well to compare. It is, however, a fair bit more complicated as Nodegit works
on slightly lower level than the
git tool does.
Being able to automate various tasks efficiently is one of the perks of software engineering not many other professions enjoy. Clever scripts save us from hours of soul-draining, repetitive work. Occasionally, however, we underestimate the real complexity of our small scripts and it can take hours of painful debugging to get them right.
Gitwalk is here to help you with that when you need to handle several repositories at the same time. Check it out here.