Image
Git hooks: How to automate actions in your Git repo
Protect your Git repository from mistakes, automate manual processes, gather data about Git activity, and much more with Git hooks.
If you administer a Git server, you know that lots of unexpected tasks come up over the lifecycle of a repository. Contributors commit to the wrong branch, a project manager might want to implement an approval process, developers may need a specific review process, you might call for certain actions to be taken after a successful push. There are lots of little convenience features that Git can provide, but to take advantage of them, you need to learn about Git hooks.
Git hooks are shell scripts found in the hidden .git/hooks
directory of a Git repository. These scripts trigger actions in response to specific events, so they can help you automate your development lifecycle.
Although you may never have noticed them, every Git repository includes 12 sample scripts. Because they're shell scripts, they're extremely flexible, and there are even some Git-specific data you have access to within a Git repository.
Create a Git repository
To get started, create a sample Git repository:
$ mkdir example
$ cd !$
$ git init
Take a look at the .git/hooks
directory to see some default scripts:
$ ls -1 .git/hooks/
applypatch-msg.sample
commit-msg.sample
fsmonitor-watchman.sample
post-update.sample
pre-applypatch.sample
pre-commit.sample
pre-merge-commit.sample
prepare-commit-msg.sample
pre-push.sample
pre-rebase.sample
pre-receive.sample
update.sample
The sample Git hooks included in your new repo indicate common triggers available to you. For instance, when enabled, the pre-commit.sample
executes when a commit is submitted but before it is permitted, and the commit-msg.sample
script executes after a commit message has been submitted.
Write a simple Git hook
Do you want to put in guardrails to prevent mistakes when making commits to your Git repository? A simple Git hook trick is to prompt the user for confirmation before they commit something to a branch.
Create a new file named .git/hooks/pre-commit
and open it in a text editor. Add the following text, which queries Git for a list of the files about to be committed for the current branch name and then enters a while
loop until it gets a response from the user:
#!/bin/sh
echo "You are about to commit" $(git diff --cached --name-only --diff-filter=ACM)
echo "to" $(git branch --show-current)
while : ; do
read -p "Do you really want to do this? [y/n] " RESPONSE < /dev/tty
case "${RESPONSE}" in
[Yy]* ) exit 0; break;;
[Nn]* ) exit 1;;
esac
done
Mark the file executable:
$ chmod +x .git/hooks/pre-commit
And then try it out by creating, adding, and committing a file:
$ echo "hello git hooks" > hello.txt
$ git add hello.txt
$ git commit -m 'but warn me first'
You are about to commit hello.txt
to main
Do you really want to do this? [y/n] y
[main 125993f] but warn me first
1 files changed, 1 insertion(+)
create mode 100644 hello.txt
You can test it a second time to ensure that it lets you decline a commit.
[ For more automation tips, download the eBook 5 steps to automate your business. ]
Check commits for binary data
Some binary data in a repository is generally acceptable, but there's so much binary data on some projects that it would weigh a repository's actions down if it were all committed. I use Git-portal to help manage this, but I also have a pre-commit hook to double-check that no binary blobs are making it into a commit:
#!/usr/bin/env bash
shopt -s nullglob
declare -a FILES
declare -a HIT
echo "Git hook executing: pre-commit..."
# dump staged filenames into FILES array
FILES=(`git diff --cached --name-only --diff-filter=ACM`)
n=0
for i in "${FILES[@]}"; do
WARN=`file --mime "${i}" | grep -i binary`
NAME=`file "${i}" | cut -d":" -f1`
if [ -n "${WARN}" ]; then
HIT[$n]="${NAME}"
WARN=""
echo "${NAME} appears to be a binary blob."
exit 1
elif [[ "${NAME}" == *"blah" ]]; then
true
# do some stuff here
else
true
# do some other stuff here
fi
let "n++"
done
if [ ${#HIT[@]} -gt 0 ]; then
echo " WARNING: Binary data found"
fi
The script I use is longer than this and has allowances for Git-portal integration, but this small sample is fully functional and gives you a good idea of the logic. The script reviews each staged file (obtained with a simple git diff
command), uses the file
command to determine whether it's binary or not, and then takes action accordingly.
An interesting quirk of empty files is that they are treated as binary objects:
$ touch quux
$ file --mime quux
quux9: inode/x-empty; charset=binary
If you're testing this script, make sure you know the mimetype of what you're testing.
Push hooks
Not everything has to happen at commit time. The act of pushing to a repository is also a valid trigger. A pre-push hook is called with parameters, so for some values, you don't have to query Git the way you might for a pre-commit hook. The parameters are:
- $1 = The name of the remote being pushed to
- $2 = The URL of the remote
If you push without using a named remote, those arguments are equal.
Information about the commits being pushed is provided as lines to the standard input in this format:
<local ref> <local sha1> <remote ref> <remote sha1>
I use a pre-push hook to synchronize offline storage with Git pushes. When a user pushes a commit, their local stash of binary blobs (3D models, 4K images, and other large artifacts too large for the Git repo) are copied to the remote storage mirror. This script does that:
#!/usr/bin/env bash
shopt -s nullglob
declare -a HIT
declare -a STORAGE
function populate() {
local n=0
for i in "${HIT[@]}"; do
STORAGE[$n]=`git ls-remote --get-url "${i}"`
let "n++"
done
}
function portalsync() {
for i in "${STORAGE[@]}"; do
rsync --rsh=ssh -av --exclude-from="${TOP}/.portalexclude" --progress "${TOP}"/"${PORTAL}" "${i}" || echo "rsync failed."
echo "Syncing _portal content to ${i}"
done
}
while read local_ref local_sha remote_ref remote_sha
do
REF=$remote_ref
done
if [ "${REF}" = "refs/heads/master" ]; then
echo "master destination detected"
TOP=`git rev-parse --show-toplevel || false`
PORTAL=_img
# dump portal remote URIs into STORAGE array
HIT=(`git remote | grep "_portal*"`)
if [ ${#HIT[@]} -gt 0 ]; then
populate
echo "Syncing _portal content..."
portalsync
fi
fi
Despite the apparent verbosity of the script, it's actually pretty simple because it uses the built-in variables provided by the way Git calls pre-push. The one thing that Git doesn't provide automatically is the top level of the repo, which can make referring to file paths tricky. My hack for this is simple:
TOP=`git rev-parse --show-toplevel || false`
This creates the variable $TOP
to represent the outermost bounds of the Git repository directory.
Functionally, you now can set absolute paths starting at $TOP
.
Git hooks
It's important to note that Git hooks aren't committed to a Git repository themselves. They're local, untracked files. When you write an important hook that you want to keep around, copy it into a directory managed by Git!
Git hooks are an important aspect of Git that is too often forgotten for being hidden away. Although only 12 are bundled as samples in a repository, there are many more kinds of hooks you can use, so use the man githooks
command for details on the kinds of triggers available. Once you feel comfortable using Git hooks, they can protect your Git repository from silly mistakes, automate manual processes, gather data about Git activity, and do much more.
Image
Bloated Git repositories may contain sensitive files and can slow your pipeline. Try git-filter-repo to eliminate the mess.
Image
Collaborate on file changes, with no Git hosting service necessary, using the Linux git diff and patch commands.
Image
Use a personal git server to save and manage your own code, to share code across different machines in your network, or to share with a small team.
Topics:
Git
Linux administration
Seth Kenlon
Seth Kenlon is a UNIX geek and free software enthusiast. More about me