
At my current job, while reorganizing our company's root git monorepo, we decided to adopt a .gitignore allowlist pattern.
I had previously experimented with this approach on raconn, but this monorepo is significantly larger. The complexity required a dedicated script to generate the .gitignore file effectively.
Why use an allowlist?
Pros:
- Pure state: You know exactly what is being tracked and must explicitly write tracking patterns.
- No unintended files: Prevents accidental commits of secrets, large binaries, or build artifacts.
Cons:
- Noise: Spurious files aren't listed as untracked, making them harder to spot.
- Maintenance: The
.gitignoremust be regenerated.
Git gotchas
By default, every line in a .gitignore file is a pattern to exclude. Files matching these patterns cannot be added to version control easily and do not appear as untracked in git status.
The Git documentation states the following regarding negation:
An optional prefix "!" which negates the pattern; any matching file excluded by a previous pattern will become included again. It is not possible to re-include a file if a parent directory of that file is excluded. Git doesnβt list excluded directories for performance reasons, so any patterns on contained files have no effect, no matter where they are defined.
Starting point
To build an allowlist, we start by ignoring everything with /**/* and then selectively adding file extensions. For a Rust project, a naive .gitignore might look like this:
# ignore everything
/**/*
!/.gitignore
!**/*.rs
!**/Cargo.tomlIf your repository looks like this:
$ tree
Running git status will unexpectedly show only the .gitignore file:
$ git status
This happens because git does not list excluded directories for performance reasons and we ignored everything. To fix this, we must "unignore" all directories so git can traverse them to find allowed files:
# ignore everything
/**/*
# allow all directories
!**/
# allow specific files
!**/*.rs
...It is best to explicitly tell git not to look into commonly ignored directories (like target/ or node_modules/), as they might contain allowed file extensions that we don't want tracked.
A script is born
Managing this manually is complex, especially when mixing multiple projects and programming languages in a single monorepo, to alleviate some of the pain, I created a generation script.
I used some fun bash idioms to keep it concise.
println() { printf "%s\n" "$@"; } every parameters to println will be printed on a new line.
"${@/#/param}" in a function will expands to "paramA" "paramB" "paramC" when called with func A B C.
Some git command were of great help while searching for missing patterns that should be allowed,
git ls-files --others --ignored --exclude-standard will list all files that are currently ignored,
with this and a fresh clone of the repo, it make it easier to spot files that are tracked but ignored.
With these helpers, we can go up the abstraction ladder to build allow and deny functions and in turn build files and exts to create language specific allowlists.
Here is the script:
#!/usr/bin/env -S bash -o nounset -o pipefail -o errexit
# https://blog.izissise.net/posts/gitignoreallowlist/
# helpers
println() { printf "%s\n" "$@"; }
header() {
println \
"# This file is generated by '$0' do NOT modify ($(date "+%Y-%m-%d"))" \
"# Commit '$(git rev-list --max-count=1 HEAD)'"
}
denyallfiles() { println "/**/*"; }
allowalldirs() { println "!**/"; }
allow() { println "${@/#/!/}"; }
deny() { println "${@/#//}"; }
files() { # allow files, usage: files BASE_PATH [FILES]...
local p=$1; shift 1;
allow "${@/#/${p}/}"
}
exts() { # allow extensions, usage: exts BASE_PATH [EXTS]...
local p=$1; shift 1;
allow "${@/#/${p}/*.}"
}
css() { for p in "${@}"; do exts "$p" css scss coffee less; done; }
fonts() { for p in "${@}"; do exts "$p" ttf woff woff2 otf eot; done; }
shell() { for p in "${@}"; do exts "$p" sh bash; done; }
python() {
for p in "${@}"; do
deny "${p}/**/.venv/" "${p}/**/venv/"
files "$p" requirements.txt
exts "${p}/**" py
done
}
golang() {
for p in "${@}"; do
files "$p" go.mod go.sum
exts "${p}/**" go
done
}
rust() {
for p in "${@}"; do
deny "${p}/target"
exts "${p}/src/**" rs
exts "${p}/examples/**" rs
exts "${p}/tests/**" rs
files "$p" \
build.rs Cargo.toml Cargo.lock \
rustfmt.toml clippy.toml .cargo/config.toml
done
}
##############################
##############################
header
denyallfiles
allowalldirs
# root
allow \
.editorconfig .gitignore .mailmap \
"**/README.md" "**/.keep" "**/LICENSE"
rust crate-a
shell "**"To update the allowlist, simply run: ./gengitignore.sh > .gitignoreThis example generates the following
# This should works well for crate-a
/**/*
!**/
!/.editorconfig
!/.gitignore
!/.mailmap
!/**/README.md
!/**/.keep
!/**/LICENSE
/crate-a/target
!/crate-a/src/**/*.rs
!/crate-a/examples/**/*.rs
!/crate-a/tests/**/*.rs
!/crate-a/build.rs
!/crate-a/Cargo.toml
!/crate-a/Cargo.lock
!/crate-a/rustfmt.toml
!/crate-a/clippy.toml
!/crate-a/.cargo/config.toml
!/**/*.sh
!/**/*.bash
Lints
Since keeping .gitignore up to date now involves manual human interaction, mistakes are bound to happen. To catch these, Iβve set up two linting steps that run in our CI pipeline.
1. Check if .gitignore is stale
This ensures that the committed .gitignore matches the current output of the generation script. If someone updates the script but forgets to redirect the output to the file, the CI will fail.
diff <(sed '1d;2d' .gitignore) <(./gengitignore.sh | sed '1d;2d') \
|| {
echo "Error: .gitignore is stale. Please run './gengitignore.sh > .gitignore' and commit the changes." >&2;
exit 1;
}
2. Check for tracked but ignored files
Because we use an allowlist, itβs possible for a file to be tracked in Git history while technically being ignored by the current patterns. This usually happens when a file was added before the allowlist was implemented or if a pattern was removed.
We use the following command to ensure no files are currently being tracked (--cached) while also being matched by an ignore pattern (--ignored):
! git ls-files --ignored --cached --exclude-standard | grep -q . || {
echo "Error: The following tracked files are currently ignored by .gitignore:" >&2;
git ls-files --ignored --cached --exclude-standard
exit 1;
}Thatβs all for now!