
At my current job, while reorganizing our company's root git monorepo, we decided to adopt a .gitignore allowlist pattern.
I had previously experimented with this approach on raconn, but this monorepo is significantly larger. The complexity required a dedicated script to generate the .gitignore file effectively.
Why use an allowlist?
Pros:
- Pure state: You know exactly what is being tracked and must explicitly write tracking patterns.
- No unintended files: Prevents accidental commits of secrets, large binaries, or build artifacts.
Cons:
- Noise: Spurious files aren't listed as untracked, making them harder to spot.
- Maintenance: The
.gitignoremust be regenerated.
Git gotchas
By default, every line in a .gitignore file is a pattern to exclude. Files matching these patterns cannot be added to version control easily and do not appear as untracked in git status.
The Git documentation states the following regarding negation:
An optional prefix "!" which negates the pattern; any matching file excluded by a previous pattern will become included again. It is not possible to re-include a file if a parent directory of that file is excluded. Git doesnβt list excluded directories for performance reasons, so any patterns on contained files have no effect, no matter where they are defined.
Starting point
To build an allowlist, we start by ignoring everything with /**/* and then selectively adding file extensions. For a Rust project, a naive .gitignore might look like this:
# ignore everything
/**/*
!/.gitignore
!**/*.rs
!**/Cargo.tomlIf your repository looks like this:
$ tree
Running git status will unexpectedly show only the .gitignore file:
$ git status
This happens because git does not list excluded directories for performance reasons and we ignored everything. To fix this, we must "unignore" all directories so git can traverse them to find allowed files:
# ignore everything
/**/*
# allow all directories
!**/
# allow specific files
!**/*.rs
...It is best to explicitly tell git not to look into commonly ignored directories (like target/ or node_modules/), as they might contain allowed file extensions that we don't want tracked.
A script is born
Managing this manually is complex, especially when mixing multiple projects and programming languages in a single monorepo, to alleviate some of the pain, I created a generation script.
I used some fun bash idioms to keep it concise.
println() { printf "%s\n" "$@"; } every parameters to println will be printed on a new line.
"${@/#/param}" in a function will expands to "paramA" "paramB" "paramC" when called with func A B C.
Some git command were of great help while searching for missing patterns that should be allowed,
git ls-files --others --ignored --exclude-standard will list all files that are currently ignored,
with this and a fresh clone of the repo, it make it easier to spot files that are tracked but ignored.
With these helpers, we can go up the abstraction ladder to build allow and deny functions and in turn build files and exts to create language specific allowlists.
Here is the script:
#!/usr/bin/env -S bash -o nounset -o pipefail -o errexit
# https://blog.izissise.net/posts/gitignoreallowlist/
# helpers
println() { printf "%s\n" "$@"; }
header() {
println \
"# This file is generated by '$0' do NOT modify ($(date "+%Y-%m-%d"))" \
"# Commit '$(git rev-list --max-count=1 HEAD)'"
}
denyallfiles() { println "/**/*"; }
allowalldirs() { println "!**/"; }
allow() { println "${@/#/!/}"; }
deny() { println "${@/#//}"; }
files() { # allow files, usage: files BASE_PATH [FILES]...
local p=$1; shift 1;
allow "${@/#/${p}/}"
}
exts() { # allow extensions, usage: exts BASE_PATH [EXTS]...
local p=$1; shift 1;
allow "${@/#/${p}/*.}"
}
css() { for p in "${@}"; do exts "$p" css scss coffee less; done; }
fonts() { for p in "${@}"; do exts "$p" ttf woff woff2 otf eot; done; }
shell() { for p in "${@}"; do exts "$p" sh bash; done; }
python() {
for p in "${@}"; do
deny "${p}/**/.venv/" "${p}/**/venv/"
files "$p" requirements.txt
exts "${p}/**" py
done
}
golang() {
for p in "${@}"; do
files "$p" go.mod go.sum
exts "${p}/**" go
done
}
rust() {
for p in "${@}"; do
deny "${p}/target"
exts "${p}/src/**" rs
exts "${p}/examples/**" rs
exts "${p}/tests/**" rs
files "$p" \
build.rs Cargo.toml Cargo.lock \
rustfmt.toml clippy.toml .cargo/config.toml
done
}
##############################
##############################
header
denyallfiles
allowalldirs
# root
allow \
.editorconfig .gitignore .mailmap \
"**/README.md" "**/.keep" "**/LICENSE"
rust crate-a
shell "**"To update the allowlist, simply run: ./gengitignore.sh > .gitignoreThis example generates the following
# This should works well for crate-a
/**/*
!**/
!/.editorconfig
!/.gitignore
!/.mailmap
!/**/README.md
!/**/.keep
!/**/LICENSE
/crate-a/target
!/crate-a/src/**/*.rs
!/crate-a/examples/**/*.rs
!/crate-a/tests/**/*.rs
!/crate-a/build.rs
!/crate-a/Cargo.toml
!/crate-a/Cargo.lock
!/crate-a/rustfmt.toml
!/crate-a/clippy.toml
!/crate-a/.cargo/config.toml
!/**/*.sh
!/**/*.bash
Thatβs all for now!