The Many Flavors of Ignore Files

A bug report in git-pkgs led me down a rabbit hole: files that git ignored were showing up as phantom diffs, and the cause turned out to be go-git’s gitignore implementation, which doesn’t match git’s actual behavior for unanchored patterns in nested directories. I went looking for a Go library that fully matched git’s pattern semantics and couldn’t find one, so I wrote git-pkgs/gitignore with a wildmatch engine modeled on git’s own wildmatch.c.

Building that made me appreciate how much complexity hides behind .gitignore, and got me thinking about all the other tools with their own ignore files. Most claim to use “gitignore syntax” without specifying which parts, and that phrase turns out to be doing a lot of work. Every tool wants to be git until it has to implement git’s edge cases.

gitignore

Most people know that *.log ignores log files and node_modules/ ignores the node_modules directory. But gitignore does far more than simple glob matching. I covered the basics in Git’s Magic Files, but getting a correct implementation working forced me to deal with all of it. The gitignore docs describe the behavior in prose; the real authority is the implementation in dir.c and wildmatch.c, with tests in t0008-ignores.sh and t3070-wildmatch.sh.

Four layers of patterns. Git doesn’t just read one .gitignore file. It checks patterns from four sources in order of increasing priority: the global excludes file (core.excludesFile, defaulting to ~/.config/git/ignore), then .git/info/exclude for repo-local patterns that aren’t committed, then the root .gitignore, then .gitignore files in each subdirectory. A pattern in src/.gitignore only applies to files under src/. Patterns in deeper directories override patterns in parent directories, and the last matching pattern wins. If you’re debugging why a file isn’t being ignored (or why it is), git check-ignore -v <path> will tell you exactly which pattern in which file is responsible.

Anchored vs. unanchored patterns. A pattern with no slash in it, like *.log, is unanchored and matches at any depth because git effectively prepends **/ to it. But the moment a pattern contains a slash, including a leading /, it becomes anchored to its .gitignore’s directory. This distinction is where go-git’s implementation broke down for us.

Pattern	Matches	Doesn’t match	Why
`debug.log`	`debug.log`, `logs/debug.log`		Unanchored, matches at any depth
`/debug.log`	`debug.log` at root only	`logs/debug.log`	Leading `/` anchors to root
`doc/frotz`	`doc/frotz`	`a/doc/frotz`	Contains `/`, so anchored
`build/`	`build/` (dir), `src/build/` (dir)	`build` (file)	Trailing `/` restricts to directories

Wildcards. * matches any string within a single path segment but does not cross / boundaries. ? matches exactly one character, also not /. These follow the rules of git’s wildmatch.c, which is subtly different from shell globbing or Go’s filepath.Match.

Doublestar **. Only special when it appears as a complete path segment between slashes: **/logs matches logs at any depth, logs/** matches everything under logs/, and foo/**/bar matches foo/bar, foo/a/bar, foo/a/b/c/bar with zero or more intermediate directories. But foo**bar is not special because the stars aren’t a standalone segment; they’re just two regular * wildcards that won’t cross a /.

Bracket expressions. [abc] matches one character from the set, ranges like [a-z] and [0-9] work as expected, and both [!a-z] and [^a-z] negate the match. All 12 POSIX character classes are supported: [:alnum:], [:alpha:], [:blank:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:], [:xdigit:]. You can mix classes with ranges in a single expression: [a-c[:digit:]x-z]. The edge cases are where it gets interesting: ] as the first character after [ is a literal member of the class, not the closing bracket. Ranges are byte-value ordered, so [B-a] matches bytes 66 through 97, which includes uppercase B through Z, several symbols, and lowercase a.

Directory-only patterns. A trailing / means the pattern only matches directories, so build/ matches the directory build but not a file named build, and it also matches everything inside that directory because once a directory is ignored git skips it entirely and never looks at its contents.

Negation. A leading ! re-includes something a previous pattern excluded. The subtlety is that you can’t re-include a file if its parent directory was already excluded, because git never descends into the excluded directory to check. To ignore everything except one nested path, you need to re-include each intermediate directory:

/*
!/foo
/foo/*
!/foo/bar

This ignores everything except foo/bar. You have to re-include foo/, then re-exclude foo/*, then re-include foo/bar. Skipping the middle step means foo/bar stays excluded.

Escaping. A backslash makes the next character literal, so \!important matches a file literally named !important rather than being a negation pattern, and \#comment matches a file named #comment rather than being treated as a comment line.

Trailing spaces. Unescaped trailing spaces on a pattern line are stripped, but trailing tabs are not. A backslash before a trailing space preserves it. Leading spaces are always significant: ` hello is a valid pattern matching a file named hello`.

Tracked files are immune. If a file is already tracked by git, adding it to .gitignore does nothing. You need git rm --cached first. This is probably the single most common source of confusion with gitignore. There’s also git update-index --assume-unchanged which tells git to pretend a tracked file hasn’t changed, useful for local config tweaks you don’t want showing up in git status.

Everything else

.gitignore is the original. Then the copies, roughly in order of how likely you are to encounter them:

.dockerignore for Docker build context
.npmignore for npm package publishing
.prettierignore, .eslintignore, .stylelintignore for JavaScript linters and formatters
.hgignore for Mercurial
.containerignore for Podman and Buildah (the OCI alternative to .dockerignore)
.gcloudignore for Google Cloud
.vercelignore for Vercel (.nowignore was the legacy name)
.slugignore for Heroku
.ebignore for AWS Elastic Beanstalk
.cfignore for Cloud Foundry
.helmignore for Helm charts
.artifactignore for Azure DevOps
.funcignore for Azure Functions
.vscodeignore for VS Code extension packaging
.chefignore for Chef
.bzrignore for Bazaar
.cvsignore for CVS
.ignore, .rgignore, .agignore for ripgrep and the silver searcher

How others differ

Docker’s is probably the most consequential ignore file after git’s, because it affects build context size and therefore build speed and layer caching. But it’s still just one flat file with no cascading, no per-directory overrides, and no global config. The pattern matching differs in subtle ways too: gitignore automatically prepends **/ to unanchored patterns so they match at any depth, while Docker’s implementation (using Go’s filepath.Match under the hood) doesn’t do the same implicit anchoring. The @balena/dockerignore npm package has good documentation on these differences.

npm’s is interesting because of its inverted relationship with package.json. You can use a files array in package.json to allowlist instead of blocklist, and if you do, .npmignore is ignored. If there’s no .npmignore at all, npm falls back to .gitignore, which catches people out when they publish packages and find that their dist/ directory was excluded because gitignore told npm to skip it. Running npm pack --dry-run before publishing shows you exactly which files would be included, which would have saved me hours the first time I hit this.

Mercurial’s .hgignore is more powerful than gitignore. It lets you choose your syntax per section with syntax: glob or syntax: regexp, and you can combine both in the same file, switching between them as needed. Glob patterns for the simple stuff, a regex for that one weird build artifact naming scheme, all in one file. It’s the only ignore file I know of that gives you regex, and the ability to mix syntaxes is something git never adopted.

“Uses gitignore syntax”

Most tools say “uses gitignore syntax” in their docs. What they usually mean is: glob patterns, one per line, # for comments, maybe ! for negation. That’s a reasonable subset, but the differences bite you when you assume full compatibility.

Some don’t support negation at all, some don’t support comments, and some treat * as matching directory separators while others don’t. Doublestar ** is supported by most but not all, and trailing / for directory-only matching varies enough between tools that you can’t assume it works the same way everywhere.

The underlying cause is implementation diversity. Tools using Go’s filepath.Match get different behavior from tools using the ignore npm package, which get different behavior from tools using Python’s pathspec library, which get different behavior from tools calling out to git’s own matching code. Each reimplementation makes slightly different choices about edge cases, and the gitignore spec is informal enough that these choices are all defensible. This is exactly what I ran into with go-git: it’s a mature, widely-used library, and its gitignore implementation still doesn’t handle unanchored patterns correctly in nested directories.

A proper compatibility matrix across all these tools (supports negation? comments? doublestar? directory-only matching? cascading?) would be useful reference material. I haven’t found one, and writing it would mean empirically testing each tool rather than trusting their docs. Create a test fixture directory with files designed to probe each feature, write the ignore file, run the operation, and see what actually gets included. The tricky part is that each tool’s operation is different: npm pack --dry-run, docker build, git status, eslint .. You’d need per-tool test harnesses.

CommonIgnore

One corner of the ecosystem actually tried to consolidate rather than adding yet another format. ripgrep and the silver searcher (ag) both deprecated their tool-specific ignore files (.rgignore and .agignore) in favor of a shared .ignore file. ripgrep’s precedence chain is .gitignore then .ignore then .rgignore, with each layer overriding the previous. BurntSushi extracted the matching logic into the ignore crate (part of the ripgrep monorepo, 91M+ downloads), and other tools like fd picked it up too. It’s tool-agnostic by convention rather than by any formal standard, but it’s the closest anyone has come to sharing an ignore format across tools.

Markdown had a similar problem for years. Every tool claimed to support “Markdown” but each implemented a slightly different dialect, with different rules for edge cases around nesting, link parsing, and emphasis. CommonMark fixed this by writing an unambiguous formal spec with hundreds of examples that serve as a test suite. Now tools can test their parser against the spec rather than guessing at intent, and users can rely on consistent behavior across implementations.

It’s not hard to imagine something similar for ignore files. Git’s documentation describes the behavior in prose, which leaves room for interpretation on things like how * interacts with /, whether ** must be surrounded by separators, and what happens when bracket ranges span from uppercase to lowercase. A formal spec with a shared test suite could let tool authors say “we implement level 1” (basic globs and comments) or “level 2” (add negation and doublestar) rather than the current vague gesture at gitignore compatibility. The wildmatch test cases in git’s own test suite are a starting point, but they only cover pattern matching, not the layering, anchoring, and directory semantics that trip up most implementations.