Building tools that talk to package registries means making thousands of requests to npm, RubyGems, PyPI, and crates.io while you iterate, and at some point you start to feel bad about it. Recorded HTTP fixtures go stale, mocking sixteen different registry protocols by hand is its own project, and pointing the test suite at the real thing means every red-green cycle is a few hundred more requests to infrastructure other people are paying for. I wanted something I could point git-pkgs and brief at locally that would answer like the real registries do but only fetch from upstream once.
proxy is a single Go binary that speaks the wire protocols of npm, PyPI, RubyGems, Cargo, Go modules, Maven, NuGet, Composer, Hex, pub.dev, Conan, Conda, CRAN, Debian, RPM, and the OCI container registry. Start it, point a package manager at localhost:8080, and the first install fetches from upstream and writes the artifact to local storage; every install after that is served from the cache. Metadata responses are rewritten on the way through so tarball URLs point back at the proxy rather than the origin, which is the part most simple HTTP caches get wrong.
proxy &
npm_config_registry=http://localhost:8080/npm/ npm install
GOPROXY=http://localhost:8080/go,direct go build
pip install --index-url http://localhost:8080/pypi/simple/ requests
That’s all I originally wanted from it, a local set of registry endpoints I could hammer from tests without bothering anyone. It’s still a side project for me, mostly a test bed where I can try package-registry experiments against real protocol handlers without first convincing sixteen upstream teams to ship them.
But sitting between a package manager and its registry turns out to be a useful place to stand, and once those handlers existed it was also most of the way to being a free, single-binary alternative to Artifactory or Nexus for people who just want a caching mirror without the rest of the platform attached.
CI
The dependency caching built into GitHub Actions and its equivalents works at the filesystem level: tar up ~/.npm or ~/.cargo, key it on a hash of the lockfile, restore it next time. That’s fine until the lockfile changes by one package and the whole cache key misses, or a matrix build spreads the same dependency set across six OS and runtime combinations that each get their own tarball.
A registry-protocol cache sits one level up, keyed on package coordinates rather than filesystem layout, so lodash-4.17.21.tgz is stored once and served to every job that wants it regardless of what else changed in the lockfile or which runner is asking. Point it at S3 or Postgres instead of the default SQLite-and-local-disk and it can be shared across runners. Longer term I’d like to see this wired directly into something like Forgejo’s CI runner, so every job on an instance gets a shared package cache and a cooldown policy by default rather than every repo having to configure it.
Mirroring
The proxy can be told to fetch packages before anything asks for them. proxy mirror takes PURLs, or a CycloneDX or SPDX SBOM, and pulls every listed artifact into the cache:
proxy mirror pkg:npm/[email protected] pkg:cargo/[email protected]
proxy mirror --sbom sbom.cdx.json
Feed it the SBOM for a repository and you have an offline mirror of exactly that project’s dependency tree, which is the shape you want for air-gapped builds or for warming a cache before a CI fleet starts pulling. The same operation is exposed as POST /api/mirror on the running server for driving it from a pipeline.
Cooldowns
Because the proxy is rewriting metadata responses anyway, it can also edit them. The cooldown setting strips any version younger than a configured age from the version lists it returns:
cooldown:
default: "3d"
ecosystems:
npm: "7d"
packages:
"pkg:npm/lodash": "0"
With that config, a version published to npm an hour ago doesn’t exist as far as anything behind the proxy is concerned, and won’t for a week. I wrote about why I think cooldowns are the single most effective supply-chain control most projects aren’t using; the short version is that almost every malicious-package incident is caught within a day or two of publish, so a build that can’t see anything younger than three days was never exposed in the first place.
A few package managers have grown a native setting for this since I wrote that post, but doing it at the proxy means one config covers every ecosystem at once, including the ones that haven’t.
Web UI
There’s a web UI on / for browsing what’s in the cache: packages by ecosystem, hit counts, size, a source browser for reading files inside cached tarballs without extracting them, and a diff view for comparing two cached versions of the same package file by file. The enrichment API behind it (/api/package/{ecosystem}/{name}/{version}) returns licence, publish date, latest version, and any OSV advisories for a given coordinate, which is the same lookup git-pkgs needs, so the two share that code.
There’s a lot more I want the UI to do, most of it on the list of things worth stealing from npmx: bundle composition, install-size sunbursts, typosquat warnings. A registry frontend that only shows you packages you’ve actually installed is a slightly different design problem from one that fronts all four million packages on npm, and I haven’t fully worked out what that should look like yet.
Next
Upstream merging would put an internal index and the public registry behind one URL, with the internal names shadowing the public ones, which is the dependency-confusion defence that pip in particular has no native answer for. I’d also like the proxy to enforce dependency policy more broadly than just cooldowns (licence allowlists, blocked package names, version floors), though there’s no shared format for writing those policies down, so whatever config the proxy grows will be yet another one, and I’d rather that problem got solved upstream of any individual tool.
The experiment I’m most looking forward to is a wrapper mode, proxy npm install express, where the binary finds or starts a server, sets NPM_CONFIG_REGISTRY or GOPROXY or PIP_INDEX_URL as appropriate, and execs the underlying command. Add alias npm="proxy npm" to your shell and every install on your machine is cached and cooled down without ever touching an .npmrc.
Most of the heavy lifting is in modules shared with the rest of the git-pkgs tooling: the manifest and lockfile parsers, the PURL handling, the SBOM readers, the OSV client. Each new registry backend is a few hundred lines of protocol handler on top of that, plus a config snippet for the install page.
Contributions are very welcome on this one, particularly protocol handlers for the registries still unticked in the README (Helm, Swift, Alpine, Arch) and anything that makes the browse UI more useful. Tell me what you’d want it to do on Mastodon or the issue tracker.
brew install git-pkgs/git-pkgs/proxy / github.com/git-pkgs/proxy