local platform to make
renovate-graph more efficient
Last year I built
renovate-graph, a tool to extract the dependency trees for a given repository, which under the hood uses Renovate. I've been getting tonnes of value from it as part of how it fits into the wider dependency-management-data ecosystem, and providing more actionable data for understanding how you use internal and external dependencies in your projects.
However, I've found that when running this against several larger repositories, the performance starts to suffer, largely due to the way that
renovate-graph is a rather hacky wrapper around Renovate.
Whereas Renovate is expected to run against a fully cloned repository, so it can create branches with expected package changes,
renovate-graph just needs to run against (generally) the latest branch of a repository.
One option I'd been investigating to improve performance was to expose the ability to tune the arguments passed to
git clone by Renovate, so we could perform a shallow clone, but then I stumbled upon Renovate's
local platform allows us to run against a local directory (that doesn't even need to have a
.git folder), which is perfect for
renovate-graph as it's a read-only operation to purely extract the dependencies for a given repo.
So what performance gains? Note that we're using
renovate-graph v0.15.1 for these comparisons.
If we take a somewhat unscientific comparison, we'll focus on using Kibana, which is a significant size - the repository checks out at 6.7GB, and the source-only archive at 670MB.
We'll first use
renovate-graph when executing against GitHub to clone + then process the repo without dependency updates lookup:
time env LOG_LEVEL=warn RG_DELETE_CLONED_REPOS=true RG_INCLUDE_UPDATES=false npx @email@example.com --token $GITHUB_TOKEN elastic/kibana # 78.37s user 29.07s system 13% cpu 13:01.17 total
Next, if we perform the same process, but by pulling a source-only archive from GitHub + then process the repo without dependency updates lookup:
time gh api /repos/elastic/kibana/zipball/HEAD > kibana.zip # 8.38s user 13.89s system 4% cpu 7:27.44 total unzip kibana.zip # 5.14s user 1.65s system 97% cpu 6.940 total cd elastic-kibana-* # HACK to avoid https://github.com/renovatebot/renovate/discussions/25202 rm renovate.json time env LOG_LEVEL=warn RG_DELETE_CLONED_REPOS=true RG_INCLUDE_UPDATES=false RG_LOCAL_PLATFORM=github RG_LOCAL_ORGANISATION=elastic RG_LOCAL_REPO=kibana npx @firstname.lastname@example.org --platform local # 17.66s user 3.08s system 126% cpu 16.433 total
To compare these:
So we can see that there's a 72% increase on processing with
local platform 🎉 (if I've done that maths correctly 😅)