Utilising Renovate's local platform to make renovate-graph more efficient

Featured image for sharing metadata for article

Last year I built renovate-graph, a tool to extract the dependency trees for a given repository, which under the hood uses Renovate. I've been getting tonnes of value from it as part of how it fits into the wider dependency-management-data ecosystem, and providing more actionable data for understanding how you use internal and external dependencies in your projects.

However, I've found that when running this against several larger repositories, the performance starts to suffer, largely due to the way that renovate-graph is a rather hacky wrapper around Renovate.

Whereas Renovate is expected to run against a fully cloned repository, so it can create branches with expected package changes, renovate-graph just needs to run against (generally) the latest branch of a repository.

One option I'd been investigating to improve performance was to expose the ability to tune the arguments passed to git clone by Renovate, so we could perform a shallow clone, but then I stumbled upon Renovate's local platform.

The local platform allows us to run against a local directory (that doesn't even need to have a .git folder), which is perfect for renovate-graph as it's a read-only operation to purely extract the dependencies for a given repo.

So what performance gains? Note that we're using renovate-graph v0.15.1 for these comparisons.

If we take a somewhat unscientific comparison, we'll focus on using Kibana, which is a significant size - the repository checks out at 6.7GB, and the source-only archive at 670MB.

We'll first use renovate-graph when executing against GitHub to clone + then process the repo without dependency updates lookup:

time env LOG_LEVEL=warn RG_DELETE_CLONED_REPOS=true RG_INCLUDE_UPDATES=false npx @jamietanna/renovate-graph@v0.15.1 --token $GITHUB_TOKEN elastic/kibana
# 78.37s user 29.07s system 13% cpu 13:01.17 total

Next, if we perform the same process, but by pulling a source-only archive from GitHub + then process the repo without dependency updates lookup:

time gh api /repos/elastic/kibana/zipball/HEAD > kibana.zip
# 8.38s user 13.89s system 4% cpu 7:27.44 total
unzip kibana.zip
# 5.14s user 1.65s system 97% cpu 6.940 total
cd elastic-kibana-*
# HACK to avoid https://github.com/renovatebot/renovate/discussions/25202
rm renovate.json

time env LOG_LEVEL=warn RG_DELETE_CLONED_REPOS=true RG_INCLUDE_UPDATES=false RG_LOCAL_PLATFORM=github RG_LOCAL_ORGANISATION=elastic RG_LOCAL_REPO=kibana npx @jamietanna/renovate-graph@v0.15.1 --platform local
# 17.66s user 3.08s system 126% cpu 16.433 total

To compare these:

platform=githubplatform=local
781454

So we can see that there's a 72% increase on processing with local platform πŸŽ‰ (if I've done that maths correctly πŸ˜…)

Written by Jamie Tanna's profile image Jamie Tanna on , and last updated on .

Content for this article is shared under the terms of the Creative Commons Attribution Non Commercial Share Alike 4.0 International, and code is shared under the Apache License 2.0.

#blogumentation #renovate #dependency-management-data.

This post was filed under articles.

Interactions with this post

Interactions with this post

Below you can find the interactions that this page has had using WebMention.

Have you written a response to this post? Let me know the URL:

Do you not have a website set up with WebMention capabilities? You can use Comment Parade.