Additional lessons learned running Renovate at scale

I've written before about why I love Renovate for dependency management and some lessons learned self hosting Renovate.
One thing I've not yet shared are some top tips for running Renovate "at scale" that I've found useful, in my own personal capacity.
This post is primarily aimed to folks who operate Renovate as-a-service for other parts of their organisation, whether that's Mend Renovate CLI (the Open Source project) or Mend Renovate Community Edition or Mend Renovate Enterprise Edition.
This takes ~12 months of running Renovate against 100s of repositories (some of which are large monorepos, such as elastic/kibana), as well as my time at Deliveroo, running against ~1000 repositories.
My definition of "at scale" is likely at odds compared to some of the folks in the community running Renovate, but these are things I've found to be worth sharing, which I feel will still work at that significant scale, as well as possibly at smaller scales.
Note that examples below are based on Renovate v39.x. There will have been some changes since then that may tweak log message names, or added additional logging to understand what's going on.
Understand teams' usage
As I've written about before, understanding how teams use your software, especially if you providing internal tools or services, is really important.
Renovate is an incredibly powerful tool, with a tonne of different configuration options that provide great flexibility, but some of that configurability can lead to performance trade-offs, or supply chain security risks. If you're providing Renovate -as-a-service or -as-a-platform, it helps to understand how it's being used.
In my case, I wrote a tool to export this to SQLite (my favourite database) so I could introspect the Renovate configuration in an easier means.
This allows me and my team to have a single database which includes all our users' Renovate configuration, which we can query with SQLite's JSON operators, i.e.
-- how many repositories are onboarded to Renovate?
select
count(*)
from
renovate_configs
where
json_extract(config, '$.enabled') is null
or json_extract(config, '$.enabled') != false;
Or:
-- which repositories do we have which are using non-Elastic presets?
select
distinct organisation,
repo,
json_extract(renovate_configs.config, '$.extends') as extends
from
renovate_configs
where
-- NOTE that this is a bit of an odd way to do this, suggestions to improve are welcome!
not exists (
select
1
from
json_each(
json_extract(renovate_configs.config, '$.extends')
)
where
json_each.value like '%>elastic/renovate-config%'
)
and exists (
select
1
from
json_each(
json_extract(renovate_configs.config, '$.extends')
)
where
json_each.value like '%>%'
)
For instance, if you find that dozens of repositories have the same customManagers
or customDatasources
available, maybe you should provide that in a shared preset (with relevant tests), instead of it being copy-pasted.
Additionally, it can be helpful to know if teams are tuning prConcurrentLimit
or branchConcurrentLimit
, as that then has performance implications for your platform.
We've found it's been very useful being able to understand teams' usage of Renovate, in addition to asking teams how they're using it.
Although we've not got there yet, this is the sort of thing that could definitely be integrated with data visualisation tools like Evidence (or internally at Elastic, we may look at exporting it into the Elastic stack to then visualise with Kibana) to provide more clear insights on key usage patterns.
Ingest your logs
If you're using Renovate, but not ingesting your (debug) logs anywhere for later querying, you're really missing out.
Make sure that you're ingesting the full DEBUG
log levels, as there's a wealth of knowledge you can gain about the operation of the platform, detect issues with repositories (either repo-level configuration issues, or platform issues like "we forgot to add authentication to our package registry"), as well as generally understand how Renovate works a lot better.
It's possible that you may also want to create metrics off the back of some of these events, to understand how your platform is operating at a high level.
Note that - as far as I'm aware - there's no official stability with the format of the logs, or JSON key-value pair naming, so there could be changes over time to be aware of.
Make the logs downloadable
As much as the log visualisation tooling you're using will be nice, it's also nice to be able to work with the plain logs.
For instance, if you're running in a Cloud environment, you may end up getting a lot of extra Cloud-y metadata in your logging platform, which makes it hard to work out what's actually from Renovate.
I've also found that when you have the raw debug logs, you can do things like write tools to better visualise them, or perform your own additional queries or transformation on top of it.
Monitoring
There are a number of key things to keep an eye on, as a platform owner for Renovate infrastructure. These are more Renovate-specific insights to understand the health of the platform, over more standard checks like "how much CPU am I using" or "are any pods crashing".
This is a high level view of some of the areas I recommend keeping an eye, but you may have organisation-specific views you're interested in.
How long repos are taking to process?
It's quite important to understand how long repositories are taking to process - both in terms of the user experience and in terms of your operational and capacity planning.
There are a number of factors that feed into the time Renovate takes to process a repository such as (non-exhaustive):
- CPU/memory limits
- network connectivity
- external factors (i.e. third party package registry response time)
- number of branches being processed
- size of repository
- external commands being processed
- whether file-/Redis-based caching is in place, and cache hits vs misses
Additionally, depending on your usage model, it may be more or less impactful when there are repositories that take longer.
On an individual repository's case, if your repository takes ~30m to process, that means it'll be at best ~30m from "rebase this PR" to completing the Renovate run, but at worst, you may have a Renovate run already processing against the repo which you need to wait for.
On the platform side, if you have a central GitLab CI deployment or Mend Renovate Community Edition, you'll be processing one repository at a time, which means that one slow repository slows the rest of the repositories from being processed.
If you run Renovate as a GitHub App, the Installation Access Token that is used to authenticate as the GitHub App has a maximum lifetime of 60 minutes. If you have repositories that are regularly getting close to this, you may need to leverage other mechanisms to speed up the repository processing, otherwise you'll start having failures.
This metadata can be discovered through the Repository timing splits (milliseconds)
:
{
"name": "renovate",
"hostname": "...",
"pid": 1583,
"level": 20,
"logContext": "51324681-7e46-4f37-84fe-e29d8f774f9d",
"repository": "...",
"splits": {
"init": 103569,
"extract": 225472,
"lookup": 181992,
"onboarding": 1,
"update": 1461590
},
"total": 1996736,
"msg": "Repository timing splits (milliseconds)",
"time": "2025-08-23T08:45:15.447Z",
"v": 0
}
How often do repos process?
Related to the above, it's also useful to understand how often your repositories are actually processing.
For instance, if you say that all repositories will process twice a day (i.e. at midnight and noon), does that happen?
A useful option here is to use a Service Level Objective (SLO) metric in your observability tooling to surface this, and give you an indication on whether you're meeting expectations for your platform.
Repository result
When Renovate finishes processing a repository, it will emit a Repository result
log line:
{
"name": "renovate",
"hostname": "...",
"pid": 340129,
"level": 20,
"logContext": "42c51bd1-68fd-4221-a96b-98c3ed6323a5",
"repository": "...",
"msg": "Repository result: done, status: activated, enabled: true, onboarded: true",
"time": "2025-08-24T14:46:41.037Z",
"v": 0
}
As part of this, we can see the state of the repository, and whether action(s) have been performed.
There are a few other options for msg
may appear, such as (non-exhaustive):
Repository result: disabled-closed-onboarding, status: disabled, enabled: false, onboarded: undefined
Repository result: config-validation, status: onboarded, enabled: true, onboarded: true
Each of these can be insightful for an at-a-glance view of the state of repositories in your organisation, and for instance allow flagging up cases where there are repositories in different states.
Note that I'm working upstream to expose this as first-class metadata to avoid parsing the log message itself.
repository problems
and repoProblems
Before Renovate logs Repository result
, it will log if there are any "repository problems".
This provides a single view of any important issues with the repository, and this summary also appears on your Dependency Dashboard.
For instance:
{
"hostname": "...",
"level": 20,
"logContext": "2e96d778-dd03-4280-872b-f1f1ba6decab",
"msg": "repository problems",
"name": "renovate",
"pid": 457962,
"repoProblems": [
"WARN: Failed to assign reviewer"
],
"repository": "...",
"time": "2025-08-25T08:11:10.313Z",
"v": 0
}
A lot of the time, these are issues that you - as platform owner - need to investigate.
However, sometimes they will be down to user configuration issues so It Depends β’οΈ
You can dig into them more with a little bit of effort.
For instance, given the above, we can now look up warning log messages (level=40
) with the message Failed to assign reviewer
(msg="Failed to assign reviewer"
), and we'll find:
{
"branch": "...",
"err": {
"err": {
"code": "ERR_NON_2XX_3XX_RESPONSE",
"message": "Response code 422 (Unprocessable Entity)",
"name": "HTTPError",
"options": {
"headers": {
"accept": "application/json, application/vnd.github.machine-man-preview+json",
"accept-encoding": "gzip, deflate, br",
"authorization": "***********",
"content-length": "55",
"content-type": "application/json",
"user-agent": "..."
},
"hostType": "github",
"http2": false,
"method": "POST",
"password": "",
"url": "...",
"username": ""
},
"response": {
"body": {
"documentation_url": "https://docs.github.com/rest/pulls/review-requests#request-reviewers-for-a-pull-request",
"errors": [
"Could not add requested reviewers to pull request."
],
"message": "Validation Failed",
"status": "422"
},
"headers": {
},
"httpVersion": "1.1",
"retryCount": 0,
"statusCode": 422,
"statusMessage": "Unprocessable Entity"
},
"stack": "HTTPError: Response code 422 (Unprocessable Entity) ...",
"timings": {
}
},
"hostType": "github",
"message": "external-host-error",
"stack": "Error: external-host-error ..."
},
"hostname": "...",
"level": 40,
"logContext": "dd8cdf98-55cb-4db6-b2ef-45141acd74f5",
"msg": "Failed to assign reviewer",
"name": "renovate",
"pid": 352056,
"repository": "...",
"time": "2025-08-24T16:40:08.214Z",
"v": 0
}
In this specific case, it's likely user error - trying to assign a team that doesn't have a minimum of Write access to the repository.
Package lookup failures
(+ warnings
)
Another common error that may appear in repository problems
is the failure to look up packages.
Again, this could be user error, i.e. non-existent.docker.registry.com
, or it could be your platform is missing configuration.
You'll see more information in the log messages around what packages failed to be looked up:
{
"files": [
"package.json",
"freeze/requirements.txt",
"requirements.txt"
],
"hostname": "...",
"level": 40,
"logContext": "420dc38d-07e3-4ad7-9ee1-bad0a881d994",
"msg": "Package lookup failures",
"name": "renovate",
"pid": 45351,
"repository": "...",
"time": "2025-08-23T11:08:39.296Z",
"v": 0,
"warnings": [
"Failed to look up npm package ...",
"Failed to look up pypi package ..."
]
}
Missing authentication
Similarly, some of the managers may also provide more insight into cases where you're missing authentication for package registries (when failing to look up packages), noted by the following lines:
{
"name": "renovate",
"hostname": "...",
"pid": 583945,
"level": 20,
"logContext": "ef19d313-cd49-4773-a7d6-13fe0fbe940f",
"repository": "...",
"msg": "Looking up com.charleskorn.kaml:kaml in repository ...",
"time": "2025-08-25T23:37:24.517Z",
"v": 0
},
{
"name": "renovate",
"hostname": "...",
"pid": 583945,
"level": 20,
"logContext": "ef19d313-cd49-4773-a7d6-13fe0fbe940f",
"repository": "...",
"msg": "GET .../maven-metadata.xml = (code=ERR_NON_2XX_3XX_RESPONSE, statusCode=403 retryCount=0, duration=183)",
"time": "2025-08-25T23:37:24.730Z",
"v": 0
},
{
"hostname": "...",
"level": 20,
"logContext": "ef19d313-cd49-4773-a7d6-13fe0fbe940f",
"msg": "Dependency lookup unauthorized. Please add authentication with a hostRule for ...",
"name": "renovate",
"pid": 199860,
"repository": "...",
"time": "2025-08-24T02:34:26.128Z",
"v": 0
}
Or for a Docker container:
{
"name": "renovate",
"hostname": "...",
"pid": 128531,
"level": 20,
"logContext": "5107696f-08e2-440b-8457-c1924e0df296",
"repository": "...",
"msg": "GET https://ghcr.io/token?scope=repository%3Adevcontainers-contrib%2Ffeatures%2Fpipenv%3Apull&service=ghcr.io = (code=ERR_NON_2XX_3XX_RESPONSE, statusCode=403 retryCount=0, duration=53)",
"time": "2025-08-23T17:34:18.317Z",
"v": 0
}
{
"name": "renovate",
"hostname": "renovate-runner-vanilla-58c9b9b48-59cdh",
"pid": 128531,
"level": 20,
"logContext": "5107696f-08e2-440b-8457-c1924e0df296",
"repository": "elastic/okta-teams",
"registryHost": "https://ghcr.io",
"dockerRepository": "devcontainers-contrib/features/pipenv",
"msg": "Not allowed to access docker registry",
"time": "2025-08-23T17:34:18.317Z",
"v": 0
}
This may also present as an HTTP 404 when looking up a package, if it's a private package:
{
"name": "renovate",
"hostname": "...",
"pid": 340129,
"level": 20,
"logContext": "42c51bd1-68fd-4221-a96b-98c3ed6323a5",
"repository": "...",
"msg": "GET https://pypi.org/pypi/.../json = (code=ERR_NON_2XX_3XX_RESPONSE, statusCode=404 retryCount=0, duration=209)",
"time": "2025-08-24T14:46:19.539Z",
"v": 0
}
{
"name": "renovate",
"hostname": "...",
"pid": 340129,
"level": 20,
"logContext": "42c51bd1-68fd-4221-a96b-98c3ed6323a5",
"repository": "...",
"dependency": "...",
"packageFile": "requirements.txt",
"msg": "Failed to look up pypi package ...",
"time": "2025-08-24T14:46:19.809Z",
"v": 0
}
Failed command execution (rawExec err
)
It's also worthwhile keeping an eye on whether any commands have failed to run.
This can either be something from allowedCommands
(previously known as allowedPostUpgradeCommands
), but could also be inbuilt commands, such as the below failure to run the Python poetry
package manager:
{
"branch": "renovate/requests-2.x-lockfile",
"durationMs": 1593,
"err": {
"cmd": "/bin/sh -c poetry update --lock --no-interaction requests",
"exitCode": 1,
"message": "Command failed: poetry update --lock --no-interaction ...",
"name": "ExecError",
"options": {
"cwd": "...",
"encoding": "utf-8",
"env": {
"CONTAINERBASE_CACHE_DIR": "/tmp/renovate/cache/containerbase",
"GIT_CONFIG_COUNT": "3",
"HOME": "/home/ubuntu",
"LANG": "C.UTF-8",
"LC_ALL": "C.UTF-8",
"PATH": "/home/ubuntu/.local/bin:/home/ubuntu/bin:/home/ubuntu/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"PIP_CACHE_DIR": "/tmp/renovate/cache/others/pip"
},
"maxBuffer": 10485760,
"timeout": 900000
},
"stack": "ExecError: Command failed: poetry update ...",
"stderr": "Source (...): ...",
"stdout": "Updating dependencies\nResolving dependencies...\n"
},
"hostname": "...",
"level": 20,
"logContext": "4fd912d3-054b-4709-b262-e7bcc048a0e4",
"msg": "rawExec err",
"name": "renovate",
"pid": 476426,
"repository": "...",
"time": "2025-08-25T09:29:35.029Z",
"v": 0
}
Invalid config
Teams using Renovate are responsible for their configuration, and ensuring it's valid.
Although Renovate works hard to retain backwards compatibility, and perform in-place "config migration", sometimes the configuration is plain invalid - for instance when trusting an LLM too much to generate you the right config π«£
Renovate will raise an issue on repositories to let repository owners know that their configuration is invalid, but sometimes it can help having an out-of-band means to let teams know.
Working with monorepos
As I've written about separately, there are some improvements I recommend when working with multi-team monorepos.
I would recommend potentially even having as a :monorepo
preset in your organisation, for a "batteries included" means to re-use this configuration.
In a different vein, remember that large monorepos will generally take longer to process with Renovate, due to their size.
Allowlisting commands
One thing to be cautious of when self-hosting is the usage of allowedCommands
(previously known as allowedPostUpgradeCommands
). It provides a lot of value, but comes with a few risks.
Especially if you're running Renovate in a long-running environment (such as with Mend Renovate Community Edition, or on a Virtual Private Server (VPS)), there are risks of arbitrary code being executed and persisting.
Separately, as a platform owner you also need to be aware of the performance impact - if allowlisting make licenses
, what happens if someone modifies their repo's Makefile
to requires make build
and make test
to run before make licenses
? Now you have a much longer execution you weren't actually allowing.
Be very careful considering the risks of allowing the arbitrary code execution this affords - it's hugely powerful, but also comes with risks to consider, and re-review periodically.
Should Renovate be opt-in or opt-out?
I very much feel that Renovate should be available for teams to use, but only if teams opt-in to using it by merging an onboarding PR / explicitly creating a renovate.json
.
Although you may want to provide a "golden path" with Renovate pre-enabled (with a set of best practices), it doesn't mean that everyone must use it.
Often, convincing folks that Renovate is better is part of the process, and that shouldn't be done by forcing everyone to use it.
That being said, while at Deliveroo, we had all repositories being processed by Renovate, where it ran in a "quiet" mode, only updating the Dependency Dashboard for Docker and CI package updates. Then, teams could explicitly opt-in to Renovate (via), which would then allow them to take advantage of Renovate on a large scale.
This worked nicely because teams who started to use Renovate's features would then start realising they could update more than Docker and CI package updates, and would onboard fully.
This worked, operationally, due to the scale of Deliveroo's engineering efforts not being quite as active as Elastic's, and that there was a bit more comfort waiting a bit longer for a Renovate PR to get raised.
However, one thing to note is that running Renovate against all repositories (even if it's for a subset of package updates) still has a performance impact, so you'll need to take into account that each of those repositories are still taking up execution time, and potentially blocking other repositories from their updates.
Onboarding
Regardless, I still feel very strongly about how great Renovate's onboarding functionality is, and that this is an excellent way to nudge folks to onboard to Renovate, as well as understanding what enabling Renovate actually means.
I've seen teams who are currently using Dependabot suddenly realise "wait, why are we behind on dependency updates"?
As mentioned above, Renovate should be opt-in, ideally through the onboarding process - we've had significant success at Elastic of repositories (created before we rolled out Renovate) who have since accepted the onboarding PR π
Closing
This should give you a few actionable steps that you can take to improve your operational maturity of your Renovate-as-a-service for your organisation, and help understand whether there are any monitoring gaps.
As I mentioned above, I'd recommend periodically looking through Renovate logs (especially after version upgrades of Renovate, to see what's new) to discover new insights.
Got any other lessons you've learned as part of running Renovate? I'd love to hear them.