Getting started with Dependency Management Data

This is a companion post to go alongside my talk writeup of my talk at DevOpsNotts July 2023 about the dependency-management-data (DMD) project

This is intended as a quick setup guide, rather than an exhaustive jump into what it is and how it works - if you'd like that, check out the talk writeup 👆

Want to know a bit more in-depth what it is and how it works? I've recently spoken at DevOpsNotts about it and I [have done a talk writeup] which digs into it a little further.

TL;DR extraordinaire

At a minimum, you need to:

  • retrieve some data, for instance via renovate-graph
    • note that you do not need to be already using Renovate to use this!
  • create the SQLite database for dependency-management-data
  • import the data

We can do this by running:

go install dmd.tanna.dev/cmd/dmd@latest

# produce some data that DMD can import, for instance via renovate-graph
npx @jamietanna/renovate-graph@latest --token $GITHUB_TOKEN your-org/repo another-org/repo
# or for GitLab
env RENOVATE_PLATFORM=gitlab npx @jamietanna/renovate-graph@latest --token $GITLAB_TOKEN your-org/repo another-org/nested/repo

# set up the database
dmd db init --db dmd.db
# import renovate-graph data
dmd import renovate --db dmd.db 'out/*.json'
# then you can start querying it
sqlite3 dmd.db 'select count(*) from renovate'

Retrieving the data

As noted above, we need to retrieve data to be imported into DMD. For dependencies, I'd recommend using renovate-graph, which uses Renovate as the engine for retrieving package data.

We can run the following:

# optional, allows renovate-graph to retrieve the `current_version` column, as well as populate the `renovate_updates` table
export RG_INCLUDE_UPDATES='true'

# produce some data that DMD can import, for instance via renovate-graph
npx @jamietanna/renovate-graph@latest --token $GITHUB_TOKEN jamietanna/jamietanna deepmap/oapi-codegen
# or for GitLab
env RENOVATE_PLATFORM=gitlab npx @jamietanna/renovate-graph@latest --token $GITLAB_TOKEN tanna.dev/serve jamietanna/tidied

If you are looking at AWS infrastructure, check out the README for endoflife-checker which explains in more details how to pull AWS data.

Creating the database and importing the data

Once renovate-graph has executed, you'll see an out directory with one file per repo.

First, we'll create the database:

# or any name, really
dmd db init --db dmd.db

Then, we need to import the data. Notice the quotes around the argument to avoid shell globbing

dmd import renovate --db dmd.db 'out/*.json'

Now our database is ready to go 👏

Generating advisories (optional)

This is an optional step, but allows us to get some more meaningful information about our dependencies.

We can run the following to set up our advisories:

# optionally fetch community-sourced custom advisories
dmd contrib download

# then generate advisories for all our packages
# note that this can take several minutes depending on how many dependencies you have!
dmd db generate advisories --db dmd.db

Running some queries

Now we've got the data available, we can ??.

It's recommended you find your SQLite browser of choice and try the following queries:

-- how many packages have been ingested via renovate-graph
select count(*) from renovate

-- how many pending package updates have been ingested via renovate-graph
select count(*) from renovate_updates

-- how many packages have been ingested via dependabot-graph
select count(*) from dependabot

-- what are your most popular 10 transitive Go dependencies?
select
  distinct package_name,
  count(*)
from
  renovate,
  json_each(dep_types) as dep_type
where
  package_manager = 'gomod'
  and dep_type.value = 'indirect'
group by
  package_name
order by
  count(*) DESC
limit 10;

And from the dmd CLI, we can also run the following:

# if you've generated the advisories data
dmd report advisories --db dmd.db

dmd report mostPopularDockerImages --db dmd.db
dmd report mostPopularPackageManagers --db dmd.db

Example

Interested in seeing what it's like with some pre-baked data? The example project has a web app hosted on Fly.io that contains a lot of public repositories from GitHub and GitLab which can give you an idea based on some pre-seeded data.

Written by Jamie Tanna's profile image Jamie Tanna on , and last updated on .

Content for this article is shared under the terms of the Creative Commons Attribution Non Commercial Share Alike 4.0 International, and code is shared under the Apache License 2.0.

#dependency-management-data.

This post was filed under articles.

Interactions with this post

Interactions with this post

Below you can find the interactions that this page has had using WebMention.

Have you written a response to this post? Let me know the URL:

Do you not have a website set up with WebMention capabilities? You can use Comment Parade.