Recently I started playing with Metabase to create interesting visualizations and have the teams quickly check how we are working.
One of the main tools we use is GitLab, so I needed the ability to query it. To be on the safe side, I configured the backup job to create a read-only mirror of the DB, and have Metabase access it.
However, as soon as I started trying to query it, I found that the table structure used in GitLab was difficult to use for quick visualizations: while the unit our users know is their project, projects reside in a namespace, and namespaces can be nested. Also, since GitLab supports forks, you may end up with several projects that you want to aggregate stats of in order to reflect accurate numbers.
Here’s a dump of the queries I found useful, and some brief explanation of what they do.
Let’s start from the top:
flat_namespaces will return a table containing the concatenated names, the parent namespace id, and the leaf namespace id.
So, if you have a namespace at
/a/b/c, where the IDs are 1, 2, and 3 respectively, and you query it with a
WHERE clause of
leaf_id = 3, you’ll get
a/b/c as the
root_projects will return a pair of
fork_project_id. This takes into account the three different possibilities that might exists:
- Projects with no forks.
- Projects with forks.
- Forks of other projects.
Basically, with a
fork_project_id you can get the corresponding
root_project_id. If you couple this with
flat_namespaces, for any given project ID on your GitLab installation, you can grab the
root_project_id with the “full path” of namespaces as a string.
project_with_full_namespace uses the previously defined CTEs to create the “full name” of the project and all its parent namespaces, similar to what you see in GitLab’s URLs. It’s important to note that this query will ignore forks, as it uses
root_projects to navigate the fork tree, but this is what we needed for our dashboards.
Common Table Structure ↩