Published our first externally collaborated R package, osdc

packaging
publishing
programming

Over the last two years, we’ve collaborated with a researcher at Steno Aarhus on building an R package called osdc. This package, titled Open Source Diabetes Classifier, classifies diabetes status in the Danish registers. And finally, we’ve published it to CRAN!

Author

Luke W. Johnston

Published

December 18, 2025

On December 10th, 2025, we finally published our first R package to CRAN! 🎉

The package is called osdc, or “Open Source Diabetes Classifier”, and it is our first package that we’ve built in collaboration with an external researcher, Anders Aasted Isaksen. He developed an algorithm to classify type 1 and type 2 diabetes using Danish registers as data sources, and we worked together to turn this algorithm into an R package that others can use. We started the collaboration back in 2023, and after a lot of work, we finally got it to a stage that we could publish a first version to CRAN.

The package has two aims (as described in the package documentation):

  1. To provide an open-source, code-based algorithm to classify type 1 and type 2 diabetes using Danish registers as data sources. There are other diabetes algorithms developed in Denmark for the registers, but they are not open source nor packaged into a reusable format.
  2. To inspire discussions within the Danish register-based research space on the openness and ease of use on the existing tooling and registers, and on the need for an official process for updating or contributing to existing data sources.

Who is it for and why use it?

The main reason for building the osdc package was to provide a tool for researchers doing diabetes research with Danish register data to classify diabetes. There are no Danish registers that fully captures the different ways that a person could be classified with diabetes, as administrative diagnosis data is not always complete nor accurate. So, researchers have had to develop different algorithms to get a better idea of who has diabetes in the Danish registers.

However, these algorithms have not been open source, and they have not been packaged into reusable tools. Which has lead to many researchers having different “in-house” solutions for their group or organisation that other groups can’t really use effectively. We wanted to change that.

So, we built the osdc package with all the necessary details for researchers to classify diabetes status in their own Danish register data. For example, the package provides a list of which registers and variables are needed with the use of the registers() function. Other than a few other helper functions, the main function of the package is classify_diabetes(), which takes all the required registers and outputs a data frame with a list of individuals, their diabetes status, and the date when the classification was made.

Aside from those functions, the package provides an algorithm() function that lists all the specific criteria used in the algorithm. This makes it easier for others to assess how exactly the algorithm classifies diabetes.

The next step is to start using the osdc package in collaborating projects that use Denmark Statistics and register data 🎉