Guide on using DrugUtilisataion package to compute drug use related information

Adding Routes with `addRoute` Function

To enrich your drug data, the DrugUtilisation package provides the addRoute function. This function utilizes an internal CSV file containing all possible routes for various drug dose forms supported by the package.

The addRoute function is designed to seamlessly incorporate route information into your drug table for the supported dose forms. In the example below, a mock database is generated using the mockDrugUtilisation function, and the addRoute function is applied to demonstrate the process:

library(DrugUtilisation)
con <- DBI::dbConnect(duckdb::duckdb(), ":memory:")
connectionDetails <- list(
  con = con,
  writeSchema = "main",
  cdmPrefix = NULL,
  writePrefix = NULL
)
cdm <- mockDrugUtilisation(
  connectionDetails = connectionDetails,
  numberIndividual = 100
)
# Add route information to the drug table
addRoute(cdm$drug_exposure)

## # Source:   SQL [?? x 8]
## # Database: DuckDB v1.0.0 [unknown@Linux 6.5.0-1023-azure:R 4.4.1/:memory:]
##    drug_exposure_id person_id drug_concept_id drug_exposure_start_date
##               <int>     <int>           <dbl> <date>                  
##  1                2         1         1516980 2001-02-27              
##  2                3         2         1516980 2018-03-08              
##  3                4         2         1516978 2018-05-28              
##  4                5         2         1516978 2018-06-14              
##  5                6         2         1539463 2017-07-03              
##  6                8         3        43135274 1996-09-13              
##  7               11         4         1503328 2007-05-14              
##  8               12         4         1125360 2004-12-26              
##  9               13         5         1539463 1986-08-21              
## 10               14         5         1503328 1979-06-27              
## # ℹ more rows
## # ℹ 4 more variables: drug_exposure_end_date <date>,
## #   drug_type_concept_id <dbl>, quantity <dbl>, route <chr>

Generating Patterns with patternTable Function

The patternTable function in the DrugUtilisation package is a powerful tool for deriving patterns from a drug strength table. This function extracts distinct patterns, associating them with pattern_id and formula_id. The resulting tibble provides valuable insights into the data:

number_concepts: the count of distinct concepts in the patterns.
number_ingredients: the count of distinct ingredients involved.
number_records: the overall count of records in the patterns.

Moreover, the tibble includes a column indicating potentially valid and invalid combinations.

patternTable(cdm)

## # A tibble: 5 × 12
##   pattern_id formula_name            validity number_concepts number_ingredients
##        <dbl> <chr>                   <chr>              <dbl>              <dbl>
## 1          9 fixed amount formulati… pattern…               7                  4
## 2         18 concentration formulat… pattern…               1                  1
## 3         24 concentration formulat… pattern…               1                  1
## 4         40 concentration formulat… pattern…               1                  1
## 5         NA NA                      no patt…               4                  4
## # ℹ 7 more variables: number_records <dbl>, amount_numeric <dbl>,
## #   amount_unit_concept_id <dbl>, numerator_numeric <dbl>,
## #   numerator_unit_concept_id <dbl>, denominator_numeric <dbl>,
## #   denominator_unit_concept_id <dbl>

For detailed information about the patterns, their associated formula, and combinations of amount_unit, numerator_unit, and denominator_unit, you can refer to the data:

patternsWithFormula

Get daily dose

Now that we have all the patterns and formulas supported, the computation of daily doses can be performed using the addDailyDose function. This function will add to the data with additional columns, including those for quantity, daily dose, unit, and route.

addDailyDose(
  cdm$drug_exposure,
  cdm = cdm,
  ingredientConceptId = 1125315
)

## # Source:   table<og_006_1721204430> [?? x 9]
## # Database: DuckDB v1.0.0 [unknown@Linux 6.5.0-1023-azure:R 4.4.1/:memory:]
##    drug_exposure_id person_id drug_concept_id drug_exposure_start_date
##               <int>     <int>           <dbl> <date>                  
##  1                8         3        43135274 1996-09-13              
##  2               12         4         1125360 2004-12-26              
##  3               17         7         1125360 1969-09-09              
##  4               23         8         2905077 2019-09-14              
##  5               24         9         1125360 2019-05-14              
##  6               27         9        43135274 2016-03-17              
##  7               34        12        43135274 2018-10-17              
##  8               35        13        43135274 1981-07-11              
##  9               40        14         1125360 1990-04-27              
## 10               49        19         2905077 2022-02-12              
## # ℹ more rows
## # ℹ 5 more variables: drug_exposure_end_date <date>,
## #   drug_type_concept_id <dbl>, quantity <dbl>, daily_dose <dbl>, unit <chr>

There is also a function, summariseDoseCoverage, to check the coverage of daily dose computation for chosen concept sets and ingredients.

suppressWarnings(summariseDoseCoverage(cdm, 1125315))

## ℹ The following estimates will be computed:
## • daily_dose: count_missing, percentage_missing, mean, sd, q25, median, q75
## ! Table is collected to memory as not all requested estimates are supported on
##   the database side
## → Start summary of data, at 2024-07-17 08:20:30.622824
## 
## ✔ Summary finished, at 2024-07-17 08:20:30.892715

## # A tibble: 56 × 13
##    result_id cdm_name group_name      group_level   strata_name strata_level
##        <int> <chr>    <chr>           <chr>         <chr>       <chr>       
##  1         1 DUS MOCK ingredient_name acetaminophen overall     overall     
##  2         1 DUS MOCK ingredient_name acetaminophen overall     overall     
##  3         1 DUS MOCK ingredient_name acetaminophen overall     overall     
##  4         1 DUS MOCK ingredient_name acetaminophen overall     overall     
##  5         1 DUS MOCK ingredient_name acetaminophen overall     overall     
##  6         1 DUS MOCK ingredient_name acetaminophen overall     overall     
##  7         1 DUS MOCK ingredient_name acetaminophen overall     overall     
##  8         1 DUS MOCK ingredient_name acetaminophen overall     overall     
##  9         1 DUS MOCK ingredient_name acetaminophen unit        milligram   
## 10         1 DUS MOCK ingredient_name acetaminophen unit        milligram   
## # ℹ 46 more rows
## # ℹ 7 more variables: variable_name <chr>, variable_level <chr>,
## #   estimate_name <chr>, estimate_type <chr>, estimate_value <chr>,
## #   additional_name <chr>, additional_level <chr>

Adding Drug Usage Details to a Cohort with `addDrugUse`

Additional drug usage details, including duration, initial dose, cumulative dose, etc., can be incorporated into a cohort using the addDrugUse function.

Parameters in `addDrugUse` Function

`duration` Parameter

The duration parameter is a boolean variable (TRUE/FALSE) determining whether to include the duration column. When set to TRUE, the duration is calculated as cohort_end_date - cohort_start_date + 1. Additionally, a column named impute_duration_percentage is added, reporting the percentage of imputed duration.

To set the imputation method for duration, use the imputeDuration parameter, which can take values such as “none,” “median,” “mean,” or “mode.” Define the imputation range with the durationRange parameter, a numeric vector of length two, where the first value should be equal or smaller than the second one.

`quantity` Parameter

The quantity parameter, another boolean variable (TRUE/FALSE), controls the inclusion of quantity-related columns. If set to TRUE, columns for initial quantity and cumulative quantity are added.

`dose` Parameter

The dose parameter, also a boolean variable (TRUE/FALSE), governs the addition of daily dose-related columns. When set to TRUE, columns for initial daily dose and cumulative daily dose are incorporated. Moreover, a column named impute_daily_dose_percentage is added, reporting the percentage of imputed daily dose.

Similar to the duration imputation, use the imputeDuration parameter to set the method for imputing daily dose, with options like “none,” “median,” “mean,” or “mode.” Define the imputation range with the dailyDoseRange parameter, a numeric vector of length two.

These parameters offer flexibility in customizing the drug usage details added to the cohort.

An example is provided where these parameters are set to TRUE, utilizing the drug ingredient acetaminophen.

library(CodelistGenerator)
cdm <- mockDrugUtilisation()
cdm <- generateDrugUtilisationCohortSet(
  cdm, "dus_cohort", getDrugIngredientCodes(cdm, "acetaminophen")
)
cdm[["dus_cohort"]] %>%
  addDrugUse(cdm,
             duration = TRUE,
             quantity = TRUE,
             dose = TRUE,
             1125315)

## Warning: `addDrugUse()` was deprecated in DrugUtilisation 0.7.0.
## ℹ Please use `addDrugUtilisation()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

## Warning: The `cdm` argument of `addDrugUse()` is deprecated as of DrugUtilisation 0.5.0.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

## # Source:   table<og_016_1721204452> [?? x 13]
## # Database: DuckDB v1.0.0 [unknown@Linux 6.5.0-1023-azure:R 4.4.1/:memory:]
##    cohort_definition_id subject_id cohort_start_date cohort_end_date duration
##                   <int>      <int> <date>            <date>             <dbl>
##  1                    1          8 2012-02-15        2012-02-17             3
##  2                    1          7 2016-02-01        2017-10-04           612
##  3                    1          5 2014-07-28        2019-11-20          1942
##  4                    1          1 2021-02-25        2021-03-11            15
##  5                    1          2 2022-05-24        2022-06-02            10
##  6                    1          9 2022-09-11        2022-11-09            60
##  7                    1          4 2021-05-05        2021-12-15           225
##  8                    1          2 2022-06-06        2022-06-07             2
##  9                    1          7 2018-05-21        2018-07-20            61
## 10                    1          3 2009-04-18        2009-11-07           204
## # ℹ more rows
## # ℹ 8 more variables: number_exposures <dbl>, cumulative_quantity <dbl>,
## #   initial_quantity <dbl>, impute_duration_percentage <dbl>,
## #   number_eras <dbl>, impute_daily_dose_percentage <dbl>,
## #   initial_daily_dose_milligram <dbl>, cumulative_dose_milligram <dbl>

Parameters for Joining Exposures

Finally, the way continuous exposures are joined can be configured using the following parameters:

`gapEra`:

This parameter sets the number of days between two continuous exposures to be considered in the same era. If the previous exposure’s end date minus the next exposure’s start date is less than or equal to the specified gapEra, these two exposures will be joined.

`eraJoinMode`:

This parameter defines how two different continuous exposures are joined in an era. There are four options: - “zero”: Exposures are joined, considering that the period between both continuous exposures means the subject is treated with a daily dose of zero. The time between both exposures contributes to the total exposed time. - “join”: Exposures are joined, considering that the period between both continuous exposures means the subject is treated with a daily dose of zero. The time between both exposures does not contribute to the total exposed time. - “previous”: Exposures are joined, considering that the period between both continuous exposures means the subject is treated with the daily dose of the previous subexposure. The time between both exposures contributes to the total exposed time. - “subsequent”: Exposures are joined, considering that the period between both continuous exposures means the subject is treated with the daily dose of the subsequent subexposure. The time between both exposures contributes to the total exposed time.

`overlapMode`: This parameter defines how the overlapping between two exposures that do not start on the same day is resolved inside a subexposure. There are five possible options:

- "previous": The considered daily dose is that of the earliest exposure.
- "subsequent": The considered daily dose is that of the new exposure that starts in that subexposure.
- "minimum": The considered daily dose is the minimum of all the exposures in the subexposure.
- "maximum": The considered daily dose is the maximum of all the exposures in the subexposure.
- "sum": The considered daily dose is the sum of all the exposures present in the subexposure.

`sameIndexMode`: This parameter defines how the overlapping between two exposures that start on the same day is resolved inside a subexposure. There are three possible options:

- "minimum": The considered daily dose is the minimum of all the exposures in the subexposure.
- "maximum": The considered daily dose is the maximum of all the exposures in the subexposure.
- "sum": The considered daily dose is the sum of all the exposures present in the subexposure.

For example, the following settings a maximum gap of 30 days for exposures to be joined. It uses the daily dose of the previous subexposure when joining exposures, employs the minimum daily dose for exposures starting on the same day, and considers the minimum daily dose for exposures that overlap.

cdm[["dus_cohort"]] %>%
  addDrugUse(cdm,
             ingredientConceptId = 1125315,
             gapEra = 30,
             eraJoinMode = "previous",
             overlapMode = "minimum",
             sameIndexMode = "minimum")

## # Source:   table<og_021_1721204465> [?? x 13]
## # Database: DuckDB v1.0.0 [unknown@Linux 6.5.0-1023-azure:R 4.4.1/:memory:]
##    cohort_definition_id subject_id cohort_start_date cohort_end_date duration
##                   <int>      <int> <date>            <date>             <dbl>
##  1                    1          5 2014-07-28        2019-11-20          1942
##  2                    1          8 2012-02-15        2012-02-17             3
##  3                    1          7 2016-02-01        2017-10-04           612
##  4                    1          3 2009-04-18        2009-11-07           204
##  5                    1          6 1991-01-28        1991-09-17           233
##  6                    1          7 2018-05-21        2018-07-20            61
##  7                    1          9 2022-09-11        2022-11-09            60
##  8                    1          2 2022-05-24        2022-06-02            10
##  9                    1          1 2021-02-25        2021-03-11            15
## 10                    1          2 2022-06-06        2022-06-07             2
## # ℹ more rows
## # ℹ 8 more variables: number_exposures <dbl>, cumulative_quantity <dbl>,
## #   initial_quantity <dbl>, impute_duration_percentage <dbl>,
## #   number_eras <dbl>, impute_daily_dose_percentage <dbl>,
## #   initial_daily_dose_milligram <dbl>, cumulative_dose_milligram <dbl>

DBI::dbDisconnect(con, shutdown = TRUE)

Marti Catala, Mike Du, Yuchen Guo, Kim Lopez-Guell, Edward Burn, Xintong Li

Adding Routes with addRoute Function

Generating Patterns with patternTable Function

Get daily dose

Adding Drug Usage Details to a Cohort with addDrugUse

Parameters in addDrugUse Function

duration Parameter

quantity Parameter

dose Parameter