Guide on using DrugUtilisataion package to compute drug use related information
Marti Catala, Mike Du, Yuchen Guo, Kim Lopez-Guell, Edward Burn, Xintong Li
TBR_route_pattern_dose.Rmd
Adding Routes with addRoute
Function
To enrich your drug data, the DrugUtilisation package provides the
addRoute
function. This function utilizes an internal CSV
file containing all possible routes for various drug dose forms
supported by the package.
The addRoute
function is designed to seamlessly
incorporate route information into your drug table for the supported
dose forms. In the example below, a mock database is generated using the
mockDrugUtilisation
function, and the addRoute
function is applied to demonstrate the process:
library(DrugUtilisation)
cdm <- mockDrugUtilisation(numberIndividual = 100)
# Add route information to the drug table
addRoute(cdm$drug_exposure)
## Warning: `addRoute()` was deprecated in DrugUtilisation 0.7.0.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## # Source: SQL [?? x 8]
## # Database: DuckDB v1.0.0 [unknown@Linux 6.5.0-1024-azure:R 4.4.1/:memory:]
## drug_exposure_id person_id drug_concept_id drug_exposure_start_date
## <int> <int> <dbl> <date>
## 1 1 1 1516980 2017-08-01
## 2 3 1 1503328 2016-07-31
## 3 4 2 1516978 2000-12-20
## 4 5 2 1539463 2016-12-10
## 5 6 2 1503328 2011-05-29
## 6 7 2 1125360 1998-05-20
## 7 9 3 1539463 2011-11-23
## 8 11 3 1516980 2014-01-28
## 9 12 3 2905077 2014-03-17
## 10 14 5 1516978 1989-01-05
## # ℹ more rows
## # ℹ 4 more variables: drug_exposure_end_date <date>,
## # drug_type_concept_id <dbl>, quantity <dbl>, route <chr>
Generating Patterns with patternTable Function
The patternTable
function in the DrugUtilisation package
is a powerful tool for deriving patterns from a drug strength table.
This function extracts distinct patterns, associating them with
pattern_id
and formula_id
. The resulting
tibble provides valuable insights into the data:
-
number_concepts
: the count of distinct concepts in the patterns. -
number_ingredients
: the count of distinct ingredients involved. -
number_records
: the overall count of records in the patterns.
Moreover, the tibble includes a column indicating potentially valid and invalid combinations.
patternTable(cdm)
## # A tibble: 5 × 12
## pattern_id formula_name validity number_concepts number_ingredients
## <dbl> <chr> <chr> <dbl> <dbl>
## 1 9 fixed amount formulati… pattern… 7 4
## 2 18 concentration formulat… pattern… 1 1
## 3 24 concentration formulat… pattern… 1 1
## 4 40 concentration formulat… pattern… 1 1
## 5 NA NA no patt… 4 4
## # ℹ 7 more variables: number_records <dbl>, amount_numeric <dbl>,
## # amount_unit_concept_id <dbl>, numerator_numeric <dbl>,
## # numerator_unit_concept_id <dbl>, denominator_numeric <dbl>,
## # denominator_unit_concept_id <dbl>
For detailed information about the patterns, their associated
formula, and combinations of amount_unit
,
numerator_unit
, and denominator_unit
, you can
refer to the data:
patternsWithFormula
Get daily dose
Now that we have all the patterns and formulas supported, the
computation of daily doses can be performed using the
addDailyDose
function. This function will add to the data
with additional columns, including those for quantity, daily dose, unit,
and route.
addDailyDose(
cdm$drug_exposure,
ingredientConceptId = 1125315
)
## Warning: `addDailyDose()` was deprecated in DrugUtilisation 0.7.0.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## # Source: table<og_006_1722100892> [?? x 9]
## # Database: DuckDB v1.0.0 [unknown@Linux 6.5.0-1024-azure:R 4.4.1/:memory:]
## drug_exposure_id person_id drug_concept_id drug_exposure_start_date
## <int> <int> <dbl> <date>
## 1 7 2 1125360 1998-05-20
## 2 12 3 2905077 2014-03-17
## 3 16 6 2905077 2019-12-21
## 4 22 7 1125360 1976-04-02
## 5 24 8 1125360 2006-11-15
## 6 25 8 2905077 2009-07-03
## 7 26 8 1125360 2006-04-08
## 8 27 9 2905077 2011-06-29
## 9 28 9 43135274 2007-07-03
## 10 34 11 2905077 2014-12-17
## # ℹ more rows
## # ℹ 5 more variables: drug_exposure_end_date <date>,
## # drug_type_concept_id <dbl>, quantity <dbl>, daily_dose <dbl>, unit <chr>
There is also a function, summariseDoseCoverage
, to
check the coverage of daily dose computation for chosen concept sets and
ingredients.
suppressWarnings(summariseDoseCoverage(cdm, 1125315))
## ℹ The following estimates will be computed:
## • daily_dose: count_missing, percentage_missing, mean, sd, q25, median, q75
## ! Table is collected to memory as not all requested estimates are supported on
## the database side
## → Start summary of data, at 2024-07-27 17:21:33.321075
##
## ✔ Summary finished, at 2024-07-27 17:21:33.593559
## # A tibble: 56 × 13
## result_id cdm_name group_name group_level strata_name strata_level
## <int> <chr> <chr> <chr> <chr> <chr>
## 1 1 DUS MOCK ingredient_name acetaminophen overall overall
## 2 1 DUS MOCK ingredient_name acetaminophen overall overall
## 3 1 DUS MOCK ingredient_name acetaminophen overall overall
## 4 1 DUS MOCK ingredient_name acetaminophen overall overall
## 5 1 DUS MOCK ingredient_name acetaminophen overall overall
## 6 1 DUS MOCK ingredient_name acetaminophen overall overall
## 7 1 DUS MOCK ingredient_name acetaminophen overall overall
## 8 1 DUS MOCK ingredient_name acetaminophen overall overall
## 9 1 DUS MOCK ingredient_name acetaminophen unit milligram
## 10 1 DUS MOCK ingredient_name acetaminophen unit milligram
## # ℹ 46 more rows
## # ℹ 7 more variables: variable_name <chr>, variable_level <chr>,
## # estimate_name <chr>, estimate_type <chr>, estimate_value <chr>,
## # additional_name <chr>, additional_level <chr>
Adding Drug Usage Details to a Cohort with
addDrugUse
Additional drug usage details, including duration, initial dose,
cumulative dose, etc., can be incorporated into a cohort using the
addDrugUse
function.
Parameters in addDrugUse
Function
duration
Parameter
The duration
parameter is a boolean variable
(TRUE
/FALSE
) determining whether to include
the duration column. When set to TRUE
, the duration is
calculated as cohort_end_date - cohort_start_date + 1
.
Additionally, a column named impute_duration_percentage
is
added, reporting the percentage of imputed duration.
To set the imputation method for duration, use the
imputeDuration
parameter, which can take values such as
“none,” “median,” “mean,” or “mode.” Define the imputation range with
the durationRange
parameter, a numeric vector of length
two, where the first value should be equal or smaller than the second
one.
quantity
Parameter
The quantity
parameter, another boolean variable
(TRUE
/FALSE
), controls the inclusion of
quantity-related columns. If set to TRUE
, columns for
initial quantity and cumulative quantity are added.
dose
Parameter
The dose
parameter, also a boolean variable
(TRUE
/FALSE
), governs the addition of daily
dose-related columns. When set to TRUE
, columns for initial
daily dose and cumulative daily dose are incorporated. Moreover, a
column named impute_daily_dose_percentage
is added,
reporting the percentage of imputed daily dose.
Similar to the duration imputation, use the
imputeDuration
parameter to set the method for imputing
daily dose, with options like “none,” “median,” “mean,” or “mode.”
Define the imputation range with the dailyDoseRange
parameter, a numeric vector of length two.
These parameters offer flexibility in customizing the drug usage details added to the cohort.
An example is provided where these parameters are set to
TRUE
, utilizing the drug ingredient acetaminophen.
library(CodelistGenerator)
cdm <- mockDrugUtilisation()
cdm <- generateDrugUtilisationCohortSet(
cdm, "dus_cohort", getDrugIngredientCodes(cdm, "acetaminophen")
)
cdm[["dus_cohort"]] %>%
addDrugUse(cdm,
duration = TRUE,
quantity = TRUE,
dose = TRUE,
1125315)
## Warning: `addDrugUse()` was deprecated in DrugUtilisation 0.7.0.
## ℹ Please use `addDrugUtilisation()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: The `cdm` argument of `addDrugUse()` is deprecated as of DrugUtilisation 0.5.0.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## # Source: table<og_016_1722100914> [5 x 13]
## # Database: DuckDB v1.0.0 [unknown@Linux 6.5.0-1024-azure:R 4.4.1/:memory:]
## cohort_definition_id subject_id cohort_start_date cohort_end_date duration
## <int> <int> <date> <date> <dbl>
## 1 1 10 2001-09-14 2003-10-24 771
## 2 1 8 2010-02-08 2011-06-06 484
## 3 1 2 2022-02-01 2022-03-03 31
## 4 1 4 2020-06-13 2020-09-21 101
## 5 1 3 2004-12-11 2012-03-06 2643
## # ℹ 8 more variables: number_exposures <dbl>, cumulative_quantity <dbl>,
## # initial_quantity <dbl>, impute_duration_percentage <dbl>,
## # number_eras <dbl>, impute_daily_dose_percentage <dbl>,
## # initial_daily_dose_milligram <dbl>, cumulative_dose_milligram <dbl>
Parameters for Joining Exposures
Finally, the way continuous exposures are joined can be configured using the following parameters:
gapEra
:
This parameter sets the number of days between two continuous
exposures to be considered in the same era. If the previous exposure’s
end date minus the next exposure’s start date is less than or equal to
the specified gapEra
, these two exposures will be
joined.
eraJoinMode
:
This parameter defines how two different continuous exposures are joined in an era. There are four options: - “zero”: Exposures are joined, considering that the period between both continuous exposures means the subject is treated with a daily dose of zero. The time between both exposures contributes to the total exposed time. - “join”: Exposures are joined, considering that the period between both continuous exposures means the subject is treated with a daily dose of zero. The time between both exposures does not contribute to the total exposed time. - “previous”: Exposures are joined, considering that the period between both continuous exposures means the subject is treated with the daily dose of the previous subexposure. The time between both exposures contributes to the total exposed time. - “subsequent”: Exposures are joined, considering that the period between both continuous exposures means the subject is treated with the daily dose of the subsequent subexposure. The time between both exposures contributes to the total exposed time.
overlapMode
: This parameter defines how the overlapping
between two exposures that do not start on the same day is resolved
inside a subexposure. There are five possible options:
- "previous": The considered daily dose is that of the earliest exposure.
- "subsequent": The considered daily dose is that of the new exposure that starts in that subexposure.
- "minimum": The considered daily dose is the minimum of all the exposures in the subexposure.
- "maximum": The considered daily dose is the maximum of all the exposures in the subexposure.
- "sum": The considered daily dose is the sum of all the exposures present in the subexposure.
sameIndexMode
: This parameter defines how the
overlapping between two exposures that start on the same day is resolved
inside a subexposure. There are three possible options:
- "minimum": The considered daily dose is the minimum of all the exposures in the subexposure.
- "maximum": The considered daily dose is the maximum of all the exposures in the subexposure.
- "sum": The considered daily dose is the sum of all the exposures present in the subexposure.
For example, the following settings a maximum gap of 30 days for exposures to be joined. It uses the daily dose of the previous subexposure when joining exposures, employs the minimum daily dose for exposures starting on the same day, and considers the minimum daily dose for exposures that overlap.
cdm[["dus_cohort"]] %>%
addDrugUse(cdm,
ingredientConceptId = 1125315,
gapEra = 30,
eraJoinMode = "previous",
overlapMode = "minimum",
sameIndexMode = "minimum")
## # Source: table<og_021_1722100925> [5 x 13]
## # Database: DuckDB v1.0.0 [unknown@Linux 6.5.0-1024-azure:R 4.4.1/:memory:]
## cohort_definition_id subject_id cohort_start_date cohort_end_date duration
## <int> <int> <date> <date> <dbl>
## 1 1 10 2001-09-14 2003-10-24 771
## 2 1 3 2004-12-11 2012-03-06 2643
## 3 1 8 2010-02-08 2011-06-06 484
## 4 1 2 2022-02-01 2022-03-03 31
## 5 1 4 2020-06-13 2020-09-21 101
## # ℹ 8 more variables: number_exposures <dbl>, cumulative_quantity <dbl>,
## # initial_quantity <dbl>, impute_duration_percentage <dbl>,
## # number_eras <dbl>, impute_daily_dose_percentage <dbl>,
## # initial_daily_dose_milligram <dbl>, cumulative_dose_milligram <dbl>