Skip to contents

Introduction

In this vignette, we demonstrate the functionality provided by the DrugUtilisation package to help understand the indications of patients in a drug cohort.

The DrugUtilisation package is designed to work with data in the OMOP CDM format, so our first step is to create a reference to the data using the DBI and CDMConnector packages.

library(DrugUtilisation)

con <- DBI::dbConnect(duckdb::duckdb(), CDMConnector::eunomiaDir())

cdm <- CDMConnector::cdm_from_con(
  con = con,
  cdm_schema = "main",
  write_schema = "main"
)

Create a drug utilisation cohort

We will use acetaminophen as our example drug. We’ll start by creating a cohort of acetaminophen users. Here we’ll include all acetaminophen records using a gap era of 7 days, but as we’ve seen in the previous vignette we could have also applied various other inclusion criteria.

cdm <- generateIngredientCohortSet(
  cdm = cdm,
  name = "acetaminophen_users",
  ingredient = "acetaminophen",
  gapEra = 7
)

Note that addIndication works with a cohort as input, in this example we will use drug cohorts created with generateDrugUtilisationCohortSet but the input cohorts can be generated using many other ways.

Create a indication cohort

Next we will create a set of indication cohorts. In this case we will create cohorts for sinusitis and bronchitis using CDMConnector::generateConceptCohortSet().

indications <- list(
  sinusitis = c(257012, 4294548, 40481087),
  bronchitis = c(260139, 258780)
)

cdm <- CDMConnector::generateConceptCohortSet(
  cdm = cdm, name = "indications_cohort", indications, end = 0
)
cdm

Add indications with addIndication() function

Now that we have these two cohort tables, one with our drug cohort and another with our indications cohort, we can assess patient indications. For this we will specify a time window around the drug cohort start date for which we identify any intersection with the indication cohort. We can add this information as a new variable on our cohort table. This function will add a new column per window provided with the label of the indication.

cdm[["acetaminophen_users"]] <- cdm[["acetaminophen_users"]] |>
  addIndication(
    indicationCohortName = "indications_cohort",
    indicationWindow = list(c(-30, 0)),
    indexDate = "cohort_start_date"
  )
cdm[["acetaminophen_users"]] |>
  dplyr::glimpse()
#> Rows: ??
#> Columns: 5
#> Database: DuckDB v1.0.0 [unknown@Linux 6.5.0-1025-azure:R 4.4.1//tmp/RtmpYJ5NOc/file1c86210d4c1c.duckdb]
#> $ cohort_definition_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ subject_id           <int> 1310, 2992, 4753, 1966, 3966, 2686, 5152, 736, 24…
#> $ cohort_start_date    <date> 1944-02-11, 1966-08-29, 1965-06-17, 1978-10-24, …
#> $ cohort_end_date      <date> 1944-02-25, 1966-09-12, 1965-07-01, 1978-11-07, …
#> $ indication_m30_to_0  <chr> "bronchitis", "bronchitis", "bronchitis", "bronch…

We can see that individuals are classified as having sinusistis (without bronchitis), bronchitis (without sinusitis), sinusitis and bronchitis, or no observed indication.

cdm[["acetaminophen_users"]] |>
  dplyr::group_by(indication_m30_to_0) |>
  dplyr::tally()
#> # Source:   SQL [4 x 2]
#> # Database: DuckDB v1.0.0 [unknown@Linux 6.5.0-1025-azure:R 4.4.1//tmp/RtmpYJ5NOc/file1c86210d4c1c.duckdb]
#>   indication_m30_to_0          n
#>   <chr>                    <dbl>
#> 1 bronchitis and sinusitis     3
#> 2 sinusitis                   18
#> 3 none                     11351
#> 4 bronchitis                2527

As well as the indication cohort table, we can also use the clinical tables in the OMOP CDM to identify other, unknown, indications. Here we consider anyone who is not in an indication cohort but has a record in the condition occurrence table to have an “unknown” indication. We can see that many of the people previously considered to have no indication are now considered as having an unknown indication as they have a condition occurrence record in the 30 days up to their drug initiation.

cdm[["acetaminophen_users"]] |>
  dplyr::select(!"indication_m30_to_0") |>
  addIndication(
    indicationCohortName = "indications_cohort",
    indicationWindow = list(c(-30, 0)),
    unknownIndicationTable = "condition_occurrence"
  ) |>
  dplyr::group_by(indication_m30_to_0) |>
  dplyr::tally()
#> # Source:   SQL [5 x 2]
#> # Database: DuckDB v1.0.0 [unknown@Linux 6.5.0-1025-azure:R 4.4.1//tmp/RtmpYJ5NOc/file1c86210d4c1c.duckdb]
#>   indication_m30_to_0          n
#>   <chr>                    <dbl>
#> 1 bronchitis and sinusitis     3
#> 2 unknown                  11344
#> 3 sinusitis                   18
#> 4 none                         7
#> 5 bronchitis                2527

We can add indications for multiple time windows. Unsurprisingly we find more potential indications for wider windows (although this will likely increase our risk of false positives).

cdm[["acetaminophen_users"]] <- cdm[["acetaminophen_users"]] |>
  dplyr::select(!"indication_m30_to_0") |>
  addIndication(
    indicationCohortName = "indications_cohort",
    indicationWindow = list(c(0, 0), c(-30, 0), c(-365, 0)),
    unknownIndicationTable = "condition_occurrence"
  )
cdm[["acetaminophen_users"]] |>
  dplyr::group_by(indication_0_to_0) |>
  dplyr::tally()
#> # Source:   SQL [4 x 2]
#> # Database: DuckDB v1.0.0 [unknown@Linux 6.5.0-1025-azure:R 4.4.1//tmp/RtmpYJ5NOc/file1c86210d4c1c.duckdb]
#>   indication_0_to_0     n
#>   <chr>             <dbl>
#> 1 bronchitis         2524
#> 2 unknown           11211
#> 3 none                163
#> 4 sinusitis             1
cdm[["acetaminophen_users"]] |>
  dplyr::group_by(indication_m30_to_0) |>
  dplyr::tally()
#> # Source:   SQL [5 x 2]
#> # Database: DuckDB v1.0.0 [unknown@Linux 6.5.0-1025-azure:R 4.4.1//tmp/RtmpYJ5NOc/file1c86210d4c1c.duckdb]
#>   indication_m30_to_0          n
#>   <chr>                    <dbl>
#> 1 bronchitis                2527
#> 2 bronchitis and sinusitis     3
#> 3 unknown                  11344
#> 4 sinusitis                   18
#> 5 none                         7
cdm[["acetaminophen_users"]] |>
  dplyr::group_by(indication_m365_to_0) |>
  dplyr::tally()
#> # Source:   SQL [5 x 2]
#> # Database: DuckDB v1.0.0 [unknown@Linux 6.5.0-1025-azure:R 4.4.1//tmp/RtmpYJ5NOc/file1c86210d4c1c.duckdb]
#>   indication_m365_to_0         n
#>   <chr>                    <dbl>
#> 1 bronchitis and sinusitis   101
#> 2 unknown                  10968
#> 3 bronchitis                2615
#> 4 sinusitis                  211
#> 5 none                         4

Summarise indications with summariseIndication()

Instead of adding variables with indications like above, we could instead obtain a general summary of observed indications. summariseIndication has similar arguments to addIndication(), but returns a summary result of the indication.

indicationSummary <- cdm[["acetaminophen_users"]] |>
  dplyr::select(!dplyr::starts_with("indication")) |>
  summariseIndication(
    indicationCohortName = "indications_cohort",
    indicationWindow = list(c(0, 0), c(-30, 0), c(-365, 0)),
    unknownIndicationTable = c("condition_occurrence")
  )

We can then easily create a plot or a table of the results

tableIndication(indicationSummary)
Database name Indication Cohort name
161 acetaminophen
Indication from 30 days before to the index date
Synthea synthetic health database Bronchitis 2,527 (18.18 %)
Sinusitis 18 (0.13 %)
Bronchitis and sinusitis 3 (0.02 %)
Unknown 11,344 (81.62 %)
None 7 (0.05 %)
Indication from 365 days before to the index date
Synthea synthetic health database Bronchitis 2,615 (18.81 %)
Sinusitis 211 (1.52 %)
Bronchitis and sinusitis 101 (0.73 %)
Unknown 10,968 (78.91 %)
None 4 (0.03 %)
Indication on index date
Synthea synthetic health database Bronchitis 2,524 (18.16 %)
Sinusitis 1 (0.01 %)
Bronchitis and sinusitis 0 (0.00 %)
Unknown 11,211 (80.66 %)
None 163 (1.17 %)
plotIndication(indicationSummary)

As well as getting these overall results, we can also stratify the results by some variables of interest. For example, here we stratify our results by age groups and sex.

indicationSummaryStratified <- cdm[["acetaminophen_users"]] |>
  dplyr::select(!dplyr::starts_with("indication")) |>
  PatientProfiles::addDemographics(ageGroup = list(c(0, 19), c(20, 150))) |>
  summariseIndication(
    strata = list("age_group", "sex"),
    indicationCohortName = "indications_cohort",
    indicationWindow = list(c(0, 0), c(-30, 0), c(-365, 0)),
    unknownIndicationTable = c("condition_occurrence")
  )
tableIndication(indicationSummaryStratified)
Cohort name
161 acetaminophen
Age group
Overall 0 to 19 20 to 150 Overall
Database name Indication Sex
Overall Overall Overall Female Male
Indication from 30 days before to the index date
Synthea synthetic health database Bronchitis 2,527 (18.18 %) 1,826 (29.95 %) 701 (8.98 %) 1,291 (18.43 %) 1,236 (17.93 %)
Sinusitis 18 (0.13 %) 15 (0.25 %) 3 (0.04 %) 11 (0.16 %) 7 (0.10 %)
Bronchitis and sinusitis 3 (0.02 %) 2 (0.03 %) 1 (0.01 %) 1 (0.01 %) 2 (0.03 %)
Unknown 11,344 (81.62 %) 4,253 (69.77 %) 7,091 (90.88 %) 5,701 (81.40 %) 5,643 (81.84 %)
None 7 (0.05 %) 0 (0.00 %) 7 (0.09 %) 0 (0.00 %) 7 (0.10 %)
Indication from 365 days before to the index date
Synthea synthetic health database Bronchitis 2,615 (18.81 %) 1,883 (30.89 %) 732 (9.38 %) 1,353 (19.32 %) 1,262 (18.30 %)
Sinusitis 211 (1.52 %) 191 (3.13 %) 20 (0.26 %) 108 (1.54 %) 103 (1.49 %)
Bronchitis and sinusitis 101 (0.73 %) 96 (1.57 %) 5 (0.06 %) 39 (0.56 %) 62 (0.90 %)
Unknown 10,968 (78.91 %) 3,926 (64.40 %) 7,042 (90.25 %) 5,504 (78.58 %) 5,464 (79.25 %)
None 4 (0.03 %) 0 (0.00 %) 4 (0.05 %) 0 (0.00 %) 4 (0.06 %)
Indication on index date
Synthea synthetic health database Bronchitis 2,524 (18.16 %) 1,823 (29.90 %) 701 (8.98 %) 1,290 (18.42 %) 1,234 (17.90 %)
Sinusitis 1 (0.01 %) 1 (0.02 %) 0 (0.00 %) 0 (0.00 %) 1 (0.01 %)
Bronchitis and sinusitis 0 (0.00 %) 0 (0.00 %) 0 (0.00 %) 0 (0.00 %) 0 (0.00 %)
Unknown 11,211 (80.66 %) 4,242 (69.59 %) 6,969 (89.31 %) 5,619 (80.23 %) 5,592 (81.10 %)
None 163 (1.17 %) 30 (0.49 %) 133 (1.70 %) 95 (1.36 %) 68 (0.99 %)
indicationSummaryStratified |>
  dplyr::filter(variable_name == "Indication on index date") |>
  plotIndication(
    x = c("strata"),
    facet = c("cdm_name", "cohort_name"),
    color = "indication",
    splitStrata = FALSE
  )