Skip to contents

TestGenerator takes as an input an Excel file with sheets that represent a table in the OMOP-CDM. The following example (testPatientsRSV.xlsx) represents a population of 10 patients, some of them with RSV.

#> # A tibble: 10 × 5
#>    person_id gender_concept_id year_of_birth race_concept_id
#>        <dbl>             <dbl>         <dbl>           <dbl>
#>  1         1              8532          1980               0
#>  2         2              8507          1980               0
#>  3         3              8532          1965               0
#>  4         4              8532          2010               0
#>  5         5              8532          1936               0
#>  6         6              8532          1970               0
#>  7         7              8532          1988               0
#>  8         8              8507          1998               0
#>  9         9              8507          1990               0
#> 10        10              8532          1945               0
#> # ℹ 1 more variable: ethnicity_concept_id <dbl>

The user can include only the tables that are relevant to the analysis.

#> [1] "person"               "observation_period"   "condition_occurrence"
#> [4] "visit_occurrence"     "visit_detail"         "death"

TestGenerator::readPatients() converts the file into JSON format and saves it in the project. The sample data is then pushed to a blank CDM with patientsCDM().

#>  Unit Test Definition Created Successfully: 'test'
#> ! cdm name not specified and could not be inferred from the cdm source table
#>  Patients pushed to blank CDM successfully
#> 
#> ── # OMOP CDM reference (duckdb) of An OMOP CDM database ───────────────────────
#> • omop tables: person, observation_period, visit_occurrence, visit_detail,
#> condition_occurrence, drug_exposure, procedure_occurrence, device_exposure,
#> measurement, observation, death, note, note_nlp, specimen, fact_relationship,
#> location, care_site, provider, payer_plan_period, cost, drug_era, dose_era,
#> condition_era, metadata, cdm_source, concept, vocabulary, domain,
#> concept_class, concept_relationship, relationship, concept_synonym,
#> concept_ancestor, source_to_concept_map, drug_strength, cohort_definition,
#> attribute_definition
#> • cohort tables: -
#> • achilles tables: -
#> • other tables: -
#> # Source:   table<main.person> [?? x 18]
#> # Database: DuckDB v0.10.1 [unknown@Linux 6.5.0-1018-azure:R 4.3.3//tmp/RtmpnIsL9O/file1b3c67b8cae9.duckdb]
#>    person_id gender_concept_id year_of_birth month_of_birth day_of_birth
#>        <int>             <int>         <int>          <int>        <int>
#>  1         1              8532          1980             NA           NA
#>  2         2              8507          1980             NA           NA
#>  3         3              8532          1965             NA           NA
#>  4         4              8532          2010             NA           NA
#>  5         5              8532          1936             NA           NA
#>  6         6              8532          1970             NA           NA
#>  7         7              8532          1988             NA           NA
#>  8         8              8507          1998             NA           NA
#>  9         9              8507          1990             NA           NA
#> 10        10              8532          1945             NA           NA
#> # ℹ more rows
#> # ℹ 13 more variables: birth_datetime <dttm>, race_concept_id <int>,
#> #   ethnicity_concept_id <int>, location_id <int>, provider_id <int>,
#> #   care_site_id <int>, person_source_value <chr>, gender_source_value <chr>,
#> #   gender_source_concept_id <int>, race_source_value <chr>,
#> #   race_source_concept_id <int>, ethnicity_source_value <chr>,
#> #   ethnicity_source_concept_id <int>

That returns a CDM reference object that now can be used to test a study, for example like this:

runStudy(conn = attr(cdm, "dbcon"),
         cdmDatabaseSchema = "main",
         resultsDatabaseSchema = "main",
         dbName = "myTestDB",
         minCellCount = 5)