TestGenerator
takes as an input an Excel file with
sheets that represent a table in the OMOP-CDM. The following example (testPatientsRSV.xlsx)
represents a population of 10 patients, some of them with RSV.
#> # A tibble: 10 × 5
#> person_id gender_concept_id year_of_birth race_concept_id
#> <dbl> <dbl> <dbl> <dbl>
#> 1 1 8532 1980 0
#> 2 2 8507 1980 0
#> 3 3 8532 1965 0
#> 4 4 8532 2010 0
#> 5 5 8532 1936 0
#> 6 6 8532 1970 0
#> 7 7 8532 1988 0
#> 8 8 8507 1998 0
#> 9 9 8507 1990 0
#> 10 10 8532 1945 0
#> # ℹ 1 more variable: ethnicity_concept_id <dbl>
The user can include only the tables that are relevant to the analysis.
#> [1] "person" "observation_period" "condition_occurrence"
#> [4] "visit_occurrence" "visit_detail" "death"
TestGenerator::readPatients()
converts the file into
JSON format and saves it in the project. The sample data is then pushed
to a blank CDM with patientsCDM()
.
#> ✔ Unit Test Definition Created Successfully: 'test'
#> ! cdm name not specified and could not be inferred from the cdm source table
#> ✔ Patients pushed to blank CDM successfully
#>
#> ── # OMOP CDM reference (duckdb) of An OMOP CDM database ───────────────────────
#> • omop tables: person, observation_period, visit_occurrence, visit_detail,
#> condition_occurrence, drug_exposure, procedure_occurrence, device_exposure,
#> measurement, observation, death, note, note_nlp, specimen, fact_relationship,
#> location, care_site, provider, payer_plan_period, cost, drug_era, dose_era,
#> condition_era, metadata, cdm_source, concept, vocabulary, domain,
#> concept_class, concept_relationship, relationship, concept_synonym,
#> concept_ancestor, source_to_concept_map, drug_strength, cohort_definition,
#> attribute_definition
#> • cohort tables: -
#> • achilles tables: -
#> • other tables: -
#> # Source: table<main.person> [?? x 18]
#> # Database: DuckDB v0.10.1 [unknown@Linux 6.5.0-1018-azure:R 4.3.3//tmp/RtmpnIsL9O/file1b3c67b8cae9.duckdb]
#> person_id gender_concept_id year_of_birth month_of_birth day_of_birth
#> <int> <int> <int> <int> <int>
#> 1 1 8532 1980 NA NA
#> 2 2 8507 1980 NA NA
#> 3 3 8532 1965 NA NA
#> 4 4 8532 2010 NA NA
#> 5 5 8532 1936 NA NA
#> 6 6 8532 1970 NA NA
#> 7 7 8532 1988 NA NA
#> 8 8 8507 1998 NA NA
#> 9 9 8507 1990 NA NA
#> 10 10 8532 1945 NA NA
#> # ℹ more rows
#> # ℹ 13 more variables: birth_datetime <dttm>, race_concept_id <int>,
#> # ethnicity_concept_id <int>, location_id <int>, provider_id <int>,
#> # care_site_id <int>, person_source_value <chr>, gender_source_value <chr>,
#> # gender_source_concept_id <int>, race_source_value <chr>,
#> # race_source_concept_id <int>, ethnicity_source_value <chr>,
#> # ethnicity_source_concept_id <int>
That returns a CDM reference object that now can be used to test a study, for example like this:
runStudy(conn = attr(cdm, "dbcon"),
cdmDatabaseSchema = "main",
resultsDatabaseSchema = "main",
dbName = "myTestDB",
minCellCount = 5)