Skip to contents

Introduction

omopgenerics defines an ecosystem of methods and classes particularly the class that can be expanded. Currently there are two packages that define cdm sources:

  • omopgenerics defines the source that defines the implementation of a ‘local’ (in memory) cdm source.
  • CDMConnector defines the source that defines a general implementation for DBI connections.

In this vignette we explain how to expand the omopgenerics ecosystem defining more sources.

The source object

First we need to define a function to create our source object: the source object must be an object (usually a list) that contains several attributes that will be used in the methods to fulfill their purpose. Finally we have to assign a class to our source and validate it with omopgenerics::newCdmSource(). The function has an argument to assign a sourceType that must be a character vector that identifies the name of the source. This is what will be retrieved by the omopgenerics::sourceType() function and it will be useful to identify how the source of the cdm_reference has been created.

Example how the creation of a new source would look like:

myCustomSource <- function(argument1, argument2, ...) {
  # pre calculation and validation of arguments
  ...
  
  # create the source object
  obj <- list(x = x, y = y, ...) # this way you would access the attributes like: obj$x
  # or
  obj <- structure(.Data = list(), x = x, y = y, ...) # this you would access the attributes like: attr(obj, "x")
    
  # assign class
  class(obj) <- "my_custom_source"
  
  # validation
  omopgenerics::newCdmSource(src = obj, sourceType = "my_custom_type")
}

If the first function that we create is myCustomSource() the validation with omopgenerics::newCdmSource() will fail as inside the methods are checked to be defined and work properly.

Methods

You will need to write 4 to 7 methods for your new <my_custom_source>:

  • insertTable required To insert local data into your source.
  • compute required To compute a ‘query’ into a table in your source.
  • listSourceTables required To list the data present into your source.
  • dropSourceTable required To drop a table from your source.
  • readSourceTable recommended To read a table from your source.
  • insertCdmTo recommended To insert a cdm into your source.
  • summary recommended To summarise and report the properties of your source.

insertTable

Purpose: To insert a local table into your source object.

Arguments:

  • cdm The cdm argument will be your source object created with myCustomSource() function.
  • name The name to identify that table in your source.
  • table A local <tibble> to insert in your source.
  • overwrite (by default TRUE), whether to overwrite if the table exists in the database.
  • temporary (by default FALSE), whether the table must be temporary.

Output: The output of a insertTable must be a cdm_table so your function must at the end validate it with omopgenerics::newCdmTable().

Sketch of how the function should look like:

#' @export
#' @importFrom omopgenerics insertTable
insertTable.my_custom_source <- function(cdm, name, table, overwrite, temporary) {
  # code to insert the table into your source
  x <- "...." # it must be a reference to your table
  
  # validate output
  omopgenerics::newCdmTable(table = x, src = cdm, name = name)
}

listSourceTables

Purpose: To list tables that are present in the source.

Arguments:

  • cdm The cdm argument will be your source object created with myCustomSource() function.

Output: A character vector with the names of tables present in source, empty identifiers "" will be eliminated by omopgenerics.

Sketch of how the function should look like:

#' @export
#' @importFrom omopgenerics listSourceTables
listSourceTables.my_custom_source <- function(cdm) {
  # code to list the tables present in source (cdm)
  x <- "...."
  
  return(x)
}

readSourceTable

Purpose: To read tables that are present in the source.

Arguments:

  • cdm The cdm argument will be your source object created with myCustomSource() function.
  • name Name to identify the table in your source.

Output: The output of a readSourceTable must be a cdm_table so your function must at the end validate it with omopgenerics::newCdmTable().

Sketch of how the function should look like:

#' @export
#' @importFrom omopgenerics readSourceTable
readSourceTable.my_custom_source <- function(cdm, name) {
  # code to read the table 'name' from source.
  x <- "...."
  
  # validate as cdm_table
  omopgenerics::newCdmTable(table = x, src = cdm, name = name)
}

dropSourceTable

Purpose: To drop a table from your source.

Arguments:

  • cdm The cdm argument will be your source object created with myCustomSource() function.
  • name Name identifier for the table that you want to drop.

Output: The output is ignored, would recommend to return the source.

Sketch of how the function should look like:

#' @export
#' @importFrom omopgenerics dropSourceTable
dropSourceTable.my_custom_source <- function(cdm, name) {
  # code to drop the table `name` present in source (cdm)
  
  return(invisible(cdm))
}

insertCdmTo

Purpose: To insert a cdm to your source.

Arguments:

  • cdm A cdm reference from a different source. Recommend to collect each table before inserting.
  • to The ‘to’ argument will be your source object created with myCustomSource() function.

Output: The output should be the cdm reference object inserted in your to source.

Sketch of how the function should look like:

#' @export
#' @importFrom omopgenerics dropSourceTable
insertCdmTo.my_custom_source <- function(cdm, to) {
  # example of how it can look like:
  tables <- names(cdm) |>
    rlang::set_names() |>
    purrr::map(\(x) omopgenerics::insertTable(cdm = to, name = x, table = dplyr::as_tibble(cdm[[x]])))
  
  omopgenerics::newCdmReference(
    tables = tables, 
    cdmName = omopgenerics::cdmName(x = cdm), 
    cdmVersion = omopgenerics::cdmVersion(x = cdm)
  )
}

The content of the function can vary depending of your source.

summary

Purpose: To summarise the metadata of your source object.

Arguments:

  • object The ‘object’ argument will be your source object created with myCustomSource() function.
  • ... For consistency.

Output: A named list of metadata of the source, each element must be a string of length 1.

Sketch of how the function should look like:

#' @export
summary.my_custom_source <- function(object, ...) {
  # extract metadata
  metadata1 <- "..."
  metadata2 <- "..."
  metadata3 <- "..."
  
  list(metadata1 = metadata1, metadata2 = metadata2, metadata3 = metadata3)
}

compute

This function works slightly different to the rest the input it will be a query instead of the source object.

Purpose: To compute a table into a permanent placeholder in your source.

Arguments:

  • x Query to compute.
  • name The name to identify the resultant table in your source.
  • temporary (by default FALSE), whether the table must be temporary.
  • overwrite (by default TRUE), whether to overwrite if the table exists in the database.
  • ... For consistency.

Output: The output of a compute must be a reference to your table in your source data, it will be converted later to a cdm_table (but you do not have to worry about that).

Sketch of how the function should look like:

#' @export
#' @importFrom dplyr compute
compute.my_custom_source <- function(x, name, overwrite, temporary, ...) {
  # code to compute the table into your source
  x <- "...." # it must be a reference to your table
  return(x)
}

The cdm reference object

Finally every <cdm_source> class object would also need a function to create a to do that you just have to read all the tables that you want to include in your cdm object. tables must be a list of with the same source.

cdmFromMyCustomSource <- function(argument1, argument2, ...) {
  # read and prepare the cdm tables
  ...
  
  # return the cdm object
  omopgenerics::newCdmReference(
    tables = tables, # list of cdm and achilles standard tables
    cdmName = "...", # usually provided as input, but also you might want to search in the cdm_source
    cdmVersion = "..." # either "5.3" or "5.4"
  )
}

If you want to add to your object do it after the initial cdm creation like:

# read from source 
cdm <- readSourceTable(cdm = cdm, name = "my_cohort")

# or insert from local
cdm <- insertTable(cdm = cdm, name = "my_cohort", table = localCohort)
cdm$my_cohort <- cdm$my_cohort |>
  newCohortTable(
    cohortSetRef = cohort_set, # table with the settings of the cohort_table
    cohortAttritionRef = cohort_attrition, # table with the attrition of the cohort_table
    cohortCodelistRef = cohort_codelist # table with the codelists of the cohort_table
  )

This step can be included in your cdm object creation if you wish:

cdmFromMyCustomSource <- function(argument1, argument2, ..., cohortTables) {
  # read and prepare the cdm tables
  ...
  
  # return the cdm object
  cdm <- omopgenerics::newCdmReference(
    tables = tables, # list of cdm and achilles standard tables
    cdmName = "...", # usually provided as input, but also you might want to search in the cdm_source
    cdmVersion = "..." # either "5.3" or "5.4"
  )
  
  # read cohort tables
  readSourceTable(cdm = cdm, name = cohortTables)
}