Logging
Logging is a common practice in studies, specially when sharing code.
Logging can be useful to check timings or record error messages. There
exist multiple packages in R that allow you to record these log
messages. For example the logger
package is quite
useful.
Logging with omopgenerics
omopgenerics
does not want to replace any of these
packages, we just provide simple functionality to log messages. In the
future we might consider building this on top of one of the existing log
packages, but for the moment we have these three simple functions:
-
createLogFile()
It is used to create the log file. -
logMessage()
It is used to record the messages that we want in the log file, note those messages will also be displayed in the console. IflogFile
does not exist the message is only displayed in the console. -
summariseLogFile()
It is used to read the log file and format it into asummarised_result
object.
Example
Let’s see a simple example of logging with omopgenerics:
library(omopgenerics, warn.conflicts = FALSE)
# create the log file
createLogFile(logFile = tempfile(pattern = "log_{date}_{time}"))
#> ℹ Creating log file: /tmp/RtmpbaTZ0S/log_2025_05_06_23_24_4324f06568ed94.txt.
#> [2025-05-06 23:24:43] - Log file created
# study
logMessage("Generating random numbers")
#> [2025-05-06 23:24:43] - Generating random numbers
x <- runif(1e6)
logMessage("Calculating the sum")
#> [2025-05-06 23:24:43] - Calculating the sum
result <- sum(x)
# export logger to a `summarised_result`
log <- summariseLogFile()
#> [2025-05-06 23:24:43] - Exporting log file
# content of the log file
readLines(getOption("omopgenerics.logFile")) |>
cat(sep = "\n")
#> [2025-05-06 23:24:43] - Log file created
#> [2025-05-06 23:24:43] - Generating random numbers
#> [2025-05-06 23:24:43] - Calculating the sum
#> [2025-05-06 23:24:43] - Exporting log file
# `summarised_result` object
log
#> # A tibble: 4 × 13
#> result_id cdm_name group_name group_level strata_name strata_level
#> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 1 unknown overall overall log_id 1
#> 2 1 unknown overall overall log_id 2
#> 3 1 unknown overall overall log_id 3
#> 4 1 unknown overall overall log_id 4
#> # ℹ 7 more variables: variable_name <chr>, variable_level <chr>,
#> # estimate_name <chr>, estimate_type <chr>, estimate_value <chr>,
#> # additional_name <chr>, additional_level <chr>
# `summarised_result` object settings
settings(log)
#> # A tibble: 1 × 8
#> result_id result_type package_name package_version group strata additional
#> <int> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1 summarise_log_… omopgenerics 1.1.1.900 "" log_id ""
#> # ℹ 1 more variable: min_cell_count <chr>
# tidy version of the `summarised_result`
tidy(log)
#> # A tibble: 4 × 5
#> cdm_name log_id variable_name variable_level date_time
#> <chr> <chr> <chr> <chr> <chr>
#> 1 unknown 1 Log file created NA 2025-05-06 23:24:43
#> 2 unknown 2 Generating random numbers NA 2025-05-06 23:24:43
#> 3 unknown 3 Calculating the sum NA 2025-05-06 23:24:43
#> 4 unknown 4 Exporting log file NA 2025-05-06 23:24:43
Note that if the logFile is not created the logMessage()
function only displays the message in the console.
exportSummarisedResult
The exportSummarisedResult()
exports by default the
logger if there is one. See example code:
library(dplyr, warn.conflicts = FALSE)
library(tidyr, warn.conflicts = FALSE)
# create the log file
createLogFile(logFile = tempfile(pattern = "log_{date}_{time}"))
#> ℹ Creating log file: /tmp/RtmpbaTZ0S/log_2025_05_06_23_24_4424f0151bd276.txt.
#> [2025-05-06 23:24:44] - Log file created
# start analysis
logMessage("Deffining toy data")
#> [2025-05-06 23:24:44] - Deffining toy data
n <- 1e5
x <- tibble(person_id = seq_len(n), age = rnorm(n = n, mean = 55, sd = 20))
logMessage("Summarise toy data")
#> [2025-05-06 23:24:44] - Summarise toy data
res <- x |>
summarise(
`number subjects_count` = n(),
`age_mean` = mean(age),
`age_sd` = sd(age),
`age_median` = median(age),
`age_q25` = quantile(age, 0.25),
`age_q75` = quantile(age, 0.75)
) |>
pivot_longer(
cols = everything(),
names_to = c("variable_name", "estimate_name"),
names_sep = "_",
values_to = "estimate_value"
) |>
mutate(
result_id = 1L,
cdm_name = "mock data",
variable_level = NA_character_,
estimate_type = if_else(estimate_name == "count", "integer", "numeric"),
estimate_value = as.character(estimate_value)
) |>
uniteGroup() |>
uniteStrata() |>
uniteAdditional() |>
newSummarisedResult()
#> `result_type`, `package_name`, and `package_version` added to
#> settings.
# res is a summarised_result object that we can export using the `exportSummarisedResult`
tempDir <- tempdir()
exportSummarisedResult(res, path = tempDir)
#> [2025-05-06 23:24:44] - Exporting log file
exportSummarisedResult()
also exported the log file,
let’s see it. Let’s start importing the exported
summarised_result
object:
result <- importSummarisedResult(tempDir)
#> Reading file: /tmp/RtmpbaTZ0S/results_mock data_2025_05_06.csv.
#> Converting to summarised_result:
#> /tmp/RtmpbaTZ0S/results_mock data_2025_05_06.csv.
We can see that the log file is exported see
result_type = "summarise_log_file"
:
result |>
settings() |>
glimpse()
#> Rows: 2
#> Columns: 8
#> $ result_id <int> 1, 2
#> $ result_type <chr> "", "summarise_log_file"
#> $ package_name <chr> "", "omopgenerics"
#> $ package_version <chr> "", "1.1.1.900"
#> $ group <chr> "", ""
#> $ strata <chr> "", "log_id"
#> $ additional <chr> "", ""
#> $ min_cell_count <chr> "5", "5"
The easiest way to explore the log is using the tidy()
version:
result |>
filterSettings(result_type == "summarise_log_file") |>
tidy()
#> # A tibble: 4 × 5
#> cdm_name log_id variable_name variable_level date_time
#> <chr> <chr> <chr> <chr> <chr>
#> 1 mock data 1 Log file created NA 2025-05-06 23:24:44
#> 2 mock data 2 Deffining toy data NA 2025-05-06 23:24:44
#> 3 mock data 3 Summarise toy data NA 2025-05-06 23:24:44
#> 4 mock data 4 Exporting log file NA 2025-05-06 23:24:44