Identify pregnancy episodes in OMOP CDM data using the HIPPS algorithm (Smith et al. 2024, doi:10.1093/jamia/ocae195).
Observational health data rarely has pregnancy_start or pregnancy_end variables. More often we get scattered pregnancy-related events such as live birth, gestational week 12, delivery procedure, miscarriage, etc. PregnancyIdentifier turns pregnancy-related codes into:
- One row per pregnancy episode
- Inferred start and end dates (and precision) from gestational timing evidence.
- Standard outcome categories (LB, SB, AB, SA, ECT, DELIV, PREG) you can use in analyses or exports.
The pipeline combines outcome-anchored episodes (HIP), timing-anchored episodes (PPS), merges them (HIPPS), then refines start dates (ESD)—so you get a consistent definition of a pregnancy across sites and data sources.
How to use it
Install (requires R ≥ 4.1 and CDMConnector):
# From GitHub (DARWIN EU)
remotes::install_github("darwin-eu/PregnancyIdentifier")Run the full pipeline (initializes concepts, runs HIP → PPS → merge → ESD, writes outputs):
library(PregnancyIdentifier)
library(CDMConnector)
cdm <- mockPregnancyCdm() # or your real cdm_reference
runPregnancyIdentifier(
cdm = cdm,
outputDir = "pregnancy_output",
startDate = as.Date("2000-01-01"),
endDate = Sys.Date(),
runExport = FALSE
)Use the result:pregnancy_output/final_pregnancy_episodes.rds is a data frame with one row per pregnancy episode: person_id, final_episode_start_date, final_episode_end_date, final_outcome_category, esd_precision_days, and other esd_* QA/concordance columns. Load it for cohort definition, export, or further analysis.
Optional: set runExport = TRUE to run export automatically after ESD, or run export yourself for de-identified summary CSVs and a ZIP:
runPregnancyIdentifier(cdm, outputDir = "pregnancy_output", runExport = TRUE)
# or:
exportPregnancies(cdm, outputDir = "pregnancy_output", exportDir = "pregnancy_output/export")Documentation
- Vignettes: Pipeline overview, HIP, PPS, Merge, ESD, Export.
- Reference: pkgdown site.
- Issues: GitHub issues.