Facilitating Harmonization of Variables in Framingham, MESA, ARIC, and REGARDS Studies Through a Metadata Repository

Pratheek Mallya, Laura M. Stevens, Juan Zhao, Chuan Hong, Ricardo Henao, Nicoleta Economou-Zavlanos, Daniel M. Wojdyla, Tony Schibler, Vihaan Manchanda, Michael J. Pencina, Jennifer L. Hall*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


BACKGROUND: High-quality research in cardiovascular prevention, as in other fields, requires inclusion of a broad range of data sets from different sources. Integrating and harmonizing different data sources are essential to increase generalizability, sample size, and representation of understudied populations - strengthening the evidence for the scientific questions being addressed. METHODS: Here, we describe an effort to build an open-access repository and interactive online portal for researchers to access the metadata and code harmonizing data from 4 well-known cohort studies - the REGARDS (Reasons for Geographic and Racial Differences in Stroke) study, FHS (Framingham Heart Study), MESA (Multi-Ethnic Study of Atherosclerosis), and ARIC (Atherosclerosis Risk in Communities) study. We introduce a methodology and a framework used for preprocessing and harmonizing variables from multiple studies. RESULTS: We provide a real-case study and step-by-step guidance to demonstrate the practical utility of our repository and interactive web page. In addition to our successful development of such an open-access repository and interactive web page, this exercise in harmonizing data from multiple cohort studies has revealed several key themes. These themes include the importance of careful preprocessing and harmonization of variables, the value of creating an open-access repository to facilitate collaboration and reproducibility, and the potential for using harmonized data to address important scientific questions and disparities in cardiovascular disease research. CONCLUSIONS: By integrating and harmonizing these large-scale cohort studies, such a repository may improve the statistical power and representation of understudied cohorts, enabling development and validation of risk prediction models, identification and investigation of risk factors, and creating a platform for racial disparities research. REGISTRATION: URL: https://precision.heart.org/duke-ninds.

Original languageEnglish (US)
Pages (from-to)E009938
JournalCirculation: Cardiovascular Quality and Outcomes
Issue number11
StatePublished - Nov 1 2023


  • atherosclerosis
  • cardiovascular disease
  • metadata
  • sample size
  • stroke

ASJC Scopus subject areas

  • Cardiology and Cardiovascular Medicine


Dive into the research topics of 'Facilitating Harmonization of Variables in Framingham, MESA, ARIC, and REGARDS Studies Through a Metadata Repository'. Together they form a unique fingerprint.

Cite this