MVP Data Available for Research
Through centralized data collection, cleaning and curation, MVP has a wealth of health records, self-reported surveys, and genetic data available for research, with generation of other omics data underway. Researchers contribute to the data cleaning and curation of phenotypes through the VA’s centralized phenomics library, also known as CIPHER. Note: Access is only available to VA system users.
Applying for Access to MVP Data
MVP genomic and phenotypic data is available to VA researchers through a research merit review process with VA’s Office of Research and Development (ORD). While opportunities for accessing MVP data are evolving, access is currently limited to VA-affiliated researchers. VA-affiliated researchers can submit proposals in response to RFAs from our ORD services, Biomedical Laboratory (BLRD), Clinical (CSRD), Health Services (HSRD), and Rehabilitation (RRD). RFAs and Program Announcements (va.gov).
When joining MVP, Veterans contribute the following information for research:
VA Health Records
The VA electronic health record (EHR) contains records for millions of Veterans including the roughly 9 million Veterans currently using the VA and millions more who used care in the past. It contains patient data from inpatient and outpatient visits including diagnoses, procedures, laboratory tests, prescriptions, clinical notes, reports and imaging. VA was one of the first hospital systems to adopt an EHR system in the 1980s and the current system has been in use for over 20 years.
Access to MVP data is available to VA researchers on approved VA funded projects. While the program is working to increase MVP data access by increasing computational capacity and assessing the regulatory landscape, there is no current mechanism for non-VA research studies to access MVP.
VA is establishing a Data Commons where MVP data will be available to the broader research community in the coming years.
- The MVP Baseline and Lifestyle Surveys collect information on Veterans’ health and wellbeing, including military experiences and exposures, family medical history, dietary habits, and much more. These surveys are requested from every participant in MVP and have been in use since the program launched in 2011.
- In 2016, a Gulf War Era Survey was launched to collect information from a subset of participants serving during that era.
- In response to the COVID-19 pandemic, the MVP COVID-19 Survey was developed and collected from participants between May 2020 and September 2021 to understand how the pandemic particularly affected Veterans.
Genetic and omic data
Veterans provide a blood sample, which is processed for DNA and plasma aliquots for genotyping and other analyses including Whole Genome Sequencing (WGS), methylation, proteomics, and metabolomics. The remaining sample is stored for future use in a VA Central Biorepository.
Other data sources
MVP requests additional data from sources both internal and external to VA, based on the needs of research projects. This data is integrated into the MVP repository for active MVP enrollees. Other data sources include:
- National Death Index (NDI): NDI contains date and cause of death obtained from state vital statistics offices. The data also includes ICD descriptions for underlying cause of death and the description of additional conditions. It serves to supplement information on death records in the VA and are provisioned by request to approved MVP projects.
- Centers for Medicare and Medicaid Services (CMS): CMS data is provisioned by request to approved MVP projects and contains data on active MVP enrollees for healthcare information captured by Medicare or Medicaid such as demographics, beneficiary summaries, inpatient and outpatient visits, vital status, facility and long-term care information, and prescription drugs.
Data snapshot: Data details
Surveys completed *
585,000+ Baseline Survey
460,000+ Lifestyle Survey
45,000+ Gulf War Survey
255,000+ COVID-19 Survey
*Reflects 1,000,000+ Veteran enrollees
- Genotype array data is available to approved researchers, and other omics data capabilities are routinely becoming available.
- ~ 650,000 genotyped individuals using custom Affymetrix genotype array
- Imputed to hybrid 1000Genomes/African Genome Resource reference panel
- Imputed to TOPMed reference panel
- Minority-specific genotype array with over 750,000 genetic variants, including over 300,000 that are more common in minority populations and relevant to their health and well-being (processing underway)
- ~140,000 whole genome sequences (processing underway)
- 90,000 methylation arrays (coming soon)
- Metabolomics and proteomics pilots underway