Researchers have discovered more than 275 million previously unreported genetic variants, identified from data shared by nearly 250,000 participants of the National Institutes of Health’s All of Us Research Program. Half of the genomic data are from participants of non-European genetic ancestry. The unexplored cache of variants provides researchers new pathways to better understand the genetic influences on health and disease, especially in communities who have been left out of research in the past.
Nearly 4 million of the newly identified variants are in areas that may be tied to disease risk. The genomic data detailed in the study are available to registered researchers in the Researcher Workbench, the program’s platform for data analysis.
“As a physician, I’ve seen the impact the lack of diversity in genomic research has had in deepening health disparities and limiting care for patients,” said Josh Denny, M.D., M.S., chief executive officer of the All of Us Research Program and an author of the study. “The All of Us dataset has already led researchers to findings that expand what we know about health – many that may not have been possible without our participants' contributions of DNA and other health information. Their participation is setting a course for a future where scientific discovery is more inclusive, with broader benefits for all.”
To date, more than 90% of participants in large genomics studies have been of European genetic ancestry. NIH Institute and Center directors noted in an accompanying commentary article in Nature Medicine
that this has led to a narrow understanding of the biology of diseases, and impeded the development of new treatments and prevention strategies for all populations. They emphasize that many researchers are now utilizing the All of Us dataset to advance precision medicine for all.
For example, in a companion study published in Communications Biology, a research team led by Baylor College of Medicine, Houston, reviewed the frequency of genes and variants recommended by the American College of Medical Genetics and Genomics
across different genetic ancestry groups in the All of Us dataset. These genes and variants mirror those in the program’s Hereditary Disease Risk
research results offered to participants. The authors found significant variability in the frequency of variants associated with disease risk between different genetic ancestry groups and compared with other large genomic datasets.
While more research is needed before these findings can be used to tailor genetic testing recommendations for specific populations, researchers believe the difference in the number of these variants may be influenced by past studies’ limited diversity and their disease-focused approach to participant enrollment, rather than a difference in the prevalence of the variants.
In a separate study, investigators with the eMERGE program tapped the All of Us dataset to calibrate and implement 10 polygenic risk scores for common diseases across diverse genetic ancestry groups. These scores calculate an individual’s risk of disease by taking into account genetic and family history factors. Without accounting for diversity, polygenic risk scores could cause false results that misrepresent a person’s risk for disease and create inequitable genetic tools. Without the diversity of the All of Us data, these polygenic risk scores would have only been applicable to some of the population.
“All of Us values intentional community engagement to ensure that populations historically underrepresented in biomedical research can also benefit from future scientific discoveries,” said Karriem Watson, D.H.Sc., M.S., M.P.H., chief engagement officer of the All of Us Research Program. “This starts with building awareness and improving access to medical research so that everyone has the opportunity to participate.”
More than 750,000 people have enrolled in All of Us to date. Ultimately, the program plans to engage at least one million people who reflect the diversity of the United States and contribute data from DNA, electronic health records, wearable devices, surveys, and more over time. The program regularly expands and refreshes the dataset as more participants share information.