In October 1990, the U.S. Department of Energy and the National Institutes of Health announced the launch of the Human Genome Project (HGP), a 15-year international effort to sequence the entire human genome in order to better understand inherited traits, disease characteristics, drug efficacy, and more. Since the completion of the project in 2003, thousands of subsequent findings have been published upon, with much of the research focusing on oncology given that many cancers are driven by certain genetic markers. With the data produced by the HGP, researchers have discovered new types of cancers, new resistance mechanisms, and a whole host of other findings, thanks to the democratized nature of the data and the rapid incentive alignment across oncology to advance care and research (see Figure 1).
It is easy to understand how rapid progress was achieved with diseases like cancer, which have a strong genetic basis, given the success of the HGP. What about other diseases? How can we better understand how diseases interact with each other and with different therapies? How can we see what commonalities exist between etiologies involving genetic and environmental factors? And now, importantly, how can we study the long-term impact of diseases such as COVID-19 on the human body and clinical care? Like we have done with cancer, how can we get a 20,000-foot view of all diseases?
Healthcare Data from the Real World Can Provide Insights
We can answer many of these questions with real-world data, including everyday healthcare information from patient visits, lab testing, medical imaging, and more. As a healthcare industry, we have seen the impact that data has had on advancing cancer care. Now, we should turn our attention toward advancing all therapeutic areas through data with the same focus and commitment.
Where to Start?
To make this shift, it starts with the foundation
Technology, infrastructure, and data processing alignment across the healthcare ecosystem is the critical first step. We are not far off from this — most EHRs have common structured data collection fields — but we need to ensure that privacy and security are readily and actively addressed to enable trusted data integration across siloed systems. As a data scientist who has been steeped in healthcare data for nearly two decades, I am confident these technological challenges are solvable with creativity and dedication.
Once the foundational elements are addressed, it opens up the doors for incentive alignment across various healthcare stakeholders, including patients, providers, payers, researchers, and more. I envision a world where stakeholders are incentivized for the insights they extract from data rather than the data itself. This could discourage keeping real-world data siloed and encourage sharing de-identified data to accelerate clinical research and advance the practice of medicine.
Starting Small with Incentives is Key
While drastically changing the incentives paradigm may seem inconceivable right now, there is an opportunity to start small. A few incentive structures at the right places in the healthcare ecosystem can go a long way in enabling the fitness of real-world data for observational research. We saw this with the rapid design, development, and deployment of the COVID-19 vaccine. Where there is a will, there is a way.
Just as the Human Genome Project spurred a whole host of advancements in oncology, real-world data could act as a similar catalyst across other diseases. As a technologist and data scientist, I am ready, willing, and able to do my part, and I know I am far from alone in wanting to apply my expertise to help change and save lives.