July 10, 2023: EHR Core’s Keith Marsolo Shares Scientific Goals and Resources Needed For Data Sharing

In an interview at the NIH Pragmatic Trials Collaboratory Steering Committee Meeting in May, Keith Marsolo, PhD, Co-Chair of the Electronic Health Records Core, reflected on data sharing experiences from the program and factors affecting the meaningful re-use of data.

Headshot of Dr. Keith Marsolo
Keith Marsolo, PhD

“The data sharing experiences from the NIH Pragmatic Trials Collaboratory Trials have been fairly limited. Most studies have only had one or two requests for data. I think part of the impetus for the new NIH Data Management and Sharing Policy is to spur additional re-use of research data, ” Marsolo said. He elaborated that the scientific goals for data sharing include transparency, reproducibility, validation, new generative science, and respecting the contribution of participants.

Data that can be shared include scientific data (raw data, analytic dataset, etc) and metadata (protocol, analytic code, statistical analysis plan, etc). “If we are going to share data, we want it to be useful. For new science, you likely need the raw data. For reproducibility, you need a lot of types of data. Study teams can be more upfront about the data and metadata that can be shared and the use cases they can support,” Marsolo said.

Because data from pragmatic trials are collected as part of routine care, there are restrictions as the data contain not only personal health data but also data about the healthcare system, so what can actually be shared may be limited. Overall, Marsolo suggests that more information is needed about what data can be shared and how this translates into the goals for re-use.

“I think if we’re not clear about what limitations of the data are, there could be a mismatch between expectations of what we can get from these data and what’s actually achievable,” Marsolo said.

With regards to resources needed for data sharing, Marsolo stated, “There might be funding that needs to be allocated to promote data sharing. For example, PCORI [The Patient-Centered Outcomes Research Institute] has a model where they allocate funding specifically for data sharing and dissemination.”

The EHR Core can help to promote data sharing by working to provide examples on how to navigate data sharing and outlining different approaches for pragmatic trials. NIH can provide additional guidance on how to handle datasets with restrictions—what to share, for what purpose, and at what cost.

For more information, Marsolo and colleagues published an article on data sharing in which the authors suggest that data sharing is not rising to its potential, and that more guidance is needed to prevent data sharing from becoming a “box-checking exercise.” There is also a Living Textbook chapter on Data Sharing and Embedded Research.

All of the materials from the 2023 Steering Committee meeting are now available.