Funding agencies and academic journals have recently shifted policies surrounding research data sharing. This push toward open science is intended to advance discoveries along with the evaluation, replication, and verification of results. While these policies encourage data sharing and suggest archives for data deposit, they do not provide guidance for the curation of data. What are the values and costs associated with sharing research data? And how does curation impact data discovery and reuse? Questions about how to efficiently allocate data archiving resources to meet increased demand for data sharing remain unanswered.
This webinar introduces “Measuring and Improving the Efficacy of Curation Activities in Data Archives” or MICA, a three-year National Digital Infrastructures and Initiatives project led by investigators at the University of Michigan School of Information in partnership with the Inter- university Consortium for Political and Social Research (ICPSR). The goal of this project is to understand how curatorial actions impact the use of digital collections.
This project is assessing stakeholder needs, priorities, and values for data reuse. It is also experimenting with machine learning methods to extract and classify curation activities from logs as well as discover incomplete data citations from literature. These methods will allow us to associate curation efforts with data reuse patterns and develop curatorial metrics for measuring the impact of curation activities. Early findings from an analysis of data downloads indicate relationships among dataset features, the intensity of curation activities, and data reuse.
This presentation is open to anyone interested in digital curation and archiving as well as discussing the potential for machine learning to support these activities.
ABOUT THE SPEAKER
Sara Lafia is a Postdoctoral Research Fellow in the Inter-university Consortium for Political and Social Research (ICPSR) at the University of Michigan. Her research is currently supported by a NSF project, Developing Evidence-based Data Sharing and Archiving Policies, where she is analyzing curation activities, automatically detecting data citations, and contributing to the development of metrics for tracking the impact of data reuse. She holds a Ph.D. in Geography from UC Santa Barbara and is also interested in geospatial applications, designing linked data models, and developing data visualizations.
DATE/TIME: 3 December 2020 at 14:00 CET.
Sign up for this webinar here.