Research Guides: Research Data Management and Curation: Concepts and Models

Research Data Lifecycle

The research data lifecycle describes the process of scientific discovery (research). The cycle is usually represented as a wheel to demonstrate how new research builds off of prior research. Although terminology and number of steps may vary (for example, the model below is from the Harvard Medical School), the idea remains the same.

The core process steps are: Plan, Acquire, Process, Analyze, Preserve, Publish/Share, and Reuse.

Other elements that are sometimes included as interconnected to the main steps are data backup and security, quality assessment and assurance, documentation, and metadata.

Planning involves learning and thinking about how you will incorporate data management into your research. Formally this is documented in a Data Management Plan (DMP), which outlines the major decisions for each step of the research data lifecycle.

The next step is to acquire the data for your project. This may include creating, collecting, or generating new data or obtaining existing data.

Once you have collected your data, usually it needs some processing before it is analysis ready. This may involve validating, summarizing, integrating, subsetting, or applying other transformations to the data. It is important to note what steps you went through to change the data from its original format, to ensure it can be reproduced later.

Analyzing is when you begin interpreting and drawing conclusions from the processed data. This may involve statistics, visualization, spatial analysis, image analysis, and modeling. Regardless, documenting the process is just as important as saving your data.

Once you have completed your findings, the next step is to preserve your data. This could be done by submitting your data to a repository, planning for long-term archiving of your data, or any other processes aimed at ensuring the long-term access to your data.

Finally, it is time to Publish and Share your data. In addition to publishing an article, this step includes other activities to increase the findability of your data, such as submitting records to data catalogues, assigning DOIs, and creating searchable metadata.

Reuse is the connection between sharing and planning, starting the cycle over again for the next iteration of the research lifecycle.

Biomedical Data Lifecycle
The model above is the research data lifecycle from Harvard Medical School. Their website offers additional explanation of each of the steps, and includes a research check-list for each of the primary steps.
USGS Data Lifecycle
The USGS have been a large participator in advancing research data management over the last decade. Their website includes additional resources on how to implement data management into each step of the research data lifecycle.

FAIR Data

FAIR Data and FAIR Principles are another related concept model for research data management. FAIR mostly focuses on the sharing, documentation, and reuse aspects of the research data lifecycle, but can also be used as a framework for implementing research data management more broadly.

First introduced in a 2016 paper, the FAIR framework presents a methodology for data management focusing on creating Findable, Accessible, Interoperable, and Reuseable data. The hope is to move towards a data culture that supports machine actionability and automation due to the increased volume, complexity, and creation speed of data.

Findable - Data should be easy to find for both humans and computers. Machine readable metadata files are important for the discovery of datasets as they can be searched algorithmically.

Accessible - Data needs to be accessed, and it should be clear to users how to download datasets manually and programmatically.

Interoperable - Data usually needs to be integrated with other data and software packages.

Reuseable - FAIR's goal is to optimize the reuse of data, which relies on researchers creating well-described metadata in order to replicate and combine data.

GO FAIR - FAIR Data Principles
Breakdown of FAIR principles and sub-principles, including links to related tools and literature for researchers.
How to GO FAIR
Provides information on the GO FAIR community, their efforts, and tools to implement FAIR principles.

Open Science Framework

Open Science Framework (OSF) is an online tool created to help researchers plan and track collaborative projects to ensure they are completing transparent, accurate, and meaningful research. The Open Science Framework was created by the Center for Open Science, who believes transparency is important in every step of the research process.

OSF is free and open-source and was developed by the non-profit Center for Open Science. Specific uses of the platform include documenting and keeping track of contributions across institutions, managing and storing files, writing and sharing documentation, and providing a centralized location for following other research projects.

The University of Alabama does not have institutional access to the OSF platform, but you can create an account using your ORCID or email address for free.

Center for Open Science
Homepage for the Center for Open Science, the sponsor of the OSF. The COS also offers other resources for researchers interested and devoted to transparent research.
Open Science Framework
Information on the Open Science Framework platform. To create an account or login, click on the blue "Get started now" button at the bottom of the first section.

Research Data Management and Curation

Research Data Lifecycle

FAIR Data

Open Science Framework

Get in touch

Hours & information

Socialize with us