What are Open and F.A.I.R Data?
Open Data
The inherent value data hold to the research process cannot be overstated, and existing data hold great potential to accelerate scientific discovery through reuse once shared.
The concept of open data — data that are publicly and freely discoverable, accessible, and reusable (opendefinition.org) — has been steadily gaining momentum in the scientific community: “Open data is like a renewable energy source: it can be reused without diminishing its original value, and reuse creates new value.” (Science et al., 2017).
Realizing the value of open data within the scientific research landscape, several stakeholders from funders to publishers, are driving change in the culture of data sharing (Mayernik, 2012; Costas et al., 2013; NSF24-124, 2024; Holdren, 2013). With this shift, many digital data repositories have come online to provide data management services that meet stakeholder needs for data discovery and access.
However, for data to be reused efficiently and effectively, they must also be well-managed and stewarded. Here, domain-specific repositories add great value to their community’s research data by bringing subject matter expertise to the curation process (Lenhert, 2015; ICPSR, 2013).
F.A.I.R Principles
The four F.A.I.R. Principles (Wilkinson et al., 2016) are a set of values intended to guide data producers and publishers in establishing good data management and stewardship practices. These principles are at the core mission of BCO-DMO:
Principle | Description | How BCO-DMO upholds this principle |
---|---|---|
Findable | Data are findable for future use by submitters, other researchers, and publishers | Data are published with detailed descriptive metadata and assigned DOIs |
Accessible | Data and metadata are open, understandable/readable to human users and machines | Data and metadata are made publically available through the BCO-DMO website |
Interoperable | Data are compatible with various applications, or workflows for analysis, storage, and processing | BCO-DMO assures data are compliant with our systems, downstream data systems, and suitable for future analysis |
Reusable | Optimise the ability for data to be reused | Data in BCO-DMO have Creative Commons Attribution 4.0 by default; data processing details are captured in related metadata, and we help ensure data meet domain-relevant community standards |
Must-reads on the F.A.I.R. principles
Wilkinson et al. (2016) was the start on this topic, while Stall et al. (2018) highlighted activities aimed at promoting FAIR principles within the Earth and space science communities.
- Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. (2016) The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018. https://doi.org/10.1038/sdata.2016.18
- Stall, S., et al. (2018), Advancing FAIR data in Earth, space, and environmental science, Eos, 99, https://doi.org/10.1029/2018EO109301. Published 5 November 2018.
- Fair principles. GO FAIR. (2022, January 21). https://www.go-fair.org/fair-principles/. Published on 02 January 2022
BCO-DMO's Data Stewardship
BCO-DMO aligns its data stewardship philosophy with the F.A.I.R. Data Principles of Wilkinson et al. (2016). These principles outline tangible practices data providers can employ to promote easier sharing and ultimate reuse of data by both machines and humans, thereby making them “Findable, Accessible, Interoperable, and Reusable”. In addition, the project considers the components of the Data Stewardship Maturity Matrix (Peng et al., 2015): preservability, accessibility/usability, data quality transparency/traceability, and data integrity as trustworthy guidelines for assessing BCO-DMO’s data stewardship practices.
An important tenet of the project is the value of people, both the highly trained data managers and the investigators with whom they work closely to ensure data are complete, quality-checked, well-organized and described, and publicly accessible. Staff utilize domain knowledge to: assist investigators with data management plans, assess submitted data for completeness, perform gross quality control and reformatting, display data in the most appropriate manner for each data type, and assemble robust metadata necessary to discover, understand, and reuse the data.