Misplaced Pages

FAIR data

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

FAIR data is data which meets the FAIR principles of findability , accessibility, interoperability , and reusability (FAIR). The acronym and principles were defined in a March 2016 paper in the journal Scientific Data by a consortium of scientists and organizations.

#474525

33-416: The FAIR principles emphasize machine-actionability (i.e., the capacity of computational systems to find, access, interoperate, and reuse data with none or minimal human intervention) because humans increasingly rely on computational support to deal with data as a result of the increase in the volume, complexity, and rate of production of data. The abbreviation FAIR/O data is sometimes used to indicate that

66-469: A formal, accessible, shared, and broadly applicable language for knowledge representation. I2. (Meta)data use vocabularies that follow FAIR principles I3. (Meta)data include qualified references to other (meta)data Reusable The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings. R1. (Meta)data are richly described with

99-480: A globally unique and persistent identifier F2. Data are described with rich metadata (defined by R1 below) F3. Metadata clearly and explicitly include the identifier of the data they describe F4. (Meta)data are registered or indexed in a searchable resource Accessible Once the user finds the required data, they need to know how they can be accessed, possibly including authentication and authorisation . A1. (Meta)data are retrievable by their identifier using

132-496: A literary work covered by copyright law. Richard Stallman founded the free software movement in response to the rise of proprietary software . The term "open source" was used by the Open Source Initiative (OSI), founded by free software developers Bruce Perens and Eric S. Raymond . "Open source" is alternative label that emphasizes the strengths of the open development model rather than software freedoms. While

165-489: A plurality of accurate and relevant attributes R1.1. (Meta)data are released with a clear and accessible data usage license R1.2. (Meta)data are associated with detailed provenance R1.3. (Meta)data meet domain-relevant community standards The principles refer to three types of entities: data (or any digital object), metadata (information about that digital object), and infrastructure. For instance, principle F4 defines that both metadata and data are registered or indexed in

198-439: A result of ESRC-funded research should be openly available to the scientific community to the maximum extent possible, through long-term preservation and high-quality data management. ESRC requires a data management plan for all research award applications where new data are being created. Such plans are designed to promote a structured approach to data management throughout the data lifecycle, resulting in better quality data that

231-477: A searchable resource (the infrastructure component). Before FAIR a 2007 paper was the earliest paper discussing similar ideas related to data accessibility. At the 2016 G20 Hangzhou summit , the G20 leaders issued a statement endorsing the application of FAIR principles to research. Also in 2016, a group of Australian organisations developed a Statement on FAIR Access to Australia's Research Outputs, which aimed to extend

264-487: A standardised communications protocol A1.1 The protocol is open, free, and universally implementable A1.2 The protocol allows for an authentication and authorisation procedure, where necessary A2. Metadata are accessible, even when the data are no longer available Interoperable The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis , storage , and processing . I1. (Meta)data use

297-508: Is a challenging task, and it is challenging to assess the FAIRness. Open license A free license or open license is a license that allows copyrighted work to be reused, modified, and redistributed. These uses are normally prohibited by copyright , patent or other Intellectual property (IP) laws. The term broadly covers free content licenses and open-source licenses , also known as free software licenses . The invention of

330-401: Is a formal document that outlines how data are to be handled both during a research project, and after the project is completed. The goal of a data management plan is to consider the many aspects of data management , metadata generation, data preservation , and analysis before the project begins; this may lead to data being well-managed in the present, and prepared for preservation in

363-644: Is a supplement (not part of the 15-page proposal) and should describe how the proposal will conform to the Award and Administration Guide policy (see below). It may include the following: Policy summarized from the NSF Award and Administration Guide, Section 4 (Dissemination and Sharing of Research Results): Since 1995, the UK's Economic and Social Research Council (ESRC) have had a research data policy in place. The current ESRC Research Data Policy states that research data created as

SECTION 10

#1732852619475

396-403: Is being explored by FAIR Data Maturity Model Working Group of RDA, CODATA's strategic Decadal Programme "Data for Planet: Making data work for cross-domain challenges" mentions FAIR data principles as a fundamental enabler of data driven science. The Association of European Research Libraries recommends the use of FAIR principles. A 2017 paper by advocates of FAIR data reported that awareness of

429-547: Is no general and definitive list of topics that should be covered in a DMP for a research project", and researchers are often left to their own devices as to how to fill out a DMP. Metadata are the contextual details, including any information important for using data. This may include descriptions of temporal and spatial details, instruments, parameters, units, files, etc. Metadata is commonly referred to as “data about data”. Issues to be considered include: Data management and preservation costs may be considerable, depending on

462-621: Is ready to archive for sharing and re-use. The UK Data Service , the ESRC's flagship data service, provides practical guidance on research data management planning suitable for social science researchers in the UK and around the world. ESRC has a longstanding arrangement with the UK Data Archive , based at the University of Essex , as a place of deposit for research data, with award holders required to offer data resulting from their research grants via

495-439: Is that data that are preserved have the potential to lead to new, unanticipated discoveries, and they prevent duplication of scientific studies that have already been conducted. Data archiving also provides insurance against loss by the data collector. In the 2010s, funding agencies increasingly required data management plans as part of the proposal and evaluation process, despite little or no evidence of their efficacy. "There

528-658: The CARE Principles for Indigenous Data Governance as a complementary guide. The CARE principles extend principles outlined in FAIR data to include Collective benefit, Authority to control, Responsibility, and Ethics to ensure data guidelines address historical contexts and power differentials. The CARE Principles for Indigenous Data Governance were drafted at the International Data Week and Research Data Alliance Plenary co-hosted event, "Indigenous Data Sovereignty Principles for

561-596: The copyright as legal mechanism. Ideas of free/open licenses have since spread into different spheres of society. Open source , free culture (unified as free and open-source movement ), anticopyright , Wikimedia Foundation projects, public domain advocacy groups and pirate parties are connected with free and open licenses. Free software licenses , also known as open-source licenses , are software licenses that allow content to be used, modified, and shared. They facilitate free and open-source software (FOSS) development. Intellectual property (IP) laws restrict

594-511: The FAIR concept was increasing among various researchers and institutes, but also, understanding of the concept was becoming confused as different people apply their own differing perspectives to it. Guides on implementing FAIR data practices state that the cost of a data management plan in compliance with FAIR data practices should be 5% of the total research budget. In 2019 the Global Indigenous Data Alliance (GIDA) released

627-490: The FAIR principles as a conceptual component of data catalog software tools, with the other components being metadata management, business context and data responsibility roles. In April 2022, Matthias Scheffler and colleagues argued in Nature that FAIR principles are "a must" so that data mining and artificial intelligence can extract useful scientific information from the data. However, making data (and research outcomes) FAIR

660-632: The Governance of Indigenous Data Workshop", held 8 November 2018, in Gaborone , Botswana. The lack of information on how to implement the guidelines have led to inconsistent interpretations of them. In January 2020, representatives of nine groups of universities around the world produced the Sorbonne declaration on research data rights , which included a commitment to FAIR data, and called on governments to provide support to enable it. In 2021, researchers identified

693-503: The UK Data Service. The Archive enables data re-use by preserving data and making them available to the research and teaching communities. There are three major themes identified in the literature in terms of benefits of DMPs: professional benefits, economic benefits and institutional benefits. It has been argued that DMPs can form a catalyst for researchers to improve their data literacy and data management practices, often aided by

SECTION 20

#1732852619475

726-510: The dataset or database in question complies with the FAIR principles and also carries an explicit data‑capable open license . Findable The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers. Machine-readable metadata are essential for automatic discovery of datasets and services, so this is an essential component of the FAIRification process. F1. (Meta)data are assigned

759-494: The development and uptake of DMPs. Preparing a data management plan before data are collected is claimed to ensure that data are in the correct format, organized well, and better annotated. This could arguably save time in the long term because there is no need to re-organize, re-format, or try to remember details about data. It is also claimed to increase research efficiency since both the data collector and other researchers might be able to understand and use well-annotated data in

792-580: The free software movement. Copyleft licenses require derivative works to be distributed with the source code and under a similar license. Since the mid-2000s, courts in multiple countries have upheld the terms of both types of license. Software developers have filed cases as copyright infringement and as breaches of contract. According to the current definition of open content on the OpenContent website, any general, royalty-free copyright license would qualify as an open license because it 'provides users with

825-434: The future. DMPs were originally used in 1966 to manage aeronautical and engineering projects' data collection and analysis, and expanded across engineering and scientific disciplines in the 1970s and 1980s. Up until the early 2000s, DMPs were used "for projects of great technical complexity, and for limited mid-study data collection and processing purposes". In the 2000s and later, E-research and economic policies drove

858-494: The future. One component of a data management plan is data archiving and preservation. By deciding on an archive ahead of time, the data collector can format data during collection to make its future submission to a database easier. If data are preserved, they are more relevant since they can be re-used by other researchers. It also allows the data collector to direct requests for data to the database, rather than address requests individually. A frequent argument in favor of preservation

891-413: The goals behind the terms are different, open-source licenses and free software licenses describe the same type of licenses. The two main categories of free and open-source licenses are permissive and copyleft . Both grant permission to change and distribute software. Typically, they require attribution and disclaim liability . Permissive licenses come from academia. Copyleft licenses come from

924-577: The library. In practice, however, DMPs often fall short of their stated goals. A 2012 review of DMP policies by research funders found that policies were missing several elements from the Digital Curation Centre 's list of criteria for a DMP. Researchers shared DMP text. DMPs are often regarded as an "administrative exercise rather than an integral part" of the research process, and it has been acknowledged that DMPs do not guarantee good data management practices. Most funders do not require

957-546: The modification and sharing of creative works. Free and open-source licenses use these existing legal structures for an inverse purpose. They grant the recipient the rights to use the software, examine the source code , modify it, and distribute the modifications. These criteria are outlined in the Open Source Definition and The Free Software Definition . After 1980, the United States began to treat software as

990-401: The nature of the project. By anticipating costs ahead of time, researchers ensure that the data will be properly managed and archived. Potential expenses that should be considered are The data management plan should include how these costs will be paid. All grant proposals submitted to National Science Foundation (NSF) must include a Data Management Plan that is no more than two pages. This

1023-524: The principles to research outputs more generally. In 2017, Germany, Netherlands and France agreed to establish an international office to support the FAIR initiative, the GO FAIR International Support and Coordination Office. Other international organisations active in the research data ecosystem, such as CODATA or Research Data Alliance (RDA) also support FAIR implementations by their communities. FAIR principles implementation assessment

FAIR data - Misplaced Pages Continue

1056-547: The right to make more kinds of uses than those normally permitted under the law. These permissions are granted to users free of charge.' However, the narrower definition used in the Open Definition effectively limits open content to libre content. Any free content license, defined by the Definition of Free Cultural Works, would qualify as an open content license. Data management plan A data management plan or DMP

1089-658: The term "free license" and the focus on the rights of users were connected to the sharing traditions of the hacker culture of the 1970s public domain software ecosystem, the social and political free software movement (since 1980) and the open source movement (since the 1990s). These rights were codified by different groups and organizations for different domains in Free Software Definition , Open Source Definition , Debian Free Software Guidelines , Definition of Free Cultural Works and The Open Definition . These definitions were then transformed into licenses, using

#474525