Data Infrastructure Building Blocks (DIBBs) is a U.S. National Science Foundation program.
20-568: On April 27, 2012, the U.S. National Science Foundation Office of Cyberinfrastructure announced a request for proposals with the name "Data Infrastructure Building Blocks (DIBBs)". The solicitation ( NSF 12-557 ) "incorporated some but not all of the goals of the former DataNet and InterOp programs." DIBBs is part of NSF's vision for a Cyberinfrastructure Framework for 21st Century Science ( CIF21 ). The introduction in this solicitation states: NSF's Cyberinfrastructure Framework for 21st Century Science and Engineering (CIF21) investment focuses on
40-594: A $ 100 million initiative: five awards of $ 20 million each over five years with the possibility of continuing funding. Awards were given in two rounds. In the first round, for which full proposals were due on March 21, 2008, two DataNet proposals were awarded. DataONE , led by William Michener at the University of New Mexico covers ecology, evolutionary, and earth science. The Data Conservancy, led by Sayeed Choudhury of Johns Hopkins University , focuses on astronomy, earth science, life sciences, and social science. For
60-413: A set of exemplar national and global data research infrastructure organizations (dubbed DataNet Partners) that provide unique opportunities to communities of researchers to advance science and/or engineering research and learning. The introduction in the solicitation goes on to say: Chapter 3 (Data, Data Analysis, and Visualization) of NSF’s Cyberinfrastructure Vision for 21st century Discovery presents
80-457: A vision in which “science and engineering digital data are routinely deposited in well-documented form, are regularly and easily consulted and analyzed by specialists and non-specialists alike, are openly accessible while suitably protected, and are reliably preserved.” The goal of this solicitation is to catalyze the development of a system of science and engineering data collections that is open, extensible and evolvable. The initial plan called for
100-506: Is composed of Coordinating Nodes located at the Oak Ridge Campus at Tennessee, University of California Santa Barbara , and University of New Mexico , and member nodes. DataONE also provides resources including tools for accessing and using it. The three coordinating nodes provide network-wide services to member nodes. They are geographically replicated, with mirrored content and full copies of science metadata . William Michener of
120-865: Is the DMPTool for data management planning . The DMP Tool is used by and referenced by many research data management plans and institutions in the US and around the world. Another recent collaboration in this area is the shared construction of a Data Management Training Clearinghouse for Earth sciences, in partnership with USGS and the Community for Data Integration (CDI). The DataONE community includes research networks, professional societies, libraries, academic institutions, data centers, data repositories, environmental observatory networks, educators, scientists, policy makers, administrators, citizen scientists, international organizations, NGOs, ecosystem managers, students, private companies and
140-457: Is to preserve and provide access to multi-scale, multi-discipline, and multi-national data. Users include scientists, ecosystem managers, policy makers, students, educators, librarians, and the public. DataONE links together existing cyberinfrastructure to provide a distributed framework, management, and technologies that enable long-term preservation of multi-scale, multi-discipline, and multi-national observational data. The distributed framework
160-902: The University of New Mexico (UNM) directed the project, and UNM is one of the coordinating nodes. Coordinating nodes are UNM, Oak Ridge Campus (partnership of Oak Ridge National Laboratory ( ORNL ) and University of Tennessee ), and the University of California, Santa Barbara . Member nodes consist of Earth observing institutions, projects, and networks. They provide resources for their own data and replicated data, and focus on serving their specific constituencies. These member nodes are geographically distributed and include: The Tool Kit provides tools for researchers to access DataONE. These are both general purpose and discipline-specific tools, and developers adapt existing tools where possible. The tool kit includes Java and Python libraries, an R programming language plug-in for analysis, extensions for Excel ,
180-833: The VisTrails scientific workflow, and the Kepler scientific workflow system . DataONE provides a place for scientists to store data and its associated metadata . The metadata makes this data searchable and accessible to other scientists. Data management practices include Some of the additional data management planning resources include: a primer for best practices, a database for best practices in data management, educational modules and tutorials, webinars, and an investigator toolkit. These have been used or adapted for use under Creative Commons license by organizations and institutions that seek to educate other communities about data and research management. Understanding different audiences of users led to
200-434: The "long tail" of small- and medium-scale data producers in the domain of sustainability science . The DataNet Federation Consortium, led by Reagan Moore of the University of North Carolina , uses the integrated Rule-Oriented Data System (iRODS) to provide data grid infrastructure for science and engineering. Terra Populus , led by Steven Ruggles of the University of Minnesota focuses on tools for data integration across
220-697: The DIBBs awards into three areas: Conceptualization, Implementation, and Interoperability. These three tracks were distinguished as follows: . . . planning awards aimed at further defining disciplinary and interdisciplinary communities' data storage and management requirements. . . .will support development and implementation of technologies related to the data preservation and access lifecycle, including acquisition; documentation; security and integrity; storage; access, analysis and dissemination; migration; and deaccession. Implementation awards must also address how they will relate to and support other CIF21 components essential to
SECTION 10
#1732883890679240-535: The US National Science Foundation as one of the initial DataNet programs in 2009, funding was renewed in 2014 through 2020 with an additional $ 15 million. DataONE helps preserve, access, use, and reuse of multi-discipline scientific data through the construction of primary cyberinfrastructure and an education and outreach program. DataONE provides scientific data archiving for ecological and environmental data produced by scientists. DataONE's goal
260-466: The development of possible user personas as models for users such as early-career researchers, science data librarians, citizen scientists or K-12 educators. DataONE collaborates with other institutions to bring together tools that help with data management practices. One of those tools, developed in collaboration with other organizations and hosted by the University of California Digital Curation Center,
280-549: The domains of social science and environmental data, allowing interoperability of the three major data formats used in these domains: microdata, areal data, and raster data. Some of its goals were later incorporated in the Data Infrastructure Building Blocks program. DataONE DataONE is a network of interoperable data repositories facilitating data sharing , data discovery, and open science . Originally supported by $ 21.2 million in funding from
300-445: The given community . . . .support community efforts to provide broad interoperability of datasets, enhancing interaction and information sharing to benefit all areas of NSF-funded science, engineering and education. The anticipated funding amount for this solicitation was listed at $ 41,500,000 pending availability of funds. The anticipated average award size for conceptualization awards was $ 100,000 for one year; for implementation awards
320-936: The interconnected cyberinfrastructure components necessary to realize the research potential of theoretical, experimental, observational and simulation-based research efforts. The [DIBBs] Program Description describes the goals of the program as such: . . . to support the development or expansion of new types of digital data storage, preservation, and access that: (1) enable engagement at the frontiers of science and engineering research and education; (2) work cooperatively and in coordination to overcome conventional barriers due to data type and format, discipline or subject area, and time and place to facilitate sharing of data; (3) combine expertise in cyberinfrastructure; library and archival sciences; computer, computational, and information sciences; and various domain sciences; (4) lead to long-term governance models for economic and technological sustainability over multiple decades. The solicitation divided
340-411: The output of research but provide input to new hypotheses, enabling new scientific insights and driving innovation. Therein lies one of the major challenges of this scientific generation: how to develop the new methods, management structures and technologies to manage the diversity, size, and complexity of current and future data sets and data streams. This solicitation addresses that challenge by creating
360-523: The second round, preliminary proposals were due on October 6, 2008, and full proposals on February 16, 2009. Awards from the second round were greatly delayed, and funding was reduced substantially from $ 20 million per project to $ 8 million. Funding for three second round projects began in Fall 2011. SEAD: Sustainable Environment through Actionable Data, led by Margaret Hedstrom of the University of Michigan , seeks to provide data curation software and services for
380-647: Was approximately $ 8 million total over 5 years; and for interoperability awards was estimated to be up to $ 1.5 million total over 3 years. Awards were given in two rounds. In the first round which dealt only with the Conceptualization track, for which full proposals were due on July 26, 2012, three DIBBs proposals were awarded: The second round of awards covered the Implementation and Interoperability Tracks for which full proposals were due on August 30, 2012. Four more proposals were awarded: A total of about $ 26.8M
400-522: Was distributed among these seven awards. Datanet DataNet , or Sustainable Digital Data Preservation and Access Network Partner , was a research program of the U.S. National Science Foundation Office of Cyberinfrastructure. The office announced a request for proposals with this title on September 28, 2007. The lead paragraph of its synopsis describes the program as: Science and engineering research and education are increasingly digital and increasingly data-intensive. Digital data are not only
#678321