Before data.europa.eu , the EU Open Data Portal was the point of access to public data published by the EU institutions, agencies and other bodies. On April 21, 2021 it was consolidated to the data.europa.eu portal, together with the European Data Portal: a similar initiative aimed at the EU Member States.
74-576: Public data can be used and reused for commercial or non‑commercial purposes. The portal was a key instrument of the EU open data strategy. By ensuring easy and free access to data, their innovative use and economic potential can be enhanced. The goal of the portal was also to make the institutions and other EU bodies more transparent and accountable. Launched in December 2012, the portal was formally established by Commission Decision of 12 December 2011 (2011/833/EU) on
148-488: A mass noun in singular form. This usage is common in everyday language and in technical and scientific fields such as software development and computer science . One example of this usage is the term " big data ". When used more specifically to refer to the processing and analysis of sets of data, the term retains its plural form. This usage is common in the natural sciences, life sciences, social sciences, software development and computer science, and grew in popularity in
222-672: A SPARQL endpoint. Its metadata catalogue applies international standards such as: Dublin Core, the data catalogue vocabulary DCAT-AP Archived 2018-12-21 at the Wayback Machine and the Asset Description Metadata Schema (ADMS). To promote linked open data, the portal makes extensive use of controlled vocabularies , such as EuroVoc . Open data Open data is data that is openly accessible, exploitable, editable and shareable by anyone for any purpose. Open data
296-436: A basis for calculation, reasoning, or discussion. Data can range from abstract ideas to concrete measurements, including, but not limited to, statistics . Thematically connected data presented in some relevant context can be viewed as information . Contextually connected pieces of information can then be described as data insights or intelligence . The stock of insights and intelligence that accumulate over time resulting from
370-584: A climber's guidebook containing practical information on the best way to reach Mount Everest's peak may be considered "knowledge". "Information" bears a diversity of meanings that range from everyday usage to technical use. This view, however, has also been argued to reverse how data emerges from information, and information from knowledge. Generally speaking, the concept of information is closely related to notions of constraint, communication, control, data, form, instruction, knowledge, meaning, mental stimulus, pattern , perception, and representation. Beynon-Davies uses
444-468: A collaborative project in the municipal Government to create and organize culture for Open Data or Open government data. Additionally, other levels of government have established open data websites. There are many government entities pursuing Open Data in Canada . Data.gov lists the sites of a total of 40 US states and 46 US cities and counties with websites to provide open data, e.g., the state of Maryland ,
518-404: A common view, data is collected and analyzed; data only becomes information suitable for making decisions once it has been analyzed in some fashion. One can say that the extent to which a set of data is informative to someone depends on the extent to which it is unexpected by that person. The amount of information contained in a data stream may be characterized by its Shannon entropy . Knowledge
592-411: A data commons strategy that better enables open data in businesses and research organizations. Such a strategy should address the need for: Beyond individual businesses and research centers, and at a more macro level, countries like Germany have launched their own official nationwide open data strategies, detailing how data management systems and data commons should be developed, used, and maintained for
666-659: A description of other data. A similar yet earlier term for metadata is "ancillary data." The prototypical example of metadata is the library catalog, which is a description of the contents of books. Whenever data needs to be registered, data exists in the form of a data document . Kinds of data documents include: Some of these data documents (data repositories, data studies, data sets, and software) are indexed in Data Citation Indexes , while data papers are indexed in traditional bibliographic databases, e.g., Science Citation Index . Gathering data can be accomplished through
740-553: A few decades. Scientific publishers and libraries have been struggling with this problem for a few decades, and there is still no satisfactory solution for the long-term storage of data over centuries or even for eternity. Data accessibility . Another problem is that much scientific data is never published or deposited in data repositories such as databases . In a recent survey, data was requested from 516 studies that were published between 2 and 22 years earlier, but less than one out of five of these studies were able or willing to provide
814-399: A large variety of actors. Both commons and Open Data can be defined by the features of the resources that fit under these concepts, but they can be defined by the characteristics of the systems their advocates push for. Governance is a focus for both Open Data and commons scholars. The key elements that outline commons and Open Data peculiarities are the differences (and maybe opposition) to
SECTION 10
#1732855204951888-400: A minimal chain of events necessary for open data to lead to accountability: Some make the case that opening up official information can support technological innovation and economic growth by enabling third parties to develop new kinds of digital applications and services. Several national governments have created websites to distribute a portion of the data they collect. It is a concept for
962-697: A new level of public scrutiny." Governments that enable public viewing of data can help citizens engage within the governmental sectors and "add value to that data." Open data experts have nuanced the impact that opening government data may have on government transparency and accountability. In a widely cited paper, scholars David Robinson and Harlan Yu contend that governments may project a veneer of transparency by publishing machine-readable data that does not actually make government more transparent or accountable. Drawing from earlier studies on transparency and anticorruption, World Bank political scientist Tiago C. Peixoto extended Yu and Robinson's argument by highlighting
1036-470: A primary source (the researcher is the first person to obtain the data) or a secondary source (the researcher obtains the data that has already been collected by other sources, such as data disseminated in a scientific journal). Data analysis methodologies vary and include data triangulation and data percolation. The latter offers an articulate method of collecting, classifying, and analyzing data using five possible angles of analysis (at least three) to maximize
1110-408: A range of different arguments for government open data. Some advocates say that making government information available to the public as machine readable open data can facilitate government transparency, accountability and public participation. "Open data can be a powerful force for public accountability—it can make existing information easier to analyze, process, and combine than ever before, allowing
1184-561: A rise in intellectual property rights. The philosophy behind open data has been long established (for example in the Mertonian tradition of science ), but the term "open data" itself is recent, gaining popularity with the rise of the Internet and World Wide Web and, especially, with the launch of open-data government initiatives Data.gov , Data.gov.uk and Data.gov.in . Open data can be linked data - referred to as linked open data . One of
1258-619: A small level, a business or research organization's policies and strategies towards open data will vary, sometimes greatly. One common strategy employed is the use of a data commons. A data commons is an interoperable software and hardware platform that aggregates (or collocates) data, data infrastructure, and data-producing and data-managing applications in order to better allow a community of users to manage, analyze, and share their data with others over both short- and long-term timelines. Ideally, this interoperable cyberinfrastructure should be robust enough "to facilitate transitions between stages in
1332-463: A total of over 13,000. The portal also contained a gallery of applications and a visualisations catalogue (launched in March 2018). In the apps gallery users could find applications using EU data and developed by the EU institutions, agencies or other bodies or by third parties. The applications were displayed as much for their information value as for giving examples of what applications can be made using
1406-421: A way that is accessible to everyone, regardless of age, disability, or gender. The paper also discusses the challenges of using open data for soft mobility optimization. One challenge is that open data is often incomplete or inaccurate. Another challenge is that it can be difficult to integrate open data from different sources. Despite these challenges, the paper argues that open data is a valuable tool for improving
1480-579: A website offering open data of elections. CIAT offers open data to anybody who is willing to conduct big data analytics in order to enhance the benefit of international agricultural research. DBLP , which is owned by a non-profit organization Dagstuhl , offers its database of scientific publications from computer science as open data. Hospitality exchange services , including Bewelcome, Warm Showers , and CouchSurfing (before it became for-profit) have offered scientists access to their anonymized data for analysis, public research, and publication. At
1554-501: Is a valuable tool for improving the sustainability and equity of soft mobility in cities. The author argues that open data can be used to identify the needs of different areas of a city, develop algorithms that are fair and equitable, and justify the installation of soft mobility resources. The goals of the Open Data movement are similar to those of other "Open" movements. Formally both the definition of Open Data and commons revolve around
SECTION 20
#17328552049511628-519: Is an individual value in a collection of data. Data are usually organized into structures such as tables that provide additional context and meaning, and may themselves be used as data in larger structures. Data may be used as variables in a computational process . Data may represent abstract ideas or concrete measurements. Data are commonly used in scientific research , economics , and virtually every other form of human organizational activity. Examples of data sets include price indices (such as
1702-663: Is called the Open Data Management Cycle and was adopted in several regions such as Veneto and Umbria . Main cities like Reggio Calabria and Genova have also adopted this model. In October 2015, the Open Government Partnership launched the International Open Data Charter , a set of principles and best practices for the release of governmental open data formally adopted by seventeen governments of countries, states and cities during
1776-414: Is licensed under an open license . The goals of the open data movement are similar to those of other "open(-source)" movements such as open-source software, open-source hardware , open content , open specifications , open education , open educational resources , open government , open knowledge , open access , open science , and the open web. The growth of the open data movement is paralleled by
1850-626: Is open if anyone is free to use, reuse, and redistribute it – subject only, at most, to the requirement to attribute and/or share-alike." Other definitions, including the Open Data Institute 's "open data is data that anyone can access, use or share," have an accessible short version of the definition but refer to the formal definition. Open data may include non-textual material such as maps , genomes , connectomes , chemical compounds , mathematical and scientific formulae, medical data, and practice, bioscience and biodiversity. A major barrier to
1924-405: Is the awareness of its environment that some entity possesses, whereas data merely communicates that knowledge. For example, the entry in a database specifying the height of Mount Everest is a datum that communicates a precisely-measured value. This measurement may be included in a book along with other data on Mount Everest to describe the mountain in a manner useful for those who wish to decide on
1998-491: Is the lack of barriers to the re-use of data(sets). Regardless of their origin, principles across types of Open Data hint at the key elements of the definition of commons. These are, for instance, accessibility, re-use, findability, non-proprietarily. Additionally, although to a lower extent, threats and opportunities associated with both Open Data and commons are similar. Synthesizing, they revolve around (risks and) benefits associated with (uncontrolled) use of common resources by
2072-443: Is the longevity of data. Scientific research generates huge amounts of data, especially in genomics and astronomy , but also in the medical sciences , e.g. in medical imaging . In the past, scientific data has been published in papers and books, stored in libraries, but more recently practically all data is stored on hard drives or optical discs . However, in contrast to paper, these storage devices may become unreadable after
2146-404: Is the plural of datum , "(thing) given," and the neuter past participle of dare , "to give". The first English use of the word "data" is from the 1640s. The word "data" was first used to mean "transmissible and storable computer information" in 1946. The expression "data processing" was first used in 1954. When "data" is used more generally as a synonym for "information", it is treated as
2220-526: The consumer price index ), unemployment rates , literacy rates, and census data. In this context, data represent the raw facts and figures from which useful information can be extracted. Data are collected using techniques such as measurement , observation , query , or analysis , and are typically represented as numbers or characters that may be further processed . Field data are data that are collected in an uncontrolled, in-situ environment. Experimental data are data that are generated in
2294-443: The 20th and 21st centuries. Some style guides do not recognize the different meanings of the term and simply recommend the form that best suits the target audience of the guide. For example, APA style as of the 7th edition requires "data" to be treated as a plural form. Data, information , knowledge , and wisdom are closely related concepts, but each has its role concerning the other, and each term has its meaning. According to
EU Open Data Portal - Misplaced Pages Continue
2368-534: The EU institutions, agencies and other bodies and the European Data Portal that provides datasets from local, regional and national public bodies across Europe. The two portals were consolidated to data.europa.eu on April 21, 2021. Italy is the first country to release standard processes and guidelines under a Creative Commons license for spread usage in the Public Administration. The open model
2442-607: The EU) should mandate that funded projects hand in their databases as "deliverables" at the end of the project so that they can be checked for third-party usability and then shared. Data In common usage and statistics , data ( / ˈ d eɪ t ə / , also US : / ˈ d æ t ə / ) is a collection of discrete or continuous values that convey information , describing the quantity , quality , fact , statistics , other basic units of meaning, or simply sequences of symbols that may be further interpreted formally . A datum
2516-455: The Internet, the availability of fast, readily available networking has significantly changed the context of Open science data , as publishing or obtaining data has become much less expensive and time-consuming. The Human Genome Project was a major initiative that exemplified the power of open data. It was built upon the so-called Bermuda Principles , stipulating that: "All human genomic sequence information … should be freely available and in
2590-555: The OGP Global Summit in Mexico . In July 2024, the OECD adopted Creative Commons CC-BY-4.0 licensing for its published data and reports. Many non-profit organizations offer open access to their data, as long it does not undermine their users', members' or third party's privacy rights . In comparison to for-profit corporations , they do not seek to monetize their data. OpenNWT launched
2664-433: The best method to climb it. Awareness of the characteristics represented by this data is knowledge. Data are often assumed to be the least abstract concept, information the next least, and knowledge the most abstract. In this view, data becomes information by interpretation; e.g., the height of Mount Everest is generally considered "data", a book on Mount Everest geological characteristics may be considered "information", and
2738-434: The binary alphabet. Some special forms of data are distinguished. A computer program is a collection of data, that can be interpreted as instructions. Most computer languages make a distinction between programs and the other data on which programs operate, but in some languages, notably Lisp and similar languages, programs are essentially indistinguishable from other data. It is also useful to distinguish metadata , that is,
2812-617: The concept of a sign to differentiate between data and information; data is a series of symbols, while information occurs when the symbols are used to refer to something. Before the development of computing devices and machines, people had to manually collect data and impose patterns on it. With the development of computing devices and machines, these devices can also collect data. In the 2010s, computers were widely used in many fields to collect data and sort or process it, in disciplines ranging from marketing , analysis of social service usage by citizens to scientific research. These patterns in
2886-471: The concept of shared resources with a low barrier to access. Substantially, digital commons include Open Data in that it includes resources maintained online, such as data. Overall, looking at operational principles of Open Data one could see the overlap between Open Data and (digital) commons in practice. Principles of Open Data are sometimes distinct depending on the type of data under scrutiny. Nonetheless, they are somewhat overlapping and their key rationale
2960-444: The course of a controlled scientific experiment. Data are analyzed using techniques such as calculation , reasoning , discussion, presentation , visualization , or other forms of post-analysis. Prior to analysis, raw data (or unprocessed data) is typically cleaned: Outliers are removed, and obvious instrument or data entry errors are corrected. Data can be seen as the smallest units of factual information that can be used as
3034-408: The data are seen as information that can be used to enhance knowledge. These patterns may be interpreted as " truth " (though "truth" can be a subjective concept) and may be authorized as aesthetic and ethical criteria in some disciplines or cultures. Events that leave behind perceivable physical or virtual remains can be traced back through data. Marks are no longer considered data once the link between
EU Open Data Portal - Misplaced Pages Continue
3108-538: The data. The visualisations catalogue offered a collection of visualisation tools, training and re-usable visualisations for all levels of data visualisation expertise, from beginner to expert. The portal was built using open source solutions such as the Drupal content management system and CKAN , the data catalogue software developed by the Open Knowledge Foundation. It used Virtuoso as an RDF database and has
3182-515: The dominant market logics as shaped by capitalism. Perhaps it is this feature that emerges in the recent surge of the concept of commons as related to a more social look at digital technologies in the specific forms of digital and, especially, data commons. Application of open data for societal good has been demonstrated in academic research works. The paper "Optimization of Soft Mobility Localization with Sustainable Policies and Open Data" uses open data in two ways. First, it uses open data to identify
3256-601: The economy, employment, science, environment and education. The importance of these was confirmed by the G8 Open Data Charter. At the time it was merged into data.europa.eu, around 70 EU institutions, bodies or departments (e.g. Eurostat, the European Environment Agency, the Joint Research Centre and other European Commission Directorates General and EU Agencies) had made datasets available, making
3330-571: The ethos of data as "given". Peter Checkland introduced the term capta (from the Latin capere , "to take") to distinguish between an immense number of possible data and a sub-set of them, to which attention is oriented. Johanna Drucker has argued that since the humanities affirm knowledge production as "situated, partial, and constitutive," using data may introduce assumptions that are counterproductive, for example that phenomena are discrete or are observer-independent. The term capta , which emphasizes
3404-424: The following: It is generally held that factual data cannot be copyrighted. Publishers frequently add copyright statements (often forbidding re-use) to scientific data accompanying publications. It may be unclear whether the factual data embedded in full text are part of the copyright. While the human abstraction of facts from paper publications is normally accepted as legal there is often an implied restriction on
3478-442: The greater public good. Opening government data is only a waypoint on the road to improving education, improving government, and building tools to solve other real-world problems. While many arguments have been made categorically , the following discussion of arguments for and against open data highlights that these arguments often depend highly on the type of data and its potential uses. Arguments made on behalf of open data include
3552-510: The idea of making data into a commons. This project exemplifies the relationship between Open Data and commons, and how they can disrupt the market logic driving big data use in two ways. First, it shows how such projects, following the rationale of Open Data somewhat can trigger the creation of effective data commons. The project itself was offering different types of support to social network platform users to have contents removed. Second, opening data regarding online social networks interactions has
3626-417: The life cycle of a collection" of data and information resources while still being driven by common data models and workspace tools enabling and supporting robust data analysis. The policies and strategies underlying a data commons will ideally involve numerous stakeholders, including the data commons service provider, data contributors, and data users. Grossman et al suggests six major considerations for
3700-499: The machine extraction by robots. Unlike open access , where groups of publishers have stated their concerns, open data is normally challenged by individual institutions. Their arguments have been discussed less in public discourse and there are fewer quotes to rely on at this time. Arguments against making all data available as open data include the following: The paper entitled "Optimization of Soft Mobility Localization with Sustainable Policies and Open Data" argues that open data
3774-538: The mark and observation is broken. Mechanical computing devices are classified according to how they represent data. An analog computer represents a datum as a voltage, distance, position, or other physical quantity. A digital computer represents a piece of data as a sequence of symbols drawn from a fixed alphabet . The most common digital computers use a binary alphabet, that is, an alphabet of two characters typically denoted "0" and "1". More familiar representations, such as numbers or letters, are then constructed from
SECTION 50
#17328552049513848-439: The most important forms of open data is open government data (OGD), which is a form of open data created by ruling government institutions. Open government data's importance is born from it being a part of citizens' everyday lives, down to the most routine/mundane tasks that are seemingly far removed from government. The abbreviation FAIR/O data is sometimes used to indicate that the dataset or database in question complies with
3922-418: The need to state the conditions of ownership, licensing and re-use; instead presuming that not asserting copyright enters the data into the public domain . For example, many scientists do not consider the data published with their work to be theirs to control and consider the act of publication in a journal to be an implicit release of data into the commons . The lack of a license makes it difficult to determine
3996-450: The needs of different areas of a city. For example, it might use data on population density, traffic congestion, and air quality to determine where soft mobility resources, such as bike racks and charging stations for electric vehicles, are most needed. Second, it uses open data to develop algorithms that are fair and equitable. For example, it might use data on the demographics of a city to ensure that soft mobility resources are distributed in
4070-435: The open data movement is the commercial value of data. Access to, or re-use of, data is often controlled by public or private organizations. Control may be through access restrictions, licenses , copyright , patents and charges for access or re-use. Advocates of open data argue that these restrictions detract from the common good and that data should be available without restrictions or fees. Creators of data do not consider
4144-522: The petabyte scale. Using traditional data analysis methods and computing, working with such large (and growing) datasets is difficult, even impossible. (Theoretically speaking, infinite data would yield infinite information, which would render extracting insights or intelligence impossible.) In response, the relatively new field of data science uses machine learning (and other artificial intelligence (AI)) methods that allow for efficient applications of analytic methods to big data. The Latin word data
4218-608: The potential to significantly reduce the monopolistic power of social network platforms on those data. Several funding bodies that mandate Open Access also mandate Open Data. A good expression of requirements (truncated in places) is given by the Canadian Institutes of Health Research (CIHR): Other bodies promoting the deposition of data and full text include the Wellcome Trust . An academic paper published in 2013 advocated that Horizon 2020 (the science funding mechanism of
4292-485: The principles of FAIR data and carries an explicit data‑capable open license . The concept of open data is not new, but a formalized definition is relatively new. Open data as a phenomenon denotes that governmental data should be available to anyone with a possibility of redistribution in any form without any copyright restriction. One more definition is the Open Definition which can be summarized as "a piece of data
4366-405: The problem of reproducibility is the attempt to require FAIR data , that is, data that is Findable, Accessible, Interoperable, and Reusable. Data that fulfills these requirements can be used in subsequent research and thus advances science and technology. Although data is also increasingly used in other fields, it has been suggested that the highly interpretive nature of them might be at odds with
4440-598: The protection of data privacy and intellectual property, applied to a small amount of data. A link to these conditions could be found for each dataset. The terms of use could be found on the site. As of November 2020, most data was covered by the Creative Commons CC‑BY‑4.0 license and the site metadata by the Creative Commons CC0‑1.0 public domain waiver. The portal contained a very wide variety of high-value open data across EU policy domains, including
4514-636: The public domain in order to encourage research and development and to maximize its benefit to society". More recent initiatives such as the Structural Genomics Consortium have illustrated that the open data approach can be used productively within the context of industrial R&D. In 2004, the Science Ministers of all nations of the Organisation for Economic Co-operation and Development (OECD), which includes most developed countries of
SECTION 60
#17328552049514588-448: The requested data. Overall, the likelihood of retrieving data dropped by 17% each year after publication. Similarly, a survey of 100 datasets in Dryad found that more than half lacked the details to reproduce the research results from these studies. This shows the dire situation of access to scientific data that is not published or does not have enough details to be reproduced. A solution to
4662-457: The research's objectivity and permit an understanding of the phenomena under investigation as complete as possible: qualitative and quantitative methods, literature reviews (including scholarly articles), interviews with experts, and computer simulation. The data is thereafter "percolated" using a series of pre-determined steps so as to extract the most relevant information. An important field in computer science , technology , and library science
4736-519: The reuse of Commission documents to promote accessibility and reuse. Based on this decision, all the EU institutions were invited - and are still today - to publish information such as open data and to make it accessible to the public whenever possible. The operational management of the portal was the task of the Publications Office of the European Union. Implementation of EU open data policy
4810-654: The state of California, US and New York City . At the international level, the United Nations has an open data website that publishes statistical data from member states and UN agencies, and the World Bank published a range of statistical data relating to developing countries. The European Commission has created two portals for the European Union : the EU Open Data Portal which gives access to open data from
4884-455: The status of a data set and may restrict the use of data offered in an "Open" spirit. Because of this uncertainty it is possible for public or private organizations to aggregate said data, claim that it is protected by copyright, and then resell it. Open data can come from any source. This section lists some of the fields that publish (or at least discuss publishing) a large amount of open data. The concept of open access to scientific data
4958-563: The sustainability and equity of soft mobility in cities. An exemplification of how the relationship between Open Data and commons and how their governance can potentially disrupt the market logic otherwise dominating big data is a project conducted by Human Ecosystem Relazioni in Bologna (Italy). See: https://www.he-r.it/wp-content/uploads/2017/01/HUB-report-impaginato_v1_small.pdf . This project aimed at extrapolating and identifying online social relations surrounding “collaboration” in Bologna. Data
5032-474: The synthesis of data into information, can then be described as knowledge . Data has been described as "the new oil of the digital economy ". Data, as a general concept , refers to the fact that some existing information or knowledge is represented or coded in some form suitable for better usage or processing . Advances in computing technologies have led to the advent of big data , which usually refers to very large quantities of data, usually at
5106-425: The websites of the various institutions, agencies and other bodies of the EU. Semantic technologies offered additional functionalities. The metadata catalogue could be searched via an interactive search engine and through SPARQL queries. Users could suggest data they think is missing on the portal and give feedback on the quality of data obtainable. The interface was in 24 EU official languages, but most metadata
5180-512: The world, signed a declaration which states that all publicly funded archive data should be made publicly available. Following a request and an intense discussion with data-producing institutions in member states, the OECD published in 2007 the OECD Principles and Guidelines for Access to Research Data from Public Funding as a soft-law recommendation. Examples of open data in science: There are
5254-573: Was available in a limited number of languages (English, French and German). Some of the metadata (e.g. names of the data providers and geographical coverage) was in 24 languages. Most of the data accessible via the EU Open Data Portal was covered by the legal notice of the Europa website. Generally, data could be used for free for commercial and non-commercial purposes, provided the source is acknowledged. Specific conditions for reuse, relating mostly to
5328-443: Was collected from social networks and online platforms for citizens collaboration. Eventually data was analyzed for the content, meaning, location, timeframe, and other variables. Overall, online social relations for collaboration were analyzed based on network theory. The resulting dataset have been made available online as Open Data (aggregated and anonymized); nonetheless, individuals can reclaim all their data. This has been done with
5402-604: Was established with the formation of the World Data Center system, in preparation for the International Geophysical Year of 1957–1958. The International Council of Scientific Unions (now the International Council for Science ) oversees several World Data Centres with the mission to minimize the risk of data loss and to maximize data accessibility. While the open-science-data movement long predates
5476-531: Was the responsibility of the Directorate General for Communications Networks, Content and Technology (DG CONNECT) of the European Commission. This is still true today with data.europa.eu. The portal enabled users to search, explore, link, download and easily re-use data for commercial or non-commercial purposes, through a common metadata catalogue. From the portal, users could access data published on
#950049