Misplaced Pages

Hierarchical Data Format

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

Hierarchical Data Format ( HDF ) is a set of file formats ( HDF4 , HDF5 ) designed to store and organize large amounts of data. Originally developed at the U.S. National Center for Supercomputing Applications , it is supported by The HDF Group, a non-profit corporation whose mission is to ensure continued development of HDF5 technologies and the continued accessibility of data stored in HDF.

#50949

47-582: In keeping with this goal, the HDF libraries and associated tools are available under a liberal, BSD-like license for general use. HDF is supported by many commercial and non-commercial software platforms and programming languages. The freely available HDF distribution consists of the library, command-line utilities, test suite source, Java interface, and the Java-based HDF Viewer (HDFView). The current version, HDF5, differs significantly in design and API from

94-543: A public-domain-equivalent license , the same way as MIT No Attribution License . It is known as "0BSD", "Zero-Clause BSD", or "Free Public License 1.0.0". It was created by Rob Landley and first used in Toybox when he was disappointed after using GPL license in BusyBox . Copyright (C) [year] by [copyright holder] <[email]> Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee

141-469: A POSIX compatibility layer and are not otherwise inherently Unix systems. Many ancient UNIX systems no longer meet this definition. Broadly, any Unix-like system that behaves in a manner roughly consistent with the UNIX specification, including having a " program which manages your login and command line sessions "; more specifically, this can refer to systems such as Linux or Minix that behave similarly to

188-466: A UNIX system but have no genetic or trademark connection to the AT&;T code base. Most free/open-source implementations of the UNIX design, whether genetic UNIX or not, fall into the restricted definition of this third category due to the expense of obtaining Open Group certification, which costs thousands of dollars. Around 2001 Linux was given the opportunity to get a certification including free help from

235-513: A clause not found in later licenses, known as the "advertising clause". This clause eventually became controversial, as it required authors of all works deriving from a BSD-licensed work to include an acknowledgment of the original source in all advertising material. This was clause number 3 in the original license text: Copyright (c) <year>, <copyright holder> All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that

282-1220: A clause restricting use of the names of contributors for endorsement of a derived work without specific permission. Copyright <year> <copyright holder> Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. An even more simplified version has come into use, primarily known for its usage in FreeBSD . It

329-447: A clear object model, which makes continued support and improvement difficult. Supporting many different interface styles (images, tables, arrays) leads to a complex API. Support for metadata depends on which interface is in use; SD (Scientific Dataset) objects support arbitrary named attributes, while other types only support predefined metadata. Perhaps most importantly, the use of 32-bit signed integers for addressing limits HDF4 files to

376-583: A historical connection to the AT&;T codebase. Most commercial UNIX systems fall into this category. So do the BSD systems, which are descendants of work done at the University of California, Berkeley in the late 1970s and early 1980s. Some of these systems have no original AT&T code but can still trace their ancestry to AT&T designs. These systems‍—‌largely commercial in nature‍—‌have been determined by

423-546: A maximum of 2 GB, which is unacceptable in many modern scientific applications. The HDF5 format is designed to address some of the limitations of the HDF4 library, and to address current and anticipated requirements of modern systems and applications. In 2002 it won an R&D 100 Award . HDF5 simplifies the file structure to include only two major types of object: This results in a truly hierarchical, filesystem-like data format. In fact, resources in an HDF5 file can be accessed using

470-873: A misuse of their trademark. Their guidelines require "UNIX" to be presented in uppercase or otherwise distinguished from the surrounding text, strongly encourage using it as a branding adjective for a generic word such as "system", and discourage its use in hyphenated phrases. Other parties frequently treat "Unix" as a genericized trademark . Some add a wildcard character to the name to make an abbreviation like "Un*x" or "*nix", since Unix-like systems often have Unix-like names such as AIX , A/UX , HP-UX , IRIX , Linux , Minix , Ultrix , Xenix , and XNU . These patterns do not literally match many system names, but are still generally recognized to refer to any UNIX system, descendant, or work-alike, even those with completely dissimilar names such as Darwin / macOS , illumos / Solaris or FreeBSD . In 2007, Wayne R. Gray sued to dispute

517-589: A similar 2-clause license. This version has been vetted as an Open source license by the OSI as the "Simplified BSD License." The ISC license without the 'and/or' wording is functionally equivalent, and endorsed by the OpenBSD project as a license template for new contributions. The BSD 0-clause license goes further than the 2-clause license by dropping the requirements to include the copyright notice, license text, or disclaimer in either source or binary forms. Doing so forms

SECTION 10

#1732851868051

564-475: A variety of proprietary systems were developed based on it, including AIX , HP-UX , IRIX , SunOS , Tru64 , Ultrix , and Xenix . These largely displaced the proprietary clones. Growing incompatibility among these systems led to the creation of interoperability standards, including POSIX and the Single UNIX Specification . Various free, low-cost, and unrestricted substitutes for UNIX emerged in

611-460: Is also object-oriented with respect to datasets, groups, attributes, types, dataspaces and property lists. The latest version of NetCDF , version 4, is based on HDF5. Because it uses B-trees to index table objects, HDF5 works well for time series data such as stock price series, network monitoring data, and 3D meteorological data. The bulk of the data goes into straightforward arrays (the table objects) that can be accessed much more quickly than

658-458: Is compatible with the GNU GPL. The FSF encourages users to be specific when referring to the license by name (i.e. not simply referring to it as "a BSD license" or "BSD-style") to avoid confusion with the original BSD license. This version allows unlimited redistribution for any purpose as long as its copyright notices and the license's disclaimers of warranty are maintained. The license also contains

705-623: Is hereby granted. THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. The SPDX License List contains extra BSD license variations. Examples include: The FreeBSD project argues on

752-515: Is in contrast to copyleft licenses, which have share-alike requirements. The original BSD license was used for its namesake, the Berkeley Software Distribution (BSD), a Unix-like operating system . The original version has since been revised, and its descendants are referred to as modified BSD licenses. BSD is both a license and a class of license (generally referred to as BSD-like). The modified BSD license (in wide use today)

799-535: Is made up of many small, interchangeable components that can be added or removed as needed. This makes it easy to customize the operating system to suit the needs of different users or environments. The Open Group owns the UNIX trademark and administers the Single UNIX Specification, with the "UNIX" name being used as a certification mark . They do not approve of the construction "Unix-like", and consider it

846-484: Is no reason not to use software already using it. The advertising clause was removed from the license text in the official BSD license on July 22, 1999, by William Hoskins, Director of the Office of Technology Licensing for UC Berkeley. On January 31, 2012, UC Berkeley Executive Director of the Office of Intellectual Property and Industry Alliances established that licensees and distributors are no longer required to include

893-405: Is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification . A Unix-like application is one that behaves like the corresponding Unix command or shell . Although there are general philosophies for Unix design, there is no technical standard defining the term, and opinions can differ about

940-418: Is their ability to support multiple users and processes simultaneously. This allows users to run multiple programs at the same time and to share resources such as memory and disk space. This is in contrast to many older operating systems, which were designed to only support a single user or process at a time. Another important feature of Unix-like systems is their modularity . This means that the operating system

987-453: Is very similar to the license originally used for the BSD version of Unix . The BSD license is a simple license that merely requires that all code retain the BSD license notice if redistributed in source code format, or reproduce the notice if redistributed in binary format. The BSD license (unlike some other licenses e.g. GPL ) does not require that source code be distributed at all. In addition to

SECTION 20

#1732851868051

1034-518: The Earth Observing System (EOS) project. After a two-year review process, HDF was selected as the standard data and information system. HDF4 is the older version of the format, although still actively supported by The HDF Group. It supports a proliferation of different data models, including multidimensional arrays, raster images , and tables. Each defines a specific aggregate data type and provides an API for reading, writing, and organizing

1081-520: The Open Group to meet the Single UNIX Specification and are allowed to carry the UNIX name. Most such systems are commercial derivatives of the System V code base in one form or another, although Apple macOS 10.5 and later is a BSD variant that has been certified, and EulerOS and Inspur K-UX are Linux distributions that have been certified. A few other systems (such as IBM z/OS) earned the trademark through

1128-476: The POSIX -like syntax /path/to/resource . Metadata is stored in the form of user-defined, named attributes attached to groups and datasets. More complex storage APIs representing images and tables can then be built up using datasets, groups and attributes. In addition to these advances in the file format, HDF5 includes an improved type system, and dataspace objects which represent selections over dataset regions. The API

1175-508: The 1980s and 1990s, including 4.4BSD , Linux , and Minix . Some of these have in turn been the basis for commercial "Unix-like" systems, such as BSD/OS and macOS . Several versions of (Mac) OS X/macOS running on Intel-based Mac computers have been certified under the Single UNIX Specification . The BSD variants are descendants of UNIX developed by the University of California at Berkeley, with UNIX source code from Bell Labs . However,

1222-493: The BSD code base has evolved since then, replacing all the AT&T code. Since the BSD variants are not certified as compliant with the Single UNIX Specification, they are referred to as "UNIX-like" rather than "UNIX". Dennis Ritchie , one of the original creators of Unix, expressed his opinion that Unix-like systems such as Linux are de facto Unix systems. Eric S. Raymond and Rob Landley have suggested that there are three kinds of Unix-like systems: Those systems with

1269-423: The BSD license is great for code you don't care about. I'll use it myself. If there’s a library routine that I just want to say 'hey, this is useful to anybody and I’m not going to maintain this,' I’ll put it under the BSD license. -- Linus Torvalds at LinuxCon 2016 The BSD license family is one of the oldest and most broadly used license families in the free and open-source software ecosystem, and has been

1316-637: The Free Software Foundation, and have been vetted as open source licenses by the Open Source Initiative . The original, 4-clause BSD license has not been accepted as an open source license and, although the original is considered to be a free software license by the FSF, the FSF does not consider it to be compatible with the GPL due to the advertising clause. Over the years I've become convinced that

1363-658: The above copyright notice and this paragraph are duplicated in all such forms and that any documentation, advertising materials, and other materials related to such distribution and use acknowledge that the software was developed by the <copyright holder>. The name of the <copyright holder> may not be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED `'AS IS″ AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. The original BSD license contained

1410-469: The acknowledgement within advertising materials. Accordingly, the advertising clause 3 of the original 4-clause BSD license for any and all software officially licensed under a UC Berkeley version of the BSD license, was deleted in its entirety. Other BSD distributions removed the clause, but many similar clauses remain in BSD-derived code from other sources, and unrelated code using a derived license. While

1457-479: The adoption of the 4-clause BSD license used a license that is clearly ancestral to the 4-clause BSD license. These releases include some parts of 4.3BSD-Tahoe (1988), about 1000 files, and Net/1 (1989). Although largely replaced by the 4-clause license, this license can be found in 4.3BSD-Reno, Net/2, and 4.4BSD-Alpha. Copyright (c) <year> <copyright holder>. All rights reserved. Redistribution and use in source and binary forms are permitted provided that

Hierarchical Data Format - Misplaced Pages Continue

1504-516: The advantages of BSD-style licenses for companies and commercial use-cases due to their license compatibility with proprietary licenses and general flexibility, stating that the BSD-style licenses place only "minimal restrictions on future behavior" and are not "legal time-bombs", unlike copyleft licenses . The BSD License allows proprietary use and allows the software released under the license to be incorporated into proprietary products. Works based on

1551-533: The clause presented a legal problem for those wishing to publish BSD-licensed software which relies upon separate programs using the GNU GPL : the advertising clause is incompatible with the GPL, which does not allow the addition of restrictions beyond those it already imposes; because of this, the GPL's publisher, the Free Software Foundation , recommends developers not use the license, though it states there

1598-432: The data and metadata. New data models can be added by the HDF developers or users. HDF is self-describing, allowing an application to interpret the structure and contents of a file with no outside information. One HDF file can hold a mix of related objects which can be accessed as a group or as individual objects. Users can create their own grouping structures called "vgroups." The HDF4 format has many limitations. It lacks

1645-528: The degree to which a particular operating system or application is Unix-like. Some well-known examples of Unix-like operating systems include Linux and BSD . These systems are often used on servers as well as on personal computers and other devices. Many popular applications, such as the Apache web server and the Bash shell, are also designed to be used on Unix-like systems. One of the key features of Unix-like systems

1692-830: The following conditions are met: THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Other projects, such as NetBSD, use

1739-782: The following conditions are met: THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> AS IS AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL <COPYRIGHT HOLDER> BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. This clause

1786-464: The inspiration for a number of other licenses. Many FOSS software projects use a BSD license, for instance the BSD OS family (FreeBSD etc.), Google 's Bionic or Toybox. As of 2015 the BSD 3-clause license ranked in popularity number five according to Black Duck Software and sixth according to GitHub data. Unix-like A Unix-like (sometimes referred to as UN*X or *nix ) operating system

1833-563: The license as the FreeBSD License, states that it is compatible with the GNU GPL. In addition, the FSF encourages users to be specific when referring to the license by name (i.e. not simply referring to it as "a BSD license" or "BSD-style"), as it does with the modified/new BSD license, to avoid confusion with the original BSD license. Copyright (c) <year>, <copyright holder> Redistribution and use in source and binary forms, with or without modification, are permitted provided that

1880-557: The major legacy version HDF4. The quest for a portable scientific data format, originally dubbed AEHOO (All Encompassing Hierarchical Object Oriented format) began in 1987 by the Graphics Foundations Task Force (GFTF) at the National Center for Supercomputing Applications (NCSA). NSF grants received in 1990 and 1992 were important to the project. Around this time NASA investigated 15 different file formats for use in

1927-646: The material may be released under a proprietary license as closed source software, allowing usual commercial usages under them. The 3-clause BSD license, like most permissive licenses , is compatible with almost all FOSS licenses (and as well proprietary licenses). Two variants of the license, the New BSD License/Modified BSD License (3-clause), and the Simplified BSD License/FreeBSD License (2-clause) have been verified as GPL - compatible free software licenses by

Hierarchical Data Format - Misplaced Pages Continue

1974-461: The original (4-clause) license used for BSD, several derivative licenses have emerged that are also commonly referred to as a "BSD license". Today, the typical BSD license is the 3-clause version, which is revised from the original 4-clause version. In all BSD licenses as following, <year> is the year of the copyright. As published in BSD, <copyright holder> is "Regents of the University of California". Some releases of BSD prior to

2021-402: The original license is sometimes referred to as the "BSD-old", the resulting 3-clause version is sometimes referred to by "BSD-new." Other names include new BSD , "revised BSD", "BSD-3", or "3-clause BSD". This version has been vetted as an Open source license by the OSI as "The BSD License". The Free Software Foundation, which refers to the license as the "Modified BSD License", states that it

2068-428: The rows of an SQL database, but B-tree access is available for non-array data. The HDF5 data storage mechanism can be simpler and faster than an SQL star schema . Criticism of HDF5 follows from its monolithic design and lengthy specification. BSD licenses BSD licenses are a family of permissive free software licenses , imposing minimal restrictions on the use and distribution of covered software. This

2115-508: The status of UNIX as a trademark, but lost his case, and lost again on appeal, with the court upholding the trademark and its ownership. "Unix-like" systems started to appear in the late 1970s and early 1980s. Many proprietary versions, such as Idris (1978), UNOS (1982), Coherent (1983), and UniFlex (1985), aimed to provide businesses with the functionality available to academic users of UNIX. When AT&T allowed relatively inexpensive commercial binary sublicensing of UNIX in 1979,

2162-454: Was in use there as early as 29 April 1999 and likely well before. The primary difference between it and the New BSD (3-clause) License is that it omits the non-endorsement clause. The FreeBSD version of the license also adds a further disclaimer about views and opinions expressed in the software, though this is not commonly included by other projects. The Free Software Foundation, which refers to

2209-433: Was objected to on the grounds that as people changed the license to reflect their name or organization it led to escalating advertising requirements when programs were combined in a software distribution: every occurrence of the license with a different name required a separate acknowledgment. In arguing against it, Richard Stallman has stated that he counted 75 such acknowledgments in a 1997 version of NetBSD . In addition,

#50949