Misplaced Pages

Python Package Index

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

The Python Package Index , abbreviated as PyPI ( / ˌ p aɪ p i ˈ aɪ / ) and also known as the Cheese Shop (a reference to the Monty Python's Flying Circus sketch " Cheese Shop "), is the official third-party software repository for Python . It is analogous to the CPAN repository for Perl and to the CRAN repository for R . PyPI is run by the Python Software Foundation , a charity. Some package managers , including pip , use PyPI as the default source for packages and their dependencies.

#549450

40-434: As of 6 May 2024, more than 530,000 Python packages are available. PyPI primarily hosts Python packages in the form of source archives, called "sdists", or of "wheels" that may contain binary modules from a compiled language. PyPI as an index allows users to search for packages by keywords or by filters against their metadata , such as free software license or compatibility with POSIX . A single entry on PyPI

80-410: A labeled , directed multigraph . This makes an RDF data model better suited to certain kinds of knowledge representation than other relational or ontological models. As RDFS , OWL and SHACL demonstrate, one can build additional ontology languages upon RDF. The initial RDF design, intended to "build a vendor-neutral and operating system- independent system of metadata", derived from

120-543: A search index . Common words like articles (a, an, the) and conjunctions (and, or, but) are not treated as keywords because it's inefficient. Almost every English-language site on the Internet has the article " the ", and so it makes no sense to search for it. The most popular search engine, Google removed stop words such as "the" and "a" from its indexes for several years, but then re-introduced them, making certain types of precise search possible again. The term "descriptor"

160-579: A URI could represent absolutely anything. However, there is broad agreement that a bare URI (without a # symbol) which returns a 300-level coded response when used in an HTTP GET request should be treated as denoting the internet resource that it succeeds in accessing. Therefore, producers and consumers of RDF statements must agree on the semantics of resource identifiers. Such agreement is not inherent to RDF itself, although there are some controlled vocabularies in common use, such as Dublin Core Metadata, which

200-450: A single scope identifier to be associated with a statement that has not been assigned a URI, itself. Likewise named graphs in which a set of triples is named by a URI can represent context without the need to reify the triples. The predominant query language for RDF graphs is SPARQL . SPARQL is an SQL -like language, and a recommendation of the W3C as of January 15, 2008. The following

240-539: A statement can be associated with a context, named by a URI, in order to assert an "is true in" relationship. As another example, it is sometimes convenient to group statements by their source, which can be identified by a URI, such as the URI of a particular RDF/XML document. Then, when updates are made to the source, corresponding statements can be changed in the model, as well. Implementation of scopes does not necessarily require fully reified statements. Some implementations allow

280-448: A type of database called a triplestore . The subject of an RDF statement is either a uniform resource identifier (URI) or a blank node , both of which denote resources . Resources indicated by blank nodes are called anonymous resources. They are not directly identifiable from the RDF statement. The predicate is a URI which also indicates a resource, representing a relationship. The object

320-444: A word or phrase from the search, getting rid of any results that include it. Multiple words can also be enclosed in quotation marks to turn the individual index terms into a specific index phrase . These modifiers and methods all help to refine search terms, to better maximize the accuracy of search results. Author keywords are an integral part of literature. Many journals and databases provide access to index terms made by authors of

360-433: Is SHACL (Shapes Constraint Language). SHACL specification is divided in two parts: SHACL Core and SHACL-SPARQL. SHACL Core consists of a list of built-in constraints such as cardinality, range of values and many others. SHACL-SPARQL describes SPARQL-based constraints and an extension mechanism to declare new constraint components. Other non-standard ways to describe and validate RDF graphs include: The following example

400-523: Is a directed graph composed of triple statements. An RDF graph statement is represented by: (1) a node for the subject, (2) an arc from subject to object, representing a predicate, and (3) a node for the object. Each of these parts can be identified by a Uniform Resource Identifier (URI). An object can also be a literal value. This simple, flexible data model has a lot of expressive power to represent complex situations, relationships, and other things of interest, while also being appropriately abstract. RDF

440-411: Is a URI, blank node or a Unicode string literal . As of RDF 1.1 resources are identified by Internationalized Resource Identifiers (IRIs); IRI are a generalization of URI. In Semantic Web applications, and in relatively popular applications of RDF like RSS and FOAF (Friend of a Friend), resources tend to be represented by URIs that intentionally denote, and can be used to access, actual data on

SECTION 10

#1732851842550

480-436: Is a term that captures the essence of the topic of a document. Index terms make up a controlled vocabulary for use in bibliographic records . They are an integral part of bibliographic control , which is the function by which libraries collect, organize and disseminate documents. They are used as keywords to retrieve documents in an information system, for instance, a catalog or a search engine . A popular form of keywords on

520-598: Is able to store, aside from just a package and its metadata, previous releases of the package, precompiled wheels (e.g. containing DLLs on Windows), as well as different forms for different operating systems and Python versions. The Python Distribution Utilities ( distutils ) Python module was first added to the Python standard library in the 1.6.1 release, in September 2000, and in the 2.0 release, in October 2000, nine years after

560-595: Is an example of a SPARQL query to show country capitals in Africa, using a fictional ontology: Other non-standard ways to query RDF graphs include: SHACL Advanced Features specification (W3C Working Group Note), the most recent version of which is maintained by the SHACL Community Group defines support for SHACL Rules, used for data transformations, inferences and mappings of RDF based on SHACL shapes. The predominant language for describing and validating RDF graphs

600-410: Is based on the idea of making statements about resources (in particular web resources) in expressions of the form subject – predicate – object , known as triples . The subject denotes the resource; the predicate denotes traits or aspects of the resource, and expresses a relationship between the subject and the object . For example, one way to represent the notion "The sky has

640-515: Is partially mapped to a URI space for use in RDF. The intent of publishing RDF-based ontologies on the Web is often to establish, or circumscribe, the intended meanings of the resource identifiers used to express data in RDF. For example, the URI: http://www.w3.org/TR/2004/REC-owl-guide-20040210/wine#Merlot is intended by its owners to refer to the class of all Merlot red wines by vintner (i.e., instances of

680-401: Is rather a URI reference , containing the '#' character and ending with a fragment identifier . The body of knowledge modeled by a collection of statements may be subjected to reification , in which each statement (that is each triple subject-predicate-object altogether) is assigned a URI and treated as a resource about which additional statements can be made, as in " Jane says that John

720-521: Is some resource or literal. More statements about the original statement may also exist, depending on the application's needs. Borrowing from concepts available in logic (and as illustrated in graphical notations such as conceptual graphs and topic maps ), some RDF model implementations acknowledge that it is sometimes useful to group statements according to different criteria, called situations , contexts , or scopes , as discussed in articles by RDF specification co-editor Graham Klyne . For example,

760-478: Is taken from the W3C website describing a resource with statements "there is a Person identified by http://www.w3.org/People/EM/contact#me, whose name is Eric Miller, whose email address is e.miller123(at)example (changed for security purposes), and whose title is Dr." The resource "http://www.w3.org/People/EM/contact#me" is the subject. The objects are: The subject is a URI. The predicates also have URIs. For example,

800-431: Is the author of document X". Reification is sometimes important in order to deduce a level of confidence or degree of usefulness for each statement. In a reified RDF database, each original statement, being a resource, itself, most likely has at least three additional statements made about it: one to assert that its subject is some resource, one to assert that its predicate is some resource, and one to assert that its object

840-482: Is used as a foundation for RDF Schema , where it is extended. Several common serialization formats are in use, including: RDF/XML is sometimes misleadingly called simply RDF because it was introduced among the other W3C specifications defining RDF and it was historically the first W3C standard RDF serialization format. However, it is important to distinguish the RDF/XML format from the abstract RDF model itself. Although

SECTION 20

#1732851842550

880-598: The University of Michigan . In 1999, the W3C published the first recommended RDF specification, the Model and Syntax Specification ("RDF M&S"). This described RDF's data model and an XML serialization. Two persistent misunderstandings about RDF developed at this time: firstly, due to the MCF influence and the RDF "Resource Description" initialism, the idea that RDF was specifically for use in representing metadata; secondly that RDF

920-647: The Python Software Foundation reported that the United States Department of Justice had subpoenaed the user data of five PyPI contributors. A representative of the organization further explained that they expect privacy for contributors, but they also comply with the law and court orders, and for this reason turned over the data which the government requested. Index term In information retrieval , an index term (also known as subject term , subject heading , descriptor , or keyword )

960-467: The RDF/XML format is still in use, other RDF serializations are now preferred by many RDF users, both because they are more human-friendly, and because some RDF graphs are not representable in RDF/XML due to restrictions on the syntax of XML QNames . With a little effort, virtually any arbitrary XML may also be interpreted as RDF using GRDDL (pronounced 'griddle'), Gleaning Resource Descriptions from Dialects of Languages. RDF triples may be stored in

1000-645: The URI for each predicate: In addition, the subject has a type (with URI http://www.w3.org/1999/02/22-rdf-syntax-ns#type), which is person (with URI http://www.w3.org/2000/10/swap/pim/contact#Person). Therefore, the following "subject, predicate, object" RDF triples can be expressed: In standard N-Triples format, this RDF can be written as: Equivalently, it can be written in standard Turtle (syntax) format as: Or, it can be written in RDF/XML format as: Certain concepts in RDF are taken from logic and linguistics , where subject-predicate and subject-predicate-object structures have meanings similar to, yet distinct from,

1040-620: The W3C's Platform for Internet Content Selection (PICS), an early web content labelling system, but the project was also shaped by ideas from Dublin Core , and from the Meta Content Framework (MCF), which had been developed during 1995 to 1997 by Ramanathan V. Guha at Apple and Tim Bray at Netscape . A first public draft of RDF appeared in October 1997, issued by a W3C working group that included representatives from IBM , Microsoft , Netscape , Nokia , Reuters , SoftQuad , and

1080-492: The World Wide Web. But RDF, in general, is not limited to the description of Internet-based resources. In fact, the URI that names a resource does not have to be dereferenceable at all. For example, a URI that begins with "http:" and is used as the subject of an RDF statement does not necessarily have to represent a resource that is accessible via HTTP , nor does it need to represent a tangible, network-accessible resource — such

1120-445: The above URI each represent the class of all wine produced by a single vintner), a definition which is expressed by the OWL ontology — itself an RDF document — in which it occurs. Without careful analysis of the definition, one might erroneously conclude that an instance of the above URI was something physical, instead of a type of wine. Note that this is not a 'bare' resource identifier, but

1160-468: The case, a keyword can be any term that exists within the document. However, priority is given to words that occur in the title, words that recur numerous times, and words that are explicitly assigned as keywords within the coding. Index terms can be further refined using Boolean operators such as "AND, OR, NOT." "AND" is normally unnecessary as most search engines infer it. "OR" will search for results with one search term or another or both. "NOT" eliminates

1200-492: The color blue" in RDF is as the triple: a subject denoting "the sky", a predicate denoting "has the color", and an object denoting "blue". Therefore, RDF uses subject instead of object (or entity ) in contrast to the typical approach of an entity–attribute–value model in object-oriented design : entity (sky), attribute (color), and value (blue). RDF is an abstract model with several serialization formats (being essentially specialized file formats ). In addition

1240-471: The database. Resource Description Framework The Resource Description Framework ( RDF ) is a method to describe and exchange graph data. It was originally designed as a data model for metadata by the World Wide Web Consortium (W3C). It provides a variety of syntax notations and data serialization formats, of which the most widely used is Turtle (Terse RDF Triple Language). RDF

Python Package Index - Misplaced Pages Continue

1280-485: The first Python release in February 1991, with the goal of simplifying the process of installing third-party Python packages. However, distutils only provided the tools for packaging Python code , and no more. It was able to collect and distribute metadata but did not use it for other purposes. Python still lacked a centralised catalog for packages on the internet. PEP 241, a proposal to standardize metadata for indexes,

1320-740: The particular encoding for resources or triples can vary from format to format. This mechanism for describing resources is a major component in the W3C's Semantic Web activity: an evolutionary stage of the World Wide Web in which automated software can store, exchange, and use machine-readable information distributed throughout the Web, in turn enabling users to deal with the information with greater efficiency and certainty . RDF's simple data model and ability to model disparate, abstract concepts has also led to its increasing use in knowledge management applications unrelated to Semantic Web activity. A collection of RDF statements intrinsically represents

1360-407: The respective articles. How qualified the provider is decides the quality of both indexer-provided index terms and author-provided index terms. The quality of these two types of index terms is of research interest, particularly in relation to information retrieval . In general, an author will have difficulty providing indexing terms that characterize his or her document relative to other documents in

1400-433: The web are tags , which are directly visible and can be assigned by non-experts. Index terms can consist of a word, phrase, or alphanumerical term. They are created by analyzing the document either manually with subject indexing or automatically with automatic indexing or more sophisticated methods of keyword extraction. Index terms can either come from a controlled vocabulary or be freely assigned. Keywords are stored in

1440-514: Was by Calvin Mooers in 1948. It is in particular used about a preferred term from a thesaurus . The Simple Knowledge Organization System language (SKOS) provides a way to express index terms with Resource Description Framework for use in the context of the Semantic Web . Most web search engines are designed to search for words anywhere in a document—the title, the body, and so on. This being

1480-470: Was adopted as a W3C recommendation in 1999. The RDF 1.0 specification was published in 2004, and the RDF 1.1 specification in 2014. SPARQL is a standard query language for RDF graphs. RDF Schema (RDFS), Web Ontology Language (OWL) and SHACL (Shapes Constraint Language) are ontology languages that are used to describe RDF data. The RDF data model is similar to classical conceptual modeling approaches (such as entity–relationship or class diagrams ). It

1520-709: Was an XML format rather than a data model, and only the RDF/XML serialisation being XML-based. RDF saw little take-up in this period, but there was significant work done in Bristol , around ILRT at Bristol University and HP Labs , and in Boston at MIT . RSS 1.0 and FOAF became exemplar applications for RDF in this period. The recommendation of 1999 was replaced in 2004 by a set of six specifications: "The RDF Primer", "RDF Concepts and Abstract", "RDF/XML Syntax Specification (revised)", "RDF Semantics", "RDF Vocabulary Description Language 1.0", and "The RDF Test Cases". This series

1560-520: Was finalized in March 2001. A proposal to create a comprehensive centralised catalog, hosted at the python.org domain, was later finalized in November 2002. On 16 April 2018, all PyPI traffic began being served by a more modern website platform: Warehouse. The legacy website was turned off at the end of that month. All existing packages were migrated to the new platform with their histories preserved. In May 2023

1600-413: Was superseded in 2014 by the following six "RDF 1.1" documents: "RDF 1.1 Primer", "RDF 1.1 Concepts and Abstract Syntax", "RDF 1.1 XML Syntax", "RDF 1.1 Semantics", "RDF Schema 1.1", and "RDF 1.1 Test Cases". The vocabulary defined by the RDF specification is as follows: rdf:Statement , rdf:subject , rdf:predicate , rdf:object are used for reification (see below ). This vocabulary

#549450