ISO 639-3 - Misplaced Pages

ISO 639-3:2007 , Codes for the representation of names of languages – Part 3: Alpha-3 code for comprehensive coverage of languages , is an international standard for language codes in the ISO 639 series. It defines three-letter codes for identifying languages. The standard was published by International Organization for Standardization (ISO) on 1 February 2007.

#618381

34-660: ISO 639-3 extends the ISO 639-2 alpha-3 codes with an aim to cover all known natural languages . The extended language coverage was based primarily on the language codes used in the Ethnologue (volumes 10–14) published by SIL International , which is now the registration authority for ISO 639-3. It provides an enumeration of languages as complete as possible, including living and extinct, ancient and constructed, major and minor, written and unwritten. However, it does not include reconstructed languages such as Proto-Indo-European . ISO 639-3

68-461: A "remainder group" is a group of several related languages from which some specific languages have been excluded. However, in ISO 639-5, the "remainder groups" do not exclude any languages. Because ISO 639-2 and ISO 639-5 use the same Alpha-3 codes, but do not always refer to the same list of languages for any given code, the languages an Alpha-3 code refers to can not be determined unless it is known whether

102-503: A change request may be withdrawn or promoted to "candidate status". Three months prior to the end of an annual review cycle (typically in September), an announcement is sent to the LINGUIST discussion list and other lists regarding Candidate Status Change Requests. All requests remain open for review and comment through the end of the annual review cycle. Decisions are announced at the end of

136-535: A fully documented request is received, it is added to a published Change Request Index. Also, announcements are sent to the general LINGUIST discussion list at Linguist List and other lists the registration authority may consider relevant, inviting public review and input on the requested change. Any list owner or individual is able to request notifications of change requests for particular regions or language families. Comments that are received are published for other parties to review. Based on consensus in comments received,

170-415: A generic value: qnp , unnamed proto-language. This is used for proposed intermediate nodes in a family tree that have no name. The code table for ISO 639-3 is open to changes. In order to protect stability of existing usage, the changes permitted are limited to: The code assigned to a language is not changed unless there is also a change in denotation. Changes are made on an annual cycle. Every request

204-425: A particular language or macrolanguage. While ISO 639-2 includes three-letter identifiers for collective languages, these codes are excluded from ISO 639-3. Hence ISO 639-3 is not a superset of ISO 639-2. ISO 639-5 defines 3-letter collective codes for language families and groups, including the collective language codes from ISO 639-2. Four codes are set aside in ISO 639-2 and ISO 639-3 for cases where none of

238-444: A variety of "scopes of denotation", or types of meaning and use, some of which are described in more detail below. For a definition of macrolanguages and collective languages, see ISO 639-3/RA: Scope of denotation for language identifiers . Individual languages are further classified as to type: Some ISO 639-2 codes that are commonly used for languages do not precisely represent a particular language or some related languages (as

272-476: Is an attempt to deal with varieties that may be linguistically distinct from each other, but are treated by their speakers as two forms of the same language, e.g. in cases of diglossia . For example: A complete list is available on the ISO 639-3 registrar's website. "A collective language code element is an identifier that represents a group of individual languages that are not deemed to be one language in any usage context." These codes do not precisely represent

306-405: Is appropriate since ISO is an industrial organization, while he views language documentation and nomenclature as a scientific endeavor. He cites the original need for standardized language identifiers as having been "the economic significance of translation and software localization ", for which purposes the ISO 639-1 and 639-2 standards were established. But he raises doubts about industry need for

340-467: Is derived from the English name for the language and was a necessary legacy feature, and a "terminological" code (ISO 639-2/T), which is derived from the native name for the language and resembles the language's two-letter code in ISO 639-1. There were originally 22 B codes; scc and scr are now deprecated. In general the T codes are favored; ISO 639-3 uses ISO 639-2/T. The codes in ISO 639-2 have

374-533: Is given a minimum period of three months for public review. The ISO 639-3 Web site has pages that describe "scopes of denotation" ( languoid types) and types of languages, which explain what concepts are in scope for encoding and certain criteria that need to be met. For example, constructed languages can be encoded, but only if they are designed for human communication and have a body of literature, preventing requests for idiosyncratic inventions. The registration authority documents on its Web site instructions made in

SECTION 10

#1733085983619

408-440: Is identified as a collective code in ISO 639-2 but is (at present) missing from ISO 639-5: Codes registered for 639-2 that are listed as collective codes in ISO 639-5 (and collective codes by name in ISO 639-2): The interval from qaa to qtz is "reserved for local use" and is not used in ISO 639-2 nor in ISO 639-3 . These codes are typically used privately for languages not (yet) in either standard. Microsoft Windows uses

442-471: Is intended for use as metadata codes in a wide range of applications. It is widely used in computer and information systems, such as the Internet, in which many languages need to be supported. In archives and other information storage, it is used in cataloging systems, indicating what language a resource is in or about. The codes are also frequently used in the linguistic literature and elsewhere to compensate for

476-453: Is not a complete genetic hierarchy; some of the collection codes are based on geography (like nai ) or category (like crp ) instead. ISO 639-5 defines alpha-3 (3-letter) codes, called "collective codes", that identify language families and groups. As of the February 11, 2013 update to ISO 639-5, the standard defines 115 collective codes. The United States Library of Congress maintains

510-434: The qps language code for pseudo-locales generated automatically from English strings, designed for testing software localization. There are four generic codes for special situations: These four codes are also used in ISO 639-3 . ISO 639-5 ISO 639-5:2008 "Codes for the representation of names of languages—Part 5: Alpha-3 code for language families and groups " is an international standard published by

544-531: The International Organization for Standardization (ISO). It was developed by ISO Technical Committee 37, Subcommittee 2, and first published on May 15, 2008. It is part of the ISO 639 series of standards. This is a list of ISO 639-5 codes , including the code hierarchy as given in the ISO 639-5 registry. The code und ( undetermined ) from ISO 639-2 can be seen as top of the hierarchy (for example, und:aav , und:euq:eu ). The hierarchy

578-443: The ISO 639-RA Joint Advisory Committee responsible for maintaining the ISO 639 code tables. Work was begun on the ISO 639-2 standard in 1989, because the ISO 639-1 standard, which uses only two-letter codes for languages, is not able to accommodate a sufficient number of languages. The ISO 639-2 standard was first released in 1998. In practice, ISO 639-2 has largely been superseded by ISO 639-3 (2007), which includes codes for all

612-612: The T-codes. As of 23 January 2023, the standard contains 7,916 entries. The inventory of languages is based on a number of sources including: the individual languages contained in 639-2, modern languages from the Ethnologue , historic varieties, ancient languages and artificial languages from the Linguist List , as well as languages recommended within the annual public commenting period. Machine-readable data files are provided by

646-951: The above macrolanguages). They are regarded as collective language codes and are excluded from ISO 639-3 . The collective language codes in ISO 639-2 are listed below. Some language groups are noted to be remainder groups, that is excluding languages with their own codes, while other are not. Remainder groups are afa , alg , art , ath , bat , ber , bnt , cai , cau , cel , crp , cus , dra , fiu , gem , inc , ine , ira , khi , kro , map , mis , mkh , mun , nai , nic , paa , roa , sai , sem , sio , sit , sla , ssa , tai and tut , while inclusive groups are apa , arn , arw , aus , bad , bai , bih , cad , car , chb , cmc , cpe , cpf , cpp , dua , hmn , iro , mno , mul , myn , nub , oto , phi , sgn , wak , wen , ypk and znd . The following code

680-650: The annual review cycle (typically in January). At that time, requests may be adopted in whole or in part, amended and carried forward into the next review cycle, or rejected. Rejections often include suggestions on how to modify proposals for resubmission. A public archive of every change request is maintained along with the decisions taken and the rationale for the decisions. Linguists Morey, Post and Friedman raise various criticisms of ISO 639, and in particular ISO 639-3: Martin Haspelmath agrees with four of these points, but not

714-453: The case of language varieties without established literary traditions, usage in education or media, or other factors that contribute to language conventionalization. Therefore, the standard should not be regarded as an authoritative statement of what distinct languages exist in the world (about which there may be substantial disagreement in some cases), but rather simply one useful way for identifying different language varieties precisely. Since

SECTION 20

#1733085983619

748-417: The code is three-letter alphabetic, one upper bound for the number of languages that can be represented is 26 × 26 × 26 = 17,576. Since ISO 639-2 defines special codes (4), a reserved range (520) and B-only codes (22), 546 codes cannot be used in part 3. Therefore, a stricter upper bound is 17,576 − 546 = 17,030. The upper bound gets even stricter if one subtracts the language collections defined in 639-2 and

782-491: The code is used in the context of ISO 639-2 or ISO 639-5. The committee draft of ISO 639-5 was issued on February 23, 2005. Voting on the draft terminated on July 5, 2005; the draft was approved. In 2006, the target publication date for the final standard was set at October 30, 2007. During the approval stage for the standard, the ISO final draft international standard ballot was not initiated until February 8, 2008. Voting ended on April 10, 2008 ("stage 50.60"). The standard

816-412: The comprehensive coverage provided by ISO 639-3, including as it does "little-known languages of small communities that are never or hardly used in writing and that are often in danger of extinction". ISO 639-2 ISO 639-2:1998 , Codes for the representation of names of languages — Part 2: Alpha-3 code , is the second part of the ISO 639 standard , which lists codes for the representation of

850-448: The fact that language names may be obscure or ambiguous. ISO 639-3 includes all languages in ISO 639-1 and all individual languages in ISO 639-2 . ISO 639-1 and ISO 639-2 focused on major languages, most frequently represented in the total body of the world's literature. Since ISO 639-2 also includes language collections and Part 3 does not, ISO 639-3 is not a superset of ISO 639-2. Where B and T codes exist in ISO 639-2, ISO 639-3 uses

884-442: The individual languages in ISO 639-2 plus many more. It also includes the special and reserved codes, and is designed not to conflict with ISO 639-2. ISO 639-3, however, does not include any of the collective languages in ISO 639-2; most of these are included in ISO 639-5 . While most languages are given one code by the standard, twenty of the languages described have two three-letter codes, a "bibliographic" code (ISO 639-2/B), which

918-437: The list of Alpha-3 codes that comprise ISO 639-5. The standard does not cover all language families used by linguists. The languages covered by a group code need not be linguistically related, but may have a geographic relation, or category relation (such as Creoles ). SIL International treats ISO 639-2 code him (Himachali languages / Western Pahari languages) as an ISO 639-5 code, although it does not appear in

952-450: The names of languages. The three-letter codes given for each language in this part of the standard are referred to as "Alpha-3" codes. There are 487 entries in the list of ISO 639-2 codes . The US Library of Congress is the registration authority for ISO 639-2 (referred to as ISO 639-2/RA). As registration authority, the LOC receives and reviews proposed changes; they also have representation on

986-623: The official list of ISO 639-5 codes maintained by the Library of Congress (the registration authority for ISO 639-5). Some of the codes in ISO 639-5 codes are also found in the ISO 639-2 "Alpha-3 code" standard. ISO 639-2 contains codes for some individual languages, some ISO 639 macrolanguage codes, and some collective codes; any code found in ISO 639-2 is also found in either ISO 639-3 or ISO 639-5. Languages, families, or group codes in ISO 639-2 can be of type "group" ( g ) or "remainder group" ( r ). A "group" consists of several related languages;

1020-678: The ones yet to be defined in ISO 639-5 . There are 58 languages in ISO 639-2 which are considered, for the purposes of the standard, to be "macrolanguages" in ISO 639-3. Some of these macrolanguages had no individual language as defined by ISO 639-3 in the code set of ISO 639-2, e.g. ara (Generic Arabic). Others like nor (Norwegian) had their two individual parts ( nno ( Nynorsk ), nob ( Bokmål )) already in ISO 639-2. That means some languages (e.g. arb , Standard Arabic) that were considered by ISO 639-2 to be dialects of one language ( ara ) are now in ISO 639-3 in certain contexts considered to be individual languages themselves. This

1054-468: The point about language change. He disagrees because any account of a language requires identifying it, and we can easily identify different stages of a language. He suggests that linguists may prefer to use a codification that is made at the languoid level since "it rarely matters to linguists whether what they are talking about is a language, a dialect or a close-knit family of languages." He also questions whether an ISO standard for language identification

ISO 639-3 - Misplaced Pages Continue

1088-412: The registration authority. Mappings from ISO 639-1 or ISO 639-2 to ISO 639-3 can be done using these data files. ISO 639-3 is intended to assume distinctions based on criteria that are not entirely objective. It is not intended to document or provide identifiers for dialects or other sub-language variations. Nevertheless, judgments regarding distinctions between languages may be subjective, particularly in

1122-457: The specific codes are appropriate. These are intended primarily for applications like databases where an ISO code is required regardless of whether one exists. In addition, 520 codes in the range qaa – qtz are 'reserved for local use'. For example, Rebecca Bettencourt assigns a code to constructed languages , and new assignments are made upon request. The Linguist List uses them for extinct languages . Linguist List has assigned one of them

1156-427: The text of the ISO 639-3 standard regarding how the code tables are to be maintained. It also documents the processes used for receiving and processing change requests. A change request form is provided, and there is a second form for collecting information about proposed additions. Any party can submit change requests. When submitted, requests are initially reviewed by the registration authority for completeness. When

#618381