User:Lea Lacroix (WMDE)/List of lists of languages

prototype (196 BC)

 Here's an attempt to build a list with all the lists of languages that are used by Wikidata. This is a work in progress, feel free to edit and improve if you have some information or useful links!

Uses of lists edit


Location of the language list/language use How is it currently displayed? Where does it come from? Where is the list stored? What’s the process to update the list? (technical) What’s the process to update the list? (community) Image
Desktop termbox

List of languages for which one can read or edit labels, descriptions, aliases

Reading & editing: name of the language in the UI’s language

Order & what languages are shown first is influenced by the thing? that is influenced by BabelBox

WikibaseContentLanguages::getDefaultTermsLanguages() (unless overridden by WikibaseContentLanguages hook), which returns the MediaWiki languages, which are the ones supported by MediaWiki directly (not sure where those are defined – Names.php?) plus the $wgExtraLanguageNames, plus some languages defined in getDefaultTermsLanguages. Add to the MediaWiki languages if this should become a general interface language, or to wmgExtraLanguageNames in InitialiseSettings.php otherwise.
 
termbox (Q87071065)
Mobile termbox

List of languages for which one can read or edit labels, descriptions, aliases

Reading & editing: name of the language in the UI’s language

Order & what languages are shown first is influenced by the thing? that is influenced by BabelBox

Same as Desktop termbox? Same as Desktop termbox? Same as Desktop termbox?
 
Language field in Special:NewItem Language code (‘en’)

In the selector list: name + code

By default, the suggested language is the user’s current interface language

Same as Desktop termbox? Same as Desktop termbox? Same as Desktop termbox?
 
Language field in Special:NewLexeme (also: Language field when editing a Lexeme) Label & description in UI’s language Freely picked from all Items, no restriction. The relevant item (a language or subclass of language?) is not necessarily shown first in the suggestor. - Create or update Items
 
 
“Spelling variant of the Lemma” field in Special:NewLexeme (also: Spelling variant field when editing a Lexeme) Reading: code only, or “mis-X-Q1234”

Editing: Name and language code. Only the code makes the selector find the relevant language

Also: when the language doesn’t exist, people are supposed to type “mis-X-Q1234”, without any help from the interface

LexemeTermLanguages, which is the MediaWiki languages (see Desktop termbox, above) plus a hard-coded list of additional language codes. The -x-Q### part is validated in LexemeTermLanguageValidator.

Note: Currently there are no validation of existence of item (phab:T201084), as long as there are no more than ten digits after the Q. Also, the full code may not be a valid IETF language tag, as any part of IETF language tag may not be longer than 8 characters (phab:T167166). ... Also: in which conditions exactly does the field appear when creating a new Lexeme? When the language item has no ISO 639-1 code statement.

Part of the answer here For the additional languages a Phabricator ticket needs to be created. LangCom input is generally sought.
 
 
Language of monolingual text: appears when entering or editing a value monolingual text (eg for P1705) Reading: name of the language in UI’s language.

Editing: name and language code.

Problem in editing mode: some languages don’t appear in the selector, but still work once the edit is saved.

WikibaseContentLanguages::getDefaultMonolingualTextLanguages() (unless overridden by WikibaseContentLanguages hook), which returns the MediaWiki languages (see Desktop termbox, above) plus a hard-coded set of additional language codes minus a hard-coded set of undesired language codes. Existing documentation on Help:monolingual text languages. Current process:
  • To get a language added, people need to create a Phab ticket with the Language codes tag.
  • According to WMDE, it needs to be approved by the Wikimedia language committee. Wikidata community consensus is that this is not required.
  • After the new language is created by the devs, if the name of the language doesn’t exist in the user’s interface, the new language is not displayed, creating some confusion (see T124758)
 
language list (Q87113143)
Language field for a Gloss on Senses Reading: name of the language.

Editing: name and code.

Same as Spelling variant of the Lemma?
Note the -x-Q#### tag is not allowed in gloss languages; however, this is only verified in frontend. It is still possible to add such glosses via API.
Same as Spelling variant of the Lemma? Same as Spelling variant of the Lemma?
 
Spelling variant field for Forms Reading: language code only.

Editing: language code, no selector available.

Same as Spelling variant of the Lemma? Same as Spelling variant of the Lemma? Same as Spelling variant of the Lemma?
 
Language of the interface for logged-in users To change temporarily one one page: add “?uselang=ar” in the URL (language code)

On the interface: switch command with the symbol , name of the language in its own language

Suggestions are influenced by: the thing that is influenced by BabelBox? Most used languages?

Note: for non logged-in users, the interface stays in English.

Same as “the MediaWiki languages” in Desktop termbox, above? Same as Desktop termbox? Same as Desktop termbox?
 
Languages of the Translate Extension: visible eg when a documentation page has translations enabled Name of the language in its own language + icon indicating the level of translation done Same as “the MediaWiki languages” in Desktop termbox, above?

Note that <languages/> only shows languages for which a (partial) translation exists; you can select other languages on Special:Translate

Same as Desktop termbox? Same as Desktop termbox?
Languages available in the BabelBox Language code, level, generated sentence in the language

Has effect on: what languages are shown first on desktop & mobile termbox

What languages are available depends on the babel box sub-templates that are available. The user can edit their user page and add whatever language code they want to add, even unsupported ones. These are shown as red links and don’t have an effect, until support is added. The community provides babel box sub-templates for languages they want to support.
 
result of adding "de-2" on user page
Existing language versions of Wikipedias On a Wikidata Item: indicated by the language code and the title of the article in each language. Same for other Wikimedia projects.

On Wikipedia: name of the language in its own language

m:Special:SiteMatrix I think it first needs to be added as an interface language, and then add a wiki, specifically populateSitesTable.php. m:Requests for new languages (for proposal) and Incubator (for development)

Note this will be changed in the future, see phab:T228745

 
sample main page (zh)
Interface language of the WDQS On the interface: switch command with the symbol, name of the language in its own language Languages supported by jquery.uls, which are periodically imported from wikimedia/language-data. Follow wikimedia/language-data instructions, then follow jquery.uls instructions, then update version in wikidata/query/gui. (Then wait for a deployment, but the process for that is supposed to change soon anyways.) Unclear.
 
note "English" in the top right corner
Items for languages in Wikidata Wikidata contains an extensive "list" of items about languages, langoids.


Can be used with qualifier language of work or name (P407)
Included in proposals for new Wikipedia language editions on meta

Wikidata items create an item for the language create an item for the language
Magic word {{#language:}} MediaWiki can output the language name with a magic word. Sample: {{#language:en-gb}} renders British English MediaWiki languages (the ones supported by MediaWiki directly, plus $wgExtraLanguageNames) Same as Desktop termbox Same as Desktop termbox
Page content language In "page information" or with magic words {{CONTENTLANG}}, {{CONTENTLANGUAGE}}. By default the same as wiki language, can be changed by translation administrators. in Mediawiki page properties ask a translation administrator
Language associated with Wikimedia sitelink schema:inLanguage on WQS, can be different from language code in URL: https://w.wiki/br3 . See phab:T145535 for similar. ask at Wikidata:Contact the development team
Language associated with property Some properties have a language/writing system associated with them that is stated in its label or description. Samples: name in hiero markup (P7383), name in kana (P1814), transliteration properties free text in label or description, possibly with statements on property property creator or update of label/description make a property proposal or suggest change on property talk page

Lists used edit

codes
language names
  • ..
  • configuration
  • labels of Wikidata items (sample: English )

General ideas edit

General ideas about its use on Wikidata:

  • use available codes whenever possible
  • identifying the language of a word in Wikidata doesn't require living native speakers nor a Wikipedia language edition
  • avoid creating codes for macro languages to define the language of the word in a specific, clearly identified language of that macro language
  • add a script subtag (e.g. "-cyrl") when the writing system isn't the primary one for the language or it's unclear what that is
  • use a region subtag (e.g. "-gb") to describe regional variants
  • use lowercase for subtags
  • ask for the addition of IETF language tags when exisiting ones don't describe the language of a monolingual string accurately
  • use "mis" while it's not available at Wikidata
  • lexemes lacking the appropriate language tag, use:
    • the appropriate parent code (e.g. "mis" or an actual language code "eo"),
    • followed by "-x-" to introduce a private use subtag,
    • and the QID for the langoid.
Sample: "eo-x-Q3505590" for System H (Q3505590). See query for others: https://w.wiki/cMB
If non-private subtags for IETF language tags are available, these should be requested.