Wikidata:Property proposal/Ukrainian romanization

Ukrainian national romanization edit

Originally proposed at Wikidata:Property proposal/Generic

Done: Ukrainian national romanization (P9373) (Talk and documentation)

Description	romanized Ukrainian text method for Ukrainian text (transliteration from Ukrainian Cyrillic alphabet to Latin alphabet)
Represents	Ukrainian National System (Q94489556)
Data type	String
Template parameter	Any title or name in a Ukrainian subject, including official romanized name for Ukrainian place names.
Domain	Any Ukrainian text. Geography and biography. Lexemes and forms. Should be tagged with `uk-Latn`.[1]
Allowed values	Any text. Ukrainian alphabet romanized matches the regex `[a-ik-pr-vyzA-IK-PR-VYZ ]+`, but text may also include punctuation, foreign-language text, etc.
Example 1	Universe (Q1): всесвіт → vsesvit
Example 2	Kyiv (Q1899): Київ → Kyiv
Example 3	Viktor Yushchenko (Q1459699): Віктор Андрійович Ющенко → Viktor Andriiovych Yushchenko
Example 4	Ukrainian National System (Q94489556): Про впорядкування транслітерації українського алфавіту латиницею → Pro vporiadkuvannia transliteratsii ukrainskoho alfavitu latynytseiu
Source	w:en:Romanization of Ukrainian, United Nations Romanization System in Ukraine (PDF), PCGN Romanization of Ukrainian (PDF)
Planned use	Marking official Latin-alphabet place names in Ukraine. Importing or validating them from Ukraine’s open data portal data.gov.ua. Transcluding them into the “name_official” field in infoboxes on en.Wikipedia.
See also	Sixteen other instances of transliteration system (Q20085670): Query. Subject item of this property: Ukrainian National System (Q94489556). Other romanization systems for Ukrainian (subjects): ALA-LC romanization of Ukrainian (Q100903064), BGN/PCGN romanization of Ukrainian (Q56345230), German romanization of Ukrainian (Duden) (Q56343745), ISO 9 (Q913336).

Motivation edit

This is necessary for using romanized Ukrainian in most contexts. I want to get Ukrainian official Latin-alphabet names into en.Wikipedia articles. —Michael Z. 2020-10-29 03:35 z 03:35, 29 October 2020 (UTC)[reply]

Technical description edit

This standard is mandated for official documents in Ukraine, including passports, geographical names and maps, public signage, street names, transit, etcetera. Initial version was adopted by Ukraine in 1994, current version enacted into law in 2010, adopted by the United Nations in 2012, and the BGN/PCGN in 2019, so it is widely used and here to stay. Adopted in en.Wikipedia for geographic names in 2008 and almost everything else in 2019. This is a simple and accessible transcription method, based on English phonemes (kh, ts, ch, sh, shch, iu/yu, ia/ya, not ch, c, č, š, šč, ju, ja of European “international” schemes), foregoing diacritics, and designed for international use. It is not reversible to derive the original Cyrillic text. It could be reliably generated by an algorithm.

Discussion edit

Why not but please specify it's an English romanisation (zh for this romanisation whilst it would be j in French, for instance). Bouzinac ^{💬●✒️●💛} 10:40, 29 October 2020 (UTC)[reply]
I alluded to it, but now I’ve added more technical description. I would like to see other romanizations added as well, including European “international” ones in addition to ISO 9:1995 (P2183), as well as national ones like the French Dictionnaire de la langue française and German Duden and DIN. —Michael Z. 2020-10-29 15:00 z 15:00, 29 October 2020 (UTC)[reply]
Trimmed the description. It's way too long. NMaia (talk) 06:11, 2 November 2020 (UTC)[reply]
Looking the English descriptions of similar properties ( https://w.wiki/kQX ), I'm trying to figure out which one has the optimal one describing the expected value. To follow them, maybe "romanization according to the Ukrainian National System" or "representation according to the Ukrainian National System of transliteration of Cyrillic text to Latin script" could work. I think it helps to clearly identify the method used in the description. BTW, the property value is the result, not the method itself. As long as it fits in the field, I wouldn't worry about it being too long. Given the state of descriptions of similar properties, I'd move ahead and improve them later on. Unless there is some good reason why these two aren't needed, I'd support their creation. --- Jura 14:03, 6 November 2020 (UTC)[reply]
Ah, this description will appear in the context of using this property. That wasn’t evident from the proposal form. I wrote it for evaluators of this proposal. I’ll give it a bit of thought and maybe revise. —Michael Z. 2020-11-06 15:18 z 15:18, 6 November 2020 (UTC)[reply]
- I edited some of the other descriptions. For query with the items for the system: https://w.wiki/kUp . Maybe @Nikki: who proposed some of the other properties wants to comment. --- Jura 07:42, 7 November 2020 (UTC)[reply]
  - Thank you. Distilling each respective property’s description:
    <system> transliteration of a <language> text (usually to be used as a qualifier)
    
    reading of a <language> name in <script>
    
    reading of a <language> name in <script>
    
    romanization of <language> with <system>
    
    romanisation following <system> of <language>
    
    romanized <language> following <system>
    
    transliteration of text from <script> to Latin script according to <system>
    
    Latin transliteration of <script> text (used as qualifier)
    
    conversion of text to alternate script (use as a qualifier for monolingual text statements; please use specific property if possible)
    
    transliteration from <script> to Latin script with <system>
    
    official <country> transliteration method for <language>
    
    transliteration from <language> to Latin script with <system>
    
    representation of a IPA phoneme in <script>
    
    transliteration from <script> to Latin script following <system>
    
    representation according to <system> for transliterating <set of scripts>
    
    transliteration of <script> and <set of scripts>
    
    transcription of name in <script>
  - Each of those is a noun phrase. Most can be interpreted as referring to the converted text representation, one to the underlying original-language expression, and one to a method of text conversion.
  - On the other hand, I wonder if it might be clearer to make the noun implicit, and refer to application of the method, for example, label: romanized by the Ukrainian National System; description: text transliterated from the Ukrainian Cyrillic alphabet to the Latin alphabet according to the Ukrainian National System of 2010. This may help understand how to use the property as a qualifier on derived text, and also clearly distinguishes it from the method, Ukrainian National System (Q94489556). —Michael Z. 2020-11-07 22:46 z 22:46, 7 November 2020 (UTC)[reply]

ALA-LC romanization for Ukrainian edit

Originally proposed at Wikidata:Property proposal/Generic

Done: ALA-LC romanization for Ukrainian (P9453) (Talk and documentation)

Description	romanization method for Ukrainian text (transliteration from Ukrainian Cyrillic alphabet to Latin alphabet)
Represents	ALA-LC romanization of Ukrainian (Q100903064)
Data type	String
Template parameter	In any cited Ukrainian-language reference source, to represent title, author, place of publication, etc.
Domain	Any Ukrainian text. Bibliographic data. Lexemes and Forms. Should be tagged with `uk-Latn` or `uk-Latn-alalc97`.[2]
Allowed values	Any text. Ukrainian alphabet romanized matches the regex `[a-iĭïk-pr-vyzA-IĬÏK-PR-VYZ ͡ʹ]+`, but text may also include punctuation, foreign-language text, etc. (Not sure how regex handles that combining double inverted breve ͡ (Q87498562), which occurs in combinations i͡e I͡e I͡E z͡h Z͡h Z͡H t͡s T͡s T͡S i͡u I͡u I͡U i͡a I͡a I͡A, or its equivalent with combining ligatures ︠ (Q87544019) and ︡ (Q87544022): i︠e︡ I︠E︡ z︠h︡ Z︠H︡ t︠s︡ T︠S︡ i︠u︡ I︠U︡ i︠a︡ I︠A︡.)
Example 1	Universe (Q1): всесвіт → vsesvit
Example 2	Kyiv (Q1899): Київ → Kyïv
Example 3	Viktor Yushchenko (Q1459699): Віктор Андрійович Ющенко → Viktor Andriĭovych I͡ushchenko
Example 4	Ukrainian National System (Q94489556): Про впорядкування транслітерації українського алфавіту латиницею → Pro vpori͡adkuvanni͡a transliterat͡siï ukraïns′koho alfavitu latynyt͡sei͡u
Source	w:en:Romanization of Ukrainian, Library of Congress Ukrainian (PDF)
Planned use	Adding transliterations of Ukrainian-language reference citations.
See also	Sixteen other instances of transliteration system (Q20085670): Query. Subject item of this property: ALA-LC romanization of Ukrainian (Q100903064). Other romanization systems for Ukrainian (subjects): BGN/PCGN romanization of Ukrainian (Q56345230), German romanization of Ukrainian (Duden) (Q56343745), ISO 9 (Q913336), Ukrainian National System (Q94489556).

Motivation edit

This is necessary for bibliographic data about Ukrainian-language sources. —Michael Z. 2020-10-29 03:35 z 03:35, 29 October 2020 (UTC)[reply]

Technical description edit

This is a standard of the American Library Association and Library of Congress, used in English-language library catalogues, by publishers, etc. in the USA, UK, Canada, and elsewhere. This is a precise transcription method, based on English phonemes (kh, t͡s, ch, sh, shch, i͡u, i͡a, not ch, c, č, š, šč, ju, ja of European “international” schemes). Unknown whether it is reversible to derive the original Cyrillic text. It could be reliably generated by an algorithm.

Discussion edit

Support both - good to have this romanizations available here. ArthurPSmith (talk) 18:22, 25 March 2021 (UTC)[reply]

@Mzajac, ArthurPSmith: I have created Ukrainian national romanization (P9373) but I am wondering about on which entity we should use this property. Should we apply to item like suggested here or to lexeme forms as for other romanization system (see Cantonese Transliteration Scheme transliteration (P9323) for example). Or maybe both? Pamputt (talk) 07:46, 29 March 2021 (UTC)[reply]

Thanks! Sorry, I don’t have much insight on the question as I haven’t edited lexemes. These properties are very much needed to annotate native names for people, places, and organizations, for given names and family names, for creative works’ titles, subtitles, and author name strings, and for the native names of things and concepts native to the respective languages. I imagine lexemes can replace or supplement some of this, but not all of it.14:01, 29 March 2021 (UTC)

(That said, I am puzzled by the main value constraint, as I have added hundreds of transliterations mainly as qualifiers.) —Michael Z. 14:06, 29 March 2021 (UTC)[reply]

Yes, I thought this was mostly for a qualifier on monolingual text written in Ukrainian, I believe that's how we use most other such romanization properties. Also the proposal did mention use with lexemes, which is fine too where it presumably would be a main value. ArthurPSmith (talk) 19:15, 30 March 2021 (UTC)[reply]

@Mzajac, ArthurPSmith: ALA-LC romanization for Ukrainian (P9453)

Done Pamputt (talk) 16:27, 17 April 2021 (UTC)[reply]