Wikidata:Properties for deletion/P9453

ALA-LC romanization for Ukrainian (P9453) edit

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.

No consensus to delete at this time. Please consider a well framed RfC to look at the whole issue in more detail — Martin (MSGJ · talk) 20:58, 27 May 2022 (UTC)[reply]

We already have a property for ALA-LC romanisation - ALA-LC romanization (P8991) - and I don't see any need for this one to be separate. It has only been used 6 times so far, so migrating the statements would be easy and deleting it shouldn't break anything.

(pinging people from the original proposal @ArthurPSmith, Pamputt, Mzajac:)

--Nikki (talk) 08:04, 30 August 2021 (UTC)[reply]

I just noticed on the property talk page that Mzajac thinks we should have specific properties for every romanisation table ALA-LC has. I don't think that's a good idea because it would require a lot of properties (making it much harder for people find the right one to use) for little benefit. On https://www.loc.gov/catdir/cpso/roman.html there are 75 different files (which sometimes have different variations for different languages, e.g. Non-Slavic Languages (in Cyrillic Script)) and the index of languages covers around 150 different languages. The tables can also change over time (e.g. the 1997 version for Korean would use "dijain" for "디자인", while the 2009 version would use "tijain").

For the most part, there is one ALA-LC romanisation system for each language and script. In the cases where it's ambiguous (e.g. Khakass in the non-Slavic languages file), I would suggest using determination method (P459) to specify the exact version.

- Nikki (talk) 08:50, 30 August 2021 (UTC)[reply]

Nikki makes a good point here, I have no strong opinion on whether this is really needed or not. ArthurPSmith (talk) 13:32, 30 August 2021 (UTC)[reply]

Firstly, a romanization is precisely defined by a single table. For example, ALA-LC Romanization for Ukrainian and ALA-LC Romanization for Russian are two different tables, intended for alphabets that share many of the same letters, some of which are pronounced differently. They can be applied to a specific string of characters and yield different results. ALA-LC Romanization is a very large set of tables. For Slavic languages, there are also some simplified variations (“modified Library of Congress”) intended for use in text rather than for library cataloguing. It has tried to maintain some phonological consistency, so languages can be compared. In contrast, BGN/PCGN is a set of tables coming from different national authorities, so its schemes are diverse. Some systems are for only one language and have only one table. There are also “universal” transliteration systems like ISO 9-1995 that has one table for all Cyrillic-script text. Some have changed their standards, so, e.g., there’s a BGN/PCGN-1965 romanization for Ukrainian and a 2019 one. For unambiguous identification, and for automation, just “ALA-LC romanization” is not good enough.

Secondly, and more importantly, romanization as it is modelled in Wikidata is a hot mess, and solitary changes like this won’t fix it. We need to create an overall framework. I started proposing properties based on others already present. But even now I can’t hold in my head at once all of the ways to indicate romanization here: sometimes it’s a statement (for items representing names), sometimes a property on a statement, sometimes a determination method property, and I don’t even know anything about how lexemes work. It applies differently to monolingual text or generic strings that may contain multiple languages. So I have no idea whether deleting this property is an improvement or not.

In comparison, w:IETF language tagging is a flexible scheme for identifying romanized text with a variable amount of precision, and this might be a useful model to inform us going forward. I can construct a language tag that indicates language, writing system, or method of romanization referring to a general system or a specific version of it. —Michael Z. 17:26, 3 September 2021 (UTC)[reply]

Keep Per Michael above, the scale of culture differents between RU and UA made the potential merging of properties nogo. --Liuxinyu970226 (talk) 01:04, 7 September 2021 (UTC)[reply]

I don't see why ALA-LC romanization (P8991) could not be used, qualified by the particular table: ALA-LC romanization of Ukrainian (Q100903064) has an item. I tried it out on Б (Q16291). UWashPrincipalCataloger (talk) 21:32, 28 September 2021 (UTC)[reply]
Because if I remember correctly, Г (Q16335) is pronunced as /ɦ/ in Ukrainian, not /g/? Liuxinyu970226 (talk) 07:15, 11 January 2022 (UTC)[reply]

The above discussion is preserved as an archive. Please do not modify it. Subsequent comments should be made in a new section.