Wikidata:Property proposal/Latvian transcription

Latvian transcription edit

Originally proposed at Wikidata:Property proposal/Person

Descriptiontranscription of name in Latvian orthography
Data typeString
Domaingiven name or family name
Example 1Depardieu (Q25212638) → Depardjē
Example 2Gerard (Q1261347) → Žerārs
Example 3Eduard (Q17190238) → Eduards
Example 4Klein (Q428024) → Kleins
Example 5Kleins (Q83385239) → novalue (or no statement)
Example 6Palmer (Q568872) → Pālmers
Example 7Liu (Q804970) → Liu
Planned useadd to given name/family name items

  Notified participants of WikiProject Names

Motivation edit

See w:Latvian_name#Spelling as introduction. This was brought up on WikiProject Names some time ago, but we haven't really solved it.

Currently these transcriptions are sometimes inserted as labels in Latvian. While this works fine on items for people (e.g. "Gerard Depardieu"), it works less so on items for given names or family names. There it makes them irregular when compared to other languages with Latin script. Accordingly, placing them in a separate property on name items seems preferable.

Please add more samples. (Add your motivation for this property here.) --- Jura 00:07, 24 November 2019 (UTC)[reply]

Discussion edit

I have not done much work with this on Wikidata yet. However, I am not convinced that using label for this is a bad approach. Original spelling can be included in description and alias (as it is done in Russian). This makes me wonder what should be the Latvian label for Russian names (example Eduard (Q17190238)) if it is decided to use this proposed property. I am aware that Lithuanians also practice transcription (example lt:Donaldas Trampas) although not as strictly as Latvians. --Papuass (talk) 08:50, 28 November 2019 (UTC)[reply]

Adding website which contains a database of person names + additional resources. --Papuass (talk) 09:07, 28 November 2019 (UTC)[reply]
@Papuass: For name items, I think the initial approach was to have various labels depending on the language even for languages with the same script, as most languages to sometimes use an equivalent for a name. At some point, someone had switch that to the current system where different spellings in the same script are added to separate items only. The item can then determine transliterations/transcriptions (with one of the many corresponding properties) and link to equivalent name with see also. It takes some amount of explaining to get there, but I think it has proven to be fairly simple and I'd rather not start making exceptions to that. --- Jura 17:00, 1 December 2019 (UTC)[reply]
Good Morning Jura! Why are you notifying me, did you find a mistake somewhere? --HarryNº2 (talk) 08:56, 23 January 2020 (UTC)[reply]
No, I noticed you frequently edit name items (where this property is to be used). Checking now, I notice you also add the same string to en and lv (as this proposal suggests). --- Jura 09:00, 23 January 2020 (UTC)[reply]
The problem is caused by this script [1] from @Harmonia Amanda:. There is a similar problem with female Czech and Slovak family names that generally end in -ová, e.g. Angela Merkel → Angela Merkelová or Hillary Clinton → Hillary Clintonová. --HarryNº2 (talk) 09:39, 23 January 2020 (UTC)[reply]
I don't think it's a problem, it's just two different ways of handling an aspect. To link "Merkelová" from "Merkel", there is surname for other sex (P5278). --- Jura 09:50, 23 January 2020 (UTC)[reply]
Maybe we should talk about that on the project page. --HarryNº2 (talk) 10:12, 23 January 2020 (UTC)[reply]
The gender inflection? (It could use some work, especially as P5278 isn't much used. I don't think it needs to be solved the same way).
Latvian had been raised there. --- Jura 10:16, 23 January 2020 (UTC)[reply]
OK, I finish my work on the project until a solution is found. HarryNº2 (talk) 10:56, 23 January 2020 (UTC)[reply]
@HarryNº2: I don't think it should impact your edits. BTW, Do you support the proposal? --- Jura 13:20, 9 February 2020 (UTC)[reply]
The namescript deal with Latvian as with Cyrillic languages (which sometimes transliterate between each other and sometimes not, and it's really difficult to know which case apply in a totally automated way a generic script can handle. So the namescript add the Latin-script string both as label, present in the description and as an alias. The people doing the editing can then change the label (if it's in fact a Spanish name that should be transliterated), or delete the alias (if it's a Latvian name). That was the technical solution chosen at the WikidataCon 2017, where we had a discussion on this (and other technical problem dealing with automated scripts with a broad range). The idea was clearly not to set in stone that the English label should be the same as the Latvian one, only to say that it was not a wrong value in any case so we could use it and someone knowing Latvian could easily query that and make appropriate changes. Namescript never went to official script status and is not used by bots, so I never went around explaining all its quirks since it was only used by a few people who knew how to use it. If it's starting to gain traction, I should definitely do an explanation page.
This said, I fail to see why we need a new property. The label in Latvian should be changed to reflect the transliteration in that language (even if the original name is in Latin script too), and we should also add the generic "transliteration" property with the appropriate qualifier. --Harmonia Amanda (talk) 12:12, 23 January 2020 (UTC)[reply]
  • I'm not sure if the script works out as intended, e.g. at Palmer (Q568872) the transcription label "Pālmers" had been present, but was set to "Palmer" by the script. Even if it hadn't been present before, there is now no way of making sure that the Latvian label is correct or was just set by default. It might have worked if it would have avoided overwriting existing labels and leaving them blank by default. --- Jura 04:39, 24 January 2020 (UTC)[reply]
  Weak support   Comment I see the problem and this would maybe help that, but I'm not a big fan of language-specific properties ("Scandinavian middle name", "second name in a Spanish family name", etc.) as I see no end to the different variations. Could this perhaps be solved by expanding the scope of Property:P2440 or maybe just be made more generic and not locked to Latvian? After all, there are many more languages that do this, or something similar, and I wouldn't like to see a property for each. Moebeus (talk) 10:58, 23 January 2020 (UTC)[reply]
  • It does, but to me it also confirms we're maybe heading down the wrong path here in terms of the modelling, the list of Property:P2440 sub properties can easily grow tenfold and it would still not be enough. I still support your proposal, but I hope we can circle back to the larger issue in a future session. Moebeus (talk) 09:27, 16 February 2020 (UTC)[reply]
  Weak support Will we have transcription properties for all languages with these problems? What Property Transcription, with qualifier like "in language Latvian"? --Frettie (talk) 12:02, 23 January 2020 (UTC)[reply]

Still confused edit

@Jura1: I am still confused if this property helps me somehow. What should be the Latvian label of Iya (Q4205774) now? The proper transcription is "Ija". But that is also needed for Latvian name "Ija" Ija (Q97021357) (with same description which causes conflict). Papuass (talk) 08:51, 7 July 2020 (UTC)[reply]

  • Not really, as Q4205774 is not in Latin script and Q97021357 is about the Latvian name. I did fix the English descriptions of both.
If there was a Latin script name that was transcribed into Latvian as "Ija" the property could be of use. I suppose you could still add it to Q4205774, but that isn't its primary use.
Hope that helps. --- Jura 09:00, 7 July 2020 (UTC)[reply]