Wikidata talk:WikiProject Names

Latest comment: 2 days ago by VIGNERON in topic Which items should be added as P735?

female given name/male given name or feminine given name/masculine given name? edit

Personally I think the English labels on Q11879590 and Q12308941 should be changed to "feminine given name" and "masculine given name". Using "female" and "male" seems a little off to me, the names themselves don't have a sex like a living being does, they're simply gendered linguistically or associated with a gender in life. Generally when words are gendered they're described as masculine (Q499327) or feminine (Q1775415), not male or female. The categories on English language Wikipedia uses en:Category:Feminine given names and en:Category:Masculine given names. Unless anyone would oppose to it I would like to change the items. StarTrekker (talk) 09:02, 29 August 2023 (UTC)Reply

Sounds good to me (your proposal). Azertus (talk) 19:35, 29 August 2023 (UTC)Reply
As an additional data point, most definitions on English Wiktionary seem to use male/female [1]. --Azertus (talk) 14:28, 30 August 2023 (UTC)Reply
The members of the project have agreed on male given name and female given name for Wikidata. I see no need to change anything here, especially since it affects hundreds of thousands of records. Before further changes please discuss in detail. If there are not enough supporters here, please continue the discussion on the Wikidata:Project chat. Such far-reaching changes with the consent of one person is not enough to subsequently task a bot with the changes to all records. This also affects several tools, which would have to be changed accordingly. --Quick-O-Mat (talk) 17:05, 14 October 2023 (UTC)Reply

Konstantinos Tragas or Konstantinos Trangas edit

Hello, would you have a solution for Commons:Category:Konstantinos Tragas (see also the talkpage of George E. Koronaios, to do with Greek and Greeklish). Thank you so much. Lotje (talk) 09:38, 29 August 2023 (UTC)Reply

Norwegian citizen Last names edit

I have had an excelent query made by @Tagishsimon giving me a list of all last names for norwegian citizens. Now I do have an excel spreadsheet containing (only) last names for about 12500 different items names for norwegian political prisoners during ww2. How can I compare these last names and eventually have new last names added to Wikidata? Breg Pmt (talk) 20:01, 24 October 2023 (UTC)Reply

Depends on your needs and knowledge. OpenRefine certainly seems like a good start for this project. --Emu (talk) 07:42, 25 October 2023 (UTC)Reply

Japanese names in name in native language (P1559) be specified with a language code "ja-hani" or simply "ja"? edit

Looking through the results of this query, a Japanese name in native language (P1559) are specified with a language code "ja-hani" (Japanese (Kanji script)). It sounds pretty odd for native Japanese speakers to specify its code purposely with "Japanese (Kanji script)", instead of "日本語 (ja)". Although it is accurate to apply "ja-hani" to Japanese names that are written only in Kanji, specifying these separate language labels could cause the difference in query results between "ja" and "ja-hani" names, which could be detrimental to information retrieval. Therefore, I'd like to ask you to change all their description to "ja" as a standard language label for Japanese names. Doraemonplus (talk) 11:30, 6 November 2023 (UTC)Reply

I think ja is the right code to use. Script, country, etc, subtags are only intended to be used when they're actually necessary, not just because you can. Name items should have writing system (P282) with the writing system and name in kana (P1814) as a qualifier (for names in kanji), so I don't see why it would be necessary to use ja-hani. - Nikki (talk) 21:41, 29 December 2023 (UTC)Reply
A bit late but I agree. --Data Consolidation Officer (talk) 16:02, 13 April 2024 (UTC)Reply

Reducing redundancy edit

Items for names take up a lot of space in Wikidata. For example, there are just under 600,000 items for surnames. That is only around 0.5% of all items, but the labels on these items which are the same as the English label account for 10% of all labels in Wikidata. The aliases which are the same as the English label account for a third of all aliases.

The size of Wikidata is causing problems, most notably for the query service, which is likely to stop working at some point in the next few years (see Wikidata:SPARQL query service/WDQS backend update) if we continue the way we are.

The developers are working on adding support for using the language code "mul" on labels (phab:T285156), designed to be used for things like this instead of copying the same label to hundreds of languages (and I hope they will also work on adding some simple dynamically generated descriptions after that - phab:T303677).

I think we can reduce the amount of redundancy on items for names before then though:

  • We could remove labels for country variants of a language, if they're the same as the first fallback language, because the fallback language is still the same language/script. This would remove at least 5 million redundant labels.
  • We could do the same for descriptions, for the same reason. This would remove at least 5 million redundant descriptions.
  • We could remove aliases which match another label and are in the wrong script, because they are not needed for searching and are entered under the wrong language anyway. It's hard to calculate using a query, but I think this would remove at least 50 million redundant aliases.

If people agree, I should be able to make a bot to do this.

- Nikki (talk) 21:24, 29 December 2023 (UTC)Reply

I usually never think about the size of Wikidata, but you're right that at this big of a scale that it has to be considered. I am even okay with giving a property to function as a description, such as P31 being name (which automatically gives the description as "name" unless it's overwritten by something else), which would also work in removing many languages at once. Also if something has one name that applies for many languages, there could be ways of combining them?
Anyway, I support this effort and see it as vital in the sustainability of open data. Egezort (talk) 22:32, 29 December 2023 (UTC)Reply

Double given name edit

I met double given name (Q1243157) for the first time today.

The context was the artist (William) Francis Marshall (Q21459938), with given name (P735) = William Francis (Q104831048) coded by @Arroser: in 2021.

This seems to me quite wrong. IMO, in English at least, William Francis (Q104831048) is just a combination of two first names, not any kind of joint name; and (at least in English) even if somebody is habitually addressed by two first names, IMO (apart from a very few exceptions) those names would not be regarded as a joint or compound first name unless they were hyphenated.

Looking at query https://w.wiki/8hgg it seems there are quite a lot of these.

I see William Francis (Q104831048) was created by User:Moebeus in 2021 and has a Commons category (since 2016). Even so, I believe it should be deleted as not a real thing; along with almost all other English examples of this.

Do others agree? Jheald (talk) 12:47, 2 January 2024 (UTC)Reply

I no longer edit names (except for adding missing ones when I need them). I appreciate the ping, if you want to delete any of the ones I've created that's okay with me. Moebeus (talk) 15:04, 2 January 2024 (UTC)Reply
@Jheald I am not sure the double given name item is necessary but joint names which are conventionally spelled with a space in English and not a hyphen are relatively common. Punjabi first names often consist of a first part followed by an honorific, gendered, or tribal suffix; for example in the name Satwant Kaur (Q113570497) the Kaur part is what makss it a female name. I don't know what is included as "English names" here as most names used by English speakers are derived from other languages (Francis from Latin for example), but there are a number of very common Hebrew-origin names spelled with a space in English as well such as Mary Anne and Anne Marie. I have no idea about William Francis specifically, but spaces on their own should not be treated as a reason for considering a single name to be two names. عُثمان (talk) 20:01, 2 January 2024 (UTC)Reply

Ivan vs. Iwan edit

There is an article named Iwan which has a redirection from Ivan. The article is linked to Iwan (Q25342533) and the redirect to Ivan (Q830350). Now there is a complaint that it is hard to connect the german article Iwan with e.g. the english article Ivan. I made the proposition to merge these two wikidata items, but got the answer, that this would be disliked here, along with a link to Wikidata:WikiProject Names. There, I can't see any reason why not to merge these two items, so I'm asking here, whether this merge would be a problem. It's the same name, Iwan is the correct german transcription of the russian name Иван, just like Ivan is the english transcription of it. Senechthon (talk) 22:54, 25 March 2024 (UTC)Reply

There were numerous discussion about this issue, even with "Иван" example: Wikidata_talk:WikiProject_Names/Archive/1#Cyrillic_-_values_for_личное_имя_(P735), Wikidata_talk:WikiProject_Names/Archive/1#Constantin_/_Konstantin_/_Constantine_merger_at_Q7111053... The status quo is that even difference in an accent sign worth a new item. --Infovarius (talk) 21:31, 27 March 2024 (UTC)Reply

Which items should be added as P735? edit

For example, consider a Russian person like d:Q2587276. Should it only have a name item with Cyrillic script like d:Q2253934 as given name (P735)? Or should it also have an item with Latin script like Q18130730 for the name's transcription as given name (P735)? D3rT!m (talk) 15:21, 1 April 2024 (UTC)Reply

@D3rT!m: yes, there should be only one item, the one in the original language (as transcriptions can vary and be multiples). I corrected the item you linked. Cheers, VIGNERON (talk) 05:33, 9 April 2024 (UTC)Reply
Okay, thank you! D3rT!m (talk) 09:27, 9 April 2024 (UTC)Reply

@D3rT!m: Not an answer, but a remark: This question can be seen in the broader context of how names should be represented. I (still) have the opinion that a “clean” modelling of names would require reifying them, i.e. having one item per name (with properties like given name, surname, included honorifics etc. being properties of that item, transcriptions would be properties of the given name, surname etc.) referred to from a namebearer by something like name (P2561). But I feel that such a modelling wouldn’t be welcome (because it would be perceived as too complicated). (I also think that there are way too many items about humans – they make SPARQL queries time out too easily for them to be able to answer many interesting questions –, but that’s another can of worms.) --Data Consolidation Officer (talk) 16:00, 13 April 2024 (UTC)Reply

@Data Consolidation Officer: I missed your answer, what do you propose exactly? (I'm not sure what your proposition would improve/solve). Names are very hard and the current solution is not perfect but I guess it's the best trade-off (and if anything, I would split same names in more items, like Berger (Q1260304) which presently conflates two very different names in French and German). Cheers, VIGNERON (talk) 12:06, 21 April 2024 (UTC)Reply
Return to the project page "WikiProject Names".