Wikidata:Requests for permissions/Bot/AmmarBot
The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Approved--Ymblanter (talk) 19:57, 23 July 2021 (UTC)[reply]
AmmarBot edit
AmmarBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Ammarpad (talk • contribs • logs)
Task/s: Replace object named as (P1932) qualifier with subject named as (P1810) for GND ID (P227) statement.
Code: update_gnd_id_qualifiers.py
Function details:
- Find items with GND ID (P227) and object named as (P1932) qualifier and is instance of human (Q5)
- Replace object named as (P1932) with subject named as (P1810)
Requested at Wikidata:Bot_requests#request_to_replace_qualifiers_in_GND_ID_(2021-06-07)
--Ammarpad (talk) 13:54, 10 June 2021 (UTC)[reply]
- Support I've been working with Ammarpad through an Outreachy intership, while this task isn't directly related to that, their coding skills are good and they will make a good bot operator here. Thanks. Mike Peel (talk) 08:26, 11 June 2021 (UTC)[reply]
- Can we see some sample edits? BrokenSegue (talk) 07:23, 17 June 2021 (UTC)[reply]
- (that said code generally seems fine) BrokenSegue (talk) 07:28, 17 June 2021 (UTC)[reply]
- Done. (Note: Some few edits in the beginning, such as [1] [2] updated retrieved (P813) even though we didn't modify the qualifier there, I fixed that now). Ammarpad (talk) 07:52, 19 June 2021 (UTC)[reply]
- (that said code generally seems fine) BrokenSegue (talk) 07:28, 17 June 2021 (UTC)[reply]
- @Ammarpad: In this edit we reversed the order of their name? Is this intentional? Seem like the original version was the correct one? BrokenSegue (talk) 00:13, 21 June 2021 (UTC)[reply]
- @BrokenSegue: The requester @Kolja21: suggested that in the request, (the gndo:preferredNameForThePerson property). The property is concatenation of forename + surname, and I found it does not exist in some of the entities, so I chnaged it to use manual concatenation of the forename and surname. But we can retain the surname + forename as is. Ammarpad (talk) 14:11, 21 June 2021 (UTC)[reply]
- The name in the source (field "Person") should be copied 1:1 to subject named as (P1810) without changing in the order of a name. [3] @BrokenSegue: Thanks for noticing this problem. --Kolja21 (talk) 14:29, 21 June 2021 (UTC)[reply]
- @Kolja21: can you clarify, please? There's no "person" field in the data. This is the data for Vitaliy V. Khutoryanskiy (Q37840153). You can see there's only "preferredName" and then "surname" and "forename" that are relevant here. "preferredName" is concatenation of forename + surname in that order. Ammarpad (talk) 17:51, 21 June 2021 (UTC)[reply]
- @Ammarpad: In the case of GND 1168595304 "person" = "surname", "forename". Since http://d-nb.info/gnd/1168595304 is the source shown in Wikidata subject named as (P1810) must match this source. Sorry for the trouble. --Kolja21 (talk) 21:34, 21 June 2021 (UTC)[reply]
- Also the prefixes could create problems: GND 112456439X "Di Pellegrino, Giuseppe" changed to "Giuseppe, Di Pellegrino". [4] --Kolja21 (talk) 14:38, 21 June 2021 (UTC)[reply]
- This is the same issue as above. The "Di" is not prefix as I understand it, the entire Di Pellegrino is supposed to be the surname. Ammarpad (talk) 17:51, 21 June 2021 (UTC)[reply]
- You are right. In this case ("Di" part of "surname") there is indeed nothing to consider. Example with a real prefix: GND 104264489. "Moltke, Adam Gottlob Detlef von" = surname, forename + prefix. If this is getting too complicated (Arabic prefixes like "al-" etc.) you can switch the source. You have two options:
- a) subject named as (P1810): Moltke, Adam Gottlob Detlef von, source: GND ID (P227) → d-nb.info (field "Person").
- b) subject named as (P1810): Adam Gottlob Detlef von Moltke, source hub.culturegraph.org (field "preferredName") --Kolja21 (talk) 21:53, 21 June 2021 (UTC)[reply]
- But shouldn’t the named as part match the “official” source at http://d-nb.info/gnd/104264489? That’s what I’ve been doing for quite some time creating many instances that would have to be fixed by yet another bot run, if this format (Lastname, Firstname) is changed to another format (such as Firstname Lastname) … If the "preferredNameForThePerson" from https://d-nb.info/gnd/104264489/about/lds is missing (which indeed sometimes is the case), an error list for closer human inspection might be enough. --Emu (talk) 22:13, 21 June 2021 (UTC)[reply]
- @Emu, Kolja21: OK, should I just use "preferredName" where it exists and leave the field blank where it does not? (Though I am not sure if the edit can be saved with a blank name)" Ammarpad (talk) 07:19, 22 June 2021 (UTC)[reply]
- @Ammarpad: Again sorry for the trouble. I should have thought about the different sources. Like Emu said d-nb.info is the preferred (“official”) source but if someone uses a different source it's ok. So I don't know how to make sure that named as will be identical in all cases. --Kolja21 (talk) 16:19, 22 June 2021 (UTC)[reply]
- @Ammarpad: Another solution would be to just leave out those cases (i.e. not touching them by bot) – it would be even better than a blank named as, as we could query the remaining stated in cases and fix them manually. I would be happy to help fixing those cases once the bot is done. --Emu (talk) 16:24, 22 June 2021 (UTC)[reply]
- +1. Let's start with the unambiguous cases (Lastname, Firstname) and see how many difficult cases left over. --Kolja21 (talk) 16:41, 22 June 2021 (UTC)[reply]
- @Kolja21, Emu: Are you sure about that? BrokenSegue suggested that the right form is "forename [prefix] surname", see his comments below or diff here. You see it's kind of opposite of (Lastname, Firstname). Maybe we should just descope the task to just updating the qualifier (object named as (P1932) -> subject named as (P1810)) which is most important thing here. Is that OK to you?Ammarpad (talk) 11:55, 29 June 2021 (UTC)[reply]
- @Ammarpad: There is no right or wrong. There are different sources with different fields. Descope the task would be ok with me. Thanks for your patience. --Kolja21 (talk) 13:10, 29 June 2021 (UTC)[reply]
- @Ammarpad, Kolja21, BrokenSegue: Pretty sure. See any of the official guidelines on editing the GND, for example page 3 of this document. Of course, this only applies for modern western Non-Icelandic names. (The “Lastname, Firstname” format dates way back to the original purpose of the GND in cataloguing.) --Emu (talk) 13:14, 29 June 2021 (UTC)[reply]
- Thank you both! I descoped the task and updated the script. Ammarpad (talk) 13:44, 29 June 2021 (UTC)[reply]
- @Kolja21, Emu: Are you sure about that? BrokenSegue suggested that the right form is "forename [prefix] surname", see his comments below or diff here. You see it's kind of opposite of (Lastname, Firstname). Maybe we should just descope the task to just updating the qualifier (object named as (P1932) -> subject named as (P1810)) which is most important thing here. Is that OK to you?Ammarpad (talk) 11:55, 29 June 2021 (UTC)[reply]
- +1. Let's start with the unambiguous cases (Lastname, Firstname) and see how many difficult cases left over. --Kolja21 (talk) 16:41, 22 June 2021 (UTC)[reply]
- @Emu, Kolja21: OK, should I just use "preferredName" where it exists and leave the field blank where it does not? (Though I am not sure if the edit can be saved with a blank name)" Ammarpad (talk) 07:19, 22 June 2021 (UTC)[reply]
- But shouldn’t the named as part match the “official” source at http://d-nb.info/gnd/104264489? That’s what I’ve been doing for quite some time creating many instances that would have to be fixed by yet another bot run, if this format (Lastname, Firstname) is changed to another format (such as Firstname Lastname) … If the "preferredNameForThePerson" from https://d-nb.info/gnd/104264489/about/lds is missing (which indeed sometimes is the case), an error list for closer human inspection might be enough. --Emu (talk) 22:13, 21 June 2021 (UTC)[reply]
- You are right. In this case ("Di" part of "surname") there is indeed nothing to consider. Example with a real prefix: GND 104264489. "Moltke, Adam Gottlob Detlef von" = surname, forename + prefix. If this is getting too complicated (Arabic prefixes like "al-" etc.) you can switch the source. You have two options:
- This is the same issue as above. The "Di" is not prefix as I understand it, the entire Di Pellegrino is supposed to be the surname. Ammarpad (talk) 17:51, 21 June 2021 (UTC)[reply]
- @Kolja21: can you clarify, please? There's no "person" field in the data. This is the data for Vitaliy V. Khutoryanskiy (Q37840153). You can see there's only "preferredName" and then "surname" and "forename" that are relevant here. "preferredName" is concatenation of forename + surname in that order. Ammarpad (talk) 17:51, 21 June 2021 (UTC)[reply]
- The name in the source (field "Person") should be copied 1:1 to subject named as (P1810) without changing in the order of a name. [3] @BrokenSegue: Thanks for noticing this problem. --Kolja21 (talk) 14:29, 21 June 2021 (UTC)[reply]
- @BrokenSegue: The requester @Kolja21: suggested that in the request, (the gndo:preferredNameForThePerson property). The property is concatenation of forename + surname, and I found it does not exist in some of the entities, so I chnaged it to use manual concatenation of the forename and surname. But we can retain the surname + forename as is. Ammarpad (talk) 14:11, 21 June 2021 (UTC)[reply]
- @Ammarpad: Seems to me that this line is just wrong. Why would you do "forename, surname"? Seems to me "forename surname" makes more sense. Is this a cultural thing? BrokenSegue (talk) 18:07, 21 June 2021 (UTC)[reply]
- @BrokenSegue: I said it above, that's how it's provided in the data see the "preferredName" key this data. It's the third key after @context and @id. But I just changed the code to use the surname + forename, I have no preference for any format. Is that OK? The conditions/format now is:
- If there's only forename, use it (I came across entities without surname)
- If there are forename and surname, use them as (surname, forename)
- If there's a "prefix", append it to above (i.e surname, forename prefix)
- Is this ok?Ammarpad (talk) 18:28, 21 June 2021 (UTC)[reply]
- @Ammarpad: I don't understand. The preferredName in that link is "Vitaliy Khutoryanskiy" with no comma. Why would you add the comma? Your changed code still adds the comma. BrokenSegue (talk) 18:31, 21 June 2021 (UTC)[reply]
- @BrokenSegue: It depends on what source you cite, see above: there are two options. If Culturegraph is the source the simplest and best way would be to add: stated in (P248): Culturegraph (Q107311325) + reference URL (P854) + retrieved (P813). --Kolja21 (talk) 22:12, 21 June 2021 (UTC)[reply]
- Updated now. Please check. Ammarpad (talk) 18:50, 21 June 2021 (UTC)[reply]
- I'm still confused. Why is it "surname forename" isn't the other way more typical e.g. "forename [prefix] surname"? BrokenSegue (talk) 05:12, 22 June 2021 (UTC)[reply]
- I am not sure why you're still confused. It was forename + surname. You said surname +forename is more common. I changed the code to do that you said you're confused again. Can you give me the exact format that should I use? Just give me the format then I'd change the code to do that. I am just implementing, I have no preference for any format. Ammarpad (talk) 07:06, 22 June 2021 (UTC)[reply]
- Really sorry for making this so hard but I don't recall saying "surname +forename is more common". I think "forename [prefix] surname" is the right form. As far as I can tell it never was "forename + surname" since the diff I pointed at above included a comma. BrokenSegue (talk) 17:16, 22 June 2021 (UTC)[reply]
- Thanks for clarifying. I was not angry actually. Ammarpad (talk) 11:45, 29 June 2021 (UTC)[reply]
- Really sorry for making this so hard but I don't recall saying "surname +forename is more common". I think "forename [prefix] surname" is the right form. As far as I can tell it never was "forename + surname" since the diff I pointed at above included a comma. BrokenSegue (talk) 17:16, 22 June 2021 (UTC)[reply]
- I am not sure why you're still confused. It was forename + surname. You said surname +forename is more common. I changed the code to do that you said you're confused again. Can you give me the exact format that should I use? Just give me the format then I'd change the code to do that. I am just implementing, I have no preference for any format. Ammarpad (talk) 07:06, 22 June 2021 (UTC)[reply]
- I'm still confused. Why is it "surname forename" isn't the other way more typical e.g. "forename [prefix] surname"? BrokenSegue (talk) 05:12, 22 June 2021 (UTC)[reply]
- @Ammarpad: I don't understand. The preferredName in that link is "Vitaliy Khutoryanskiy" with no comma. Why would you add the comma? Your changed code still adds the comma. BrokenSegue (talk) 18:31, 21 June 2021 (UTC)[reply]
- @BrokenSegue: I said it above, that's how it's provided in the data see the "preferredName" key this data. It's the third key after @context and @id. But I just changed the code to use the surname + forename, I have no preference for any format. Is that OK? The conditions/format now is:
- Support --Kolja21 (talk) 14:31, 21 June 2021 (UTC)[reply]
- Hi all, is there anything I can do to speed this up?, it has been some weeks now. Thanks Ammarpad (talk) 10:54, 20 July 2021 (UTC)[reply]
- @Ammarpad: yeah there's often delays here. maybe ping the administrator's noticeboard? BrokenSegue (talk) 17:47, 23 July 2021 (UTC)[reply]