Wikidata:Requests for permissions/Bot/WikiTennisBot 1
The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Approved --Lymantria (talk) 09:19, 29 December 2020 (UTC)[reply]
WikiTennisBot 1 edit
WikiTennisBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Mad melone (talk • contribs • logs)
Task/s:As per Wikidata:Bot_requests#ITF-url-conversion, the website of the International Tennis Federation has changed their website structure and consequently a new ID has been introduced, i.e. there is ITF player ID before 2020 (archived) (P599) which was used on the old website structure but still can be used (forward to the new structure) and ITF player ID 2020 (P8618) for the new structure. Even though there has been manual effort, I believe that there is potential for a bot run.
Code: Please note that I am a newby when it comes to bot programming or programming as such. Therefore, please be critical.
- Code: https://github.com/madmelone/WikiTennisBot/blob/main/bots/ITF%20property%20change/bot.py
- List of relevant items: https://github.com/madmelone/WikiTennisBot/blob/main/bots/ITF%20property%20change/items_newformat.txt
- List of errorneous ITF player ID before 2020 (archived) (P599) statements which need to be corrected manually
Function details: Bot works as follows:
- Get all players with a ITF player ID before 2020 (archived) (P599) statement OK
- Get all new links for ITF player ID 2020 (P8618) statement OK
- Run the bot for updates
- Check whether the item already has a ITF player ID 2020 (P8618) statement. If no: add new statement, if yes: only update in case different from the results from step 2
- (Potentiall) Set all ITF player ID before 2020 (archived) (P599) statements to "deprecated
- Write a bot log (small examples via User:WikiTennisBot/log)
I encourage feedback on both the code and the idea, therefore pinging @Edoderoo:
--Mad melone (talk) 14:49, 7 November 2020 (UTC)[reply]
One question I already have is how to save the rank of a statement, because item.setRank("deprecated") doesn't seem to work for me. Thanks--Mad melone (talk) 15:48, 7 November 2020 (UTC)[reply]
- Support this in general. Nice idea to use selenium webdriver here considering the difficulties with the redirects.
However, I would not deprecate non-resolving or redirecting P599 statements, as "dead link" or "redirecting link" do not mean that the identifier is invalid. Anyways, if you deal with ranks in pywikibot you probably look for pywikibot.Claim.changeRank (instead of "setRank").
Generally the code is still a bit unstructured which does not make it appealing to go through it. However, I am pretty sure that most bot code here does not get a thorough review anyways ;-) In case you get the flag, please start with small batches that you can verify easily using your bot's contributions page and only increase the batch size once you gain more confidence. —MisterSynergy (talk)
- MisterSynergy Thanks for the feedback. Using claim.changeRank gives an error.
item.get()
if item.claims:
if "P599" in item.claims:
if item_data[i]["itf_old"] == item.claims["P599"][0].getTarget():
# Set P599 statement deprecated if not already
claim = pywikibot.Claim(repo, "P599")
if not claim.getRank() == "deprecated":
print(claim.getRank())
claim.changeRank("deprecated")
...
It prints out "normal", so I know I am in the right branch of the decision tree, but I get the following error: "AttributeError: 'Claim' object has no attribute 'title'" - don't know what to make of that. Would be great if you or someone else who has experience can push me in the right direction, but I will try to work on it by myself regardless. --Mad melone (talk) 19:28, 11 November 2020 (UTC)[reply]
- You are not editing the existing claim, thus it fails. Try something like this (which I put into a function just for demo purposes; not tested, but should do):
def deprecateRank(item, oldId): item.get() if not item.claims: # no statements found in this item return if 'P599' not in item.claims: # property P599 not found in this item return for claim in item.claims['P599']: # mind that item.claims['P599'] is a list of all P599 claims over which you want to loop here if claim.getTarget()!=oldId: # claim has a different value, skip continue if claim.getRank()=='deprecated': # claim is already deprecated, skip continue claim.changeRank('deprecated')
- —MisterSynergy (talk) 19:53, 11 November 2020 (UTC)[reply]
- Support - right now the tediuous work is done manually, which is a waste of time and (mental) energy. Edoderoo (talk) 10:49, 10 November 2020 (UTC)[reply]
- Maybe the bot could set "end cause"="redirect" as qualifier for identifiers that are no longer active (while keeping normal rank). Also, please do a few actual live edits as samples. Looking at Q56254533 mostly confused me.
@MisterSynergy: would you kindly change the sorting of properties so that P8618 appears before P599 on items? --- Jura 10:52, 10 November 2020 (UTC)[reply]
- I made a couple of edits, see https://www.wikidata.org/wiki/Special:Contributions/WikiTennisBot. I identified an error with a trailing "/" in the url, but fixed that in subsequent runs, i.e. I ran the same items where there was an error but after fixing it ran fine. As per above I don't intend to include a deprecation of P599 but only insert P8618 claims which seems to work fine. I could also run this without bot flag, but would also seek your feedback regardless. Please advise how to continue, --Mad melone (talk) 20:07, 11 November 2020 (UTC)[reply]
- I made a larger batch of edits this morning (376 edits, see User:WikiTennisBot/log/20201311 07-37-34 ITF Property Change - I checked more than 200 of those edits and all of them look fine to me. Would be great if I could get the "go" signal for the remaining ~ 2.000 edits. --Mad melone (talk) 12:30, 13 November 2020 (UTC) @MisterSynergy: @Jura1:[reply]
- I made a couple of edits, see https://www.wikidata.org/wiki/Special:Contributions/WikiTennisBot. I identified an error with a trailing "/" in the url, but fixed that in subsequent runs, i.e. I ran the same items where there was an error but after fixing it ran fine. As per above I don't intend to include a deprecation of P599 but only insert P8618 claims which seems to work fine. I could also run this without bot flag, but would also seek your feedback regardless. Please advise how to continue, --Mad melone (talk) 20:07, 11 November 2020 (UTC)[reply]
- Are we ready for approval here?--Ymblanter (talk) 20:58, 24 November 2020 (UTC)[reply]
- We may assume so after a month of silence. Lymantria (talk) 09:19, 29 December 2020 (UTC)[reply]