Wikidata:Requests for permissions/Bot/ComplexPortalBot
The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Approved --Lymantria (talk) 20:26, 19 June 2021 (UTC)[reply]
ComplexPortalBot edit
ComplexPortalBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: TiagoLubiana (talk • contribs • logs)
Task/s:Add and update protein complexes entities from the European Bioinformatics Institute Complex Portal platform.
Code: Code available at: https://github.com/lubianat/complex_bot
Function details: For a set of curated species, the bot will (in synchrony with Complex Portal curation):
- Add items for macromolecular complexes absent on Wikidata
- Add label, aliases, description and core statements (e.g., "instance of")
- Link macromolecular complex items to their components via "part of" relations
- Link macromolecular complex to Gene Ontology terms
For more information, details about the process are available at User:ProteinBoxBot/2020_complex_portal
--TiagoLubiana (talk) 17:06, 23 February 2021 (UTC)[reply]
Support --Andrawaag (talk) 18:20, 23 February 2021 (UTC)[reply]
Support Andrew Su (talk) 19:28, 4 March 2021 (UTC)[reply]
Support --Jvcavv (talk) 20:47, 4 March 2021 (UTC)[reply]
Support --SCIdude (talk) 08:28, 5 March 2021 (UTC)[reply]
Support --Sulhasan (talk) 20:04, 9 March 2021 (UTC)[reply]
Support --Bmeldal (talk) 15:18, 19 May 2021 (UTC)[reply]
Comment Sample edits are being run. Robot is doing only a couple at any type due to API rate limits for new users. TiagoLubiana (talk) 12:25, 8 March 2021 (UTC)[reply]
Comment Sample edits were sucessfully done. Bot is ready for scale up. TiagoLubiana (talk) 14:13, 15 April 2021 (UTC)[reply]
- Question does this bot overwrite or delete existing information? If so, how, when and why. --- Jura 07:48, 16 April 2021 (UTC)[reply]
- Strong oppose given the open questions. (This to avoid that it gets prematurely approved before we actually have the question sorted out). --- Jura 07:48, 16 April 2021 (UTC)[reply]
- @Jura1: The bot follow the default behaviour for https://github.com/SuLab/WikidataIntegrator and does not explicitly delete information. I'd gladly check that, but currently, I am not able to run more test runs (repeated maxlags over 40 seconds). I'll try again and do a couple more checks, thank you for the comment. TiagoLubiana (talk) 19:48, 16 April 2021 (UTC)[reply]
- How about overwriting (or replacing)? --- Jura 19:58, 16 April 2021 (UTC)[reply]
- @Jura1: It was doing that, indeed, which is not appropriate. So thank you very much for the catch! I have fixed the source code, and it does not overwrite or delete information anymore. It simply appends a new statement if it doesn't exist already. I've tested the behavior on CST complex (Q105777252). TiagoLubiana (talk) 14:00, 21 April 2021 (UTC)[reply]
- @Jura1: Given the fix, would it be possible for you to lift the strong oppose? Best, TiagoLubiana (talk) 15:16, 22 April 2021 (UTC)[reply]
- Can you do three test edits showing correct updates (please provide links to the diffs)? To me this is still overwriting that shouldn't happen (last edit in the item you linked as sample above). --- Jura 10:18, 23 April 2021 (UTC)[reply]
- He is updating references. In my bot I'm not doing this but I would like to see a document/discussion discouraging this. --SCIdude (talk) 14:31, 23 April 2021 (UTC)[reply]
- @Jura1: it is only overwriting references, indeed. In the last edit that you quoted (this), the bot has not removed the molecular function (P680) and the instance of (P31) statements that were added manually (they are the only unreferenced statements in the page). As @SCIdude: mentions, it would be nice to see a document on that. However, if you think it is important, I'll invest some time to remove that too. TiagoLubiana (talk) 21:58, 23 April 2021 (UTC)[reply]
- I don't think it should be doing that. Maybe ask on Wikidata:Project chat if you think it's a good idea. --- Jura 17:10, 27 April 2021 (UTC)[reply]
- @Jura1: can you elaborate a bit on why you object to this? In the bot edit, the reference is updated with a novel time stamp, after the bot updated that item. Complex Portal, like other curation databases, are living resources that are being kept up-to-date in line with novel insights. Complex Portal is the primary curation source which means any Wikidata entry based on this resource should mirror Complex Portal’s updates in line with its publication cycles to ensure compatibility of available data. The bot only updates or verifies a statement added by Complex Portal and not statements from other sources. Updating the references of the primary statements with each Complex Portal release keeps the Wikidata entries up-to-date and clean while all edits are kept in the history for users to refer back to if required. --Andrawaag (talk) 15:03, 19 May 2021 (UTC)[reply]
- It's just consistent with the way most other data in Wikidata is updated. Statements are not meant to be continuously re-written as this is done in Wikipedia. Ranks and qualifiers are meant to keep track of validity, see Help:Ranks. If the resource isn't stable, maybe it shouldn't be imported at all. --- Jura 06:38, 20 May 2021 (UTC)[reply]
- The data was not changing, I think. Only the metadata. Do you mean you want at some point 500 references for the same statement, the most recent reference ranked "best"? Is that technically even possible at this moment? --Egon Willighagen (talk) 09:31, 20 May 2021 (UTC)[reply]
- @Jura1: That is not accurate. Ststements are constantly being updated in wikidata. There is even a apicall "wbeditentity", that updates a full item in one go, which is much less expensive on the API than updating one statement, rank, reference at the time. This is from a technical perspective, but even on content level, Wikidata is constantly in flux. Everytime for example a new head of state or government is instated that statement about the previous person is updated. eg. 1. A new statement is created with new head of state and the previous statement is updated with qualifiers and sometimes new references. This has nothing to do with a resource being stable or unstable, but more with knowledge evolving. Other timse new references emerge that need to be added to the reference blob of a statement as well. --Andrawaag (talk) 10:06, 20 May 2021 (UTC)[reply]
- Technically I can edit your comment and re-write it to say something else, but that doesn't mean I should or that it's acceptable here or in Wikipedia to do so. Wikipedia does allow to rewrite an article fully based on new information and generally doesn't require to include a historical evolution of a topic. This is different from Wikidata, where Help:Ranking outlines the approach for historic information. It's correct that qualifiers can be added to statements with new references, but that doesn't mean previous information should be re-written or deleted entirely. Also, I'm aware that we had a few legacy bots (possibly operated by yourself or people collaborating with you) that haven't been thoroughly reviewed prior to starting to operate and occasionally lead to discussions with other users when they delete information or lead to requests for deletion of entire batches of items. I hope they have been fixed in the meantime. --- Jura 07:39, 31 May 2021 (UTC)[reply]
- It's just consistent with the way most other data in Wikidata is updated. Statements are not meant to be continuously re-written as this is done in Wikipedia. Ranks and qualifiers are meant to keep track of validity, see Help:Ranks. If the resource isn't stable, maybe it shouldn't be imported at all. --- Jura 06:38, 20 May 2021 (UTC)[reply]
- @Jura1: can you elaborate a bit on why you object to this? In the bot edit, the reference is updated with a novel time stamp, after the bot updated that item. Complex Portal, like other curation databases, are living resources that are being kept up-to-date in line with novel insights. Complex Portal is the primary curation source which means any Wikidata entry based on this resource should mirror Complex Portal’s updates in line with its publication cycles to ensure compatibility of available data. The bot only updates or verifies a statement added by Complex Portal and not statements from other sources. Updating the references of the primary statements with each Complex Portal release keeps the Wikidata entries up-to-date and clean while all edits are kept in the history for users to refer back to if required. --Andrawaag (talk) 15:03, 19 May 2021 (UTC)[reply]
- I don't think it should be doing that. Maybe ask on Wikidata:Project chat if you think it's a good idea. --- Jura 17:10, 27 April 2021 (UTC)[reply]
- How about overwriting (or replacing)? --- Jura 19:58, 16 April 2021 (UTC)[reply]
Support --Egon Willighagen (talk) 15:37, 19 May 2021 (UTC)[reply]
Comment Not sure how the digression about the api above is relevant here, just to summarize the basics: items, statements and references (such as the retrieved date) shouldn't be deleted, removed or overwritten. New information can be added: new items, new statements to existing items, new qualifiers and references to existing statements, etc. --- Jura 07:39, 31 May 2021 (UTC)[reply]
- @Jura1: Just to be clear. You state there is consensus that a new reference should be added even if the same bot retrieves the same data from the same database one day later? If so, there is no reference for your statement. Your refusal of support for this bot seems to be based on unsupported claims and thus to be invalid. --SCIdude (talk) 10:01, 31 May 2021 (UTC)[reply]
- If new information is added, then a reference should be provided.
You shouldn't "touch up" all items and statements merely because you run the bot on a daily basis on a series of items without adding anything. I'm not aware of any consensus for such "touch-up" bots operating on Wikidata. --- Jura 10:07, 31 May 2021 (UTC)[reply]
- @Jura1: Will you withdraw your refusal of support if the bot owner pledges to not "touch up" references in the future? --SCIdude (talk) 10:12, 31 May 2021 (UTC)[reply]
- The question is less if I (or anybody) else supports or opposes the bot, but what concerns (or arguments) were advanced. As far as I'm concerned, I think with this, all I raised were addressed and so it could move ahead. --- Jura 10:38, 31 May 2021 (UTC)[reply]
- @Jura1: Will you withdraw your refusal of support if the bot owner pledges to not "touch up" references in the future? --SCIdude (talk) 10:12, 31 May 2021 (UTC)[reply]
- If new information is added, then a reference should be provided.
What's the outcome ? edit
- @SCIdude: finally, what will it be doing for updates? --- Jura 09:57, 3 July 2021 (UTC)[reply]