Wikidata talk:Bots
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2024. |
When do you need to create a bot account?
editThis page currently doesn't mention any criteria for when a mass-editing operation would need a bot account. This is relevant because Quickstatements enables regular users to perform mass-editing operations. The only information I could find was on the Quickstatements help page which currently says "Very large runs or potentially-controversial runs should go through the approval process described in Wikidata:Bots.", but that is not a well-defined criterion. Silver hr (talk) 20:52, 16 July 2022 (UTC)
Unattributed proxy edits
editPicking up on this thread, I propose that we add:
Bots that proxy edits |
In order for a bot to make edits on behalf of another user, for which the botop is not responsible, then:
|
Bovlb (talk) 18:03, 30 August 2022 (UTC)
- I think we should rather phase out the use of proxy bot accounts completely. As much as I am aware, OAuth allows tools to make edits from the Wikimedia account of the tool user anyways.
- Btw. we do have a related situation at User talk:Reinheitsgebot#Who is triggering edits of this account?. —MisterSynergy (talk) 18:32, 30 August 2022 (UTC)
- Eliminating proxy edits entirely would also meet my needs. Bovlb (talk) 20:47, 30 August 2022 (UTC)
- Using proxy bots used to have three advantages:
- Before the introduction of OAuth, it was the only possibility. This is no longer true. (Extra grant needs to be requested through OAuth for the tool to be able to edit, but users should be comfortable with granting it if they want to use an editing tool.)
- Their edits can be marked as bot edits. This is what you want to prohibit.
- They can use higher API limits than ordinary users. This is what would remain, although I’m not sure if bots can actually take advantage of this, since they need to respect replication lag. It’s also a question if it’s an advantage or disadvantage that any logged-in user may quickly edit many pages.
- Bots have two additional rights that autoconfirmed users don’t have—
suppressredirect
(Not create redirects from source pages when moving pages) andnominornewtalk
(Not have minor edits to discussion pages trigger the new messages prompt)—, but these apply only for wikitext pages, not for entities, so they’re mostly uninteresting for Wikidata bots, especially proxy bots.
- Considering that the only advantage that would remain is the higher API limits, and even that is of questionable value, I’m also for entirely banning proxy bots. However, I think such an important policy change should be discussed at a more visible place, e.g. on Wikidata:Project chat, so that all interested people can take part. —Tacsipacsi (talk) 08:08, 31 August 2022 (UTC)
- On "API limits": bot accounts have the right "apihighlimits" which allows them to read data from the API more efficiently in some scenarios. However, they do not have "noratelimit" any longer: the maximum edit rate for both bots and regular users is 90/minute. Bot accounts cannot edit quicker than regular ones. —MisterSynergy (talk) 08:58, 31 August 2022 (UTC)
- Oh right, so bots can only query a bit more quickly. Thanks to continuation, this is probably a negligible difference, and even if/when not, nothing stops the tool from querying through a bot account; we want to ban only proxied edits. Then there’s really no reason to use proxy bots. —Tacsipacsi (talk) 18:42, 1 September 2022 (UTC)
- On "API limits": bot accounts have the right "apihighlimits" which allows them to read data from the API more efficiently in some scenarios. However, they do not have "noratelimit" any longer: the maximum edit rate for both bots and regular users is 90/minute. Bot accounts cannot edit quicker than regular ones. —MisterSynergy (talk) 08:58, 31 August 2022 (UTC)
- Using proxy bots used to have three advantages:
- Eliminating proxy edits entirely would also meet my needs. Bovlb (talk) 20:47, 30 August 2022 (UTC)
Formally describing bot tasks
editWe have quite many bots.
I recently created a bot to create a better overview over our various bots by scraping the bot User:* profiles for {{Bot}}
.
One bot can perform many different tasks. I would like to make the individual bot tasks discoverable by the involved properties,
e.g. show me all bots that add official website (P856) as a main statement, or all bots that use point in time (P585) as a qualifier, or all bots that edit lexemes.
Currently bot tasks are only described in free text ... so this would require us to introduce a way to formally describe the tasks of a bot.
I therefore suggest the introduction of a new tasks
parameter for {{Bot}}
which would accept a JSON array where each contained object has the following properties:
description
: English description of the task in plaintext (no wiki markup). Mentions of properties are automatically linkified.space
: In which space the edit is performed, acceptable values are: entity types (Item, Property, Lexeme, Sense, Form) or Wikitext to denote that the edit changes regular wikitext pages
Additionally a task may specify one of the following:
- tasks that add or remove claims can specify which properties they use with
"properties": { "mainStatement": [...], "qualifier": [...], "reference": [...] }
"fingerprint": true
specifies that the tasks edits labels, descriptions and/or aliases"sitelinks": true
specifies that the tasks adds or removes sitlinks"sitelink_badges": true
specifies that the tasks adds or removes sitelink badges
The JSON would reside directly in the wikitext, making it easy to scrape and for humans visiting the page the JSON would be rendered via Module:BotTasks, as shown in the following examples.
This is just my first idea of how to formally describe bot tasks ... feedback is very much welcome!
--Push-f (talk) 08:42, 8 December 2022 (UTC)
Examples
editSpace | Description | Properties involved in the edit | ||
---|---|---|---|---|
Main statement | Qualifier | Reference | ||
Item | Add software version identifier (P348) to items that have source code repository URL (P1324) set to a GitHub.com repository | software version identifier (P348) | publication date (P577) | reference URL (P854), retrieved (P813), title (P1476), publication date (P577) |
Item | Add official website (P856) to items that have source code repository URL (P1324) set to a GitHub.com repository | official website (P856) | reference URL (P854), retrieved (P813) |
Space | Description | Properties involved in the edit | ||
---|---|---|---|---|
Main statement | Qualifier | Reference | ||
Item | Add pronunciation audio (P443) claims for records made on lingualibre.org | pronunciation audio (P443) | reference URL (P854) | |
Form | Add pronunciation audio (P443) claims for records made on lingualibre.org | pronunciation audio (P443) | language of work or name (P407) | reference URL (P854) |
Space | Description | Properties involved in the edit | ||
---|---|---|---|---|
Main statement | Qualifier | Reference | ||
Item | Adds descriptions for various languages | This task edits labels, descriptions and/or aliases. |
Discussion
editSomehow this feels too static in my opinion:
- My own bots currently have more than 10 tasks; I am also co-maintaining Deltabot and PLbot meanwhile, with more than 50 different scripts
- Some tasks involve non-content namespaces
- Some tasks involve actions such as "patrol", or "protect", "delete", etc. (admin-bot); some interact with sitelinks and badges, or terms in the widest sense; some may use "undo" or "rollback"
- Some tasks may decide what to do on-the-fly
It would be quite an ask to provide a definite list of things the bots edit during operation. —MisterSynergy (talk) 09:24, 8 December 2022 (UTC)
- I guess by non-content namespaces you mean regular wiki pages? I already accounted for those with
"space": "Wikitext"
. - Right I think it's okay if we leave out admin actions such as "patrol", "protect" and "delete" for now. Most bots aren't admin bots anyway.
- I just added three other options "fingerprint", "sitelink" and "sitelink_badges" ... note that I am not proposing to model these in detail (e.g. which bot edits which labels/descriptions/aliases in which languages or which sitelinks are edited)... I think it's good enough to be able to differ a bot that only edits properties from a bot that only edits something in the fingerprint or something about sitelinks.
- I don't know what you mean by "terms in the widest sense".
- So yes I don't think this scheme has to cover everything, I think it's already valuable if it can describe most tasks of the average bot.
- --Push-f (talk) 16:28, 8 December 2022 (UTC)
Is a bot flag required for a bot that is expected to make very few edits (if any)?
editPlease see phab:T370842 and wikipedia:Wikipedia:Administrators'_noticeboard#Bot_to_inform_temp_users_of_expiry for context. On Wikidata, it appears that this feature is almost never used, and my question is whether I still need to go though the bot approval process for this. Leaderboard (talk) 07:12, 17 August 2024 (UTC)
- I thought we automatically accept global bots. Ymblanter (talk) 18:42, 18 August 2024 (UTC)
- @Ymblanter: meta:Global bots is a specific flag which requires a two-week global discussion period, which this bot does not have. (also: global bots are disabled on this wiki anyway) Leaderboard (talk) 06:11, 19 August 2024 (UTC)
- Then I would say it would be good to go through a request, mostly to see whether the community thinks the task is worthwhile to perform. Ymblanter (talk) 06:48, 19 August 2024 (UTC)
- Deploying a bot on Wikidata to notify users of rights which will expire sounds like a waste of effort to me. Multichill (talk) 20:13, 20 August 2024 (UTC)
- Hi @Multichill: you are right in that this is pretty much never used on Wikidata (I wasn't able to find one in limited testing), but I intend to run the bot globally on as many wikis as possible and hence this process (because nothing stops this from being used in the future). Ideally this is something that shouldn't require any kind of approval at all, but the rules don't work like that. Leaderboard (talk) 18:26, 21 August 2024 (UTC)
- Deploying a bot on Wikidata to notify users of rights which will expire sounds like a waste of effort to me. Multichill (talk) 20:13, 20 August 2024 (UTC)
- Then I would say it would be good to go through a request, mostly to see whether the community thinks the task is worthwhile to perform. Ymblanter (talk) 06:48, 19 August 2024 (UTC)
- @Ymblanter: meta:Global bots is a specific flag which requires a two-week global discussion period, which this bot does not have. (also: global bots are disabled on this wiki anyway) Leaderboard (talk) 06:11, 19 August 2024 (UTC)
Are bots subject to edit rate limits?
editAnd if so, is there an approval process for gettin the noratelimit right, short of becoming a sysop? 2600:1003:B13C:72FD:9AC6:700D:1BF1:6E08 21:47, 14 November 2024 (UTC)
- Yes, accounts with the botflag are ratelimited at 90 edits per minute just as regular user accounts, and there is no approval process for getting the noratelimit right.
- Admins do have this right only because certain admin functions might otherwise not work properly. —MisterSynergy (talk) 23:50, 14 November 2024 (UTC)