User:BrokenSegue/API Proxy

Bot API Proxy Proposal

edit

Problem definitions

edit
  • Bots on wikidata are overloading the servers' capacity
  • Wikidata has limited ability to manage the load caused by the bots
    • No ability to prioritize between bots
    • No ability to pause individual bots during an incident
  • Bots are not efficiently sending their updates to wikidata (using multiple edits when one larger edit could do)
  • Bots compete with one another over acceptable max lag and jobs take longer to run

Proposed solution

edit
  • Develop a wikibase API proxy and encourage bots to point at it instead of wikidata
    • The proxy converts wikibase into an "eventually consistent" database
    • Read requests are directly forwarded to the underlying wikibase installation
    • Writes are stored, queued and responded to immediately as-if they had been immediately executed
    • Later the writes are slowly pumped into wikidata under the bots' accounts
    • Inefficient multiple edits can be condensed into a single larger edit
  • The proxy would have a UI which would allow administrators to monitor/control the bots:
    • Assign priorities to bots
    • Throttle back all bots temporarily
    • Disable bot
    • Monitor the size of the bot edit queue
  • Proxy could have an additional API bots could use to determine if their edits have gone through

Expected objections

edit
  • Not all bots will be tolerant of the "eventually consistent" mode of operation
    • We will sometimes have to return fake ids which the bots will use for subsequent writes
    • Bots that need to be able to read back their writes immediately will not be able to use the proxy
  • If the edit queue becomes too long then edit conflicts may arise
    • We can offer solutions to bot writers where their edits are canceled if:
      • More than some amount of time has passed
      • Someone else has since edited the item
      • Someone else has since edited that statement
  • Non-trivial to implement
    • I am willing to put in time to implement this assuming there is buy-in to run the proxy and bot-operators to migrate.
  • Bot operators may be unwilling to adopt
    • The change for many bot operators will be very small because the API is identical. Could just be a config change.
    • Bots will operate much faster and without edit rate limits which should make bot operators happy

Feedback?

edit

Comments appreciated on the talk page. BrokenSegue (talk) 02:57, 16 October 2020 (UTC)