Function details: the bot goes through items without sitelink and check their history pages and if there is just one site link (that has been removed) checks the site link and if the site link exists in another item, merges these two, at first phase I'll merge only items that the empty item has no statement at all.
It deletes them completely automatically, right? If so, it seems like an "adminbot", which has been controversial in the past. Of course, what really matters is the current community consensus.
Personally, I don't see any issue as long as there aren't too many false positives and the deletions are monitored. The Anonymouse (talk) 06:53, 18 November 2013 (UTC)Reply[reply]
Yes It deletes automatically but it reflects in rc and anyone can check themAmir (talk) 15:51, 19 November 2013 (UTC)Reply[reply]
@Ladsgroup: can you post the code, please? I myself have made several thousands deletions using merge.py, so I would't be concerned with this task as long as you keep any responsibility. --Ricordisamoa 23:19, 22 November 2013 (UTC)Reply[reply]
I can send you the code if you want (sorry for delay, I forgot) Amir (talk) 17:37, 23 November 2013 (UTC)Reply[reply]
@Ladsgroup: I'd prefer the bot code being public to the community, given the potential dangerousness of this task. --Ricordisamoa 03:20, 7 December 2013 (UTC)Reply[reply]
@Bene* Why should the merge go in the higher ID? I strongly vote to merge in the most used or in the lower one. Help:Merge also says in section Select recipient item: "choose the one with the lowest Q####, as it is the oldest item" — Felix Reimann (talk) 09:44, 21 February 2014 (UTC)Reply[reply]
Hi, where did I recommend to merge into the higher id? I only said that the most used item or the one having more sitelinks or statements should be chosen. I assume you agree with that. -- Bene*talk 15:52, 21 February 2014 (UTC)Reply[reply]
I was not sure what you meant with bigger item. If you meant bigger in terms of numbers of interwiki or backlinks, then I'm very fine with it. — Felix Reimann (talk) 16:10, 21 February 2014 (UTC)Reply[reply]
@Ladsgroup: you should construct the Site object by using the DBname directly, and not by removing "wiki" from it to get the language code, since it may break with other sites than Wikipedia. And BTW the getReferences method of the 'pywikibot/core' branch comes with a built-in namespace filter. --Ricordisamoa 18:39, 16 January 2014 (UTC)Reply[reply]
about the first one, I'll fix it, the latter, I'm running via compat (I'm a little bit old-school)Amir (talk) 18:41, 16 January 2014 (UTC)Reply[reply]
Can you make a (semi-random) list of these empty items to give me an idea what we're talking about? And to others: Please don't start deleting or editing them, that defeats the point. Multichill (talk) 10:06, 22 February 2014 (UTC)Reply[reply]
This is very important comment. Could you generate full list too? I think community must review it very carefully before approving this task. — Ivan A. Krestinin (talk) 18:50, 22 February 2014 (UTC)Reply[reply]
I'm working on making a sample list. Amir (talk) 09:23, 23 February 2014 (UTC)Reply[reply]
It is not good idea. Bigger item can has bad description for some languages. General idea: bot must not loose any good information. If bot can not save all information or can not decide that information is better the bot must skip conflicting item pairs. The pairs are needed manual processing. — Ivan A. Krestinin (talk) 18:44, 25 February 2014 (UTC)Reply[reply]
@Ivan A. Krestinin: you're right, I'm gonna change it to consider checking description conflict and if it couldn't decide, abandon it. Thank you for sharing this concern. Best Amir (talk) 07:22, 1 March 2014 (UTC)Reply[reply]
So we have this backlog. With this we'll clear that, but what happens after that? Maybe we can create daily reports for admins to check? Could be based on the date of the last revision. Multichill (talk) 20:20, 24 February 2014 (UTC)Reply[reply]
If you want my bot can report deletion log in a place with telling the reason of deletion. It's not a big deal. Amir (talk) 17:02, 25 February 2014 (UTC)Reply[reply]
@Ivan A. Krestinin: It's fixed now. @Ricordisamoa: this is great but I can't use for several reasons: 1- I use compat not core (I'd love to port it but I don't have time for it right now) 2- the most important issue in here is to check the item is empty enough to delete it and because Q-number doesn't matter anymore my code won't do merging now (I wrote a code to merge but I can't use it because now the Q-numer doesn't matter and it's not okay to copy content of a big item to another item just because the latter has lower Q-id) Amir (talk) 13:45, 8 March 2014 (UTC)Reply[reply]
According to bot policy, yes. Also, you might want to run your bot on your main admin account and make a few test deletions, if you think it needs testing. The Anonymouse [talk] 15:42, 8 May 2014 (UTC)Reply[reply]
Thanks for posting code Amir. I think the skip condition "if linkpage.namespace()==0:" should be more inclusive, or possibly even removed (initially?). I have had items deleted when they were being discussed on a task force discussion page, and for a non-admin this is very annoying as all that is left is Qddddd, with no way to remember what it was. (I had one restored, but there is still one red link at Wikidata talk:Sport results task force). Also consider that there might be an active discussion about the item on WD:RfD or WD:PC. Perhaps run the bot skipping any item with an incoming link until the backlog is cleared, then let humans review the backlog of items with incoming links from other namespaces, and then try to introduce some logic which only deletes the ones that are not valuable. John Vandenberg (talk) 17:16, 13 May 2014 (UTC)Reply[reply]
I think excluding user and user talk namespaces is good enough, what do you think? Amir (talk) 11:39, 14 May 2014 (UTC)Reply[reply]
@Ladsgroup:, just checking ... do you mean something like the following?