Wikidata:Bot requests/Archive/2016/05

This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Merging items (de-sv)

Recently in User:Pasleim/projectmerge/dewiki-svwiki were added a lot of new, apparently duplicated items, candidates for merging - items of homonymous disambiguation pages in sv (sv:Kategori:Robotskapade förgreningssidor) and de Wikipedias (de:Kategorie:Begriffsklärung). It's necessary to merge such pairs of items if there are no conflicts with sitelinks, I think. XXN, 19:32, 29 May 2016 (UTC)

Observe that there is also a lot of duplicates within svwiki. They cannot be merged today, but will be in a distant future. -- Innocent bystander (talk) 16:04, 2 June 2016 (UTC)

Done ~600. XXN, 00:20, 5 June 2016 (UTC) :This section was archived on a request by: XXN, 00:20, 5 June 2016 (UTC)

The cost of hryvnia

Hi. There are potentially very interesting, but not an easy task, requiring bots, preferably on a regular basis, through wikitech. National Bank of Ukraine publishes daily exchange rate in xml and json (http://bank.gov.ua/NBUStatService/v1/statdirectory/exchange?xml, http://bank.gov.ua/NBUStatService/v1/statdirectory/exchange?json). If the bot can receive daily about 9 am UTC + 3 that data and update Hryvnia (Q81893), could be a relevant template and fill projects actual statistical information. Here is an example of filling information [1] --Максим Підліснюк (talk) 23:53, 11 May 2016 (UTC)

I could do it. Do we need to consider licensing issues? – T.seppelt (talk) 18:44, 12 May 2016 (UTC)

@T.seppelt: No. First, it is the statistics that have no author's values. Second, this data issued by the government of Ukraine (National Bank of Ukraine) within its powers, and therefore are free according to Art. 10 of the Law of Ukraine «On Copyright and Related Rights». Finally, according to ch. 2, Art. 10-1 of the Law of Ukraine «On Access to Public Information» public information in the form of open data is permitted to further free use and distribution. Anyone is free to copy, publish, distribute, use, including for commercial purposes, in conjunction with other information, or by the inclusion of its own product in the form of public information open data with reference to the source of such information. --Максим Підліснюк (talk) 19:11, 12 May 2016 (UTC)

@Максим Підліснюк:

Done I added the task to the crontab. It will be performed every day. Please check the bot's edits in the next days. I limited it to Euro (Q4916) and United States dollar (Q4917). This can be changed of course but I thought that these currencies are the most relevant. -- T.seppelt (talk) 08:26, 16 May 2016 (UTC)

@T.seppelt: please, add Russian ruble (Q41044). Thanks --Максим Підліснюк (talk) 19:21, 17 May 2016 (UTC)

@Максим Підліснюк: I'll do it in the upcoming days. -- T.seppelt (talk) 19:26, 17 May 2016 (UTC)

@T.seppelt:, something wrong: you says that Hryvnia (Q81893)=~25 USD but in real there's reverse quantity: USD=~25 UAH. So you should perform reversion of these numbers, or to put these data to opposite items. --Infovarius (talk) 16:03, 3 June 2016 (UTC)

@Infovarius: I fixed the bug and added also Russian ruble (Q41044). -- T.seppelt (talk) 08:27, 4 June 2016 (UTC)

This section was archived on a request by: T.seppelt (talk) 16:09, 10 June 2016 (UTC)

Fix globes for non-Earth coordinates

While phab:T56097 remains unfixed, it's impossible to enter non-Earth coordinates with the right globe using the website, so we have we to choose between adding coordinates with the wrong globe or not adding the coordinates.

I'm looking for someone who can run a bot regularly, once a day would be good, which can look for items where the globe for the coordinates doesn't match the located on astronomical body (P376) statement and then update the globe for the coordinates. This query might be useful.

- Nikki (talk) 11:09, 3 May 2016 (UTC)

See https://www.wikidata.org/wiki/User:LocatorBot. If you put correct located on astronomical body (P376) it will do the rest. --Smalyshev (WMF) (talk) 00:41, 4 May 2016 (UTC)

This section was archived on a request by: --Pasleim (talk) 11:45, 27 June 2016 (UTC)

Import Identifiers from Commons Authority Control templates

Commons finally joined Wikidata community with enabling of arbitrary access on Commons and I just rewrote c:Module:Authority control to use identifiers from Wikidata. When comparing Wikidata and Commons identifiers there are occasional mismatches and wikidata items missing identifiers specified on Commons. Help is needed to write bots that can copy identifiers from Commons to Wikidata. See c:Category:Authority control maintenance subcategories. --Jarekt (talk) 16:19, 12 May 2016 (UTC)

Somethin for User:T.seppelt? --Pasleim (talk) 17:03, 12 May 2016 (UTC)

Definitely, I have a script ready. Unfortunately I'm off for the weekend. Next week I'm going to fill the request for approval on commons. – T.seppelt (talk) 18:32, 12 May 2016 (UTC)

T.seppelt, If you need any help with request or navigating Commons, please contact me directly on my talk page on Commons . --Jarekt (talk) 13:17, 18 May 2016 (UTC)

We also have hundreds of pages with identifiers on Commons that need to be copied to Wikidata. See for example c:Category:Wikidata with missing NLA identifier. All Creator and Institution templates there have NLA identifier while corresponding wikidata pages do not have "Libraries Australia ID (P409)". Those "just" need to be imported to Wikidata without a need to touch any pages on Commons (or getting approval there). We also have categories like c:Category:Pages with mismatching GND identifier where GND ID (P227) does not match GND value stored on Commons: either one of them wrong or both correct. --Jarekt (talk) 13:14, 18 May 2016 (UTC)

My bot will take care of this. It's part of the process which I already used on several Wikimedia projects. Thank you Jarekt, -- T.seppelt (talk) 13:27, 18 May 2016 (UTC)

@Jarekt: It's not that simple - as the data is imported into Wikidata, it should be removed from Commons, so that it is stored once, not twice. Commons' templates will then fetch the data values from Wikidata as required. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:11, 18 May 2016 (UTC)

Andy, Once the date is moved to Wikidata than the pages will automatically move to c:Category:Pages using authority control with parameters matching Wikidata. Then, I routinely remove identifiers from {{Authority Control}} templates found there, so they only use wikidata. It seems like T.seppelt has a process working which he used on "on several Wikimedia projects" and I might be reinventing a wheel here, but it seems to me that in the majority of the cases all we need to do is to add wikidata q-code, verify that the identifiers match wikidata and remove identifiers from Commons. Only small minority of files need more complicated processing. --Jarekt (talk) 16:21, 18 May 2016 (UTC)

I was probably a bit unclear about what my bot is actually doing: It loads all pages in a tracking category for pages with local (redundant or not) parameters. Then these values are compared to the identifiers stored in Wikidata. If values are missing, the bot adds them to Wikidata. At the end it removes all values from Commons which can be found on Wikidata. Since on Commons everything is a bit more complicated (Q paremter etc), I'm very glad that you created a pretty smart module and a more advanced tracking system. The bot will manage it. I'd recommend to let it start and see then what we can do woth the left-overs. Warm regards, –T.seppelt (talk) 05:25, 19 May 2016 (UTC)

This section was archived on a request by: --Pasleim (talk) 17:36, 7 July 2016 (UTC)

Undo edits by Edoderoobot

User:Edoderoobot has added thousands of incorrect instance of (P31) statements (see User talk:Edoderoobot#Hundreds_.28if_not_thousands.29_of_incorrect_P31_statements). It's now been nearly 10 weeks since User:Edoderoo was first told about it and despite numerous messages, I have seen no progress at all. Although Wikidata:Bots makes it quite clear that it's Edoderoo's responsibility to clean up the mess, the incorrect statements have been there far too long already and I don't want them to continue being there indefinitely so I'm asking here to see if someone else is willing to help.

The problematic statements are instance of (P31) statements with imported from Wikimedia project (P143) Swedish Wikipedia (Q169514) as a reference, so I'd like a bot to go through all the edits by Edoderoobot and remove any statement it added which matches that as being questionable. If the bot is able to determine whether the statement is actually correct and only remove the incorrect ones, that would be a bonus.

- Nikki (talk) 17:32, 21 May 2016 (UTC)

@Alphos: Could you help here? Matěj Suchánek (talk) 12:30, 22 May 2016 (UTC)

Most definitely can, but I just noticed this ping, and it's getting awfully late for me - and unlike some, I'd rather be available to shut RollBot down immediately, should anything undesirable happen

The last "offending" edit seems to be on May 5th (and Edoderoobot seems to be inactive since then), but could you point me to the first one, or at least give me the date ? If not don't worry, I'll find it tomorrow.

Another note is that some of these edits appear to be legit, and RollBot cannot discriminate : should it revert all of them nonetheless ?

Alphos (talk) 22:37, 22 May 2016 (UTC)

@Nikki: It's been a few days, and I haven't started RollBot on the task yet - good thing, as it turns out I was thinking of another bad set of edits Edoderoobot made, for which I'll probably contact Tacsipacsi, EncycloPetey, Tobias1984 and Multichill to offer RollBot's services, when the task at hand is done.

I really need more details (first and last edit in that set of "bad" edits, mainly), and possibly a decision as to "nuking" (reverting pages to their former state regardless of edits made by other users since, thus also reverting subsequent edits by other people) ; RollBot usually doesn't nuke, leaving the pages the pages with edits by other users and listing them instead, but it can nuke.

It may seem counterintuitive, but my bot doesn't technically revert edits, it brings pages/entities to their prior state, and there is a difference.

Alphos (talk) 13:46, 26 May 2016 (UTC)

I'm really not sure when it started or ended. :( The biggest problem for me is finding the edits. Edoderoboobot was doing multiple bot things simultaneously, so the edits adding incorrect P31 statements are mixed in with thousands of unrelated edits to descriptions and there are far more edits than I can possibly go through by hand, so I can't actually find all the bad edits. The examples I have to hand are from the 2nd and 11th of March. I'm not aware of any bad edits since I first reported it on the 15th of March (but I could just have missed them amongst all the other edits) and I think I've also seen bad edits from February too.

Since there were multiple things happening at the same time, reverting everything between two dates would also revert lots of unrelated edits. I'm not sure how many of those also had issues. It would work a lot better if it could filter based on the edit summary, the descriptions have a different summary to the P31 ones.

I'm not sure about nuking. Of the handful of examples I have to hand, most of them have been edited since. Some are good, some are bad (based on the bad bot edits), others just undid the bad edits. If there were a list of items (and it's not too many), I could maybe check them by hand, but like I said, I can't even find the bad edits. :/ - Nikki (talk) 14:52, 26 May 2016 (UTC)

Bots that do several tasks at once are a nightmare, and it's even worse when the tasks aren't thoroughly tested first >_<

But now that I have the dates (further in the past than I initially thought), I can try and see if there's a way to do it (whether for instance there are time slots where Edoderoobot worked on one task rather than another, and work in segments on the slots for that P31 task), or if, on your suggestion, I can (or should) alter RollBot to add a summary matching condition, whether by blacklisting or whitelisting or both - this alteration will however take me some more time to get into, my health being what it is.

I'll keep you posted either way.

I'll also ask the contributors to the second complaint I saw on that bot's talk page if they want me to do anything about it.

Alphos (talk) 15:30, 26 May 2016 (UTC)

@Nikki: It seems that, for Edoderoobot's March spree, Stryn has removed already a significant chunk using Autolist (example) - more or less all the ones I looked at before starting RollBot.

It doesn't excuse the fact Edoderoo didn't do it in over a month, with Stryn having done it in late April instead. It does however mean that RollBot is probably unneeded here.

On another note, thanks for the summary matching suggestion, I'll definitely think of implementing it

Alphos (talk) 16:31, 26 May 2016 (UTC)

PS: I'll contact Tacsipacsi, EncycloPetey, Tobias1984 and Multichill, to see if they need help with their issue with Edoderoobot.

I don't need help, thanks. I work on a very limited set of items here regularly. My issue was the addition of English phrases to data items as Dutch labels, and the bot continued to make the same mistake after the user was alerted and responded. For the literary entries where I saw this happen, I've cleaned up the problems already. For the taxonomic entries, I haven't looked them over because the issue will be far more complicated: many plants, animals, and other organisms will have no Dutch name and will be known only by their Latin binomial. --EncycloPetey (talk) 17:47, 26 May 2016 (UTC)

Thanks for your message

I noticed it too after reading the section of Edoderoo's talk page you replied in. The same occurred for categories, where the english labels were pretty much copy-pasted into the dutch labels, which Multichill made a note of.

Thanks also for your work rolling back those changes. Next time, you may consider to ask RollBot to do it for you

It's a bit crude ("Hulk SMASH !", if you will), but it does the deed !

I'll wait for the other users to chime in on what they noticed.

Alphos (talk) 18:31, 26 May 2016 (UTC)

I have reverted an enormous amount of edits, especially the ones where the source was sv-wiki and the item didn't have an sv-wiki-link. I believe this request can therefor be archived? Edoderoo (talk) 17:03, 11 August 2016 (UTC)

This section was archived on a request by: --Pasleim (talk) 12:00, 26 September 2016 (UTC)

date of birth (P569) with century precision (7)

For date of birth (P569) with century precision (7), values could be changed

	Change	Sample (displays as "20. century")	WQS
From	+(century)00-00-00T00:00:00Z/7	+2000-00-00T00:00:00Z/7	2000-01-01
To	+(century-1)01-00-00T00:00:00Z/7	+1901-00-00T00:00:00Z/7	1901-01-01

For dates of birth, it seems that for century precision "dates", it would be better to specify the first year in the century rather than the last one.

When queried at WQS these appear as January 1.
--- Jura 07:38, 16 May 2016 (UTC)

Oppose With the current implementation of the time datatype lower order elements can be omitted for reduced precision without the need of doing any further calculations. --Pasleim (talk) 09:14, 16 May 2016 (UTC)

That actual leads you to mix-up 20th century people (born maybe 1910) with people from the 21st century (born 2005).
--- Jura 09:59, 16 May 2016 (UTC)

I don't unterstand your example. A person born in 1910 has the value +1910-00-00T00:00:00Z/9, born in 2005 the value +2005-00-00T00:00:00Z/9, born in 20th century the value +2000-00-00T00:00:00Z/7 and born in the 21th century the value +2100-00-00T00:00:00Z/7. If precision 9 is given, you have to omit everything except the first 4 digits, with precision 7 you have to omit everything except the first 2 digits. --Pasleim (talk) 10:34, 16 May 2016 (UTC)

The sample I have in mind would be a person born (maybe) in 1910 using +2000-00-00T00:00:00Z/7 compared to a person born in 2005 using +2005-00-00T00:00:00Z/9 . If you use just wdt:P569 rounded to the century digits you would get "20" for both.
--- Jura 15:30, 16 May 2016 (UTC)

labels from name properties

For people for which we know family name (P734) and given name (P735) , it could be possible to autogenerate a label in several relevant languages following certain rules. Could someone set up a robot to do this ? author TomT0m / talk page 10:13, 15 May 2016 (UTC)

Notified participants of WikiProject Names

@TomT0m: I can imagine working on this but I feel it can be controversial (therefore I want more comments on this). Query for this: tinyurl.com/zzfoooo. Matěj Suchánek (talk) 09:45, 12 July 2016 (UTC)

I have seen people insist on one item for each spelling of a name, which means an approach like this would be unreliable (at best) when languages don't copy the original spelling. I think something based on this idea could work though if it takes into account things like where the person is from and the target language (and it would be better if people who speak the target language can confirm that they would expect the original spelling to be used for all people from those countries, because there might be things which are different that we're not aware of).

For example, I can't think of many examples of British people whose names are written differently in German so if a person is British and the names match the English label, using the same label for the German label sounds like it would be very unlikely to cause a problem. At the other extreme, Japanese writes almost all foreign names in katakana based on the pronunciation, so Michael Jackson (Q2831) (American), Michael Mittermeier (Q45083) (German) and Michael Laudrup (Q188720) (Danish) are all written differently in Japanese despite all having the same given name (P735) Michael (Q4927524) statement.

Generating the expected name from the statements and comparing it to the most appropriate label seems like a good sanity check. If the name expected from the statements doesn't match the actual label, there must be a reason for it. Some of the labels or statements could be wrong and need fixing, or perhaps the person is most commonly known by a different name.

Looking at that query, a few things already stand out to me: It says "Kirsten Johnson" for Czech for Kirsten Johnson (Q6416089), but the Czech sitelink for Betsey Johnson (Q467665) is "Betsey Johnsonová". For Azeri it also says "Kirsten Johnson", but the Azeri sitelink for Boris Johnson (Q180589) is "Boris Conson". It says "Bert Jansen (příjmení)" for Czech for Bert Jansen (Q1988186).

- Nikki (talk) 10:36, 15 July 2016 (UTC)

I support doing this, per the proposal, for some languages, but not for others. I'd be happy to collaborate on drawing up a "safe list" of languages. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:41, 12 July 2016 (UTC)

P.S. See also #Taxon labels, above. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:00, 12 July 2016 (UTC)

Just be careful. The transcriptions of names with origin in Cyrillic script is transcribed differently depending on the original and transcripted language. In Swedish we also have different transcriptions of Middle East Arabic and North African Arabic names. -- Innocent bystander (talk) 19:14, 12 July 2016 (UTC)

To be honest @TomT0m: I'm really really wary about this. I saw too many P735/P734 errors to believe we can accurately expand on it. I saw too many P735/P734 correct uses which don't permit to deduce label (for example, second or third given name, with the pseudonym-family name, noble people, etc.). It seems to me like every label should be checked manually and that's not possible with a bot. @Ash_Crow: had started working on something a little different: if people had the exact same label in English, French and German, he expanded the labels to all languages in Latin script and same naming usages. --Harmonia Amanda (talk) 09:32, 20 July 2016 (UTC)

@Harmonia Amanda: What's the worst, having an item really without a label so possibly very difficult to identify or an item with a close but non accurate label ? Considering the fact that the labels are probably missing in a very big number of languages in most cases, I think a non perfect information is better that no information at all. Plus ... a good way to improve quality of datas is actually use them to start spotting and correcting errors. I guess to be really useful both as easy to maintain, the bot should check if a name property has been modified since the last time he put the label. Hence the correction would propagate in each languages in a minimal numbers of edits and it would be clear that we should focus on the naming properties to optimize the interproject effort. author TomT0m / talk page 09:41, 20 July 2016 (UTC)

Given the number of errors already present, especially for ancient/medieval people (Romans for a start) when names used to be translated, I'd be *very* careful.

I suggest you limit your action to people who:

lived in the last century or so
already have at last a label in one language matching the First+last name combination
don't have a pseudonym (P742) or nickname (P1449).

Ash Crow (talk) 10:53, 20 July 2016 (UTC)

Strong reticence. This task would lead to too many false positive. Furthermore, there is an active community around the question of names, available to work on these topics. The bot would impede their work. --Dereckson (talk) 12:12, 20 July 2016 (UTC)

@Dereckson, Ash Crow, Harmonia Amanda: Can you be (publicly) more specific on how the work would be impeded ? Maybe solutions can be found to make everyone happy.

I've already have a suggestion to put only the computed label as alias. author TomT0m / talk page 12:28, 20 July 2016 (UTC)

@Matěj Suchánek: are you still interested in working on this? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:51, 10 October 2016 (UTC)

I don't usually do potentially controversial tasks whereas many concerns and objections were raised. So not for the time being. Matěj Suchánek (talk) 12:58, 10 October 2016 (UTC)

This section was archived on a request by:
--- Jura 12:11, 21 November 2016 (UTC)