Wikidata:Bot requests/Archive/2018/07

Set Prefered rank in Brazil (Q155)

There are 87 items with booth countries Brazil (Q155) and Portuguese Empire (Q200464). Brazil should have Prefered rank. I think that @PreferentialBot: can do this; please see: https://petscan.wmflabs.org/?psid=5044537 --JotaCartas (talk) 14:38, 5 July 2018 (UTC)

This section was archived on a request by: Edgars2007 (talk) 13:54, 15 July 2018 (UTC)

Linking categories in Spanish Wikisource

Request date: 8 June 2018, by: Ninovolador (spanish WS admin)

Task description

I would want to link sistematically some categories on Spanish Wikisource. Most subcategories under s:es:Category:Años are in the format Category:NXXXX, Category:FXXXX or Category:PXXXX, where XXXX is the number of the year. N means "births" F "deaths" and P "published". I want to link sistematically every subcategory to its equivalent.

So, basically, my request is to link (just an example, this categories are already linked):

And so for every subcategory under s:es:Category:Años.

Request process

@Ninovolador: done this. There are some leftovers however (which are less than approx. 600 initial categories :) ). I did the mapping against enwiki (not -source), FYI. --Edgars2007 (talk) 14:55, 15 July 2018 (UTC)

@Edgars2007: Thank you very much!! -- Ninovolador (talk) 03:00, 16 July 2018 (UTC)
This section was archived on a request by: Matěj Suchánek (talk) 08:34, 22 July 2018 (UTC)

Import area codes P473 from CSV file

Request date: 9 July 2018, by: Krauss

Link to discussions justifying the request
Task description

Get from city table as github.com/datasets-br/city-codes/data/br-city-codes.csv
(see columns wdId as Wikidata Item and ddd as city area code).

Example: Itu/SP have wdId=Q957653 and ddd=11 at the br-city-codes table, so I (by hand) set statement Q957653#P473. The correct is to do by bot.

Licence of data to import (if relevant)

CC0

Discussion

@Krauss: Sounds like this can be done with Help:QuickStatements. --Marsupium (talk) 11:21, 13 July 2018 (UTC)

@Marsupium: Thanks, I need to do my homework (can you help?), but YES (!) it is a good solution. --Krauss (talk) 01:29, 15 July 2018 (UTC)
@Marsupium: and others, I edited (!) with simple qid,P1585/* comment */ CSV file... but when submit, there are errors:
  • Q818261 is a "wrong" entity! The correct "Alagoinhas/BA" is Q22050101! but (as Q22050101 > Q818261) how to delete or redirect it? Must move the older to the new or the new to the older?
  • Q975677 is a "wrong" entity! The correct "Antônio Carlos/SC" is Q22063985! but (as Q22063985 > Q975677) how to delete or redirect it? Must move the older to the new or the new to the older?
  • ... many similar errors
@Marsupium: at first, Wikipedia articles has to get merged. And only AFTER that you can merge these Wikidata items. Please don't change labels/descriptions in the way you did it. --Edgars2007 (talk) 13:12, 15 July 2018 (UTC)
Sorry, Marsupium, I'm little bit tired to do things right :D The ping was meant for @Krauss:. --Edgars2007 (talk) 13:13, 15 July 2018 (UTC)
Thanks @Edgars2007: I will not move or delete ~50 Wikidata items: please answer the question about "how to fix the Wikidata bug". I am asserting that "Q22050101 and Q818261 are duplicated" and that "Q22050101 is correct", so we can fix Q818261 --Krauss (talk) 14:38, 15 July 2018 (UTC)

PS: the discussiom about duplicates is a side effect of my "whant to import area codes P473" initiative, because I can't insert statements in wrong (eg. non-city entities). There are ~50 errors in ~5500 cities of Brasil. --Krauss (talk) 14:43, 15 July 2018 (UTC)

I already answered to that question. The Wikipedia articles from the "wrong" Wikidata item has to be merged into those Wikipedia articles which are in "right" Wikidata item. But for now they have to be marked as dublicates. You can use this gadget for that. --Edgars2007 (talk) 14:51, 15 July 2018 (UTC)

Thanks discussions @Edgars2007: now I am back to the main task, I was using https://tools.wmflabs.org/quickstatements/#/batch/3277... But there are 26 errorss in 26 itens, and no description of the error. What is wrong with it??

qid,P473
Q304652,62 /*Abadia de Goiás/GO */
Q582223,34 /*Abadia dos Dourados/MG */
Q304716,62 /*Abadiânia/GO */
Q1615444,37 /*Abaeté/MG */
Q1615298,91 /*Abaetetuba/PA */
Q1796441,88 /*Abaiara/CE */

When I do "by hand" there are no errors. See eg. Q942803.

@Krauss: In both QuickStatements formats the strings have to be in double quotes (see Help:QuickStatements#Add simple statement) like this. In CSV format (the meaning of double quotes in CSV seems to interfere here, but I don't understand how, still it works), for some reason it seems to work with four double quotes before and one after. And in CSV format the comments need their own "#" column, so like this:
qid,P473,#
Q304652,""""62",Abadia de Goiás/GO
Q582223,""""34",Abadia dos Dourados/MG
BTW: Would probably be good to add a source to the statements …
Cheers, --Marsupium (talk) 20:19, 15 July 2018 (UTC), edited 20:53, 15 July 2018 (UTC)
@Marsupium: Thanks! It is an CSV strange syntax (what make sense is "strin and the CSV quoting as """strin"... The extra quoting is not any standsard CSV, see W3C reccomendation od 2016 or RFC 4180 of 2005... Ok, lets dance with the music,
See my batches: #3291 br-ddd-b01a2, #3293 br-ddd-b01, #3294 br-ddd-b02, #3295 br-ddd-b03, #3296 br-ddd-b04, #3297 br-ddd-b05, #3298 br-ddd-b06. Is working, with ~5500 items!

... Only ~120 of ~5500 with errors after a lot of work, QuickStatements work fine (!!), good result. But lets understand the little erros: I am not understanding:

Hi @Marsupium: again a problem with QuickStatements, see batch/3312, where input syntax was perfect and 2 itens was done... But all other with "error" have no error (!), I do by hand and is perfect, see a complete list with links. --Krauss (talk) 21:52, 16 July 2018 (UTC)

@Krauss: I pity for that, but I'm afraid that I don't know anything more than I've already written. I have to admit that unfortunately QuickStatements is often quite idiosyncratic. Sorry! --Marsupium (talk) 21:00, 17 July 2018 (UTC)
@Marsupium: and @Edgars2007: Thanks all help and discussion !
The "Import area codes P473 from CSV file" is completed.
This section was archived on a request by: Matěj Suchánek (talk) 08:23, 22 July 2018 (UTC)

Migrate to P1480

Request process

Request date: 12 July 2018, by: Swpb

Link to discussions justifying the request
Task description

Migrate all statements xnature of statement (P5102)disputed (Q18912752) to xsourcing circumstances (P1480)disputed (Q18912752) (see query). Swpb (talk) 17:47, 12 July 2018 (UTC)

Discussion
Request process

  Done --Pasleim (talk) 22:39, 4 October 2018 (UTC)

This section was archived on a request by: --Pasleim (talk) 22:39, 4 October 2018 (UTC)

CAPS cleanup

There a series of entries like Q55227955 created by bot that have the names inverted and family names in CAPS. In the meantime, these labels have been copied to some other languages.
--- Jura 09:12, 5 July 2018 (UTC)

heavy coord job

Request date: 14 July 2018, by: Conny

Link to discussions justifying the request
Task description

Wikidata Items with pictures from commons and the same coordinates on both projects should in Wikidata display the information, is the picture camera or object possition.

Discussion

@Conny: It's unclear what you want. Please explain it in more detail and provide an example edit. On Property talk:P625#camera/object_position I voiced my doubt if this should be on Wikidata. Multichill (talk) 14:06, 17 July 2018 (UTC)

@Multichill: We can exchange here again :) . My sample is on linked talkpage. Thank you, Conny (talk) 04:17, 18 July 2018 (UTC).
Request process

Normalize dates with low precision

Request date: 17 July 2018, by: Jc86035

Link to discussions justifying the request
Task description

Currently all times between 1 January 1901 and 31 December 2000 with a precision of century are displayed as "20. century". If one enters "20. century" into a date field the date is stored as the start of 2000 (+2000-00-00T00:00:00/7), which is internally interpreted as 2000–2099 even though the obvious intent of the user interface is to indicate 1901–2000. Since the documentation conflicts with the user interface, there is no correct way to interpret this information, and some external tools which interpret the data "correctly" do not reflect editors' intent.

This is not ideal, and since it looks like this isn't going to be fixed in Wikibase (I have posted at Wikidata:Contact the development team but haven't got a response yet, and the Phabricator bugs have been open for years), I think a bot should convert every date value with a precision of decade, century or millenium to the midpoint of the period indicated by the label in English, so that there is less room for misinterpretation. For example, something that says "20. century" should ideally be converted to +1951-00-00T00:00:00/7 (or alternatively to +1950-00-00T00:00:00/7), so that it is read as 1901–2000 by humans looking at the item, as 1901–2000 (or 1900–1999) by some external tools, and as 1900–1999 by other external tools.

Classes of dates this would apply to:

  • Decades (maybe) – e.g. dates within 1950–1959 to +1954-00-00T00:00:00/8
  • Centuries – e.g. dates within 1901–2000 to +1951-00-00T00:00:00/7 or +1950-00-00T00:00:00/7 (depending on what everyone prefers)
  • Millennia – e.g. dates within 1001–2000 to +1501-00-00T00:00:00/6 or +1500-00-00T00:00:00/6

For everything less accurate (and over five digits), the value is displayed as "X years" (e.g. −10000-00-00T00:00:00Z/5 is displayed "10000 years BCE"). Incorrect precisions for years under five digits could be otherwise fixed, but it looks like the user interface just doesn't bother parsing them because people don't name groups of a myriad years.

While this is obviously not perfect and not the best solution, it is better than waiting an indefinite time for the WMF to get around to it; and if the user interface is corrected then most of the data will have to be modified anyway. Values which have ambiguous meaning (i.e. those which can be identified as not having been added with the wikidata.org user interface) should be checked before normalization by means of communication with the user who added them. Jc86035 (talk) 11:33, 17 July 2018 (UTC) (edited 14:13, 17 July 2018 (UTC) and 16:16, 17 July 2018 (UTC))

Discussion

I think the reasoning applies to centuries and millenia. I'd have to think about it a bit longer for decades. While I'm thinking, perhaps Jc86035 would clarify the request by explicitly stating every precision the request applies to. Also, when discussing precision, I find it inadvisable to use the terms "lower" or "greater". Terms such as "better" and "worse" or "looser" and "tighter" seem better to me.

I suppose this bot would have to be run on a regular schedule until the problem is fixed by the developers (or the Wikidata project is shuttered, whichever comes first). Jc3s5h (talk) 12:19, 17 July 2018 (UTC)

@Jc3s5h: I think running it at least once a day would be good. I've edited the proposal so that it only applies to decades, centuries and millennia, because Wikibase handles less precise dates differently (in general the time handling seems very nonstandard to me, probably because most people don't need to represent 13.7 billion years ago in the same time format as yesterday). Jc86035 (talk) 14:48, 17 July 2018 (UTC)
  •   Support (edit conflict) I think it is a good idea. I also found it confusing on ocassion. If there are templates and other tools that interpret Wikidata dates incorectly that is their bug and it is beyond our control to debug each tool that uses Wikidata. However I think it is a good idea to default to the mid-point in the time period for some of the confusing ones. ( I would not do it for years, but for decaded, centuries, millenia, that would be fine. --Jarekt (talk) 12:28, 17 July 2018 (UTC)
  •   Oppose looks like a work around instead of solving the real problem. Multichill (talk) 14:02, 17 July 2018 (UTC)
    @Multichill: Yes, obviously this isn't the best solution, but the Phabricator bug is three years old now so it's not like Wikibase's date handling is suddenly going to be improved after years of nothing, so we may as well deal with it regardless of the pace of software development. The longer the issue sits around, the more data everyone has to revalidate after the issue is fixed. (They don't have enough staff to deal with comparatively minor things like this. Dozens of bugs e.g. in Kartographer have been closed as wontfix just because "there's no product manager".) Jc86035 (talk) 14:23, 17 July 2018 (UTC)
    Furthermore, currently ISO 8601 necessitates us using things like earliest date (P1319) and latest date (P1326) if there's any sort of non-trivial uncertainty range, yet Wikibase stores the user's initial value anyway. Wikibase does a lot of odd things like the aforementioned non-standard century handling and allowing "0 BCE" as a valid date. I don't think they have the resources to fix stuff like this. Jc86035 (talk) 14:43, 17 July 2018 (UTC)
  • Question: As an example, if the bot finds the datestamp +1900-01-01T00:00:00Z with precision 7, should it convert it to +1950-01-01T00:00:00Z or +1850-01-01T00:00:00Z? I do not believe a bot can trace back to when an entry was last changed and see if it was changed interactively or through the API. If through the API, I think the year should be 1950. If interactively, I think it should be 1850. Perhaps we could somehow examine the history of contributions, and see how many dates are entered interactively vs. the API. If one is vastly more frequent than the the other, we could go with whatever predominates.
Whatever we do, we should stay away from any year before AD 101. As Jc86035 points out, the issues are compounded for BCE, and there are also some tricky points before 101. Jc3s5h (talk) 21:39, 17 July 2018 (UTC)
  • @Jc3s5h: Could you not find it through differing edit summaries (e.g. lots of additions by one user with #quickstatements)? I think it would be difficult but it would be possible with something like the WikiBlame tool. Jc86035 (talk) 06:06, 18 July 2018 (UTC)
    I would like to see a substantial sample, which would have to be gathered automatically. For example, all the date edits make on Wikidata for an entire day on each of 100 randomly chosen days. Jc3s5h (talk) 11:56, 18 July 2018 (UTC)
Request process

Request date: 21 July 2018, by: Микола Василечко

Link to discussions justifying the request
Task description
Licence of data to import (if relevant)
Discussion


Request process

Replace

In ukrainian: сторінка значень в проекті Вікімедіасторінка значень проекту Вікімедіа --Микола Василечко (talk) 20:05, 21 July 2018 (UTC)

That's 1,000,000+ replacements. Was there any discussion justifying this task? Or why should it be done? Matěj Suchánek (talk) 08:21, 22 July 2018 (UTC)
I'm thinking... Wouldn't such massive changes be better done in DB directly... Say "update ... set ukdesc='сторінка значень проекту Вікімедіа' where ukdesc='сторінка значень в проекті Вікімедіа'". --Edgars2007 (talk) 12:15, 22 July 2018 (UTC)
Maybe but I doubt this would be allowed. Don't forget you need to have a way to update JSON of items, propagate this to WDQS etc... Matěj Suchánek (talk) 12:56, 22 July 2018 (UTC)

Fix labels for P31=Q5

A series of items about people were created recently that included the enwiki disambiguation. Sample: Q55647666. This should be removed from the label.
--- Jura 15:17, 24 July 2018 (UTC)