Open main menu

Wikidata:Bot requests

Bot requests
If you have a bot request, add a new section using the button and tell exactly what you want. To reduce the process time, first discuss the legitimacy of your request with the community in the Project chat or in the Wikiprojects's talk page. Please refer to previous discussions justifying the task in your request.

For botflag requests, see Wikidata:Requests for permissions.

Tools available to all users which can be used to accomplish the work without the need for a bot:

  1. PetScan for creating items from Wikimedia pages and/or adding same statements to items
  2. QuickStatements for creating items and/or adding different statements to items
  3. Harvest Templates for importing statements from Wikimedia projects
  4. Descriptioner for adding descriptions to many items
  5. OpenRefine to import any type of data from tabular sources
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2019/01.
Filing cabinet icon.svg
SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 2 days.

You may find these related resources helpful:

High-contrast-document-save.svg Dataset Imports    High-contrast-view-refresh.svg Why import data into Wikidata.    Light-Bulb by Till Teenck.svg Learn how to import data    Noun project 1248.svg Bot requests    Question Noun project 2185.svg Ask a data import question

Contents

Resolving Wikinews categories from "wikimedia category"Edit

Request date: 12 August 2017, by: Billinghurst (talkcontribslogs)

Link to discussions justifying the request
Task description

Numbers of Wikinews items which are categories have been labelled as instance of (P31) -> Wikimedia category (Q4167836) which I am told is incorrect. It would be worthwhile running a query where this exists and removing these statements, and the corresponding labels where they exist in interwikilink isolation on an item, and generating a list where they exist among a bundle of sister interwikilinks. This will enable an easier merge of those independent items to their corresponding items that exist, and a tidy up of items that are confused.

Discussion

Lymantria (talkcontribslogs) and I were at cross-purposes as items with that statement were being merged to items. We had a confluence of rule issues, merging of wikimedia category items to items, versus the merging of Wikinews categories to items. Now better understood.

Request process

Redirects after archivalEdit

Request date: 11 September 2017, by: Jsamwrites (talkcontribslogs)

Link to discussions justifying the request
Task description

Retain the links to the original discussion section on the discussion pages, even after archival by allowing redirection.

Licence of data to import (if relevant)
Discussion


Request process

Semi-automated import of information from Commons categories containing a "Category definition: Object" templateEdit

Request date: 5 February 2018, by: Rama (talkcontribslogs)

Link to discussions justifying the request
Task description

Commons categories about one specific object (such as a work of art, archaeological item, etc.) can be described with a "Category definition: Object" template [1]. This information is essentially a duplicate of what is or should be on Wikidata.

To prove this point, I have drafted a "User:Rama/Catdef" template that uses Lua to import all relevant information from Wikidata and reproduces all the features of "Category definition: Object", while requiring only the Q-Number as parameter (see Category:The_Seated_Scribe for instance). This template has the advantage of requesting Wikidata labels to render the information, and is thus much more multi-lingual than the hand-labeled version (try fr, de, ja, etc.).

I am now proposing to deploy another script to do the same thing the other way round: import data from the Commons templates into relevant fields of Wikidata. Since the variety of ways a human can label or mislabel information in a template such as "Category definition: Object", I think that the script should be a helper tool to import data: it is to be ran on one category at a time, with a human checking the result, and correcting and completing the Wikidata entry as required. For now, I have been testing and refining my script over subcategories of [2] Category:Ship models in the Musée national de la Marine. You can see the result in the first 25 categories or so, and the corresponding Wikidata entries.

The tool is presently in the form of a Python script with a simple command-line interface:

./read_commons_template.py Category:Scale_model_of_Corse-MnM_29_MG_78 reads the information from Commons, parses it, renders the various fields in the console for debugging purposes, and creates the required Wikibase objects (e.g: text field for inventory numbers, Q-Items for artists and collections, WbQuantity for dimensions, WbTime for dates, etc.)
./read_commons_template.py Category:Scale_model_of_Corse-MnM_29_MG_78 --commit does all of the above, creates a new Q-Item on Wikidata, and commits all the information in relevant fields.

Ideally, when all the desired features will be implemented and tested, this script might be useful as a tool where one could enter the

Licence of data to import (if relevant)

The information is already on Wikimedia Commons and is common public knowledge.

Discussion


Request process

Remove statement with Gregorian date earlier than 1584 (Q26961029)Edit

SELECT ?item ?property (YEAR(?year) as ?yr)
{
    hint:Query hint:optimizer "None".
    ?a pq:P31 wd:Q26961029 .
    ?item ?p ?a .
    ?a ?psv ?x .
    ?x wikibase:timeValue ?year .
    ?x wikibase:timePrecision 7 .        
    ?x wikibase:timeCalendarModel wd:Q1985727 .        
    ?property wikibase:statementValue ?psv    .   
    ?property wikibase:claim ?p     .  
}
LIMIT 15704

Try it!

The above dates have year precision and Proleptic Gregorian calendar (Q1985727) as calendar model. I think they could be converted to Julian and the qualifier statement with Gregorian date earlier than 1584 (Q26961029) removed.
--- Jura 09:19, 24 February 2018 (UTC)

  Support --Marsupium (talk) 23:01, 28 April 2018 (UTC)
Presumably some such statements are actually intended to be Gregorian year, no? --Yair rand (talk) 02:56, 5 September 2018 (UTC)
Sample? --- Jura 05:52, 4 December 2018 (UTC)
I checked some Lunar Eclipses from Antiquity (a kind of event where we can compute accurate dates) and the dates where all stated in the Julian calendar and correctly entered into Wikidata. --Pyfisch (talk) 12:50, 21 December 2018 (UTC)

Crossref JournalsEdit

Request date: 27 March 2018, by: Mahdimoqri (talkcontribslogs)

Link to discussions justifying the request
Task description
  • Add missing journals from Crossref
Licence of data to import (if relevant)
Discussion


Request process

"place on Earth" descriptionsEdit

Can someone remove the descriptions added by the batches mentioned on AN. The users shouldn't have edited and the descriptions are fairly useless. Please delete all in a single edit.
--- Jura 08:38, 15 May 2018 (UTC)

"AN" link is dead. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:41, 29 May 2018 (UTC)
Not anymore. Matěj Suchánek (talk) 08:45, 22 July 2018 (UTC)
Dead, as of just now Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:12, 1 September 2018 (UTC)

Populating Commons maps category (P3722)Edit

Request date: 30 May 2018, by: Thierry Caro (talkcontribslogs)

Link to discussions justifying the request
  • None.
Task description

Take all instances of subclasses of geographical object (Q618123). Look for those that have a Commons category (P373) statement and visit the corresponding Commons category. If it includes another category that has Maps of and then its name as its own name, import this value as Commons maps category (P3722) to the item. This would be useful to the French Wikipedia, where we now have Category:Pages using Wikidata property P3722 (Q54473574) automatically populated through Template:Geographical links (Q28528875).

  Comment. Hi. Is there still someone here? Thierry Caro (talk) 23:27, 8 November 2018 (UTC)
Licence of data to import (if relevant)
Discussion
Request process

Normalize referencesEdit

Request date: 13 July 2018, by: Marsupium (talkcontribslogs)

Link to discussions justifying the request
Task description

Often one source website or database is indicated inconsistently in various manners. To improve this situation some queries and following edits on references could be made. This is a task that would best be done continuously and gradually adapted to more cases. Thus, perhaps this task fits well in the work field of DeltaBot, User:Pasleim?

  1. Add ID property (and if feasible also stated in (P248)) to references with according reference URL (P854) where missing.
  2. Add stated in (P248) to references with according ID property where missing.
  3. (For later: Merge references where a source website or database is used twice (accidentally).)

The issue exits for ULAN ID (P245), RKDartists ID (P650) and probably many more source websites or databases. I have examined ULAN ID (P245) and those would be the queries and edits to be done:

  1. SELECT ?entity ?prop ?ref ?id WHERE {
      ?entity ?prop ?statement.
      ?statement prov:wasDerivedFrom ?ref.
      ?ref pr:P854 ?refURL.
      MINUS { ?ref pr:P245 []. }
      FILTER REGEX(STR(?refURL), "^https?://(vocab.getty.edu/(page/)?ulan/|www.getty.edu/vow/ULANFullDisplay.*?&subjectid=)")
      BIND(REPLACE(STR(?refURL),"^https?://(vocab.getty.edu/(page/)?ulan/|www.getty.edu/vow/ULANFullDisplay.*?&subjectid=)","") AS ?id)
    }
    LIMIT 500
    
    Try it! → add    / ULAN ID (P245)?id to the reference (now >1k cases)
  2. SELECT ?entity ?prop ?ref WHERE {
      ?entity ?prop [ prov:wasDerivedFrom ?ref ].
      ?ref pr:P245 [].
      MINUS { ?ref pr:P248 wd:Q2494649. }
    }
    
    Try it! → add    / stated in (P248)Union List of Artist Names (Q2494649) to the reference (now ca. 70 cases)

Thanks for any comments!

Discussion


Request process

I'm working on it myself now, see Topic:Unj4mc05g2qpj1gs for the process. Any help or advice still welcome! --Marsupium (talk) 18:20, 30 October 2018 (UTC)

Normalize dates with low precisionEdit

Request date: 17 July 2018, by: Jc86035 (talkcontribslogs)

Link to discussions justifying the request
Task description

Currently all times between 1 January 1901 and 31 December 2000 with a precision of century are displayed as "20. century". If one enters "20. century" into a date field the date is stored as the start of 2000 (+2000-00-00T00:00:00/7), which is internally interpreted as 2000–2099 even though the obvious intent of the user interface is to indicate 1901–2000. Since the documentation conflicts with the user interface, there is no correct way to interpret this information, and some external tools which interpret the data "correctly" do not reflect editors' intent.

This is not ideal, and since it looks like this isn't going to be fixed in Wikibase (I have posted at Wikidata:Contact the development team but haven't got a response yet, and the Phabricator bugs have been open for years), I think a bot should convert every date value with a precision of decade, century or millenium to the midpoint of the period indicated by the label in English, so that there is less room for misinterpretation. For example, something that says "20. century" should ideally be converted to +1951-00-00T00:00:00/7 (or alternatively to +1950-00-00T00:00:00/7), so that it is read as 1901–2000 by humans looking at the item, as 1901–2000 (or 1900–1999) by some external tools, and as 1900–1999 by other external tools.

Classes of dates this would apply to:

  • Decades (maybe) – e.g. dates within 1950–1959 to +1954-00-00T00:00:00/8
  • Centuries – e.g. dates within 1901–2000 to +1951-00-00T00:00:00/7 or +1950-00-00T00:00:00/7 (depending on what everyone prefers)
  • Millennia – e.g. dates within 1001–2000 to +1501-00-00T00:00:00/6 or +1500-00-00T00:00:00/6

For everything less accurate (and over five digits), the value is displayed as "X years" (e.g. −10000-00-00T00:00:00Z/5 is displayed "10000 years BCE"). Incorrect precisions for years under five digits could be otherwise fixed, but it looks like the user interface just doesn't bother parsing them because people don't name groups of a myriad years.

While this is obviously not perfect and not the best solution, it is better than waiting an indefinite time for the WMF to get around to it; and if the user interface is corrected then most of the data will have to be modified anyway. Values which have ambiguous meaning (i.e. those which can be identified as not having been added with the wikidata.org user interface) should be checked before normalization by means of communication with the user who added them. Jc86035 (talk) 11:33, 17 July 2018 (UTC) (edited 14:13, 17 July 2018 (UTC) and 16:16, 17 July 2018 (UTC))

Discussion

I think the reasoning applies to centuries and millenia. I'd have to think about it a bit longer for decades. While I'm thinking, perhaps Jc86035 would clarify the request by explicitly stating every precision the request applies to. Also, when discussing precision, I find it inadvisable to use the terms "lower" or "greater". Terms such as "better" and "worse" or "looser" and "tighter" seem better to me.

I suppose this bot would have to be run on a regular schedule until the problem is fixed by the developers (or the Wikidata project is shuttered, whichever comes first). Jc3s5h (talk) 12:19, 17 July 2018 (UTC)

@Jc3s5h: I think running it at least once a day would be good. I've edited the proposal so that it only applies to decades, centuries and millennia, because Wikibase handles less precise dates differently (in general the time handling seems very nonstandard to me, probably because most people don't need to represent 13.7 billion years ago in the same time format as yesterday). Jc86035 (talk) 14:48, 17 July 2018 (UTC)
  •   Support (edit conflict) I think it is a good idea. I also found it confusing on ocassion. If there are templates and other tools that interpret Wikidata dates incorectly that is their bug and it is beyond our control to debug each tool that uses Wikidata. However I think it is a good idea to default to the mid-point in the time period for some of the confusing ones. ( I would not do it for years, but for decaded, centuries, millenia, that would be fine. --Jarekt (talk) 12:28, 17 July 2018 (UTC)
  •   Oppose looks like a work around instead of solving the real problem. Multichill (talk) 14:02, 17 July 2018 (UTC)
    @Multichill: Yes, obviously this isn't the best solution, but the Phabricator bug is three years old now so it's not like Wikibase's date handling is suddenly going to be improved after years of nothing, so we may as well deal with it regardless of the pace of software development. The longer the issue sits around, the more data everyone has to revalidate after the issue is fixed. (They don't have enough staff to deal with comparatively minor things like this. Dozens of bugs e.g. in Kartographer have been closed as wontfix just because "there's no product manager".) Jc86035 (talk) 14:23, 17 July 2018 (UTC)
    Furthermore, currently ISO 8601 necessitates us using things like earliest date (P1319) and latest date (P1326) if there's any sort of non-trivial uncertainty range, yet Wikibase stores the user's initial value anyway. Wikibase does a lot of odd things like the aforementioned non-standard century handling and allowing "0 BCE" as a valid date. I don't think they have the resources to fix stuff like this. Jc86035 (talk) 14:43, 17 July 2018 (UTC)
  • Question: As an example, if the bot finds the datestamp +1900-01-01T00:00:00Z with precision 7, should it convert it to +1950-01-01T00:00:00Z or +1850-01-01T00:00:00Z? I do not believe a bot can trace back to when an entry was last changed and see if it was changed interactively or through the API. If through the API, I think the year should be 1950. If interactively, I think it should be 1850. Perhaps we could somehow examine the history of contributions, and see how many dates are entered interactively vs. the API. If one is vastly more frequent than the the other, we could go with whatever predominates.
Whatever we do, we should stay away from any year before AD 101. As Jc86035 points out, the issues are compounded for BCE, and there are also some tricky points before 101. Jc3s5h (talk) 21:39, 17 July 2018 (UTC)
  • @Jc3s5h: Could you not find it through differing edit summaries (e.g. lots of additions by one user with #quickstatements)? I think it would be difficult but it would be possible with something like the WikiBlame tool. Jc86035 (talk) 06:06, 18 July 2018 (UTC)
    I would like to see a substantial sample, which would have to be gathered automatically. For example, all the date edits make on Wikidata for an entire day on each of 100 randomly chosen days. Jc3s5h (talk) 11:56, 18 July 2018 (UTC)
Request process

Request date: 21 July 2018, by: Микола Василечко (talkcontribslogs)

Link to discussions justifying the request
Task description
Licence of data to import (if relevant)
Discussion


Request process

ReplaceEdit

In ukrainian: сторінка значень в проекті Вікімедіасторінка значень проекту Вікімедіа --Микола Василечко (talk) 20:05, 21 July 2018 (UTC)

That's 1,000,000+ replacements. Was there any discussion justifying this task? Or why should it be done? Matěj Suchánek (talk) 08:21, 22 July 2018 (UTC)
I'm thinking... Wouldn't such massive changes be better done in DB directly... Say "update ... set ukdesc='сторінка значень проекту Вікімедіа' where ukdesc='сторінка значень в проекті Вікімедіа'". --Edgars2007 (talk) 12:15, 22 July 2018 (UTC)
Maybe but I doubt this would be allowed. Don't forget you need to have a way to update JSON of items, propagate this to WDQS etc... Matěj Suchánek (talk) 12:56, 22 July 2018 (UTC)

Fix labels for P31=Q5Edit

A series of items about people were created recently that included the enwiki disambiguation. Sample: Q55647666. This should be removed from the label.
--- Jura 15:17, 24 July 2018 (UTC)

Harvesting data from MusicBrainzEdit

I got a list with 143,503 artists that have an entry at MusicBrainz. A bot can harvest data (links and misc info like birthdate) from the MusicBrainz database and add it to the WikiData. An example of such harvesting can be found here - adding an AllMusic ID found in the MusicBrainz database. My list is in CSV format and it looks like this:

  • http://www.wikidata.org/entity/Q71616,Max von Schenkendorf,d120de82-a486-4ec0-ab64-0c2c3a5b46f8
  • http://www.wikidata.org/entity/Q71626,DJ Dean,b768f66f-3260-41ec-941d-e7c82b1cb87b

and it is actually the result of this script - initial discussion. -- OneMusicDream (talk) 23:48, 30 July 2018 (UTC)

@OneMusicDream: Because MusicBrainz is a wiki (a user editable website), with no sourcing, you may hit WD:BLP issues importing personal data like birth dates. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:21, 3 August 2018 (UTC)
@Pigsonthewing: Thanks for your reply. Now I was thinking about this: If the MusicBrainz birth dates have a rate of errors like X%, and X is acceptably low, then maybe it's worth having a huge amount of data instead of not having it. Such errors will be corrected in time. If it's not worth collecting it, then I think that collecting at least the external links (or just a part of them, like only Discogs and Allmusic links) about those artists is probably worth doing. Partial collecting. And then maybe in the future doing more partial collecting. And so on. OneMusicDream (talk) 16:41, 5 August 2018 (UTC)

Cleanup bot descriptionEdit

Can someone fix the following descriptions:

SELECT *
{
    ?item schema:description "Non-profit organisation in the USA"@en 
}

Try it! . Currently some 1300
--- Jura 22:09, 3 August 2018 (UTC)

  Doing… [3] Matěj Suchánek (talk) 07:45, 4 August 2018 (UTC)
  • Thanks for fixing the caps. As it's in the US, maybe "organization" should be used too.
    --- Jura 07:58, 4 August 2018 (UTC)

Database of Classical ScholarsEdit

Request date: 9 August 2018, by: Jonathan Groß (talkcontribslogs)

Task description

The Database of Classical Scholars has shifted their database to a new home, and in the process changed all IDs. While this might be regrettable in itself, the editors of the database are very forthcoming in trying to alleviate the problem. They sent me a chart (link to GoogleDrive) which, among others, contains in columns G and H the OLD and NEW IDs.

The main task would be to fetch all items with Database of Classical Scholars ID (P1935) and replace old IDs with new ones. The links would also have to be adapted to the new format: Contrary to before, links to entries in the database are no longer just IDs but also require a combination of surname and given name(s). The chart has these values in columns B to D. The new IDs consist of ID[four arabic digits]-SURNAME[in caps]-Firstname-Lastname.

In addition, one could check corresponding pages to these items on dewiki and enwiki which are using Template:DBCS (Q26006668): In case one of those templates has an old ID, replace it with the new one.

Many thanks in advance! Jonathan Groß (talk) 14:18, 9 August 2018 (UTC)

Discussion


Request process

Accepted by (Magnus Manske (talk) 14:31, 9 August 2018 (UTC)) and under process

Actually, this might not work as expected. For the new ID 8494, the URL is https://dbcs.rutgers.edu/all-scholars/8494-ABBOTT-Kenneth-Morgan but how can we link there with just the ID? https://dbcs.rutgers.edu/all-scholars/8494 doesn't work, and neither does https://dbcs.rutgers.edu/index.php?page=person&id=8494 . Not touching this until we clear this up. --Magnus Manske (talk) 14:38, 9 August 2018 (UTC)
Yes, that's a problem. I suggested to the editors that they install some sort of redirects, so that just-ID-links work as well. Let's see what they say. I'll get back to you then. Jonathan Groß (talk) 14:43, 9 August 2018 (UTC)
Given that the old URLs are broken anyway, I made a new Mix'n'match catalog, matched the entries according to the spreadsheet, and am now changing the IDs to the new ones. I preserved the old catalog (deactivated), in case we need it again. --Magnus Manske (talk) 15:10, 9 August 2018 (UTC)
Thanks a lot! Jonathan Groß (talk) 07:17, 10 August 2018 (UTC)

Hungarian citizenshipEdit

Request date: 12 August 2018, by: Bencemac (talkcontribslogs)

Task description

I would like to ask the following changes where instance of (P31) is human (Q5) and country of citizenship (P27) is Hungary (Q28). I tried to make their queries but I was not able to; please forgive me, I am bad at Wikidata Query (yet). If it is possible to do, I would have similar request in the future (before 1946). Bencemac (talk) 17:59, 12 August 2018 (UTC)

1. Born after 1989-10-23 (optional)

  • the person is still alive:
  • the person is dead

2. Born after 1949-08-20

  • the person died before 1989-10-23:
  • the person died after 1989-10-23:
  • the person is still alive:

3. Born after 1946-02-01

  • the person died before 1949-08-20:
  • the person died before 1989-10-23:
  • the person died after 1989-10-23:
  • the person is still alive:
Discussion
SELECT ?person ?personLabel ?birthdate ?deathdate 
WHERE {
  ?person wdt:P31 wd:Q5;
          wdt:P27 wd:Q28;
          wdt:P569 ?birthdate;
          wdt:P570 ?deathdate.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

Try it!

SELECT ?person ?personLabel ?birthdate ?deathdate 
WHERE {
  ?person wdt:P31 wd:Q5;
          wdt:P27 wd:Q16410;
          wdt:P569 ?birthdate;
          wdt:P570 ?deathdate.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

Try it!

@Bencemac: You can use the above queries to extract data from WD and export them into excel where you can perform your modification and then later import the changes into WD. But a query can't modify the data. You need another tool for that. Snipre (talk) 20:17, 28 August 2018 (UTC)

@Snipre: Thanks, but I do not want to do it myself. I have never done this before and I am afraid of doing something bad, so it would need an experienced user. Bencemac (talk) 14:39, 29 August 2018 (UTC)
Request process

Clean up multiple coordinates in wikiitemsEdit

Request date: 17 August 2018, by: Bouzinac (talkcontribslogs)

Link to discussions justifying the request
Topic:Uizegviq5kuoe2xx
Task description

For every wiki_item that has multiple coordinate location (P625) at same normal rank and no preferred rank, make only one coordinate location (P625) as "preferred". Order of preference : to be determined by what the bot is capable to do.

Licence of data to import (if relevant)
Discussion
IMO the proper solution is to decide which one is correct (or import it from elsewhere). It makes no sense to me to have a "more and less preferred" coordinates. Matěj Suchánek (talk) 11:06, 17 August 2018 (UTC)
  • From the discussion on French project chat, the request was limited to airports and the idea is to not set preferred ranks to coordinates of some wikis. Given that cebwiki coordinates are known to be rounded, I think it is reasonable to do that for these. Obviously, if there is a large difference between the values, one should probably check them manually.
    --- Jura 11:16, 17 August 2018 (UTC)
Request process

Incorrectly imported text fieldsEdit

Request date: 17 August 2018, by: Jc86035 (talkcontribslogs)

Task description

Special:Search/all: u00 reveals that there are about 1,700 items with auto-generated labels/descriptions/other text fields which contain escape codes for Unicode characters. Special:Search/all: & insource:/&/ has some others, and there are probably more searches which would turn up errors like these. (Some of the descriptions, like the one for Tito Pérez (Q29521467), are also clearly for a different language to the one specified – for that item, the "en" description is clearly not in English.) I think these should be fixed, preferably by looking at the batch of edits in which the bad data was imported and fixing each batch of items. Jc86035 (talk) 18:13, 17 August 2018 (UTC)

Discussion
Yay for better indexing. I will probably restore Wikidata:Requests for permissions/Bot/MatSuBot 6. Matěj Suchánek (talk) 10:47, 18 August 2018 (UTC)
Request process

Importing data from epcrugby.com (Property P3666)Edit

Request date: 27 August 2018, by: Blackcat (talkcontribslogs)

Hello, both the URL formatter and the data value for EPCR player ID (P3666) have changed. The most important is the latter, because the first can be easily changed. Basically now we have a different id for each player (i.e. for Martin Castrogiovanni (Q1039026) the old one was www.epcrugby.fr/info/archives_jouers.php?player=143&includeref=dynamic and the new one is https://www.epcrugby.com/player?PlayGuid=MC322193) which currently makes the property unusable.

Task description

In this sandbox i dumped all the item that use the said property. Is there a way to acquire the new values from EPCR web sites?

Licence of data to import (if relevant)
Discussion

@Blackcat: What's about the licence of the data in epcrugby.com ? Snipre (talk) 19:52, 28 August 2018 (UTC)

Honestly I don't know @Snipre:, anyway we don't have to import any data, only acquire the new ID for each player who played in the European rugby cups; we already have a property for that. -- Blackcat (talk) 20:19, 28 August 2018 (UTC)
When I locked at the first page, I found © 2018 Content European Professional Club Rugby, Statistical Data © European Professional Club Rugby. Even extracting the ID can be a problem if there is a systematic extraction by a bot. The best is to ask the website to explicitly free the IDs. Snipre (talk) 20:26, 28 August 2018 (UTC)
Request process

elevation above sea level (P2044) values imported from ceb-WikiEdit

Request date: 6 September 2018, by: Ahoerstemeier (talkcontribslogs)

Link to discussions justifying the request
  • Many items have their elevation imported from the Cebuano-Wikipedia. However, the way the bot created the values is very faulty, especially due to inaccurate coordinates the value can differ by up to 500m! Thus most of the values are utter nonsense, some are a rough approximation, but certainly not good data. To make things worse - the qualifier with imported from Wikimedia project (P143) often wasn't added. For an extreme example see Knittelkar Spitze (Q1777201).
Task description

Firstly, a bot has to add all the missing imported from Wikimedia project (P143) omitted in the original infobox harvesting. Secondly, especially for mountains and hills, the value has to be set to deprecated state, to avoid it to poison our good date.

Licence of data to import (if relevant)
Discussion


Request process

Import and maintain nominal GDP for countries from the World Bank Data APIEdit

Request date: 17 September 2018, by: WDBot (talkcontribslogs)

Link to discussions justifying the request
Task description

A bot to load nominal GDP from the WorldBank API and write it to WikiData countries (property https://www.wikidata.org/wiki/Property:P2131).

  1. load the country information (retrieved from query.wikidata.org and copy-pasted in the script)
  2. iterate over each country
  3. check if data - Nominal GDP in US-Dollar - on WorldBank is available - if not go to the next country
  4. load the first value of wb data
    1. check over all nominal gdp properties if the value is available
    2. skip if value is available skip
    3. write if value is not available

Code: link to code[[4]]

You can find test edits here:

  • Bulgaria (example when there is no data): [[5]]
  • Germany (example with only one missing value for the year 2000): [[6]]
Licence of data to import (if relevant)

CC BY-4.0 - see here https://datacatalog.worldbank.org/public-licenses#cc-by

Discussion
I think the references should also have a property pointing to either the World Bank or its database, rather than only pointing to the URL. Maybe publisher (P123) or published in (P1433). --Yair rand (talk) 21:07, 17 September 2018 (UTC)
Hi Yair rand and thank you for your feedback. I have adjusted the script to write "publisher" too. Here you can see the example for France and USA on test.wikidata.org. You can see the new script here. --WDBot (talk) 20:24, 18 September 2018 (UTC)
If you need the approval for the bot, you're looking for Wikidata:Requests for permissions/Bot — regards, Revi 06:27, 19 September 2018 (UTC)
Thank you, Revi. I have now created a request. Cheers! --WDBot (talk) 18:07, 20 September 2018 (UTC)
Request process

Create indiscriminately new itemsEdit

Apparently, users are looking forward to this. Per Wikidata:Requests_for_permissions/Bot/GZWDer_(flood)_4, GZWDer_(flood) was approved for this. As its operator isn't currently active, maybe someone else wants to do it. --- Jura 16:42, 20 September 2018 (UTC)

Updating templates' description in RussianEdit

Request date: 7 October 2018, by: Wikisaurus (talkcontribslogs)

Link to discussions justifying the request

Looks obvious.

Task description

Most templates has standard Russian description "шаблон проекта Викимедиа", per d:Q11266439, but quite a number of them have old descriptions "шаблон в проекте Викимедиа" and "шаблон проекта Викимедия". Please replace them with the modern description. I can not do it myself with QuickStatements because SPARQL request are giving me only a small portion of results - if I replace limit 100 by large limit they collapse. SPARQL requests are below (they are a bit strange because there are some problems with cyrillic, I believe):

SELECT ?item ?itemlab ?itemdesc WHERE {
  ?item wdt:P31 wd:Q11266439 .
  wd:Q6537516 schema:description ?wrongdesc1
  filter (lang(?wrongdesc1) = "ru") .
  OPTIONAL { ?item schema:description ?itemdesc
  filter (lang(?itemdesc) = "ru") }
  filter (?itemdesc = ?wrongdesc1)
} LIMIT 100

Try it!

SELECT ?item ?itemlab ?itemdesc WHERE {
  ?item wdt:P31 wd:Q11266439 .
  wd:Q6459244 schema:description ?wrongdesc1
  filter (lang(?wrongdesc1) = "ru") .
  OPTIONAL { ?item schema:description ?itemdesc
  filter (lang(?itemdesc) = "ru") }
  filter (?itemdesc = ?wrongdesc1)
} LIMIT 100

Try it!

Licence of data to import (if relevant)
Discussion
The query can be as simple as:
SELECT ?item ?desc {
  VALUES ?desc { "шаблон в проекте Викимедиа"@ru "шаблон проекта Викимедия"@ru } .
  ?item schema:description ?desc .
}
Try it!
Matěj Suchánek (talk) 16:47, 8 October 2018 (UTC)
It looks like magic and works in a moment. Matej, thank you! Wikisaurus (talk) 20:43, 12 October 2018 (UTC)
Request process

Add interwiki conflicts listings to talk pagesEdit

Request date: 23 October 2018, by: Yair rand (talkcontribslogs)

Task description

I think it would be helpful if items listed at Wikidata:Interwiki conflicts had notices and links placed on their talk pages, to improve discoverability. There are over a thousand items listed, and no easy way for a user to determine whether any individual item is listed there. --Yair rand (talk) 04:43, 23 October 2018 (UTC)

Discussion


Request process


Copy lemma to F1Edit

For lexemes without forms, could a bot copy the lemma to this form? Sample edit: https://www.wikidata.org/w/index.php?title=Lexeme:L8896&diff=772695679&oldid=772692662

Please skip any lexemes that already have forms. --- Jura 08:51, 25 October 2018 (UTC)


Add annual country level unemployment rate (P1198)Edit

It would be interesting to have annual data for each country (1 value per year for country items). I'm not sure what are the most suitable sources for each country.

When discussing a query with CalvinBall, I noticed that Q30#P1198 currently only has one value (for 2013). --- Jura 12:34, 26 October 2018 (UTC)

Hi Jura, we could use the WDBot to do this job. The source could be World Bank Data - here an example for the USA: https://data.worldbank.org/indicator/SL.UEM.TOTL.ZS?locations=US. The World Bank uses ILO estimates, which have the following nice properties (check the Details button for the indicator on the WB's page):
Statistical Concept and Methodology: [...] The standard definition of unemployed persons is those individuals without work, seeking work in a recent past period, and currently available for work, including people who have lost their jobs or who have voluntarily left work. Persons who did not look for work but have an arrangements for a future job are also counted as unemployed. Some unemployment is unavoidable. At any time some workers are temporarily unemployed between jobs as employers look for the right workers and workers search for better jobs. It is the labour force or the economically active portion of the population that serves as the base for this indicator, not the total population. The series is part of the ILO estimates and is harmonized to ensure comparability across countries and over time by accounting for differences in data source, scope of coverage, methodology, and other country-specific factors. The estimates are based mainly on nationally representative labor force surveys, with other sources (population censuses and nationally reported estimates) used only when no survey data are available..
If this is fine for you I would make a request for bot permission (the script is already available). Datawiki30 (talk) 15:50, 26 October 2018 (UTC)
  • Sounds good. Maybe the qualifier criterion used (P1013) could be used with an item that describes the applied methodology. Eventually, we might have numbers with different methodologies for the same year. --- Jura 16:04, 26 October 2018 (UTC)
Hi Jura and thank you for your feedback. Do we really need the additional qualifier? Similar to the GDP I would just use the "stated in" = World Bank database and "reference URL" = https://data.worldbank.org/indicator/SL.UEM.TOTL.ZS?locations=XX where XX is the ISO code of the country. My opinion is that qualifier trying to explain the data are too short to describe the method behind the data... For new methods I would just suggest to propose a new property (like there are different properties for total, male and female population); Cheers! Datawiki30 (talk) 16:36, 26 October 2018 (UTC)
I think it's useful. The value would just be a (new) specific item. There is no need for its label or description to include the full text. I think it's an advantage as the above numbers may be useful for cross-country comparison, but some users might be looking for just one country and expect the methodology preferred in the country. For your bot it might be possible to do all in one edit, so the additional work would be marginal. --- Jura 16:44, 26 October 2018 (UTC)
Thank you Jura for your comment. There are no technical obstacles - the bot can handle this. I suppose that you mean, that we could have structurally different values in the same property - for example the value from ILO could for country A could be for example 5% where for the same years and country the value from Eurostat could be 3 %. Is this the case? Datawiki30 (talk) 21:29, 26 October 2018 (UTC)
  • Yes. I had in mind mainly national agencies that might have 4% instead of 10% (by whatever method), but it's the same issue. --- Jura 16:05, 2 November 2018 (UTC)
@Jura1: OK. Could you please take a look here? After a discussion in the project chat and here I think that it would be the best to import only the most actual data (for example for 2017). Otherwise we could have problems with the loading time of the countries pages. I would be glad to see your comment there. Cheers! Datawiki30 (talk) 19:18, 12 November 2018 (UTC)
  • This isn't really helpful to look at the evolution. I think it could easily hold annual data for the last 50 years. There are several other properties that have annual data. --- Jura 04:24, 13 November 2018 (UTC)

move descriptions in German from English to German descriptionEdit

Special:Search/Beruf/Funktion seems to find a lot. Sample edit: [7] --- Jura 16:07, 2 November 2018 (UTC)


{{Section resolved|Manually moved 9 entries. Can't find more German descriptions in the English field. Pyfisch (talk) 17:32, 1 December 2018 (UTC)}}

  • @Pyfisch: did you click on the search link ? It currently gives 15,607 results and any I clicked on aren't in English. Maybe you need to change your interface language to English --- Jura 05:52, 4 December 2018 (UTC)
  • Here is a query that finds some [8]. --- Jura 06:00, 4 December 2018 (UTC)
Interesting. I searched with "Pages in this language: English" which only shows a single result. But if I switch the interface language to English I get a whole lot of entities. Thanks for the SPARQL query! I am not sure though if we want to move all these descriptions to German because they are rather long unlike other descriptions that serve to disambigulate between persons of the same name. --Pyfisch (talk) 09:53, 4 December 2018 (UTC)
Feel free to improve them, but we surely don't want German text in the English description field. --- Jura 09:58, 4 December 2018 (UTC)
I am now moving descriptions from English to German. (details) --Pyfisch (talk) 09:27, 11 December 2018 (UTC)
(more) --Pyfisch (talk) 11:18, 11 December 2018 (UTC)
(more) --Pyfisch (talk) 13:37, 11 December 2018 (UTC)
(last one) I should have fixed most (all). @Jura1: If you have a query that produces more items, please tell me.--Pyfisch (talk) 19:00, 11 December 2018 (UTC)
Thanks. Seems mostly done. Searching for "Konfession" finds a few more. --- Jura 16:50, 12 December 2018 (UTC)
"Konfession" done. --Pyfisch (talk) 21:24, 14 December 2018 (UTC)

BLKÖEdit

Most pages in https://de.wikisource.org/wiki/Kategorie:BLK%C3%96 (27209 pages) seem to lack items (http://petscan.wmflabs.org/?psid=6382466 , currently 26641 pages).

I think it would be worth creating them as well as an item for the person subject of the article if it can't be matched with one of the exisiting items. --- Jura 07:43, 8 November 2018 (UTC)

ProposalEdit

To get this started I propose this structure for articles. It also mentions from which source each statement is imported. As I see it besides the structure for articles the structure for volumes and person subjects with imported data also needs to be decided. Additionally described by source (P1343) should probably be added to new and existing person subjects. --Pyfisch (talk) 22:29, 11 December 2018 (UTC)

ArticleEdit


I've made a preliminary data export. It contains all BLKÖ articles with GND, Bearbeitungsstand etc. The articles are linked based on the stated GND, Wikipedia and Wikisource articles, if there was a conflict multiple Q-numbers are given. I also searched for items linked to the article and unfortuanly found many that describe the person instead the of the text (they will need to be split). The last four columns state the date/place of birth/death from the text. The dates vary in accuracy:
  • year-month-day, year-month, only year
  • ~ before date describes imprecise dates
  • > before describes dates stated as "nach 1804"
  • A before dates describes "Anfang/erste Tage" start of
  • E before dates describes "Ende/letzte Tage" end of
  • M before dates describes "Mitte" middle of
  • ? BLKÖ knows the person was dead but does not know when he/she died

The places will need to be manually matched to Q-items. The first column contains some metadata about the kind of page. There are:

  • empty: Person
  • L: Liste
  • F: Family, Wappen, Genealogie
  • R: Cross Reference
  • P: Prelude
  • H: note about names and alternate spellings
  • N: corrections, Nachträge

Each group should get a distinct is-a property. @Jura1: Do you like it? This is just for viewing, a later version will be editable to make manual changes before the import. --Pyfisch (talk) 22:14, 18 December 2018 (UTC)

  • I like the approach. BTW, there is Help:Dates that attempts to summarize how to add incomplete dates. --- Jura 14:05, 20 December 2018 (UTC)
    • editable data export. Updated the exported data. The sheet "articles" is already cleaned up. But I need help to match the ~4000 place names in the sheet "places" to Wikidata Q-Items. --Pyfisch (talk) 16:07, 22 December 2018 (UTC)

Clinical TrialsEdit

Request date: 8 November 2018, by: Mahdimoqri (talkcontribslogs)

Link to discussions justifying the request

https://www.wikidata.org/w/index.php?title=Wikidata:Dataset_Imports/Clinical_Trials*

Task description
Licence of data to import (if relevant)
Discussion


Request process

trainer-stationsEdit

Request date: 12 November 2018, by: Fundriver (talkcontribslogs)

Task description

Is it possible to harvest the trainer-data for head coach of sports team (P6087) out of the german Wikipedia out of different infoboxes? It should be pretty similar to the harvesting for member of sports team (P54) and could be done with the same syntax for different sports, because the infoboxes are similar for the different sports in the german Wikipedia (expect for ice hockey): You could use trainer_tabelle in Template:Infobox Rugby Union biography (Q14373909), Template:Infobox football biography (Q5616966), Template:Infobox basketball biography (Q5831659) and Template:Infobox floorball player (Q20963207) with the same technic. You probably just should pay attention to don't import data, that isn't totally clear. So sometimes you have a "(Co-Tr.)", "U-21" or "U21" in addition to the Wikilink that need manual oversight. But per example a "(Co-Tr)" you could use to refine a statement, if this is possible. Fundriver (talk) 09:52, 12 November 2018 (UTC)

Licence of data to import (if relevant)
Discussion


Request process

Cleanup VIAF datesEdit

There are a series of imports of dates that need to be fixed, please see Topic:Un0f1g1eylmopgqu and the discussions linked there, notably Wikidata:Project_chat/Archive/2018/10#Bad_birthdays with details on how VIAF formats them. --- Jura 05:28, 14 November 2018 (UTC)


import writersEdit

When adding values for screenwriter (P58), I notice that frequently these persons don't have Wikidata items yet.

It would be helpful to identify a few sources for these and create corresponding items. Ideally every tv episode would have its writers included. --- Jura 15:05, 18 November 2018 (UTC)


adding data from scoresway.comEdit

Request date: 22 November 2018, by: Amirh123 (talkcontribslogs) hi please adding player datas of scoresway.com

Link to discussions justifying the request
Task description
Licence of data to import (if relevant)
Discussion

Import Schizosaccharomyces pombe protein coding genesEdit

Request date: 6 December 2018, by: Anlock (talkcontribslogs)

Link to discussions justifying the request
Task description

The PomBase database manually curate and maintain the coding inventory of the S. pombe genome. I would like to upload the protein coding genes of S. pombe as per this request https://www.wikidata.org/wiki/Wikidata:Property_proposal/PomBase_systematic_ID

The dataset is located here: https://docs.google.com/spreadsheets/d/1nrFcoQJirshUYbgI8-O3sjIDUonDHM_gLClJrrm3zZY/

Licence of data to import (if relevant)

Creative Commons Attribution 4.0 International license (CC-BY)

Discussion


Request process

Fix item redirects in grammatical features and other aspects of LexemesEdit

Request date: 10 December 2018, by: ArthurPSmith (talkcontribslogs)

Link to discussions justifying the request
Task description

In general if an item has been redirected in the main namespace, right now I believe User:PLbot fixes statements with that item value. However, this does not seem to be happening for grammatical features or other aspects of Lexemes - in particular when we merged Q24133704 into present participle (Q10345583) that left (thousands?) of lexeme forms with that old grammatical feature value; it would be really nice to have a bot fix this (and watch for similar things in future)!

Licence of data to import (if relevant)
Discussion

Actually it's User:KrBot that normally fixes redirects. Ideally that bot would just be updated to handle the special ways that lexemes use Wikidata items? ArthurPSmith (talk) 19:16, 10 December 2018 (UTC)

Request process

Add original title of scientific articlesEdit

There are some articles, that have title (P1476) value enclosed in square bracket. This means that the title is translated to English and the article's title wasn't in English.

Sample: https://www.wikidata.org/w/index.php?title=Q27687073&oldid=555470366

Generally, the following should be done:

  1. deprecate existing P1476 statement
  2. add the original title with title (P1476)
  3. add the label in the original language
  4. remove [] from the English label

--- Jura 11:03, 11 December 2018 (UTC)

Reviews in articlesEdit

When doing checks on titles, I found some items with P31=scholarly article (Q13442814) include an ISBN in the title (P1476)-value.

Sample: Q28768784.

Ideally, these would have a statement main subject (P921) pointing to the item about the work. --- Jura 19:10, 13 December 2018 (UTC)

Marking as preferred the current time zone in Property:P421Edit

Request date: 14 December 2018, by: Antenor81 (talkcontribslogs)

Link to discussions justifying the request
Task description

The request is to mark as preferred the current time zone in Property:P421 when it contains also old zones that are no longer applicable. This is useful because the other wiki-projects are not able to import the current time zone if it has not a superior rank.

Licence of data to import (if relevant)
Discussion
  • Now that I re-read this, is this about daylight savings time or time zones in general? You haven't specified any changes that should be done.
    • If for a country or a region, the timezone changes, you could just add an end date for the timezone and a start date for the new one. Maybe @PreferentialBot: can then set preferred rank as it does already for other properties.
    • For DST, maybe we should put this in another (new) property.
-- Jura 14:09, 20 December 2018 (UTC)

@Antenor81, T.seppelt, Nikosguard, Rachmat04, علاء: @ShinePhantom, ViscoBot, Vyom25, Liridon: fyi --- Jura 06:52, 21 December 2018 (UTC)

I mean time zone in general. For example, in this case there are two time zones, the old one (UTC+3, end date 26 mar 2016) and the new one (UTC+4, start date 26 mar 2016). I noticed in it.wikipedia that the system was not able to import the time zone from Wikidata because both time zones had the same rank (normal). In the same example, now the current time zone has a preferred rank and the old time zone has a normal rank, and now everything is working. So, if we want the other wiki-projects to import the current time zone, it should have a superior rank.--Antenor81 (talk) 07:59, 23 December 2018 (UTC)
Request process

Patronage/clientèle patronage (P3872), rank-preferred for latest year availableEdit

Request date: 1 January 2019, by: Bouzinac (talkcontribslogs)

Link to discussions justifying the request
Task description

Update any element with P3872, if there is (1+) years, up (preferred)-rank the latest year. Down (normal)-rank other years if present. For instance, see

And this should be executed one time per year (as there might be new data) Thanks a lot!

Licence of data to import (if relevant)
Discussion
Request process


pq:P31 -> pq:P3831 for trailer (Q622550)Edit

"Now" that we have object has role (P3831) could someone kindly switch the qualifier from P31 to P3831 for qualifier values trailer (Q622550).

It's mostly YouTube video ID (P1651) statements, but also a few others. currently 345 --- Jura 13:01, 2 January 2019 (UTC)

request for a bot to import population dataEdit

Request date: 10 January 2019, by: Histobot (talkcontribslogs)

Link to discussions justifying the request
Task description

Import municipal population data of Dutch municipalities from the Statistics Netherlands https://www.cbs.nl open data portal from the dataset: https://opendata.cbs.nl/ODataApi/odata/37259ned. Specifically add population data from 1960 to 2017 to every municipality using the property population. This will facilitate the use of reliable and consistent population data in other projects. Look for instance at the municpality of Zwolle and its population data. I will try to add this data using openrefine.

Licence of data to import (if relevant)

CC BY 4.0

Discussion

Histobot (talk) 15:56, 10 January 2019 (UTC)

Request process

Auto-adding complementary valuesEdit

Request date: 11 January 2019, by: Jc86035 (talkcontribslogs)

Link to discussions justifying the request
Task description

There should be a bot to add complementary values for Genius artist ID (P2373), Genius album ID (P6217) and Genius song ID (P6218). For all Genius artist ID (P2373) values without Genius artist numeric ID (P6351), the bot should add the first match of regex \{"name":"artist_id","values":["(\d+)" in the linked page, and vice versa with the first match of regex "slug":"([0-9A-Z][0-9a-z-]*[0-9a-z]|[0-9A-Z])". Preferred and deprecated ranks should be inferred when adding new values, although if multiple statements to be added have the same value but different rank then only the statement with the higher rank should be used. The values should be periodically checked to see if they match, and errors should be reported somewhere (probably on-wiki). The same should also be implemented for the other two pairs of properties, Genius album ID (P6217)/Genius artist numeric ID (P6351) and Genius song ID (P6218)/Genius song numeric ID (P6361).

Licence of data to import (if relevant)

N/A (presumed not copyrightable)

Discussion

All of the properties now exist. Jc86035 (talk) 10:57, 15 January 2019 (UTC)

Request process

Adding main subject (P921) to scholarly articles based on relevant keywords in the title and descriptionEdit

Request date: 17 January 2019, by: Thibdx (talkcontribslogs)

Task description

The goal of this bot is to add main subject (P921) to scholarly articles.

The metadata of scholarly articles in Wikipedias are quite hard to maintain by hand because the rate of creation of these articles exceed the capacity of the community to generate data. So that automation would be a great help.

In many case, finding a specific keywords on scholarly articles makes it obvious that it is a main subject of the article. This is the case for most technical terms that does not have double meaning.

For example :

A list of such pairs could be stocked in a protected wikipage. For each Keyword, the bot would search in scholarly articles and add the related main subject (P921) statement if the keyword is in the title. Of course, each pair would have to be tested first to ensure data consistency.

Human readable algorithme 
Wikidata:WikiProject Materials/ScholarTopicsBot
Getting the work done

If an experienced dev thinks it could be one of its priorities I would be glad to handle this to him. If not, I can try to do it myself. I'm not a dev at all. The only thing I did so far is modifiying some scripts. So that if you can help me by pointing the following examples it would be helpfull :

  • A bot that extract content from a wikipage
  • A bot that list Qids using a request
  • A bot that add statements to items

Regards

Discussion


Request process

Add JSTOR IDEdit

Request date: 19 January 2019, by: GZWDer (talkcontribslogs)

Link to discussions justifying the request
Task description
Licence of data to import (if relevant)
Discussion


Request process