Wikidata:Contact the development team/Archive/2020/10

This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

Query service GUI cut-n-paste problems (August 13)

I keep deleting the parts when trying to do cut and paste of SPARQL parts on https://query.wikidata.org .

Maybe it's a problem with my browser, etc. (of which I will spare you the details), but maybe others have the same problem. --- Jura 08:24, 13 August 2020 (UTC)

I did have such a puzzling problem yesterday, when I tried to paste the string GGNZKBPHAMNUOU-UHFFFAOYSA-N but I cannot reproduce it now. --SCIdude (talk) 06:28, 25 August 2020 (UTC)
@Jura1: Could you describe step by step what are the actions you're performing, so we (and other editors) can try to reproduce the issue? Feel free to go in details here (e.g. "1. I select part of the query, 2. I use the key combination Ctrl+C", etc.). Lea Lacroix (WMDE) (talk) 10:56, 21 September 2020 (UTC)
Somehow I was hoping others had the same problem and be able to describe it accurately. Here is my attempt:
  1. Try https://query.wikidata.org/#TEST1%0ATEST3%0ATEST2
  2. Then place the cursor before TEST3
  3. select the line with ↓
  4. try Shift-Del to cut
  5. This deletes TEST3 and TEST2 ..
  6. Pasting just restores TEST2, so somehow Shift-Del is done twice.
Maybe I should get a new keyboard, but I didn't notice it happening on mediawiki pages ;) --- Jura 09:30, 30 September 2020 (UTC)
I tried to follow your instructions and I can't reproduce it. Can anyone else? --Lydia Pintscher (WMDE) (talk) 15:42, 2 October 2020 (UTC)

Choicelist for country (P17)

Hello, in order to help data quality, provided the item has dates start time (P580) end time (P582) point in time (P585), the choicelist showing, when starting to type in a country in country (P17), should be showing existing countries at the times of the items dates. Eg for an old french item, proposing preferably Kingdom of France (Q70972) and not France (Q142). Bouzinac💬✒️💛 05:37, 30 September 2020 (UTC)

Thanks for this suggestion. Unfortunately, the current suggester is not able to do such a thing, and improving it would be a non-trivial change. I created a Phabricator ticket to start the discussion but I can't promise it will be done any time soon. Lea Lacroix (WMDE) (talk) 13:47, 5 October 2020 (UTC)

String to external-id

We have Open Food Facts food additive ID (P1820), Open Food Facts food category ID (P1821) and Open Food Facts ingredient ID (P5930). For no reason (at least I don't see one) two of them are external-id, one is string datatype. Open Food Facts food additive ID (P1820) should be changed to exernal-id too. Wostr (talk) 21:50, 26 September 2020 (UTC)

I created a ticket for it. Could you leave a message on the talk page of the property, just to inform people and make sure that no one disagrees? Thanks! Lea Lacroix (WMDE) (talk) 10:55, 28 September 2020 (UTC)
Thanks. Messages left Property talk:P1820, Wikidata talk:WikiProject Food, Wikidata talk:WikiProject Nutrition. Wostr (talk) 11:40, 28 September 2020 (UTC)
I see that datatype has been changed. Thank you. Wostr (talk) 16:13, 7 October 2020 (UTC)

PetScan returns "502 Bad Gateway" for some days

See

for details. Also see

--M2k~dewiki (talk) 14:47, 6 October 2020 (UTC)

Ping @Magnus Manske:. It's not clear to me how the current Petscan issue is connected to phab:T232620. I browsed the various tickets, both on Phabricator and Bitbucket and you're the only person who's connecting these issues: could you explain a bit more your reasoning? Lea Lacroix (WMDE) (talk) 15:38, 6 October 2020 (UTC)

Missing QIDs

Any idea why most QIDs get skipped at Special:NewPages?

Sample:

  • 17:10, 9 September 2020 ‎(Q99051383) (hist) ‎[764 bytes] ‎RegularBot (talk | contribs) (‎Created a new Item: Bot:
  • 17:10, 9 September 2020 (Q99051380) (hist) ‎[600 bytes] ‎RegularBot (talk | contribs) (‎Created a new Item: Bot:
  • 17:10, 9 September 2020 ‎(Q99051378) (hist) ‎[5,348 bytes] ‎NordNordWest (talk | contribs) (‎Created a new Item: Item duplicated from Q99051178)
  • 17:10, 9 September 2020 (Q99051376) (hist) ‎[798 bytes] ‎RegularBot (talk | contribs) (‎Created a new Item: Bot:
  • 17:10, 9 September 2020 ‎(Q99051372) (hist) ‎[598 bytes] ‎RegularBot (talk | contribs) (‎Created a new Item: Bot:
  • 17:10, 9 September 2020 ‎(Q99051370) (hist) ‎[522 bytes] ‎RegularBot (talk | contribs) (‎Created a new Item: Bot:
  • 17:10, 9 September 2020 ‎(Q99051366) (hist) ‎[530 bytes] ‎RegularBot (talk | contribs) (‎Created a new Item: Bot:

Malfunctioning bot? --- Jura 17:19, 9 September 2020 (UTC)

See the next section.--GZWDer (talk) 18:11, 9 September 2020 (UTC)
It's unclear if the old bug has any link to your and/or someone else's bot activity. --- Jura 11:43, 12 September 2020 (UTC)
A cause for this problem might be, that in the last few days some indexes are updated not immediately, but with a delay of some days. For example this Petscan result should show only articles without wikidata objects. In the last days, also entries are shown, where the articles already have been connected to a (new or existing) wikidata object. --M2k~dewiki (talk) 13:34, 12 September 2020 (UTC)
Also see Wikidata:Project_chat#Missing_increments_for_new_creates_Items. --M2k~dewiki (talk)
We're working on it, see phab:T232620 and subtasks. Lea Lacroix (WMDE) (talk) 10:46, 6 October 2020 (UTC)
If phab:T232620 explains it, does that mean there is a blocked user that is hammering Wikidata with item creation request? Has someone contact the ISP these requests come from? Or is just an unrelated ticket? Please explain in plain English (or French) what the root cause is and what is being done about it. --- Jura 10:53, 6 October 2020 (UTC)
Yes, this is the right ticket, and we're still working on it. Indeed, it seems that one or several blocked accounts are trying to create items (we don't know if it's on purpose or not, it may be a malfunctioning bot). When this kind of request happens, the software "reserves" a QID that will not be used. We didn't contact ISP, we're trying to solve the root problem instead. Lea Lacroix (WMDE) (talk) 11:05, 12 October 2020 (UTC)

Deployment and configuration

Hello! We want to deploy our own Wikidata data instance for querying with our own data based on dump. Now I'm trying to find information about which technology stack and on which machines configuration everything works. Can you please tell me in which direction to look? There are many sources of information around. Thanks in advance.  – The preceding unsigned comment was added by Ulyana7 (talk • contribs).

Hello @Ulyana7:, thanks for your message. I'm not sure what you are trying to do and for which purpose (query Wikidata, query your own data set?): feel free to give more details :)
In case you would like to query Wikidata with your own query service: see mw:Wikidata_Query_Service/User_Manual#Standalone_service
In case you would like to add your own data set in a separate Wikibase instance: see https://wikiba.se/setup/ and let me know if you have issues or questions.
Cheers, Lea Lacroix (WMDE) (talk) 11:09, 12 October 2020 (UTC)

QS not stopped by lag>30s

We are at the moment at lag=38s and this is not a single incident today. While bot edits don't seem to be the cause of today's lag, it is a surprise that QuickStatements still does 90/min edits, filling the queue, and causing the maxlag situation to exceed a duration of 30 min now. Please fix this, QS is just a bot, stop it until the lag situation has cleared, plz. --SCIdude (talk) 16:09, 18 October 2020 (UTC)

@SCIdude: There are about 4 quickstatements batches running, all local afaics, right now. I count 16 edits per minute for the four, taking 16:18 as the example minute. If you can supply evidence of the supposed 90/min when maxlag >5s, please provide it. (I have warned a pywiki bot owner, where their bot is ignoring maxlag.) --Tagishsimon (talk) 16:24, 18 October 2020 (UTC)
And no server-side QS batch has run since 2020-10-18 13:17:36. --Tagishsimon (talk) 16:26, 18 October 2020 (UTC)
And with the exception of the bot I've already warned, the grafana picture suggests that bots are paying proper attention to maxlag. --Tagishsimon (talk) 16:28, 18 October 2020 (UTC)
To continue: the misbehaving bot seems to have been switched off. Maxlag has plunged. Throughout this outage, Quickstatements was not the cause. --Tagishsimon (talk) 16:40, 18 October 2020 (UTC)
While I got the rate wrong (18/min instead of 90/min) QS still edited during lag>5s. And I never stated it was the cause of the situation, although it certainly was one cause that it persisted. --SCIdude (talk) 17:20, 18 October 2020 (UTC)
No. Each discrete instance of QS edited at the rate of 4 edits per minute. All of them adhered to maxlag recommendations - https://www.mediawiki.org/wiki/Manual:Maxlag_parameter. And here you are trying to get QS knocked on the head for no good reason at all. --Tagishsimon (talk) 17:23, 18 October 2020 (UTC)
QuickStatements intentionally edits "slowly" while maxlag>5, in order to indicate to the user that it does not hang. "Slowly" means that it should not be editing quicker than usual "manual editing speed". ---MisterSynergy (talk) 19:57, 18 October 2020 (UTC)

SPARQL Query stopped working October 8th/9th

Wikidata:WikiProject Video games/New video game items is busted because the WDQS query stopped working on October 8th, bit of a discussion here: Wikidata talk:WikiProject Video games/New video game items

If you replace "BIND(IRI(REPLACE(?id, '^(.+)$', ?formatterUrl)) AS ?url)" with "BIND('foo' as ?url)" in the SPARQL query (link to broken query here) it works again. I'm fairly certain the query didn't change at any point recently, and you can see from the edit history that it stopped updating around October 8th. I assume this is a bug in the SPARQL parser, but I'm not really sure.

Thanks. Nicereddy (talk) 03:24, 20 October 2020 (UTC)

@Nicereddy: If you scroll all the way to the bottom of the error message, you can see the problem:
Caused by: java.lang.IndexOutOfBoundsException: No group 2
	at java.util.regex.Matcher.start(Matcher.java:375)
	at java.util.regex.Matcher.appendReplacement(Matcher.java:880)
	at java.util.regex.Matcher.replaceAll(Matcher.java:955)
This means that the replacement part of the REPLACE (i. e., the ?formatterUrl) now contains a $2, but the pattern (^(.+)$) only has one group. There’s no bug in the SPARQL parser, someone just edited a formatter URL (P1630) so it no longer works with this query. (You can also query formatter URLs with $2 to see that the likely culprits are Fandom article ID (P6262) and P6623 (P6623).) --Lucas Werkmeister (WMDE) (talk) 09:27, 20 October 2020 (UTC)
As a small point of interest, and without doubting your diagnosis, Lucas, (which is easily testable by adding FILTER(!CONTAINS(?formatterUrl, "$2")) to the troubled query), nor suggesting there's any cause for any work to be done here, I cannot see edits on P6623 history or P6262 history which coincide with the 8 October demise of Nicereddy's query. Suggests the parser's handing of REPLACE might have changed? --Tagishsimon (talk) 09:58, 20 October 2020 (UTC)
No changes have been made to the SPARQL parser nor the regex engine used by WDQS, explanations given by Lucas are correct, if the query has not changed on this date then the only explanation is a change in the data, if not directly related to a change in P6623 or P6262 then something else might have change causing these properties to be taken into account while they were not previously. DCausse (WMF) (talk) 12:15, 20 October 2020 (UTC)
Good point. This, perhaps. --Tagishsimon (talk) 13:15, 20 October 2020 (UTC)

Wrong links to Enciclopedia Treccani

Well, I see the URLs of this family are incorrectly composed by Wikidata.

E. g. : in Neapolitan wafer item d:Q176927, I get an error instead of the page.

astiotalk 16:00, 9 October 2020 (UTC)

@Astio_k: then the formatter URL (P1630) needs to be adjusted at Property:P4223#P1630; the best place to discuss this is probably the property talk page, {{Ping}}ing the participants of the original property proposal. Note that the change won’t be visible in all items with a Treccani's Enciclopedia Italiana ID (P4223) directly, see T112081. --Lucas Werkmeister (WMDE) (talk) 17:48, 9 October 2020 (UTC)
@Astio_k: It's easier: the wrong property to be used in this case is Treccani ID (P3365) (designed for "Treccani - Enciclopedia online"), not Treccani's Enciclopedia Italiana ID (P4223) (designed for "Treccani - Enciclopedia Italiana").   Solved now :) --Epìdosis 17:56, 9 October 2020 (UTC) P.S. The correct place for asking similar questions is Wikidata:Bar, not this
Thank a lot, but Treccani's Enciclopedia Italiana ID description is “identifier for the Enciclopedia Italiana on Treccani website”, and it's the place where I found the page.
astiotalk 18:00, 9 October 2020 (UTC)
@Astio_k: The description is OK: as you can see the article contains "Enciclopedia on line", not "Enciclopedia Italiana"; as you can see from the comparison of this and this, they are two distinct entities: "Enciclopedia on line" refers Treccani ID (P3365), "Enciclopedia Italian" refers to Treccani's Enciclopedia Italiana ID (P4223). --Epìdosis 19:27, 9 October 2020 (UTC)
@Epìdosis: : ok, thanks. -- astiotalk 21:19, 22 October 2020 (UTC)

SPARQL endpoint for test.wikidata.org

Hi all. there has already been question asked (04/2020) about "SPARQL GUI for test.wikidata.org" and the answer was "No".

My question is not about the GUI, but about the SPARQL endpoint. I am writing a bot that first retrieves data over SPARQL, and then if necessary creates item using WDTK. Naturally, I have to test it in the sandbox (which is "test.wikidata.org") first, but if the only SPARQL endpoint I could use is the "main" one, then this testing is gonna be... interesting :) Like, it will find an item in main WD, then update that item over MW API in the test WD (which, most probably, will not even be there, in the first place, as there're way less items defined in the test WD).

Is it too expensive to run SPARQL endpoint at test WD?

thanks! --62mkv (talk) 17:39, 27 October 2020 (UTC)

Saddly, it is unlikely that we'll have the resources any time soon to support a SPARQL endpoint for "test.wikidata.org". To make it work, this would also require a lot of supporting infrastructure (event mechanism, streaming updates, etc...) that are not in place at the moment. Or it would need some amount of customization to have a working update mechanism. And of course the ongoing maintenance for such a complex service. You could have a look at creating Fakes or Stubs to validate your bot, creating scenarios that focus only on the specifics of your bots instead of requiring a full implementation of Wikidata / WDQS. --GLederrey (WMF) (talk) 10:33, 28 October 2020 (UTC)