Wikidata:Contact the development team/Archive/2021/03

This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

language of https://wikisource.org/ : currently "en" ( phab:T138332)

At Wikidata:Project_chat#Sitelinks_to_Incubator, there was a short discussion about https://wikisource.org/ among others.

It appears that the language of sitelinks is currently "en" (see the query there). It's applied the schema:name value and added as value schema:inLanguage .

Given that https://wikisource.org/ hosts pages in a series of languages, I don't think the language can be determined in advance. Accordingly, I think it should be "und" (code for "undetermined language").

If individual pages at https://wikisource.org/ would include multiple languages at the same time, the code would be "mul". It's true the global code for https://wikisource.org/ would be "mul", but that doesn't apply to individual sitelinks.

Given the above, the patch currently in the pipeline (phab:T138332) should be updated to use "und". I don't think the question about the language code to use was discussed somewhere (if yes, please provide a link). --- Jura 10:16, 11 February 2021 (UTC)

@Jarekt, matej_suchanek: who participated in some of the discussions around https://wikisource.org/ . --- Jura 10:16, 11 February 2021 (UTC)


Badges

A simple way to do that would to include the language as a badge, probably by adding (e.g.) Mingrelian (Q13359) directly as badge. The languages on the above list would need to be made available.
Bots that maintain other badges could added them when needed. --- Jura 13:22, 11 February 2021 (UTC)
  • Instead of
?article schema:isPartOf <https://wikisource.org/> ; schema:inLanguage ?lang .
the language code would be at:
?article schema:isPartOf <https://wikisource.org/> ; wikibase:badge/wdt:P424 ?lang .
--- Jura 21:33, 11 February 2021 (UTC)

@Lydia Pintscher (WMDE): do you need more info on the two above? --- Jura 08:30, 19 February 2021 (UTC)

The default content language as I understand it for wikisource.org is English. Until that is changed I don't think we should do something different. Moving that information to badges is an option but to be honest too much disruption for little gain as far as I can see. It'd mean creating a lot of new badges of questionable use and re-users would need to look at two different places for the same type of information. That doesn't seem right. --Lydia Pintscher (WMDE) (talk) 14:32, 20 February 2021 (UTC)
@Lydia Pintscher (WMDE): I hope we agree that being able to query the language of page on a sitelink is a feature that is needed, both for contributors and re-users.
I suppose one could use content language of wikisource.org. This can be set on a page level in page properties.
If not, where else would we be querying the language instead? Especially, as you seem to prefer that a single feature gives the info? --- Jura 14:49, 20 February 2021 (UTC)
In an ideal world I agree with you that each multilingual Wikisource sitelink should have the language code of the article itself. I've talked to a few more people and given their lack of enthusiasm for the problem and the effort that'd be required to do this it doesn't seem worth the effort at this point. We have more pressing fish to fry unfortunately :( --Lydia Pintscher (WMDE) (talk) 17:32, 22 February 2021 (UTC)
@Lydia Pintscher (WMDE): Agree that the ideal solution probably requires too much development resources. Syncing with page properties might not be simple.
Accordingly, it might be worth having a second look at the proposal(s) above.
- The use of badges seems fit for the purpose and could easily be maintained by bot as most other badges. Further, adding badges is a task that could probably be handled by relatively junior members of your development team, if not directly by the community liaison(s). The solution could also work for the few cases where we have permanent duplicated item (P2959) (the old story of some wikis having the same page in several languages).
- As far as the default language code for https://wikisource.org/ goes, I'm still not convinced that using "en" is a good idea if the wiki actually uses page properties to override it. "en" might just be the default of the interface (which is another question). Also, we would have two Wikisource editions with the same language code. Where was "en" determined? If it wasn't actively done, leaving it undetermined ("und") seems preferable. Afterall sushi preparation is fairly delicate and frying it might be safer.
--- Jura 08:33, 23 February 2021 (UTC)
  • @Lydia Pintscher (WMDE): It's seems that there isn't much interest in the question. How shall we fix this? In the way suggested above or in the way announced by Mohammed Sadat? At least we seem to agree that it shouldn't be in "en", as Sourceswiki has works in >150 languages that would appear as being being in "en". --- Jura 10:04, 26 February 2021 (UTC)
    • Hey :) to be honest I think we'll just leave it as is until there is enough demand to justify spending time on it. Everyone else I've asked so far was "whatever", which to me means I should be spending the team's time on other things. --Lydia Pintscher (WMDE) (talk) 17:32, 28 February 2021 (UTC)
      • @Lydia Pintscher (WMDE): Who did you ask? I don't see any communication with either community. If you just talk to WMDE staff, this isn't really development for the Wikidata community. It would be good to at least try to implement it as announced to the community. Clearly, the implementation at Sourceswiki is somewhat below community expectations and needs finishing. Interwikis that don't appear as interwikis, how could that even happen? --- Jura 13:09, 1 March 2021 (UTC)

Lack of whitespace hampers find-on-page in Wikidata items

The Edit > Find command in the current Safari/macOS, and the Find on Page in Safari/ipadOS, both use “Begins with” matching to find text strings on the page. If I search, for example, on the page Universe (Q1) for the third value name “studied by,” nothing is found. But if I search for “by”, then I find the end of the string. If I search for “valuestudied by”, then I find the end of the “+ add value” link in the previous statement run together with the heading of the “studied by” statement.

Looking at the source code, I see that the only white space between the strings is some UNIX linefeed characters, contained in HTML block elements but not in inline elements. The addition of some real whitespace in the templates where it wouldn’t interfere with the page layout would probably correct this.

As a workaround, I can switch to “Contains” matching on the Mac (but not on the iOS device). Most editors probably aren’t aware this option exists.

This inconsistent findability of items on the page is confusing and hampers the usability of editing interface, especially in items with a lot of statements. It makes it difficult or impossible for editors to find statements or verify their existence or absence on the page, and could lead to the creation of duplicate statements. —Michael Z. 18:33, 15 February 2021 (UTC)

I've just tried find with Safari/MacOS, and indeed a) the search mode was 'Begins with' and b) the 'studied by' value was found  Y. Can't at the moment speak for iOS. But I would raise a concern about amending wikidata to make good what seems to be a deficiency in the browser search, presuming the issue does exist in iOS. --Tagishsimon (talk) 22:09, 15 February 2021 (UTC)
Hm, then I will try turning off gadgets and see if it’s one of them. —Michael Z. 01:59, 16 February 2021 (UTC)
@Mzajac: Do you have updates about what happened when you turned off gadgets? -Mohammed Sadat (WMDE) (talk) 15:20, 25 February 2021 (UTC)
Hi, @Mohammed Sadat (WMDE):. Sorry for the delay. To be clear, the problem exists on both Mac and iOS. The workaround of changing “begins with” to “contains” is possible only on the Mac.
On the Mac, I tried:
  1. Disabled all gadgets (how can I determine the default set?)
  2. Disabled User:Mzajac/common.js
  3. Disabled my global meta:User:Mzajac/global.css
  4. Disabled Safari extensions.
  5. Tried every other Appearance>Skin
  6. Confirmed I have no Beta features turned on
None of it helped. While refreshing, I did notice that sometimes the text is found if I hit command-G while the page is not finished loading and rendering. But in every cases, it still fails to find after it finishes.
  1. Logged out of Wikidata. This worked!: I could find begins with “studied” on Q1.
So I still don’t know the cause of the problem, or how to solve it while logged in. Suggestions? —Michael Z. 17:40, 9 March 2021 (UTC)
Thank for providing these updates, Michael. I'm unable to reproduce the problem however, so I will hold off on creating a ticket for us to look into untill another person / persons are able to confirm this issue also. -Mohammed Sadat (WMDE) (talk) 20:09, 11 March 2021 (UTC)

Malformed json in latest Wikidata dump

Hi,

I have a script to extract entities from Wikidata dumps, that I've been running successfully for years.

The last time I ran it, on current latest-all.json.bz2 (03-Mar-2021 14:10, size 63323125695), it complained about a malformed json:

 ijson.common.IncompleteJSONError: parse error: after array element, I expect ',' or ']'
         :[]}},"lastrevid":1374358285}{"type":"item","id":"Q27","labe
                    (right here) ------^

The script runs multiple threads in parallel, so it's able to "crash" on some threads while continuing on others, so I noticed that the error happens not only at that point, but also in a couple more places through the json.

I'm currently downloading the version that is .gz (rather than .bz2) to try running on it (not very hopefully, to be honest). Does anyone know why is this happening? Has there been changes on how Wikidata dumps are generated, or maybe it was just an error on this particular dump?

The last succesfully extraction happened at the beginning of January, on a .bz2 with size 61247031499 (I'm not able to find it in the dumps page)


--Motagirl2 (talk) 12:53, 6 March 2021 (UTC)

Someone else also reported having issues at phab:T276643. - Nikki (talk) 13:34, 6 March 2021 (UTC)

Query namespace

(referred from Wikipedia:) A while ago, I noticed the Query namespace on Wikidata and the connections it has to the gadget namespaces. First of all, MediaWiki:Namespaceprotected is the message shown to users editing restricted namespaces. Unlike the Spanish Wikipedia File namespace (Archivo in Spanish,) attempting to protect the page does allow you to view the protection levels of the page. However, Gadget namespace on Wikipedia cannot be protected. This does not appear to be a built in feature with the gadget ns, as over at Miraheze this does not appear to be a problem. My observation: Any ns that cannot be edited by anyone locally will give an error when trying to protect the page. This is the same for the Query namespace. However, query ns appears to be even harder to be edited as d:Special:ListGroupRights returns zero results for an attempt to find (query-update) on the page, nor is the right listed anywhere on m:Special:GlobalGroupPermissions. Unlike gadget ns, Query ns appears to be a custom namespace for Wikidata. Is this ns related to query.wikidata.org? And who can edit pages in query ns? Just curious as it seems like a custom namespace for Wikidata that is not currently used and is protected for no specific reason. 54nd60x (talk) 13:51, 7 March 2021 (UTC)

That namespace is not in use and it never was; there are no pages in it, per https://www.wikidata.org/wiki/Special:AllPages?from=&to=&namespace=122, and nobody can create pages in it. As far as I know, it was reserved for future use when Wikidata was set up in 2012, but eventually never put to use as things somehow turned out differently. As much as I am aware, there are no plans to use it in the future. —MisterSynergy (talk) 12:50, 11 March 2021 (UTC)
@MisterSynergy: Shouldn't developers be able to create pages in that namespace, as they were (I assume) the ones that created the namespace themselves? 54nd60x (talk) 13:04, 11 March 2021 (UTC)
Well, yes. If you can change the server settings/Mediawiki configuration, you can probably enable an account to write to this namespace. Usually the devs do not play around like this. "Nobody" from above refers to regular community members. —MisterSynergy (talk) 13:13, 11 March 2021 (UTC)

@MisterSynergy: Thanks for all of the explanation. However, I still don't understand why this namespace was protected in the first place and what its original purpose was planned to be. 54nd60x (talk) 13:23, 11 March 2021 (UTC)

The Query namespace has been set up for future work regarding automated list generation and related uses, but nothing that will happen any time soon. We added this information on Help:Namespaces to help clarify this point. Lea Lacroix (WMDE) (talk) 12:02, 15 March 2021 (UTC)

Page move on enwp doesn't update here

Hi, probably this is related to a known issue, hence my post here rather than phabricator. This page move on enwp should have automatically updated the sitelink at Holger Rune (Q48200561), but it didn't (leading to my bot creating a duplicate for it). Looking closer, Subaryan (who did the move) doesn't seem to have an account here, I thought they were auto-created whenever such an action took place? Thanks. Mike Peel (talk) 10:08, 12 March 2021 (UTC)

Nope, the user needs to have an account here for it to work. There's a ticket at phab:T143486 about it. - Nikki (talk) 17:35, 12 March 2021 (UTC)

Constraint not working

The statement Lansac (Q190130)located in the administrative territorial entity (P131)canton of la Vallée de l'Agly (Q17623468) should trigger a constraint violation (value-type constraint (Q21510865)) because canton of la Vallée de l'Agly (Q17623468) is not an instance or subclass of administrative territorial entity (Q56061), but it is not working. Could you please see why? Thanks. Ayack (talk) 17:12, 15 March 2021 (UTC)

I've opened a ticket at phab:T277524 to investigate it. --Lydia Pintscher (WMDE) (talk) 09:02, 16 March 2021 (UTC)
Thank you Lydia. Ayack (talk) 09:42, 16 March 2021 (UTC)

White_field

See Wikidata:Project chat/Archive/2021/03#White field. 217.117.125.88 18:02, 18 March 2021 (UTC)

Can you please provide a screenshot? It's difficult to understand or reproduce your issue without seeing what is happening. Thanks, Lea Lacroix (WMDE) (talk) 10:58, 29 March 2021 (UTC)

Suggested properties not present on other items with the same instance

Item Q97370495 is an instance of television series season (Q3464665).

Currently it has two other statements:

I find it odd that three of the suggested other properties are not present at all on other instances:

Accordingly, they are pointless.

Another one is fairly rare (and not terribly helpful):

Even another suggested property publication date (P577) isn't generally used. --- Jura 12:48, 28 March 2021 (UTC)

Suspect suggestions are based on the properties used, not their values. One of the larger (~130k) sets of items using the combination P31 and P179 are UK law items, for which the suggestions are highly appropriate. --Tagishsimon (talk) 13:23, 28 March 2021 (UTC)
I think it's both: properties used + values of P31/P279/possibly some others. Otherwise suggestions wouldn't be different if an item just had a P31. --- Jura 13:34, 28 March 2021 (UTC)
Suggestion here is that P31 value does not have much influence. https://phabricator.wikimedia.org/T132839#4125004 --Tagishsimon (talk) 14:05, 28 March 2021 (UTC)
Hopefully things have evolved three years later. Is there any relevancy of the suggestions mentioned above? --- Jura 14:16, 28 March 2021 (UTC)
The comment that is linked above is still relevant. We're doing the best we can with the existing suggestion system, but it's far from perfect. FYI, there's a team of researchers who have been working on alternatives. Lea Lacroix (WMDE) (talk) 11:04, 29 March 2021 (UTC)

We cannot make sitelinks to redirects, despite promises in this thread. @Lydia Pintscher (WMDE): can you plz give some information? --SCIdude (talk) 11:09, 20 March 2021 (UTC)

I'm really really sorry. I've been too swamped and didn't get to reworking the ticket yet for the developers to reflect the new decision :( I'll do that today still. --Lydia Pintscher (WMDE) (talk) 11:58, 31 March 2021 (UTC)

Quadruplicate

Maybe it helps gain some insight in the "true duplicate" problem: Category:19th century in Castilla–La Mancha (Q98489873) was created four times with the same sitelink. Merged them today: Special:WhatLinksHere/Q98489873. --- Jura 15:40, 22 March 2021 (UTC)

Thank you! --Lydia Pintscher (WMDE) (talk) 11:55, 31 March 2021 (UTC)