Wikidata:Contact the development team/Archive/2018/12

This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

Constraint violations are no more displayed when saving a statement

For several weeks, contraint violations are no more displayed when you save a statement. You have to manually purge the item several times to see them appears, which makes them useless. Is this a known bug? — Ayack (talk) 12:28, 1 December 2018 (UTC)

Hmm no that's not a known bug. I just tried it on the sandbox and it worked fine for me. Strange. Can someone else please try as well? Can you please check if there are any errors in the developer console of your browser? --Lydia Pintscher (WMDE) (talk) 14:28, 1 December 2018 (UTC)
Seems to be caused by a script from my common.js but I haven't find which one yet. — Ayack (talk) 17:40, 4 December 2018 (UTC)

Wikidata on google doesn't show image

I created a wikidata for a rapper wikipedia a while back and everytime i go on google the wikidata show up but no images on it how can i fix this problem ?  – The preceding unsigned comment was added by 2600:6c5a:477f:d8a3:eddb:d4f8:c418:2851 (talk • contribs) at 08:06, 7 December 2018‎ (UTC).

It is a common misconception that Wikidata offers a direct path to Google’s search results. They may or may not rely on data from Wikidata, and there is nothing you or we can do about this. —MisterSynergy (talk) 08:24, 7 December 2018 (UTC)

Knowledge Panel Realtions

Dear Wiki team,

I acted in a film called Nicabob, the film I acted in has a Knowledge Pannel. I was wondering what can I do to get me a knowledge Pannel since I was the lead actor in the film?


Sincerely,

Nicolaas Migliore

What "Knowledge Pannel" are you referring to? ·addshore· talk to me! 12:37, 18 December 2018 (UTC)

Duplicates in RDF dump?

Hi, for some performance evaluation we loaded the RDF-all dump to Neptune.

Long load time as expected. I was surprised, however, by the large number of duplicates being reported. Could not really find anything on the internet on this. So I am wondering: are we doing something wrong, or are these duplicates real? And if so - why are they there?

{

   "status" : "200 OK",
   "payload" : {
       "feedCount" : [
           {
               "LOAD_COMPLETED" : 1
           }
       ],
       "overallStatus" : {
           "fullUri" :

"s3://foobarbaz/neptune/benchmark/wikidata",

           "runNumber" : 1,
           "retryNumber" : 0,
           "status" : "LOAD_COMPLETED",
           "totalTimeSpent" : 91353,
           "totalRecords" : 8057810937,
           "totalDuplicates" : 37334435,
           "parsingErrors" : 0,
           "datatypeMismatchErrors" : 0,
           "insertErrors" : 0
       }
   }

}

BR Thomas

@Lucas Werkmeister (WMDE): Can you answer? --Lydia Pintscher (WMDE) (talk) 14:29, 1 December 2018 (UTC)
(For readability, that’s 37 million duplicates in 8 billion records.) I’m not sure how we produce dumps, but it’s not too surprising to me that there would be some duplicate triples in the output. For example, if data for each entity is dumped individually, and then combined, then reference nodes which occur on multiple entities will be described multiple times (a reference node is identified by its hash, so multiple statements or entities with the same reference – e. g. imported from Wikimedia project (P143)English Wikipedia (Q328) – will result in the same reference node, but each entity’s import will repeat that reference node’s triples), and I think that is more or less how our dumps are produced.
However, while this somewhat bloats the dumps (hopefully, after compression, not too much), I don’t think it should be a problem. Repeating a triple in RDF is meaningless as far as I’m aware, so I assume the number of duplicates is just diagnostic output by Neptune with no effect on the resulting graph. --Lucas Werkmeister (WMDE) (talk) 10:36, 3 December 2018 (UTC)
Yes, duplicate triples in RDF are fine, they can be stored only once and this should be the case for every graph DB. We could in theory eliminate them but that would cause a lot of complexity in the code, don't think it's worth it. Values/refs can be duplicated too sometimes since multiple items can refer to the same one, and we have shards, and even within shards we use kind of Bloom filter which is not perfect and can have false negatives. Smalyshev (WMF) (talk) 21:20, 10 December 2018 (UTC)

distinct value constraint not working

At Q60050658 and Q5227381, NLM Unique ID (P1055) has the same value, but doesn't show a violation. --- Jura 14:41, 23 December 2018 (UTC)

Items have now been merged. If you find another example, let us know. Lea Lacroix (WMDE) (talk) 13:08, 27 December 2018 (UTC)
Can you investigate why it had happened? Otherwise some might get the impression bug reports don't get followed up on. --- Jura 09:15, 28 December 2018 (UTC)
I just tested the constraint and it worked for me. Without a broken example I don't think we can do more unfortunately. --Lydia Pintscher (WMDE) (talk) 14:52, 30 December 2018 (UTC)

2019 development plan

I read through the plan for 2019 - thanks for putting this together, lots of good stuff in there! I'm particularly looking forward to how ShEX and signed statements will work, and reworking the lexeme UI would be nice too (any concrete ideas on that yet)? And the plans for expanding federation, wikibase, etc. all sound good!!!

Two things I think are really important for our community though, not mentioned. Maybe they're not issues for Wikidata Dev Team, but then who would handle them?

  1. We get lots of requests for (likely to be problematic) inverse properties because of some sort of Lua constraint - can this be addressed somehow? In SPARQL terms the problem is that we need to allow Lua to fetch statements where the item is the object, not just the subject, of the triple. I would say this falls under "Encourage more data use" but it also helps improve data quality - inverse properties are a real source of confusion by having the same logical statement present in two different ways, with potentially different references, qualifiers, and even conflicting values.
  2. The redirect issue (see last year's RFC that had pretty overwhelming community support - I'm not aware of any Wikidata RFC that has had such heavy participation). While there is some need for further community discussion about what to do, one clear need is that "redirects must be clearly visible and discriminated from other links". Can the dev team at least look at some support for flagging sitelinks that are currently redirects, and possibly work with the community on further defining what else may be needed here? The discussion has so far been hampered by a lack of communication from the dev side on this, and given the number of people concerned I think it's really important to get it addressed.

Thanks for all you're doing!!! ArthurPSmith (talk) 16:12, 27 December 2018 (UTC)

Here's one place with discussion of the first issue. But it comes up a lot. ArthurPSmith (talk) 19:16, 28 December 2018 (UTC)
Hey :)
For inverse properties you are talking about accessing them from Lua on Wikipedia for example? I hope we can make progress in this direction as part of automated list generation but the details of that are still very much in the air. I'll put it to my notes for that.
About redirects: I'm still waiting for some movement on my comment there from the editors - basically seeing if we can make use of properties to define the needed relations. How would you like to see the redirects flagged? In the Item? or a list of all sitelinks to redirects across all Items? --Lydia Pintscher (WMDE) (talk) 14:57, 30 December 2018 (UTC)
@Lydia Pintscher (WMDE): The easiest way to mark that a sitelink links to a redirect is probably to use the badge system. In the recent Community Wishlist survey this suggestion received 37 supports, making it the 4th most requested feature in the Wikidata category. The presence of such badges would be directly accessible from SPARQL, making it easy to generate lists of sitelinks to redirects in particular subject domains or globally. A popular suggestion in the Community Wishlist was that sitelinks to redirects could be indicated in the interwiki sidebar of Wikipedia articles by presenting them in italics; but a distinguishing icon could also be used. This all ought to be fairly easy to implement, as the required infrastructure essentially all exists already, as used to indicate good and featured articles.
The idea of using properties to generate additional "approximate" sitelinks is interesting, but would need considerable further research as to when such sitelinks are appropriate and helpful, when are they not, and what targets for them (if any) would be most appropriate. They would also, very importantly, need to have a manual override, to allow editors to determine where an "approximate" sitelink should point to. The existing sitelinks to redirects (as approved by the RfC) are what would provide that manual functionality. There are already over 25,000 intentionally sitelinked redirects on en-wiki, marked as such with en:Template:Wikidata redirect; with equivalent templates on other big wikis. The first step is better understand how those existing manually-added sitelinks to redirects are being used, and better identify them to readers and editors. Those are in place now, they are not controversial, but they should be better labelled. That was the dominant view of the RfC, and it's time to implement it. Once that is done, and we can see how the labelling beds down, only then IMO should we consider the much more radical step of whether to show an additional virtual set of further guessed sitelinks automatically. Jheald (talk) 18:12, 30 December 2018 (UTC)
+1 to implementing the results of the RfC and the Community Wishlist. Marking existing redirects in italics seems like a good improvement over the current situation. --Micru (talk) 19:03, 30 December 2018 (UTC)
Ok. I'll check with the team. I fear they'll look at me with horror in their eyes because of performance issues but let's see. --Lydia Pintscher (WMDE) (talk) 19:07, 30 December 2018 (UTC)

Qualifiers are leaking to other values?

I'm trying to get the official websites of the capitals of Japanese prefectures, however some have multiple websites for different purposes in that entry. Looking at the entry for Chiba there is a qualifier "of" for the unwanted URL, so I tried to select by that qualifier. However, the query returns all links. Am I misunderstanding something or is this a bug?

 SELECT ?capital ?capitalLabel ?url_official ?ofLabel WHERE {
   ?capital wdt:P31  wd:Q17221353 ;
            wdt:P856 ?url_official ;
            p:P856 [pq:P642 ?of] .
            
   SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
 }

--Mwil.jp (talk) 07:20, 31 December 2018 (UTC)

This is not a bug. Your query is returning items with a P856-statement and a P856-statement with P642-qualifier. These two statements can be but don't have to be identical. Thus if an item has multiple official websites you get all of them if just one statement has the P642-qualifier. The query you want is
SELECT ?capital ?capitalLabel ?url_official ?ofLabel WHERE {
  ?capital wdt:P31  wd:Q17221353 ;
           p:P856 ?statement .
  ?statement ps:P856 ?url_official ;
             pq:P642 ?of.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!
And in case you want all official websites without the "of" qualifier, use
SELECT ?capital ?capitalLabel ?url_official WHERE {
  ?capital wdt:P31  wd:Q17221353 ;
           p:P856 ?statement .
  ?statement ps:P856 ?url_official .
  MINUS { ?statement pq:P642 ?of }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!
--Pasleim (talk) 10:12, 31 December 2018 (UTC)