Wikidata:Database evaluation

During the last years, Wikidata has grown in use and became a major large-scale ontological knowledge-base[1]. This substantial evolution of Wikidata can cause of the emergence of several deficiencies within the data provided by the database. These matters can be related to the hierarchical organization of Wikidata (related to a misuse of instance of (P31) or subclass of (P279))[2] or related to other issues such as the lack of assignment of labels in a given language to Wikidata items[3].

Given the important effect of such issues on the linked data quality of Wikidata, this page was created to provided simple SPARQL queries that can be used to track and consequently solve common matters in Wikidata.

Description

edit

The queries provided by this page were developed during SPARQL: Be connected to Wikidata, the first Wikidata meeting of the University of Sfax held from 25 to 27 June 2019. They assess different aspects of the data quality of Wikidata such as hierarchical organization, reference support and language support. Some of these queries are inspired from Wikidata:SPARQL query service/queries/examples and User:Pigsonthewing/Queries.

Hierarchy verification:

  • Anti-pattern 1: Verifies if a Wikidata item is an instance of and a subclass of another item at the same time[2].
  • Anti-pattern 2: Verifies if a Wikidata item is an instance of or a subclass of two items that are taxonomically related[2].

Property use:

  • Properties that are used the same as Instance Of
  • Properties that are the inverse of Subclass Of

Missing information:

  • Buildings in Tunisia without geo-locations
  • Buildings in Tunisia without images
  • Buildings in Tunisia without an English Wikipedia article

Language support:

  • Properties without labels in Arabic
  • Properties without descriptions in Arabic
  • Colours without labels in Arabic
  • Colours without descriptions in Arabic
  • Items and properties where the description is the same as the label

Reference support:

  • Wikidata statements not supported by references

Hierarchy evaluation

edit

Anti-pattern 1

edit
#by Csisc, 2019-06-27
SELECT DISTINCT * WHERE {
 ?C wdt:P31 ?X .
 ?C wdt:P279 ?X .} 
LIMIT 100
Try it!

Anti-pattern 2

edit
#by Csisc, 2019-06-27
SELECT DISTINCT * WHERE {
 ?C wdt:P31|wdt:P279 ?B .
 ?B wdt:P31|wdt:P279* ?A .
 ?C wdt:P31|wdt:P279 ?A . } 
LIMIT 100
Try it!

Property use

edit

Properties that are used the same as Instance Of

edit
#by Csisc, 2019-06-27
SELECT DISTINCT ?prop WHERE {
  ?X wdt:P31 ?Y .
  ?X ?prop ?Y .
  FILTER(?prop!=wdt:P31)
}
LIMIT 10
Try it!

Properties that are the inverse of Subclass Of

edit
#by Csisc, 2019-06-27
SELECT DISTINCT ?prop WHERE {
  ?X wdt:P279 ?Y .
  ?Y ?prop ?X .
}
LIMIT 10
Try it!

Missing information

edit

Buildings in Tunisia without geo-locations

edit
#by Pigsonthewing and Csisc, 2019-07-05
SELECT ?item ?itemLabel WHERE {
  ?item (wdt:P31/(wdt:P279*)) wd:Q41176;
    wdt:P17 wd:Q948.
  MINUS { ?item wdt:P625 []. }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Try it!

Buildings in Tunisia without images

edit
#by Pigsonthewing and Csisc, 2019-07-05
SELECT distinct ?item ?itemLabel with 
{
  select distinct ?item where
  {    ?item (wdt:P31/(wdt:P279*)) wd:Q41176;
    wdt:P17 wd:Q948 . }
} as %i
where
{include %i
  filter not exists {?item wdt:P18 [] .}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!

See also: https://tools.wmflabs.org/wikishootme/

Buildings in Tunisia without an English Wikipedia article

edit
#by Pigsonthewing and Csisc, 2019-07-05
SELECT distinct ?item ?itemLabel with 
{
  select distinct ?item where
  {  ?item (wdt:P31/(wdt:P279*)) wd:Q41176;
    wdt:P17 wd:Q948 . }
} as %i
where
{include %i
  filter not exists {?article schema:about ?item ;
          schema:isPartOf <https://en.wikipedia.org/> .}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!

Language support

edit

Properties without labels in Arabic

edit
#by Csisc, 2019-06-27
SELECT ?item ?itemLabel
WHERE
{
	?item rdf:type wikibase:Property .
  	OPTIONAL {?item rdfs:label ?label1 	filter(lang(?label1) = "ar")}
	FILTER(!BOUND(?label1))
 
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en,fr" }
}
LIMIT 100
Try it!

Properties without descriptions in Arabic

edit
#by Csisc, 2019-06-27
SELECT ?item ?itemLabel
WHERE
{
	?item rdf:type wikibase:Property .
  	OPTIONAL {?item schema:description ?label1 	filter(lang(?label1) = "ar")}
	FILTER(!BOUND(?label1))
 
    SERVICE wikibase:label { bd:serviceParam wikibase:language "ar,en,fr" }
}
LIMIT 100
Try it!

Colours without labels in Arabic

edit
#by Csisc, 2019-06-27
SELECT ?item ?itemLabel
WHERE
{
	?item wdt:P31 wd:Q1075 .
  	OPTIONAL {?item rdfs:label ?label1 	filter(lang(?label1) = "ar")}
	FILTER(!BOUND(?label1))
 
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en,fr" }
}
LIMIT 100
Try it!

Colours without descriptions in Arabic

edit
#by Csisc, 2019-06-27
SELECT ?item ?itemLabel
WHERE
{
	?item wdt:P31 wd:Q1075 .
  	OPTIONAL {?item schema:description ?label1 	filter(lang(?label1) = "ar")}
	FILTER(!BOUND(?label1))
 
    SERVICE wikibase:label { bd:serviceParam wikibase:language "ar,en,fr" }
}
LIMIT 100
Try it!

Items and properties where the description is the same as the label

edit
#by YMS, 2016-09-26
SELECT ?item ?label (lang(?label) as ?lang) WHERE {
  ?item rdfs:label ?label .
  ?item schema:description ?label .
} 
LIMIT 100
Try it!

Reference support

edit

Wikidata statements not supported by references

edit
#by Csisc, 2019-06-27
SELECT DISTINCT ?item ?prop WHERE {
   ?item ?prop ?statement .
  FILTER(!(regex(str(?prop), "http://www.wikidata.org/prop/direct/" ) ))
  FILTER(regex(str(?prop), "http://www.wikidata.org/prop/" ) )
  FILTER NOT EXISTS {
               ?item ?prop ?statement .
               ?statement prov:wasDerivedFrom ?derivedFrom  .}
}
LIMIT 100
Try it!

References

edit
  1. Pellissier Tanon, T., Vrandečić, D., Schaffert, S., Steiner, T., & Pintscher, L. (2016, April). From freebase to wikidata: The great migration (Link). In Proceedings of the 25th international conference on world wide web (pp. 1419-1428). International World Wide Web Conferences Steering Committee.
  2. 2.0 2.1 2.2 Brasileiro, F., Almeida, J. P. A., Carvalho, V. A., & Guizzardi, G. (2016, April). Applying a multi-level modeling theory to assess taxonomic hierarchies in Wikidata (Link). In Proceedings of the 25th International Conference Companion on World Wide Web (pp. 975-980). International World Wide Web Conferences Steering Committee.
  3. Kaffee, L. A., Piscopo, A., Vougiouklis, P., Simperl, E., Carr, L., & Pintscher, L. (2017, August). A glimpse into babel: An analysis of multilinguality in wikidata (Link). In Proceedings of the 13th International Symposium on Open Collaboration (p. 14). ACM.