Open main menu

Wikidata:Database evaluation

During the last years, Wikidata has grown in use and became a major large-scale ontological knowledge-base[1]. This substantial evolution of Wikidata can cause of the emergence of several deficiencies within the data provided by the database. These matters can be related to the hierarchical organization of Wikidata (related to a misuse of instance of (P31) or subclass of (P279))[2] or related to other issues such as the lack of assignment of labels in a given language to Wikidata items[3].

Given the important effect of such issues on the linked data quality of Wikidata, this page was created to provided simple SPARQL queries that can be used to track and consequently solve common matters in Wikidata.

DescriptionEdit

The queries provided by this page were developed during SPARQL: Be connected to Wikidata, the first Wikidata meeting of the University of Sfax held from 25 to 27 June 2019. They assess different aspects of the data quality of Wikidata such as hierarchical organization, reference support and language support. Some of these queries are inspired from Wikidata:SPARQL query service/queries/examples and User:Pigsonthewing/Queries.

Hierarchy verification:

  • Anti-pattern 1: Verifies if a Wikidata item is an instance of and a subclass of another item at the same time[2].
  • Anti-pattern 2: Verifies if a Wikidata item is an instance of or a subclass of two items that are taxonomically related[2].

Property use:

  • Properties that are used the same as Instance Of
  • Properties that are the inverse of Subclass Of

Missing information:

  • Buildings in Tunisia without geo-locations
  • Buildings in Tunisia without images
  • Buildings in Tunisia without an English Wikipedia article

Language support:

  • Properties without labels in Arabic
  • Properties without descriptions in Arabic
  • Colours without labels in Arabic
  • Colours without descriptions in Arabic
  • Items and properties where the description is the same as the label

Reference support:

  • Wikidata statements not supported by references

Hierarchy evaluationEdit

Anti-pattern 1Edit

#by Csisc, 2019-06-27
SELECT DISTINCT * WHERE {
 ?C wdt:P31 ?X .
 ?C wdt:P279 ?X .} 
LIMIT 100

Try it!

Anti-pattern 2Edit

#by Csisc, 2019-06-27
SELECT DISTINCT * WHERE {
 ?C wdt:P31|wdt:P279 ?B .
 ?B wdt:P31|wdt:P279* ?A .
 ?C wdt:P31|wdt:P279 ?A . } 
LIMIT 100

Try it!

Property useEdit

Properties that are used the same as Instance OfEdit

#by Csisc, 2019-06-27
SELECT DISTINCT ?prop WHERE {
  ?X wdt:P31 ?Y .
  ?X ?prop ?Y .
  FILTER(?prop!=wdt:P31)
}
LIMIT 10

Try it!

Properties that are the inverse of Subclass OfEdit

#by Csisc, 2019-06-27
SELECT DISTINCT ?prop WHERE {
  ?X wdt:P279 ?Y .
  ?Y ?prop ?X .
}
LIMIT 10

Try it!

Missing informationEdit

Buildings in Tunisia without geo-locationsEdit

#by Pigsonthewing and Csisc, 2019-07-05
SELECT ?item ?itemLabel WHERE {
  ?item (wdt:P31/(wdt:P279*)) wd:Q41176;
    wdt:P17 wd:Q948.
  MINUS { ?item wdt:P625 []. }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Try it!

Buildings in Tunisia without imagesEdit

#by Pigsonthewing and Csisc, 2019-07-05
SELECT distinct ?item ?itemLabel with 
{
  select distinct ?item where
  {    ?item (wdt:P31/(wdt:P279*)) wd:Q41176;
    wdt:P17 wd:Q948 . }
} as %i
where
{include %i
  filter not exists {?item wdt:P18 [] .}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

Try it!

See also: https://tools.wmflabs.org/wikishootme/

Buildings in Tunisia without an English Wikipedia articleEdit

#by Pigsonthewing and Csisc, 2019-07-05
SELECT distinct ?item ?itemLabel with 
{
  select distinct ?item where
  {  ?item (wdt:P31/(wdt:P279*)) wd:Q41176;
    wdt:P17 wd:Q948 . }
} as %i
where
{include %i
  filter not exists {?article schema:about ?item ;
          schema:isPartOf <https://en.wikipedia.org/> .}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

Try it!

Language supportEdit

Properties without labels in ArabicEdit

#by Csisc, 2019-06-27
SELECT ?item ?itemLabel
WHERE
{
	?item rdf:type wikibase:Property .
  	OPTIONAL {?item rdfs:label ?label1 	filter(lang(?label1) = "ar")}
	FILTER(!BOUND(?label1))
 
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en,fr" }
}
LIMIT 100

Try it!

Properties without descriptions in ArabicEdit

#by Csisc, 2019-06-27
SELECT ?item ?itemLabel
WHERE
{
	?item rdf:type wikibase:Property .
  	OPTIONAL {?item schema:description ?label1 	filter(lang(?label1) = "ar")}
	FILTER(!BOUND(?label1))
 
    SERVICE wikibase:label { bd:serviceParam wikibase:language "ar,en,fr" }
}
LIMIT 100

Try it!

Colours without labels in ArabicEdit

#by Csisc, 2019-06-27
SELECT ?item ?itemLabel
WHERE
{
	?item wdt:P31 wd:Q1075 .
  	OPTIONAL {?item rdfs:label ?label1 	filter(lang(?label1) = "ar")}
	FILTER(!BOUND(?label1))
 
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en,fr" }
}
LIMIT 100

Try it!

Colours without descriptions in ArabicEdit

#by Csisc, 2019-06-27
SELECT ?item ?itemLabel
WHERE
{
	?item wdt:P31 wd:Q1075 .
  	OPTIONAL {?item schema:description ?label1 	filter(lang(?label1) = "ar")}
	FILTER(!BOUND(?label1))
 
    SERVICE wikibase:label { bd:serviceParam wikibase:language "ar,en,fr" }
}
LIMIT 100

Try it!

Items and properties where the description is the same as the labelEdit

#by YMS, 2016-09-26
SELECT ?item ?label (lang(?label) as ?lang) WHERE {
  ?item rdfs:label ?label .
  ?item schema:description ?label .
} 
LIMIT 100

Try it!

Reference supportEdit

Wikidata statements not supported by referencesEdit

#by Csisc, 2019-06-27
SELECT DISTINCT ?item ?prop WHERE {
   ?item ?prop ?statement .
  FILTER(!(regex(str(?prop), "http://www.wikidata.org/prop/direct/" ) ))
  FILTER(regex(str(?prop), "http://www.wikidata.org/prop/" ) )
  FILTER NOT EXISTS {
               ?item ?prop ?statement .
               ?statement prov:wasDerivedFrom ?derivedFrom  .}
}
LIMIT 100

Try it!

ReferencesEdit

  1. Pellissier Tanon, T., Vrandečić, D., Schaffert, S., Steiner, T., & Pintscher, L. (2016, April). From freebase to wikidata: The great migration (Link). In Proceedings of the 25th international conference on world wide web (pp. 1419-1428). International World Wide Web Conferences Steering Committee.
  2. 2.0 2.1 2.2 Brasileiro, F., Almeida, J. P. A., Carvalho, V. A., & Guizzardi, G. (2016, April). Applying a multi-level modeling theory to assess taxonomic hierarchies in Wikidata (Link). In Proceedings of the 25th International Conference Companion on World Wide Web (pp. 975-980). International World Wide Web Conferences Steering Committee.
  3. Kaffee, L. A., Piscopo, A., Vougiouklis, P., Simperl, E., Carr, L., & Pintscher, L. (2017, August). A glimpse into babel: An analysis of multilinguality in wikidata (Link). In Proceedings of the 13th International Symposium on Open Collaboration (p. 14). ACM.