User:Lucas Werkmeister (WMDE)/Wikimania 2018/Quality Control on Wikidata using Constraints and Shape Expressions

Abstract

edit

Data quality is an important topic on Wikidata, the knowledge base providing structured data to Wikimedia projects and the world. In addition to the usual on-wiki ways of quality control (patrolling, recent changes, watchlists, …), the nature of the structured data on Wikidata enables structured quality control mechanisms as well, to automatically verify that data is consistent across the whole knowledge base and conforms to certain expected structures. Two mechanisms which check data quality in this fashion are presented: property constraints, an initially community-driven process now also supported by the Wikidata development team, where information on how a property should and should not be used is stored on each property; and shape expressions, a W3C standard for describing the structure of an RDF dataset that community members have started to use for Wikidata as well, based on Wikidata’s RDF export format also used in the Wikidata Query Service. Both have their strengths and weaknesses (shape expressions are more powerful, while property constraints are easier to understand), and both have their place in the Wikidata community. This presentation introduces both mechanisms and shows how they can be used.

Usernames of all presenters

edit

User:Lucas Werkmeister (WMDE), User:Daniel Mietchen, Andra Waagmeester

Keywords

edit
  • wikidata
  • quality
  • constraints
  • shape expressions
  • shex
  • rdf

Outcome

edit

Attendees will gain a deeper understanding of the property constraints mechanism and get an outlook on the possible future of quality control using Shape Expressions.

Topics

edit
  • Communities & Collaboration
  • Research & Academia
  • Technology & Software

Relationship to the theme

edit
  • Content quality