Wikidata:WikidataCon 2017/Notes/Using Wikidata Query and QuickStatements to automatically amend Wikidata items

Title: Using Wikidata Query and QuickStatements to automatically amend Wikidata items

Note-taker(s): Lucas

Speaker(s) edit

Name or username: Geert Van Pamel (Geertivp)

Abstract edit

Using Wikidata Query to verify the quality and completeness of Wikidata items. How to find anomalies. How to find missing language translations. How to find mismatched key/value pairs. How to perform analysis, reports, and statistics. Using QuickStatements to automatically create and amend Wikidata items. How to build lists.

Examples: Create Labels in other languages, add (missing) descriptions, count the most used description for a group of items, align/correct descriptions.

Collaborative notes of the session edit

no two items can have same label+description. can run into this problem with QuickStatements

checkConstraints can detect more problems as well, e. g. duplicate properties (e. g. two dates of birth, one year one full date), missing reciprocal or symmetric properties

description may contain a female job description (“actress”) but sex or gender (P21) may be male, or vice versa

sister city example: query searching for belgian cities with sister city that doesn’t refer back to the first city. can be automatically fixed with QuickStatements

Amsterdam has sister city Rotterdam, Rotterdam doesn’t have sister city Amsterdam

also shown by the checkConstraints gadget (in Preferences > Gadgets) process:

download query results as CSV

open with Excel

extract Q-ID from “wd:Q123” column with =mid(FIELD;5;20)

Harmonia points to CSV2QuickStatements tool which does this automatically https://tools.wmflabs.org/ash-dev/wdutils/csv2quickstatements.php

paste into QuickStatements (make sure to log in when using QuickStatements for first time)

sister city added

QuickStatements can also fix qualifiers, add labels/descriptions, …

Questions / Answers edit

Overview of the session edit

Constraints
  • Wikidata is extremely open
  • Anyone can edit
  • Constraints are not proactively checked => only visible after saving the data
Techniques

Run Wikidata query

Detect missing data

Detect wrong data

Generate transaction file

Run QuickStatements

Review errors

Correct manually

Pitfalls
  • Take care; avoid mistakes; verify
  • Be prepared for negative feedback
  • When creating new subject, have at least one statement
Culture

Importance of languages

Multilingual countries

EN is most used

Get your language/culture known in EN => others will translate/build in their own language

Add a WM link

External links