Wikidata:WikidataCon 2017/Notes/Wikidata Toolkit: How to use Wikidata in Java

Title: How to use Wikidata in Java

Note-taker(s): Lucas

Speaker(s) edit

Name or username: Tpt

Contact (email, Twitter, etc.): @Tpt93 thomas pellissier-tanon.fr

Slides: https://docs.google.com/presentation/d/e/2PACX-1vTusPACVFXTzEd64pDU1WqzM1BFMK3uJ-SO0ATA-OAkJiLZmjkGL7YVkqJQOxXVl7YIhBxN8Eut7Rid/pub

Documentation: https://www.mediawiki.org/wiki/Wikidata_Toolkit

Eclipse setup: https://www.mediawiki.org/wiki/Wikidata_Toolkit/Eclipse_setup

git clone https://github.com/Wikidata/Wikidata-Toolkit.git

Dump: https://people.wikimedia.org/~hoo/tmp/wikidata-20171028-all-first2500.json.gz

http://wikidata.org/entity/Q24075199

Abstract edit

Wikidata Toolkit is a Java library for accessing Wikidata and other Wikibase installations. It allows to create bots, to download and parse dumps in order to do, e.g., complex analysis of Wikidata content. The aim of this workshop is to give a quick introduction of the Wikidata Toolkit and show people an easy way to create bots and manipulate Wikidata dumps.

Collaborative notes of the session edit

WikidataToolkit

Wikidata client library in Java, developed in 2014/15 by TU Dresden

fast processing based on dumps (e. g. queries that cannot be done on WDQS)

easy editing, without all the complexity of the API

proof of concept of RDF mapping

nice Java objects for Wikidata items, can be used with Java Stream APIs

dump processing: automatically downloads most recent dump and processes it with your processor

RDF export is mostly outdated

some statistics utilities and lots of examples

hands-on session: with computer and an IDE (see slides for instructions)

TutorialExample will first fail because it’s in offline mode. You can disable offline mode (comment out L51), start the program, quickly stop it, and overwrite the partial downloaded file with the truncated one from above (“dump”).