Wikidata:WikidataCon 2017/Notes/Data donations: The process to get your data into Wikidata

Title: Data donations: The process to get your data into Wikidata

Note-taker(s): Gikü (first 20 min)

Speaker(s) edit

Name or username: Jens Ohlig, Wikimedia Deutschland

Contact (email, Twitter, etc.): User:Jens Ohlig (WMDE)

Useful links

Abstract edit

After you have convinced an institution to donate data (or, if you are an institution: after you have been convinced) the real fun starts — how to upload all the data? Wikidata lacks a big friendly upload button, but that should not put you off. We'll look at various parts of a typical workflow:

  • collecting data (copyright! data base right! fun!)
  • cleaning data (repairing faulty data with Google Spreadsheets and/or OpenRefine)
  • upload (QuickStatements, PyWikibot and everything else)

Collaborative notes of the session edit

Data donation often starts with organizations offering to donate their data to Wikidata so that it is not lost.

Jens is the first point of contact, he assesses the situation and directs the organization to specific project or people.

educational video clip about donations rolls out

Important to know: donation ddoes not mean uploading data to WD and forgetting about it, it also should be maintained and checked upon

Data pipeline: Find --> Get --> Verify --> Clean --> Analyse --> Present

For Wikidata it means:

Preparation: Describe data --> Import in a spreadsheet --> Define WD data structure --> Format data in the spreadsheet so that it iis importable to WD

Importing: Import the data into Mix'n'Match --> Match data to existing entities --> import data to WD --> add notes to description and archive

It is advised to first import a sample batch, so that everybody gets familiarized with the process on a smaller, less risky, scale

To be improved:

  • point of contact of the community
  • tools; presenter believes they are far from being performant

Questions / Answers edit

Q: most important tool? most critical first step?

A: basicaly improving any present tool would be significant

A2: even documenting tools would be significant

Q: skills about the present tools in institutions

A: depend on the institution; institution depend on paid staff

Q: Train the trainers - we have too few people

"the Foundation should just throw money at it" ??? No, but we need professional resources at some point.