Wikidata:WikidataCon 2017/Notes/Data donations: The process to get your data into Wikidata
Title: Data donations: The process to get your data into Wikidata
Note-taker(s): Gikü (first 20 min)
Speaker(s) edit
Name or username: Jens Ohlig, Wikimedia Deutschland
Contact (email, Twitter, etc.): User:Jens Ohlig (WMDE)
- Useful links
Abstract edit
After you have convinced an institution to donate data (or, if you are an institution: after you have been convinced) the real fun starts — how to upload all the data? Wikidata lacks a big friendly upload button, but that should not put you off. We'll look at various parts of a typical workflow:
- collecting data (copyright! data base right! fun!)
- cleaning data (repairing faulty data with Google Spreadsheets and/or OpenRefine)
- upload (QuickStatements, PyWikibot and everything else)
Collaborative notes of the session edit
Data donation often starts with organizations offering to donate their data to Wikidata so that it is not lost.
Jens is the first point of contact, he assesses the situation and directs the organization to specific project or people.
educational video clip about donations rolls out
- https://commons.wikimedia.org/wiki/File:Donating_data_to_Wikidata.webm
- https://www.youtube.com/watch?v=YMAKwwLsGW0
Important to know: donation ddoes not mean uploading data to WD and forgetting about it, it also should be maintained and checked upon
Data pipeline: Find --> Get --> Verify --> Clean --> Analyse --> Present
For Wikidata it means:
Preparation: Describe data --> Import in a spreadsheet --> Define WD data structure --> Format data in the spreadsheet so that it iis importable to WD
Importing: Import the data into Mix'n'Match --> Match data to existing entities --> import data to WD --> add notes to description and archive
It is advised to first import a sample batch, so that everybody gets familiarized with the process on a smaller, less risky, scale
To be improved:
- point of contact of the community
- tools; presenter believes they are far from being performant
Questions / Answers edit
Q: most important tool? most critical first step?
A: basicaly improving any present tool would be significant
A2: even documenting tools would be significant
Q: skills about the present tools in institutions
A: depend on the institution; institution depend on paid staff
Q: Train the trainers - we have too few people
"the Foundation should just throw money at it" ??? No, but we need professional resources at some point.