Wikidata:WikiProject WLM/How to map WLM data example
How to map data from the WLM database to wikidata edit
How you can help out in practice edit
Before it's possible to write the code that moves data from the Wiki Loves Monuments (WLM) database to Wikidata the fields and values must be mapped. You don't have to know how to code to help out with this part. Ideally you know a lot about the monuments in the database, instead.
Basically this means answering these general questions and filling in the answers to those questions in your own tables copied (and maybe improved) from the ones in this document.
Create a page e.g. as a sub page of Wikidata:WikiProject WLM/Mapping tables here on Wikidata containing wikitables with your mappings in wikitables by copying the empty tables. Paste the empty table into a new page that is already linked to in the status table on Wikidata:WikiProject_Cultural_heritage and when you want feedback, please add "Check the mapping" or something in the column 'Status' in that table.
If we already know the property, because it is either implicit or taken care of by anoher field/database table etc we just put an hyphen ('-') in the column row property.
Note: Please add the category Category:WLM Wikidata mapping to the page so others can easily find them, comment or help out with the coding etc. Please also add your mapping to Wikidata:WikiProject WLM/Mapping tables
The main questions are:
- What data fields can be filled in for every object (i.e. row) - "step 1" below?
- What data fields values can you by looking at them map to existing Wikidata?
- What new Wikidata items and properties do you need to create to be able to map to Wikidata?
- How can one use an external datasources to lookup unknown field values?
How does the stored data look? edit
Each heritage object is stored in two places in the WLM database. Firstly, as a row in the general table, called monuments_all. This table contains basic information about all the heritage objects in order to make them comparable across countries. Furthermore, each country has its own dedicated table(s) with more specific information. Depending on the type of the data, each country can have more than one table. For example, there are four Sweden-specific tables: se-arbetsl, se-bbr, se-fornmin and se-ship, which store data about working-life museums, buildings, ancient monuments and historical watercraft, respectively.
In the following example you can see how a row from the se-bbr table, representing an item of built cultural heritage in Sweden, has been mapped.
Green background indicates the most important fields.
Step 1: What is implicity known about the data? edit
Some Wikidata properties are self-evident by the object being in a particular list. Those can be identified and mapped directly.
E.g. all the objects are from the country (P17) named Sweden (Q34)
field name | value | Wikidata property | comment |
heritage status | heritage designation (P1435) | This needs to be one of the three (complex) protection types (Q24284071, Q24284072 or Q24284073),
Needs a bbr query (Source uses both http://kulturarvsdata.se/raa/bbr/{ID} | |
country | Sweden (Q34) | country (P17) | |
wikiproject | sv.wikipedia | - | used for getting Wikidata items from wikilinks |
is a(n) | architectural ensemble (Q1497375) | instance of (P31) | All objects in se-bbr are complexes, not buildings |
Example:
In Sweden there are three different types of legal protection for different types of cultural heritage, so we created three new items:
governmental listed building complex (Q24284071) for buildings owned by the state
individual listed building complex (Q24284072) for privately owned buildings
ecclesiastical listed building complex (Q24284073) for older buildings owned by the Swedish church.
Which legal protection each monument goes under is not stored in the WLM database. We therefore need to look that up by querying the source database via their API.
Step 2: analyze the WLM data and begin mapping edit
Look at a few examples of the full data in the database behind the list on Commons. If you know how to get data out of a SQL-file, you can download a full dump of the Wiki Loves Monuments database [1]. Otherwise feel free to contact Mattias Östmar (WMSE) and I'll help you get the data out into a format that you can easily look through.
2.1 Data in the general table monuments_all edit
Below is one row from the Swedish list of cultural heritage buildings in Sweden with values filled in for each field. I've added an English translation of the field value in bold parantheses behind for your convenience.
heritage field | example value | Wikidata property (Search current mappings AND on Wikidata. If not found: propose a new property | Conversion (e.g. Split into multiple Q-values) | comment |
---|---|---|---|---|
country | se-bbr | - | lookup type of protection from RAÄ if not church (visible in URL) | |
lang | sv | - | ||
id | 21300000013599 (Note: Is it unique? If not, add in comments. Which organisation/institution is using it? Add the URL in field 'registrant_url' below!) | Swedish Open Cultural Heritage URI (P1260) | needs a raa/bbr./ prefix | The prefix is either raa/bbr/ or raa/bbra/ but the only way of figuring out which is to examine "http://kulturarvsdata.se/raa/bbr/21300000003265 and http://kulturarvsdata.se/raa/bbra/21300000003265" to check which doesn't giva a 404 |
project | wikipedia | - | ||
adm0 | se | - | handled by Implicit mappings | |
adm1 | se-ab | - | not needed since we know municipality | |
adm2 | Vaxholm | located in the administrative territorial entity (P131) | map free text municipality name to an item (differes from label by the appendix"(s )kommun" and possible capitalization) | The target item must have instance of (P31)municipality of Sweden (Q127448) |
adm3 | null | - | ||
adm4 | null | - | ||
name | Bogesunds slott (Bogesund 1:82; f.d. Bogesund 1:1) | label, P? | Wikisyntax needs to be stripped | String needs to be split where the bracketed text (fastighetsbeteckning) becomes the value of P? (to be identified). Note that the first part may be wikilinked and contain multiple parts. |
adress | somethingville | ? | The values are a mixture of places and addresses (wiki linked and not). For wikilinked adresses a combination of located on street (P669) and house number (P670) can be used, for plain text adresses use P969 (P969). For non-addresses: We can use location (P276) if they are wikilinked, but nothing otherwise. | |
municipality | Vaxholm | - | Handed by adm2 | |
lat | 59.39395 | coordinate location (P625) | Combines the value of bot lat and lon | |
lon | 18.28676 | see lat | see lat | |
image | Bogesunds Slott.JPG | image (P18) | ||
commonscat | Bogesunds_slott | Commons category (P373) | ||
source | //sv.wikipedia.org/w/index.php?title=Lista_%C3%B6ver_byggnadsminnen_i_Stockholms_l%C3%A4n&oldid=35302942 | - | ||
monument_article | Bogesunds_slott | see comment | get item from link | Gives the (likely) object for the monument |
registrant_url | http://www.bebyggelseregistret.raa.se/bbr2/anlaggning/visaHistorik.raa?page=historik&visaHistorik=true&anlaggningId=21300000013599 | - | ||
changed | 2016-06-02 04:49:26 | see comment | if statements are sourced then this might be useful to store |
Below is one example row from the full data table behind the Swedish list of cultural heritage buildings. Again, I've added an English translation of the field value in bold parantheses behind for your convenience.
2.2. Map data in country-specific e.g. table monuments_all_se-bbr_(sv) edit
After we've looked into the general table data above, we can now see that a lot of the field values are handled by parsing that table. We can also note that the field function contains two separate pieces of information that we want to parse out separately. We therefore create a duplicate row in the table so we can fit in two values for Wikidata properties.
heritage field | example value | Wikidata property | Conversion | comment |
---|---|---|---|---|
bbr | 21300000013599 | - | handled by monuments_all | |
namn (name) | Bogesunds slott (Bogesund 1:82; f.d. Bogesund 1:1) | - | handled by monuments_all | |
region-iso | SE-AB | - | handled by monuments_all | |
funktion (function) | Slott (2 byggnader) | has use (P366) | Strip the paranthesis and split the rest on comma. | can only be used after words have been mapped to items |
funktion (function) | Slott (2 byggnader) | has part(s) of the class (P2670) | Isolate the number of buildings | ⟨ subject ⟩ has part(s) of the class (P2670) ⟨ building (Q41176) ⟩
quantity (P1114) ⟨ "number of buildings" ⟩ |
byggar (building year) | 1640-talet | inception (P571) | convert dates to iso form | can be used directly for exact dates/years. For spans use decade/century and then P580+P582 or P1319+P1326 qualifiers) per https://www.wikidata.org/wiki/Property_talk:P571#Precision |
arkitekt (architect) | architect (P84) | get item(s) from link(s) | can only be used if wikilinked and target is instance of (P31)human (Q5) | |
plats (place) | - | handled by monuments_all | ||
kommun (municipality) | Vaxholm | - | handled by monuments_all | |
lat | 59.39395 | - | handled by monuments_all | |
lon | 18.28676 | - | handled by monuments_all | |
bild (image) | Bogesunds Slott.JPG | - | handled by monuments_all | |
commonscat | Bogesunds_slott | - | handled by monuments_all | |
source | //sv.wikipedia.org/w/index.php?title=Lista_%C3%B6ver_byggnadsminnen_i_Stockholms_l%C3%A4n&oldid=35302942 | - | handled by monuments_all | |
changed | 2016-06-02 04:49:26 | - | handled by monuments_all | |
monument_article | Bogesunds_slott | - | handled by monuments_all | |
registrant_url | http://www.bebyggelseregistret.raa.se/bbr2/anlaggning/visaHistorik.raa?page=historik&visaHistorik=true&anlaggningId=21300000013599 | - | handled by monuments_all |
2.3. Example of how the object looks on Wikidata edit
Now you can take a look at the final Wikidata object Bogesund Castle (Q24284121).