Wikidata:WikiProject WLM/How to map WLM data example

How to map data from the WLM database to wikidata edit

How you can help out in practice edit

Before it's possible to write the code that moves data from the Wiki Loves Monuments (WLM) database to Wikidata the fields and values must be mapped. You don't have to know how to code to help out with this part. Ideally you know a lot about the monuments in the database, instead.

Basically this means answering these general questions and filling in the answers to those questions in your own tables copied (and maybe improved) from the ones in this document.

Create a page e.g. as a sub page of Wikidata:WikiProject WLM/Mapping tables here on Wikidata containing wikitables with your mappings in wikitables by copying the empty tables. Paste the empty table into a new page that is already linked to in the status table on Wikidata:WikiProject_Cultural_heritage and when you want feedback, please add "Check the mapping" or something in the column 'Status' in that table.

If we already know the property, because it is either implicit or taken care of by anoher field/database table etc we just put an hyphen ('-') in the column row property.

Note: Please add the category Category:WLM Wikidata mapping to the page so others can easily find them, comment or help out with the coding etc. Please also add your mapping to Wikidata:WikiProject WLM/Mapping tables

The main questions are:

  • What data fields can be filled in for every object (i.e. row) - "step 1" below?
  • What data fields values can you by looking at them map to existing Wikidata?
  • What new Wikidata items and properties do you need to create to be able to map to Wikidata?
  • How can one use an external datasources to lookup unknown field values?

How does the stored data look? edit

Each heritage object is stored in two places in the WLM database. Firstly, as a row in the general table, called monuments_all. This table contains basic information about all the heritage objects in order to make them comparable across countries. Furthermore, each country has its own dedicated table(s) with more specific information. Depending on the type of the data, each country can have more than one table. For example, there are four Sweden-specific tables: se-arbetsl, se-bbr, se-fornmin and se-ship, which store data about working-life museums, buildings, ancient monuments and historical watercraft, respectively.

In the following example you can see how a row from the se-bbr table, representing an item of built cultural heritage in Sweden, has been mapped.

Green background indicates the most important fields.

Step 1: What is implicity known about the data? edit

Some Wikidata properties are self-evident by the object being in a particular list. Those can be identified and mapped directly.

E.g. all the objects are from the country (P17) named Sweden (Q34)

Implicit data
field name value Wikidata property comment
heritage status heritage designation (P1435) This needs to be one of the three (complex) protection types (Q24284071, Q24284072 or Q24284073),

Needs a bbr query (Source uses both http://kulturarvsdata.se/raa/bbr/{ID}
and http://kulturarvsdata.se/raa/bbra/{ID}.) to determine which.

country Sweden (Q34) country (P17)
wikiproject sv.wikipedia - used for getting Wikidata items from wikilinks
is a(n) architectural ensemble (Q1497375) instance of (P31) All objects in se-bbr are complexes, not buildings

Example:

In Sweden there are three different types of legal protection for different types of cultural heritage, so we created three new items:

governmental listed building complex (Q24284071) for buildings owned by the state

individual listed building complex (Q24284072) for privately owned buildings

ecclesiastical listed building complex (Q24284073) for older buildings owned by the Swedish church.

Which legal protection each monument goes under is not stored in the WLM database. We therefore need to look that up by querying the source database via their API.

Step 2: analyze the WLM data and begin mapping edit

Look at a few examples of the full data in the database behind the list on Commons. If you know how to get data out of a SQL-file, you can download a full dump of the Wiki Loves Monuments database [1]. Otherwise feel free to contact Mattias Östmar (WMSE) and I'll help you get the data out into a format that you can easily look through.

2.1 Data in the general table monuments_all edit

Below is one row from the Swedish list of cultural heritage buildings in Sweden with values filled in for each field. I've added an English translation of the field value in bold parantheses behind for your convenience.

heritage field example value Wikidata property (Search current mappings AND on Wikidata. If not found: propose a new property Conversion (e.g. Split into multiple Q-values) comment
country se-bbr - lookup type of protection from RAĄ if not church (visible in URL)
lang sv -
id 21300000013599 (Note: Is it unique? If not, add in comments. Which organisation/institution is using it? Add the URL in field 'registrant_url' below!) Swedish Open Cultural Heritage URI (P1260) needs a raa/bbr./ prefix The prefix is either raa/bbr/ or raa/bbra/ but the only way of figuring out which is to examine "http://kulturarvsdata.se/raa/bbr/21300000003265 and http://kulturarvsdata.se/raa/bbra/21300000003265" to check which doesn't giva a 404
project wikipedia -
adm0 se - handled by Implicit mappings
adm1 se-ab - not needed since we know municipality
adm2 Vaxholm located in the administrative territorial entity (P131) map free text municipality name to an item (differes from label by the appendix"(s )kommun" and possible capitalization) The target item must have instance of (P31)municipality of Sweden (Q127448)
adm3 null -
adm4 null -
name Bogesunds slott (Bogesund 1:82; f.d. Bogesund 1:1) label, P? Wikisyntax needs to be stripped String needs to be split where the bracketed text (fastighetsbeteckning) becomes the value of P? (to be identified). Note that the first part may be wikilinked and contain multiple parts.
adress somethingville ? The values are a mixture of places and addresses (wiki linked and not). For wikilinked adresses a combination of located on street (P669) and house number (P670) can be used, for plain text adresses use P969 (P969). For non-addresses: We can use location (P276) if they are wikilinked, but nothing otherwise.
municipality Vaxholm - Handed by adm2
lat 59.39395 coordinate location (P625) Combines the value of bot lat and lon
lon 18.28676 see lat see lat
image Bogesunds Slott.JPG image (P18)
commonscat Bogesunds_slott Commons category (P373)
source //sv.wikipedia.org/w/index.php?title=Lista_%C3%B6ver_byggnadsminnen_i_Stockholms_l%C3%A4n&oldid=35302942 -
monument_article Bogesunds_slott see comment get item from link Gives the (likely) object for the monument
registrant_url http://www.bebyggelseregistret.raa.se/bbr2/anlaggning/visaHistorik.raa?page=historik&visaHistorik=true&anlaggningId=21300000013599 -
changed 2016-06-02 04:49:26 see comment if statements are sourced then this might be useful to store

Below is one example row from the full data table behind the Swedish list of cultural heritage buildings. Again, I've added an English translation of the field value in bold parantheses behind for your convenience.

2.2. Map data in country-specific e.g. table monuments_all_se-bbr_(sv) edit

After we've looked into the general table data above, we can now see that a lot of the field values are handled by parsing that table. We can also note that the field function contains two separate pieces of information that we want to parse out separately. We therefore create a duplicate row in the table so we can fit in two values for Wikidata properties.

heritage field example value Wikidata property Conversion comment
bbr 21300000013599 - handled by monuments_all
namn (name) Bogesunds slott (Bogesund 1:82; f.d. Bogesund 1:1) - handled by monuments_all
region-iso SE-AB - handled by monuments_all
funktion (function) Slott (2 byggnader) has use (P366) Strip the paranthesis and split the rest on comma. can only be used after words have been mapped to items
funktion (function) Slott (2 byggnader) has part(s) of the class (P2670) Isolate the number of buildings
⟨ subject ⟩ has part(s) of the class (P2670)   ⟨ building (Q41176)      ⟩
quantity (P1114)   ⟨ "number of buildings" ⟩
byggar (building year) 1640-talet inception (P571) convert dates to iso form can be used directly for exact dates/years. For spans use decade/century and then P580+P582 or P1319+P1326 qualifiers) per https://www.wikidata.org/wiki/Property_talk:P571#Precision
arkitekt (architect) architect (P84) get item(s) from link(s) can only be used if wikilinked and target is instance of (P31)human (Q5)
plats (place) - handled by monuments_all
kommun (municipality) Vaxholm - handled by monuments_all
lat 59.39395 - handled by monuments_all
lon 18.28676 - handled by monuments_all
bild (image) Bogesunds Slott.JPG - handled by monuments_all
commonscat Bogesunds_slott - handled by monuments_all
source //sv.wikipedia.org/w/index.php?title=Lista_%C3%B6ver_byggnadsminnen_i_Stockholms_l%C3%A4n&oldid=35302942 - handled by monuments_all
changed 2016-06-02 04:49:26 - handled by monuments_all
monument_article Bogesunds_slott - handled by monuments_all
registrant_url http://www.bebyggelseregistret.raa.se/bbr2/anlaggning/visaHistorik.raa?page=historik&visaHistorik=true&anlaggningId=21300000013599 - handled by monuments_all

2.3. Example of how the object looks on Wikidata edit

Now you can take a look at the final Wikidata object Bogesund Castle (Q24284121).