Wikidata:WikiProject University degrees/Reports/Iceland

Iceland (Q189), Summer term 2018

edit

Preparation

edit

We started by looking at the data available on Anabin (https://anabin.kmk.org). Anabin lists 21 universities for Iceland.

Taking a deeper look into those universities revealed that not all of them exist anymore and that they have been merged with other universities. Working through the data and filtering out merged universities resulted in 7 remaining universities that are currently active in Iceland. Only one of those 7 universities was not already present in Wikidata and was therefore created by us (Bifröst University (Q3107095)). Additionally we added some information to existing universities like the official website, postal address and so on.

Regarding the degree itself Anabin only lists the type of degree e.g. Bachelor of Science and not the academic discipline the degree can be earned in.

After the the new property grants (P5460) was approved we decided to add the academic fields in which the degree can be earned from the university as a qualifier to the statement. This makes it possible to ask the which university awards a specific degree in a specific academic field.

Modelling it this way also allows to add additional information like when the university started awarding a degree in a discipline and the default duration of the program. Also it is possible to add the timeframe the university awarded the degree for a specific program. The problem with this is finding the data. Not even all programs at an university list those information in a uniform way.

To avoid the risk of adding false data we decided to go with primary sources for the information on degrees and academic disciplines by only looking directly at the websites of the university itself. In case of the University of Iceland (Q196559) there were only graduate programs listed on the english website undergraduate programs are apparently only available in icelandic. Adding the data by getting it from the automatically translate icelandic website seamed to error prone for us so we added only master degrees for this university.

For four universities we were not able to find any information about the property students count (P2196). The new created property grants (P5460) requires a statement constraint students count (P2196), so consequently we sin against the validation constraint because of the missing students count. Further we noticed, some academic major items were either not correctly setup or the instance of (P31) was missing. While searching for the convenient academic major item we received many items with the same term. Therefore it was necessary to check each of the assigned items, it is semantically suitable to our needs. Let's give an example: The university Bifröst University (Q3107095) offers a study program Bachelor of Arts (Q1765120) in "Philosophy, Politics and Economics". We have found the item Philosophy, Politics and Economics (Q2777850), that is not compatible with the qualifier academic major (P812). The reason for its incompatibility is that Philosophy, Politics and Economics (Q2777850) is instance of (P31) of academic degree (Q189533) instead of one of the academic discipline (Q11862829), specialty (Q1047113) or science (Q336). In addition for the creative academic disciplines as fine art (Q219625) or design (Q82604) it makes semantically more sense to add instance of (P31) specialty (Q1047113) instead of academic discipline (Q11862829). From all the universities in Iceland only one offers an unique degree, which seems to be a Master of Science (Q950900) in the academic major "Projectmanagement", but according to a research we note, it is an own degree: Master of Science in Project Management (Q6785257)

Anabin Parser

edit

One aspect of this project was to collect the data, from which we created our insert statements for WikiData. We developed a crawler, based on Python, to gather all the informations, given by Anabin. Sadly Anabin has no API to request all the data in a format like JSON, thats why we had to parse the HTML of the side, what makes it a little bit harder to get all information. First we make a request on the graduations, which gives us a table with all graduations offered in Iceland. On each graduation there is a unique id. We can use this to get request some more details of the graduation. These details include a list of institutes which offer this grad. So we make one more request to gather the details for the institute. It includes the address, email, website, location and some more information within a comment. It also has a list of grades, offered by the institute. So we had to do one request for each degree, for each of them we made another request for their details and for all institutes per degree. Then we merged all informations together in one array. At the end, we wrote an CSV-File.

This was the hardest part. Because there where often signs in the information which where interpreted as a break within the CSV. At the end we had a CSV with a lot of information, which could then be used to create the insert statements for the WikiData. With some little changes, and improvements, it can also be used to collect informations for other countries in Anabin. With a lot more informations in it.

Importing the data

edit

Using the data from the Anabin parser we cleaned up and checked the data in generated CSV and imported it using Quickstatements https://tools.wmflabs.org/quickstatements/. We used version 1 of Quickstatements to import the data because it seamed to work better than version 2. For importing the academic degrees we created a template CSV for creating new academic majors and another template for adding the academic degrees and majors to the universities. We then used the information provided on the office websites to fill those templates and imported them.

Doing this we observed that Quickstatements was always adding the academic major to the existing grants (P5460) statement for the provided degree. We were not able to figure out to change this to allow the same result as adding it by hand as a different degree with its own qualifiers. This fact prevented us from adding additional qualifiers per academic major in those cases. In case of the University of Iceland (Q196559) we did this for two of the programs to show that it works.

Another thing that would have been nice are better error messages while importing the data. It only showed error but it was not clear why this error occurred and it took some time to figure them out.

Queries

edit

All universities in iceland

edit

The following query uses these:

  • Properties: instance of (P31)     , country (P17)     
    SELECT DISTINCT ?university ?universityLabel WHERE {
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
      ?university wdt:P31 wd:Q3918  .
      ?university wdt:P17 wd:Q189 . # iceland
    }
    ORDER BY ?universityLabel
    

All universities in iceland granting a master of science degree

edit

The following query uses these:

  • Properties: instance of (P31)     , country (P17)     , grants (P5460)     
    SELECT DISTINCT ?university ?universityLabel WHERE {
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
      ?university wdt:P31 wd:Q3918  .
      ?university wdt:P17 wd:Q189 . # iceland
      ?university wdt:P5460 wd:Q950900 . # master of science
    }
    ORDER BY ?universityLabel
    

Those queries can also be used to query world-wide

edit

The following query uses these:

  • Properties: instance of (P31)     , country (P17)     , grants (P5460)     
    SELECT DISTINCT ?country ?countryLabel ?university ?universityLabel WHERE {
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
      ?university wdt:P31 wd:Q3918  .
      ?university wdt:P17 ?country  .
      ?university wdt:P5460 wd:Q950900 . # master of science
    }
    ORDER BY ?universityLabel