Wikidata:WikiProject University degrees/Reports/Bulgaria

Bulgaria (Q219), Summer term 2018

edit

Choose a country and explore the Anabin database for this country. What universities offer which degrees?

edit

We decided to choose Bulgaria as our country of choice, because teammate Ralitsa has Bulgarian heritage and was therefore able to give valuable information about the educational system of her homeland. A slight problem however was, that the Bulgarian language is written in Cyrillic letters, which teammate Alex is not able to read or write. So all the translational tasks had to be done by Ralitsa, because Google Translator only has limited capabilities. We researched all the universities of Bulgaria from Anabin (Anabin) and noticed, that some unis had several entries under different names. The reason is, that many of them were renamed after the fall of the Soviet Union and afterwards renamed again, when Bulgaria joined the EU. So there where many academic institutions listed 3 times. It took some time to figure out the real amount of unis, we had to work with, but in the end we came to the conclusion, that there are 18 actual and current universities in bulgaria. However, information could only be found on 15 of them, because the other 3 had no websites and we also could not find out about their academic courses using other sources. The websites of the universities where often times badly made with a horrible GUI and only usable in the Bulgarian language, because some of them did not offer any alternatives and the ones that did, had not put much effort into it, so that there were mostly 404 error pages to see. These, however, was written in English :)

What kind of queries could potentially be asked of this data? Include them on the queries page!

edit

Even though we already wrote some queries for the last report, writing a new query was not that easy. Our idea was to create a query, that shows which university offers a specific degree. For example, we want to display an instance of university, which offers a master’s degree in Electrical Engineering. Getting the instance of university was easy and filtering the degrees as well. The difficult part was to understand how to get the qualifier. After some research in the internet, a post on stackoverflow gave us the answer, so we were able to create the query. (Query], Result)

The following query uses these:

  • Properties: instance of (P31)     , grants (P5460)     , academic major (P812)     
    SELECT DISTINCT ?university ?universityLabel ?degreeLabel ?majorLabel WHERE {
      SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
      ?university wdt:P31 wd:Q3918  .
      ?university wdt:P5460 ?degree .
    OPTIONAL { ?university p:P5460 ?rev . ?rev pq:P812 ?major}
     
    FILTER(?degree = wd:Q183816) #master's degree
    FILTER(?major = wd:Q55636433) #Electrical Engineering 
    }
    
    ORDER BY ?universityLabel ?degreeLabel ?majorLabel
    

Check if the university already exists in Wikidata. If not, create the university item. Do you have any other information that needs added?

edit

Many of the Bulgarian universities already had entries in the WikiData Database. There was only one without any entries. We added these manually to get to know the principles of using Wikidata and adding information. We added all the infos we could gather about this university (University Prof. Dr. Asen Zlatarov).

Check if the degrees offered already exist in Wikidata. If not, create the degrees (make SURE that they are not already there)

edit

This point was a bit on the tough side. A speciality of the Bulgarian language is, that it can also be expressed in Latin letters, which is more widespread in the younger generations. So, all of it had to be translated in a thoughtful way to Cyrillic and English, to make sense. After gathering all the possible degrees and available majors to apply to, we had to translate them into the English notion. The available system for Bachelor and Master of Arts, Science, Law, Nursing, etc. does not apply to the bulgarian system, where there are only bachelor's degrees and master's degrees without any specifications.

Our plan was to write a scraper, to collect all the majors, but, since Anabin stored different information on the same unis under different names, we had to get the complete and current lists from the universities' websites. So we would have had to write 15 different scraper scripts and even if we would have found the time for that, it would not have been that simple, because as mentioned earlier, most of the sites were not structured in a well manner. In the end, we decided to get the majors manually and translate them one by one, for this reason. After cumulating all the majors of each university and translating them into the english system, we put them into an excel sheet.

 

Connect up the university to the degree, including information on whether it is generally accepted.

edit

After researching how to add our gathered information to WikiData, we found out about the tool OpenRefine, which was originally developed by Google, but is now published openly. The process of finding the right way to add our data was a bit hard, because it is not that self-explanatory for our case. WikiData demanded references for every entry, labels and descriptions and every major and degree had to have a property of or subclass of statement added in order to get accepted on the page. But after messing around with it and reading a few tutorials, we finally found out the right scheme to add information:

 

Some universities had surprisingly long lists of majors, so we decided to insert 5 per uni, which accumulated to 90 majors in total. The actual number would have been about 850. We added our information and exported all entries.

 

Unfortunately we made some minor mistakes, while adding the data, like creating a few duplicate items or creating new items, while combining 2 existing ones would have been the more correct way. Also we used capitalized words, where the english language does not use capitalization. A user on WikiData made the suggestion, to fix these issues, which we will do. Furthermore, in the beginning of the project work, we created a new master's degree item by accident at one of our first tries, which resulted in bad results, when trying our queries. We merged these items (Q55693871 & Q183816) and solved this issue right away.

 

Add additional data as discovered and test the queries

edit

We added all the degrees and majors and created entries for the non-existent university.

If there is time, include another country!

edit

Unfortunately, there was no time to created items for another country. We did now know about the time, it would approximately take to do the whole process again for a second country and due to the upcoming exams and showtime preparations, nobody could put in this time to do it.