Wikidata:WikiProject Biodiversity/Event 20220321 Arise Hackathon 2022

Arise Hackathon 2022

edit

Team Trixidata

edit

Game plan

edit

The Wikiproject biodiversity is a wiki project that aims on extending the different Wikimedia platforms (ie. Wikipedia, Wikimedia commons, and Wikidata) with knowledge about our biodiversity. This page is a sub page of that project specifically created for the Arise Hackathon 2022. This page is created as preparation to collect and finetune ideas's for the actual hackathons.

The aim of the project Trixidata is to make a showcase for Naturalis, by providing: a concrete example of publishing Naturalis's sources on the open source platform Wikipedia, making an iterative pipeline to enrich a graph on biodiversity

The result of the project is:

  1. a architecture picture of the pipeline from Naturalis' sources to open sources
  2. a working pipeline between those sources
  3. a number of Wikipedia pages which can be filled on a easy way (human friendly)
  4. an automatically refreshing Wikidata background to enrich a graph (computer friendly)
  5. a presentation to explain the principles and sketch the opportunities for Naturalis

Tasks

edit

Architecture

edit

Make a sketch of the pipeline with the sources of Naturalis and the open sources of biodiversity:

Write missing Wikipedia articles on insects

edit

Here we will identify for which (insect) species Wikipedia articles are missing. For those we will create Wikipedia stub with knowledge collected from a variety of respected resources. Where possible we will create more informative Wikipedia articles (in English and Dutch).

Describe Collectors and Collections from the Dutch Caribbean

edit

Understand who the people are who collected specimens is an important element to understanding the nature of data from a collection. The specimens in a collection are the result of people's interests, the places they go and the times that they went there. Knowing who they are is also important to giving credit, research trends and funding of biodiversity research. We will discover who the insect collectors of the Dutch Caribbean were, when they worked and where they came from, linking them to work we have already done on Montserrat and The Cayman Islands in previous work.

Make species interaction explicit

edit

There is a lot known about species interactions. Here we use Wikidata to collect interaction data from for example Globi. When described in Wikidata the knowledge cna be used in Wikidata.

edit

Link data and Wikidata allows storing data in an intuitive data format. Data about a concept is stored using so called statements or triples which follow a "subject-predicate-object" form. This is inspired by the [https://en.wikipedia.org/wiki/Subject%E2%80%93verb%E2%80%93object_word_order subject-verb-object word order used in grammars of various natural languages. This allows a more intuitive way of dealing with data. However, legacy data formats are more aligned with two dimensional dataframes. Tools are need to bridge traditional dataformats with the formats used in linked data/wikidata. In this task we review existing tools to align various biodiversity resources with linked-data forms. If possible we prototype other intuitive input tools to generate linked data.