User:ProteinBoxBot/2020 Disease Ontology sprint

Overall summary edit

This sprint is aimed at covering the Disease Ontology (Q5282129) in Wikidata. Since 2015 Disease Ontology (Q5282129) has been synchronized with Wikidata. For mappings between Disease Ontology (Q5282129) and other disease identifiers we relied on dbxref predicates in the disease ontology. The semantics of this property are not precise enough to express exact mappings between DO and those resources. As a result we paused the bot to investigate the issue and propose a different fix. Currently, the Disease Ontology (Q5282129) is using SKOS:exactMatch to express exact synonymy between resources. Implementing a new bot, while removing artefacts caused by previous bot runs, turned out to be difficult. To move forward two tracks are followed, one where a staging wikibase on wbstack is used to model the disease ontology in a similar environment as Wikidata, while in parallel fixing artefacts from previous bot runs. Once the wikibase schema entity is aligned with the native schema of Disease Ontology (Q5282129) and the previous edits are either updated or deleted, the bot edits on Wikidata will resume.

Participants edit

Gameplan edit

  • Setup a staging wikibase on wbstack Done
  • Remove or update mappings between DO and other disease identifier providers from Wikidata
  • Update the EntitySchema for Disease Ontology

Example disease edit

Disease Disease Ontology ID wikidata item mapping
asthma (Q35869) DOID:2841 Q35869 MESH:D001249, OMIM:600807