Software Used in Scientific ResearchEdit

This page is a planning document for research we aim to present at WikidataCon.

YUL goalEdit

The digital preservation department would like to provide evidence as part of our rationale for decisions about which types of scientific software to prioritize for preservation by the library.

Taking the set of scientific publications in PubMed (or some subset of these papers) we would like to see what pieces of software are mentioned in these research paper.

Exploratory workEdit

In Semantic Annotation of Data Processing Pipelines in Scientific Publications (Q30098399), the researchers have exposed their data via a SPARQL endpoint here.

Using the following query:

PREFIX dms: <>
PREFIX dc: <>
PREFIX rdf: <>
PREFIX prov:<>
SELECT distinct ?papername ?softname
 ?paper dms:describesExperiment ?experiment. 
 ?experiment dms:usedSoftware ?software.
 ?software prov:value ?softname.
 ?paper dc:title ?papername

We get a set of results about which software is mentioned in which papers from their data set (proceedings of 4 semantic web conferences).

Using OpenRefine to reconcile against Wikidata items that are instances of software (Q7397) we get these results relationships between papers and pieces of software that already have items in Wikdiata.

Related WorkEdit

Draft Proposal for SubmissionEdit

  • increasing awareness of the value of software citation
  • demonstration of a method for how to automatically identify, structure and publish statements documenting which software has been reported to have been used in which publication
  • role Wikidata could play in how this could be cultivated