User:YULdigitalpreservation/UsesSoftware

Software Used in Scientific Research edit

This page is a planning document for research we aim to present at WikidataCon.

YUL goal edit

The digital preservation department would like to provide evidence as part of our rationale for decisions about which types of scientific software to prioritize for preservation by the library.

Taking the set of scientific publications in PubMed (or some subset of these papers) we would like to see what pieces of software are mentioned in these research paper.

Exploratory work edit

In Semantic Annotation of Data Processing Pipelines in Scientific Publications (Q30098399), the researchers have exposed their data via a SPARQL endpoint here.

Using the following query:

PREFIX dms: <https://github.com/mesbahs/DMS/blob/master/dms.owl#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX prov:<http://www.w3.org/ns/prov#>
 
SELECT distinct ?papername ?softname
 WHERE {
 ?paper dms:describesExperiment ?experiment. 
 ?experiment dms:usedSoftware ?software.
 ?software prov:value ?softname.
 ?paper dc:title ?papername
 }

We get a set of results about which software is mentioned in which papers from their data set (proceedings of 4 semantic web conferences).

Using OpenRefine to reconcile against Wikidata items that are instances of software (Q7397) we get these results relationships between papers and pieces of software that already have items in Wikdiata.

Related Work edit

Draft Proposal for Submission edit

  • increasing awareness of the value of software citation
  • demonstration of a method for how to automatically identify, structure and publish statements documenting which software has been reported to have been used in which publication
  • role Wikidata could play in how this could be cultivated