Wikidata:Property proposal/ChemRxiv ID

ChemRxiv ID edit

Originally proposed at Wikidata:Property proposal/Generic

   Done: ChemRxiv ID (P9262) (Talk and documentation)
Descriptionidentifier of a document in ChemRxiv, a preprint server for the chemical sciences launched in 2016
RepresentsChemRxiv (Q50012382)
Data typeExternal identifier
Template parameterChemRxiv at en:Template:Cite journal
Domainscholarly article (Q13442814), preprint (Q580922)
Allowed values\d+
Example 1Enantioselective conjugate addition of catalytically generated zinc homoenolate (Q104931192)13607438
Example 2COVID-19 knowledge extractor (COKE): a tool and a web portal to extract drug-target protein associations from the CORD-19 corpus of scientific publications on COVID-19 (Q103837480)13289222
Example 3Structure-based drug design of an inhibitor of the SARS-CoV-2 (COVID-19) main protease using free software: a tutorial for students and scientists (Q98577324)12791954
Example 4Additional examples can be listed via Scholia (https://scholia.toolforge.org/venue/Q50012382)
Sourcehttps://chemrxiv.org/
Number of IDs in source7357 as of 2021-01-23, growing at a rate on the scale of hundreds per month
Formatter URLhttps://doi.org/10.26434/CHEMRXIV.$1
Robot and gadget jobsA bot should be created to upload all existing preprints, update the metadata for existing items, and automatically add new ones. I've already sent a pull request to enable this capability with the Wikidata Integrator project on GitHub (https://github.com/SuLab/WikidataIntegrator/pull/170)
See alsoarXiv ID (P818), BioRxiv ID (P3951)

Motivation edit

Like arXiv and bioRxiv, ChemRxiv is a domain-specific preprint server that covers the chemical sciences. While ChemRxiv assigns DOIs to its articles, it is valuable to use preprint server-specific identifiers to later link the same Wikidata item describing an article to both its preprint and its postprint (as is the case for other articles that have bioRxiv and ChemRxiv preprints).

Because new ChemRxiv identifiers are being minted with each submission, I'm not sure what to enter for "expected completeness". It's possible at any given time for Wikidata to be complete, but since there could be more identifiers later, I'm not sure if this means "always incomplete". Neither arXiv ID (P818) nor BioRxiv ID (P3951) list this attribute so perhaps it can be omitted here as well. Cthoyt (talk) 00:08, 23 January 2021 (UTC)[reply]

Discussion edit

  • I don't know much about documenting scholarly article (Q13442814)s. Is DOI (P356) still going to be needed/created as a statement after this property is approved? --Lectrician1 (talk) 04:17, 23 January 2021 (UTC)[reply]
    • @Lectrician1: This identifier is only for a tiny subset of scholarly articles (7357 according to the proposal here), so yes the DOI property will definitely still be needed. However, I'm wondering if we should be adding items for preprints - aren't most of them usually published elsewhere? Shouldn't we just have one item for the article (with both the DOI for the published article and a preprint ID if there is one)? ArthurPSmith (talk) 17:00, 25 January 2021 (UTC)[reply]
    • @Lectrician1:@ArthurPSmith: typically, scholarly articles on Wikidata use the DOI to point to the post-print. In the example of arXiv, no DOI is assigned by the preprint server. In the case of bioRxiv and ChemRxiv, the preprint servers do assign DOIs. Further, the internal identifier of the documents can be formatted into a DOI string to generate the appropriate DOI. When documents are only pre-prints from bioRxiv, they have the bioRxiv DOI, and when they are published, they are changed to the post-print's DOI. Usually an identifier from PubMed is the arbiter of whether something is a post-print. As it is currently done, the same item points to both the preprint (using the preprint server-specific Wikidata property) and the post-print using a DOI. On the idea of whether we should be adding preprints or not - I think with tools like Scholia that summarize the academic output of an individual, it's nice to have the most up-to-date information, including pre-prints. To be honest, as a scientist, I am more often reading preprints these days rather than waiting for content to be peer reviewed (the goodness of peer review is a debate for somewhere else ;)). Cthoyt (talk) 16:54, 5 February 2021 (UTC)[reply]
  •   Support especially given that we also have BioRxiv ID (P3951) and arXiv ID (P818), it will allow us to track the evolution of a manuscript from pre-print to publication. This can be important because publication can take months to years and the manuscript can change dramatically in the meantime. --Hannes Röst (talk) 17:55, 4 March 2021 (UTC)[reply]
  •   Done as ChemRxiv ID (P9262). @Cthoyt, Lectrician1, ArthurPSmith, Hannes Röst: UWashPrincipalCataloger (talk) 08:25, 7 March 2021 (UTC)[reply]