Wikidata:Property proposal/JSTOR publisher ID

JSTOR publisher ID edit

Originally proposed at Wikidata:Property proposal/Authority control

Descriptionidentifier for a publisher on JSTOR
RepresentsJSTOR (Q1420342)
Data typeExternal identifier
Domainpublisher (Q2085381), educational institution (Q2385804), scientific organization (Q45103187) (this may need to be adjusted – the class hierarchy for educational orgs confuses me)
Allowed values[a-zA-Z0-9_-]{2,24}[1]
Example 1Butler University (Q1017974)butler
Example 2Wiley (Q1479654)black
Example 3Yale University Press (Q255147)yale
Example 4Yale University (Q49112)yaleuniv
Example 5Institute of Peace and Conflict Studies (Q616167)IPCS
Sourcehttps://www.jstor.org/publishers
Mix'n'match1240
Planned useExternal links on Wikipedia, potential for autofilling citation details
Number of IDs in source2084[2]
Expected completenesseventually complete (Q21873974)
Formatter URLhttps://www.jstor.org/publisher/$1
Robot and gadget jobsCan be easily scraped with Mix'n'match
See alsoJSTOR journal ID (P1230), JSTOR article ID (P888), JSTOR topic ID (P3827)
  1. I analyzed all the IDs at https://www.jstor.org/publishers, and found that all IDs are limited to the following set of characters: -1357CDGILNPSU_abcdefghijklmnopqrstuvwxyz. IDs can be as short as 2 characters (e.g. dv) or as long as 24 characters (e.g. britassocbiolanthroosteo). IDs appear to be case-sensitive (e.g. IPCS works but ipcs does not), so we must allow uppercase letters.
  2. Based on number of matches for CSS selector #content a[href^='/publisher/'] at https://www.jstor.org/publishers. Retrieved 18 Aug 2020.

Motivation edit

JSTOR is a major source of scholarly articles cited frequently on Wikimedia projects. This property could potentially be used to autofill the publisher (P123) property for articles/journals stored on JSTOR. If you check out a few example pages for JSTOR article ID (P888) and JSTOR journal ID (P1230) (e.g. example journal, example article), you'll see that JSTOR includes a prominent link to the publishers of each of those articles, which could easily be scraped. IagoQnsi (talk) 19:04, 18 August 2020 (UTC)[reply]

Discussion edit