Wikidata:Property proposal/HathiTrust Volume Identifier
HathiTrust Volume Identifier edit
Originally proposed at Wikidata:Property proposal/Authority control
Description | Volume Identifier for HathiTrust
Alphanumeric ID from the HathiTrust Digital library, which is a large-scale collaborative repository of digital content from research libraries including content digitized via the Google Books project and Internet Archive digitization initiatives, as well as content digitized locally by libraries. For more information see w:HathiTrust. Each item in the registry has an permanent Volume ID and a stable URL (see below), so it would be easy to link the item on Wikidata to the resource on HathiTrust. HathiTrust ID (P1844) covers the HathiTrust record number, which represent a work's bibliographic data, and is not an immutable ID. Because this represents a specific scan entity, as opposed to a bibliographic record, it more like the Hathi Trust counterpart to Internet Archive ID (P724) |
---|---|
Data type | External identifier |
Domain | version, edition or translation (Q3331189), e.g. of book book (Q571) |
Allowed values | [a-z0-9]+\.[a-z0-9_\-:/]+ |
Example 1 | Essays on practical agriculture (Q51469955) → loc.ark:/13960/t3902rm3s |
Example 2 | The birds of Long Island (Q51420434) → hvd.hn4t8l |
Example 3 | How to Play Chess (Q19049739) → uc2.ark:/13960/t3pv6f03j |
Source | https://www.hathitrust.org/hathifiles_description |
External links | Use in sister projects: [ar] • [de] • [en] • [es] • [fr] • [he] • [it] • [ja] • [ko] • [nl] • [pl] • [pt] • [ru] • [sv] • [vi] • [zh] • [commons] • [species] • [wd] • [en.wikt] • [fr.wikt]. |
Planned use | Linking editions to scans online |
Number of IDs in source | 17,455,698 digitised volumes |
Expected completeness | always incomplete (Q21873886) |
Formatter URL | https://hdl.handle.net/2027/$1 |
Robot and gadget jobs | Probably all uses of Commons:Template:HathiTrust can be imported, and any volume with an OCLC number may be able to be linked. |
See also | HathiTrust ID (P1844) |
Distinct-values constraint | yes |
Wikidata project | WikiProject Books (Q8487081) and WikiProject Academic Journals (Q59961429) |
Motivation edit
Linking to scan authority control data from Wikisource author, portal and index pages. Inductiveload (talk) 16:43, 3 February 2021 (UTC)
Discussion edit
WikiProject Books has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. Inductiveload (talk) 16:50, 3 February 2021 (UTC)
- Comment this sounds useful, is there already a mapping available from Internet Archive ID (P724) to HathiTrust ID? Can you explain the difference to HathiTrust ID (P1844) - is this proposal here for a *specific* scan and HathiTrust ID (P1844) is for the work? So each HathiTrust ID (P1844) could have 0, one or more scan events each with its own "HathiTrust Volume Identifier"? An example would be 001730317 (War and Peace) where there is one book, but there are two different scan events (v2 and v3). Why do you say that HathiTrust ID (P1844) is not stable, that sounds concerning and a long-term issue? Assuming that the https://catalog.hathitrust.org/Record/001730317 is stable (which you say it is not), I would argue that we would not need to record the individual scans as we could just link to the record of the book and Wikidata would not have to keep track of all individual scan events. Best --Hannes Röst (talk) 18:36, 3 February 2021 (UTC)
- @Hannes Röst: I don't think there is a mapping from IA -> HT. Many IA works are scanned by IA themselves, and most Google scans come via Google books (with extra processing that sometimes trashes images, which usually means the HT scans are better quality). It's possible that sometimes an IA and HT record can be linked up via the OCLC number (and/or LCCN?).
- Hathi's documentation says about HathiTrust ID (P1844): "HathiTrust's record number for the associated bibliographic record: HathiTrust record numbers are not permanent and can change over time." I don't know how often that actually happens, though. I guess this is so they can split and merge bibliographic records as needed.
- Probably the biggest use of this as a separate property to HathiTrust ID (P1844) is for things like your example (a multi-volume work) and periodicals like The Electrical Engineer (Q105221968), which has https://catalog.hathitrust.org/Record/000554163, which contains links for each volume (collection of issues, in this case on a 6-monthly basis). So, not all Hathi Volume IDs under one Record ID point to scans of the same thing. In this case, there are also scan events that do refer to the same thing, scanned multiple times (e.g. 1, 2 and 3). Files uploaded at Commons from Hathi will use the volume ID (often with Commons:Template:HathiTrust, often not), not the record ID, since the document is obviously tied to the specific scan event.
- See also Wikidata_talk:WikiProject_Periodicals#Properties_for_periodicals_tiers:_work/series/volume/issue/article, where I am trying to figure out how to represent a multi-tier work hierarchy with a view to using it to drive Wikisource pages. Our scans at WS are obviously generally (sometimes they're composited) tied to a single HT or IA (or whatever) scan.
- Inductiveload (talk) 20:37, 3 February 2021 (UTC)
- Comment Why not just use Handle ID (P1184) for these? Mahir256 (talk) 02:17, 4 February 2021 (UTC)
- @Mahir256: probably because I hadn't seen that property :-/. Is there a canonical way to express that a Handle ID represents a scan of the item in question? Inductiveload (talk) 09:33, 4 February 2021 (UTC)
- @Inductiveload: Can we consider this proposal withdrawn? --Emu (talk) 17:36, 29 March 2021 (UTC)
- @Emu: I think so, yes. But I still don't have an answer for "Is there a canonical way to express that a Handle ID represents a scan of the item in question?", since Handle IDs can represent other things as well. Inductiveload (talk) 21:55, 29 March 2021 (UTC)
- @Inductiveload: According to this query, the qualifier collection (P195) is used for this --Emu (talk) 22:04, 29 March 2021 (UTC)
- @Emu: is that not more "where the item/record is" rather than "what relationship this Handle ID has to the entity" (in this case "is a a scan of it", possibly with other image-related metadata)? Inductiveload (talk) 22:19, 29 March 2021 (UTC)
- @Inductiveload: Well, I’m not an expert on modeling such statements, but collection (P195) is used together with accedaCRIS (Q52155934) and that might be what you have in mind? --Emu (talk) 22:41, 29 March 2021 (UTC)
- @Emu: is that not more "where the item/record is" rather than "what relationship this Handle ID has to the entity" (in this case "is a a scan of it", possibly with other image-related metadata)? Inductiveload (talk) 22:19, 29 March 2021 (UTC)
- @Inductiveload: According to this query, the qualifier collection (P195) is used for this --Emu (talk) 22:04, 29 March 2021 (UTC)
- @Emu: I think so, yes. But I still don't have an answer for "Is there a canonical way to express that a Handle ID represents a scan of the item in question?", since Handle IDs can represent other things as well. Inductiveload (talk) 21:55, 29 March 2021 (UTC)
- @Inductiveload: Can we consider this proposal withdrawn? --Emu (talk) 17:36, 29 March 2021 (UTC)
- @Mahir256: probably because I hadn't seen that property :-/. Is there a canonical way to express that a Handle ID represents a scan of the item in question? Inductiveload (talk) 09:33, 4 February 2021 (UTC)