Wikidata:WikiProject BHL/Updates
|
under construction
TO DO edit
Authors edit
- Match more BHL authors to items here -- BHL authors run through OpenRefine, if they have (i) dates and (ii) titles with items here
- Convert more author-name-strings to authors -- Updated. Some issues persisting with diacritics.
- Add family names from BHL to author items
-
Add 'named as' to BHL authors that don't have them-- but some not updated in triplestoretinyurl.com/yauzplqh
(567) - Consistency checks -- eg published before born
tinyurl.com/y7pej6hr
(236) (some may have been owners not authors)- Re-check against BHL dates: qy for comparison file:
tinyurl.com/y8w9mo4g
- Re-check against BHL dates: qy for comparison file:
- Entomologists of the World ID (P5370) with no BHL creator ID (P4081):
tinyurl.com/ybpmn5m7
(6717)
Periodicals edit
- Initial posting to WikiProject Periodicals
- Identify periodicals and change instance of (P31) for titles with
ISSNs,BPHs,,'periodical' identification from LoC LCCN (some), lots of volumestinyurl.com/yaxx6wle
(129) (some) (looked at with > 10 vols), 'annual' volume labels (some), catalogue series- information also in 'genre' section of BHL title pages
- Merge titles with duplicate ISSNs: BHL <-> non-BHL
tinyurl.com/yd5k2qhw
(32324), also by number of BHL items:tinyurl.com/y96ol4tm
<- Priority BHL <-> BHLtinyurl.com/y7k5397x
(18) - Compare against BHL links from Tropicos publication ID (P4904), IPNI publication ID (P2008), wikispecies, es-wiki
- investigation/merge list -- first pass done -- second list
- Journals where BHL range extends outside title stated lifetime
tinyurl.com/y9djhebw
- Better classification of periodicals
- Find periodicals in BHL titles as yet without items
- Qualify ISSNs with multiple values when 'print' / 'online'
Books edit
- Initial posting to WikiProject Books
- Improve classification: editions / tech reports / catalogues / periodicals etc (use keyword info, series info, ...)
- Identify multiple editions of the same book; create work items
- Add subject topics -- see discussion - may need new property?
Page refs, etc edit
- Try to use page refs to match articles / containing publications / creators; cf query:
tinyurl.com/ybsalpb9
with sample page link:tinyurl.com/y77yzhxy
- -- first pass done
- full work available at URL (P953) "biodiversitylibrary" --
tinyurl.com/y95dglul
(24) - DOI (P356) "10.5962/bhl.title" --
tinyurl.com/ybrud2fh
- BioStor work ID (P5315) --
tinyurl.com/y733czfp
- Mine Wikispecies for BHL links, not necessarily carried forward, eg [1]
- -- ISSN & source pages done
- Look for related abbreviations:
tinyurl.com/y9gc2qpp
- Periodicals still unmatched to existing item:
tinyurl.com/yapmoujh
(2308) - Periodicals with multiple BHLs:
tinyurl.com/y8tdzjo4
Commons edit
- Link titles -> Commonscats (but hold off until more refined here?)
Data Uploads -- BHL edit
Data from per-title RIS files edit
Fields, for the 63,229 wikidata 'title' items; they are repeated, for each volume ("BHL item")
- start time (P580), end time (P582), number of works (P3740) added as qualifiers for serials, based on entries; more serials to find?
- KW -- keyword (444907) -- awaiting decision on Wikidata:Property proposal/subject facet
- AU -- author (216374) -- added (as strings) in Magnus's initial upload. Conversion to items needed. Roles can vary. (eg "former owner").
- ER -- end record (129278) -- end of info for each BHL item
- UR -- URL (129278) -- <=> BHL item identifier
- TY -- type(129278) -- always "BOOK"
- TI -- title (129278) -- added in Magnus's initial upload. (But check?)
- PY -- Publication year (128661) -- added in Magnus's initial upload. (But check?)
- CY -- City (of publication) (117826) -- 41,570 added where matched to items. Needs sanity checking. Try again on remainder.
- PB -- Publisher (115581) -- added as string. Needs conversion into items. Discuss how many items for publishing houses with long histories.
- N1 -- notes (92954)
- VL -- volume (91243)
- SN -- shelf number (31813)
- ET -- edition (5354) -- added for 4536
BHL files edit
The BHL offers a number of files for download -- see data exports page, and schema
title.txt
-- TitleID / MARCBibID / MARCLeader / FullTitle / ShortTitle / PublicationDetails / CallNumber / StartYear / EndYear / LanguageCode / TL2Author / TitleURL / CreationDate- add inception (P571) / dissolved, abolished or demolished date (P576) for periodicals, as qualifier -- more periodicals to find?
- may be able to add some language of work or name (P407)s
titleidentifier.txt
-- TitleID / IdentifierName / IdentifierValue / CreationDate- OCLC control number (P243) (102363) (but check); MARC001 (97037); NAL (49731); Library of Congress Control Number (LCCN) (bibliographic) (P1144) (as "DLC") (27940) -- now added for almost all; heavy verification was needed; OAI (12922); WonderFetch (10270); Dewey Decimal Classification (P1036) (7694); ISBN-10 (P957) (4543); ISSN (P236) (1168); NLM Unique ID (P1055) (862) -- fmt doesn't match; GPO (729); CODEN (P1159) (407); Soulsby (238); TL2 (189); Abbreviation (? short name (P1813)) (100); BPH journal ID (P4569) (69) --fmt doesn't match;
subject.txt
-- TitleID / Subject / Creation Datecreator.txt
-- TitleID / CreatorType / CreatorName / CreationDatedoi.txt
-- EntityType / EntityID / DOI / CreationDateitem.txt
-- ItemID / TitleID / ThumbnailPageID / BarCode / MARCItemID / CallNumber / VolumeInfo / ItemURL / LocalID / Year / InstitutionName ZQuery / CreationDatepart.txt
partcreator.txt
BHL pages edit
Data also available by API -- see [2], in particular GetTitleMetadata()
It appears the API return can sometimes give additional information, eg by breaking some of the fields up structurally -- such an organisational name into that of an umbrella body and a subunit, or a series identification in a journal name. It also sometimes has fields specifying the nature of a value, such as the type of a note.
On the other hand, there may also be details on the HTML page not returned by the API, for example the related works of Journal of Botany, being a second series of the Botanical Miscellany (Q51434285) (BHL title 234). ((Held in the "TitleAssociation" table ? -- cf BHL internal table documentation at [3])), or the 'series' of BHL #102113
Information available includes:
- Title/Bibliographic Level
- Creator/Relationship
- Variant/Title, Variant/TitleVariantTypeName
headings from HTML page edit
Headers on "Details" tab of HTML pages (for 3473 currently marked as periodicals):
- Title (3473); Publication info (3473); By (3473); Genre (3473); Material Type (3453); Language (3295); Subjects (3118); Identifiers: (3111); Notes: (2767); Related Titles (2048); Call Number (1905); Title Variants: (1678); BHL Collections: (1401); DOI (516); Classification (322); Edition (7);
- Direct values:
- Genre: Journal (2827), Book (644), "" (2) -- Material Type: Published material (3451), Archival material (1), Map (1) -- Language: English (2270), German (365), French (285), Spanish (76), etc -- Classification = Dewey
- BHL Collections: New York Botanical Garden (558); Seed & Nursery Catalogs (526); Expanding Access to Biodiversity Literature (189); etc...
- Attribute/Value pairs:
- Related Titles: Succeeded by: (1571); Preceded by: (1066); Series: (577); Related/Analytical: (285); Other: (142); Supplement: (91); With: (38); Parent: (32); Contained In: (2)
- Title Variants: Alternative: (1620); Abbreviated: (793); Uniform: (207); Translated: (81);
fields from API edit
Detailed fields available, via BHL API:
- Type of item:
BibliographicLevel
(3472 = 3470 + 2 null) -- Serial (2825); Monograph/Item (525); Collection (117); Monographic component part (2); Serial component part (1)- cf MARC header, position 07
MaterialType
(3472 = 3452+20) -- Published material (3450); Archival material (1); Maps (1)
- Title-related :
TitleID
(3472)TitleURL
(3472)ShortTitle
(3472 = 3471+1)FullTitle
(3472)SortTitle
(3472)UniformTitle
PartName
(1648 = 100 + 1548)PartNumber
(1666 = 90 + 1576)Edition
(1629 = 7 + 1622)
- Publication:
PublisherPlace
(3399 = 3337 + 62)PublisherName
(3279 = 3080 + 199)PublicationDate
(2970 = 2315 + 655)PublicationFrequency
(2256 = 1324 + 932) -- Annual (241); Irregular (210); Monthly (123); Quarterly (109); Annual. (93); Monthly. (62); Irregular. (48); Quarterly, (32); Bimonthly (25); Monthly, (23); Quarterly. (20); Semiannual (16); Biennial (14); Bimonthly, (13); Semiannual, (12); Annual, (11); Weekly. (11); Frequency varies (10); Irregular, (10); Weekly (9); Weekly, (9); Bimonthly. (8); Annual (irregular) (7); Four no. a year (7); Frequency varies. (7); Unknown (7); Semiannual. (6); Four no. a year. (5); Semimonthly (5); Two no. a year (5); Biennial, (4); Monthly (irregular) (4); Three issues yearly (4); Ceased. (3); Monthly, except July and Aug. (3); Semimonthly. (3); Three times a year, (3); year (3); 2 no. a year, (2); 3 no. a year, (2); 4 no. a year (2); 4 no. a year, (2); 4 no. a year. (2); 6 no. a year, (2); Annual (Irregular) (2); Biennial. (2); Bimonthly (irregular) (2); Biweekly (2); Desconhecida (2); Four no. a year, (2); Monthly (except Aug. and Sept.) (2); Monthly (except June-Aug.) (2); Semimonthly, (2); Six no. a year (2); Ten no. a year (2); Three no. a year, (2); Trimestral (2); Two issues yearly. (2); 10 no. a year (1); 10 nos. a year. (1); 11 no. a year (bimonthly July/Aug.) (1); 12 no. a year, (1)
- Other:
Doi
(516)CallNumber
(2503 = 1904 + 599)
one-to-many:
Identifiers
(3472 = 3395 + 77) ->TitleIdentifier
(11289)IdentifierValue
(11289)IdentifierName
(11289) -- OCLC (3465); MARC001 (2679); DLC (1326); WonderFetch (1054); NAL (921); ISSN (892); DDC (330); CODEN (318); NLM (98); Abbreviation (70); BPH (53); ISBN (44); GPO (22); OAI (9); TL2 (7); Soulsby (1)
Subjects
(3472 = 3117 + 355) ->Subject
(13575)SubjectText
(13471 + 104)
Authors
(3472 = 3034 + 438) ->Creator
(5042)CreatorID
(5042)Name
(5042)Title
(28 + 5014)FullerForm
(141 + 4901)Unit
(1029 + 4103)Dates
(561 + 4481)Location
(7 + 5035)Numeration
(0 + 5042)Role
(5042) -- Added Entry -- Corporate Name (MARC 710) (3205); Main Entry -- Corporate Name (MARC 110) (995); Added Entry -- Personal Name (MARC 700) (710); Main Entry -- Personal Name (MARC 100) (107); Main Entry -- Meeting Name (MARC 111) (13); Added Entry -- Meeting Name (MARC 711) (12)Relationship
(177 + 4865) -- ed. (39); engraver (25); ill. (22); ed (14); editor (9); engraver. (9); printer. (8); ill (7); printer (7); publisher (7); illustrator (6); lithographer (5); issuing body (3); Auteur (2); author (2); client (2); printer of plates (2); Editor (1); colorist. (1); editor. (1); former owner (1)TitleOfWork
(114 + 4928)
Notes
(3472 = 2766 + 706) ->TitleNote
(7393)NoteText
(7393)NoteTypeName
(7393) -- General (3375); Numbering Peculiarities (965); Linking Entry Complexity (673); Coverage Selective (527); Issuing Body (465); Location Given (342); Coverage Unknown (263); Language (247); Summary (228); Supplement (162); Contents (66); Coverage Complete (39); Citation/References (27); Former Title Complexity (5); Formatted Contents (4); Incomplete Contents (2); Scope and Content (2); Partial Contents (1)NoteSequence
(7393)
Variants
(3472 = 1567 + 1905) ->TitleVariant
(2494)Title
(2494)TitleVariantTypeName
(2494) -- Alternative (1620); Abbreviated (793); Parallel (75); Translated (6)
Data uploads -- other sources edit
Internet Archive edit
- number of pages (P1104) -- added for single-volume works
- language of work or name (P407), Open Library ID (P648) -- added for single-volume works
WorldCat / OCLC linked data edit
- NOTE: not CC0 -- harvested data cannot be uploaded here (except OCLC IDs, which are to be considered Public Domain)
Library of Congress edit
- Extraction in progress. Many of the stated LCCNs (in the titleIdentifier file) appear not to resolve correctly.
- Fields: CALL NUMBER (29269)(2005); Request in (29269)(2005); Status (15612)(1348); LCCN (13181)(797); LCCN Permalink (13181)(797); Main title (13181)(797); Type of material (13181)(797); Browse by shelf order (13162)(794); LC classification (13162)(794); Description (13161)(789); Published/Created (13150)(783); LC Subjects (12750)(787); Links (9829)(500); Links available (9330)(524); Other system no. (6905)(673); Notes (5502)(555); Additional formats (4678)(604); Related names (4269)(571); Geographic area code (2666)(330); Dewey class no. (1758)(485); Corporate name (1282)(196); Related titles (447)(32); Language code (322)(88); Uniform title (206)(59); Meeting name (41)(3);
- Not serials: Personal name (10955)(6); Edition (1196)(); Contents (522)(); ISBN (124)(); Dissertation note (101)(); Shelf Location (86)(); References (83)(); Incomplete contents (35)(); Collection (16)();
- Low counts: Geographic class no. (5)(); Constituent unit (4)(); Invalid ISBN (4)(); Scale info (4)(); Acquisition source (3)(); Partial contents (3)(); Repository (3)(); Rights advisory (3)(); Report no. (2)(); Abstract (1)(); Computer file info (1)(); Copyright reg no. (1)(); Invalid system no. (1)(); Performer (1)(); Publisher no. (1)(); UPC/EAN (1)();
- Mostly not serials: Series (2818)(54); LC copy (915)(22);
- Serials upweighted: NAL class no. (465)(230); NLM class no. (306)(167); Government doc no. (161)(55);
- Mostly serials: Older receipts (1262)(868); Current frequency (684)(640); Other Subjects (581)(551); Form/Genre (483)(464); Invalid LCCN (311)(303); Content type (294)(287); Media type (294)(287); Carrier type (292)(285); Additional Links (208)(198); National bib no. (178)(121); Reproduction no./Source (121)(119); Portion of title (111)(75); Variant title (95)(62); Summary (54)(49); Other title (44)(42); Published/Produced (24)(18); GPO item no. (22)(16); Cover title (19)(12); Subject keywords (17)(16); Spine title (16)(11); Running title (13)(11); Canadian class no. (11)(9); Title translation (11)(10); Caption title (10)(7); Added title page title (7)(5);
- Serial specific: Publication history (650)(650); ISSN (477)(477); Linking ISSN (458)(458); Serial key title (453)(453); National bib agency no. (428)(424); Abbreviated title (394)(394); CODEN (272)(272); Indexes (268)(268); Continued by (242)(241); Continues (226)(226); Former frequency (152)(152); Latest receipts (117)(117); Indexed selectively by (92)(92); Supplements (58)(58); Merger of (57)(57); Postal reg no. (53)(53); Has supplement (42)(42); Related item (34)(34); Incorrect ISSN (28)(28); Absorbed (27)(27); Continues in part (27)(27); Indexed by (25)(25); Absorbed by (23)(23); Split into (21)(21); Continued in part by (16)(16); Invalid ISSN (16)(16); Parallel title (15)(15); Issued with (13)(13); Supplement to (12)(12); Other standard no. (6)(6); Other edition (4)(4); Absorbed in part (3)(3); Former title (3)(3); Invalid CODEN (3)(3); Other class no. (2)(2); Overseas acq no. (2)(2); Invalid GPO item no. (1)(1);