Wikidata:WikiProject 20th Century Press Archives/Data structure

 

Home

 

Data Structure

 

Data Sources

 

Use Cases

 

Tools & Tasks

 

Statistics

 

The folder metadata from the data source is available in an preliminiary RDF structure:

For discussion about modeling issues, see the talk page.

External identifier property for PM20 folders edit

PM20 folder ID (P4293) (documentation and discussion)

Qualifiers edit

The following qualifiers are in regular use with PM20 folder ID (P4293):

Considered qualifiers:

Reference statements edit

for the properties below use:

Persons edit

Still incomplete mapping of the metadata from person folders:

type property pid datatype cardinality rdf transformation
Lde:Len skos:prefLabel adjust_label (label_type=last_first)
Dde schema:hasOccupation
Den
I instance of P31 item 1.1 Q5 (fix) OR Q8436 (fix) if label contains "<Familie>"
P PM20 Folder ID P4293 string 1.1 dcterms:identifier
P GND ID P227 string 0.1 gndo:gndIdentifier
P date of birth P569 date 0.n schema:birthDate format_date()
P date of death P570 date 0.n schema:deathDate format_date()
P occupation P106 item 0.n zbwext:activity/schema:about map to wd items, for certain fields of activity
P field of work P101 item 0.n zbwext:activity/schema:about map to wd items, for certain fields of activity
P family P53 item 0.n dct:hasPart (inv) not for families

types: L=label, D=description, P=property, I=implied property

Organizations/companies edit

Interlinking edit

General considerations about categories (subjects, wares, and geographical locations) edit

Country/subject folder items, defined by existing location + PM20 subject classification (Draft) edit

Each folder in the country/subject archive is defined by a combination of

  • a geographical entity (countries, geographical regions, a few cities, the whole world)
  • a subject category (within a hierarchy of subject categories)

The model developed here has to be extensible, because currently only parts of the digitized material have been organized in well described folders, while others only exist as ranges of images of digitized roll films.

New properties edit

Open question: Use ID (precise, but closed) or notation (more fuzzy, extendable) Probably: Use notation/signature/code as external id, for future extensibility without an external database (beyond the film material itself).

Implementation: The purl.org address, e.g. http://purl.org/pm20/category/geo/s/A9, redirects to https://pm20.zbw.eu/category/geo/s/A9, which will redirect to https://pm20.zbw.eu/category/geo/i/140905 (numerical identifier via Apache RewriteMap). Currently it redirects to http://zbw.eu/beta/pm20voc/ag/140905, and that to https://zbw.eu/beta/skosmos/pm20ag/en/page/140905, a Skosmos concept page. --Jneubert (talk) 08:02, 24 July 2020 (UTC)[reply]
name pid datatype links to comment temporary use for example creation
PM20 subject code (P8484) external term entry and list of folders by "country" at the PM20 subject category item -
PM20 geo code (P8483) external term entry and list of folders by subject and ware at the real world location item -
PM20 ware ID (P10890) external term entry and list of folders by "country" at the real world ware/product/product class item or a special PM20 ware category (Q111973176) (in a few cases) catalog code (P528)
PM20 film section ID (P11822) external image from PM20 film or fiche with mandatory qualifier number of pages (P1104), indicating a range of follow-up images (possibly across multiple films/fiches) inventory number (P217)

Alternatives:

  • one generic property for notation (something like skos:notation - already existing? preliminarily use short name (P1813)?), similar to catalog code (P528), in combination with catalog (P972)
  • Or: Use with formatter url instead of list property? (No - only works as a non-extensable list (generated HTML page) + lookup mechanism (notation -> id).
  • Preliminary implementaion of external-id target pages with Skosmos leaves the lists almost hidden as rdfs:seeAlso links.
  • Future implementation will use an signature -> id redirect
  • Todo later-on: static html page, perhaps with customized 404 message "Signature not defined in the static representation of PM20 vocabulary (as of ...)"

Non-uniqueness problems:

PM20 subject category (Q92707903) edit

currently ca. 1400 categories

example items (partOf/hasPart hierarchy):

PM20 subject category system (Q92732036)

e Health situation, general (Q92714111)
e1 Individual diseases and their control (Q92707235)
n Economy, general (Q96748601)
type property pid datatype cardinality source property transformation
Lde skos:prefLabel @de
Len skos:prefLabel @en
Dde "Systematikstelle des Pressearchiv 20. Jahrhundert" (fix)
Den "subject category of the 20th Century Press Archives" (fix)
Ade PM20 subject code + labelDe
Aen PM20 subject code + labelEn
P instance of P31 item 1.1 PM20 subject category (Q92707903) (fix)
P part of P361 item 1.1 super_class()
P PM20 subject code P8484 external 1.1
P main subject P921 item 0.1 manual lookup
P has part P527 item 0.1

type: L=label, D=description, P=property, I=implied property

PM20 subject code (P8484) holds short form of the notation (e.g., n24 Sm12), as it was used in the signatures put onto the clippings. A fuller form useful for sorting (e.g., n 24 SM 012) could be added to the category item as a label with the language code zxx (no linguistic content (Q22282939)) (does not work - neither with QS (Lzxx) nor interactively). Other possible option: use series ordinal (P1545) as a qualifier for PM20 subject code (P8484) (example).

PM20 country/subject folder (Q91257459) edit

currently ca. 9000 subject folders

example items:

type property pid datatype cardinality restriction source property transformation
Lde skos:prefLabel
Len derived from English location and class labels?
Dde "Mappe aus dem Pressearchiv 20. Jahrhundert" (fix)
Den "folder of the 20th Century Press Archives" (fix)
I instance of P31 item 1.1 PM20 country/subject folder (Q91257459) (fix)
P part of P361 item 1.1 20th Century Press Archives (Q36948990) (fix)
P facet of P1269 item 1.1 subclass/instance of human-geographic territorial entity (Q15642541) zbwext:country qualified with PM20 geo code (P8483), used for lookup
P main subject P921 item 1.1 instance of PM20 subject category (Q92707903) zbwext:subject qualified with PM20 subject code (P8484), used for lookup
P IIIF manifest P6108 url 1.1 manifest_url() TODO Anzeige mit plugin?
P PM20 folder ID P4293 external 1.1 starting wth "sh/" dct:identifier when also used for films, alternatively with new "PM20 film" property
P number of works P3740 quantity 1.1 zbwext:totalDocCount

Possible extension for image ranges on films or fiches edit

  • May be applied to countries, wares, company folders, subject folders (missing person folders are not digitized)
  • Additional property: PM20 film section ID (P11822) (see above) - created
  • described as: range of microfilm images of the 20th Century Press Archives (start position given as property value, qualified with number of pages). (Later on perhaps extended to digitized microfiches, or otherwise the latter in a separate property)
  • Formatter URL: https://pm20.zbw.eu/film/
  • value example: h2/sh/S2043H/1151
  • mandatory qualifier: number of pages (P1104)
    • on films, one image normally contains two pages
    • calculation of P1104 for films uses start of next range: ( {number of images of start film} - {value} ) + {number of images of all intermediate films} + { next range start ] - 1 ) * 2 (two pages per image - not necessarily exact due to start/end page, images with only one page, etc.)
  • optional qualifier: including (P1012) subcategory (Q92876464)
    indicates the inclusion of sub-folder hierarchy
  • multiple entries per item are possible! (h1, h2, h3, ...)

Ware folders edit

PM20 ware category (Q111973176) edit

Only used for special categories, mostly collections of concepts (like Axe, hatchet, hammer (Q113376049)). Normally, commodities and wares categories use already defined normal items.

type property pid datatype cardinality source property transformation
Lde skos:prefLabel @de
Len skos:prefLabel @en
Dde "spezielle Kategorie von Waren des Pressearchiv 20. Jahrhundert" (fix)
Den "special category of commodities/wares of the 20th Century Press Archives" (fix)
Ade ?
Aen ?
P instance of P31 item 1.1 PM20 ware category (Q111973176) (fix)
P PM20 ware ID P10890 external 1.1
P main subject P921 item 0.1 manual lookup

type: L=label, D=description, P=property, I=implied property

PM20 ware/country folder (Q113376528) edit

type property pid datatype cardinality restriction source property transformation
Lde skos:prefLabel
Len derived from English location and class labels?
Dde "Mappe aus dem Pressearchiv 20. Jahrhundert" (fix)
Den "folder of the 20th Century Press Archives" (fix)
I instance of P31 item 1.1 PM20 ware/country folder (Q113376528) (fix)
P part of P361 item 1.1 20th Century Press Archives (Q36948990) (fix)
P facet of P1269 item 1.1 subclass/instance of human-geographic territorial entity (Q15642541) zbwext:country qualified with PM20 geo code (P8483), used for lookup
P main subject P921 item 1.1 instance of PM20 ware category (Q111973176) or other commodity/ware item zbwext:ware qualified with PM20 ware ID (P10890), used for lookup
P IIIF manifest P6108 url 1.1 manifest_url() TODO Anzeige mit plugin?
P PM20 folder ID P4293 external 1.1 starting wth "wa/" dct:identifier when also used for films, alternatively with new "PM20 film" property
P number of works P3740 quantity 1.1 zbwext:totalDocCount

Related on Wikidata edit

Other collections edit

  • Wikidata:WikiProject DNB (Dictionary of National Biography) - Wikisource entries and their modelling and cross-linking in WD

Other classifications edit

Possibly interesting properties edit