Data Model: Modelling Manuscripts in a Wikibase

This Data Model is an attempt at using Wikidata for describing manuscripts. Predecessors in other RDF-type databases will be taken into account, most prominent examples include Biblissima's model (GitHub) and the FactGrid data model for manuscripts developed by Marco Heiles in 2020/21.

Fingerprint (label, description, alias)

The label of a manuscript item should make it recognisable to anyone familiar with it. While manuscripts are commonly referenced by their signatures (consisting of city, institution, record set, shelf number), some manuscript classes have canonical catalogue numbers which are used by scholars instead of library signatures. For example, Septuagint manuscripts are cited by their siglum in the Rahlfs catalog, New Testament manuscripts by Gregory-Aland-numbers, and so on. This is even more significant as catalogs often have different and competing concepts of manuscripts: While Pinakes records manuscripts as they are housed in libraries today, both Rahlfs's and Gregory/Aland's catalogs conceive manuscripts as they were originally created. Both concepts are valid and should be represented by interlinking Wikidata items.

Some manuscripts, especially the more famous ones, have what can be called “trivial names” which are widely used in publications and may be more familiar than library signatures. These trivial names are often used in the lemma of Wikipedia articles. Examples are the Archimedes palimpsest, the Fragmenta Dublinensia, the Madrid Skylitzes, the Vienna Dioscorides, the Codex Gigas, and the Cotton Genesis.

With these considerations in mind, our rules for labels are as follows:

  1. In general, the label for a manuscript item should be its signature in the least ambiguous form.
  2. If a manuscript is part of a group of manuscripts which are more commonly referred to by a canonical catalog code, this should be used as the label, and the signature be moved to alias.
  3. If a manuscript has a familiar trivial name, this name supersedes both canonical catalog codes and library signatures, which both become aliases.

The description should be as specific as possible. It need not repeat information which is already conveyed in the label. It can summarise intellectual contents of the manuscript, state basic properties (such as the manuscript having illuminations or being a palimpsest, the creation date, commissioner etc.). For mass imports, generic descriptions like “Greek New Testament manuscript with catena commentary” are possible, so long as the items are clearly discernible and have statements with additional information.

The alias of a manuscript should always contain the most familiar or useful designations of a manuscript, including but not limited to:

  • library signature in full form (if not already used as label) as well as abbreviations
  • catalog codes beside the canonical one
  • obscure or obsolete trivial names
  • sigla used in critical editions to designate this manuscript

Classes of manuscripts

The standard designation for any manuscript item is MS.instance of (P31)manuscript (Q87167). This is replaced in some cases by:

Other uses like instance of (P31)codex (Q213924), instance of (P31)book (Q571), instance of (P31)papyrus scroll (Q113016548), instance of (P31)papyrus fragment (Q95065857), instance of (P31)lectionary (Q284465) are discouraged.

Relation to volume (Q105763458) needs to be clarified.

Subclasses of manuscript (Q87167): query As of 17 October 2023 there are 187 Wikidata items classified as subclass of (P279)manuscript (Q87167). A tree chart can be generated with the Wikidata generic tree tool.

Looking through these cases one gets the impression of a dense, overgrown forest with entangled vines and brambles all over the place. Rather than taking out the machete or the chainsaw, I carefully slip and slither my way through it and list the connections. What I find is manuscripts classified according to form, material, regional origin, time period and language; I find literary genres and use cases, painting techniques, flag alphabets. What a mess!

After this survey the next step would be to identify where to start. We cannot transform a whole jungle into a tidy garden in one fell swoop. What we can do is stake a claim and start by delineating, weeding out, prodding and pruning the taxonomy.

  • For the representation of hierarchical data, maybe we could learn something from User:Jheald/aat.

Property statements for manuscripts

Material properties

Title ID Data type Description Examples Inverse
instance ofP31Iteminstance of: Assigns the manuscript to a class. Default is manuscript (Q87167) or illuminated manuscript (Q48498). Use of manuscript fragment (Q30103158) needs to be reviewed. Where appropriate, palimpsest (Q274076) should be used (instead of or in addition to the former?) chained book (Q19602268) can be used as an additional value. – Subclasses of manuscript (Q87167) need to be reviewed as well.Archimedes Palimpsest <instance of> palimpsest-
made from materialP186Itemmaterial: Describes materials which make up the manuscript, most importantly writing support (e.g. papyrus (Q125576), parchment (Q226697), vellum (Q378274), paper (Q11472)), ideally also ink, binding, cover etc.chocolate <made from material> cocoa bean-
number of pagesP1104Quantitypage, leaf, page of plates, leaf of plates, unnumbered page of plates and number of pages: Specifies the number of folia of a manuscript.The Diary of a Young Girl (First Edition) <number of pages> 285 page-
state of conservationP5816Itemconservation state: States the current condition of a manuscript (one of preserved (Q56557591), not completed (Q20734200), mildly damaged (Q107531416), damaged (Q106379705), demolished or destroyed (Q56556915), unlocated, probably destroyed (Q106959824), unknown preservation status (Q66890153) or disassembled (Q61962974); maybe restored (Q75505084), too; where appropriate, with point in time (P585)). In specific cases, where appropriate, several values with qualifiers start time (P580) and end time (P582).Saint-André-des-Arts Church <state of conservation> demolished or destroyed-
heightP2048Quantityheight, human height and body length: Specifies the vertical dimension of a manuscript.Eiffel Tower <height> 324 metre-
widthP2049Quantitywidth and road width: Specifies the horizontal dimension of a manuscript.Mona Lisa <width> 53 centimetre-

Missing properties:

  • foliation
  • quire organisation
  • number of quires
  • codicological units (booklets)
  • book format (for exact dimensions, height (P2048), width (P2049) are also adequate)
  • for paper: watermarks

Creation properties

For palimpsests, both lower and upper script are described within the same manuscript item. Where appropriate, statements should be qualified with applies to part (P518)lower script (Q122901270) and applies to part (P518)upper script (Q122901275).

Title ID Data type Description Examples Inverse
inceptionP571Point in timedate of establishment: Marks the time of creation for a manuscriptSociety of Jesus <inception> -
location of creationP1071ItemNames the place of creation for a manuscriptMona Lisa <location of creation> Florence-
transcribed byP11603Itemcopyist: Names the scribe(s) of a manuscript. The redundancy with calligrapher (P6819) needs to be addressed.Old English Rune Poem <transcribed by> Humfrey Wanley-
annotatorP11105Itemannotator: Names the annotator(s) of a manuscript.Al-Bidāyah wa-al-nihāyah ( At-Turki ) <annotator> Abdallah Ben Abdel Mohsen At-Turki-
illustratorP110Itemillustrator: Names the illuminator(s) of a manuscript.The Little Prince <illustrator> Antoine de Saint-Exupéry-
music transcriberP9260Itemmusic transcriber: Names the person who added musical notation to a manuscript. For Greek manuscripts, the value will probably always be unknown.Tumi robe nirobe <music transcriber> Jyotirindranath Tagore-
script styleP9302Itemscript style: Specifies the script style of a manuscript.Uncial 053 <script style> uncial script-
commissioned byP88Itemcommissioner: Names the commissioner of a manuscript.Arc de Triomphe <commissioned by> Napoleon-

Content properties

Title ID Data type Description Examples Inverse
language of work or nameP407Itemlanguage: This property states the language or languages used in writing or annotating this manuscript (specify with applies to part (P518)).Autobiografia di Alice Toklas (translation) <language of work or name> Italian-
exemplar ofP1574Itemindividual copy of a work: This property lists works (written work (Q47461344)) whose text is present in the manuscript (specify with section, verse, paragraph, or clause (P958)).Isaiah scroll <exemplar of> Isaiah-
genreP136Itemgenre and by genre: Specifies the intended (and actual) use of a manuscript (lectionary (Q284465), book of hours (Q727715)).Winter in Wartime <genre> war film-
imageP18Commons media fileillustration and image: For illustration purposes.Douglas Adams <image> Douglas adams portrait cropped.jpg-
type of musical notationP12041Itemmusical notation: For type of musical notation present in the manuscript.P-Cua IV-3ª Gav. 44 (22) <type of musical notation> neume-

Stemmatic and editorial properties

Title ID Data type Description Examples Inverse
part ofP361Itempart: This property assigns the manuscript to a group or family of manuscripts. (Possibly a new property "part of manuscript group" might be better.)ear <part of> headhas part(s)
based onP144Itembased on: This property refers to manuscript(s) which were used as a model in copying the manuscript (specify with applies to part (P518)).A <based on> Αderivative work
derivative workP4969Itemderivative work: This property refers to manuscript(s) which are copies (apographa) of the manuscript (specify with section, verse, paragraph, or clause (P958)).Debian <derivative work> Ubuntubased on
has edition or translationP747Itemversion, edition or translation, translated edition and source text: This property lists editions which used the manuscript. (Possibly a new property "used in edition" might be better.)Pride and Prejudice <has edition or translation> Pride and Prejudiceedition or translation of

Missing properties:

Catalog, provenance and housing properties

Title ID Data type Description Examples Inverse
owned byP127Itemproprietor, property and owned by: Names the owner(s) of a manuscript (use with qualifiers start time (P580) and end time (P582), current owner with preferred rank). Value may be identical to that of collection (P195).Choupette <owned by> Karl Lagerfeldowner of
collectionP195Itemcollection and editorial collection: Specifies the holding institution where the manuscript was/is housed. If there are different known values (for example, when a manuscript went from one institution to another), qualifiers start time (P580) and end time (P582) should be used and the current institution be stated with preferred rank.Rosetta Stone <collection> British Museum-
‎fondsP12095Itemfonds: Specifies the fonds the manuscript is part of. If there are different known values (for example, when a manuscript went from one fonds to another), qualifiers start time (P580) and end time (P582) should be used and the current institution be stated with preferred rank.Peterborough Chronicle <‎fonds> Laud fonds-
inventory numberP217Stringaccession number: Shelf mark of the manuscript. TBD if in short form (specific to collection/fonds, but not unique) or in longer form (as in scholarly literature).The Night Watch <inventory number> SK-C-5-
catalog codeP528Stringcatalogue: Catalogue code(s) of a manuscript (use with qualifier catalog (P972)).The Night Watch <catalog code> 2016-
Commons categoryP373StringCommons category: Corresponding Wikimedia Commons category (Q24574745).Calosoma reticulatum <Commons category> Calosoma reticulatum-
full work available at URLP953URLdigital library: URL to digital copies of the work.L'Odyssée, 'Poésie homérique' <full work available at URL>

Identifiers for manuscripts

Property Created Database reports Associated tasks Related properties
Gregory-Aland-Number (P1577) 2014-10-24 report
BerlPap ID (P1948) 2014-06-14 report
Catalogue of Illuminated Manuscripts ID (P3702) 2017-02-25 report
Medieval Libraries of Great Britain ID (P3768) 2017-03-20 report
Manus Online manuscript ID (P4752) 2018-01-17 report
IIIF manifest URL (P6108) 2018-11-12 report
Mirabile manuscript ID (P7989) 2020-03-19 report
Trismegistos text ID (P8532) 2020-08-18 report
Medieval Manuscripts in Oxford Libraries manuscript ID (P9015) 2021-01-03 report
Initiale ID (P10236) 2022-01-05 report
Mapping Manuscript Migrations manuscript ID (P10481) 2022-03-12 report
Diktyon ID (P12042) 2023-09-23 report WikiProject Manuscripts/Pinakes
‎Catenae Catalogue ID (P12109) 2023-10-22 report WikiProject Manuscripts/Catenae Parpulov group (P12110)
‎Rahlfs number (P12116) 2023-10-22 report WikiProject Manuscripts/LXX
cagb manuscript ID (P12131) 2023-11-07 report WikiProject Manuscripts/cagb
BnF archives and manuscripts ID (P12207) 2023-12-08 report

Query to get the number of uses of each of the above identifiers:

The following query uses these:

Showcase items

Showcase queries

Image gallery of most notable manuscripts

#title:The 100 manuscripts with most sitelinks
SELECT ?q (MIN(?image) AS ?img) ?qLabel ?linkcount WHERE {
  ?q (wdt:P31/wdt:P279*) wd:Q87167.
  ?q wikibase:sitelinks ?linkcount .
  ?q wdt:P18 ?image.
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".
    ?q rdfs:label ?qLabel.
GROUP BY ?q ?qLabel ?linkcount
ORDER BY DESC(?linkcount)
The 100 manuscripts with most sitelinks

External identifiers for manuscripts

#title:Identifiers and count of manuscripts
SELECT ?id ?idlabel (COUNT(?item) AS ?count) WITH {
    ?item wdt:P31/wdt:P279* wd:Q87167.
} AS %subquery WHERE {
  INCLUDE %subquery .
  ?id wikibase:propertyType wikibase:ExternalId; wikibase:claim ?p; wikibase:statementProperty ?ps.
  ?item ?p [ ?ps ?value ] .
  ?id rdfs:label ?idlabel FILTER (lang(?idlabel) = "en").
} GROUP BY ?id ?idlabel
Identifiers and count of manuscripts

Count of manuscripts by present collection

See also Wikidata:WikiProject Manuscripts/Dashboard

#title:Count of manuscripts by collection
SELECT ?collection ?collectionLabel (COUNT(?item) AS ?count) WHERE {
  ?item wdt:P31/wdt:P279* wd:Q87167.
  ?q p:P195 ?s. ?s ps:P195 ?collection . MINUS {?s pq:P582 []}.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
} GROUP BY ?collection ?collectionLabel
Count of manuscripts by collection

Map of manuscript collections

#title:Map of manuscript collections
SELECT ?collection ?collectionLabel ?coords ?link
{SELECT DISTINCT ?collection ?coords WHERE {?q (wdt:P31/wdt:P279*) wd:Q87167;
  wdt:P195 ?collection .
?collection wdt:P625 ?coords.
} }
OPTIONAL{ ?collection wdt:P856 ?link }.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
Map of manuscript collections

Manuscripts with no collection property

#title:Manuscripts with no collection property
SELECT DISTINCT ?q ?qLabel ?qDescription ?enwp 
?q (wdt:P31/wdt:P279*) wd:Q87167.   
MINUS { ?q wdt:P195 []}
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".
 OPTIONAL{?enwp schema:about ?q ; schema:isPartOf <> }
Manuscripts with no collection property

Count of manuscripts by language

#title:Count of manuscripts by language
SELECT ?lang ?langLabel ?langDescription (COUNT(DISTINCT ?q) AS ?mcount)
?q (wdt:P31/wdt:P279*) wd:Q87167.
?q wdt:P407 ?lang 
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en"
GROUP BY ?lang ?langLabel ?langDescription
ORDER BY DESC(?mcount)
Count of manuscripts by language

Count of manuscripts by material

#title:Count of manuscripts by material used
SELECT ?material ?l ?d (COUNT(DISTINCT ?q) AS ?mcount)
?q (wdt:P31/wdt:P279*) wd:Q87167.
?q p:P195 ?s. ?s ps:P195 ?collection . MINUS {?s pq:P582 []}.
MINUS { ?s ps:P195 wd:Q1322278 }
?q wdt:P186 ?material
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en".
    ?material rdfs:label ?l; schema:description ?d
GROUP BY ?material ?l ?d
ORDER BY DESC(?mcount)
Count of manuscripts by material used

External links