Wikidata:Property proposal/Thesaurus Linguae Aegyptiae object ID
Thesaurus Linguae Aegyptiae object ID
editOriginally proposed at Wikidata:Property proposal/Creative work
Motivation
editTo support LOD in the realm of Egyptian texts, we would like to
- import a subset of Egyptian text objects of the TLA into Wikidata, creating independent Wikidata ID (so far only few Ancient Egyptian text artefacts are in Wikidata),
- refer to Wikidata IDs in TLA text object entries,
- link Wikidata entries of Ancient Egyptian text artefacts to TLA via the requested property.
--Dwer (talk) 20:29, 16 November 2023 (UTC)
Discussion
edit- Support Sphekaleon (talk) 07:57, 23 November 2023 (UTC)
- Support This must be useful for using TLA objects as linked open data --Somiyagawa (talk) 13:49, 22 November 2023 (UTC)
- Support I think it is a useful addition to be able to link hieroglyphic texts to external repositories. Situxx (talk) 14:46, 19 November 2023 (UTC)
- Support Maculosae tegmine lyncis (talk) 10:43, 20 November 2023 (UTC)
- Support Tobias Paul (talk) 10:08, 21 November 2023 (UTC)
- Comment I would rather propose separate properties for written works Thesaurus Linguae Aegyptiae work ID (Thesaurus Linguae Aegyptiae objects with Datentyp "Sammelüberschrift") and for objects Thesaurus Linguae Aegyptiae object ID (Thesaurus Linguae Aegyptiae objects with Datentyp " Objekt"). Thesaurus Linguae Aegyptiae work ID would have the subject type constraint written work (Q47461344), text (Q234460). Thesaurus Linguae Aegyptiae object ID would have the subject type contraint artificial object (Q16686448). Otherwise it gets messed up. (For example Westcar Papyrus (Q591301) is a WikiData entry for both the papyrus and the tales, but TLA has two separate IDs for the papyrus SEPNO7UILFCDPKLYRADUNRV3QA and for the tales TI5T3F4AMZHSXK5NXBTT2Y6N4Q. Hence, the WikiData entry cannot be unambiguously linked to a TLA object using a composite work/object property as proposed above.
- Work ID example: Instructions of Kagemni (Q3069960)→U45UGJQ3WFEBRDS6TDCRBST66Y
- Work ID example: The Maxims of Ptahhotep (Q963743)→PG6PVAQFHND67GCROZAIT3AA64
- Object ID example: Prisse Papyrus (Q1632596)→NHXAUFJ7AZAYFKTYQUXLHWKXUE
- --Ailintom (talk) 15:39, 21 November 2023 (UTC)
- Comment It is true that from a conceptional perspective (a) text artefacts (objects) and (b) abstract texts ('The Bible', 'The Teachings of Ptahhotep') should be differentiated between.
However, the TLA uses the same data structure (BTSTCObject) and the same URL endpoint thesaurus-linguae-aegyptiae/object/$1 for both (only differentiating between the two via additional metadata).
I wonder if it is allowed to apply for two conceptually different properies that point to one and the same URL endpoint. Is this an option?
BTW: The TLA differentiates between text artifacts ('Papyrus Prisse'; BTSTCObject; URL endpoint thesaurus-linguae-aegyptiae/object/$1) and abstract texts ('The Teachings of Ptahhotep'; BTSTCObject; URL endpoint thesaurus-linguae-aegyptiae/object/$1), on the one hand, and concrete written texts ('the copy/version of The Teachings of Ptahhotep on Papyrus Prisse'; BTSText; URL endpoint thesaurus-linguae-aegyptiae/text/$1), on the other hand. The differentiation between a concrete written work/text and an abstract work/text seems to be alien to Wikidata, yet?
--Dwer (talk) 17:54, 21 November 2023 (UTC)
- Comment It is true that from a conceptional perspective (a) text artefacts (objects) and (b) abstract texts ('The Bible', 'The Teachings of Ptahhotep') should be differentiated between.
- Comment Notified participants of WikiProject Antiquity —-Jahl de Vautban (talk) 21:17, 21 November 2023 (UTC)
- Comment Regarding the concerns raised by Ailintom and Dwer's reply: In my opinion, we don't need to create different properties for the TLA objects just because TLA puts textual artifacts (papyri, inscriptions etc.) in the same category as abstract concepts of texts (written work (Q47461344)). We must, however, make sure to keep the distinction between the two clear on Wikidata; it is far from alien to us. For comparison, we have both Berlin Chronicle (Q21100459) for the abstract work transmitted on the textual artifact Egyptian Museum and Papyrus Collection, P 13296 (Q21100575). Two concepts, two items. This can be done easily with one and the same property, so no need to change the URL endpoint on the TLA's end.
On another note, I find it curious that the TLA has opted to treat concrete objects and (intangible, abstract) texts in this way, but it's no need for concern in the creation of this property. Trismegistos has chosen to label its entries as "texts" when, clearly, they are textual artifacts (tangible objects). This had led to a confusion on Wikidata, which I have since corrected.
Oh, and BTW, I Support the creation of this property. Jonathan Groß (talk) 16:48, 22 November 2023 (UTC)- Comment Jonathan Groß, in that case we would probably need a clear guideline to decide which of the two different TLA text/object IDs we provide for the still very numerous WikiData entries that describe both the artefact (manuscript) and the abstract work (literary work). The problem is that TLA does not have a SPARQL interface; hence, when querying WikiData with SPARQL, you will have no idea whether the provided TLA ID refers to an artefact or an abstract work. This will render the provided property useless (or cumbersome to use, for one can of course scrape the TLA website with a Python script or the like) in many cases when you want to extract TLA IDs from WikiData for reuse in other LOD projects. (And if one day the TLA provides a semantic interface to their data, I guess they will no longer use the same class for artefacts and works, and we will have split the property anyway.) I think we are lucky that the TLA unlike Trismegistos does differentiate between abstract works, artefacts, and concrete written texts, and we should make use of this distinction to provide for a more semantically transparent mapping. --Ailintom (talk) 07:38, 23 November 2023 (UTC)
- @Ailintom:: As a guideline I suggest disambiguating the Wikidata items which currently are conflation of artifact and work. See our discussion here. It would be a boon if you could list examples of this for us to fix. Best, Jonathan Groß (talk) 08:03, 23 November 2023 (UTC)
- Comment The kernel business of the TLA are text artifacts and concrete texts (versions). These text artifacts, however, are organized in a corpus tree which also contains various kinds of captions such as "Letters", but eventually also works such as Teachings of Ptahhotep. So there are excellent landing pages for works, under which you will also find the text artifacts and the concrete text linked, but the caption was not tailor-made for works. The actual question is therefore why text artifacts and captions were only differentiated between on the metadata level but not on the data type level (which technically implies the URL endpoint). Noone had foreseen these technical implications. Probably it is indeed wise to restrict this very proposal to text artifacts and to make a separate proposal for Thesaurus Linguae Aegyptiae work IDs (leading to the same URL endpoint), as suggested by @Ailintom:. --Dwer (talk) 11:22, 25 November 2023 (UTC)
- Support I'm on board with that, too. Jonathan Groß (talk) 12:25, 25 November 2023 (UTC)
- Comment Jonathan Groß, in that case we would probably need a clear guideline to decide which of the two different TLA text/object IDs we provide for the still very numerous WikiData entries that describe both the artefact (manuscript) and the abstract work (literary work). The problem is that TLA does not have a SPARQL interface; hence, when querying WikiData with SPARQL, you will have no idea whether the provided TLA ID refers to an artefact or an abstract work. This will render the provided property useless (or cumbersome to use, for one can of course scrape the TLA website with a Python script or the like) in many cases when you want to extract TLA IDs from WikiData for reuse in other LOD projects. (And if one day the TLA provides a semantic interface to their data, I guess they will no longer use the same class for artefacts and works, and we will have split the property anyway.) I think we are lucky that the TLA unlike Trismegistos does differentiate between abstract works, artefacts, and concrete written texts, and we should make use of this distinction to provide for a more semantically transparent mapping. --Ailintom (talk) 07:38, 23 November 2023 (UTC)
- Comment Regarding the concerns raised by Ailintom and Dwer's reply: In my opinion, we don't need to create different properties for the TLA objects just because TLA puts textual artifacts (papyri, inscriptions etc.) in the same category as abstract concepts of texts (written work (Q47461344)). We must, however, make sure to keep the distinction between the two clear on Wikidata; it is far from alien to us. For comparison, we have both Berlin Chronicle (Q21100459) for the abstract work transmitted on the textual artifact Egyptian Museum and Papyrus Collection, P 13296 (Q21100575). Two concepts, two items. This can be done easily with one and the same property, so no need to change the URL endpoint on the TLA's end.
- @Dwer, Ailintom, Sphekaleon, Somiyagawa, Situxx, Maculosae tegmine lyncis: @Tobias Paul, Jahl de Vautban: Done I've created two properties: Thesaurus Linguae Aegyptiae object ID (P12185) for objects (textual artifacts) and Thesaurus Linguae Aegyptiae textual work ID (P12186) for works (intellectual contents, texts). Both use the same formatter URL but have slightly different property constraints. This should facilitate data evaluation. BTW, if you want to import much data from the TLA into Wikidata, I suggest seeking help and guidance at the WikiProject Antiquity. Jonathan Groß (talk) 21:16, 28 November 2023 (UTC)