User:Marsupium/Blog

Just now, I decided that I will testingly start a small series addressing small things connected to my activity at Wikidata. I always make notes for myself anyway. However, some of them might be more useful for me and maybe others if I note them understandable and written out. Feel free to correct the posts or to add comments. Thanks, --Marsupium (talk) 06:30, 13 April 2014 (UTC), changed 01:19, 19 April 2014 (UTC), 09:53, 15 August 2017 (UTC)

Unlike its transclusion on User:Marsupium this page shows posts in draft state, too.

Ordering our triples

--Marsupium (talk) 11:36, 20 June 2018 (UTC)

On quality edit

Wikidata is growing fast. That's great. The database it is it can't be without errors. Quality has to be traded off against quantity. IMHO less quantity and more quality would be preferable. I'd like to respond to Wikipedias of which many are sceptical about Wikidata's quality and its usefulness for them[q 1] that their doubts are unfounded. I'm afraid that they are not.

There are manual editors' errors, and errors by scripts and bots, all lowering quality. I spend much of my time here with the correction of bot errors. Many of them are evitable, many even easily. I think higher standards for bots and their requests could help:

  • Bots shouldn't get a flag for some task and afterwards use them for many others.
  • Bots shouldn't continue with new additions if they've done harmful ones that are not yet corrected.
  • Bots running with code that is not public should be a well-founded exception. It should be demanded that bot code is open source.

Rules can help with this. A culture of more effort to reduce bot errors is nicer. I appreciate all the work bot operators put into their contributions and I'd like to encourage everyone to take care, improve code, point out errors and help to eliminate them! --Marsupium (talk) 09:53, 15 August 2017 (UTC)

  1. German Wikipedia tends to be such a place caring much about preservation of standards.

another issue process: Topic:Uodrphi2rhc499ng coming from reinheit… affecting MnM powered edits and causing overreactions

Where I still plan to talk to others: edit

here: 2016-09 ("Thespiaden"), 2016-11
also summary: "Updated item: nl-description" is wrong, is label, e.g. [2], [3]
logs not work: last entries at https://goo .gl/BezTim from 2016

same problem with BotMultichill: [4] also BotMultichillT: based on ULAN and many other QS fiddlers and aliases which aren't aliases: User talk:Multichill/Archives/2018/June#Absurd aliases and from 2019-06: User talk:Multichill#About BotMultichillT/ULAN

use c:Module:I18n/name etc.; for a blacklist also: "d.y." (sv)

see also Wikidata:Project chat/Archive/2018/01#Automatically copied labels to other languages

now a start for a constructive approach to prevent further mess and eventually fix the already existing: User:Marsupium/Person label copying blacklist

Dupe sources edit

A haphazard selection (additions welcome): The items there have typically a P31, ca. 4 non-ID property, >=1 ID property statements:

Bot Import No. of item creations Time Props added Example edit(s)
Reinheitsgebot mixnmatch_people_creator >3000 2018 P31, P27, P569, P570, P106, >=2 ID props: all applicable sourced [5]
BotMultichill Web umania import 2018 P31, P4887, P106, P21 (+ P569, P570 from Reinheitsgebot): all applicable sourced [6]
BotMultichill "Creating artist based on RKD: Person won the prize" 2017 P31, P650, P21, P106, P569, P19, perhaps more: all applicable sourced [7]

mixnmatch_people_creator: one person twice in one run: Leon Moran (Q52148493), Joseph Henry Bush (Q52154574): how to avoid: test IDs

also some:

  • BotMultichill: "Creating new item for Stedelijk Museum voor Actuele Kunst artist"
  • BotMultichill: "Creating artist based on RKD: Painter has more than 3 works in RKDimages"
  • BotMultichill: "Creating artist based on RKD: Painter with works in RKDimages and date and place of birth known"

Collaboration on the Physical Objects Class Tree edit

Concerned with the progress of the Wikidata:WikiProject Visual arts I try to improve the interoperability of Wikidata with commons:Template:Artwork which was – if I remember correctly – more than 500.000 times transcluded (when [8] still showed this information). This occupation includes the mapping of commons:Template:I18n/objects to a physical object class tree on Wikidata with the help of the Art & Architecture Thesaurus (Q611299). In doing so I often enter the field of other disciplines and in the same time issues with the different language versions of Wikipedia. A great and perhaps the only possibility to create high value data will be the collaboration of projects with different domains and from different languages. The projects at WikiProject Archaeology (Q10801979) and Portal:Archaeology (Q7076022) or even de:Wikipedia:Wikidata trifft Archäologie 2013 might be those which can help the Wikidata:WikiProject Visual arts to determine if arrowhead (Q1643900) shall be a subclass of (P279) projectile point (Q2308299), and the projects at Portal:Fashion (Q13341443) might be able to help to distinguish the forms of headgear (Q14952) which caused me some trouble some weeks ago. --Marsupium (talk) 01:19, 19 April 2014 (UTC)

Some Critique of Iconclass edit

WARNING: This section was written some time ago. I'm afraid it contains errors I still plan to do some research on. If you think anything is not correct, I'm happy if you edit it or perhaps leave a remark. --Marsupium (talk) 09:53, 15 August 2017 (UTC)

  • A – depending on the use – more or less big issue with Iconclass (Q1502787) is that it is only formalised partially. (Maybe apart from some recurring annexes) Iconclass is a monohierarchical classification. Between the time Iconclass “took shape in the early 1950s“[i 1] and its digital publication this probably did not really matter. A multihierarchical classification published in the after all mostly linear form of books would be hard to use anyway. However, at the latest now when the classification is yet published as Linked Open Data[i 2] this is a problem which reduces the value of Iconclass for Wikidata heavily. The problem gets evident through a passage of an official guide to Iconclass:

    ”This richness is not due to the fact that Iconclass linguistically differentiates between meanings of the word "praying". It is simply because Iconclass contains concepts that may be represented in various ways, like "public prayer" and "private prayer", but also concepts we can use to describe the visualization of prayer, like "hands folded"; and of course, many scenes of prayer from the bible, classical history and classical mythology. So, there is a fundamental difference between the word we use as a search term and the various concepts or groups of concepts that are linked to that word. This difference can be summarized in one word: context. Of course, the word "praying" has its own semantic richness, but that will never match the historical, thematic and narrative contexts which the Iconclass browser unfolds for us.“[i 3]

  • Classification: The classification is shaped by the formerly paper form of Iconclass. Unfortunately the classification is monohierarchical what is an annoying limitation. Moreover the number of subclasses in a level is quite arbitrarily limited due to the use of a digit for the first two levels, a letter for the third, and again digits for the further levels.[i 4]
  • Nonsense and redundancy of the classification: In consequence of the bad formalisation Iconclass lists redundant and nonsense notations and both together, there are "11F(+2) the Virgin Mary (+ Mary)" and "11F7(+2) specific aspects ~ Madonna-representations (N.B. secondary notations only) (+ Mary)".
  • Formalisation: Another point where missing formalisation becomes evident are the so called "Structural Digits"[i 4] that should better be expressed explicitly.
  • Controlled vocabulary: Iconclass is a controlled vocabulary. This brings the advantage with it that no unexpected values appear. With semantic web technologies open systems that are still well formalised have become possible. Especially considering the structurally necessary incompleteness of Iconclass regarding non-European art for example open systems could probably facilitate more expressive powers.
  • The system of labeling main and additional minor parts of an image might be improvable by expressing the precise relationship between these parts in a formalised manner.
  • Nonsense and redundancy of the classification: In consequence of the bad formalisation Iconclass lists redundant and nonsense notations and both together, there are "11F(+2) the Virgin Mary (+ Mary)" and "11F7(+2) specific aspects ~ Madonna-representations (N.B. secondary notations only) (+ Mary)".
  • Formalisation: Another point where missing formalisation becomes evident are the so called "Structural Digits"[i 4] that should better be expressed explicitly.
  • Controlled vocabulary: Iconclass is a controlled vocabulary. This brings the advantage with it that no unexpected values appear. With semantic web technologies open systems that are still well formalised have become possible. Especially considering the structurally necessary incompleteness of Iconclass regarding non-European art for example open systems could probably facilitate more expressive powers.
  • The system of labeling main and additional minor parts of an image might be improvable by expressing the precise relationship between these parts in a formalised manner.
  • More: Comments like in http://www.iconclass.org/rkd/46C11/ ("Comments: Jan. 6, 2011, 1:33 p.m. Hans Brandhorst says: 46C119 manoeuvering, navigating We don't have the concept for traffic on land as we do have for travel on water. Seems useful to have it: "coachman" could be: 46C1191...") may indicate more problems.

--Marsupium (talk), before 23:23, 20 July 2014 (UTC), 23:18, 10 September 2017 (UTC)

  1. History of Iconclass
  2. ICONCLASS as Linked Open Data
  3. Etienne Posthumus; Hans Brandhorst: A PRACTICAL GUIDE TO THE ICONCLASS 2100 BROWSER. November 2009, p. 6.
  4. 4.0 4.1 4.2 Contents of Iconclass

PS: Now the Getty Vocabulary Program provides the Getty Iconography Authority looking aspiring after at first glimpse. See Guidelines (2017-09-01 as of 2017-09-26), Slides by Patricia Harpring, revised June 2016, CONA and the Iconography Authority: Linking and Relationships Are Unique. International Terminology Working Group Meeting, 23 August 2016, Patricia Harpring (similar content). --Marsupium (talk) 14:49, 26 September 2017 (UTC)

PPS: It seems not to be intended that some "structural digits" after a bracket cannot be given without a definite content of the bracket, the need for this can be seen in Madonna with Canon Joris van der Paele (Q2480921)depicts Iconclass notation (P1257)61B2(...)12 from https://rkd.nl/en/explore/images/2149. --Marsupium (talk) 16:59, 21 January 2019 (UTC)