Wikidata:Property proposal/Unique image of unicode char

Unique image of unicode char edit

Originally proposed at Wikidata:Property proposal/Sister projects

   Not done
DescriptionImage of a visible unicode char in SVG format to be showed on wiktionaries (lemma pages for single unicode chars) or wikipedias (article pages for single unicode chars) beside the char as rendered by the OS and browser (with risk for "unkonwn"), see sv.wiktionary.org
Representsnone
Data typeCommons media file
Template parameterno parameter yet, usable for boxes like en.wikipedia.org or eo.wiktionary.org
Domaininstances of Unicode character (Q29654788)
Allowed valuesfilename available on commons with extension ".svg"
Allowed unitsN/A
Example 1exclamation mark (Q166764) -> "Bang.svg"
Example 2𞊬 (Q109636842) -> "TOTO LETTER BREATHY AE.svg"
Example 3 
SourceN/A
Planned usesee description above
Robot and gadget jobsbot should add this for all unicode chars (instances of Unicode character (Q29654788)) where available, many images here https://commons.wikimedia.org/w/index.php?title=Special:Contributions/Ekirahardian&limit=500&target=Ekirahardian
See alsoUnicode character name (P9382)
Single-value constraintyes

Motivation edit

Global unicode attack Taylor 49 (talk) 20:28, 15 April 2022 (UTC) Taylor 49 (talk) 16:02, 20 July 2022 (UTC)[reply]

Late enhancement: There is already a related property, namely image (P18). Unfortunately it can contain several images, or only vaguely related images. For example, @ (Q10714) contains a stamp instead of a clean picture of the character "@". Thus I propose a new proprery "Unique image of unicode char" that would be restricted to one value, and also SVG file format only. The scope could well be enhanced to other types of images than unicode char:s, for example traffic signs (road and rail), as long as there is a well-established definition about how the sign is supposed to look. The ultimate purpose of this is to centralize the database of unicode, that currently exists on hundreds of wiktionaries in thousands of modules (example). Instead of creating and maintaining those modules (on some wikis they got even mass-deleted), all data would be here at wikidata. It can take some 100ms to peek wikidata, which is a bit longer than peeking a LUA module, but this is done only one time for one wiki page, and the time limit is 10s. After all, this is much better than thousands of modules around everywhere, and doing same edits on hundreds of wikis in order to maintain consistent result. The images of chars uploaded to commons are ultimately useful, and this property would make it easier to use them on wiktionaries and other wikis. Taylor 49 (talk) 16:24, 20 July 2022 (UTC)[reply]

Discussion edit

  Comment @Multichill There is a reason why I propose a new property different from image (P18): the new one is restricted to one value, and also SVG file format only. Taylor 49 (talk) 06:43, 16 April 2022 (UTC)[reply]
That's what your proposing, not why you are proposing this. On exclamation mark (Q166764) seems to work now. Why change it? Multichill (talk) 09:56, 16 April 2022 (UTC)[reply]
@Taylor 49: If there's a reason you are proposing it, writing it into the motivation box is the way to go. Project chat links aren't stable when discussions get archived. ChristianKl12:30, 20 July 2022 (UTC)[reply]
@ChristianKl: I fixed the link above, it works again. BTW: YES we CAN How do I find the Q-item with Unicode code point (P4213) having value "6666" from LUA ? Taylor 49 (talk) 16:08, 20 July 2022 (UTC)[reply]
  Comment @Nw520 @Multichill There can be multiple images such as here tilde (Q11167) or junk images such as here @ (Q10714) or here (Q3594829). It's easy to filter by extension, but difficult to impossible to analyze content. Thus, the easiest solution is a new property restricting to one value, and explicitly intended as "unique image of unicode char" rather than "char in any context" that generic image (P18) is for. Taylor 49 (talk) 10:32, 16 April 2022 (UTC)[reply]
Consistent use of a qualifier, as suggested, would enable analysis of content. So still unclear why a distinct property is required. If there are advantages, what are they? --Tagishsimon (talk) 13:03, 16 April 2022 (UTC)[reply]
@Tagishsimon What qualifier would you suggest?
The advantages or differences:
  • P18 allows multiple images, this would allow only one
  • P18 allows all formats, this would allow only SVG
  • P18 is for "anything that makes the subject understandable" and context is allowed (char on a stamp for example), whereas this would be intended for the bare char (on white and transparent background) and nothing else, preferably in a further standadized form (see Wikidata:Project_chat#Global_unicode_attack). The scope is different, or at least substantially narrower.
You can always consider a property as redudant, for example number of cylinders (P1100) can be replaced by has part(s) of the class (P2670) with the qualifier quantity (P1114).
Taylor 49 (talk) 17:28, 16 April 2022 (UTC)[reply]
Switched from oppose to neutral. I am still not happy with how narrow the proposed property's domain would be (many other concepts could benefit from an image property that is restricted to a digital / not photographed representation, e.g. street signs, military symbols, etc…) and think that restricting it to one specific file extension (what if no SVG is available?) and limiting it to only one statement (what if there are multiple styles available?) is problematic. --Nw520 (talk) 20:07, 16 April 2022 (UTC)[reply]
  Comment Like Nw520 I also think it's preferrable with properties with a wide scope plus a qualifier over creating new properties. Here's a preliminary suggestion for P18 qualifier: Ximage (P18)Yhas characteristic (P1552)standardized (Q105223943) alternatively Ximage (P18)Yobject has role (P3831)standardized (Q105223943). This is sufficient to solve the problem of distinguishing a new standardized picture from old pictures. Infrastruktur (talk) 21:41, 16 April 2022 (UTC)[reply]
> what if no SVG is available?
Create it, until then, no uniqe image is available (you can still use P18).
> what if there are multiple styles available?
Pick the most typical one (not italic, not outline, ...), if several are best, pick randomly among them.
> many other concepts could benefit
True (maybe not many, but at least some other concepts). If someone wants to propose a property "unique naked image of sign, char, symbol in SVG format" with a bit wider domain, then we can discard this proposal in favor of that one. Taylor 49 (talk) 18:56, 17 April 2022 (UTC)[reply]
  Oppose I don't think this would make sense as a property. Different projects may prefer different styles (e.g. serif vs sans serif), so it shouldn't be limited to one value. Sometimes there isn't an SVG image available, so it shouldn't be limited to SVGs. Most of the time it would be no different from the image (P18) statement. It would be useful to be able to find images following a consistent style, but that includes more than just Unicode characters (e.g. maps) and there's no limit on how many styles there can be, so I think the only sensible approach would be to use qualifiers. There are some potentially useful qualifiers already, like script style (P9302), typeface/font used (P2739) and for color scheme (P8798), but we would probably need a new one (something like "image scheme", maybe). - Nikki (talk) 10:59, 30 May 2022 (UTC)[reply]
  Comment > serif vs sans serif
Well, the official charts do not have this absurd problem. Just follow the style that can be found there. Taylor 49 (talk) 16:29, 20 July 2022 (UTC)[reply]
Your preference may be to use the same style as the Unicode charts, which is serif for Latin script, but the English Wiktionary seems to prefer sans serif for Latin script. - Nikki (talk) 10:05, 25 July 2022 (UTC)[reply]

  Not done @Taylor 49, Tagishsimon, Multichill, Nikki, Infrastruktur, Nw520: BrokenSegue (talk) 05:36, 14 September 2022 (UTC)[reply]