Dan Polansky

Classes and instances edit

Latest comment: 1 year ago6 comments3 people in discussion

Hi. diamond (Q5283) and quartz (Q43010) are not supposed to be set as instances of mineral (Q7946) (nor gemstone (Q83437)). Conflictinly they are both already set as subclasses of (subclasses of) mineral (Q7946). Metaclasses instead exist for classification of classes, in this case there's metaclass mineral species (Q12089225). An actual mineral instance would be a concrete one such Krupp Diamond (Q1790227). See also Wikidata talk:WikiProject Mineralogy/Properties/Archive/2018/06 and Help:Basic membership properties. 2001:7D0:81FD:BC80:647A:E42D:74E1:2829 08:36, 3 December 2022 (UTC)Reply

That is rather confusing to me. I posted something to Talk:Q5283 (diamond). To my mind, "diamond" is not a class and has no instances and is like "gold" or "bronze" in that regard, but Wikidata currently claims otherwise. However, researching this further, I now realize that the word "diamond" is used in two senses, one countable referring to individual stones and one uncountable referring to the material. --Dan Polansky (talk) 12:38, 16 December 2022 (UTC)Reply

The diamond (Q5283) item is horribly conflated; according to en:Diamond it belongs to the mineral class only and should not be considered a gemstone (Q83437) (whether an instance or a class), nor should the descriptions refer to both senses (like "Bill is both an itemized list of expenses and the 42nd POTUS; now put that into your ontology pipe and smoke it"). The classification as a gemstone instead applies to diamond (Q17413599), the other sense. Since you appear to already have a discussion about this (still without even mentioning diamond (Q17413599)) I'd rather not get involved, but I'll leave it to you (collectively) to resolve the matter. I have however added a third sense (the mineral) to diamond (L31754) and corrected the item for this sense (P5137) links (the gem sense and lemma confusingly referred to the mineral item). --SM5POR (talk) 06:42, 30 December 2022 (UTC)Reply

Unfortunately, English Wikipedia seems to be the only one making a clear distinction between the mineral and the gemstone by having an article about each of the two subjects, while diamond (Q5283) links to articles in numerous languages covering either the mineral, the gemstone, or both (I just checked a few of them; most seem to begin discussing the mineral, but then also cover cut diamonds to a greater or lesser extent). Depending on the dominance of either subject, a case could be made for creating another item and moving one category of article links there, then turn the ambiguous one into an instance of (P31) Wikipedia article covering multiple topics (Q21484471). SM5POR (talk) 07:18, 30 December 2022 (UTC)Reply

A systematic approach towards classifying the Wikipedia articles could be this:

Pick out a number of related articles and image files in (say) English Wikipedia.
Map those articles/images to their corresponding Wikidata items.
Group those items according to topic (mineral/chemical structure vs gemstones/brilliants/cuts).
Associate any Wikipedia articles linked to said items with those same groups.
Search the articles linked to item diamond (Q5283) for links (in the text) to articles/images grouped in the previous step.
- If any previously unlisted (but relevant) articles/images turn up during the search, restart the procedure from step 2 with these articles included.
Calculate the amount of text (characters, words, sections, whatever) continuously dominated by links to either group.
Sort the articles according to percentage mineral/gemstone text.
Verify that the articles at the extreme ends of this sorted list indeed seem to cover the topics using approximately those amounts of text.

Steps 4-7 should be suitable for automated processing, once the necessary code for matching links and calculating text amounts has been developed. The sorted list could then be used to find articles preferably linked to a different item. --SM5POR (talk) 09:08, 30 December 2022 (UTC)Reply

Thank you. I am not sure I have enough enthusiasm for this effort on "diamond", especially as it impacts Wikipedia. If you have energy to do it, it will help the ontology. --Dan Polansky (talk) 09:17, 30 December 2022 (UTC)Reply

How to do Wikidatese edit

Latest comment: 1 year ago1 comment1 person in discussion

Some tips and tricks:

Heavy nominalization, use of nouns.
meaningless:
- does-not-have-quality meaning
- has absence of meaning
utterly meaningless:
- does-not-have-quality meaning has-quality utterness (or something like that)
common
- has-quality commonness
is element or compound:
- subclass-of disjoint-union list-item element list-item compound

Purpose of nominalization: allow choice of appropriate meaning of the component word in question: if there were entities for "meaningless", they would have to duplicate all the meanings of "meaning", or if not all, then many of them. If we instead say "lacks meaning", we may choose the appropriate sense on the "meaning" node.

The result is very unnatural, though. It takes a lot of learning, I guess.

--Dan Polansky (talk) 18:04, 23 December 2022 (UTC)Reply

Amphiboles edit

Latest comment: 1 year ago9 comments2 people in discussion

I think I provided a clear reason for my revert in edit summary (Special:Diff/1795182314). In your subsequent reverts and edit comments you responded with pure argumentum ad hominem. This is not constructive nor collegial in any way. Anonymous editing is not prohibited in Wikidata. You suggest talk page should be used (Special:Diff/1795182597). Surely if motivation behind an edit is unclear or if there is an actual editorial disagreement then it needs to discussed in talk page. However in this case you should have responded in talk page first, instead of unsubstantiated revert, if you think that my edit summary really wasn't clear. Adding sources is appreciated, but adding a source in itself does not mean that your edit is correct as sources can be misused.

Comments that you added to given P31 statement indicate you quite clearly misunderstood what "mineral supergroup" is. It is a metaclass, a rank in mineral classification (other ranks being species, subgroup, class etc.) This is indicated in mineral supergroup (Q3977918) item, it is not indicated that this item is for a group of concrete objects as you suggest in you commentary.

Af for removal of P279=mineral statement, as said, the item already is set as a subclass of (subclasses of) this class, meaning that amphiboles being a mineral subclass is already indicated by means of subsumption (see furher explanation here). This is why all parent classes from the subclass tree beside direct one usually aren't repeated for subclasses in Wikidata.

Identifiers that you for some reason restored were obviously mismatched (matching Q2628631 and Q17159). I hope you don't suggest that I need your permission to fix obvious errors.

You also quite clearly misuse comment (DEPRECATED) (P2315). As indicated in property page, this property is (or was before deprecation) for use in pages of other properties. If another appropriate property already doesn't exist, then a new one should be created, instead repurposing an existing one. But normally talk pages are used for this particular purpose. 2001:7D0:81FD:BC80:2C13:8104:D438:96B3 10:55, 24 December 2022 (UTC)Reply

Please discuss on the talk page of the disputed item and we can see what to do about the item. Removing well sourced statements is a non-option, whatever other merits your edits may have. --Dan Polansky (talk) 10:58, 24 December 2022 (UTC)Reply

I made a mistake in reverting your two corrections. In the meantime, I have undone my mistake. I am suspicious of anonymous editors. They show disrespect for identification. --Dan Polansky (talk) 11:01, 24 December 2022 (UTC)Reply

My use of "comment" is a use, not a misuse, as I explained on the talk page of the property. Those who oppose its use must do much better in way of argumentation. --Dan Polansky (talk) 11:02, 24 December 2022 (UTC)Reply

My comment is mainly about various aspcts of your conduct. This is why I posted here. As for anything specific to amphiboles, I'm happy discuss in talk page of relevant item.

removing well sourced statements is a non-option – as said and as explained in case of this particular P31 statement, you quite obviuously misused the source. I don't think that keeping the statement (or its qualifier/source) is justified in such case.

I wasn't aware of your comment/proposal in property talk. Nonetheless, your proposal haven't been approved, properties generally aren't repurposed, and this is not the original approved use case. And in any case, if you look around a little in WIkidata, you see that item pages (as opposed to their talk pages) generally aren't used for such commentary in any form. 2001:7D0:81FD:BC80:2C13:8104:D438:96B3 11:15, 24 December 2022 (UTC)Reply

Disputes about individual items need to be either on the item talk pages or in another relevant public venue. "misused" is not the term; "misunderstood" is the term. This you can explain in a revert edit summary or on item's talk page. --Dan Polansky (talk) 11:18, 24 December 2022 (UTC)Reply

I now see this is from an IP different from the one who made the reverts. Super rude, if you ask me. The idea of identification is rejected by the person behind the IP, an ultimate irony on Wikidata project. --Dan Polansky (talk) 11:20, 24 December 2022 (UTC)Reply

Hm, when did I suggest otherwise? Above, in the first post, I linked my revert and I explicitly said it was my revert.

I already agreed that it's better to discuss anything specific to amphibole (Q17159) on its talk page, and I'm happy to respond over there to any relevant question.

Not sure what you imply with "misused" not being a term. It's the past tense of verb misuse (wikt:misuse#Verb). 2001:7D0:81FD:BC80:2C13:8104:D438:96B3 11:32, 24 December 2022 (UTC)Reply

I think what we see here is an abuse of power. The power consists in the powerful entity being not properly identified while the entity being subjugated being properly identified. Interesting. --Dan Polansky (talk) 11:35, 24 December 2022 (UTC)Reply

Ultimate conceptual mind map edit

Latest comment: 1 year ago1 comment1 person in discussion

Working on some of the upper or middle parts of the ontology, I can't help thinking the ontology part of Wikidata serves as a kind of ultimate mind map. Each entry is a tree structure of nodes, nodes being concepts and their relations, and apart from concepts, individual entities. Although forcing nodes to be concepts makes it perhaps a bit more like concept map. However, the strictures of having to pick from available concepts add tremendous value, something a concept map, which is something on a sheet of paper, can never achieve. Since, a concept map uses words to identify concepts, and that just works rather poorly. --Dan Polansky (talk) 11:20, 25 December 2022 (UTC)Reply

Identifying concepts via defining statements edit

Latest comment: 1 year ago1 comment1 person in discussion

Concepts are to be identified by 1) genus, 2) differentia, and these would include 3) contextual information, including field that uses the concept. As per Help:Description#Descriptions_are_not_definitions:

"The concept represented by the item is defined by the statements not the description. If you need to distinguish an item from another item, add the right statements with sources before you add a special description. Also when searching items don't rely on the description's correctness, check the statements to ensure you found the right item"

Dan Polansky (talk) 14:46, 25 December 2022 (UTC)Reply

LISP macros edit

Latest comment: 1 year ago3 comments2 people in discussion

Am I writing LISP macros? It seems to be the case. Wow. So that's the connection between LISP and artificial intelligence? --Dan Polansky (talk) 15:16, 25 December 2022 (UTC)Reply

"LISP macros" is probably not the right term. That said, the hyphenation that one could use for various purposes including concept disambiguation is reminiscent of LISP. And since LISP was a tool for "artificial intelligence", one wonders whether this whole triplet idea has something to do with LISP. --Dan Polansky (talk) 12:07, 29 December 2022 (UTC)Reply

I find it interesting that you mention LISP, because the comment (DEPRECATED) (P2315) dispute reminded me of the Interlisp (Q4386338) "*" comment macro that is actually inserted (without being evaluated) into the code structure itself, like "(* SOME COMMENT)", in contrast to the conventional ";some comment" <EOL> syntax that leaves the comments outside the code already at parse time. But I guess that's just a coincidence.

I never really used Interlisp myself except on some occasion when I was confined to a Xerox Daybreak (Q8043909) workstation (Xerox 1186) in the 1980's, but my main experience is with Maclisp (Q1882973), and later Common Lisp (Q849146). I have envisioned using some LISP-like data structure for 3-D object modeling (ray tracing applications), but never got around to actually implementing it. Having previously used LISP for code prototyping, I now primarily use Python for that.

As to whether the Wikidata triples were influenced by LISP, at least searching google for "wikidata triples lisp" doesn't suggest a strong connection. I guess that's something you may have to ask Denny Vrandečić (Q18618629) about (if he has time to answer, or you can find some information already recorded). --SM5POR (talk) 05:50, 30 December 2022 (UTC)Reply

being a superclass edit

Latest comment: 1 year ago3 comments2 people in discussion

I should not create an entity for being a superclass. It is a being, but it is not a quality. If I could create it, I could say X has-quality being-a-superclass of X. There are workarounds. I can say has-quality superclass of X. That is a hack since superclass is not a quality. --Dan Polansky (talk) 16:10, 25 December 2022 (UTC)Reply

It seems to be not a quality since it takes an argument, and that's what qualities usually do not seem to do. --Dan Polansky (talk) 16:58, 29 December 2022 (UTC)Reply

I'm not familiar with your terminology ("quality", "argument"); some practical examples could be helpful to illustrate exactly what you are referring to. To me, a superclass is just the opposite of a subclass, and the latter property is sufficient to tell what is a subclass of what; no corresponding superclass statement required (in SPARQL you can even write "?y ^wdt:P279 ?x" which is equivalent to "?x wdt:P279 ?y"). --SM5POR (talk) 09:25, 30 December 2022 (UTC)Reply

Use of workarounds edit

Latest comment: 1 year ago5 comments2 people in discussion

To get things done, I am forced to use workarounds, such as free textual comments, although other workarounds are preferable, especially use of qualifiers. The ontology in Wikidata is in many places broken or largely undeveloped, suggesting that many editors have little idea of what Wikidata (Wikiconcepts) can do for concept identification and disambiguation. What is usually there is genus, but differentia are usually missing as structured statements, and when they are there, they do not trace to sources. Often, the genus is wrong. --Dan Polansky (talk) 16:15, 25 December 2022 (UTC)Reply

Workarounds are fine (to a reasonable extent), as long as they don't contradict community consensus. Even when there is a lack of community consensus on an issue, it's often helpful to follow established conventions in order for other editors to understand what you are doing, unless those conventions are grossly inadequate or even misguided. I have seen many statements and edits that look totally bonkers (and accidentally made one or two of those myself); they may however be the result of unintentional (and undiscovered) typos, poor user interfaces (see User talk:Marsupium#A long time ago in a property far, far away... for a theory of mine in a particularly odd case), bulk robot edits, or blind copying of earlier bad habits. Many "descriptions" are nothing more than over-simplified renderings of the subject item's instance of (P31) statement, or even completely empty.

There are indeed also genuinely different (and conflicting) opinions on what constitutes a class vs an instance (such as "only physical objects can be instances", "whenever there are two or more objects of something, it's a class", "a subclass of X cannot simultaneously be an instance of Y", "when in doubt, make X an instance of Y as well as a subclass of Y just to make sure", "you cannot have an instance of an instance", "let's make everything a class to keep it simple" etc).

But I also believe there are numerous editors doing a tremendous job without ever having cared about things like "genus" or "differentia" (and neither had I, until you made me look them up, nor had my spell-checker with respect to the latter word).

In any case, whenever there is a dispute over what approach to take, community consensus trumps technical excellence, every time, with no exceptions. A community project cannot work in any other way. But with patience and persistent efforts, the community may eventually be persuaded to embrace and support the superior tools and methods. In the meantime, we'll simply have to deal with the rough edges and obstacles popping up everywhere. --SM5POR (talk) 11:00, 30 December 2022 (UTC)Reply

Is there any Wikidata policy about what "consensus" means here? Since, in Wikipedia, "consensus" is defined as being from arguments regardless of numerical superiority, contradicting the general meaning of "consensus". --Dan Polansky (talk) 11:54, 30 December 2022 (UTC)Reply

I agree that using the word "consensus" may be a bit sloppy, and it could even be me who is primarily to blame for the misuse in this thread. I don't think there is a single formal procedure that spans all of Wikidata, but wherever any procedure is required, such as when a property is proposed for creation, it seems sufficient to have majority support from those few who care enough to react to the announcement of the proposal and participate in the discussion. If both

Support and

Oppose present substantial arguments and the numerical difference is small, it's possible that other factors than numerical strength will determine the issue. The procedure itself is thus subject to case-by-case considerations; I haven't seen anyone challenge an outcome of a discussion on strictly formal grounds, but I haven't engaged myself in too many discussions either.

I believe there will hardly ever be a "last chance"; if you find an issue that was settled years ago but you strongly disagree with what was decided, you are free to reopen the issue and propose a change.

My problem so far hasn't been to persuade a majority of opponents to change their minds and support my proposals, but to get anyone to even bother to comment on them! It's not that I actually need formal approval to go ahead with my ideas, but I want to know what at least one other person think of them before I spend hours doing something the community may later reject either for contradicting some policy I accidentally failed to learn about, or their mere ignorance of the glorious prospects I have in mind.

And by "one person" I do mean anyone, including you. Feel free to comment on any of these issues:

You may also want to take a look at the following and let me know whether you think I got them right (I do believe I did, but I'm not seeing my efforts being scrutinized by anyone whatsoever):

Talk:Q26085352#Class or set
Property talk:P5307#Type constraint limited to classes (typo in comment: I meant "qualifiers", not "constraints")
Property talk:P7510#Property scope
Property talk:P1073#Useful as qualifier

--SM5POR (talk) 19:13, 30 December 2022 (UTC)Reply

@Dan Polansky, @ChristianKl, @ArthurPSmith: With thousands of human editors and robots assuming that existing properties are (intended to be) used in accordance to specifications, I find it a bit risky to "kidnap" one of those properties and repurpose it; it's not like there are any automatic signs saying "warning: the floor is wet" or similar... I hadn't read your superclass proposal before and seriously thought you had misunderstood the purpose of model item (P5869) and a few other properties that I have now deprecated (while leaving the references in place for you to recover them).

But there are the sandbox items and properties which are meant for experimental use, though I'm not entirely sure of under what conditions (I'm using Sandbox-Lexeme (P5188) myself for a similar purpose). For your purpose I would suggest Sandbox-Item (P369) which has the same datatype. You will have to tolerate others simultaneously using the same property for other experiments, but that shouldn't be much of a problem as long as everybody understands that sandbox properties are off-limits for regular, long-term use, constraints may change unexpectedly etc. To allow easy identification of your particular use I suggest adding a qualifier like linguistic unit (Q11953984)Sandbox-Item (P369)morpheme (Q43249)subject has role (P2868)superclass (Q21652121) and you could add Wikimedia community discussion URL (P7930) with a link to the property proposal or something if you want to others to learn about it (not necessarily to every statement; once per key subject item and property is certainly enough). --SM5POR (talk) 16:11, 2 January 2023 (UTC)Reply

Adjectival labels in qualities edit

Latest comment: 1 year ago6 comments2 people in discussion

I found such a case in the concept of "polysemy". It makes sense: adjectives in natural language are translated into qualities. I do not know what the policy is or how widespread the practice is. Some editors seem to think that the labels do not need to be an exact match, which makes little sense to me. But the adjectival label is in fact an exact match, conceptually speaking. It is just that to translate adjectives into ontologese, we need to turn them into nouns denoting qualities. But why would not labels help up do just that? --Dan Polansky (talk) 17:29, 25 December 2022 (UTC)Reply

I believe there are policies regarding labels, but I haven't paid much attention to them. Property labels may well be more critical than regular item ones.

With the addition of the lexeme database, I think the significance of item labels can be lowered further. In my opinion, the main purpose of labels is to serve as convenient reminders to Wikidata users and application developers of what a particular item or property is all about, and if they need further clarifications, they can look at the descriptions and aliases. From the perspective of a SPARQL client and its end-user, the labels ideally shouldn't matter any more than, say, the name of an internal software function or subroutine. I'm concerned however that some Wikipedia infoboxes and similar are dependent on the property labels for rendering text to be seen by Wikipedia end users, as I have heard rumors about Wikipedia editors asking for properties such as "subclass of" be relabeled to "type of" since that expression better reflects the language familiar to an average Wikipedia reader!

As the initial set of items defined in Wikidata came from Wikipedia articles in English and various other languages, the labels were simply copied from the article titles, for which reason most of them are nouns (it might be interesting to do a statistical survey of labels to see what parts of speech they represent today, and to what extent).

With new items being defined without a history as Wikipedia entries, we have to find a labeling standard anyway, and your suggestion to use adjectives for qualities seems like a reasonable one. But where is it stated that the label must be a noun? Looking at polysemy (Q191928) I see that it has "polysemic" and "polysemous" declared as aliases. The corresponding Wikipedia entry also has the noun as its label, supposedly because it's grammatically easier to describe a noun than an adjective, while the adjective is geared towards using it in some other context (switching parts of speech in either context is possible but may sound weird, like "Enhance your calm, John Spartan" in Demolition Man).

For advanced linguistic applications such as text recognition, production and translation I don't look at labels, but at the lexeme section of Wikidata, which is designed to handle any parts of speech, word forms or senses with all their grammatical features. There is a property item for this sense (P5137) that may form the crucial link between lexemes and their semantics as defined by those items.

The lexemes polysemous (L339382), polysemously (L197388), polysemically (L197387) and polysemantically (L197386) (one adjective and three adverbs) exist, but none of them have any senses defined yet, and therefore no item for this sense (P5137) claims. Looking for adjectives in any language, I find bisexual (L291659) in Basque, and as expected it links to bisexuality (Q43200), with nouns as labels.

As long as there is a one-to-one mapping between an adjective and a noun, I assume a single item will suffice to provide the semantics necessary for an automatic text production system to pick the right word form.

Do you need the adjective spelled out for any particular reason, or is it a matter of cosmetics? --SM5POR (talk) 13:48, 30 December 2022 (UTC)Reply

I have no critical need for inclusion of adjectival labels, and if this is forbidden, I don't mind very much. I just find it a neat idea for qualities, to reinforce their nature. I think it is very useful to look at labels from the point of view of machine translation: machine sees a label in text and needs to determine which entity/concept corresponds to that label. What the machine sees in the text is usually an adjective, not a noun, as far as adjectives are concerned. Wikidata may not necessarily be used for that purpose, but I think this machine-translation way of looking at Wikidata brings a useful perspective to improve quality of Wikidata. --Dan Polansky (talk) 14:02, 30 December 2022 (UTC)Reply

Again, if you are serious about translations or other linguistic stuff, don't look at labels, look at lexemes. The labels are structurally limited and inadequate for automated processing. I regard them as shorthand reminders, like handwritten post-it notes hastily slapped onto boxes and binders to provide hints about what's inside. Using them for machine translation will be like writing a novel using only refrigerator magnet words. The designers of Wikidata may have had higher ambitions for them originally, but they really pale in comparison to the lexeme database which was introduced years later. --SM5POR (talk) 11:58, 31 December 2022 (UTC)Reply

The above understanding does not seem to match the label guideline, by saying "handwritten post-it notes hastily slapped". The guideline says the label should be the most common one for the subject, and that does not suggest haste or carelessness, but rather an objective principle of choice. And when tracing a statement to a source, the source should ideally use the same labels for the subject and the object of the statement, and when it does not, that's what qualifiers are for, including "subject named as" and "object named as". Labels may work especially well for English with its limited inflection of nouns: there's the singular, the plural, and that's it. --Dan Polansky (talk) 12:20, 31 December 2022 (UTC)Reply

Well, it's the usual problem with analogies: I'm not directly equating labels with post-it notes; I'm trying to describe the difference between labels and lexemes as congruent with the difference between post-it notes and a dictionary (without actually spelling out that latter difference, thereby making it difficult to get my point across).

True, there are conventions for how labels are written (singular, lowercase initial etc), but the purpose of that is to make it easier for human editors to interpret them. I encountered some classes labeled in plural like mayors of Camden County, New Jersey (Q50056268) which I consider utterly confusing; are they intentionally referring to the class of a collective (a group) of mayors, or simply to the class of mayor? Because an instance of the mayor group class is a single group of mayors, not a single mayor, much like an instance of ethnic group in Indonesia (Q83828) is a single ethnic group, not a single Indonesian.

Human language is so much richer than the restricted vocabulary of Wikidata labels. Those labels will not tell you what their plural forms, genders, cases or tenses are; all that is found in the lexeme database. There is still a lot missing, such as rules of grammar telling which cases or tenses to use. And by asking for some labels to be expressed as adjectives, you will lose the corresponding nouns instead ("Be calm, John Spartan").

Ah, you were talking about English labels only? Then you may perhaps find the singular form of nouns sufficient. But how do you find the proper endings when translating to German?

There is actually too much linguistic information like "subject named as" and "female form of label" cluttering up Q-space, but it probably got started before the lexeme entries were designed. I want linguistic data confined to the lexeme section, where it can be fully documented and processed for several hundred different languages. --SM5POR (talk) 19:41, 31 December 2022 (UTC)Reply

All labels to be attested and ideally common edit

Latest comment: 1 year ago4 comments2 people in discussion

Help:Label: "The label is the most common name that the item would be known by", boldface mine. This suggests that frequency of use trumps other considerations; GNV could be used to resolve label disagreements. Other aid in label disagreement resolution could be labels used by authority control sources (thesauri) to which the item traces. Dan Polansky (talk) 19:21, 25 December 2022 (UTC)Reply

And: " Reflect common usage Because the aim is to use the name that an item would be known by to the most readers, labels should reflect common usage. When it comes to scientific names, for example, of a species, labels should use a species' common name, however items must always also have the scientific name listed as Alias. If a species has several common names, a reasonable effort should be made to determine which of them is the most commonly used, e.g. by consulting references. The other names should be placed in the alias field along with the scientific name. If a species does not have a common name, the scientific name can be used as the label. Note that individual breeds do not have scientific names. Every breed of dog, from the Siberian Husky to the Chihuahua, is part of the species Canis lupus familiaris. " --Dan Polansky (talk) 19:23, 25 December 2022 (UTC)Reply

A correction: if no label can be attested in a language, a transparently named unattested label seems to be better than no label. Since, otherwise, the user of that language has no way to select the entity in their native language, creating a disincentive to participation in other languages than English. Maybe the label does not even need to be transparently named: it suffices that it is easy to understand enough to take the user from the label to the entity in their mind. --Dan Polansky (talk) 19:35, 29 December 2022 (UTC)Reply

I had reason to pick labels for two new items when splitting logic family (Q173359) into two distinct topics covered in the same article. Since I hadn't yet read the rule you refer to here, I tried to come up with labels that would be non-ambiguous and easy to understand, and decided to use "logic circuit technology family" and "logic circuit design methodology", while adding "logic family" as an alias to both. I don't know which expressions are most common in this industry, but I would guess it's "logic family". I have simply left my constructed labels in place for lack of authoritative sources. --SM5POR (talk) 14:04, 30 December 2022 (UTC)Reply

Teamwork? edit

Latest comment: 1 year ago7 comments2 people in discussion

I think you (and sometimes others) raise a number of important points above, some of which I agree with, while others not. Rather than dive right into each one of them separately, I wonder if you would be interested in some systematic teamwork, once we find out where we agree and can avoid crossing each other's path?

Your use of comment (DEPRECATED) (P2315), which I have also considered (but not embraced) made me look a bit further and find Wikidata:Property proposal/see talk page discussion at. That in turn led me to amend the constraints for Wikimedia community discussion URL (P7930), which I hadn't seen before but is just what I have been looking for. Thanks (even if it was unintentional of you)!

When I'm in Rome, I try to do as the Romans do. --SM5POR (talk) 09:57, 28 December 2022 (UTC)Reply

I think you are right. --Dan Polansky (talk) 10:31, 29 December 2022 (UTC)Reply

Actually, "comment" is a bad name for the purpose; "note" would be better. Since "comment" points to the concept of opinion, as in "comment is free". Where I have some other uses that I find worthwhile: 1) textual statement, 2) textual note under a trace to a source, indicating e.g. which sense from a dictionary was meant, e.g. "1a". It seems to me that Wikidata is poor in generics. That may be a good thing in part, but it also has disadvantages. Of course, there could be a qualifier "dictionary sense", having text as its object/2nd argument. The problem is that whenever a need arises, one first has to go through the bureaucracy of approving a new property or qualified, and that creates barrier to contribution and productive work. But I recognize that there are deep advantages in avoiding generics. --Dan Polansky (talk) 10:55, 29 December 2022 (UTC)Reply

I don't know you, and I realize I may be preaching to the choir, but what follows is my general take on this community effort. My apologies if all this is old news to you. That said, here goes:

Technical excellence is one thing, while massive collaboration is another, and these two qualities may be difficult to combine. I usually work alone, and I can be a bit of a perfectionist, making me spontaneously disapprove of a lot of things I see in Wikidata. But I also try to be a diplomat and not visibly write everything that I dislike off as substandard, at least not until I have learned what the community at large thinks about it.

About generics: True, Wikidata has plenty of special purpose properties that are the result of individual needs rather than some coherent plan; one of my main gripes with the property system is the insane number of external reference properties, since they might just as well have been implemented using a single external reference property, stating the external resource as its main value item and the ID as a qualifier. But this is exactly why the property approval procedure is necessary; without it we would probably have a million properties, most of them useless or at least redundant several times over.

However, the large total number of properties (now around 10,000) is actually a feature, not a bug, when you view Wikidata as a system subject to organic evolution. A high number of participant editors makes the wisdom of the crowd (Q1753943) phenomenon a significant factor, and once in a while, some individual idea among thousands may actually turn out to be the best that has happened since sliced bread.

My reaction, when I first experienced one of my edits being reverted, was one of surprise, followed by curiosity: What did I do wrong? Can I learn something from this? You can read my initial account of that experience at User:SM5POR#Nematode infections (I eventually gave up on fixing the original problem, and went on to more interesting tasks).

I recently added my personal "agenda" for Wikidata to the beginning of said page. For the moment I'm working on the of (P642) qualifier deprecation effort; that meaningless "property" was a mistake in the early days of Wikidata. I expect comment (DEPRECATED) (P2315) to eventually be deleted as well; you will find my opinion on that property at Property talk:P2315. To clarify, I do see the need for a "comment" field, but in the interest of preserving a permanent record of all comments regardless of what happens to the item, I oppose placing that field inside the item itself. Maybe Wikimedia community discussion URL (P7930) isn't perfect, but it's a step in the right direction.

Feel free to browse my user pages, and please let me know if you find something you disagree with, so that I may either learn that I have been wrong or improve my explanatory skills. Meanwhile I'll take a look at some of the issues you have posted on this talk page of yours. --SM5POR (talk) 16:24, 29 December 2022 (UTC)Reply

Thank you so much for your kind interest. I'll look closer into what you wrote and see whether I have more to add or ask. --Dan Polansky (talk) 16:29, 29 December 2022 (UTC)Reply

@Dan Polansky: Dan, I want to apologize for getting in your way even as that was precisely what I had hoped not to. Now I want to explain why I even bothered to contact you in the first place:

I was trying to resolve some subclass loops that I have long suspected contribute to poor performance and timeouts when doing SPARQL searches involving wdt:P279* property paths (I still don't know if that theory is correct, but I thought it might be worth another try). The first loop I found involved the item word (Q8171), and when I tracked it down in the edit history I got the initial impression that you had made a correct edit which had then been incorrectly reverted by another editor I know.

I therefore planned to approach the other editor about this apparent mistake, but first I wanted to find out who you were and what past experience you had. I found your statistics with English Wikipedia truly impressive (and still do); it looked like you had spent 30-40 times as much work on multiple Wikipedia projects over the past decade as I have done on Wikidata since I discovered this project less than three years ago, and my own work on both Swedish and English Wikipedia together pales even when compared to what you have accomplished on Wikidata during the few months you have been active here. I thought your contributions and skill would be of great use on Wikidata.

Then I found out about your unorthodox use of the comment (DEPRECATED) (P2315) property and the resulting fallout with the admins. "Unfortunate" was my personal reaction, but such things happen sometimes. It shouldn't have to be a disaster if you could simply learn from this experience how Wikidata is best developed, but I feared it might take a long time if you had nobody to conduct a low-key dialogue with, and that's when I decided to step in and introduce myself above.

Thinking back over past decades, I believe I have eventually managed to understand my own limitations in life; that certainly wasn't always the case. One thing I know I'm bad at is making assessments about personal chemistry, especially in an on-line setting. These days I simply try to be friendly and hope for the best; if it doesn't work out I can always walk away.

Since your talk page is technically a public forum, I will not enter into a detailed analysis of anyone's behavior, but I hope you can read between my lines what my general impression is. The mistake I'm most concerned with is however my own; I don't think I contributed very much if anything at all, and I fear I might even have made matters worse. That said, I think I have one trait that I'm proud of: I never, ever, really give up hope. Setbacks happen, true, and I may decide in some cases that my best option is to simply walk away, but that feeling of hope seems to stick to me like an alien creature from a 1950's science fiction/horror movie clutching to my back. And I'm not really walking away this time either, but I do think I should stay out of your way for the time being. You are still welcome to contact me, and I will be happy to reply, but from now on the initiative will be yours.

And may the Source be with you! --SM5POR (talk) 12:09, 5 January 2023 (UTC)Reply

I fear I may have woken false hopes in you. I may be of less help to your objectives than you hoped. I hope you may have gained something nonetheless, perhaps at least a little. Your expression of interest was an act of kindness, and is appreciated by me. I apologize if I created wrong impression by my actions. --Dan Polansky (talk) 14:35, 5 January 2023 (UTC)Reply

Dictionaries as sources edit

Latest comment: 1 year ago5 comments2 people in discussion

It is unclear whether dictionaries are good sources for ontology work. Since, the ontology implied in dictionary definitions is all too often broken and inconsistent. Multiple dictionaries seem to have the definition of "entity" broken, or maybe they have some reasons to think that existence is either a quality or property of "entity". --Dan Polansky (talk) 12:08, 29 December 2022 (UTC)Reply

I don't consider dictionaries authoritative as sources to resolve ontology issues other than indirectly when the need arises, say by looking up a city to find out what country it's located in and similar basic facts, then translating that to the Wikidata ontology for geographical objects in order to make it an instance of (P31) the appropriate class. In particular, the abstract concepts close to entity (Q35120) should be discussed in Wikidata:WikiProject Ontology and not left to depend on arbitrary dictionary definitions for their place in the tree. There are also traps caused by multiple language-specific senses with subtle differences not necessarily recognized by conventional sources; see class (L3934), entity (L1393), event (L4036), object (L5848), subject (L5338) and property (L3826) for examples (comparing the terminology in different languages might be helpful though; I'd really love to see a cross-language table of all the senses associated with the English-language lexemes tied to the fundamental items).

I began drawing a new sub-tree surrounding entity (Q35120) some two years ago simply because I considered the existing ad-hoc tree a mess with maybe a hundred different sub-classes directly attached to entity (Q35120) or to some immediate subclass. I like hierarchical data structures and I try to avoid redundancy whenever possible, but I'm not entirely confident with the philosophical terminology and wouldn't mind getting a second opinion on it. I admit I haven't paid attention to what references have been cited for the instance of (P31) or subclass of (P279) claims. --SM5POR (talk) 19:20, 29 December 2022 (UTC)Reply

While they are not entirely authoritative, especially the English dictionaries often do a decent ontological job at least in some parts of ontology. For instance, they meticulously state "quality or state" when the entity in question is in fact two entities, one quality and one state. And since they probably also employ some ontologists, especially M-W given their practice of delegating some senses to other words, which would be very hard to do without an electronic ontology, chances are what they are doing at the top of the ontology is at least worth looking at and thinking about.

As for "a city to find out what country it's located", that's not a matter for ontology, from what I understand. Ontology is not about concrete objects. --Dan Polansky (talk) 19:31, 29 December 2022 (UTC)Reply

Well, then you trust dictionaries more than I do, but I'm fine with that. And of course locating a city in a world gazetteer isn't ontology; that was simply my way of describing how I have so far disregarded dictionaries in trying to understand the Wikidata ontology. I may well have been overly cautious in that respect, and your view may help me get a better understanding of which sources I can trust for what kind of work.

But ultimately, the most authoritative source on what the Wikidata ontology should look like is the Wikidata project itself. --SM5POR (talk) 12:10, 31 December 2022 (UTC)Reply

One should perhaps take dictionaries with a lot of reservation, far from trusting them blindly, for ontology purposes. My point is rather that one probably does not need to ignore dictionaries altogether but rather take them into account to some limited extent. As for "the most authoritative source on what the Wikidata ontology should look like is the Wikidata project itself", that sounds like a circular authority principle, something fundamentally unworkable. What Wikidata ought to do, in my view, is document reliable sources via statements, and then make its own "mind", as it were, as to which statements to accept and which to deprecate. Thus, Wikidata should depend on sources and yet have a quasi-autonomy, by having the meta-authority to adjudicate between authorities. --Dan Polansky (talk) 07:42, 11 January 2023 (UTC)Reply

At most one superclass edit

Latest comment: 1 year ago2 comments2 people in discussion

It seems to me that ideally, each class should have at most one superclass. Since, the superclass statement should really be one of definition, not extra-definition statement. When there are multiple superclasses, it may mean that the superclass is disputed in good sources. Then, one of the multiple superclasses should be deprecated or the main superclass (genus) should be marked in green. These are assumptions, not certainties. They rest on some other assuptions: 1) entities in the ontology part are controlled via definitions; 2) definitions are created via structural statements, not via textual descriptions; 3) classes should be defined using one genus and only one. It also seems that allowing multiple superclasses would prevent machines from doing more robust consistency checking. --Dan Polansky (talk) 14:46, 29 December 2022 (UTC)Reply

I began writing a comment to this, but kept rewriting it as I couldn't really figure out what you were aiming at. Could you please provide some examples of how you would like instance of (P31) and subclass of (P279) links connect to items, compared to how said items are linked today? Can an item have multiple instance of (P31) statements, but not more than one subclass of (P279) statement? --SM5POR (talk) 12:20, 31 December 2022 (UTC)Reply

Avoid using qualifiers to make statements edit

Latest comment: 1 year ago7 comments2 people in discussion

Ideally, one should not use qualifiers to make statements. Such use may be acceptable as a temporary workaround, but not as an ultimate good. Of course, one may use qualifiers to make meta-statements.

Thus, to say X is pair of Y and Z, one should not use "of" as qualifier. Instead:

X subclass-of pair
X subclass-of-arg1 Y
X subclass-of-arg2 Z

Or something of the sort. Whether Wikidata has all the tools required for that remains to be seen. --Dan Polansky (talk) 16:56, 29 December 2022 (UTC)Reply

If by any chance you are referring to the qualifier of (P642), rest assured it's on its way out. It was a mistake to create it in the first place, since it doesn't have any language-neutral semantics and has turned out impossible to translate to, say, Chinese. "Of" is an English preposition with maybe a dozen different senses. We are working on defining new properties (not only qualifiers) to replace existing of (P642) constructs.

That said, I'm not sure I understand your point here. The triples linking subject item A to object item B only, without the help of qualifiers, are pretty limited in their expressiveness, as both A and B have to be notable (i.e. they should map to some real-world concept of general interest).

What do you mean by "pair of Y and Z"? Is it a subclass of Y and a subclass of Z simultaneously (AND, intersection), or is it something else? To cite an example claim in wikitext, you can use Template:statement like this:

Maybe even better use some real example items in place of generic variables like X, Y and Z.

The neat thing about qualifiers is that you can use them to make statements about non-notable items, say, a cable connector on a particular piece of electronic equipment. Example:

IBM Personal Computer (Q202712)has part(s) (P527)VGA connector (Q539719)
- connects with (P2789)computer monitor (Q5290)

Without the qualifier, you would have to create a separate item "The VGA connector on an IBM PC" (of questionable notability) and link it using two main statements like:

IBM Personal Computer (Q202712)has part(s) (P527)IBM PC VGA connector
IBM PC VGA connectorconnects with (P2789)computer monitor (Q5290)

Then to describe a computer in full detail you will need separate items for every discrete part of it having its own property values, such as the main CPU, the GPU and any other co-processor, the memory subsystem, the I/O subsystem, keyboard, monitor, network interface and any other peripherals. That will be a lot of items, each one having to pass as "notable".

It may be simpler for the ontology software to process, yes, but will it also be conceptually simpler for the human ontologist (hmm, my spell-checker doesn't recognize that as a word) to visualize and map to the real world? I doubt it.

The "snak" data structure (not sure where that term comes from) effectively turns each main statement into a miniature item with its own main value, rank, qualifiers and references. By rejecting qualifiers as nothing but a temporary workaround, I'm concerned you are throwing a lot of good functionality out the window.

Granted, there are bad qualifiers as well as bad uses of good qualifiers, but then let's focus on eliminating those rather than dismiss all qualifiers in general. --SM5POR (talk) 03:32, 30 December 2022 (UTC)Reply

Re: "What do you mean by "pair of Y and Z"? Is it a subclass of Y and a subclass of Z simultaneously (AND, intersection), or is it something else?" Ontologically, a pair is not a subclass of any of its component superclasses. Thus a pair of two words is nor a word; it is a pair. The same is true for any tuple. I needed a pair for the entity of "lexical unit", which I understood from the sources to be a pair of a lexeme with one of its meanings; in this case, the pair is a pair, not a lexeme and not a meaning.

Re: "The neat thing about qualifiers is that you can use them to make statements about non-notable items": Sure, but it seems to be a workaround, the violation of the notion of "qualifier". --Dan Polansky (talk) 07:06, 30 December 2022 (UTC)Reply

Okay, I understand your pair is a composite object then, an assembly (Q811367)? Because that's where the physical matter diverges from the class path, and the superclass turns into an abstract structure, set, or other collective of component items, while the physical matter is split into those individual items via has part(s) (P527) instead, see electrical cable (Q188447) for an example. This happens also when the component items are identical; a group of buildings is not a subclass of building, it's a subclass of group. Same thing with groups of words, even if the component words are just as abstract as the group they form. --SM5POR (talk) 12:48, 31 December 2022 (UTC)Reply

I don't see how describing the internals of a computer (or just any multi-part artifact) violates the notion of a qualifier. Another example: A stereo amplifier set has two loudspeakers (left and right). You write amplifier sethas part(s) (P527)loudspeakerquantity (P1114)"2". To state the same fact without the qualifier you would need to first define a loudspeaker pair, then make that pair a part of the amplifier set. In my view, that is a workaround to cope with the lack of qualifiers. Same thing if you want to specify the impedance, physical size or commercial brand of loudspeakers used in the set; qualifiers allow you to do all that without defining an item for every combination of components and their attributes. --SM5POR (talk) 13:16, 31 December 2022 (UTC)Reply

Re: "I don't see how describing the internals of a computer (or just any multi-part artifact) violates the notion of a qualifier": My understanding of qualifier is rather narrow: it is something that qualifies the statement in question. Thus, "disputed" is a qualifier, and so are "subject named as X" and "object named as X". So is "uncertain". The qualifier should talk about the statement, not about the object of the statement (not about argument 3 in the statement triple). Thus, property connects with (P2789) should not be used as a qualifier, in the ideal world. Here, people will use workarounds, but we may want to converge toward avoidance of these workarounds. --Dan Polansky (talk) 13:40, 2 January 2023 (UTC)Reply

Now I'm getting it; by "qualifier" you are referring to a much more limited concept than what I think the creators of Wikidata ever intended it to be (but I may well be wrong; I simply wasn't involved when Wikidata was launched in 2012, as I discovered it only in 2020). So, while I doubt that this will change, would you approve of simply calling those 2nd-order properties something other than "qualifiers", or do you think they serve no structural purpose at all, regardless of what they are called? I agree that it would be an improvement to clarify the distinction between them and the true qualifiers. --SM5POR (talk) 07:44, 4 January 2023 (UTC)Reply

Opposite can act as complete differentia edit

Latest comment: 1 year ago1 comment1 person in discussion

Thus, if one defines non-physical entity as:

1) subclass-of entity

2) opposite of physical entity

That is all that one structurally needs to do; the substance of the structural definition is in physical entity. In particular, whenever physical entity is defined as having quality X, non-physical entity is defined as having quality non-X. --Dan Polansky (talk) 17:39, 29 December 2022 (UTC)Reply

Relation to UML and database data modelling edit

Latest comment: 1 year ago1 comment1 person in discussion

Statements in entities establish relations that are reminiscent of relations between classes in UML data modelling. The relations include subclass-of and has-part(s). What seems missing is the generic association; instead, there are specific properties for labeled associations. The structural definitions in Wikidata seem to reveal an implied data model. --Dan Polansky (talk) 11:41, 30 December 2022 (UTC)Reply

Being, essence and existence edit

Latest comment: 1 year ago2 comments1 person in discussion

Here is my conjecture about the thread title, to be corrected later (Conjectures and Refutations):

being
- essence
- existence

But what is that? Let's have a look. Consider this sentece: "God is almighty". Let us suppose we are atheists. Is that sentence true? Surprisingly, it is in some sense. Since, God is almighty by definition. But in natural language, the sentence sounds misleading, implying existence. We can do better: "God, if he is out there, is almighty". Both sentences use the word "is" and therefore invoke being. The being by which "God is almighty" is true is the essence. One may also say that almightiness is part of God's essence. But there is also existence. What is that? It's "being out there". Where? That's a metaphor or another figure of speech. The language user has to figure out from the context of this discussion what is meant here. Thus, we obtain that there are different kinds of being, essence and existence. They are reminiscent of Sosein and Dasein, but whether they are the same thing would need to be verified. Also, a definition may be said to be the statement of the essence. That further contrasts to accident, things true of the subject but not part of the definition of the subject. An example would probably be found somewhere in Aristotle. being is from Aristotle, but existence, so the sources say, is of much later date.

Popper warns against essentialism. He instructs us to read definitions "from right to left", treating the definition as an answer to the question, by what shorthand shall we call the right-hand side? Whether he is completely right would need clarification, but he makes an interesting point worth considering. One way to understand his criticism is this: definitions almost necessarily identify simple intensions. While they may match natural kinds one-to-one in this world, this does not work across possible worlds. Let's take this: man is the rational animal. The essence is extremely simple, consisting only of two elements. Now let's check possible worlds and see centaurs or other rational horse-like creatures. They meet this definition of human, but they are not humans. Humans are a natural kind, not a simple intensional essence. A consequence of this is that dictionary definitions of natural kinds, like humans, are always wrong or incomplete, by identifying a simple intensional essence instead of the natural kind. Whether a definition could be written to identify the natural kind is questionable; there is some kind of indexicalness in the way natural kind is picked up. Similarly, we cannot provide a definition of "this world" (in the many worlds hypothesis) and need an indexical "this"; any combination of characteristics of this world would narrow down the worlds but would still fail to single out "this one". In mathematics, there is a somewhat similar fact that there are many more mathematical objects than there are phrases to identify them. One cannot probably even say how many mathematical objects there are; "the largest cardinal number" has essence, but no existence, or, for mere mortals like the present speaker, for each cardinal number, there is a larger cardinal number. --Dan Polansky (talk) 08:16, 31 December 2022 (UTC)Reply

But what is entity? Whatever it, an entity usually has essence (but beware of natural kinds). And entity may or may not have existence. But what is the difference between entity and essence, both from the same Latin roots? Could not we say that there are essences and some of them have existence? For some reason, we say that an entity has essence rather than being numerically identical to essence. We do not say that essence has existence; we say that an entity has existence. (It is probably since essence exists as essence, as abstract object) But what is it for a class to have existence? Are classes abstract objects or concrete objects or mixture of both? Can instance lack existence? To answer these questions, we may need again remind us of the duality between thing and its representation. By necessity, Wikidata entities are representations. Even a representation has existence, as representation, but what it often lacks is the thing corresponding to it. Thus, if we create Wikidata:entity for Peter Meter, create an intensional definition for it (and no other seems possible, but beware of images), and it turns out there is no such thing, we may say that Peter Meter does not exist, and yet, Peter Meter is an instance, but dually, it is also a representation of an instanace.

There is another characterization of essence, as that which identifies the entity. Thus, essence is something like definition. Other things about the entity are accidents, and are extra-definition statements. This applies to instances as well as classes. This has bearing for interpretation of modalities, esp. necessity, across possible worlds. Since, an entity has the same essence in different possible worlds, but not necessarily the same accidents.

To repeat, this is a conjecture. This is an extension of Popperian philosophy of science. The idea of conjectures and refutations applies to scientific theories, but one may extend the broad idea and process to any conjecture, whether scientific, mathematical or philosophical. The idea that a similar process has more bearing on mathematics than one would have guessed is covered by Lakatos. --Dan Polansky (talk) 10:24, 31 December 2022 (UTC)Reply

Infrastructure edit

Latest comment: 1 year ago11 comments2 people in discussion

I'm afraid most theoretical discussions of philosophy, logic etc fly a bit above my head. I could possibly understand it if I made a serious effort; it's just that I don't see the point of it and therefore don't want to spend the time required.

To make the theory a bit more appealing to me, I want to see how it relates to real-world objects and issues, such as where in the ontology those objects belong. Taking an example from a few years ago, I planned to work on the infrastructure (Q121359) item, which I saw as an abstract class of concepts such as transport infrastructure (Q376799), cyberinfrastructure (Q1450531), research infrastructure (Q1438053) and so on. I viewed those as instances of infrastructure, much like lawyer, engineer and plumber are instances of profession (Q28640) (i.e. you cannot hire a "profession" or even a generic "professional", nor study to become one without specialization, therefore the different professions aren't subclasses of profession).

However, as I browsed the various infrastructure-related items, I found that almost all of them were declared to be subclasses of infrastructure rather than instances. Right now, only scientific infrastructure (Q111544419) is an instance of (P31) infrastructure (Q121359), but it's also a subclass of knowledge infrastructure, which in turn is a subclass of... yeah, you guessed it. In short, a mess. And I decided not to get entangled in it at that time.

I can't tell how other editors see these relationships. To me, the choice between instance and subclass is certainly not an arbitrary one, as it has to be consistent with the rules of class logic. For one thing, subclass is a transitive relation, while instance is not. Or, as some terminology has it, instance of (P31) is transitive over subclass of (P279) (but not over itself, like subclass of (P279) is).

Nobody can create a "general-purpose infrastructure" that can serve every need from transport and telecommunications to health care and research; each general infrastructure class has to be built according to the needs of that particular class. There is a discrete step of abstraction between the instance and its class making them have different types. I have no problem with rail infrastructure being a subclass of transport infrastructure since they will share certain properties. It's just that final continuum between "X-type infrastructure" and "infrastructure as such" that is missing.

Do we agree on this, or are there aspects I have overlooked? --SM5POR (talk) 18:35, 31 December 2022 (UTC)Reply

The structural definition (defining statements) in infrastructure (Q121359) is currently mostly lacking. What is needed is a proper genus and proper differentia. Then we may figure out which subclasses are true subclasses and why, at least formally.

Informally, I would have to think harder, and start in dictionaries, probably. Are all these infrastructures really the same concept of infrastructure? From a quick analysis, I cannot tell. infrastructure is a rather abstract concept. I may be able to figure out something better later. --Dan Polansky (talk) 14:17, 2 January 2023 (UTC)Reply

Thank you! I'm not sure myself about the individual concepts of infrastructure at play, but if there are indeed two or more distinct concepts, then we would need additional "infrastructure" metaclasses, as I cannot imagine one class having some types of infrastructures as instances, and others as subclasses. It's the fact that infrastructure (Q121359) is abstract that makes me cringe at it seemingly having concrete subclasses.

And I'm certainly not in a hurry; that infrastructure mess has been growing mold for at least a couple of years now, it's an example of a recurring problem in many areas at Wikidata, and I'm more concerned about getting that model item maintenance procedure up and running (I imagine we could have robots regularly comparing other items to the model ones to detect congruence irregularities such as using instance of (P31) in place of subclass of (P279) or vice versa). I also have a few property proposals in mind (model lexemes etc, that's what I'm using the Sandbox-Lexeme (P5188) property for right now).

Speaking of infrastructure, your imaginative use of carries (P2505) in linguistics made me investigate what a "zářez" was and assign an English label ("notch") to it! I don't even know what it's called in Swedish; "urskärning" maybe... --SM5POR (talk) 19:30, 2 January 2023 (UTC)Reply

I also deprecated transport structure (Q2516121)subclass of (P279)transport infrastructure (Q376799) and replaced it with transport structure (Q2516121)part of (P361)transport infrastructure (Q376799) as the subject item is just a single physical component of the object, alongside other physical components such as highways and marine freight terminals, or abstract components like railway administration and cross-border tax-free businesses. There is just too much excessive use of subclass of (P279). --SM5POR (talk) 20:44, 2 January 2023 (UTC)Reply

I recommend having a look at the authorities to which authority control traces: some of them have some good material. When I want to get a hold over a concept, I start systematically filling the "subject named as" qualifier for the authorities to see immediately which of these are in English, and to make sure they are talking about the same word or at least concept. Then I look at the English sources that have something like definition, e.g. WordNet. One may even consult non-English ones using Google Translate if one wants. --Dan Polansky (talk) 13:48, 3 January 2023 (UTC)Reply

By the way, are you sure that grapheme (Q2545446), orthographic word (Q115863220) and sentence (Q37124094) really should be subclasses of linguistic unit (Q11953984) (as they are right now in Wikidata), and not instances? There are in fact three items of the latter kind: yes (Q6452715), variant (Q115159493), and Q12943823. Looks carefully planned... not. SM5POR (talk) 21:03, 2 January 2023 (UTC)Reply

To illustrate why I ask that question, consider the following statements (and fill in the blanks):

How long is a grapheme? Typically 1 letter/character.
How long is a word? From one white-space delimiter to the next, typically 1-15 letters.
How long is a sentence? From one sentence delimiter (period etc) to the next, typically between two and a few hundred letters.
How long is a linguistic unit? From _____________ to the next, typically between _____ and _______ letters/characters.

See my point? --SM5POR (talk) 21:18, 2 January 2023 (UTC)Reply

"A grapheme is a linguistic unit": the leading indefinite article reveals subclass-of, not instance-of. The same is true of word and sentence. Thus, linguistic unit is not linguistic unit type. --Dan Polansky (talk) 13:48, 3 January 2023 (UTC)Reply

Okay, I see; that makes sense I think. But isn't it the linguistic unit type you want to define then, in terms of instances of it? Because a subclass can theoretically be defined in an almost infinite number of ways (say, the union of all consonant graphemes in scripts except Cherokee, all non-verbs with a prime number of letters, and anything written by William Faulkner on a Sunday), and I don't see how that contributes anything of more value than what you can pull out of thin air, hence my sudden skepticism towards your superclass earlier today.

My apologies for that silly example, but my faith in the quality of current Wikidata contents hit a new low when the former RSFSR popped up in that subclass tree I browsed; the explanation was that the RSFSR had been defined as a subclass of cratonym when it should clearly be an instance. Don't waste your time arguing about it with me unless you really want to; there are probably a lot of other poorly chosen statements messing with our queries, and I ought to find ways to eliminate them rather than throw them in the way of the work done by others.

One thing I do want to eliminate are all those "instances of term" as they are language-dependent and therefore conflated with the language-neutral items as well as with labels in any other language. We may be talking about something like 50,000 items here. --SM5POR (talk) 00:54, 4 January 2023 (UTC)Reply

But maybe linguistic unit is not a class; I am not sure. It is a concept similar to linguistic entity, which is any entity relating to language. Thus, linguistic unit would not be a genus of anything. My interpretation of the indefinite article business may be ontologically too naive. On the other hand, it would mean that abstract object would not be a class either. --Dan Polansky (talk) 13:51, 3 January 2023 (UTC)Reply

About how long it is: how long is an abstract object, typically? Since, numbers are abstract objects and so are geometric shapes, yet these are ontologically very different kinds of entities. Is abstract object a class? Or is it perhaps more something like generic interface (in Java terminology) or concept? How do we bind a class to the concepts that it "implements"? --Dan Polansky (talk) 13:56, 3 January 2023 (UTC)Reply

Change request: set primary labels to unique values using disambiguators edit

Latest comment: 1 year ago1 comment1 person in discussion

The current guideline Help:Label mandates that preferred terms for entities ("labels") need not be unique (one term can be shared by multiple entities).

Furthermore, Help:Label#Disambiguation_information_belongs_in_the_description mandates the preferred terms for entities ("labels") should not use disambiguating brackets.

In my experience with Wikidata, these two properties combined cause endless frustration resulting from poor usability of this design choice.

Since, entities in statements appear via their primary labels (preferred terms), not via their Q numerical identifiers. As a result, a mere glance at the user interface does not reveal which entity is meant in the statement. When an entity states "different from" and then lists multiple entities with the same primary label as the subject entity, it is not clear what these entities are.

To give an example, "thesaurus" is now a label (preferred term) of two entities/concepts: "thesaurus (dictionary)" and "thesaurus (IR)". The labels just mentioned could be used as the primary labels (preferred terms). In this case, there are just two entities, but in the case of, say, "Ministry of Education", there can be rather many entities. (Sure, sometimes the label is, say, "Ministry of Education of Germany" and there is no problem, but this is far from the norm.) This would greatly simplify reviewing of categorization of entities as thesauri since one would see immediately (at a glance) whether the entity is classified as "thesaurus (dictionary)" or "thesaurus (IR)".

The Help page does not trace the guideline to any rationale. One rationale for using a term without the disambiguating bracket is to make sure this matches the actual term used in a sentence in a text. However, in general, an entity/concept is referred in a text by any of multiple labels, whether the primary label or the alternative label (whether the preferred term or any of the alternative terms). The obvious solution is to set the preferred term without the brackets as the lead alternative term.

One valid objection is that a term containing a bracket is not really a preferred term since it is not expected to be used in text invoking the entity/concept. Perhaps the primary label should not be thought of as the "preferred term". However, thesauri (IR) sometimes do use disambiguating brackets in their preferred terms; examples are currently missing (TBD).

In any case, the state in which a guideline does not trace to any rationale or discussion is highly unsatisfactory, violating the spirit of Wikipedia consensus process, by which the strength of argument is decisive. And if no arguments are stated or traced to, it is the authority of the guideline that is implied to be decisive, regardless of any arguments.

As for the policy or guidelines status of Help:Label, it seems to have no formal status, being indicated as proposed policy or guidelines. What that means is unclear; it is probably part of the partially Orwellian character of Wikimedia projects where policies are not policies, since "ignore all rules", and "ignore all rules" is not meant literally anyway since "ignore" means "do not pay any attention to", meaning "don't bother to read or know", which is impractical nonsense. My best guess is that Help:Label is a de facto overridable policy; thus, it is a near-policy or set of binding rules, with override being possible in well justified exceptional cases. What is probably not possible is to start systematically adding "(X)" disambiguators since then other editors will remove them, saying "remove per Help:Label". Dan Polansky (talk) 14:38, 26 January 2023 (UTC)Reply

P31: Q179797 edit

Latest comment: 1 year ago5 comments3 people in discussion

Hi Dan, I saw that you made this. Could you explain why this was not correct? In particular, how would you specify the language of the thesaurus? On the French Wiktionary, we will be working on the organisation of Wiktionary categories this month, so we are now discussing the best way to model that. Pamputt (talk) 10:22, 2 February 2023 (UTC)Reply

Greetings. One thesaurus page in Wiktionary is not a thesaurus; it is a thesaurus page or entry. Thus, Thesaurus:chat is not an instance of thesaurus. Thus, e.g. the English Wiktionary has one thesaurus, not multiple thesauri, except perhaps one thesaurus per language, but the collection of all thesaurus pages for all languages in the English Wiktionary can be seen as one thesaurus. Is this explanation clear? I will try to answer all questions as best as I can. --Dan Polansky (talk) 16:15, 2 February 2023 (UTC)Reply

About specifying the language of the page, e.g. in Thesaurus:fr:chat (Q30699903), I guess one could do it via a qualifier of the statement "X instance of Wiktionary thesaurus page", and there are other options. --Dan Polansky (talk) 16:17, 2 February 2023 (UTC)Reply

Hello Dan! I am responsible for raising this issue. In fact, I am interested to make a better structure for entities linked to categories in Wiktionaries and for thesauri. I made a small query to look for the indication of language the way you deleted. Actually, I think you are right, it was not made the better way. Still, it is important to connect with the language of each thesaurus. I am not sure about the best way to indicate it. Will it be a general language of work or name (P407)? And for the relation between a language-specific thesaurus and the non-language-specific (umbrella thesaurus? metathesaurus?)? Well, if you have any idea, your are welcome on this! 🙂 Noé (talk) 15:59, 3 February 2023 (UTC)Reply

Greetings to French Wiktionary and Thesaurus, if I remember correctly :). I am not sure. My problem was with the use of instance-of for what were no genuine instances. I am sorry I am of not much help. I think "language of work or name" that you proposed could be okay; even if someone thinks it imperfect, it does not seem like an unacceptable workaround or abuse. We do need to make use of workarounds, just not too bad workarounds, and the challenge is to distinguish the acceptable workarounds from unacceptable ones. --Dan Polansky (talk) 17:04, 3 February 2023 (UTC)Reply

Kripkean identification or essence of persons edit

Latest comment: 1 year ago2 comments1 person in discussion

From the point of view of Kripkean ridig designation, the following seem to be key defining characteristics of a person:

year of birth (or even the date)
place of birth

That would be the same across possible worlds.

Worldly achievements (author, composer) are not part of Kripkean ridig designation since they do not hold necessarily true in all worlds that have forked from the world into which the person was born. (There are other interpretations than forking or branching, but let us leave these aside for now.) Nonetheless, achievements for which people are noted are a good fit for the description field.

The year of death is not necessary the same in all forked/branched possible worlds, but it still feels a good fit for the description field.

An example description field:

Komárom-born Austrian composer (1870-1948)

The above states:

place of birth
nationality (whatever that is supposed to mean exactly; let's suppose they spoke German at home)
role of achievement
the years of birth and death

Looks fine to me. The original description "Austrian composer" seems too context-free. Asking the readers to scroll down the page to find statements that, when put together, provide a summary of a person's identity seems impractical. --Dan Polansky (talk) 07:06, 28 February 2023 (UTC)Reply

Later: By means of something of a debate:

across-possible-worlds identifying characteristics could have been the circumstances of conception, not birth. The circumstances would include time and place, and possibly the parents.
across-possible-worlds identifying characteristics could have been the circumstances of christening (name giving), not conception or birth. The circumstances would include time and place, and possibly the parents.

It is not very clear which of these are most Kripkean, in part since it is not very clear what exactly that Kripkean rigid designation really is. The only thing that should be clear is that "the author of Waverly" is not part of the definition or essential identification of Walter Scott, since he could have failed to write the book on an alternate branch of the world, branch that would branch off after the identity-defining event, be it the conception, the birth or the christening. --Dan Polansky (talk) 09:55, 2 March 2023 (UTC)Reply

Mineral resources edit

Latest comment: 1 year ago6 comments2 people in discussion

I hoped we had settled this in previous discussions (in Talk:Q12131447 in particular). Now after Special:Diff/1864556920 it looks we need take this road again. Various aspects of Help:Label were pointed to you previously, and yet for some reason you once again misuse label field to provide an ad hoc description instead of the actual term in common usage. You claim you reverted because my edit "introduced inaccuracies and removed valid reliable sources" while this assessment actually applies to the overall mess that you made in this item previously. As pointed in previous discussion, you ignored what it says in some of the the very same sources that you referenced yourself, and regretfully you still do. Please do re-familiarize youself with everything that we already discussed earlier before you feel an urge to undo my edits again. Several other users have pointed out (also above) that you need to familiarize youself with what Wikidata is (not dictionary etc.) and how it works. You can't just reinvent it then force your ways on other users. 2001:7D0:81FD:BC80:DD4D:3D4F:EC5A:2ED4 11:35, 30 March 2023 (UTC)Reply

I propose we discuss any disagreements on the talk page of the disputed entry, here mineral resource (Q889659). I will try to provide reasonably detailed rationales in my edit summaries, although I admit sometimes the rationale stays on the general level and lacks detail. I will try to do my best to address issues in most amicable manner while heeding the objectives of verifiability especially via tracing to reliable sources, and accuracy especially in reference to the specific wording of definitions as found in reliable sources. --Dan Polansky (talk) 11:55, 30 March 2023 (UTC)Reply

Well, my comment above, as well as our previous discussion, is mostly about your conduct and general practices on Wikidata, and less about the subject matter of any particular item. As for verifiability, I hate to go in circes with this, but I'd like to emphasize once again that statements need to be verified based on reliable sources where the subject matter matches that of the Wikidata item, not based on random reliable sources that happen to use the same English-language term in any sense. The subject of Wikidata item is normally drawn from the set of Wikipedia articles attached to this item, and so generally relevant sources need to be determined based on Wikipedia articles, not the other way around. 2001:7D0:81FD:BC80:DD4D:3D4F:EC5A:2ED4 12:19, 30 March 2023 (UTC)Reply

As I pointed out before, the practice of an unidentified entity (an anonymous editor with almost no contributions) attacking an identified entity is probably problematic. For the subject matter, the talk page of Q Q889659 is a good place, and I posted there. Here I will only point out that for mineral resource (Q889659), Wikipedias are of little use since they are no authoritative source of the definition (which establishes the identity of the entity/concept) and that different sources in different languages define the mineral-resource-souding concept differently. For the entity under discussion, the first question is what exact concept it aims to capture and based on what authoritative sources (not Wikipedias), and once that is clarified, the entity can be further refined. The talk page of that entity is probably a good place to discuss the substantial issue. --Dan Polansky (talk) 13:02, 30 March 2023 (UTC)Reply

Your first remark is once again an argumentum ad hominem. I'm here to respond to your revert not to attack you. If you act in a way that disrespectually ignores large parts of previous discussion then you also might expect that the tone of my response in turn won't be as amicable as you might like it to be, though.

As for Wikipedias, they of course aren't authoritative sources, but this is not the point. The point raised for several times in previous discussion was that we nonetheless need to check the content of attached Wikipedia articles to determine *which* authoritative sources are relevant to item subject since Wikidata items in the first place are usually created based on some set of Wikipedia articles. As discussed thoroughly before, authoritative source may use a term in distinct senses and in this case there is no way to match a source properly to a Wikidata item based on this source alone. 2001:7D0:81FD:BC80:DD4D:3D4F:EC5A:2ED4 13:39, 30 March 2023 (UTC)Reply

It seems to me--but I am not sure--that it is a problematic practice to let anonymous editors post on user talk pages of non-anonymous editors. --Dan Polansky (talk) 14:17, 30 March 2023 (UTC)Reply

iw edit

Latest comment: 1 year ago2 comments2 people in discussion

Thank you for editing interwiki (after me), which was on 2023 march 27 disabled by Microsoft for older eO. As also was disabled thanking, subscribe (only per remembering --_~_~_~_~, and so on. --Kusurija (talk) 13:51, 24 April 2023 (UTC)Reply

You are welcome. I edited Category:Japanese terms spelled with 澄 read as ちょう (Q37665091) the way you suggested. --Dan Polansky (talk) 14:13, 24 April 2023 (UTC)Reply

Classification of thesauri edit

Latest comment: 11 months ago1 comment1 person in discussion

Assumptions:

A thesaurus should be ranked as a "thesaurus", not as "controlled vocabulary", since it is the latter via transitivity: each thesaurus is a controlled vocabulary.
When a thesaurus (controlled vocab with BT, NT, RT, etc.) is ranked only as "controlled vocabulary", it should be replaced with "thesaurus".

Hope it makes sense to anyone concerned. --Dan Polansky (talk) 09:48, 17 May 2023 (UTC)Reply

Ontological precision and splitting of interwikis edit

Latest comment: 11 months ago3 comments2 people in discussion

I am in the process of splitting authority control/authority file (single entity) to authority control and authority file. This may result in splitting of interwikis depending on the option chosen.

Option 1: Match Wikipedias based on ontological precision.
Option 2: Keep all Wikipedias interlinked from, say, authority file.

Pros and cons:

Option 1: Achieves ontological precision (keeping separate entities separate) on the Wikipedia level. It is up to national Wikipedias to perform ontological alignment in the article titles if they so wish, e.g. by renaming de:Normdatei to de:Authoritätkontrolle if they so wish. At worst, interwiki becomes more compartmentalized, but going to Wikidata with its entity interconnections could solve the issue.
Option 2: Creates best topical interconnection. But is ontologically imprecise.

There should ideally be a policy or a guideline on this. --Dan Polansky (talk) 06:51, 18 May 2023 (UTC)Reply

I think authority control (Q118455746)topic's main template (P1424)Template:Authority control (Q3907614) should be moved back to authority file (Q36524), as the template is not about the subject of authority control. I think of it more like a navbox (Wikimedia navigational template (Q11753321)) for authority data. /Autom (talk) 06:45, 21 May 2023 (UTC)Reply

I am not sure. Better raise this on the talk page of one of the concerned entities and ping me? --Dan Polansky (talk) 08:26, 21 May 2023 (UTC)Reply

Meta-descriptions edit

Latest comment: 10 months ago13 comments3 people in discussion

Hi Dan,

certainly you simply want to improve the ontology, but I sincerely ask you to refrain from meta-descriptions. If a foreign language has concepts which you don't understand e.g. Q11738556 it is really not necessary to comment them in a demeaning manner: whatever Polish "kazus" refers to. Best wishes U. M. Owen (talk) 09:47, 8 June 2023 (UTC)Reply

I don't think there is anything demeaning about "whatever Polish "kazus" refers to"; it is an accurate description of the state of knowledge at a point in time of the entry. In particular, Q11738556 does not have any Polish definition, merely Polish label, and it traces to no authoritative sources, and it has no other identifying items, so 'whatever Polish "kazus" refers to' is the best accurate identification one can come up with. It is a description that indicates that the Polish term as if owns the identity of the entry. If you have a preferred alternative phrasing that you will find less demeaning, we can consider it, and explore alternatives. If you have a link to a related discussion, policy, guidelines and the like, I am all ears. --Dan Polansky (talk) 10:12, 8 June 2023 (UTC)Reply

Imagine that all other nations would use the English description field for purposes like you. The description field is not meant for personal comments.

If you want to work this out please just ask a Polish user like @Matlin:.--U. M. Owen (talk) 10:16, 8 June 2023 (UTC)Reply

It is not a personal comment: it is an accurate description from the point of view of anyone who speaks English but not Polish; and the Polish term probably has multiple meanings, so knowing what the Polish term means does not establish the identity of Q11738556 either. Q11738556 is fundamentally defective and lacks proper identification; the English description is a reflection of that. One hope that I hold for that English description is that it will prevent editors from guessing at best translations into other languages without becoming fully aware that the entity is essentially undefined and underidentified; furthermore, the description acknowledges that even though the English label was chosen as "case", the person entering it did not really know what they were entering. (Although I should do better to note that the label is an original provisional translation.) --Dan Polansky (talk) 10:22, 8 June 2023 (UTC)Reply

Do you speak a decent Polish? How have you translated kazus into English?--U. M. Owen (talk) 10:25, 8 June 2023 (UTC)Reply

I am a native Czech speaker and speak no Polish. Polish kazus is almost certainly of Latin origin, cognate to English case, Czech kauza, etc. By combination of online dictionaries (including those linked from wiktionary:en:kazus) and translation services, I figured out the Polish Wikipedia is about something like a superset of legal case and administrative case, for which "case" would be a good approximate translation. That is as far as the label "case" is concerned; it is non-empty, probably approximately correct and now indicated in the description to be an original translation. Someone who speaks both English and Polish should ideally step in, trace the entry to an authoritative source (which Polish WP is not), thereby establish the identity of the entity/concept and fill in real definition in Polish and English. --Dan Polansky (talk) 10:34, 8 June 2023 (UTC)Reply

If you can't find an English definition for this item why do you translate it [=invent a new translation for it]? As long it can't be matched to an English concept named case IMHO it is the Polish concept of kazus.

Someone who speaks both English and Polish should ideally step in, trace the entry to an authoritative source (which Polish WP is not), thereby establish the identity of the entity/concept and fill in real definition in Polish and English. +1--U. M. Owen (talk) 10:38, 8 June 2023 (UTC)Reply

Okay, you have an opinion, I have an opinion. What does the policy, guideline or previous relevant discussion say? What I entered was accurate as far as I can tell and non-misleading; what you are doing, as far as I can tell, is invent new rules for me to follow. What is your native tongue? --Dan Polansky (talk) 10:45, 8 June 2023 (UTC)Reply

You may keep the description, but finding tentative translations for concepts which you haven't fully understood is original research.--U. M. Owen (talk) 11:03, 8 June 2023 (UTC)Reply

It may be an original research to an extent (performed with the use of sources in Polish and English), but as long as the description points that out, what is the problem? The problem I am trying to solve is that the entity has no English label and therefore appears as Q-number when referenced from other entities. I would be happy with other labels, such as "case (OR)", "case (original research)", "case (original translation)", "case (tentative)", "case (from Polish)" or whatever else addresses concerns. Empty English label is bad and provides dysfunctional user experience. --Dan Polansky (talk) 11:09, 8 June 2023 (UTC)Reply

This is definitely not what the description field is meant for. Item description normally descirbes item subject. It should say what Polish "kazus" means in a particular sense, not literally "what the word means in Polish". At best such description is just pointless, meaning more or less the same as "hey guys, lets write something useful here someday". Everyone can check labels/descriptions in other languages themselves and draw from that. It is likely that this random user commentary confuses some people who know the language and otherwise would be able to fill in proper English description. I also rather find that this description is demeaning as it easily reads that this item is some junk item due having only Polish-language label that normal English speaking people don't even understand.

More or less the same issue has been brought to your attention several times before, e.g. here: user commentary in any form does not belong to the main namespace.

As for items without English labels, likely everyone agrees that eventually English labels should be added. But made up and unreliable labels cause confusion rather than be of any help. It's not only about English labels anyway. Missing labels in any language cause imperfect user experience. 2001:7D0:81DB:1480:65E5:403A:C78C:F885 15:58, 8 June 2023 (UTC)Reply

I have come to believe that unidentified IPs attacking a well identified contributor with a reputation at stake are an evil. It is even more evil when the attacker does not use a constant IP but rather switches IPs. The IP risks nothing; the well identified contributor takes risks by trying to do good things for the project even if somewhat unorthodox things. If someone has some good reading on the subject, I am all ears so that I do not have to rely on my own deliberations and intuitions. --Dan Polansky (talk) 16:05, 8 June 2023 (UTC)Reply

This is a discussion about the proper usage of descriptions and not about any person. Therefore nobody's reputation is at stake.--U. M. Owen (talk) 11:38, 12 June 2023 (UTC)Reply

Works, their editions and translations edit

Latest comment: 4 months ago3 comments1 person in discussion

Wikidata allows tracking editiors of a work as separate items. Thus, if a work has 3 editions, there are going to be 4 items: 1 for the book, 3 for the editions. This was done for Rejzek's etymological dictionary: Czech Etymological Dictionary (Q19679433).

A similar principle holds for translations, not only editions, I figure.

I am now working on All Life is Problem Solving and Alles Leben ist Problemlösen. I created an item for All Life is Problem Solving as distinct from Alles Leben ist Problemlösen, but I am not entirely happy about it. It is All Life is Problem Solving (Q123492610). I would find it cool to be able to track multiple translations to different languages in a single item for the book, but I do not know whether this was ever done. The problem of different ISBNs would have to be solved. A link to a guide on this would be great to have. There is Wikidata:WikiProject Books, but it is overloaded with information I presently not care about.

--Dan Polansky (talk) 15:20, 20 November 2023 (UTC)Reply

Example: “What Do You Care What Other People Think?” (Q2712899) is a book, and it traces to two translations, French and Czech. The English primary item features Czech and French titles as Czech and French labels. --Dan Polansky (talk) 09:03, 27 November 2023 (UTC)Reply

Hypothesis: A Wikidata item for an English book, film, etc. can contain a Czech label different from the English one, and the Czech label does not need to be a direct translation if it matches a Czech title under which a Czech translation was published. However, All Life is Problem Solving (Q123492610) is not an English original but rather a translation; should then the various language labels in the English translation be the same as in the original edition? Or should all language labels in the translation item carry the title in the language of the translation (this would neatly differentiate the displayed labels in "edition or translation of" part of the primary entry)? The title could be rendered as a transliteration of a non-Latin script where applicable (for Latin script languages), but it would still be different from a rendering of the title into the target language. --Dan Polansky (talk) 09:54, 27 November 2023 (UTC)Reply

Invitation to participate in the WQT UI requirements elicitation online workshop edit

Latest comment: 1 month ago1 comment1 person in discussion

Dear Dan_Polansky,

I hope you are doing well,

We are a group of researchers from King’s College London working on developing WQT (Wikidata Quality Toolkit), which will support a diverse set of editors in curating and validating Wikidata content.

We are inviting you to participate in an online workshop aimed at understanding the requirements for designing effective and easy-to-use user interfaces (UI) for three tools within WQT that can support the daily activities of Wikidata editors: recommending items to edit based on their personal preferences, finding items that need better references, and generating entity schemas automatically for better item quality.

The main activity during this workshop will be UI mockup sketching. To facilitate this, we encourage you to attend the workshop using a tablet or laptop with PowerPoint installed or any other drawing tools you prefer. This will allow for a more interactive and productive session as we delve into the UI mockup sketching activities.

Participation is completely voluntary. You should only take part if you want to and choosing not to take part will not disadvantage you in any way. However, your cooperation will be valuable for the WQT design. Please note that all data and responses collected during the workshop will be used solely for the purpose of improving the WQT and understanding editor requirements. We will analyze the results in an anonymized form, ensuring your privacy is protected. Personal information will be kept confidential and will be deleted once it has served its purpose in this research.

The online workshop, which will be held on April 5th, should take no more than 3 hours.

If you agree to participate in this workshop, please either contact me at kholoud.alghamdi@kcl.ac.uk or use this form to register your interest https://forms.office.com/e/9mrE8rXZVg Then, I will contact you with all the instructions for the workshop.

For more information about my project, please read this page: https://king-s-knowledge-graph-lab.github.io/WikidataQualityToolkit/

If you have further questions or require more information, don't hesitate to contact me at the email address mentioned above.

Thank you for considering taking part in this project.

Regards Kholoudsaa (talk) 16:38, 19 March 2024 (UTC)Reply

Add topic