Wikidata:Requests for comment/Adopt Help:Classification as an official help page
An editor has requested the community to provide input on "Adopt Help:Classification as an official help page" via the Requests for comment (RFC) process. This is the discussion page regarding the issue.
If you have an opinion regarding this issue, feel free to comment below. Thank you! |
THIS RFC IS CLOSED. Please do NOT vote nor add comments.
The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Help:Classification (in the current wording) will not be adapted as an official help page. --Pasleim (talk) 09:07, 20 August 2016 (UTC)[reply]
Contents
Introduction edit
Classification's property in Wikidata has an history that begun since the beginning of the project (for the community side, Wikidata's conceptors had a debate even before and made design choice about this :) )
First the GND main type was used as a property that played the role, together with the properties of Help:Basic membership properties. Only the latter exists now as the former has been deleted by community decision (see for example WD:Requests for comment/Migrating away from GND main type).
There is some open debates in community and controversial discussion about their usage though. One of the issue has been the one of can we use both instance of (P31) and subclass of (P279) both on the same item?. This issue is sometime controversial outside out Wikidata. Some argue this should not be possible because of a philosophical principle, the type–token distinction (Q175928). Some other because of theorical reasons on the properties of some computer languages in which logical inferences can be made, for performance reasons and/or for all the questions that can be asked to a system that uses this language have an answer. (an example is http://www.w3.org/TR/owl-ref/#OWLLite). Other do not have this restriction, such as en:RDF or newer versions of en:Web Ontology Language.
It is totally possible in Wikidata as Wikidata allows to use any property on any item without restrictions. See for example of useful uses in Help:Classification.
As the issue has been raised a lot, for example with @Denny, Emw, Jobu0101, Micru: ... I wrote an help page Help:Classification who is currently a draft. I would want to remove the {{Draft}}
and so asking community if it is possible to do so. Hence this RfC.
late addition : An important thing in favor can also be this thread, where Markus, one well know Wikidatan, that worked with Denny on SMW on works on Wikidata Toolkit, amongst other things tells important things and note the importance of getting this right.
Discussions and questions edit
- Someone ping me in the next week or so if I don't get around to commenting on this. I have a strong feeling about metaclasses that is both agreeable and disagreeable to TomTom's thoughts. --Izno (talk) 20:52, 14 April 2015 (UTC)[reply]
- @Izno: It would help if you could express your feelings a little bit :)
- Please merge Help:Classification and Wikidata:Item classification in any case as it's confusing to have two pages on the same topic. -- Bene* talk 21:28, 14 April 2015 (UTC)[reply]
- @Bene*: For all I'm concerned, Wikidata:Item classification feels weird and has been created by a user who as been blocked. I don't know what to save of this. It seems based on principles I don't understand. TomT0m (talk) 07:57, 15 April 2015 (UTC)[reply]
- It's still confusing though to have both pages. I'll tag that one for deletion than and link to this RFC to make one single and widely accepted help page rather then several single-opinion based pages. -- Bene* talk 08:01, 15 April 2015 (UTC)[reply]
- @Bene*: For all I'm concerned, Wikidata:Item classification feels weird and has been created by a user who as been blocked. I don't know what to save of this. It seems based on principles I don't understand. TomT0m (talk) 07:57, 15 April 2015 (UTC)[reply]
- Why having items who are both class and instances is for me not (yet) understandable. Isn't Nimitz-class aircraft carrier (Q309336) a subclass of <ship> rather than of <ship class>? Suppose we want to find all ship that are ship classes in the similar design sense Do we want to find items like Nimitz-class aircraft carrier (Q309336) or USS Ronald Reagan (Q211658)? The example with the chemical elements should either be explained better or completely removed. --Pasleim (talk) 09:01, 15 April 2015 (UTC)[reply]
- @Pasleim: Don't hesitate to add annotation if you spot a mistake, because <stroke>I can't see where it is stated that Nimitz-class aircraft carrier (Q309336) is a subclass of <ship>.</stroke> oh, someone corrected, that's why
- The concept of meta class is just a way of creating circle classification: this is the thing to avoid. The ship classification can be simplified by:
- USS Ronald Reagan (Q211658) is an instance of Nimitz-class aircraft carrier (Q309336)
- Nimitz-class aircraft carrier (Q309336) is a subclass of supercarrier (Q1186981)
- supercarrier (Q1186981) is a subclass of fleet carrier (Q869609)
- fleet carrier (Q869609) is a subclass of aircraft carrier (Q17205)
- aircraft carrier (Q17205) is a subclass of warship (Q3114762)
- warship (Q3114762) is a subclass of naval vessel (Q177597)
- naval vessel (Q177597) is a subclass of ship (Q11446)
- Image now this classification with your concept of class and metaclass ? Which item is a class and which one a metaclass ? Snipre (talk) 13:50, 15 April 2015 (UTC)[reply]
- @Snipre: First, it is definitely not MY metaclass concept. See the references on the english article of metaclass (Q19478619) Second see the relevant section on why having items both instances and subclass statement in Help:Classification. You don't answer this argument, and also note that your class hierarchy is exactly like mine, except you can't put the ship class item, for which there is an item in Wikipedia, in it. TomT0m (talk) 14:51, 15 April 2015 (UTC)[reply]
- @Snipre:
- USS Ronald Reagan (Q211658) is an instance of Nimitz-class aircraft carrier (Q309336)
- Nimitz-class aircraft carrier (Q309336) is a subclass of supercarrier (Q1186981)
- supercarrier (Q1186981) is a subclass of fleet carrier (Q869609)
- fleet carrier (Q869609) is a subclass of aircraft carrier (Q17205)
- aircraft carrier (Q17205) is a subclass of warship (Q3114762)
- warship (Q3114762) is a subclass of naval vessel (Q177597)
- naval vessel (Q177597) is a subclass of ship (Q11446)
- Nimitz-class aircraft carrier (Q309336) is an instance of ship class (Q559026)
- supercarrier (Q1186981) is an instance of ship type (Q2235308)
- fleet carrier (Q869609) is an instance of ship type (Q2235308)
- aircraft carrier (Q17205) is an instance of ship type (Q2235308)
- warship (Q3114762) is an instance of ship type (Q2235308)
- naval vessel (Q177597) is an instance of ship type (Q2235308)
- ship class (Q559026) is a subclass of class (Q16889133)
- ship type (Q2235308) is a subclass of class (Q16889133)
- Visite fortuitement prolongée (talk) 14:35, 31 May 2015 (UTC)[reply]
- I edited a few seconds before your comment. Is it correct now? Visite fortuitement prolongée (talk) 14:53, 31 May 2015 (UTC) - Yes it is, Deleted. TomT0m (talk) 15:16, 31 May 2015 (UTC)[reply]
- I would add . TomT0m (talk) 15:19, 31 May 2015 (UTC)[reply]
- An other display, with a table that I borrow from TomT0m in #Discussions about administrative subdivisions
- @Snipre:
USS Ronald Reagan (Q211658) | Nimitz-class aircraft carrier (Q309336) | ship class (Q559026) |
supercarrier (Q1186981) | ship type (Q2235308) | |
fleet carrier (Q869609) | ||
aircraft carrier (Q17205) | ||
warship (Q3114762) | ||
naval vessel (Q177597) | ||
ship (Q11446) | class (Q16889133) |
- For the problem of definition of "individual": according to the proposed classification, an item can be an instance of if the item is a "concrete objects, or events involving concrete objects, who are localized in time and space". So with that definition I can't write "anger is an instance of feeling". Feelings aren't concrete object or localized objects. So the definition is not good because it can't be generalized.
- Then the localization in space and time is not relevant for atoms: I can described several energetic levels of a specific atom by several items. So for one atom, I can have several items and this is not in agreement with the proposed definition of "instance of".
- Then the example of the atoms is a good one of closed mind applying a rule without any knowledge about the subject: yes, you can localize an individual atom in time and space, but finally it is something we want to do in Wikidata (ie to create specific items for individual atoms) ? The concept of the proposed classification bypass the most important question: which is the framework of WD ? What is the granularity we want in WD ? This can't be solved by an simple page of 3-5 paragraphs: each domain, each field develops its own limits, so "instance of" has to be defined by projects or teams which has a good understanding of the subject. Yes, we can create specific items for individual atoms, but if we don't want to reach that granularity, why can't we use "instance of" to the lowest classification level defined/used in WD and not in a hypothetical system ?
- A better definition of what is an "instance of" is the most detailed description an object can have according to the available properties. If we have no possibility in WD to describe the position of an atom at a defined time or no property to be able to distinguish between 2 cars, and we don't want to be able to do that difference, why do we need to apply a so rigorous rule ?
- To build an ontology we need first to define the framework (what is the purpose of the ontology) and then the granularity (which is the maximal description we want to reach). Snipre (talk) 14:23, 15 April 2015 (UTC)[reply]
- Take each time you were angry. These are the tokens in the sense of the token/type relationship. Then angryness is the class of all that events. If we want to classify further, then we want to go higher to higher levels and say that scientists of philosophers considered angryness as some cardianal sentiment type or whatever they imagined. Which is not the case of the class of all times I was angry, which is something else. This just works and fit Wikidata's needs as there is articles about anything, pretty much. We need that power and solid basis. TomT0m (talk) 14:51, 15 April 2015 (UTC)[reply]
- And for the localization of the angryness you take only the head or the rest of the body ? In the middle of the head or 1 cm below the highest point of the head (hair not included) ? Snipre (talk) 08:21, 16 April 2015 (UTC)[reply]
- Some place where you were, its precise enough localisation for our purpose. If you take physical principles like quantics physics, even localisation of a particle in a notion that is complex :) This would not imply that a physician would tell that an atom is not localised at all. TomT0m (talk) 09:51, 16 April 2015 (UTC)[reply]
- And for the localization of the angryness you take only the head or the rest of the body ? In the middle of the head or 1 cm below the highest point of the head (hair not included) ? Snipre (talk) 08:21, 16 April 2015 (UTC)[reply]
- Yes, we can create specific items for individual atoms This is not the question. There is cases like car models where it is totally possible to have individuals. In that case it is a lot more simple to follow the same principle in both case. A newbie would only have to follow one instruction, otherwise the structure is not robust. Say we decided that some class would not have any instances, and it turns out it has some in Wikidata finally. No luck, we got to modify the model, and that could have implications for every consumers that made some asumptions. With Hydrogen subclass of Atoms and Hydrogen instance of chemical element, it does not matter if someone later create the first atom in the world article for some reason. This is robust modeling and mathematically robust. And no, I don't think you answered my argument below. TomT0m (talk) 16:04, 15 April 2015 (UTC)[reply]
- Again I answered to your argument: you can't provide in one single page all possible cases which can be "instance of". This is a domain specific definition. The only general way is, like I defined, to define that the more detailed object in the specific domain is "an instance of" something and then to refer to the specific classification described by the projects to correctly classify the items. And be consistent: with your assumption of possible future change in the model, you have to propose to avoid ANY use of "instance of". And with a proper modelling, if you can query all "subclass of" or "instance of", you can easily modify the classification of a class by bot. Snipre (talk) 18:01, 15 April 2015 (UTC)[reply]
- Absolutely not ! an atom is a token in the sense gave, and will always be ! tokens are members of classes, in that case classes of atoms, and by definition we can link tokens to class using instance of. Then with a second use of instance of we can link classes to classes of classes (sets of sets of atom). We can used with well defined stuff twice instance of. And as, contrary to subclass of, instance of is not transitive, a token will nether be a member of a class of classes. This is just elementary maths with a solid definition for what are tokens. As atoms will always be tokens, this is robust. And we can use instance of. TomT0m (talk) 18:11, 15 April 2015 (UTC)[reply]
- I never say that atom was not a potential "instance of". The problem is again if this is something we want to model: my proposition is to define as "instance of" the lowest level of the classification or the most detailed item of the classification. Then if we decide to model to the individual atom then this becomes the "instance of" and if we stay at the level of the element, this becomes the "instance of". You are typically what I defined as a closed mind: you have your mathematical reasoning (which is correct) but you completely skip the fact that a mathematical formulation is a tool in the modeling and not the goal. Snipre (talk) 08:13, 16 April 2015 (UTC)[reply]
- @Snipre: Nope, I base on a philosophical principle (the token class) to link the maths to the real world :) This is very different. And this actually helps us because this just work, as you begin to understand, and can serve a a guideline for the whole project. Which mean that when a newbie will have a problem for classifying anything, we will just have to explain this. And not more something a little bit complicated and area dependant like ok, these object exists but we decided in this area that we will model this and not that, but in this case we do that. And we see a lot those discussion. So I'll totally say I'm in a minimalist philosophical position here :). But a close mind ... no, what's not how I would define myself. I would see more a close mind as somebody who can't generalize and is focused on something :) and is bothered when somebody tells him he might discuss with him to tell him he might change something. I think I proved I could discuss :) TomT0m (talk) 09:47, 16 April 2015 (UTC)[reply]
- I have also to answer to the lack of domain specific particularities account for, this is absolutely totally unfair. The model take really well into account, for example, the specificities of the usal ship classification. The ship class example show very well that : there is classifications used in the real world, often they are not stupid and based on things similar than those principles. The model can show, and without conflict, with the same unifying principles, those real world classifications. So no, I don't accept the argument. This respect very well the domain specificities. More: it's a useful tool to express them. TomT0m (talk) 10:28, 16 April 2015 (UTC)[reply]
An example of mess in the class tree : The linux kernel classification edit
I don't know how to solves this exactly, but here is an example of mess in the current situation do to the lack of clear concepts to apply in this case : Talk:Q14579#Q14579Q16686448 I claim that with using metaclasses, we can make this tree a lot simpler. That's all for now. TomT0m (talk) 12:20, 16 April 2015 (UTC)[reply]
- Could you explain a bit more why it is a mess? It would be helpful if you point out specific things you object to there. --Laboramus (talk) 20:13, 19 April 2015 (UTC)[reply]
- @Laboramus: I don't know exactly how, just a feeling that it is overly complicated and vague. This does not seem write for example to have a subclass of sequence and subclass of instruction in the parent classes of linux kernel (in sofware programming, we could have a sequence of any type, like sequence of number, but a sequence of number would not be a number itself). I would be less disturb by subclass of sequence of instruction for example. It also do not seem right that in the software case we do not do better that in the modeling of book case. FRBR with book/edition/manifestation and so on seems better defined.
- Let's begin to think using the class/token relation as first. I think the tokens of a software are many, of different types. For example :
- we use the instance word when there is several processus on the same software that run on the same machine. This is a token of execution and it is localized in time and space.
- when we buy an edition of a software at the supermarket, or when the coder realeases a new version of their works. There is a realisation as an archive file, who is then duplicated.
- not sure I got everything :)
- Which of this token are we refering when we speak of the linux kernel ? If we want to class software, it seems reasonable that we have for example to have a software type class. It's probably not a subclass of software as its subclass would be its instances ...
- I think that if we answer all these questions correctly, we will be able to clean this tree. TomT0m (talk) 07:46, 20 April 2015 (UTC)[reply]
Discussions about administrative subdivisions edit
- This tool is useful in a vast variety of cases, and gives good foundation when it can be applied, and looking closer it can be applied a lot. Of course there will always be corner cases, like pure maths, but the love instances are when I'm in love. Love is a class of this events. Then philosopher or scientists may call this a cardinal sentiment, love is such a class of sentiments. For physical theories, this is theories that applies in certain physical cases, for example the experiments built to test them ...
- I'm not saying, this must be clear, that it is universal, I doubt this is useful in pure maths for example, but it is nonetheless useful and it is a starting point in cases where we do not know where to start. TomT0m (talk) 07:25, 20 April 2015 (UTC)[reply]
- TomT0m, quick note: "angryness" is spelled "angriness". English speakers would typically use the word "anger" to express the emotion as discussed here. Compare "love", "loveliness". Emw (talk) 12:18, 20 April 2015 (UTC)[reply]
- @Laboramus: Another example of a usecase where things do not seem clear for everyone : administrative divisions. I think of this because of a question on project chat about this :
- The division token/class/metaclass can solve a problem encountered by the old category system :
- Token are administrative divisions like ville de Paris
- Classes could be commune in the case of france, an entity with a mayor and so on
- There is other stuffs in the Wikipedia category system that are used to class territorial entities : (I take administrative territorial entity of a single country (Q15916867) ). This is not a class, as a french commune like Paris is not a administrative territorial entity of a single country, that does not make sense. french commune instance of administrative territorial entity of a single country does. This category is used in Wikipedia to class territorial entities, but the category system is not as well defined as the rest. I proposed a correct model, on the administrative territorial entity project, but is seems people flattened the 2-level hierarchy into a simple token/type to do as they did in the Wikipedia category's system. This help page is here to give solid foundation and avoid this kinds of mess. A type like french commune can very well, and it is useful to sort such cases, an instance of type of administrative division if we want to respect the subclass of definition : an instance of A if A is a subclass of B is also an instance of B. Which become inconsistent if Paris is a administrative territorial entity of a single country, which does not make sense as, I far as I know, this is the case of every single territorial entity. This would mean that administrative territorial entity of a single country is equal to administrative territorial entitty. On the other hand if we want to class administrative entity types, we can ahave a metaclass of all french territorial entity type, and everything is consistent again, while retaining links beetween the items. I proposed the following scheme, but we can see that people have deleted the items because they don't understand this and refused metaclasses for bad reasons, and flattened into only one, incorrect hierarchy:
Administrative division classifications: Horizontal: instance of, Vertical : Subclass of 1 2 3 - administrative territorial entity (Q56061) type of administrative division - administrative territorial entity of France (Q192498) class that regroup administrative divisions of a single country Paris (Q90) ; Nantes (Q12191) commune of France (Q484170) class of french administrative division
- Problems I see:
- Administrative territorial entity in France is not an instance of table of administrative divisions by country. Part of would be slightly better, but even that is awkward. Something like represented in or listed in would be better.
- Mangled formatting and missing table data that would be obvious and not exist if you previewed your edits or read what you've written after saving.
- Having a link to previous discussion on your proposed model for administrative entities would also help. Emw (talk) 12:48, 20 April 2015 (UTC)[reply]
- corrected. The table is unreadable because the items have been deleted, merged and so on beetween discussions. The reference: Wikidata_talk:Country_subdivision_task_force#A_3_layer_model_to_class_subdivisions_and_their_type_with_instance_of_.28P31.29_and_subclass_of_.28P279.29 TomT0m (talk) 13:40, 20 April 2015 (UTC)[reply]
Another example : language family edit
a language family is not this.
@Giftzwerg 88: Can you read Help:Classification please and comment then if you still think it is a good idea to say language family is a subclass of language ? If you have no clue, it is a similar example to the ship class one :). TomT0m (talk) 16:06, 22 April 2015 (UTC)[reply]
Ship class edit
I do not understand the sentence: "Suppose we want to find all ship that are ship classes in the similar design sense." No ship is a ship class, so "ship that are ship classes" = ∅. And it is not told what is the "similar design sense". Visite fortuitement prolongée (talk) 13:55, 31 May 2015 (UTC)[reply]
- corrected, thanks. TomT0m (talk) 14:27, 31 May 2015 (UTC)[reply]
Modelling Methodology = edit
There is a Help:Modelling page that I think is closer to what is needed. Perhaps that page could be the promoted to be on the main help page. Peter F. Patel-Schneider (talk) 16:27, 21 October 2015 (UTC)[reply]
Opinions on adopting Help:Classification edit
- Oppose By principle: why do we have two help pages (Help:Classification and Help:Basic membership properties) for this classification problem ? We need to have only one page to be sure we have one unique policy. Snipre (talk) 10:00, 15 April 2015 (UTC)[reply]
- OK, but this is only a presentation question. Do you agree with the content, and if you don't understand, what do not you understand ? Then we can move on. TomT0m (talk) 13:01, 15 April 2015 (UTC)[reply]
- Keep my opposition due to concept of metaclass: this is just unclear distinction which will increase the complexity of the classification and lead to an increase of conceptual items without any clear definition.
- @Snipre: What do you find unclear exactly ? It won't increase complexity that is already there, such as with atoms and chemical elements who are existing items, it will give a good foundation to express the relationship beetween them. Without adding any property. The chemical element concept is not a class of atom. Hydrogen is. TomT0m (talk) 13:51, 15 April 2015 (UTC)[reply]
- "The chemical element concept is not a class of atom." Why not ? What is your definition of chemical element ? If I take the definition of of en:WP: "A chemical element (often just element when the chemical context is implicit) is a pure chemical substance consisting of a single type of atom..." so there is a relation between chemical element and atom.
- We can define chemical element as a class of atoms having the same atomic number and hydrogen as a class of atoms having the atomic number 1.Snipre (talk) 14:41, 15 April 2015 (UTC)[reply]
- @Snipre: I take the french definition (more precisely the one of the french Wikipedia) of chemical element (I already had this discussion tons of time with Emw, I even pointed inconsistencies to an ontology he quoted, and that turned out to be a bad one if I remember, so I bein to understand the subject pretty well). The english definition is a bit different, it turns out they are different.
- The english one (for completeness)
- a chemical element is a pure substance (a substance made of only one type of atom, for example a cloud of dihydrogen is an instance of the Hydrogen class). So in that case, a chemical element is a type of substance.
- The one of the french Wikipedia
- Un élément chimique désigne l'ensemble des atomes caractérisés par un nombre défini de protons dans leur noyau atomique. (a chemical element is the set of all atoms that have the same number of protons). Then Hydrogen is a chemical element. But is it a chemical element in the subclass of sense or in the instance of sense ? Let's suppose it is in the subclass of. I'll show this is inconsistent with the definition. Let's say . Now say . OK, by definition of subclass of (x instance of A and A subclass of B so x instance of B) we get . So by definition of chemical element this would mean the first hydrogen atom in the universe is the set all atoms that have the same number of protons, which is clearly wrong as he is not a set by itself, he is an individual. There is a set of chemical elements, {Hydrogen, Oxygen, ... }, I think we agree. Then by replacing Hydrogen by the atoms, we can get chemical elements = {{the first hydrogen atom, ...}, { ... }, ... }. In that sense a metaclass is just a set of set, this is just simple maths. And how can we separate the atoms into those subsets ? By their atomic number of protons. Who is associated to the Hydrogen class, not to the chemical elements metaclass. TomT0m (talk) 15:42, 15 April 2015 (UTC)[reply]
- @TomT0m: I answered that argument above in the discussions section: the problem of the < the first hydrogen atom in the universe > has to be discussed only if we want to create item for individual atom. If yes, then we will go further in your discussion about that problem, if not we don't have to spend hours about a problem which exits only in your mind. Please continue above the discussion. Snipre (talk) 15:51, 15 April 2015 (UTC)[reply]
- TomT0m: Do you really want to model individual atoms (see Avogadro constant (Q6203))? --Succu (talk) 18:27, 15 April 2015 (UTC)[reply]
- @Succu: *sigh* no. That's absolutely not the point. (but it's enough that there is only one article on one particular atom for some reason on Wikipedia and ...) Also note that their is another example, and there is instances of ships, a class of ship item, a corresponding article. The point is to model all that in a consistent way so there is no special cases, for example. And we can. TomT0m (talk) 18:38, 15 April 2015 (UTC)[reply]
- Sure, this is only one important point, TomT0m. Using subClassOf implies you want to model individuals. If this not the case remove the example. Why is your proposed classification of ships universal? --Succu (talk) 18:57, 15 April 2015 (UTC)[reply]
- @Succu: Using subClassOf implies you want to model individuals. No it does not. What is supposed to imply that ? Is there implicit hypothesis I'm not aware of ? It is not my classification of ship who is universal. It's nothing. It's just the classification scheme that works to classify atoms, chemichal elements, hydrogen as well ship, classes of ships in the en:Ship_class sense, and USS whatever using the exact same principle (tokens/class/metaclass). TomT0m (talk) 19:10, 15 April 2015 (UTC)[reply]
- TomT0m, what do you think is the purpose of creating subclass hierarchies? --Succu (talk) 20:04, 15 April 2015 (UTC)[reply]
- @Succu: Choose between express knowledge, the most precisely possible, or fulfil the closed world assumption (Q1102454) . Unfortunaltely the latter is impossible, so I'll stick to the former, together with the most rasonable open-world assumption (Q851949) one. And you, what do you think ? TomT0m (talk) 20:29, 15 April 2015 (UTC)[reply]
- by the way, can I may I ask you about this ? There is no comment https://www.wikidata.org/w/index.php?title=Q108149&oldid=210967777&diff=prev TomT0m (talk)
- Do you know a single person on earth which can name the first (enumerated) instance of the first ion (Q36496)? --Succu (talk) 20:52, 15 April 2015 (UTC)[reply]
- Why would that matter ? I propose a model for which we don't care. TomT0m (talk) 20:59, 15 April 2015 (UTC)[reply]
- TomT0m, what do you think is the purpose of creating subclass hierarchies? --Succu (talk) 20:04, 15 April 2015 (UTC)[reply]
- @Succu: Using subClassOf implies you want to model individuals. No it does not. What is supposed to imply that ? Is there implicit hypothesis I'm not aware of ? It is not my classification of ship who is universal. It's nothing. It's just the classification scheme that works to classify atoms, chemichal elements, hydrogen as well ship, classes of ships in the en:Ship_class sense, and USS whatever using the exact same principle (tokens/class/metaclass). TomT0m (talk) 19:10, 15 April 2015 (UTC)[reply]
- Sure, this is only one important point, TomT0m. Using subClassOf implies you want to model individuals. If this not the case remove the example. Why is your proposed classification of ships universal? --Succu (talk) 18:57, 15 April 2015 (UTC)[reply]
- @Succu: *sigh* no. That's absolutely not the point. (but it's enough that there is only one article on one particular atom for some reason on Wikipedia and ...) Also note that their is another example, and there is instances of ships, a class of ship item, a corresponding article. The point is to model all that in a consistent way so there is no special cases, for example. And we can. TomT0m (talk) 18:38, 15 April 2015 (UTC)[reply]
- @Snipre: I take the french definition (more precisely the one of the french Wikipedia) of chemical element (I already had this discussion tons of time with Emw, I even pointed inconsistencies to an ontology he quoted, and that turned out to be a bad one if I remember, so I bein to understand the subject pretty well). The english definition is a bit different, it turns out they are different.
- @Snipre: What do you find unclear exactly ? It won't increase complexity that is already there, such as with atoms and chemical elements who are existing items, it will give a good foundation to express the relationship beetween them. Without adding any property. The chemical element concept is not a class of atom. Hydrogen is. TomT0m (talk) 13:51, 15 April 2015 (UTC)[reply]
- Keep my opposition due to concept of metaclass: this is just unclear distinction which will increase the complexity of the classification and lead to an increase of conceptual items without any clear definition.
- OK, but this is only a presentation question. Do you agree with the content, and if you don't understand, what do not you understand ? Then we can move on. TomT0m (talk) 13:01, 15 April 2015 (UTC)[reply]
- @Succu: isotope is also a metaclass anyway :) I'm not sure I catch everything about the nuclide concept though. TomT0m (talk) 20:49, 15 April 2015 (UTC)[reply]
- nucleon (Q102165) fits? -Succu (talk) 21:47, 15 April 2015 (UTC)[reply]
- @Succu:, No, not at all. , but this has nothing to do with atom types. But the number of Q102165) is used to define types of atoms. The names of those classes of atoms is isobar (Q516369). And for sure, isobar (Q516369)(s) are kind of nuclides (subclass of). As this is a class of class of atoms, this fits. TomT0m (talk) 09:35, 6 May 2015 (UTC)[reply]
- nucleon (Q102165) fits? -Succu (talk) 21:47, 15 April 2015 (UTC)[reply]
- @Succu: isotope is also a metaclass anyway :) I'm not sure I catch everything about the nuclide concept though. TomT0m (talk) 20:49, 15 April 2015 (UTC)[reply]
- Oppose per Snipre. We need something (1) that is concise (2) that references external authorities (3) that uses concrete examples. (4) That cites pre-existing consensus on Wikidata. Help:Classification does none of those. --Haplology (talk) 11:33, 15 April 2015 (UTC)[reply]
- There is references http://osdir.com/ml/general/2014-09/msg50876.html here, and a lot in the english article metaclass (Q19478619) . But like Snipre, on the concepts, what is your opinion ? TomT0m (talk) 13:01, 15 April 2015 (UTC)[reply]
- I find your sudden interest in my opinion a little hard to believe considering that you summarily reverted me over this matter a few hours ago. My opinion on the concepts is that it makes sense to continue with the way things were before, but it doesn't matter because the help page is a non-starter for the reasons I mentioned above. It has to acknowledge other peoples' opinions, including the thousands of other editors who made this site, in the page itself, not in arguments behind the scenes. It's irrelevant if someone is right: this is collaborative.
- I'm done with this. --Haplology (talk) 15:14, 15 April 2015 (UTC)[reply]
- Let it be noted that TomTom created and wrote en:w:Metaclass (semantic web) himself, except for minor fixes. That's wonderful but it proves my point. --Haplology (talk) 15:20, 15 April 2015 (UTC)[reply]
- (edit conflict) @Haplology: Please don't be that way, this is not helpful. I replied to you on project chat, why not continue on this here and not there ? I don't understand. I cite my sources in that article, and yes, I wrote this because I got interested from that subject in the Wikidata case. Turned out I found enough to write a Wikipedia article, that has been genuinely kept and reviewed ! How stupid of me. TomT0m (talk) 15:25, 15 April 2015 (UTC)[reply]
- There is references http://osdir.com/ml/general/2014-09/msg50876.html here, and a lot in the english article metaclass (Q19478619) . But like Snipre, on the concepts, what is your opinion ? TomT0m (talk) 13:01, 15 April 2015 (UTC)[reply]
- Oppose, may support after overhaul. Help:Classification has an important and admirable goal, but it needs a lot of work and is far from suitable as an official Help page in its current form. I think having separate pages for Help:Classification and Help:Basic membership properties could be reasonable, if the many problems in the former were resolved.
- The page's many grammatical errors and stylistic oddities derail attention from the content. Readers will quickly dismiss documentation with such consistently broken English. An example of one of the smaller papercuts: use of a space before punctuation marks. English never uses a space before colons (e.g. "Superclasses and subclasses : relationships") or question marks ("But what is the relationship, then ?"). Another papercut: phrases like "to class classes themselves" are very unidiomatic -- "class" can be used as a verb in English, but in this context it looks quite odd. English speakers virtually always use the word "classify" instead of using "class" as a verb. Further grammatical and stylistic issues abound. If an author wants to be taken seriously when writing in English, regardless of their native tongue, they must write well in English.
- More importantly, the content itself is often misfocused and incorrect. For example, "The Classification of Wikidatians subclass of The Wikidatians" is not only too contrived, it is also wrong. "The classification of Wikidatans" (note conventional spelling, without the extra "i") is not in any conventional interpretation of that English phrase a subclass of Wikidatan. Virtually all English speakers would interpret "the classification of Wikidatans" as either A) the activity of classifying Wikidatans or (less commonly) B) the information artifact output by that activity. Neither of those is a type of Wikidatan.
- We should also avoid using Wikidatans or other in-group references as examples. "Wikidatans" is very much not as familiar a term as "Wikipedian", so the term presents an unnecessary road bump in learning. Even if it were, many of the people reading Help:Classification will not identify as Wikidatans or Wikipedians, making the coverage of this unfamiliar (but important) topic even more foreign. In short, we need less navel-gazing and more relatable examples; lose the "Wikidatan" stuff.
- We should also avoid contrived examples like "humans who have read this page subclass of human". (I've adapted this from current page's the "The Wikidatans who read this page" example.) Examples like "city with more than 10,000 inhabitants subclass of city", "chemical element in period 1 subclass of chemical element" and "bird that exists in Europe subclass of bird" are technically valid, but inadvisable, because reliable sources overwhelmingly do not build hierarchies along those axes -- the examples are contrived. Those examples should be modeled using reified properties like population, period in periodic table, and inhabited region. Help:Classification should probably include some examples it currently uses as anti-patterns.
- Also, we should avoid referring to metaclasses as "classes of classes". That's vague and ambiguous. A "class of classes" could just as well be an ordinary class whose subclasses are classes and none of whose instances are classes. A metaclass is more precisely "a class whose instances are classes".
- Finally, we should avoid including File:Atom_classes.svg in the page. It is overcomplicated, visually quite messy, and as far as I am aware not viewed as a good way to classify chemical entities by anyone but the author. An image with fewer overlapping lines would be a start. Get rid of the asterisk lines and use some other distinguishing feature like color. Put arrowheads on all lines instead of relying on left-right separation of column categories. Add more padding around the perimeter of the graphic. Then we can begin to discuss if it a reasonable way to classify elements. Emw (talk) 18:34, 18 April 2015 (UTC)[reply]
- @Emw:
why reified ? they are just properties. Those modelling are not exclusive, see the paragraph I added about Intensional class definition using the contributes to hypothetical property. Plus see http://www.insee.fr/fr/themes/tableau.asp?reg_id=0&id=201 urban unit of more that 100.000 inhabitants is a class used by the french official statistical organisation in his statistics. Which make this class useful, at least for this. TomT0m (talk) 13:59, 19 April 2015 (UTC)[reply]Those examples should be modeled using reified properties like population
- @Emw:
- TomT0m, your post is mangled. Please preview before clicking save, or at least read what you have written after you post. We have had this problem many times before. Also, when replying to a comment, please indent your reply one level more than the preceding post. This aids readability. Emw (talk) 14:37, 19 April 2015 (UTC)[reply]
- Still mangled, TomT0m, even after your partial fix. Please put greater care into the basics of correct format. Emw (talk) 15:37, 19 April 2015 (UTC)[reply]
- @Emw: Sorry for that, but I don't know what you mean. Could you make the correction such that I understand what you mean by mangled and we can move on ? TomT0m (talk) 15:42, 19 April 2015 (UTC)[reply]
- The formatting issues are not significant enough to further bog down this discussion.
- By "reified" I mean a normal property. The fact that "urban unit of more than 100,000 inhabitants" appears in an HTML table does not entail that that thing should be modeled as its own resource in a concept hierarchy -- or that the INSEE actually models it as a class in an ontology.
- I never suggested that modeling with a normal property like population is mutually exclusive with modeling it via subclass of. I am asserting that we should avoid creating such classes in Wikidata and using them in examples. Doing so is similar to avoiding asserted subclass of where the object is "cities with a male mayor" or "cities that were destroyed in World War II". Many such classes could foreseeably be used, but creating Wikidata items for them is not a good idea. Such subclass of claims should be inferred via queries, not directly asserted in the UI. Emw (talk) 16:22, 19 April 2015 (UTC)[reply]
- @Emw: It's too early to say that. Queries are not ready yet, and I don't think it is a good idea to put restrictions on these as it could be conflict prone for no real benefit. The Insee uses probably a vast variety of criteria to express datas. I just found an ontology of the datas they uses ( http://rdf.insee.fr/def/geo/insee-geo-onto.ttl ), found in their rdf page, where we can download rdf datas. I'm noy saying we have to import all this, but I think if we put to much constraint without enough experience as we are not mature enough, this could be a significant pain in the future to introduce more complex concepts like expressing statement on queries or asserting the membership of a class of some item to a query associated to a class. Plus it is really easy to extend a query to include the members of a class. The query could become
(X where X.population > 100000 or X instance of urban area with more than 100000 inhabitants)
. And we do not know the power of the query engine yet. We will understand what is manageable, what is overkill or what is needed later. TomT0m (talk) 16:44, 19 April 2015 (UTC)[reply]- TomT0m, I don't think it's too early to tell. Several Wikidata SPARQL endpoints already exist. Also, we can safely assume that a future Wikidata query engine will support basic operators like >, <, and =.
- Furthermore, I would point out that the INSEE OWL ontology you linked -- http://rdf.insee.fr/def/geo/insee-geo-onto.ttl -- contains no class for "urban unit of more than 100,000 inhabitants" or any other classes for a similar hard-coded threshold number of inhabitants. The INSEE ontology doesn't seem to include any population data, but it is worth noting that the HTML table on their website is not represented as its own class in their ontology.
- The query
(X where X.population > 100000 OR X instance of urban area with more than 100000 inhabitants)
is clearly suboptimal. It is redundant. More concerningly, it also suggests we should have claims like "Fooville instance of urban area with more than 100000 inhabitants" asserted directly in the UI. Denny and others have agreed that directly using subclass of or instance of several times in a Wikidata item is not a good idea. Experienced ontologists have written about the pitfalls of such an approach and recommend asserted monohierarchies and inferred polyhierarchies as a better strategy for the use of instance of and subclass of. Emw (talk) 17:30, 19 April 2015 (UTC)[reply]- @Emw: Not convinced. I don't think that in Wikidata we can reach the integration on ontology modelling they require in the article. And I don't really understand why you are talking of (explicit) multiple inheritance in our case since we are talking of some urban area class, and urban area with more than 100000 inhabitants would only have one parent in this case. Another thing: in our case their could be situations, probably a lot I think, where we will have some classes, but not the information expressed as statements that would be necessary to compute the extension of the class. So I'll continue to be careful and to say that we do not know enough atm to take such decisions ... I think this happens a lot, for example in administrative geography, where cities are not villages or major city, the Flying Spaghetti Monster knows why ... TomT0m (talk) 18:32, 19 April 2015 (UTC)[reply]
- @Emw: Also, I think you misread the threads about the query engine. It's not sure yet there there will be a full featured RDF endpoint available, and the on-site query engine will be based on SPARQL, the language won't probably be SPARQL. TomT0m (talk) 18:35, 19 April 2015 (UTC)[reply]
- The engine we are building is an SPARQL engine on top of RDF triple store. So initially, the only language supported internally would be SPARQL. However, as for public SPARQL endpoint it is not 100% clear what we will use, depends on what people would want us to have. --Smalyshev (WMF) (talk) 19:53, 19 April 2015 (UTC)[reply]
- Last stuff : wether the membership of the class is computed or explicitely stated does not have a lot to do with the class concept itself :) If there is a query on Wikidata, we can as well treat is as a class defined in extension. This implies that a bot could either detect that the claim is redundant when some user add it explicitely, or put it in a report like claim of class membership with not enough information to compute the truth value of the query expression or claim with class assertion membership for which the class expression is false. TomT0m (talk) 18:50, 19 April 2015 (UTC)[reply]
- @Emw: It's too early to say that. Queries are not ready yet, and I don't think it is a good idea to put restrictions on these as it could be conflict prone for no real benefit. The Insee uses probably a vast variety of criteria to express datas. I just found an ontology of the datas they uses ( http://rdf.insee.fr/def/geo/insee-geo-onto.ttl ), found in their rdf page, where we can download rdf datas. I'm noy saying we have to import all this, but I think if we put to much constraint without enough experience as we are not mature enough, this could be a significant pain in the future to introduce more complex concepts like expressing statement on queries or asserting the membership of a class of some item to a query associated to a class. Plus it is really easy to extend a query to include the members of a class. The query could become
- @Emw: Sorry for that, but I don't know what you mean. Could you make the correction such that I understand what you mean by mangled and we can move on ? TomT0m (talk) 15:42, 19 April 2015 (UTC)[reply]
- Still mangled, TomT0m, even after your partial fix. Please put greater care into the basics of correct format. Emw (talk) 15:37, 19 April 2015 (UTC)[reply]
- TomT0m, your post is mangled. Please preview before clicking save, or at least read what you have written after you post. We have had this problem many times before. Also, when replying to a comment, please indent your reply one level more than the preceding post. This aids readability. Emw (talk) 14:37, 19 April 2015 (UTC)[reply]
I don't think that limiting instances to concrete physical objects/events is a good idea. There are a lot of things in wikidata (as in life) that are not concrete objects. Is love (Q316) a class? Of what? What about Love (Q6690289) ? Is return on investment (Q939134) a class? Of what? Is elastic modulus (Q192005) a class? Is Ka (Q360536) a class?
As such, I do not feel the proposal gives me specific tools I can use to determine classification and decide when to use instance of (P31) and when subclass of (P279), besides very narrow band of cases which are not that hard anyway. This is not the hard case: But look at the items having no definitions - there are plenty of cases which would give one a pause. That's what I'd like to get some help with. --Laboramus (talk) 20:14, 19 April 2015 (UTC)[reply]- @Laboramus: Did you see recent editions ? Love is similar to angryness.
- moved to #administrativedivisions in the discussion section for clarity
- Oppose, may support after overhaul per User:Emw. Furthermore, the title could be more specific and say what is classified. Is it users, items, properties, all? All of the things listed at Wikidata:Glossary? FreightXPress (talk) 19:22, 25 April 2015 (UTC)[reply]
- Hello and welcome in Wikidata. You remind me of somebody who renamed the page item classification and left. I don't think it's a good idea because we don't classify items, we classify the objects they refers to (we classify the Eiffel Tower, not the item about the Eiffel tower). Second because it's quite obvious as this is the main namespace, so the main content of the database. TomT0m (talk) 19:26, 26 April 2015 (UTC)[reply]
- Oppose prefer "instance of human" for people. --- Jura 13:46, 3 May 2015 (UTC)[reply]
- @Jura1: It's a different question ... This is a question of which class we create or not, but the help page is mostly about the concepts. you would have said that is not true that would have been different. But this is irrelevant that you prefer generic other specific classes. TomT0m (talk) 14:22, 3 May 2015 (UTC)[reply]
After reading and, I guess, understanding the draft, I oppose to rename it Help:Classification. The draft is not about Classification in general, but about class classification and metaclass. this is not a guide for beginner, but a guide for advanced users. Therefore, I strongly suggest to name it Wikidata:Class classification (similar to Wikidata:Item classification) if it is adopted as an official help page. Help:Classification would list the 3 pages.Visite fortuitement prolongée (talk) 13:43, 31 May 2015 (UTC)[reply]- See Wikidata:Classification. Visite fortuitement prolongée (talk) 14:08, 31 May 2015 (UTC)[reply]
- @Visite fortuitement prolongée: Really ? There is a bigger section and more explanations about class and instances in Help:Classification than in Item Classification. Moreother classes are items, I don't get your reasoning here, it's inconsistent. You're just building a mess. TomT0m (talk) 14:30, 31 May 2015 (UTC)[reply]
- I confounded "item" and "instance". My mistake. Forgot my comment. Visite fortuitement prolongée (talk) 14:41, 31 May 2015 (UTC)[reply]
- @Visite fortuitement prolongée: Really ? There is a bigger section and more explanations about class and instances in Help:Classification than in Item Classification. Moreother classes are items, I don't get your reasoning here, it's inconsistent. You're just building a mess. TomT0m (talk) 14:30, 31 May 2015 (UTC)[reply]
- See Wikidata:Classification. Visite fortuitement prolongée (talk) 14:08, 31 May 2015 (UTC)[reply]