Wikidata talk:WikiProject Taxonomy/Archive/2013/12

This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

Wikidata:Project chat#Some good items

At the project chat, they collect items which should be "featured". I like the idea of presenting some items as examples for Wikidata beginners how to model taxa. I propose to collect some items, ideally with increasing complexity, for example:

  1. A basic taxon showing all necessary properties like scientific name, rank, higher taxon, authors
  2. A taxon with picture and basionym
  3. A species in a monotypic genus
  4. Ex-authors, incertae sedis, ....

This could be presented as a guide introducing the new properties in each step and, using the taxobox, showing the result. Of course, all presented items should be sourced sufficiently. If you like this idea, we could also think of removing some parts of the Wikidata:Taxonomy task force main page (e.g., the list with the rank properties (P75 (P75),...) or the rank item lists like Wikidata:Taxonomy_task_force#Bacteria_under_the_rules_of_International_Code_of_Nomenclature_of_Bacteria, which could be moved to a subpage, if still needed.  — Felix Reimann (talk) 08:11, 15 October 2013 (UTC)

You mean writing some Help-pages? Well, it would not hurt to do so, but it is not so easy: for example a taxon has no author(s), unless you mean the circumscribing author(s). It is the name that has (an) author(s). And the present model for providing authors only work for ranks above that of genus. - Brya (talk) 10:52, 15 October 2013 (UTC)
Yep, we need something like a Best Practice or a Step-by-Step Guide. --Succu (talk) 11:16, 15 October 2013 (UTC)

There is a new page called Wikidata:Showcase items. --Succu (talk) 20:51, 7 December 2013 (UTC)

Fossils

Relevant to this WikiProject: Wikidata:Property_proposal/Natural_science#fossils --Tobias1984 (talk) 10:31, 4 December 2013 (UTC)

WP-articles to be deleted

Do we need a place for them? Acacia baileyana (Q9562555) is a bot generated article from ZipcodeZoo (Q15078690). It is impossible to deduce witch taxon is meant. So what we do with them? --Succu (talk) 19:09, 7 December 2013 (UTC)

Good question. I encountered a few of these, but left them with a label like "unknown hybrid of ..." (like Quercus ×hopeiensis (Q11141337), Spiraea vanhouttei (Q15246097) unknown Cypripedium hybrid (Q10869590), Temnocephala brevicornis intermedia (Q1940528)). I guess deleting them at the source is the only solution. Maybe we could have a list? - Brya (talk) 06:58, 8 December 2013 (UTC)

instance of (P31) and constraints

Say we have a constraint that applies to monophiletic taxon, and that who does not applies to non monophiletic ones. This will be possible to do with P31 because it's the general rule in Wikidata. To do the same with the specific taxon typing properties, we would need the same kind of technical capabilities, ie. applying constraints wrt. a value of this constraint. So what does remain an advantage of having the two type systems ? TomT0m (talk) 22:57, 11 December 2013 (UTC)

If monotypic or not, this is only of minor importance (many taxoboxes use this to show additional infos of the parent monotypic item). This is only classification among others (also: not of a complex structure) and can perfectly reside with all the other classifications. It does not interact with the taxonomic important properties itself. For the rest of the discussion, see Wikidata:Project chat#P279 & P31 (& P361) for taxons (included species)  — Felix Reimann (talk) 19:11, 12 December 2013 (UTC)

zh-wiki

I have been looking at Wikidata:Database reports/Constraint violations/P225 and there seems to be a flood of duplications from zh-wiki where there are two bot-created pages per species. I have eliminated about ten of these, but this is not fun. I put in a message with a bot-operator, but there seems to be more than one bot owner involved. This may be / could turn into a huge problem. - Brya (talk) 05:29, 1 December 2013 (UTC)

zhwiki has about 50,000 articles about plants. About 25,000 were without taxon name (P225) and further 5,000 without a data item. So I decided to make use of zhwiki taxoboxes to provide taxon name (P225). I did not expect to generate that many constraint violations. If its helpful, I try to analyze them at the end of the next week and provide a list. --Succu (talk) 07:54, 1 December 2013 (UTC)
Thank you. I am feeling very uncomfortable on zh-wiki as I can read very little, but the problem exists there, and cannot be fixed here unless it is first dealt with on zh-wiki. I am not sure now if it is more than one bot, and it seems to be recent activity (november 2013?). Perhaps we can ask user Liangent to look into it, or are there others? - Brya (talk) 08:13, 1 December 2013 (UTC)
The bot-owner responded, and says he stopped months ago. He also claims there cannot be many of them. However, that is not my impression. - Brya (talk) 10:53, 1 December 2013 (UTC)
Now about halfway through the A (in Al-) and counting just short of two dozen. - Brya (talk) 17:08, 1 December 2013 (UTC)
I'm nearly done with the list. But we have to wait some days until all constraints are reported at the constraints list. @Brya: I think it would be more effective, if you would concentrate on duplicates of genera, because I worked a lot on this constraint list too. --Succu (talk) 17:34, 1 December 2013 (UTC)
It is somewhat depressing to resolve dozens of cases only to see the number of constraint violations go up by some seven hundred. It really starts to look as if somebody will need to run a bot at zh-wiki. - Brya (talk) 17:24, 2 December 2013 (UTC)
Be patient. This will go on at least the next two or three days. Then I'll check the list. With the list of duplicates we can feed a bot on zh-wiki. I made a simple check on vi-wiki to. There are thousands of articles without any conection to wikidata. Alone around 2,000 articles about cacti species. --Succu (talk) 17:53, 2 December 2013 (UTC)
OK, but a bot could be run now, just checking all the pages created by this bot against existing pages. - Brya (talk) 19:22, 2 December 2013 (UTC)
I'm almost done with the list of zh-wiki duplicates. Currently there are nearly 2000 of them in the constraint list. --Succu (talk) 19:49, 6 December 2013 (UTC)
That is a depressingly large number, although I am not really surprised. Anyway, it is good to see the number of constraint violations go down! - Brya (talk) 07:26, 7 December 2013 (UTC)
It took a while to resolve the remaining wrong zh-interwikis, but now the list is ready. I hope @Liangent:, @Stevenliuyi:, or @Zolo: can help to resolve these issues. --Succu (talk) 16:04, 7 December 2013 (UTC)
Thanks for the work, I have left a message at the Chinese VIllage Pump as things need to be sorted out on Wikipedia first. --Zolo (talk) 17:04, 7 December 2013 (UTC)

There's already a list of duplicated articles at zh:User:Liangent/csdtln, solely based on article data (no Wikidata data are used. If you're curious: all articles with a binomial name in Taxobox are put in a hidden category zh:Category:TaxoboxLatinName, with sort key set to their binomial names). It's just nobody who act on it. Liangent (talk) 13:57, 11 December 2013 (UTC)

If I understand it correctly this list has existed for over half a year? It would be really great if somebody started to convert all the duplicates into redirects. There sure seem to be a lot of bot-operators active in zh-wiki, and it should be a nice project for somebody? - Brya (talk) 18:05, 11 December 2013 (UTC)
It seems people want to actually merge them, though no one is actively working on them now. Articles using different names are from different databases, so descriptions are usually not duplicated (for example, described from different aspects). Also there're some typos, in binomial names or Chinese names, which need manual cleanup. Liangent (talk) 11:09, 13 December 2013 (UTC)
Thank you Liangent for contacting them. Perhaps we could tag all these items with something like instance of (P31) Duplication (Q55414) and said to be the same as (P460) <corresponging taxon> and remove all taxon specific claims?  — Felix Reimann (talk) 12:06, 13 December 2013 (UTC)
Or instance of (P31) Wikipedia:Proposed article mergers (Q6596462)?  — Felix Reimann (talk) 12:10, 13 December 2013 (UTC)
As I understand it we have at least one problem: whitch items should we tag. The one(s) with the highest q-number or the latest created article(s) -Succu (talk) 19:21, 13 December 2013 (UTC)
My impression is that the items with the highest q-number will be the latest created articles. And these latest created article(s) are the shortest, so the earlier ones look to be preferable. Of course this is just going by length, I cannot read them. And the best is to merge them, while retaining any useful content present in either. - Brya (talk) 07:48, 14 December 2013 (UTC)
Redirecting is better than deleting them, but merging is better than redirecting them, if there is indeed some content worth merging (I cannot tell). As long as the duplication is eliminated Wikidata will be happy ;). - Brya (talk) 17:34, 13 December 2013 (UTC)
since the contents were not duplications, it worths manually merging. maybe you can send the list here and expect some positive responds. (but not too much) Bluedeck 15:34, 15 December 2013 (UTC).

Order of statements

I think the topmost statements should be arranged in the following order:

  1. taxon name (P225)
  2. taxon rank (P105)
  3. parent taxon (P171)

If you agree with me, then we should recommend this order of statements in the tutorial. --Succu (talk) 18:55, 11 December 2013 (UTC)

I agree. - Brya (talk) 19:01, 11 December 2013 (UTC)
  Support - But we already need a bot to fix the order in probably a few thousand items. I never thought order would be important, because it would be dealt with by a template. Now that we have to do it by clicking on the arrows, we have a lot more to think about. --Tobias1984 (talk) 19:46, 11 December 2013 (UTC)
Let's wait a little bit until the dust has settled, before we let the bots free... ;) It's a recommandation, not a law. ---Succu (talk) 19:58, 11 December 2013 (UTC)
I also agree. However, I think moving these three statements at the beginning of the page is not of utmost importance as this is only for the readers of Wikidata. Contrary, if a taxon has more than one scientific name (Wikidata:Database reports/Constraint violations/P225) or alternative parent taxa (!), the ranks are of huge importance. I propose:
  • Preferred rank: the name/parent according to the most recent scientific reference. P225, P105, and P171 statement which have a preferred rank set, shell always have a scientific source.
  • Normal rank: Each name/parent according to "imported from Wikipedia" or a scientific reference, which is not the most recent one.
  • Deprecated rank: Each name/parent, which is from a source known to be outdated. See for example the taxon name of Hierophis viridiflavus (Q844254).
I plan/propose to use this also in the taxobox: If a statement of preferred rank is given, this is used; else if a statement of normal rank is given, this used, else a deprecated one. This means, if we (or our bots) add a new statement for a source from June 2012, we would set this statement to preferred, if and only if there is no other statement of a more recent source for the same property. If, however, there is a preferred statement from August 2001, we set it's rank back to normal. The date of the cited source as basis of the rank setting may however always be overwritten if, for example, a older work is still more exact in a specific part compared to the newer one.  — Felix Reimann (talk) 21:31, 11 December 2013 (UTC)
I would also expect that ranking would be much more important, as that has implications for external reuse beyond any given person who will view the item (which are probably few and far between). I'm not sure that I agree that "most recent" description always makes sense as the preferred. Some sources will take precedence over others, and we should prefer the statement which is most authoritative. --Izno (talk) 22:26, 11 December 2013 (UTC)
True. As I said: "the rank setting may however always be overwritten." I tried to describe the default. Exceptions are often senseful.  — Felix Reimann (talk) 22:46, 11 December 2013 (UTC)
For example, we may need to prefer two statements rather than one, as in some cases there will be controversy... --Izno (talk) 22:28, 11 December 2013 (UTC)
Also true - better than seeing edit wars. Than, each specific user should decide either according to the references or - as fallback - by random.  — Felix Reimann (talk) 22:46, 11 December 2013 (UTC)
@FelixReimann: Seems reasonable to me. Let try it, to get some experience. --Succu (talk) 10:34, 15 December 2013 (UTC)
I would tend to agree that taxon rank should precede parent taxon, though I'm unsure whether the name should come before or after the two. --Izno (talk) 22:26, 11 December 2013 (UTC)

P31

I think the general rule will be, on the whole project, to put instance of (P31) on top, how will you interract with that rule ? TomT0m (talk) 21:18, 11 December 2013 (UTC)
Let's see what the results will be. Hopefully, setting P31 on top is recommended but not enforced. For taxa, the taxon name brings much more information (entropy!).  — Felix Reimann (talk) 21:31, 11 December 2013 (UTC)
If enforced then - I don't care - we will set the three important properties above right after P31. But this is definitely not worth to start an edit war.  — Felix Reimann (talk) 21:35, 11 December 2013 (UTC)
My answer is blowing in the wind (because of multiple edit conflicts). --Succu (talk) 21:39, 11 December 2013 (UTC)

Please do not inject namecalling, Felix. Calling someone who is a proponent of a particular property a "fanboy" is not appropriate.

My opinion on the matter is that P31/P279 should be placed nearest to their analogs of specific type properties i.e. instance of with taxon rank and subclass of with parent taxon, where these are specified. It makes the data duplication apparent to those who are interested, while also creating a definition closer to consistent with the rest of Wikidata. --Izno (talk) 22:26, 11 December 2013 (UTC)

rephrased the respective part  — Felix Reimann (talk) 22:40, 11 December 2013 (UTC)
Thanks. --Izno (talk) 15:48, 12 December 2013 (UTC)
I would expect that to be the case except where there are more specific type properties currently defined. For each of those specific type properties, we should seek a not-across-the-board resolution local to each property, IMO. Something that needs discussion in a different place, I think. WD:Requests for comment/Ranking and ordering, go go! :) --Izno (talk) 22:26, 11 December 2013 (UTC)
Of course you have some domain specific examples, we can evaluate, at hand? --Succu (talk) 22:33, 11 December 2013 (UTC)
I am still not really convinced that adding P31=taxon is a good idea, as it is pretty much redundant. So, I certainly would be unhappy to see it at the top, obscuring more important matters.
        I don't have any doubt that taxon name should be on top: it is probably the most meaningful claim to fill in. It feels right to have taxon rank following this, although in itself it is not an important item: it is there mostly to be machine-readable (it still does not pick up "clade" as being a rank).
        Of course, "parent taxon" is very meaningful and important: it should be near taxon name. However, I do not follow this ranking of "parent taxon". It is not so that newest is necessarily best, but it is hard to know when it is and is not. I would prefer having the ability in a Taxobox Module to set a preference (with at the moment APG III as the default for the higher plants). Any preference among parent taxa would really need to be substantiated with references, otherwise it is just an opinion. - Brya (talk) 06:06, 12 December 2013 (UTC)

Question

I am getting the impression that if I add extra claims (to a pre-existing list of claims) these are inserted at the top and no longer at the bottom? This is very unhandy. - Brya (talk) 04:34, 13 December 2013 (UTC)

I think this has something to do with this problem. --Succu (talk) 13:16, 13 December 2013 (UTC)

Each taxon should be taxon

I add {{Constraint:Value type|class=Q427626|relation=instance}} to P:P105 but Succu reverts. Any reasonable objections from anyone? --Infovarius (talk) 19:05, 13 December 2013 (UTC)

I don't understand. Is it about "taxon" or "taxonomic rank"? Requiring "instance of taxon" is not helpful; this will just generate an enormous list of violations, as it is not a particular good idea to add "instance of taxon". Requiring "taxonomic rank" does not seem to be doing any good either, as there already is a list of allowed ranks. - Brya (talk) 07:37, 14 December 2013 (UTC)
instance of (P31) is a project whole property, and this one is not an exception. It is becoming, as it is in whole semantic web standards, a structuring property. It tells anybody the nature of an item by linking to taxon (in our case) the same way other items does. In the future we can expect a constraint systems who does consistency checks, for example by checking that every property used in the instances are listed somewhere on the <taxon> item help pages or statements. Other questions ? TomT0m (talk) 23:26, 15 December 2013 (UTC)
Such a time may come. That could be a good time to consider leaving the project. - Brya (talk) 06:49, 16 December 2013 (UTC)
@TomT0m: - I can understand the requirement for each item to have the instance of (P31) or subclass of (P279) statement. But this should only be in addition to WikiProject-internal taxonomies, and should never replace them. I also think that the WikiProject should decide how to use those two properties or that it should be decided in an RfC. I don't think that setting everything as instance of (P31) = taxon makes very much sense. I would favor setting all species as subclass of (P279) = species and the same with higher levels. Using instance of (P31) in this case would be ambiguous, because we have some real instances of famous animals, which then would be indistinguishable from the group they belong to. Also, I think Succu reverted you because a page with almost a million constraint violations would crash most browsers. --Tobias1984 (talk) 10:07, 16 December 2013 (UTC)
This would not be ambigugous at all as a <taxon> has a definition. If I'm not wrong, a <specie> is a subclass of (P279) <taxon> as every species are <taxons>. A taxon is also a group of animal, by definition, so every specie should be marked instance of (P31) <species>, and <Tobias1984> instance of (P31) <human>. <human> is a subclass of <organisms> at the same time, as it is a taxon. This is a global project policy as nobody has ever Help:basic membership properties is one of the most important page in this project imho, but we could go throw a RfC. Note that these properties have a far more use than is these project, these are properties used by almost every ontology and they are well defined. TomT0m (talk) 16:52, 16 December 2013 (UTC)
I think you are wrong. Ranks are an enumeration of terms giving a stage in the hierarchy a name. They are not taxa themself. --Succu (talk) 17:14, 16 December 2013 (UTC)
They are classes of taxa, ie. for each range there is a set of taxa associated to them. All of these taxa (are supposed to have) properties in commons, which is the definition of a class of taxa. TomT0m (talk) 21:31, 16 December 2013 (UTC)
Sure: All green colored things are classes of green. Thats true, but is generally not a useful model. --Succu (talk) 21:47, 16 December 2013 (UTC)
Well, usefulness of ranks is essentially and more and more historical in modern classification. Anyway, it's a useful information to deal with the type of classification you want to model, and there is several interesting classifications for leaving things ... so this at least can be used to mark this type of classification. You might want to use the classification in which the ranks are really meaningful, maybe historical, or you might want to use a more modern one who absolutely ignore them, like a phylogenetic one. In any case you have to precise the one you want to use, and tagging classes (taxons in our case) as an instance of a classification are useful for that. You could for example build a tree with all the classes of the Liné classification and their relationships in it. TomT0m (talk) 22:22, 16 December 2013 (UTC)
I don't know what you want to tell me and how your answer is related to the problem. Aside from PhyloCode (Q1189395) in all codes ranks are used. These are represented by code of nomenclature (P944). --Succu (talk) 22:44, 16 December 2013 (UTC)
Yes, ranks indicate levels in a hierarchy. On the other hand <specie> and <taxons> are nothing except misspellings. It is possible to argue logic in all kinds of directions, but this does not necessarily lead to anything except a big tangle. - Brya (talk) 18:08, 16 December 2013 (UTC)
Why beeing so aggressive ? I see no reasons. I see no big tangle here, these logical bases are used in a lot of ontologies, for consistency I see it useful to use them. TomT0m (talk) 21:31, 16 December 2013 (UTC)
1) There is an enormous amount of work left to be done here, and there are some real problems to be solved. It does not help to start inventing ways to make things more difficult than they already are.
2) There are any number of projects (even projects that started out promising) where "logical bases" were adopted with deplorable results, creating tangle after tangle. The last thing the world needs is another tangle based on 'logic'. - Brya (talk) 06:31, 17 December 2013 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── I think any further discussion of this topic should take the route of an RfC. It seems like the discussion would benefit from a project-wide reality check. Hope this admin-action is in the best interest of all participants. --Tobias1984 (talk) 06:48, 17 December 2013 (UTC)

„a project-wide reality check”? First of all a RfC should clarify that owl and semantic web intergration are part of the wikidata project goals. --Succu (talk) 09:06, 17 December 2013 (UTC)
Linked Data is at the root of the project, Wikidata has an RDF export, supports different ways of downloading datasets, the question is already settle. The question is whether or not it is interesting to check whether or not it is interesting to be closer to upper levels ontologies of the domains. And whether or not the complexity is manageable, if a help page can clearly solve what is on the table, there is not much real complexity. TomT0m (talk) 12:22, 17 December 2013 (UTC)
a) Where can I find the Wikidata RDF export? b) If the question that owl and semantic web intergration are part of the wikidata project goals is already settled, than please tell me where I can find the discussion. --Succu (talk) 12:38, 17 December 2013 (UTC) Again we left the problem.
[1] [2] [3] slide 8 for integration with the linked data web, this is at the root of the project. In that sense, it's obvious that integration with other standards is on the table, even if you are right, if up to community to decide at which extent. But for bibliographical data for example, this is a crutial problem. Note that integration with other standards may be a way to solve problems because the ontologies does, ontologies can be a way to encode problem solving knowledge into a solution found by others. TomT0m (talk) 12:47, 17 December 2013 (UTC)
Ok, lets put it on the record: there is no RDF export at the moment, and only wishfull thinking as a lot of other thinks around wikidata. And we should end this diskussion. --Succu (talk) 13:02, 17 December 2013 (UTC)
Is any of these bugs related? Helder.wiki (talk) 13:56, 17 December 2013 (UTC)
Yes. And P31 as well as P293 have neither in the existing RDF-Export-Code nor in the development road map any special semantic. They are properties like all others.  — Felix Reimann (talk) 14:17, 17 December 2013 (UTC)
Yes, they are, but clearly they play a role in interoperability with other frameworks. The RDF framework is clearly on the developpers mind, Daniel and Denny are (were) the fuel of this project and speak of that since day -1. It's not useless to keep in mind that Wikidata could be and important piece of the Linked Data web, and probably already is : we store a mapping between entity ids of freebase, VIAF and tons et of other databases. TomT0m (talk)
I am sure that Freebase is very happy that it is linked to from here, nobody else does, but why is there a Freebase at all? Does it serve any purpose? And indeed there is enthousiastic linking to ITIS, but what purpose does this serve? At some points there are discrepancies between content existing here and ITIS, but we already knew ITIS is full of errors and finding the errors (again) helps nobody. ITIS is not improved thereby and it is just a headache to the users here. The more things are linked the more errors are linked up, and the less information value remains. - Brya (talk) 17:30, 17 December 2013 (UTC)
It is a matter of synchronisation of versioning, and whether or not we need the ITIS informations. If claims from ITIS are sourced and versioned, sync it here is a matter of automated work and to update and mark some statements deprecated. Are ITIS informations useful for Wikispecies or Wikidata ? Better think of how we can interoperate with them. TomT0m (talk) 18:08, 17 December 2013 (UTC)
No, we don't need ITIS, and no, we cannot inter-operate with them (nobody can, except their source databases). - Brya (talk) 19:09, 17 December 2013 (UTC)
I vote for tagging P171 being a subproperty of P279. Who knows which will be possible earlier? RDF is also not supported yet (we even do not generate RDF dumps, only XML). Also, no tools are available which provide the same features for model checking as we currently have. Why should we switch? You say it's better in theory. Where is the prove? We already outlined several points where it is good to have separate properties (and our model works).  — Felix Reimann (talk) 14:17, 17 December 2013 (UTC)
It's even more out of the scope of the devteam to implement subproperties as far as I know (and they planned that, they did implement them in semantic MadiaWiki, so if they did not in Wikidata, they have a reason. It's not the case of class/instance as it's part of the initial data model, which mean if they did not they probably thought Wikidata could leave with gadgets or external tools for that). Apart from that : class and instances are absolutely not mutually exclusive with using other properties. In a sense, in RDF, classes can be defined as beeing particular cases of another class, but also by beeing the set of the instance of a class that satifies some property wrt. the values of a query, for example we could define the class of all genus taxons as the set of all taxons that have a <genus> rank. In the future we might be able to tag a class with a query, for example. We should stop the property versus classification debate, it seems a bad things to think they are mutually exclusive to me. The things tagging with instance of (P31) gives : stating explicitely the kind of taxon it is, with a chosen taxon class tree. And a tool to solve point of view problems : different trees, potentially different class of taxons, historical classes of taxons, and so on. TomT0m (talk) 16:05, 17 December 2013 (UTC)
Well, "tagging with instance of (P31) gives : stating explicitely the kind of taxon it is, with a chosen taxon class tree." is the opposite of what we do. If it is tagged with "instance of (P31) = Fantastica" this is a denial of "is Fantastica". - Brya (talk) 17:47, 17 December 2013 (UTC)
I found difficult to find a meaning here. TomT0m (talk) 18:08, 17 December 2013 (UTC)
At last agreement! Even if it is only that there is no meaning in what you write. - Brya (talk) 19:09, 17 December 2013 (UTC)

Additionaly, I would point out to an answer of Emw to a personal qustion about why he supports using ontologies and standards. Last I would point out that I'm not a biological ontology experts and that others probably face the same projects you face and stored that in an ontology. I think it would benefit the members of this projects to have a look on how they do things. TomT0m (talk) 16:11, 17 December 2013 (UTC)

Then stop discussing about nothing but hot air. And thanks for your „advice“. --Succu (talk) 17:08, 17 December 2013 (UTC)

One amendment: OWL indeed defines the subproperty relation: [4]. Thus, I hereby state

SubObjectPropertyOf( :P105 :P31 )

Problem solved. Did not someone say this is not allowed in OWL? Indeed, it seems that also W3C sees the need of property hierarchies...  — Felix Reimann (talk) 13:05, 18 December 2013 (UTC)

I hope to soon write a fuller comment on this issue, but I just wanted to quickly note that the fact that OWL supports subproperties has never been in question. I link to rdfs:subPropertyOf in many comments I make on this issue; OWL is essentially a superset of RDFS. The question is whether any OWL ontologies, anywhere, support subproperties of the special properties rdf:type or rdfs:subClassOf, the bases of instance of (P31) and subclass of (P279). I am not aware of any, and I have looked through more than a few ontologies. Major, multi-domain ontologies that I am aware of use only rdf:type and not any subproperties of rdf:type. There are several arguments against having a proliferation of domain-specific subproproerties of rdf:type (i.e., 'instance of'); this is one of them. Emw (talk) 13:28, 18 December 2013 (UTC)
You are not aware of any usages of subproperties Emw? Thats a weak argument not to use them. --Succu (talk) 18:41, 18 December 2013 (UTC)
No, he said he was not aware of any use of any subproperty of subclass and instance of. The discussions seems especially hard here, I feel like in a flameware in any discussion forum on the Internet and I can't understand why ... TomT0m (talk) 21:12, 18 December 2013 (UTC)
Dear TomT0m, may be Emw should answer his/herself? --Succu (talk) 21:54, 18 December 2013 (UTC)
Succu, TomT0m said the basic gist of what I was going to say. Again, the fact that OWL supports subproperties -- i.e. "any usages of subproperties" -- has never been in question. The question at hand is more specific: do any OWL ontologies use subproperties of rdf:type (instance of) or rdfs:subClassOf (subclass of)? I have surveyed major ontologies that deal with many domains, read through relevant papers and presentations, and I have never seen an example of such. I have asked for such an example for months from supporters of this idea, and the response has thus far been that none have been found. This raises at least two questions:
  • 1. Does OWL (or, for that matter, RDFS) even support creating subproperties of rdf:type and rdf:subClassOf in OWL? If you go to the OWL 2 Primer, click 'Show RDF/XML Syntax' up top, then go to the OWL 2 QL section and look at the example use of subproperties, you'll see that the 'hasFather' property is subsumed by the 'hasParent' property via rdfs:subProperty. Both of these properties are prefixed by 'owl:objectProperty'. However, 'rdf:type' and 'rdfs:subClassOf' are special properties, as indicated by their 'rdf:' and 'rdfs:' namespace prefixes. They are built-in properties. Does OWL have defined handling of 'rdfs:subPropertyOf' for these special built-in properties?
  • 2. If the answer to (1) is "yes", then why are such subproperties of rdf:type and rdfs:subClassOf absent from broad, multi-domain ontologies, where one would expect to see them?
I accept the possibility that my survey of major ontologies might be deficient, or that I might have missed something in a paper or presentation. There would be an easy way to demonstrate that. Show me an ontology that has a subproperty of rdf:type or rdfs:subClassOf.
Now, one might summarily dismiss the idea that compatibility with OWL is a significant concern, but I would find that myopic. OWL is a core component of the Semantic Web. It is the standard format used to encode an ontology (including different domains or branches within an ontology), which is what we are doing on Wikidata. Organizing our properties around OWL conventions will help Wikidata, and thus Wikipedia, be interoperable with the rest of the Semantic Web. Emw (talk) 03:23, 19 December 2013 (UTC)
Emw, if you are undertaking a code review (Q1342704) of some major projects written in C# (Q2370) and non of these project makes use of the keyword yield your recommendation would be never to use this keyword? --Succu (talk) 18:30, 19 December 2013 (UTC)
Succu, if 'yield' were not a supported feature of C#, then yes, I would recommend not using the construct. As the C# documentation and Q&A forums show, however, 'yield' is a supported feature of the technology and widely used. My argument above is that after having looked at the documentation and many ontologies, having a subproperty of 'rdfs:subClassOf' like 'parent taxon' is more like extending C#'s built-in 'yield' keyword and applying it throughout a project without knowing whether any C# compiler supports it and without being able to cite examples of any other C# projects doing so. Emw (talk) 15:16, 27 December 2013 (UTC)
yield is a supported feature as well as rdfs:subClassOf. The question was about using it or not. --Succu (talk) 19:58, 27 December 2013 (UTC)

Sorry for the long delay: What I meant with "reality-check RfC" is that it never hurts to discuss things with a wider audience. I would strongly advocate that WikiProjects should be allowed to work on ontologies independent of the more general "instance of / subclass of" classification that is more compatible to current semantic web properties. @TomT0m: Another thing to consider is that this WikiProject is doing an incredible amount of pioneering work. It might very well be that the policies drafted here will be adapted by the semantic web community for biologic-taxonomy. We have certainly already reached points where the properties of WikiProject Medicine are an improvement over current standards. I would also strongly advocate for the members of this WikiProject to try to publish some of their ontological findings in blog or scientific literature form. --Tobias1984 (talk) 21:17, 18 December 2013 (UTC)

Tobias, an RFC on this topic has been open for months: see the Many or few type properties RFC. There has also been major discussion happening in Wikidata's public squares: see P279 & P31 (& P361) for taxons (included species) and the recent slew of 'Properties for deletion' requests for domain-specific 'type of' properties, particularly for type of astronomical object (P60) and type of administrative division (P132). If you do think a new RFC should be created, please have an uninvolved administrator close the one linked above. Since you express a strong opinion on the matter above, I would also recommend having someone that has not been involved in any of these discussions open the RFC. Emw (talk) 03:45, 19 December 2013 (UTC)
Well, this P279 & P31 (& P361) for taxons (included species) struck me as a very good example of somebody who does not have a clue wanting to muscle in on something where he does not belong. I cannot say that I am very interested in ontologies, but it certainly would have been nice if those concerned with ontologies would have their concerns addressed and settled before the project got into swing (let alone full swing). BTW: it would also have been nice if some thought had been put into designing an at least somewhat user-friendly User Interface.
        What I do know is that designing any datastructure without familarity with the kind of data to be handled and the uses they are to be put to is a recipe for disaster. And, yes, I do know there are plenty of highly-paid experts who will claim the opposite but the whole strew of expensive failed projects that they are leaving in their wake is not a commendation. - Brya (talk) 06:30, 19 December 2013 (UTC)
Yeah well, at least the nonpayed nonexperts are discussing something concrete :) Just use a property wide instance of (P31) taxon for taxons, which has no real drawbacks wrt. current model and is consistent with the direction the project is taking. This is not this project versus the rest of the world, I hope. TomT0m (talk) 08:32, 19 December 2013 (UTC)
My discussion cycles detector says now: STOP. We do not talk about the same things now, at least the following three topics are mixed up:
  1. Should we delete P105 and P171 and move everything to P31 and P293
  2. Should taxa have some P31 property set
  3. Do we need a Constraint Violation report ensuring 2. (the start of this thread)
Regarding 1.: Here all members of Project Taxonomy vote strongly against it. I refrain from repeating the arguments again now. If you want to discuss this, create a WORKING PROPOSAL first (i.e., what would you model how, we heared - depending on which better fits the current argument - different modelling schemes. This is not helpful). Also, just saying it will be all good is not sufficient. Thus, create a proposal, otherwise we cannot falsify the scheme and this discussion will be endless.
Regarding 2.: I see the need of having P31 as property each Wikidata item has to have. It is required to have an easy distinction between e.g. the deity Zeus Zeus (Q34201) and the fish genus Zeus Zeus (Q1960850). Thus, myself (and others of this Project) actively tag taxa and related items with P31=(monotypic taxon|taxon|fossiltbd|cultivar|trivial name|polyphyly|...).
Regarding 3.: We actively work manually [sic!] with the constraint violation reports. Thus, we would prefer not to have additional reports with hundred thousands of violations no human shall start to fix. We know that there is open work but it is open work for bots. Endless constraint violation reports do not help - moreover, they even might harm because (a) beginners (which we need!) do not know which violation reports are important and might even surrender if seeing that there are millions and millions more violations to fix and (b) also for experienced folks, always needing to load Megabytes of reports while we are interested only in the small subset our bots cannot correct is also annoying. Thus, I understand Succus revert.  — Felix Reimann (talk) 10:00, 19 December 2013 (UTC)
@FelixReimann:. Thanks for the essential answer to the topic :) Now you can see in Wikidata:Database reports/Constraint violations/P105 that my (second) constraint is working more efficiently than the old "One of" (first). Infovarius (talk) 10:17, 20 December 2013 (UTC)
For this Wiki-project to work we need a property "is a component part of" (parent taxon) which is used in a one-to-many relationship in both directions (up and down: one parent to many children, and one child to many parents) and we need a property "is not a component part of, but is a manifestation of" (subclass of). The present set up works moderately well. - Brya (talk) 18:23, 19 December 2013 (UTC)
Brya, the distinction between "parts" and "subsets" is a notable topic in ontology. They are two perspectives in knowledge representation. Mereology deals with part-whole relations, and as you likely know set theory deals with subset-superset and element-set relations. OWL has its basis in a discipline called description logic (DL), which focuses on bringing operators like   and   to knowledge representation in the form of subclass of (P279) (rdfs:subClassOf) and instance of (P31) (rdf:type). Wikidata's top-level mereological property, part of (P361), does not have formal definition in DL. At the end of day, to my understanding this means that part-whole relations are not defined in a computationally effective way, and so are not well represented in semantic reasoning engines. As you know I don't think it's a good idea to make domain-specific subproperties of 'subclass of', but if 'parent taxon' were a subproperty of anything, I think it would be a subproperty of 'subclass of' and not 'part of' -- that would be closer to the standard representation of biological taxa in ontologies.
Some resources that might interest you:
I say this just as some hopefully relevant background, mostly independent of this particular discussion. Emw (talk) 05:19, 20 December 2013 (UTC)
Yes, I am aware there is a formidable (even daunting) amount of literature. I am also aware there is less than complete agreement on these matters,
        And, again, I am aware that there is a strong tendency among all too many who deal with such matters to become lost in whatever brand of logic they believe in. If reality does not compute, then obviously reality has to be cut down to measure, so that it does fit. I am not going to go along with that. - Brya (talk) 12:01, 20 December 2013 (UTC)

Simple proposal of classification using instance of and subclass of

  1. Tag each taxon item with instance of (P31) = <taxon>
  2. Replace parent taxon with subclass of (P279).

Consequences :

  1. Wikidata consistency, same properties in the whole project, easier for contributors to learn the project
  2. Request to build the subclass tree : search all items pairs who are both instance of (P31) = <taxon> for whose there is a subclass relation instead of serching all item pairs for which there is a parent taxon property
  3. Rewrite the help page of the project, or delete some of the statements to link to the general Wikidata properties.

Bonus:

  • on a transition phase, and maybe forever, keep the parent taxon property.
  • taxonomies are sometime contradictory, possibilities to import some of the inconsistent life trees without problems by marking taxon items as instance of (P31) <specific tree of life taxon> :

Otherwise : anything ? TomT0m (talk) 17:17, 19 December 2013 (UTC)

declined --Succu (talk) 17:27, 19 December 2013 (UTC)
@TomT0m:Cactaceae (Q14560) are a subclass of? --Succu (talk) 18:50, 19 December 2013 (UTC)
Caryophyllales (Q21808), with the other one. Take one plant you would said it's a cactus. Is it also a Caryophillale, for any cactus ? then Cactus is a subclass of Caryophillale. TomT0m (talk) 18:59, 19 December 2013 (UTC)
If I understand your statement right Cactaceae (Q14560) is not a subclass of Centrospermae, Ficoidales and Opuntiales? --Succu (talk) 19:10, 19 December 2013 (UTC)
No, it's just the first one I chose, but it should be also the case for the others. Except en:Centrospermae is used in an old classification and the statement should be marked as such, something like a deprecated rank. But it's true in a classification, as a property of classifications should be that any indiviual classified in a low level of a classification belongs to the upper level as well. It's the exact meaning of subclass of. TomT0m (talk) 19:59, 19 December 2013 (UTC)
Except Centrospermae? It's an valid but outdated classification. So we have a deprecated subclass of? How well fits this with the general usage of "subclass of" ? --Succu (talk) 20:15, 19 December 2013 (UTC)
Hmm are you aware of the ranking functionnality of Wikibase or are you just trolling? The rank prefered is to be used for valid and up to date classification, normal is for statements that are probably correct but historical, and deprecated is for historical sources that were believed to be true at the time but have been obsoleted or mistaked were found in the source. The claims are valid in the sense it has been stated in the classification, but actual nowdays classification superseeds these historical one and shows a different point of view. It's valid in the sense it's been stated in a perfectly respectable classification at the time, but this classification is different from the state of the art one. see Wikidata:Glossary#Claims_and_Statements, no need for a deprecated subclass of property, it's a feature of the software to mark some claims deprecated. TomT0m (talk) 20:35, 19 December 2013 (UTC)
Sure I'm aware of the ranking feature. And a proposal how to cope with it lays on the dusty table. Am I right that your answer to my initial question is: it depends? --20:57, 19 December 2013 (UTC)
No you are not, I said subclass of always fits, and the proposition from Felix seem good to me (except the difference between dusty and not the most recent is kind of fuzzy ;) ) And if you think that that depends, on what would you think that ? The only exceptions I can imagine is for something that is at the limits of the definition of classification. I can't see a difference beetween classification of living things and other kinds. TomT0m (talk) 21:11, 19 December 2013 (UTC)
Fine, than you can answer my initial question: Cactaceae (Q14560) are a subclass of? --Succu (talk) 21:18, 19 December 2013 (UTC)
If you want to shut my mouth, stop playing games and express yourself clearly. Yes, a class has generally several instances so there is a plural in the name. Or is it the fact that you think this rank is a collection of taxons and not a taxon by herself ? (oh wait, replace taxon by class and this also work !), or is it that the class word has a meaning in Linné taxonomy ? No, the plural is not a problem. Because none of these answer will change mine : this still fits. TomT0m (talk) 21:28, 19 December 2013 (UTC)
I think we (eg the active members of this project) have expressed ourselfs clearly in this endless discussion. And yes: shut up, if you are not ready to answer a simple question. --Succu (talk) 21:40, 19 December 2013 (UTC)
stingy ! TomT0m (talk) 22:08, 19 December 2013 (UTC)
May be it's „stingy” for you, but not for us. But let me more blunt: did you improve any item within the scope of this project? Have you ever written an article about a „taxon”? --Succu (talk) 22:28, 19 December 2013 (UTC)
No need to speak that much for saying that. I'm not a part of the cast, OK. But I plan or will try to push a gadget based on the class type and whose could suggest the properties needed for any type of item, that will (could at least) work for your project as well just as any over specific projects. And maybe some more specific properties for more specific taxons . A kind of missing props gadget extendable by the community, for example, just by editing the class item docpage or the class itself. With which drawbacks ? TomT0m (talk) 22:46, 19 December 2013 (UTC)
I wish you good luck with your intended gadget, but my question is not answered Cactaceae (Q14560) are a subclass of? --Succu (talk) 23:09, 19 December 2013 (UTC)
try this. --Succu (talk) 23:20, 19 December 2013 (UTC)
OK, this went out of control, I have no clue of which kind of answer you want (instances of with plural do not really make sense to me), nor I can read your thought. What your link makes me think is how suboptimal our tools are. If we want to cooperate we better be clear of what we want to do, why, ask ourselves if we really do things the right way or how we can improving, but this kind of mind games really does not help. But I'm not hostile. TomT0m (talk) 23:36, 19 December 2013 (UTC)
That does not show. Anyway dividing up parent- taxon-relationships into deprecated and valid is not in conformity with the use they are to be put to. Wikidata is supposed to support Wikipedia, and NPoV is (supposed to be) the cornerstone of Wikipedia. - Brya (talk) 04:50, 20 December 2013 (UTC)
It may be a 'simple' proposal, but simplistic may be a better description. For example it does not say what to do with what is now "subclass of", nor how to convince everybody to stop using "subclass of" for what they are now using it. That alone is enough to sink it. - Brya (talk) 18:13, 19 December 2013 (UTC)
subclass of is exactly that now, by definition. And as the rule is to link to the lowest class in the hierarchy, it's exactly parent taxon beetween taxon (classes of beeing). TomT0m (talk) 18:21, 19 December 2013 (UTC)
To emphasise the things, I just read the Caryophilales article in French Wikipedia, and I'll quote classification phylogénétique APG (1998) et la classification phylogénétique APG II (2003) ont étendu cet ordre. L'ordre est plus pres de la sous-classe Caryophyllidae de Cronquist (APG classification and APG to extended this order. The order is closer of the subclass Caryophyllidae from Cronquist). So I'll say yes, in modern classification, an order is some kind of class. TomT0m (talk) 19:05, 19 December 2013 (UTC) OK, I think I'm beginning to get it, there is a class level in classical living thing classification ... TomT0m (talk) 19:08, 19 December 2013 (UTC)
You can define anything you like, but that does not make it so. That is exactly the problem: you make ex cathedra pronouncements and expect the world to conform to your point of view. - Brya (talk) 18:26, 19 December 2013 (UTC)
Absolutely not. It"s how it is defined in Help:Basic membership properties, is more and more adopted in other parts of Wikidata as the base of the models, and it is how it is used by users except honest mistakes which can happen with parent taxon as well. The filter within subclass of statement is easy bu checking the type of the subject and the object, verification is like in other parts of the project by sourcing the statement by a bilogical database and/or by the scientific publications. TomT0m (talk) 18:35, 19 December 2013 (UTC)
Nope, that is not how it is used by any user I have seen at work here. And unless you first convince all users to use it that way, it is not possible to switch. - Brya (talk) 18:41, 19 December 2013 (UTC)
Can you be clearer ? TomT0m (talk) 21:11, 19 December 2013 (UTC)
"Subclass of" is commonly used for all kinds of things, see for example Pyrus (Q434). - Brya (talk) 04:50, 20 December 2013 (UTC)
Yeah, and it's not a problem. Pyrus (Q434) is not a taxon, and subclass of (P279) is also used to class other things. That perfecly fits in my proposal, for a concrete use you would just have to work just with taxon items. TomT0m (talk) 16:44, 20 December 2013 (UTC)
Sure, if we have to redo everything to do it just your way, you don't have a problem; it is everybody else who will have problems. - Brya (talk) 17:27, 20 December 2013 (UTC)
Redo everything seems kind of an overstatement. TomT0m (talk) 19:05, 20 December 2013 (UTC)
Well, "throw everything out and start from zero", no, but "revisit every item and revise it", yes. Also redesign any algoritm that already exists. - Brya (talk) 06:15, 21 December 2013 (UTC)
No, design a bot and launch it, the childenhood of the art in Wikidata. Also I made it clear in my proposal that we could keep the current structure as well as long as we want, we don't have to delete current statements. TomT0m (talk) 11:45, 21 December 2013 (UTC)
Then your bot and you will be happy, but the rest of the world ... - Brya (talk) 12:26, 21 December 2013 (UTC
Who is the rest of the world ? One advantage of aligning practices project wide is to share the experiences and the tools from one part of the project to other parts. Cooperation and discussion should go to a concensus where everybody is happy and in which what is useful for one project is useful for everybody. If we are open to each others point of views. TomT0m (talk) 12:45, 21 December 2013 (UTC)
Ah, so you will be going out into the world to build consensus. Anyway, you learned a valuable lesson here of how not to go about it (at least I hope you paid attention and learned something). - Brya (talk) 13:23, 21 December 2013 (UTC)
  • OK, the lesson is talking and arguenting is harmful, I love your moral :) . Obviously we are nowhere close to a consensus here, I never planned to claim so, I'm do not want to put bad faith in this. On the other hand I'll stop answering to out of scope subjects, it's not useful as well. TomT0m (talk) 13:53, 21 December 2013 (UTC)
  • Brya, instance of (P31) is currently used in over 4.6 million items. subclass of (P279) is used in all domains other than biological taxonomy, where such claims have been actively removed. Please show me another group that removes P279 statements like this, anywhere on Wikidata. While there are a small handful of domains that still use domain-specific type properties, to my knowledge they all also use P279, including, notably, redundant classifications. While consensus is currently building for replacing domain-specific type properties with P31 and P279, consensus that P31 and P279 are applicable to all items on Wikidata already exists. Then, of course, there's also the fact that the rest of the Semantic Web uses P31 and P279 (in the form of rdf:type and rdfs:subClassOf) instead of domain-specific type properties. Emw (talk) 14:09, 21 December 2013 (UTC)
I removed your simplifying, erroneous tree hierarchy Emw. --Succu (talk) 20:29, 23 December 2013 (UTC)
Succu, please show me the error. I used the 'parent taxon' values of human (Q5) to construct the taxonomic hierarchy for it using 'subclass of', so whatever errors exist would be trivially fixable without removing the 'subclass of' claim. Perhaps you could also respond to the points above and below. Emw (talk) 14:49, 27 December 2013 (UTC)
Simply start with the different circumscriptions of Hominini... Taxonomy is not static. --Succu (talk) 20:33, 27 December 2013 (UTC)
Taxonomy is not static of course not, but we have the same tools to handle that for the subclass tree than with parent taxon : claims are sourced, so claim says this author thinks that ... is a subclass/parent taxon of and make obsolete, which said this author thinked that but nowdays nobody really believes this anymore. This make no difference. With the bonus that we can also tag taxons with the tree they are supposed to belongs to, as there is also different tree built by different authors that could be included in Wikidata, and we can handle that. TomT0m (talk) 14:50, 28 December 2013 (UTC)
A quick look at the use of instance of (P31) shows that this is quite varied, and not at all different from how I would use it. Hitting the "random item" button a number of times trying to establish usage of subclass of (P279) did not turn up any case of its being used, let alone a pervasive use. BTW, this "Wikiproject" does use subclass of (P279) (even where I would not use it), so it would not be an exception even if a pattern could be established. So, I don't follow you at all. - Brya (talk) 16:13, 21 December 2013 (UTC)
Brya, please show me examples of where this WikiProject uses subclass of (P279). What I've seen is P279 being actively removed from items about classes of organism by members of this group. Regarding its pervasiveness, I would note that while the property is used on only about 26 thousand items, one would not expect its claim count to be in the same order of magnitude as instance of (P31). Wikidata has many more items about instances than about classes; there are many more instances of Homo sapiens than there are subclasses of Homo. By a more relevant measure of pervasiveness, P279 is used everywhere: it connects classes in a hierarchy across all of Wikidata's many domains into a taxonomy of all knowledge. There is one exception, where P279 is not the primary (or only) mechanism to create class hierarchies: organismic taxonomy. Emw (talk) 16:57, 21 December 2013 (UTC)
Well 26 000 on 14 000 000 items is quite a small number (<.2% ?), so probably this Wikigroup is using it above average. The one golden rule on discussions on taxonomy is that if Homo is dragged into it the topic is no longer taxonomy, if ever it was. - Brya (talk) 17:22, 21 December 2013 (UTC)
Brya, again -- as I explained clearly in my previous post -- claim count is not a relevant metric of pervasiveness for subclass of (P279). The relevant metric is the breadth of domains it is used to construct class hierarchies within. As shown here (have you looked at that?), P279 is used across virtually every taxonomy of knowledge -- except organismic taxonomy. Do you contest my claim that P279 has been actively removed from items about classes of organism by members of this group? And please explain what you mean when you say "The one golden rule on discussions on taxonomy is that if Homo is dragged into it the topic is no longer taxonomy." As you can seen in my previous post, I was referring to the genus Homo (Q171283); the comment was clearly about taxonomy. Emw (talk) 17:39, 21 December 2013 (UTC)

How do you plan to connect
For all the above examples, we would like to use P293. All are valid classifications, i.e., you would have "preferred" rank. Additionally, there are taxa with several "preferred"-ranked parent taxa. Don't you want to see that the usability will be extremely bad if we add to all other classifications an organism will have also our half of dozen parent taxons? Just because noone implemented subproperty of rdf:type. Still, this is a weak argument. Do you have an example of an OWL model of all organisms (pure bio databases do not count, as they are interested only in taxonomy, i.e., we would forbid all of the above examples). Freebase (which is quite similar to what we implement here) also has specialized properties for biology.  — Felix Reimann (talk) 09:30, 20 December 2013 (UTC)
Felix, these are great examples for discussion. The relation between human (Q5) and person (Q215627) was a major part of a discussion about how to classify items like Coco Chanel. To answer your question, I think 'human subclass of person, human subclass of Homo ' is a rare case where multiple values for subclass of (P279) make sense. The next question is which, if either, of these values should be preferred. For that I think Wikipedia has an answer: see Human. The infobox at right indicates a taxonomic subclass relation is preferred. While person is mentioned, it is mentioned in the question of the the legal concept of personhood as it concerns abortion law. 'Person' is a ubiquitous colloquial alias for 'human' at least in English, but the formal definition of 'person' is a topic of law, and historically fluid (tectonically so). The statement human (Q5) subclass of (P279) person (Q215627) itself is controversial: it entails all humans are persons, which means Wikidata is implicitly making a statement about the beginning of human personhood. On the other hand, human (Q5) subclass of (P279) Homo (Q171283) is universally accepted without qualification, so I think that should be the preferred 'subclass of' claim.
Most of the other examples seem like they can be captured in conventional, non-type properties. For house cat (Q146), for example, it would probably be better to create a property like diet (Px) to capture the relation between house cat (Q146) and carnivore (Q81875): i.e., I think house cat (Q146) diet (Px) meat (Q10990) would be better than house cat (Q146) subclass of (P279) carnivore (Q81875). Ideally, I think that claim would be applied to the most distant superior taxon for which house cat (Q146) diet (Px) meat (Q10990) would apply. Using conventional properties for attributes like diet would make Wikidata's property system more expressive, while preserving subclass of (P279) for a preferred type claim, or multiple type claims when a subject is classified along two very different axes. Emw (talk) 13:58, 20 December 2013 (UTC)
I don't fully agree, house cat (Q146) subclass of (P279) carnivore (Q81875) could be used to state a constraint on all its subclasses that states that their diet is a subclass of animals, for example. TomT0m (talk) 16:51, 20 December 2013 (UTC)
I will repeat that it is not the question if house cat (Q146) diet (Px) meat (Q10990) would be better than house cat (Q146) subclass of (P279) carnivore (Q81875). The question in hand is not only if the majority of the users in this project feel that way, but especially if this has been fully implemented for all those relationships. Until that moment has arrived this is the wrong place for this discussion. - Brya (talk) 17:27, 20 December 2013 (UTC)
Brya, "I will repeat"? Please show me where the point has been made before. Whether a claim like house cat (Q146) diet (Px) meat (Q10990) would be better than house cat (Q146) subclass of (P279) carnivore (Q81875) is the central question at hand in my reply to Felix.
Properties like instance of (P31) and subclass of (P279) can be used as extremely broad catch-all properties, but that doesn't mean they should be. Such usage diminishes the structured classification for which P31 and P279 exist to much less structured tagging. This necessarily also works to the detriment of our property ecosystem, stunting the creation of new and useful properties. In reality, we could eliminate (or simply not create) many conventional properties and express them as one of a multitude of classes for our items. A plausible hypothetical example:
Option 1
subclass of (P279) Felis silvestris (Q43576)
diet (Px) meat (Q10990)
IUCN conservation status (P141) Least Concern (Q211005)
temporal range start (P523) Neolithic (Q36422)
...
Option 2
subclass of (P279) Felis silvestris (Q43576), carnivore (Q81875), organisms of Least Concern (Qx), Neolithic organisms (Qy), ...
In fact, the claim house cat (Q146) subclass of (P279) carnivore (Q81875) is already redundant. house cat (Q146) is already a subclass of carnivore (Q81875): replacing P171 with P279 would allow us to express house cat (Q146) subclass of (P279) Carnivora (Q25306) in a way readily deducible by external standards-compliant projects, and Carnivora (Q25306) is defined by diet (Px) meat (Q10990).
The class of a subject is a special property that entails the subject has certain values for other, conventional properties. As explained above, house cat (Q146) subclass of (P279) Carnivora (Q25306) entails house cat (Q146) diet (Px) meat (Q10990). The taxonomic classification of house cat (Q146) also entails other attributes, e.g. house cat (Q146) subclass of (P279) mammal (Q7377) would entail thermoregulation (Py) endothermy (Q15401780) while house cat (Q146) subclass of (P279) Placentalia (Q25833) would entail mode of birth (Pz) viviparity (Q120446). Emw (talk) 18:52, 21 December 2013 (UTC)
So your general goal is to replace well sourced propery/value-pairs like IUCN conservation status (P141) = Least Concern (Q211005) with artificial classes like organisms of Least Concern (Qx) and subclass of (P279) of organisms of Least Concern (Qx)? --Succu (talk) 20:12, 27 December 2013 (UTC)
As I said quite a lot of time already, this is absolutely not exclusive :), we can do both. I personaly have no objections about IUCN conservation status (P141). A good definition of the class of organisms of Least concerns would actually need this property. Foror the artificial class, every class is somewhat artificial, what is important is the relevance of the classes definitions. TomT0m (talk) 13:19, 28 December 2013 (UTC)
Redundancy is seldom a good idea. But I think you should test this idea and replace sex or gender (P21) = female (Q6581072) by subclass of (P279) of human females. --Succu (talk) 16:03, 28 December 2013 (UTC)
Nobody wants to do that. TomT0m (talk) 18:39, 29 December 2013 (UTC)
But this is the intention of Option 2: replacing properties with classes. Why not this replacement? --Succu (talk) 19:21, 29 December 2013 (UTC)
He says we could, not we should. Plus I will repeat these options are not exclusive, they are in many ways complementary. (for example a more complex class such as women can be defined as instance of (P31) = <human> and sex = female. TomT0m (talk) 20:51, 29 December 2013 (UTC)
May be we should do some magic and call him by his name Emw? --Succu (talk) 21:06, 29 December 2013 (UTC)
  • For redundancy, this may not be a good idea sometimes, I'm not so sure in the case of Wikidata it is. For example it could be a protection for mistaken edit errors or vandalism : if the redundant properties are not consistent, there is a problem. On an open wiki, it can happen at any moment. Actually in communication redundancy is important, for example in natural language there is a lot of redundancies, and network computer code have redundant informations almost all the time as the network is rarely perfect. TomT0m (talk) 18:48, 29 December 2013 (UTC)
Return to the project page "WikiProject Taxonomy/Archive/2013/12".