Wikidata talk:WikiProject Taxonomy/Archive/2015/08

Latest comment: 8 years ago by Termininja in topic incertae sedis
This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

Possible duplicates (taxon name (P225))

I've made some simple spell checks for taxon name (P225) and detected more than 5000 possible duplicates. Please help to check them. --Succu (talk) 16:40, 14 August 2015 (UTC)

This is bad. Not unexpected but bad. Some of them are easy, but very likely there will be dubious cases in the animals, very hard (to impossible) to crack. Just goes to show how useful a property "gender" would be. - Brya (talk) 18:36, 14 August 2015 (UTC)
Those look like the standard bot-created Wikipedia pages just from a single spot check (Efferia californica (Q14511125) and Efferia californicus (Q14627038)). There are bound to be some false positives, but otherwise... --Izno (talk) 18:38, 14 August 2015 (UTC)
That is not really saying much as the vast majority of items on taxa deal with bot-created Wikipedia pages ... - Brya (talk) 18:56, 14 August 2015 (UTC)
Not every one knows what's obvious and what isn't. (Ergo, it's not obvious.) --Izno (talk) 05:16, 15 August 2015 (UTC)
Well, there are some users to whom nothing is obvious, no matter what. - Brya (talk) 06:07, 15 August 2015 (UTC)
I think this is only the tip of the iceberg and it's much worser. Spelling errors in generic names are harder to detect. (Q943119, Q10795776). --Succu (talk) 19:13, 14 August 2015 (UTC)

Translation

Some people on the French Bistro and on the equivalent project and frwiki has said to have problems with all the help pages in english, so I took on myself to ask to make this page tranlatable. As some members of this project any objection ?

Second point : the tutorial, who is probably even more interesting but is unfinished and as such likely to move. Is it time to mark it translatable ? author  TomT0m / talk page 08:41, 12 August 2015 (UTC)

The tutorial does not feel ready for translation. We are still feeling our way into a basic structure. Probably need several more properties. - Brya (talk) 17:05, 12 August 2015 (UTC)
I agree. There are open tasks, e.g. the integration of the solution we found for types. The project page is not a help page. It provides a basis for disussing things related to taxonomy. Most of it (property list) is autotranslated. The rest is not really important. --Succu (talk) 17:46, 12 August 2015 (UTC)
Well, translation can be modified as well, it does not moves that much, and translating them will maybe help other to join the discussion. author  TomT0m / talk page 17:53, 12 August 2015 (UTC)
I'm sure that people contributing to an english discussion here, don't need a translation of the project page. --Succu (talk) 18:08, 12 August 2015 (UTC)
I'm talking of people who do not contribute here, but does on local taxonomy projects. They probably have an opinion however and someone like me can totally make the link if they are not easily talking in english for example (and I'm not talking of those who could talk in english but would not because of the language, people are twisted sometimes). author  TomT0m / talk page 18:13, 12 August 2015 (UTC)
TomT0m, not a single local taxonomy project links to this page. --Succu (talk) 19:40, 12 August 2015 (UTC)
and you don't want this to change ? author  TomT0m / talk page 19:58, 12 August 2015 (UTC)
Ask these „people on the French Bistro“ to change this, TomT0m. --Succu (talk) 20:22, 12 August 2015 (UTC)
What do you think I'm doing ? ;) This is not the most easy thing to do in taxonomy because of all this controversies and almost political issues, and people tend to fear to lose control, but discussion after discussions this changes. Some are interested in Wikidata but have already to much on their plate to take time to come here. There is this language issue with people who prefer to discuss in french » (it's especially for those that is's important to show good will on the translation side). But if it's just for the interwiki I guess you or me can sort this out alone in no time :) author  TomT0m / talk page 20:32, 12 August 2015 (UTC)
Asking e.g. User:Liné1 came never to your mind? --Succu (talk) 20:46, 12 August 2015 (UTC)
If this kind of stuffs stopped me I would have quit this project a long time ago. We can see so much good stuffs and so much depressing one that if you focus on the bad you're overwhelmed. And Wikipédia just can't exists and is a just pile of poo. author  TomT0m / talk page 20:58, 12 August 2015 (UTC)
TomT0m, I hope your are well?! --Succu (talk) 21:12, 12 August 2015 (UTC)

OK, nobody opposed, except Succu elsewhere. I'll put the <translate>...</translate> again when you'll be done with the edit conflicts Succu/Tinm. author  TomT0m / talk page 19:16, 24 August 2015 (UTC)

@Succu: Are you kidding, seriously, reverted again ? Which game are you playing ? author  TomT0m / talk page 12:00, 25 August 2015 (UTC)
I'm the one who mainly maintains this page at the moment. I see no argument that a translation is useful. --Succu (talk) 12:08, 25 August 2015 (UTC)
That's really annoying. Are you really wainting for me to do the change for the pleasure of reverting ??? author  TomT0m / talk page 12:37, 25 August 2015 (UTC)

  WikiProject Taxonomy has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. Time for a project wide consultation. Personnaly it's a demand of taxonomists in the french Wikiproject. (I think Tinm can concur). author  TomT0m / talk page 12:37, 25 August 2015 (UTC)

Succu's right, he's the current reference user for this projet, you can't do as if « nobody opposed » and he didn't exist. If there's no agreement, that's how it is. Nevertheless, I think having a taxonomy tutorial/help page in French would be something useful and in line with Wikidata's principles, and following en:WP:Be bold TomT0m should be allowed to translate this page if he wishes. — Tinm (talk) 14:48, 25 August 2015 (UTC)
There is very little real text in the page, and looking at this, TomT0m has translated most of it already, so I am not sure what the problem is? - Brya (talk) 16:56, 25 August 2015 (UTC)
If I understood well, there is translation tags in the text to annote the texts to be translated, and Succu decided it was "complicate" and reverted everything. Destroying my work of course because the translations were unusable after that. A translation admin seems then to have undid the process. I can't understand this in an international project. author  TomT0m / talk page 17:40, 25 August 2015 (UTC)

original combination (P1403) and basionym (Q810198)

I'm not sure what priority this has (probably not much) but at some point we will probably want to clarify the relationship between original combination (P1403) (zoology) and basionym (Q810198) (botany). There's also “basonym” in microbology for which there isn't an element. A merger might be conceivable as frwiki, enwiki, dewiki all have a single article for these terms, while eswiki and itwiki don't seem to mention original combination (P1403) altogether. — Tinm (talk) 16:39, 24 August 2015 (UTC)

Yes, at some point it would be a good idea to add a property "basonym". It just hasn't come up. - Brya (talk) 17:29, 24 August 2015 (UTC)
Maybe become based on (P144) can totally work :) author  TomT0m / talk page 17:56, 24 August 2015 (UTC)
I don't think so, TomT0m. Basionym (and protonym) have a very specific meaning. — Tinm (talk) 18:06, 24 August 2015 (UTC)
Sure... --Succu (talk) 18:04, 24 August 2015 (UTC)
@Tinm: Is there ambiguities in that context ? I mean, if I read the ethymology in frwiki it means «nom ayant servi de base à la formation d'un autre nom »., the an old name was used to forge another new name. If the item is a taxon name, is there a possibility that this relation occurs in other contexts ? Because if there is not, it's totally possible to label based on (P144) as "basyonym" when both the subject and the object are taxonomic names. author  TomT0m / talk page 18:33, 24 August 2015 (UTC)
It is not wise to go by descriptions in Wikidata for concepts that are fairly complex in their details. - Brya (talk) 19:05, 24 August 2015 (UTC)
This is a waste of time. You're totally overthinking it. First based on (P144) is intended at art works (like when a movie can be based on a book), and second even if it had a more generic meaning and we could use it, we would still probably want to subclass it into something like “based on the original taxonomic name...” to specify that the modifications of taxonomic names obey to the rules given by the systematic codes. That's precisely what basionym means already, as its etymology indeed illustrates. You can create a “stuff that is inspired from other stuff” class if you want, and make basionym a subclass of it, but this is much too abstract. — Tinm (talk) 19:18, 24 August 2015 (UTC)
A similar property can totally be expanded in scope if it's justified. Property creation in Wikidata is painful so it can even be quite convenient. author  TomT0m / talk page 19:21, 24 August 2015 (UTC)
But in this case there's little doubt that we want a systematics-specific property. — Tinm (talk) 19:42, 24 August 2015 (UTC)

Edit conflict...

Succu insists that the properties for authors botanist author abbreviation (P428) and author citation (zoology) (P835) should appear in the middle of the list describing the properties for taxon (Q16521), on the projet page, which I find absurd. What do you think ? Here is the disputed edit (I moved these properties to “Related items”). — Tinm (talk) 18:04, 24 August 2015 (UTC)

These are not an „related items”, but properties. If you want to reorganize the overview, make a proposal. --Succu (talk) 18:08, 24 August 2015 (UTC)
I would create a “Related properties” section alongside our “Related items” one. It stands to reason... — Tinm (talk) 18:20, 24 August 2015 (UTC)
These "optional properties" are a mixture, with some dealing with the taxon (IUCN), and others dealing with the name (author, type). If anything is to removed, then other properties should be also. Maybe at some point in the future, once we have a lot more of them, it will make sense to everybody to make two lists. - Brya (talk) 18:59, 24 August 2015 (UTC)
That's not quite right, Brya. All the properties that are listed, other than the two I've given and IPNI author ID (P586) (I had missed it before), are intended for taxon (Q16521) items. (Incidentally, IPNI author ID (P586) improperly declares itself a Wikidata property related to taxa (Q18609040)). — Tinm (talk) 19:32, 24 August 2015 (UTC)
@Tinm: Note that there is the property properties for this type (P1963) to link a class to a property that applies to its instances. The {{PropertyForThisType}} template generates a query to list all these properties plus all of the properties of the parent classes. author  TomT0m / talk page 19:40, 24 August 2015 (UTC)
TomT0m: Is enumerating domain properties for each domain class an OWL recommendation? --Succu (talk) 20:52, 24 August 2015 (UTC)
No more than creating a list of properties in a WikiProject page is an OWL recommandation. author  TomT0m / talk page 08:00, 25 August 2015 (UTC)
@Tinm: I make it about half and half: a taxon does not have a type, a name has a type (at least under the ICNafp, it is more complicated for the ICZN). "Code of nomenclature" applies to names, not to taxa. Etc. Ranges, geographical or temporal, apply to the taxon. Arguably, "taxon synomym" can belong to either, depending if it is a nomenclatural / objective or taxonomomic / subjective synonym. - Brya (talk) 05:08, 25 August 2015 (UTC)
Oh, I see what you mean. But what I my point is that technically, regardless of what a property theoretically applies to, these Wikidata properties all have a taxon (Q16521) item as subject, except for a couple e.g. botanist author abbreviation (P428) which are properties for human (Q5) items and are not used (even as qualifiers) for taxon (Q16521) items. — Tinm (talk) 15:08, 25 August 2015 (UTC)

If we want reorganize the tables, we should keep in mind that BHL page ID (P687), IUCN taxon ID (P627), IPNI author ID (P586), IPNI publication ID (P2008), ZooBank author ID (P2006) and ZooBank publication ID (P2007) should only be used within references. --Succu (talk) 15:29, 25 August 2015 (UTC)

Sorry Succu, I really don't get what you mean here... — Tinm (talk) 15:55, 25 August 2015 (UTC)
E.g. IPNI author ID (P586) is used as part of the reference for botanist author abbreviation (P428). --Succu (talk) 16:08, 25 August 2015 (UTC)
Is it really intuitive or practical ? As far as I can see, current items totally ignore these instructions. — Tinm (talk) 16:39, 25 August 2015 (UTC)
It is true that "botanist author abbreviation (P428)" and "author citation (zoology) (P835)" are odd, in that they are not used directly in items dealing with taxa (items would be easier to read if they were). - Brya (talk) 17:03, 25 August 2015 (UTC)
That's what I meant. I don't get your point though, isn't it natural to have the information that Linnaeus's abbreviation is "L." at Linnaeus ? — Tinm (talk) 22:38, 25 August 2015 (UTC)

Protesilaus protesilaus

Moved from here

Hi Succu, I recently corrected some errors in the item Q7251582, Protesilaus protesilaus. There is only one taxon with that epithet, which can be found combined with several generic names. The original combination is Papilio protesilaus Linnaeus, 1758. Also the combination Eurytides protesilaus can be found. The current accepted combination is Protesilaus protesilaus but each language version of Wikipedia has to decide for itself whether it takes up the name as Eurytides or as Protesilaus. If a language version would have two articles on this one taxon, then that would be an error. To cut things short: there should only be one item on this taxon in Wikidata to which the articles in different language versions are linked. I don't know exactly how Wikidata copes with redirects on any language version.

When I tried to link the Dutch article to Wikidata, a few weeks ago, I got an error message, and it appeared there were two items dedicated to this one taxon, so I did a merger. A few days later I found out Brya had undone my corrections, so I tried to find out what had gone wrong, did not find anything, and did the best I could to restore what to my opinion was the correct situation. I tried to add a statement on the original combination Papilio protesilaus to the item, but for some strange reason this was automatically changed into another combination of protesilaus, for which there was a Wikidata item, but it was impossible for me to add Ppapilio protesilaus by hand. Any other name for the orriginal combination didn't make sense, si I undid that.

Wikidata starts to pose real problems for taxonomists if it becomes very difficult if not impossible to add such undisputable claims as an original combination of a name.

I just discovered that you on your turn undid my changes in this item. As it is my intention to use Wikidata for the correct links between language versions, and to add correct claims and statements to the items included, I hope you can give me a clue as to what went wrong in your opinion. Sincerely, Wikiklaas (talk) 13:31, 22 August 2015 (UTC)

Hi, Wikiklaas. Are you aware of the Tutorial at Wikidata:WikiProject Taxonomy? --Succu (talk) 09:09, 23 August 2015 (UTC)
If I may, my edits of 1 aug undid Wikiklaas' merge, and instead just moved the one iw-link, I further added a "taxon synonym" relationship (unfortunately unreferenced). This merge had no beneficial aspects, if the relationship is as Wikiklaas portrays it. If a name is not current (is regarded as non-current by the references used) then it is perfectly handled by treating it as a synonym.
        Putting more than one taxon name in an item is an ugly emergency measure, only to be suffered when different Wikipedias use different names (or for higher-order taxa, where priority does not apply), to be resolved as soon as possible. It is easy to see why: if all names ever applied to a taxon were to be dumped into "taxon name" Wikidata would become an unmanageable morass. - Brya (talk) 10:36, 23 August 2015 (UTC)
There is only one butterfly taxon denoted by the epithet protesilaus. If, let's say, the Swedish Wikipedia has it as Eurytides protesilaus and the Dutch has it as Protesilaus protesilaus, then it is obvious these articles should be linked in the same Wikidata item as they refer to the same taxon: they are objective synonyms. My edits were no more than that: bring together in one item the articles on different language versions that are objective synonyms. After all, the idea is that one can find an interwiki link to the article on the same subject (same taxon) on the Swedish version in the Dutch version. This is not effected by creating "taxon synonym" links but by placing the objective synonyms in the same item. Because Eurytides protesilaus, Protesilaus protesilaus and Papilio protesilaus are objective synonyms, any language version should have no more than one article on the basis of that epithet; any other mention of that name (as the original combination in Papilio or the combination in the alternative genus), will have to be a redirect on that language version. By treating the name as a "taxon synonym" the correct interwikilinks are not made, and the system fails in doing what it was supposed to do. Wikiklaas (talk) 22:55, 24 August 2015 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── @Wikiklaas: “Each language version of Wikipedia has to decide for itself whether it takes up the name as Eurytides or as Protesilaus.” You are right, each Wikipedia can choose. But Wikidata can decide on which name it retains as well ; in this case item Protesilaus protesilaus (Q7251582) is the main Wikidata entry for this taxon, while item Eurytides protesilaus (Q13627517) is merely Wikidata's way of keeping track of the existence of a synonymous name. This means that there shouldn't be any Wikipedia articles linked to Q13627517. Even if they're entitled Eurytides protesilaus, they should be added to Q7251582. The Wikipedia title needn't be the same as that of the Wikidata article. Succu and Brya, let me know if I'm wrong. — Tinm (talk) 01:57, 25 August 2015 (UTC)

As to the purpose of Wikidata, I see this as simply a "central storage for the structured data" for Wikimedia, and especially Wikipedia, with its NPoV mission. Its purpose is not to build The One True Tree of Life, but to store and properly reference all notable points of view. So, the purpose is to reference names so that Wikipedias can decide which names to harvest, and how. If a Wikipedia wants to have more than two pages on the same name (or taxon), who are we to stop them? But they should be real pages: the bot-created pages on sv-wiki (with nothing more than nomenclatural data) are indeed ugly duplicates, but again, who is going to stop them?
        The approach advocated by Wikiklaas is possible (and will have support among entomologists), and can be effected by a new property "species-group name". However, the disadvantages seem obvious: 1) the world at large won't recognize the results and 2) it is going to result in a more complex structure.
        One more point: what pragmatically connects iw-links is putting them in the same item, which is what I did in the first place. - Brya (talk) 05:41, 25 August 2015 (UTC)
An item with multiple scientific names (objective synonyms) and/or all sitelinks has one drawback: its's not possible to create the correct taxobox for all wikipedias. --Succu (talk) 14:32, 25 August 2015 (UTC)
That's correct. There is a totally valid rationale for this: there is no such thing as "the" correct taxobox. A taxobox is a representation of a scientific view. Choices on which view is presented should be made by well informed editors on local projects, not by Wikidata. Especially not if there are editors on Wikidata who think there is only one correct taxobox for a certain taxon (a name and all its objective synonyms), let alone there would be only one correct taxobox for a taxon that has its status disputed (accepted as valid by some, seen as a synonym for another name by others). I agree with @Tinm: in the above comment that Wikipedia's should only link to one item. In the case of objective synonyms that is, like in the protesilaus case. And that was what I was trying to realise when my edits were reverted. In case of subjective synonyms, I'm sure the approach of Brya is much more realistic, and also much more flexible in the long run. Wikiklaas (talk) 21:25, 25 August 2015 (UTC)
Well, with the right software it would be possible. And, in turn, as such software does not exist, it is harder to decide on a good model. A circular problem. So my choice is to set up a structure that can hold all the information available, trusting that someday the software will be there to handle it.- Brya (talk) 16:46, 25 August 2015 (UTC)

Wouldn't it be possible to use ranks

@Succu Wouldn't it be possible to use ranks ? I've seen these mentioned here and there but I don't know how they work yet. — Tinm (talk) 18:03, 25 August 2015 (UTC)
Setting a rank is allways like expressing a taxonomic opinon. We use ranks with parent taxon (P171) for widely accepted higher level taxonomies e.g. APG III system (Q156982). --Succu (talk) 18:20, 25 August 2015 (UTC)
(See also my answer regarding POV on the projet discussion page.) We must also keep in mind that in most cases of multiple scientific names there is a scientific consensus and no opinion involved. I'd find it very elegant if we opted for statements in the style of “<lion (Q140)> taxon name (P225) <Felis leo, Linnaeus 1758, deprecated>”. And ranks are flexible enough to handle the cases in which POV would be a problem ; we can just leave the “normal” (or the “prefered”, whatever we choose as a default) rank to several statements. From the perspective of a Wikipedia user or program that seeks to import a correct name, the questions to answer are the same anyway, whether we regroup the scientific names or not, and arguably it'd be more elegant and practical if there's a single item. — Tinm (talk) 18:34, 25 August 2015 (UTC)
There are lots and lots of names that are completely stable. But there are also lots of names where two (or more) different positions exist. And in those cases it will prove, often enough, that both sides claim there is consensus. In practice there is a convoluted but seamless gradient of all kinds of situations. - Brya (talk) 18:45, 25 August 2015 (UTC)
Sure enough. ;) I think we could use rank:deprecated for completely undisputed claims and rank:normal and rank:preferred to handle the most subtle cases. For instance we can have two normal-ranked names if there's a two-sided dispute that we don't want to decide, or if there's a widely used name against a marginal (but still supported by some) name we can use one preferred-rank name and one normal-rank name ; and at the same time the old, consensually deprecated names for the same taxons can be there with deprecated ranks. — Tinm (talk) 20:13, 25 August 2015 (UTC)
I'm sorry, but my talk page is the wrong place to discuss potential usages of the rank feature. Please move to the project page. --Succu (talk) 20:28, 25 August 2015 (UTC)
Should I move the whole discussion ? It doesn't seem to concern you personally anyway. — Tinm (talk) 20:47, 25 August 2015 (UTC)
What ever you like. But I think your thread started here. --Succu (talk) 20:53, 25 August 2015 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── As the sideways have now gone their own ways, I still didn't get the answer as to what you saw as a problem when you reverted my edits. Wikiklaas (talk) 21:03, 25 August 2015 (UTC)

Wikiklaas, what exactly was not answered in this convsersion? --Succu (talk) 21:07, 25 August 2015 (UTC)
Well, I was trying to realise what Tinm described above: making one item on Protesilaus protesilaus where all the Wikipedia's, no matter what combination they used, were linking to, and my edits were reverted twice, once by you. I would like to know what problem you saw when you undid my edits and restored a previous version of the item. Wikiklaas (talk) 21:30, 25 August 2015 (UTC)
I don't think it's worth arguing on that specific case. The revert was because Brya deemed the current situation more appropriate than your modifications, and he's more experimented than you regarding this projet. Now as far as I can tell, there's not yet a set-in-stone rule and there's room to discuss what the best way of managing synonyms is. — Tinm (talk) 22:15, 25 August 2015 (UTC)
Well, what I was doing was "making one item on Protesilaus protesilaus where all the Wikipedia's, no matter what combination they used, were linking to" changing from Wikiklaas' mixed item on both Protesilaus protesilaus and Eurytides protesilaus. - Brya (talk) 05:32, 26 August 2015 (UTC)

Comments on taxon common name (P1843)

First I apologize for opening so many topics at the same time. These are just comments that cross my mind as I explore the projet and that I thought were relevant. They're mostly for future reference as I don't have unlimited time, so treat them with low priority if you wish to.

Also I am not sure how much discussion has already been dedicated to this subject (which is know to be a reccurent source of controversy on all the Wikipedias :D). Let me know if I'm missing obvious things.

I am quite sceptical about how common names are currently handled. There seems to be obvious and major shortcomings. From what I have understood, there are several parallel systems in use :

  1. taxon common name (P1843) property : I am not too sure that this solution would work well on a large scale (with many languages), altough it's probably possible to find good solutions to this problem. However, there are critical issues with the addition of common names as monolingual text, notably (1) it's not possible to ask which species correspond to one common name and (2) it doesn't intrinsically allow to link the item that represents the common name itself, if any (e.g. cod (Q13194939), fox (Q8331)).
  2. instance of (P31) common name (Q502895) Whatever the solution we chose for handling common names, this class should probably play a role. Nevertheless it cannot really be used directly, because common names are intrinsically monolingual, in existence and scope.
  3. using the common name as label : it would only works for common names that are unambiguous and unique, and it relies on unstructured data. Hardly conceivable. I've seen approximate common name translations on many items and I already know it'll be a huge mess to clean because of the freedom associated with labels. For instance, (en)poppy was labelled (fr)coquelicot in French but actually (fr)coquelicot only refers to the (en)field poppy (aka (en)corn poppy, etc.) — that item's labelled (en)Papaver rhoeas / (fr)coquelicot and the English common names aren't given.

I'll stop here for now, and for the moment I can't see a solution that combines the advantages of methods 1 and 2. Otherwise it'd be a good idea to come up with recommendations regarding point 3 relatively quickly, esp. I believe that unlike on the Wikipedias we'll probably agree rather quickly that “any item which instance of (P31) is taxon (Q16521) must normally be labelled according to its taxon name (P225)”.

Here are a couple of examples I've found useful (it's important to consider several languages) : Bellis perennis (Q26158) ; Castanea sativa (Q22699) ; Papaver rhoeas (Q130201) and poppy (Q967457) ; red fox (Q8332) and fox (Q8331) ; mouse (Q2751034) (n.b. (fr)souris has a slightly different meaning than (en)mouse ; (fr)souris absolutely mustn't be linked to the Mus genus) ; edit: another cool example, fr:Fourmi rouge and en:Red ant

Tinm (talk) 21:28, 24 August 2015 (UTC)

Mind to introducing your self, Tinm, and the thesis you are working on? --Succu (talk) 21:50, 24 August 2015 (UTC)
Of course, no problem. I'm French, and I've contributed for some time on the French Wikipedia, especially in the biology projet. I've got a degree in evolutionary genetics. Wikidata is a rather hot topic on frwiki at the moment, both in general and in biology, so I've started roaming a bit on Wikidata about a month ago and I think I'm starting to understand how things work a bit better now. I've followed the taxonomy project pages from the start — I was partly pulled here by TomT0m, who's also a regular contributor on frwiki. Regarding taxonomy, we're only just starting to think about integrating some Wikidata content, so I'd be curious to know about the initiatives that were made on dewiki, if any. — Tinm (talk) 02:32, 25 August 2015 (UTC)
Obviously, common names are a headache, unless dealing with Official Lists, that assign one official common name to one recognized species. And even then ... - Brya (talk) 05:14, 25 August 2015 (UTC)
#1: (1) This should be possible. (2) A common name for a group of taxa should have common name (Q502895). I'm not sure that linking fox (Q8331) to all species and gerera is useful.
#2: common name (Q502895) is applied to items that are non taxa. But sometimes wikipedia articles have taxoboxes.
#3: According to Help:Label this is possible. But there are some issues to resolve. We need a consensus on wikdata to do this (RfC?). What label is applied to Rutilus alburnoides (Q840418)? What happens with the label when as sitelink is moved on wikipedia?
--Succu (talk) 10:11, 25 August 2015 (UTC)
Well, to me Q840418 is irresolvable, a pure headache. The obvious solution is to split it (one name -> one item), while keeping the iw-links together in one of the items. Separate (but linked items) offer the potential to add plenty of references, without getting tangled up. - Brya (talk) 10:58, 25 August 2015 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── @Succu It's good if #1.1 is possible, but it's only through the API right ? Regarding #1.2 it can be handled through “linked pages” (i.e. Vulpes probably should be linked to fox anyway) so I don't think it's a problem. Regarding #2, on frwiki we have a “common name” box for these cases, and that works well, but I don't know about the other versions. We may want to leave the Wikipedias handle their own issues, too, if a page's actually dedicated to a common name so be it. Finally #3 is more difficult that what I though. What's clear though is that the label's content is unpredictable so we don't want to rely on it. @Brya that's the synonyms problem :D We could use ranks too. But whether we want one item for each (scientific) name seems a rather urgent matter to decide on. — Tinm (talk) 16:25, 25 August 2015 (UTC)


@Tinm: ; Taxonomy in the broad and specific sense : I'll give my impressions on this issue as a non taxonomist who looked into classification issues in general and how they are modeled and used in the semantic web. First Wikidata is not really built for linguistic attributes, but for concepts. A "common name" is not really a concept. It denotes a set of animals or organisms however, and that's the real topic of Wikidata. Sets of organisms or objects are usually organized in classes, not the taxonomic rank specifically, who are basically groups of organisms and the reason why who chose to regroup them like that. My opinion is that the main object that Wikidata is interested is the organisms some people, french for example, calls like that. Biological taxonomy build classes that are called taxons with scientific criteria, but common people are not entitled to this and might call them whatever they want regardless of taxonomy and regroup them in ways that do not make sense taxonomy wide. The relation can pretty much be modeled using knowledge representation classes. For example lets say we have a common name "whatever" that refers to too similarly looking taxons T1 and T2 that are barely related taxonomy speaking but that common people confuses. We simply create an item for "whatever", and say the organisms are the union of T1 and T2. This item is NOT a taxon so we don't put a

⟨ whatever ⟩ instance of (P31)   ⟨ taxon ⟩

statement. This pretty much models an homonymy page where it is said "whatever can refer to either T1 animals or T2 animals". We can also imagine other properties to model the relations beetween taxons and common set of organisms, for example if "whatever" (in england) is an equivalent of "quelque chose" in france but not totally we can have a property like "almost equals", with "partly covers" and "quelque chose" having their own items and lnked by this property. This might need elaboration but conceptually is quite powerful. author  TomT0m / talk page 07:59, 25 August 2015 (UTC)

This is what we are doing by using "instance of: common name", and what Tinm accepted, but he notes that what is lumped together in one language, may not be lumped together in another language. This is indeed so, but short term is not much of a problem as so far we are dealing with few cases, of powerfull memes, present in several languages. Long term this may be different. - Brya (talk) 10:51, 25 August 2015 (UTC)
Not really if I understand correctly. If it is a name it can't be a class of organisms, it's a linguistic entity. A group of organisms is named in some way, it "is" not a name ... author  TomT0m / talk page 11:28, 25 August 2015 (UTC)
I think what TomT0m suggests is to write “instance of: fox”. But again this would mean any given taxon item would have lots of “instance of” statements and consequently (1) it would no longer be easy (e.g. for a program) to know what kind of item it's dealing with (2) reading such “instance of” statements requires to explore the ontology upwards to know the meta-class each class is in, plus to do this you more or less need to have a guess about what you're looking for. So it's really not practical. — Tinm (talk) 15:50, 25 August 2015 (UTC)
@Tinm: no, subclass of (P279), see Help:Classification. Checking the meta class only means in this case check for instance of (P31) however. You seem to know a lot for an ignorant :) author  TomT0m / talk page 16:00, 25 August 2015 (UTC)
Indeed sorry, that didn't make sense, my brain doesn't work this morning. Argument 1 disappears but it doesn't help with argument 2. Whatever the view of the question you adopt, a “common name” property is exactly what we want. We can discuss the kind of value we want it to have but the principle of that property is hardly debatable. — Tinm (talk) 16:47, 25 August 2015 (UTC)
Actually the label is enough if we're discussing vernacular names who are not taxons. For taxons who maps the common name, I'm not sure why we could not use the label/aliases either, because we have a property for the scientific name so it's usable whatever the label in the language is. This poses no "mono/multilingual" label since the label is naturally mutilingual, and that the scientific name is language independant. author  TomT0m / talk page 17:28, 25 August 2015 (UTC)
Well, the way I see it, it is either "one name, one item" or put every property referring to that name in as a qualifier to that name (basionym, IPNI number, Tropicos number, GBIF number, EOL number, etc). Then it would be possible to have more than one taxon name in an item. It would mean having several mini-items within the item. It would still leave the parent taxon issue to be tackled, and it also means giving up on checking for constraint violations. It looks hopelessly complicated to me. - Brya (talk) 17:14, 25 August 2015 (UTC)
I think there is no alternative for "one name, one item" approach. --Succu (talk) 17:59, 25 August 2015 (UTC)
Which mean we'd have items for even cryptic names, and also that we'll have to accept some degree of POV. At the very least the interwiki links will have to be regrouped under one particular name, but more than that we'll want to regroup all the concrete data under that name. In other words “one name, one item” implies to have reference Wikidata names. — Tinm (talk) 18:12, 25 August 2015 (UTC)
I was not referring to sitelinks. But if you want to „regrouped under one particular name” only one NPOV option is available: put all the sitelinks to the item which covers the „first name usage” (basionym (P566), original combination (P1403), basonym). That means e.g. moving all sitelinks from lion (Q140) to Felis leo (Q15294488) or from Opuntia ficus-indica (Q144412) to Cactus ficus-indica (Q12294235). Looks strange? It is! --Succu (talk) 20:56, 26 August 2015 (UTC)

Various questions

Hello,
I'm (mostly) the main contributor to french taxoboxes (templates/modules), and I have some questions about how some specific informations are stored in WD for taxa.
Sorry for this big list of questions. I read the project description but I'm not realy clear about some stuff that can be important.

Let start with author: how to deduce from author+date field how the create the displayed name? For Panthera leo for example the correct display is "(Linnaeus, 1758)". The date is fine. I have Carl von Linné for the name. I guess I can find the french link (if exists) on 'Carl von Linné' item. But how to choose beetween "Linné" and "Linnaeus" in the abreviation list? And how to know to add the parenthesis? Also how are handled more complex cases such as "Duméril in Duméril & Bibron" or even more complicated stuff in botanic (i.e. "(Pascher) Korshikov ex H. Ettl")?

But in my opinion the main question is how to represent classifications. Different classifications differs in various informations for taxa:

  • taxa name: example → -a corrected in -us not in all classifications
  • taxa name (bis): same taxa is a synonym or not
  • higher taxa: some taxa are attached to a different taxa
  • rank: a same taxa can have a different rank (class, subclass…) depending on classification
  • image/distribution map: both depend on lower taxa. I recently corrected picture for Uroplatus sikorae that was a picture of Uroplatus sameiti, which can be right in certain classifications has it was considered as a subspecies previously (U. sikorae sameiti).
  • synonymy, incertae sedis, …: can also be different
  • even author(s) can change, the presence of parenthesis…

My supposition is that − when it is the same taxa − we should add each information with a "source" that is the corresponding classification, and use it as filter (and maybe have a "default" value? But that mean an unsourced value, which seems bad).
But it is maybe complicated to apply, no?
This point is very important for us. Each project (of biology) selected classifications that apply for all taxobox (even if taxa that exist in other classifications only are also created, of course).

In addition (but for later discussions ;)) we have taxoboxes − mostly in botanic − which show "classical" classification and phylogenetic one. How "phylo" is represented? And in the same way it exists different phylo classifications.

By my side I manage a bot dedicated to biology. I have several instances of various classifications in it, with many data for taxa (including IUCN/CITES values) so I can upload many stuff (I guess I will start with reptiles, my "main" area) if I know how to represent them in WD and in a way that can allow us to get all our editorial choices.

Thanks in advance. Regards, Hexasoft (talk) 19:02, 26 August 2015 (UTC) note: sorry, my english is mostly technical so I'm not always precise in my words when writing something else than computer science documents ;)

A lot of questions indeed. Do you know Wikidata:WikiProject Taxonomy/Tutorial and Module:Taxobox? There you'll find some answers, I think. --Succu (talk) 19:23, 26 August 2015 (UTC)
Thanks. I guess that this module will answer my questions about authors (as I can see complex examples).
Whatever the part that seems to deal with classifications is this section but the given example is hard-coded and do not seems to use the classification-based data inside (in the example) Q5146943. By the way I don't understand the way it is managed: I can see 2 scientific names with classifications, and an other one with an other scientific name but no "source", but handling the author+date.
Is this approach is the recommended one? I mean no source = taxa data + sources for different data? This also (eventualy) implies author+date if different in other classifications?4
If I want to import (let say) the reptiles classification from Reptile Database should I create all data inside "source = ReptileDB" or should I put it in "default" if it still not exists?
Regards, Hexasoft (talk) 20:57, 26 August 2015 (UTC)
More than 30,000 statements are referenced with Reptile Database (Q1644501). --Succu (talk) 21:21, 26 August 2015 (UTC)
Nice. But whatever, it don't explain me how to use it because the example in Wikidata:WikiProject_Taxonomy/Tutorial#Taxonomy_changes is hard-coded. I will look at the module code to see if it is handled inside, whatever (but I guess no, because it would be used in examples, I guess).
BTW how "history" of classification is handled? I mean Reptile Database is updated about 1 ou 2 times per year, so many taxa (items) can change. Should we just modify items content or deal with dates and so on? Regards, Hexasoft (talk) 21:29, 26 August 2015 (UTC)
Have a look at Cactaceae (Q14560). If the treatment of a taxon at Reptile Database (Q1644501) is changed what should we do here?-Succu (talk) 21:49, 26 August 2015 (UTC)
I mean: does we should keep track of changes with "from" and "to" date qualifiers, or just keep the "current state of classification"?
Not sure that such history can be useful per se, whatever, but I prefer ask about practices (it can be useful in a way I don't know).
For Cactaceae (Q14560) is it about selecting a classification? My previous question "it don't explain me how to use it" was about selecting a given classification and extract all data with this criteria. My purpose is to allow to show a taxobox from a given classification point-of-view. Regards, Hexasoft (talk) 22:09, 26 August 2015 (UTC)
It is indeed the intent to make it possible to build taxoboxes according to a particular classification. The area I am familiar with is the flowering plants where the most important recent classifications (Cronquist [classical], APG, APG II, APG III [phylo]) have been put in, completely . These are published classifications, so no issues about updates.
        I can't help you with the Desserobdella picta case, as I don't understand it. Anyway, it is not how I think we should be doing things. I think we should migrate completely to a "one name, one item" approach, connecting the items by (referenced) statements like "taxon synonym".
        For authors, it will be quite a while before Wikidata can be used. I would have preferred a simple string, which could be copied directly. The approach adopted means that an author citation depends on other items, which may not exist yet, or may not have complete details on authorship. Long term the chosen approach is more versatile, but short term ... - Brya (talk) 04:13, 27 August 2015 (UTC)
Decisions can be changed ... if the current one is unrealistic, why not a change of policy ? author  TomT0m / talk page 08:19, 27 August 2015 (UTC)
Well… A simple string could be useful for a "quick start" for authors. But it is not very nice: author(s) link(s) will not update in pages move or if abreviation change, for example. But current system seems very complicated and it will really take huge time to complete the necessary data! (and what about authors only known by their last name?)
@Brya: not sure I understand. a same "name" (i.e. "Felidae") has different values for some properties depending on the classification you are looking for. I.e. in some classifications (i.e. ITIS, MSW…) Felidae is in Feliformia suborder, in others (CoL…) it is in Carnivora order, in ITIS and NCBI it is in Feliformia suborder, in others in Feloidea superfamily.
I don't know what is the best way / decided way to represent that (in order to be able to get "taxonomy from classification X point of view"). Regards, Hexasoft (talk) 12:05, 27 August 2015 (UTC)
Well, different classifications are not much of a problem. It is quite possible to put in five, or ten, statements with a different parent taxon, each referenced with a different classification (CoL does not count, and ITIS only in an emergency). The Cactaceae are as good an example as any. So with the right programming it is possible to read an entire classification. Of course, that does not help with different circumscriptions: Malvaceae-according-to-Cronquist is quite different from Malvaceae-according-to-APG.
        As to author citations, the problems are 1) with algae, fungi and plants 2) prokaryotes and 3) full citations for animals. It seems quite doable to import strings of complex author citations (these are available in all the databases). It will be a lot of work, a few hundred thousand names, but it is a matter of importing existing strings into existing items. And these will be directly ready to use for Wikipedias. And "abbreviations" don't change: they were designed to be stable. The system now chosen will yield usable results only when all the basionyms (respectively basonyms) have been tracked down and a new item created for each. It seems safe to state that this is going to take much, much longer, and will be much more work. Long term it will work, but in the short term ... - Brya (talk) 16:43, 27 August 2015 (UTC)
I think the most sorrow-stricken case is #3. Widely used references like Mammal Species of the World (Third edition) (Q1538807) do not care a lot about original combination (P1403). #1 is easier to resolve, but needs a lot of cross checks. #2: LPSN (Q6595107) is the reference, but LPSN URL (P1991) has not a lot of usages untill now. It's doable but needs a lot of energy and help to be done. --Succu (talk) 22:20, 27 August 2015 (UTC)
Ok. So it is probably possible to have a classification from WD, but hardly the authors citation and not yet an image legend. By my side I don't want to have part of data here and part on WP for a same article, it lead to multiplicate the editions.
As in addition it is not possible to see from WP changes on WD that modify articles content (i.e. in watch page or in local history) I will clearly not include WD in our templates/modules for the moment − appart maybe for very simple informations such as UICN/CITES that are mostly still treated by bots.
Thanks for your time and your answers. I will maybe try to feed WD with subsets of data from reptiles to perform tests (but not in main).
Regards, Hexasoft (talk) 08:41, 28 August 2015 (UTC)

Examples of synonyms

I have been asked to give examples of how to handle synonyms. I am still feeling my way into this. At present:

Rhaponticum chinense (Q20850513) is a current name (according to the Flora of China), with
Klaseopsis chinensis (Q20850345) a synonym, and
Serratula chinensis (Q10906008) a synonym, and also the basionym of the other two.

The iw-links are together and happen to be in Serratula chinensis (Q10906008).

Another example

Rocio octofasciata (Q3737) is a current name (according to Catalog of Fishes) with
Cichlasoma octofasciata (Q10904620) a synonym, and
Heros octofasciatus (Q20873580) a synonym, and also the original combination.

But according to ITIS the synonymy is the other way around. The iw-links are together and happen to be in Rocio octofasciata (Q3737).

So, it is possible to have both "taxon synonym" and "synonym of" in the same item, with appropriate references. An open question is how many references should be used: if the statements are clean (that is if the MSW-ID:10400125 is in item Q195023 and not in Q20829969) this may be clear enough, be explicit enough, without also putting them in as references to "taxon synonym" (this would save a lot of work). References should preferably be 'real taxonomic references' anyway. - Brya (talk) 05:27, 28 August 2015 (UTC)

Hi Brya. Thank you for the examples. I'll look into it. — Tinm (talk) 18:23, 28 August 2015 (UTC)

incertae sedis

Is it ok to use incertae sedis (Q235536) for parent taxon (P171) (example, I see it is still not used anywhere), or just to set the value to unknown? --Termininja (talk) 10:50, 30 August 2015 (UTC)

incertae sedis (P678) is used as a qualifier to parent taxon (P171) e.g. Biophomopsis (Q10429817). --Succu (talk) 10:58, 30 August 2015 (UTC) PS: BTW: I started working on fungi missing parent taxon (P171) yesterday.
There is always a parent taxon, but sometimes a genus has not been assigned to a family, etc. So fill in the known parent taxon, and use "incertae sedis" to indicate the (primary) ranks at which the genus has not been assigned. For fungi the Index Fungorum is useful for information and for linking to. - Brya (talk) 11:13, 30 August 2015 (UTC)

Oh,I didn't know that there is property for this. Thanks --Termininja (talk) 11:42, 30 August 2015 (UTC)

Return to the project page "WikiProject Taxonomy/Archive/2015/08".