Wikidata talk:WikiProject Taxonomy/Archive/2016/03

This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

Data model

Latest comment: 8 years ago28 comments5 people in discussion

(@Brya, Succu : Here is the idea I mentionned on the Phenobot talkpage.)

WikiProject Taxonomy has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Below is an attempt to bring solutions to the known issues coming with the current data model for taxonomy and names. It is a draft, also your thoughts are welcome ! :)

The limitations I am thinking of are, among other things :

it is hard to make several names coexist, especially when they are associated with different taxonomic points of view
there is no consistency between how taxon names and other names are handled ; also the way names that cannot refer to taxa (e.g. invalid names) are handled create problems (dedicated items but taxonomic author statements still (would) have to be used as qualifiers, etc.)
some taxa (i.e. groups of organisms) have multiple items where all the statements have to be duplicated almost identically — to my eyes this is really an aberration. By itself this creates various problems, starting with the sitelinks location issue etc. but it also largely explains why the rules of the current model are hard to grasp, and why many contributors have found it unituitive.

The model I propose below is conceptually clearer and more powerful. And although switching to it would clearly require a significant effort, it is definitely feasible.

Details

Proposed model

The base idea of the model is that items that deal with taxa (i.e. groups or organisms) and items that deal with names should be distinct — at least in principle (see below). Thus, we could have a <nomenclatural name> class beside the <taxon> class. Subclasses such as <botanical name>, <earlier homonym> (itself a subclass of the former), etc., are possible. Instances of this class (name items) would declare a “latinized spelling” or similar property.

Distinguishing names and taxa like this would solve a good part of the problems we face. In particular the dependence between systematics and nomenclature can be taken care of by writing “applies to part <[name]>” on the “parent taxon” property statements to link those statements to the values of the “taxon name” property statements. (If this is not clear, see the examples, below).

In principle, the downside of this system would be that it is necessary to create at least two items for each taxon : the taxon itself and its name. (The total number of statements would be nearly the same as before, though.)

Combined items

However, for the sake of simplicity of edition, combined taxon/name items are possible. In this case, the combined item would declare both “P31 <taxon>” and “P31 <nomenclatural name>”.

In some ways this would be similar to the current system, but the “P31 <nomenclatural name>” statement makes a huge difference, and therefore properties that apply to taxa and properties that apply to names can be defined as such (because the property explictly declade what they apply to), so both aspects of the item can have their independent lives without any it causing problems (no qualifiers-or-not-qualifiers issues, etc.).

This may seem a bit messy but actually, because the combined item would be an instance of each class, the price to pay to allow combined items is rather small. In particular the symmetry between names (if there are several) is not necessarily broken, or at least it is possible to access the data as if it was not. Also it even makes sense for the item to self-reference itself for the “taxon name” property (“taxon name <self>”) as it is clear that the subject is the item in the sense of the taxon and the object the item in the sense of the name. This extra, seemingly pointless statement is the main downside — but even if it is weird it still makes sense.

The other issue to consider is that splitting a combined item, if it happened to be necessary, implies to update all the links that pointed to the part that's been moved. For biology-related properties this would not be a problem as we can expect them to declare the class of the objects they point to (e.g. hybrid of (P1531) obviously points to a taxon, not a name), but for very general properties it may not be possible to know for sure. It is likely the taxon, as in “<Arms of Canada (Q41549)> depicts (P180) <lion (Q140)>” but it would be hard to trust a bot on such calls.

Not a taxon name

And last thing, regarding names that cannot be used for taxa (that are not “instance of <taxon name>”, see examples below) : in order to keep the database clean, they should not be allowed to link other names — in technical words, the property “synonym” should require to the subject to be an instance of <taxon name> (needless to say, “taxon name” —a property for taxon items, while synonym is a property for names— would as well). Names that can't be used for taxa should only include the minimal information for self-definition, such as a reference, an author (the “author” property is not to be used as a qualifier as we now distinguish between taxa and names) and a date. And that's it. In addition, at the taxon name(s) item(s), synonym statements pointing to names that are not instances of <taxon name> could be given the deprecated rank, so that not-taxon names could be ignored altogether when retrieving synonyms. Not-taxon names should also not be used to express taxonomic points of view (not used with “applies to part”) ; in short they should only ever be referrenced through “synonym” claims having the deprecated rank.

All of these rules ensures that such names won't ever submerge the database. Also, note that a if the applicability of a name to a taxon is debated, the corresponding item may be an instance of both <taxon name> and <not a taxon name> with the appropriate specific references (from a database point of view such a name would behave like a taxon name).

Concrete example

Concretely, we would have something like what's in the code blocks below.

First some name classes :

scientific name -- generic class that we probably need to handle names not covered by the codes e.g. strains or things like Bacteria/Eubacteria, Eukaryota/Eukarya...

nomenclatural name (name that is governed by one of the biological codes of nomenclature)
subclass of <scientific name>

zoological name (name governed by the International Code of Zoological Nomenclature)
subclass of <nomenclatural name>
-- most names should be instances of this class, or <botanical name>, <microbiological name>, etc.

origial combination
subclass of <zoological name>

recombination
subclass of <zoological name>

-- etc. for all codes and all terms, but actually in the beginning we only just need zoological/botanical/.. name, and we can give items more precise classes later. I mentioned orig. comb. and recombination only because it's relevant for Panthera leo, and they're necessary to write the author strings e.g. “(Linaeus, 1758)”, with parentheses.

More importantly, we would also have :

taxon name (name that can legally refer to a taxon)
subclass of <scientific name> -- again I intended this to be a subclass of nomenclatural name, but with strains and all this is probably more simple

not a taxon name (nomenclatural name that cannot legally refer to a taxon)
subclass of <nomenclatural name>

valid name
subclass of <zoological name>
subclass of <taxon name>

invalid name
subclass of <zoological name>
subclass of <not a taxon name>

-- etc. for other codes, but again that's maybe not necessary to go into too much details for now, it is obvious that a name that is both an instance of <botanical name> and <taxon name> must be a correct name.

And the actual items would look like this :

Drosophila melanogaster :

Drosophila melanogaster (species of insect)
instance of <taxon> -- taxon part
instance of <zoological name> -- name part (a more precise subclass is possible)
instance of <valid name> -- also name part, if possible names should always explicitly be instances of valid/invalid name in addition to their nature
latinized spelling <"Drosophila melanogaster">
author <Meigen>
description date <1830>
synonym (rank:deprecated) <Drosophila ampelophila (zoological name)>
taxon name <[self]>
rank <species>
parent taxon <Drosophila>

Drosophila ampelophila (zoological name)
instance of <junior synonym> -- or of <zoological name>
instance of <invalid name>
author <Loew>
description date <1862>

Panthera leo ; here we actually have the choice to use a combined item or not as there are two valid names (possibly some taxa may use one system and some the other, I think it doesn't create problems as long as it is clear what's what, and to my eyes it is). Anyway, the example is given with a combined item :

Panthera leo (the species of mammal lion)
instance of <taxon>
instance of <recombination> -- or of <zoological name>
instance of <valid name>
-- name-related properties
latinized spelling <"Panthera leo">
original combination <Felis leo> -- or we can just write instance of <recombination> of <Felis leo>
synonym <Felis leo>
-- taxon-related properties
taxon name <[self]>
taxon name <Felis leo>
rank <species>
parent taxon <Panthera> applies to part <[self]> -- i.e. applies to “Panthera leo”
parent taxon <Felis> applies to part <Felis leo>
-- etc.

Felis leo (zoological name)
instance of <origial combination>
instance of <valid name>
latinized spelling <"Felis leo">
author <Linnaeus>
description date <1758>
synonym <Panthera leo>
etc.

The weirdest part is definitely the “taxon name <self>” statement at the combined item. This statement may be optional as it is implicit but I think it is better to explicitly preserve the symmetry between names. The statement may still be optional for taxa that declare a single taxon name, though. In any case it does not seem desirable to /require/ a split of the combined item as soon as a second taxon name is added.

Finally, Uroplatus sameiti is a taxon that has been promoted from subspecies to species :

Uroplatus sameiti (species of lizard)
instance of <taxon>
taxon name <Uroplatus sameiti (name)>
taxon name <Uroplatus sikorae sameiti (name)>
rank <species> applies to part <Uroplatus sameiti (name)>
rank <subspecies> applies to part <Uroplatus sikorae sameiti (name)>
parent taxon <Uroplatus> applies to part <Uroplatus sameiti (name)>
parent taxon <Uroplatus sikorae> applies to part <Uroplatus sikorae sameiti (name)>

-- Varanus brevicauda -- a name with alternative spellings

Varanus brevicauda (zoological name)
instance of <zoological name> -- a more precise subclass may of course be used
instance of <valid name>
latinized spelling <"Varanus brevicauda"> stated in <ITIS>
latinized spelling <"Varanus brevicaudus"> stated in <The Reptile Database>
author <George Albert Boulenger>
description date <1898>

Getting there

Good but... How would we move to this new data model, is it not too big of a change ? Well, it's still a big change of course but the model is actually very close to what already exists in most cases, and the transition can be automated entirely (or almost) and it's easy to track its completion.

Currently, all items are in the “combined” case, they contain information about both the taxon and the taxon name. We would therefore need to:

decide if we want to make the change ;
work out the details, in particular : sort the properties that apply to taxa and names ;
add “instance of <[zoological/botanical/...] name>” to all items. For the few items that include several (homotypic) names, first create the excedentary name items.
last, merge the items that represent the same taxon with different names. For each taxon :
1. split the name parts of the items
2. merge the taxon parts of the items as necessary. This concerns I think about a thousand items and it is possible to automate a good part if not all of it by using “applies to part” wherever a property (rank/parent taxon) is used in both items with different values, or if the property is used in one item and not the other.

That's it. I hope I conveyed the idea and showed why it would be efficient. Anyway, your comments are welcome ! If there are things that are not clear, please point it out.

—Tinm (d) 01:35, 23 February 2016 (UTC)

Comments

Thank you for doing all this work. I cannot take all this in so quickly, among other points, this uses the word "valid", which is best avoided in any discussion in this area. After a first quick read, I get the impression that this is not dealing well with uncertainty, which is one of the big issues. There also appears to be a need to add frequent qualifiers to indicate what a claim applies to, which is something best avoided. I will have to reread it a few times to understand it better. - Brya (talk) 06:11, 23 February 2016 (UTC)

@Brya: I have tried to remove the problematic uses of valid name in the above (it is now only used in the zoological context, in which its use is acceptable... I believe ?). Otherwise, yes, qualifiers needs to be added to statements that are name-dependent, but this kind of statements are strictly necessary no matter how we approach the problem ; we have to express the dependency between taxon parent (or other properties) and names in a way or another and I think this is the most direct way. —Tinm (d) 04:39, 24 February 2016 (UTC)

When the use of "valid name" is required in a formal statement, then it is OK. Having said that, I think it is wise to try and avoid separate botanical and zoological terms whenever possible. Otherwise we would have to adopt the accompanying structure, as well. Also, the vast majority of users will just be confused by these. - Brya (talk) 06:38, 24 February 2016 (UTC)

Just a short comment. More will follow later. There is no chance/risk whatsoever to submerge the database. The database can easily handle one item for every name ever published. But very good to think through this mess with some names being taxon and other not where it currently is a mess. --Averater (talk) 08:04, 23 February 2016 (UTC)

PS. What is the point of subclassing the types of names to that extent? What does the second part regarding <taxon name>/<not a taxon name> add? Those are not standard anywhere and add no information compared to the subclassing from scientific name. A name can be valid according to some code and it can be used in the scientific community. If it is used is realized by it being applied to a taxa and if it is valid is realized bu the first type of subclassing. --Averater (talk) 08:19, 23 February 2016 (UTC)

@Averater: Thanks for your feedback. Regarding the flooding of the database there is no doubt that there is no risk of technical failure but having many unimportant names can make retrieving the important names more difficult from a user's/programmer's perspective. Although there's no absolute need for it (we can just access all the items and filter out those that are “not a taxon name”) it is also true that these names can really be viewed as deprecated since a nomenclatural ruling has labelled them so. So for me it is totally okay to mark them as deprecated. But regardless, it would work either way, it's just a matter of implementation (and we can talk about it). Regarding the subclassing, it is also largely debatable, as long as we have zoological name, botanical name, etc. for every code there isn't much difference. At least as long as all the subclasses (if any) inherit from these base classes. The main interest of subclasses though is that among other things they would unable us to compute the “author-strings” (provided correct input information). Finally I think the <taxon name>/<not a taxon name> classes are useful as a generic class to include all of the (in-)valid names (for animals), (in-)correct names (for plants) etc. ; we want to be able to define the properties in a rather flexible way. —Tinm (d) 04:39, 24 February 2016 (UTC)

I would say that to have fewer names actualle hides the important names. Both becouse it as it is now is hard to se from a "unimportant" name if it is a synonym to an important name and as a name you cant find does not help you in finding the important name. If it was easy to see what name is the proper name for a taxa from the synonym item and it would be easy to find that synonym (with all its properties) it would also be easy to find the proper name. All synonyms ar in some point of view deprecated names and should lead towards the important name. A somewhat bad comparison is to compare synonyms with parent taxon. Both are one-to-many-relations but from a child nod it is easy to find the parent in contrast with synonyms where it is easy to find the many from the one. --Averater (talk) 07:28, 25 February 2016 (UTC)

A few comments : I don't like the idea of "combined items" who make the model more complex (why?) and continues to mix the name notion and the taxonomy itself, which is not really a good thing imho

I don't really understand the notion of a "deprecated synonym". In the wikidata semantics it means "something that used to be a synonym but is not anymore". Is is not just a deprecated name ? Which would make the "synonym" property expandable. author TomT0m / talk page 08:09, 23 February 2016 (UTC)

@TomT0m: Hi TomT0m! I somewhat agree regarding combined items (although names and taxonomy are really not independent) : from an computational perspective a complete separation between taxa and names makes more sense. However I believe it is very important to think of people who will browse and edit items by hand, and for this combined items are really much more practical, almost unavoidable. Regarding the deprecated statement on synonyms... I guess it depends how broadly you want to define a synonym. You could say that a name that cannot be used for a taxon is indeed not really a synonym as it in principle cannot “be used in place of” the considered taxon name. —Tinm (d) 04:39, 24 February 2016 (UTC)

Up to develop a gadget that split a browser window in two parts and shows the item about the names in front of the item about the taxon, you may be right. But I think that if you make a gadget feature request on the right place it can be done in no time. author TomT0m / talk page 11:08, 24 February 2016 (UTC)

To make it possible to have separate items for taxa and for names, it would be necessary to have a comprehensive system of taxon-ID's, presumably a separate ID for each recircumscription of the taxon. Such a system is possible, in principle, and here and there some work is being done. But it is not here yet, and we certainly cannot build one.

For the foreseeable future we can only indicate a taxon by its "taxon name", and given that this system has worked quite well for over two hundred and fifty years, there is not really any objection to doing so. The issue is that in many cases, as taxa are not stable in circumscription and position, there is a somewhat complicated relationhip between taxa and names. - Brya (talk) 12:04, 24 February 2016 (UTC)

The first part of your message does not quite add up with the second part. This works very well, but this leads to hard to manange complexity ? So this just does not very well. From the second part : a taxon name is an identifier, but in fact it's not stable so it lacks the most important feature of an identifier. So it's not a good identifier. author TomT0m / talk page 12:10, 24 February 2016 (UTC)

You are reversing the situation. There is a hard to manage reality. The system of "taxon names" is the best that science has been able to come up with, over the centuries. And not for want of trying. - Brya (talk) 12:26, 24 February 2016 (UTC)

Alternative

Thanks for providing your thoughts, Tinm. But I think your proposed model is extremely complex, not easy to understand and to use, it causes constraint violoations and did not solve all of the problems. A more straightforward way could follow this lines:

Replace in all items with taxon name (P225) (taxon name item) instance of (P31) with scientific name (Q15730631).
Create a new (additional) item (taxon item) for all items which have a taxon name (P225) now. Move all the sitelinks to this item and tag it with instance of (P31)=taxon (Q16521)
Create a new property taxon labeled as. The property takes items which are subclasses of name (Q82799)).
Add all the taxon name items related to taxon item with the help of the new property taxon labeled as to this item.
Ensure, that all instances of taxon name are unique within a underlying Code and are correctly published accordingly to the rules of that code
Make sure other subclasses of names other than taxon name don't violate any contraints of publication related properties as "published in" etc.

Some problems solved with taxon item:

A local taxobox can lookup the values of taxon labeled as to find the fitting taxon name (taxon name resolution).
Moving an article to another taxon name with in wikipedia is not causing any changes here. If a new taxon name is introduced then this one has to be added to taxon labeled as of course.
Monotypic taxa are no problem anymore.
No further need to discuss where a sitelink should be located.

Advantage for taxon name items:

The only change we have to do is moving the sitelinks the a separate taxon item.
There is no need to artificial classify a taxon name item into botanical name, zoological name... This is clear because following parent taxon (P171) to a code of nomenclature (P944). If this should not resolve properly the item itself can be tagged with code of nomenclature (P944).
We can model these items along the underlying Codes.

--Succu (talk) 13:37, 24 February 2016 (UTC)

The idea of having clean items focused on a name is of course attractive. But it is likely that there will always be the question of what to put in what item (IUCN status, flora treatment, etc). Also the point about an ID for the taxon (see above) remains. And how will this play out for heterotypic synonyms? - Brya (talk) 17:53, 24 February 2016 (UTC)

I forgot some minor points: taxon name items do not need a label because the labels in all languages should be equal. And we need of course a new constraint which ensures that these items don't have sitelinks. Our Q-Id would become a true LSID.

At the moment nothing should be changed. That means IUCN status goes to the taxon name items which correspondents to the scientific name at IUCN. Same is true for database ids and so on.

With taxon name items we are able to model taxon concepts based on circumscription of scientific names and/or traits. Each taxon concept will have its own item and thus a ID.

--Succu (talk) 18:39, 24 February 2016 (UTC)

One more thought: All properties tagged as Wikidata property related to taxa (Q18609040) belong to taxon name items. --Succu (talk) 19:06, 24 February 2016 (UTC)

As I understand it, the taxon items would hold only the sitelinks (presumably including the Wikispecies and Commons links). These would hold no real taxon-specific properties, so I am not sure how appropriate it is to call these "taxon items". In a way, roughly speaking, these pages function analogously to types, in that they are central anchoring points. Still not sure about heterotypic synonyms; these would be linked to taxon name items? - Brya (talk) 05:27, 25 February 2016 (UTC)

If you like it more call it sitelink container for wikimedia articles which declaire to describe a taxon. I think only articles with homotypic/objective synonyms should be linked together. --Succu (talk) 09:14, 25 February 2016 (UTC)

Well, this "Wikimedia articles which declare to describe a taxon" covers a lot of territory. The Swedish Wikipedia is full of pages that declare to describe a taxon, but in reality are devoted to fictitious taxa. If this proposal means that these pages would be back in as "taxon items" this would be a bad thing. - Brya (talk) 11:39, 25 February 2016 (UTC)

I hope taxon concepts are not based on „fictitious taxa”. --Succu (talk) 23:20, 24 March 2016 (UTC)

No, but that still leaves the issue unresolved of what exactly is a taxon item (and what exactly is in it)? Is it a container for links to Wikimedia pages on the same concept (not necessarily linked to a taxon), or is it focused on taxon concepts? If the first is the case it must be clear that not every container relates to a taxon. Or would it be restricted to pages on real taxa (excluding the pages on fictitious taxa)? - Brya (talk) 06:19, 25 March 2016 (UTC)

This proposal seems to add nothing. We already have a clear link between Wikipedia article and Wikidata item. The link between a name for a taxa and the taxa is much less clear. How would this proposal help with unclear name/taxa combinations, cases where several names apply to the same taxa or where a taxa does not have a valid name? Would it be helpful in storing more information such as synonyms and all authors for those synonyms? --Averater (talk) 07:20, 25 February 2016 (UTC)

As mentioned above, unfortunately we havn't a „clear link between Wikipedia article and Wikidata item”. Another issue is that old bots mixed up sitelinks belonging to different kingdoms and connected names based on some dubious ad hoc algorithm. We have resolve a lot (not all) of this kind issue by now. --Succu (talk) 23:20, 24 March 2016 (UTC)

@Succu: You don't say how the system I propose is complex. I do not think it is, on the contrary it is very simple ; maybe I gave too many details right away, when to a large extent this was just to demonstrate how this system could be expanded in an natural and efficient way to include information such as the nature of names (recombination etc.). I also see major inconsistencies in your system (that arise from your willingness to transfer as little as you can to the newly created taxon items and to not grant non-taxon names a proper status) but if possible I first would like to hear a more developed criticism of my proposal. Otherwise, the two systems are close on many of their properties as they are based on the same idea. All the advantages you mention apply to the system I described just as well. —Tinm (d) 04:04, 26 February 2016 (UTC)

It would help if you could restate the core of your proposal, supported by a clear example. I am trying to understand it but I keep being distracted by non-essentials. - Brya (talk) 11:45, 26 February 2016 (UTC)

Yep, Tinm, it would be helpful if you could sketch the steps we have to take to migrate from the current model to that proposed by you. --Succu (talk) 21:17, 26 February 2016 (UTC)

Fictional taxa

Latest comment: 8 years ago3 comments3 people in discussion

Since I get no reasons at all other then it would be correct and factual I start another thread here instead of edit warring. It concerns this and this where the heading is "violations of Wikipedia policy (at least 50% fictitious taxa)" which I tried to change to the more factual "Possibly nonexisting taxa". Since it concerns taxon where the name is against some code. It has actually nothing to do with if the taxa exists or not and certainly nothing to do with fictional creatures such as elves or Santa Claus. A better headline is desired where it reflects the content. If it does violate some policy it would also be good to have that specified. It is quite a stretch for one user to make such claims and and certainly not something everyone agrees on. --Averater (talk) 07:17, 1 March 2016 (UTC)

You never contributed to that page. And I see no reason to change the subheading. --Succu (talk) 07:52, 1 March 2016 (UTC)

"Fictional taxa" are taxa occurring in a work of fiction. A taxon found only with Harry Potter is a fictional taxon. So these are not involved here. In the meantime, Averater has amply demonstrated that he lives in a universe of his own, not affected by Wikipedia policy or by reality: he would best be in his place in a website of his own, where he can express himself freely. - Brya (talk) 11:38, 1 March 2016 (UTC)

Property for Catalogue of Life (Q38840)?

Latest comment: 8 years ago19 comments4 people in discussion

Hi, I'm wondering if we have a property for identifier of Catalogue of Life (Q38840) (I can't find one). If we do not have one so far, I think we could request a new property: http://www.catalogueoflife.org/col/details/species/id/$1. It is another important source of taxonomy reference. --Philip Tzou (talk) 02:33, 26 February 2016 (UTC)

So far, we have managed to avoid this. CoL contains an amount of error that is off the scale (only ZipCodeZoo seems to be worse). Not having a property for this does not really help much as somebody with a bot has imported a great deal of CoL-material into svwiki, cebwiki, and warwiki, so we ended up with quite a bit of junk anyway. Also GBIF and EoL have accepted CoL, which also does not help in keeping this material out. - Brya (talk) 05:33, 26 February 2016 (UTC)

Ok, I was wondering why. So that's the reason. --Philip Tzou (talk) 06:08, 26 February 2016 (UTC)

Of course it would be a good idea to have that as another property. Since they do have their ambitious goal of all taxon and are used as a quick reference is would be helpful. Do you know if they at CoL have identifiers that are stable enough to use? --Averater (talk) 08:11, 26 February 2016 (UTC)

I found that the hex identifiers used in 2015 archive become more stable now (and still valid in recent 2016 version). But we can still give it some time to see if it's really stable. --Philip Tzou (talk) 08:18, 26 February 2016 (UTC)

CoL's LSIDs change every year. --Succu (talk) 08:37, 26 February 2016 (UTC)

They both have a yearly edition and a monthly one. I did a quick search now in both the monthly (jan 2016) and the 2015 version and got the same species with the same id in both versions. However for 2011 I got different species. Their id now is far longer than in the 2011 version so maybe they have changed it so their id really are unique now and then also stable. Could be worth checking out. Shouldn't be too hard to send them an email and ask. --Averater (talk) 09:37, 26 February 2016 (UTC)

There is no need to believe me. Read Roderic's article Catalogue of Life and LSIDs: a catalogue of fail. BTW: The LSID for Loxodonta africana now is urn:lsid:catalogueoflife.org:taxon:44c6631f-6ae5-11e5-9d43-bc764e092680:col20160129. --Succu (talk) 10:46, 26 February 2016 (UTC)

What are you saying and what do you mean with that blog post? For Loxodonta africana 7b498777d8b86d615d26fb2555362a5d can be used. It is an id that works both for the current jan 2016 and the 2015 edition. --Averater (talk) 11:07, 26 February 2016 (UTC)

The internal id switched in 2015 from "6884408" to "7b498777d8b86d615d26fb2555362a5d". The LSID given in the UI is the one that should be used. But this one isn't stable as Roderic's article explains. --Succu (talk) 11:31, 26 February 2016 (UTC)

Why should the other id be used? An id has to be stable but there is nothing that forbids several id to point to the same id. If any of those id points to something else now there is a problem, but to have many id to the same item is not a problem. That blog post is also from 2013 and since they have changed their id system to longer id it is worth checking out how their id works now. --Averater (talk) 14:22, 26 February 2016 (UTC)

Surprisingly internal database ids are not meant to be stable. CoL offers no way to cite them. Have fun reading Life Science Identifier (Q6459954) and with your private investigations (=OR). --Succu (talk) 21:08, 26 February 2016 (UTC)

That is just a guess if the id in the link is or is not stable. The most important feature for an id is their unique value. Links are supposed to always work meaning that they have reasons not to change them. If those id are for any year version or only will work in the right one is the relevant question. --Averater (talk) 07:52, 27 February 2016 (UTC)

Shouldn't be every identifier (Q853614) unique and - if externaly exposed as a LSID - stable as a Uniform Resource Identifier (Q61694)? --Succu (talk) 23:12, 27 February 2016 (UTC)

The "value" must be unique, meaning that any id must not point to two (or more) different items. And they must for us to be able to use them stable in the meaning that an id pointing to some item must also point to the same value tomorrow. To have many id pointing to the same item is no problem. Compare with redirects. Any web address can be considered as an id pointing to some content. If we have redirects we have many id pointing to the same content. And we should not (easily) change where a redirect points and if some content moves we should (normally) leave a redirect so the old address still works. With CoL it seems to be that the same content in the 2015 and 2016 versions have the same id (at least the same id works which is enough for us). But if they consider every version as an independent database unlinked from the previous with different items we have a problem since we would then have to have one identifier property for every version which would be a mess (at least in the current state of Wikidata). --Averater (talk) 07:41, 28 February 2016 (UTC)

I agree that we can write an email to ask them if they will plan to change their IDs in the future, since it looks possible to me that they moved to long hex IDs was because they were trying to make IDs stable in the future. Of course we should move slowly until we are 100% sure about it. And I also agree that there are too many indents here :P --Philip Tzou (talk) 11:16, 28 February 2016 (UTC)

@Averater, Philip Tzou: got you a response? --Succu (talk) 22:08, 23 March 2016 (UTC)

Oh, I didn't do it. How about you @Averater? --Philip Tzou (talk) 05:57, 28 March 2016 (UTC)

Me neither. --Averater (talk) 06:18, 28 March 2016 (UTC)