Wikidata:Property proposal/taxon synonym string

taxon synonym string edit

Originally proposed at Wikidata:Property proposal/Natural science

   Not done
Descriptionsynonym of the taxon name
Representssynonym (Q1040689)
Data typeString
Template parameter"synonyms" in en:template:speciesbox, en:template:taxobox, and en:template:automatic taxobox
Domaintaxa
Allowed valuesscientific names
Example 1Phidippus johnsoni (Q675345) → Attus johnsonii
Example 2Phidippus johnsoni (Q675345) → Phidippus bicolor
Example 3Phidippus johnsoni (Q675345) → Dendryphantes johnsoni
Planned useaddition of taxon synonyms for jumping spiders (99+% of which don't have items)
Robot and gadget jobsNothing currently planned, but there's a lot of potential here
See alsotaxon synonym (P1420)

Motivation edit

The original proposal for taxon synonym (P1420) was to have it be a string property. However, the property was controversially created as an item property instead at the insistance of a single user. Fast forward six years and taxon synonym (P1420) is barely used (14,425 uses) despite the fact that almost every taxon on the planet has synonyms and many have dozens. This means that Wikidata still doesn't have useful data on taxon synonyms for 99.9% of taxons (even though Wikipedia, Wikispecies, and even Commons have this data). The fact is, no one is going to create Wikidata items for the millions of synonyms that exist (and it's debatable whether or not they even should in many cases, e.g. objective synonyms, taxonomic vandalism, etc.). These problems have been discussed ad nauseam on the property talk page without any solutions being reached, mainly because we keep letting the perfect be the enemy of the good. The only practical way forward, IMO, is to adopt Brya's compromise proposal to have separate item and string properties, similar to how author (P50) and author name string (P2093) work. This should allow us to finally move forward with adding more comprehensive synonym data to Wikidata while still allowing users to flesh out full items for taxon synonyms if they need to. Kaldari (talk) 16:01, 28 June 2020 (UTC)[reply]

How this property would be used in relation to taxon synonym (P1420) edit

Use of "taxon synonym string" (with taxon author (P405) and year of publication of scientific name for taxon (P574) qualifiers) would be the default for most taxon synonyms. In cases where there are ...

  1. Competing taxon concepts in recent literature (e.g. Prasophyllum uroglossum (Q65946197) vs. Prasophyllum fuscum (Q15488111))
  2. Disagreements about taxon level in recent literature (e.g. Bos taurus (Q20747334) vs. Bos primigenius taurus (Q20747320))
  3. Deprecated taxon names that are notable in their own right (e.g. Attus (Q4818757))

... use of taxon synonym (P1420) would be encouraged. In other words, we should only use taxon synonym (P1420) where there is a concrete reason for a Wikidata item to exist for that synonym and the information can't be equally-well handled with just a string and qualifiers. Kaldari (talk) 20:32, 28 June 2020 (UTC)[reply]

Discussion edit

  •   Oppose:
  1. The purpose of author name string (P2093) is "undifferentiated author", but taxon name is usually unique so it is possible to create items for individual names.
  2. This proposed solution assumes there is only one canonical name for each taxon, but the canonical name may differ between classification systems

--GZWDer (talk) 16:14, 28 June 2020 (UTC)[reply]

@GZWDer:
  1. That's not accurate. According to the description (and how the property is actually used), author name string (P2093) is for when the author is undifferentiated or doesn't have an item. Regardless, just because something is "possible" doesn't mean people are actually going to do it. For example, I really want to add synonyms to spider species, but I'm not about to create 100,000+ items in order to do it. Nor do I expect that anyone is going to create hundreds of thousands of Wikidata items for every author of a paper listed in Wikidata.
  2. True, in which case you can still use taxon synonym (P1420).
Kaldari (talk) 16:23, 28 June 2020 (UTC)[reply]
We need consistency, instead of different ways to achieve the same thing. --GZWDer (talk) 16:28, 28 June 2020 (UTC)[reply]
If there was a way to have properties that could be either strings or items, I would agree with you. Kaldari (talk) 16:30, 28 June 2020 (UTC)[reply]
  Comment There seems to be a classic conflict here, does Wikidata capture as much information as it can without trying to model everything in detail (in the hopes that it can be sorted out later) or does postpone adding data that hasn't (yet) been modelled? Individuals seem divided on this topic, the question is can both approaches exist side by side? The author name string (P2093) example works well because (a) it's clear that a huge amount of bibliographic data couldn't be added to Wikidata if we insisted on having authors as items, and (b) there are people and bots mapping author strings to author items. Having the string data in the first place makes this mapping possible, and doesn't seem to negatively impact this who would prefer to have authors as items. Perhaps this proposal could be strengthened by showing how we could move from synonyms as strings to synonyms as items? That way, those who prefer items could see a path to that goal, and hence we could develop bots to make that transition as and when we have information and/or time. I think most would agree that the modelling of taxonomic names and taxa in Wikidata is, at best, open to improvement, but how it is to be improved seems unclear. It is tempting to argue that we shouldn't mess with taxonomic properties until that model is clarified, but that will be enormously frustrating to those who "just want to get things done." --Rdmpage (talk) 17:10, 28 June 2020 (UTC)[reply]
  Comment I very much support Rod's pragmatic approach and, indeed, an expansion of Kaldari's original proposal to explain the mechanics of supporting a transition from string to item (when appropriate) would be a great addition. In my opinion, having followed the usage of author-related properties for a while, this incremental approach is solid. I'd be curious to hear from @Magnus Manske: the success of the author string vs item resolution depends to a large extent on the tools he created. --DarTar (talk) 18:22, 28 June 2020 (UTC)[reply]
@Rdmpage, DarTar: I don't think Wikidata claims always have to "progress" from strings to items. There are cases where it makes sense for an author to be a string forever (a minor author in a minor academic paper who has no other citations). Similarly, there are numerous cases (the majority, IMO) where it makes sense for a taxonomic synonym to be just a string (with qualifiers). The mussel species Anodonta cygnea has 568 synonyms. Creating separate items for each one (most of which only occur within a single paper), would be pointless and create a maintenance nightmare. I've now outlined above how I think this property should be used in relation to taxon synonym (P1420). Let me know if you think that makes sense. Kaldari (talk) 20:47, 28 June 2020 (UTC)[reply]
So adding 568 (qualified and referenced) synonym strings to Swan mussel (Q777865) will not „create a maintenance nightmare“? --Succu (talk) 21:12, 28 June 2020 (UTC)[reply]
Creating items for fake species does more than create "a maintenance nightmare": it makes Wikidata a serious threat to information on taxa, on a global scale. - Brya (talk) 03:53, 29 June 2020 (UTC)[reply]
@Succu: No more so than what Wikipedia already deals with. If, however, we had 568 items for all the synonyms (the current method), every time the accepted scientific name changed, we would have to update 568 items (rather than just updating one item). Anodonta cygnea may not be the best example, as there doesn't seem to be an item for the taxon name Anodonta cygnea specifically, but Phidippus audax (Q134277), for example, has several dozen synonyms. Most of them are listed in Wikipedia, but none are listed in Wikidata. Is there a downside to allowing string synonyms? I realize that it doesn't allow quite as much data to be associated with the synonym, but having name, author, and date is a lot better than having nothing, which is what we have now and it avoids the problems associated with having a proliferation of synonym items. I hope you will consider this idea with an open mind as your opinion carries a lot of weight here. Kaldari (talk) 14:15, 29 June 2020 (UTC)[reply]
I also am in agreement with Rod (@Rdmpage:) here. Particularly the point that I do not think comparison between authors and taxon names is appropriate. Large numbers of exisiting synonyms are important names and carry the potential of future usage, as such I think these should be items not strings. I accept there are some and objective synonyms is an example, that clearly will never be used. But I do think the aim should be to create items for these. I also agree that the entire struct of the taxonomic data needs a significant amount of scrutiny and redesign. It may be, as Rod says, better use of peoples time to first get the overall in the most useful form. Cheers Scott Thomson (Faendalimas) talk 01:50, 9 July 2020 (UTC)[reply]
  •   Question Is there a consensus that the current handling of synonyms in Wikidata should change to string-based solutions? Vojtěch Dostál (talk) 07:38, 29 June 2020 (UTC)[reply]
    • @Vojtěch Dostál: No, but there is also no consensus that the handling of synonyms should be item-based (see the original taxon synonym (P1420) proposal and various discussions on that property's talk page). By having both options we can flexibly accommodate either method when it makes sense to. I'm open to tweaking the guidelines about when to use either, but it's clear that the current item-only method isn't adequate. Kaldari (talk) 15:18, 29 June 2020 (UTC)[reply]
      • To me it's not that clear. I am afraid of breaking the status quo which has its merits, and creating two parallel taxonomic systems in Wikidata which will not communicate with each other. @Kaldari: I have one more question if you don't mind: How will a bot which imports from Wikispecies know when to use items and when strings? Clearly, some synonyms deserve an item of their own and classifying them as mere strings attached to a different taxon is an over-simplification of the taxonomic reality.Vojtěch Dostál (talk) 15:27, 29 June 2020 (UTC)[reply]
        • @Vojtěch Dostál: As suggested by Rod and Dario's discussion above, the default would be to initially bot-create the synonyms as strings under the item linked from Wikispecies. Any synonyms that merited their own items could then be changed into taxon synonym (P1420) claims as needed by humans. I expect that that would be the exception rather than the rule, however, and most synonyms would remain as strings with qualifiers. I see the two properties as being complimentary, not separate and parallel, in the same way that author (P50) and author name string (P2093) have allowed us to have more comprehensive author data than we would have with just author (P50) alone. Kaldari (talk) 15:40, 29 June 2020 (UTC)[reply]
  •   Question what happens if a name currently in "taxon name" becomes a synonym? Will you repurpose the item and change taxon name, apply the string? --- Jura 08:52, 29 June 2020 (UTC)[reply]
    • @Jura1: This should compliment the use of taxon name (P225) well (which is also a string property). In the event that the taxon name becomes a synonym, the values and qualifiers under taxon name would be moved to a "taxon synonym string" claim under the same item and a new taxon name (P225) claim would be created. As you suggest, the item would simply be repurposed (or merged with another item) and no new item would need to be created. This also saves us from having to move all the sitelinks (or split them up based on which ones move to the new name). It should be a much easier process and much more complimentary to our sister projects. Kaldari (talk) 15:04, 29 June 2020 (UTC)[reply]
      • It was a question, not a suggestion on my side. I don't think items should be repurposed in general. Your response suggests that you want to move from a taxon name based system. If you merely want to improve enwiki's handeling of siteslinks, you could opt for Commons' approach. --- Jura 15:10, 29 June 2020 (UTC)[reply]
        • @Jura1: By repurposed, I simply meant the same item would receive the new taxon name (if the name of that taxon was changed). The item itself would still represent the exact same taxon. In other words, taxon (Q16521) items would actually represent taxons, rather than our current awkward system of taxon items sometimes representing just taxon names and sometimes representing both taxons and taxon names. Does that make sense? Kaldari (talk) 15:23, 29 June 2020 (UTC)[reply]
          • I think I understand the two approaches. Maybe it's important to note that there is a key difference to "author name"/"author"-pair of properties you mentioned: "author name" is used for bulk imports because insufficient information is available to create an item for the author and large scale creations for these do happen once some identifier is available. For synonyms, I think one can't really create the string version without having sufficient information to create separate items as well. Also, the renaming you outlined illustrates the uncertaintity introduced by no longer providing a stable identifier (QID) for taxon names. --- Jura 07:23, 30 June 2020 (UTC)[reply]
            • There may be enough information to create an item, but 1) many synonyms are not notable enough to have an item and 2) in the current model creating an item would promote the synonym into a species (a taxon), even when it cannot be one, by definition. Another fake species will have been created. - Brya (talk) 05:23, 1 July 2020 (UTC)[reply]
              • I don't agree about the amalgmation you are doing, but the proposal doesn't change anything about (2). As for (1), it seems odd that that should depend on the existence of a sitelink. --- Jura 06:02, 1 July 2020 (UTC)[reply]
                • As for 2), with the proposed property it would be possible to add synonyms as strings, without implying in any way that they are, or represent, actual species, just like in any taxonomic paper. Creating an item for a synonym unambiguously states ("instance of: taxon") that they are species, even though it is possible to add a qualifier "this is a fake species" (there is a large variety of such qualifiers): the latter will confuse just about any user. - Brya (talk) 05:06, 2 July 2020 (UTC)[reply]
                  • @Jura1: I removed the sitelink criteria. To address your other concern, I think the uncertaintity introduced by no longer providing a stable identifier (QID) for taxon names is far less significant than the confusion currently caused by having to have a separate item for every taxon synonym (even synonyms which are simply misspellings or taxonomic vandalism). If you have any suggestions for improving the proposal, I'm open to ideas. Kaldari (talk) 00:55, 9 July 2020 (UTC)[reply]
  •   Support The proof of the pudding is in the eating. I don't see the apocalyptic scenario's described above. Wikidata is a linked-data store and core to linked data is the possibility to reroute links. SPARQL even has the CONSTRUCT query that enables this. So even if this proposal becomes obsolete because all taxa are perfectly defined in their proper wikidata items, there is really no long term risk in accepting this proposal, which would enable quite some use cases. Worst case scenario is that the need for this property becomes obsolete, in which case all we have to do is do a simple bot operation, that removes these deprecated statements, for which wikidata even has invented ranks. --Andrawaag (talk) 14:55, 29 June 2020 (UTC)[reply]
  •   Support Taxon syonyms refer to the same entity and thus modeling it this way is better then creating a new item. ChristianKl17:03, 29 June 2020 (UTC)[reply]
  •   Support With the understanding that (as proposed) this will exist besides P1420, and that most synonyms won't ever be transferred to P1420, although some may be (if notable enough). - Brya (talk) 04:18, 30 June 2020 (UTC)[reply]
  • For the moment   Oppose. Some reasons:
    1. taxon name (P225) is not intended to reflect the taxon concept currently in use. P225 should only be changed in very rare cases. A lot of others items and properties rely on it. In fact you'll get a warning if you try to change the value of P225
    2. I missed the word "reference" in your explaination. Subjective/heterotypic synonyms reflect a taxonomic opinion of someone. Wikmedia projects are not a reference at all. In your example Phidippus audax (Q134277) enWP has no reference for the listed synonyms. World Spider Catalog is a good reference for species etc. considered as synonyms. But you should allways cite the version or use retrieved (P813)
    3. Your The "synonym strings" are dumb. They show no relationship to each other (e.g. based on the same type). Should they ordered somehow (e.g. by date)?
    4. If you consider your "synonym strings" as an analogy to taxon name (P225) (including all allowed qualifiers) and you have all this information at hand, why not create an item? It's cheap and supported by tools like QS.
    5. In my opinion all synonyms provided by reliable sources are "notable enough" to get an item. Of course not all of this items are allowed to label a taxon concept
  • --Succu (talk)
    •   Comment Can we please refrain from calling others contributions "dumb" and be more convincing and less judgemental?
      • @ Andrawaag: I read "dumb" to mean the string was "dumb", not that the contribution was "dumb". In other words, the synonym string is just a set of text characters, not a thing (item) linked to other useful information.
    • Why is notability a criteria in selecting the value type of a property? In that case each InChI (P234) and InChIKey (P235) (just two random examples) should point to individual items too. Creating items is not as cheap as suggested, yes you can do that with QS or through the Wikidata API, but a lot of contributions are also coming from single users adding single statements. Then the workflow is suddenly frustrating. My approach in those cases (and I know how to write bots) is to open a second tab create the new wikidata item select the newly created and populated wikidata QID and paste that in the statements of which the property requires an item. This workflow is just not user-friendly and yet easier than achieving the same in QS. --Andrawaag (talk) 08:53, 1 July 2020 (UTC)[reply]
      • If you want to add a new monotypic species you'll have to add the new genus of course first. This is not a flaw in usability. In a database this a normal workflow such as adding a new customer before creating an invoice for the new customer. If you are creating the invoice with a text editor then this restriction will vanish at the cost of structure. BTW: here you can omit adding parent taxon (P171) and hope the constraint will be fixed by someone else. A common pattern... --Succu (talk)
  • It's not clear to me that the world fits any model we make of it in this regard. That, I suppose, is a support for the "mixed model" Kaldari proposes. It would be a good idea to work out what we do with synonyms if two species (say) become one, or one becomes two. Maybe this is an established concept in taxonomy. Also do we label synonyms with nomen erratum etc? How do we deal with updated or contradictory synonymy lists?
I hope your experience with author stings to authors is better than Wikipedia's. We have many citations where we would be so much better with a simple "authors=" parameter, rather than list of first and last names which doesn't handle all the oddities of human naming. All the best: Rich Farmbrough16:26, 7 July 2020 (UTC).
@User:Rich Farmbrough: lumpers and splitters (Q1662868) is an old taxonomic "problem". So what Wikipedias do if
  1. two species become one (lumped / merged)
  2. one becomes two (split; Crocodylus halli (Q68594258) is a recent example)
in regard of the related sitelinks? --Succu (talk) 19:06, 7 July 2020 (UTC)[reply]
We have a lesser problem, we can have an article about some or many species, we can have multiple articles about one species. We can have redirects where we need them, and change these things on a far more ad-hoc basis than Wikidata. that's not to say we always succeed, I'm not highly involve in that area. All the best: Rich Farmbrough00:23, 19 July 2020 (UTC).
  •   Oppose for the present. I have sympathy for Kaldari's frustration with the current system, which I share. But the problem is a deep one, and I'm not convinced that this patch is, overall, an improvement.
    1. The deep problem is that Wikidata does not model taxa as opposed to taxon names. Numerous discussions have resulted in the conclusion that we don't know how to model taxa and taxon names. (My summary, and it is only a summary, is here if you aren't familiar with these discussions.) Rich Farmbrough's comment It's not clear to me that the world fits any model we make of it seems very apposite to me.
    2. Wikidata taxon items are used to retrieve taxon identifiers, e.g. in the English Wikipedia's taxonbars. So any synonym that has an identifier in a taxonomic database needs to be an item, not a string. However, taxonomic databases have very different policies. The World Spider Catalog (WSC) does not have identifiers for taxon names, but for taxa. If the name it accepts for the taxon changes, the identifier remains the same. Treating synonyms as strings works well for data extracted from the WSC only if the taxon name of the item can be changed, i.e. the item really does represent a taxon, not a taxon name. GBIF has identifiers for all the taxon names it lists, including somewhat oddly both "Araneus diadematus Clerck, 1758" and "Araneus diadematus (Clerck, 1758)". So to retrieve GBIF's identifiers, the synonyms need to be treated as items.
I do take the point about the best being the enemy of the good, and we should strive to improve the modelling of taxa and taxon names in Wikidata regardless of the fact that there may not even be a way to do so properly. However, I'm not (yet) convinced this proposal alone is an improvement. The present system, which muddles taxa and taxon names, more-or-less works. Peter coxhead (talk) 06:42, 9 July 2020 (UTC)[reply]
@Peter coxhead: Regarding #2, that is why I am proposing a hybrid system rather than just changing taxon synonym (P1420) to a string property. In cases where there are legitimate reasons to have separate items (and separate Wikipedia taxonbars), we could accommodate that. Wikipedia, however, generally follows the 1 species = 1 article model, so this would actually make Wikidata more compatible with Wikipedia, not less, as it's more likely that relevant data for a species won't be split across multiple items. The downside of not including every database identifier for every synonym is a very minor downside, in my opinion, and I see very little practical impact from it. Kaldari (talk) 13:48, 9 July 2020 (UTC)[reply]
@Peter coxhead: Any thoughts regarding my response above? I would also like to mention that many database identifiers for synonyms just go to useless pages which have no more information than author and publication year, thus providing no additional information.[1] Are there specific cases you are thinking about where losing a database identifier for a synonym (that doesn't meet one of the criteria for creating a separate item) would actually result in some detriment to a Wikidata user or re-user? FWIW, I don't know of any Wikidata re-uses that are utilizing database identifiers for synonyms, so this seems like more of a theoretical concern than a practical problem. Kaldari (talk) 17:53, 16 July 2020 (UTC)[reply]
@Kaldari: sorry to be slow, I don't monitor Wikidata regularly. Databases differ significantly in what they do about synonyms, which complicates the issue.
  • The World Spider Catalog's identifier is for the taxon; if the name its compilers accept changes, the ID goes to the new name, and the old one is just listed as a synonym. So for this database, nothing is gained by having taxon names as items. Strings would be fine.
  • Tropicos (a plant taxonomic database) records names for those taxa it covers, listing for each any other sources that accepted those names. Its identifiers for synonymous taxon names are useful, because the entries contain useful information.
  • Many sources linked by taxon name IDs are not regularly updated, if at all; for example for plants the online Flora of China and the online Flora of North America. So they will often be using names no longer accepted in recent sources. Their entries can only be found under IDs for synonyms.
I don't know of any Wikidata re-uses that are utilizing database identifiers for synonyms – look at the taxonbar at the bottom of en:Reynoutria japonica. Calflora and FNA (Flora of North America) are useful sources, but they both use the synonym Fallopia japonica. If Fallopia japonica (Q899672) were not an item, we wouldn't have these IDs and hence links. Peter coxhead (talk) 06:35, 24 July 2020 (UTC)[reply]
The muddle between taxons and taxon names is one of the remaining places where names get confused with the concepts towards which the name points. Having less of those cases and only those cases which are needed for Wikipedia compatibility is desireable.
The present system also has the problem when linking a taxon from within Wikidata with properties such as instance of (P31) or found in taxon (P703). Whenever we have synonymous you would have to link to both the taxon and it's synonym if you want to log all the relevant relations. ChristianKl20:06, 9 July 2020 (UTC)[reply]
Shouldn't found in taxon (P703) better be linked to a circumscription (Q5121761)? --Succu (talk) 20:26, 9 July 2020 (UTC)[reply]

ok for the sake of somewhere to say this I have been thinking on what @Rdmpage: and @Peter coxhead: have said, both here and elsewhere. I am not meaning to hijack this and this is not so much an issue with this proposal but of how Wikidata does taxonomy in general. As I and others have said above the whole methodology here needs rebuilding, your struct. Its almost like your trying to be a hybrid of a database and an encyclopedia. Wikidata is a database you should treat it as one.

  1. change the name of Instance of Taxon to instance of taxon name, this removes this confusion. Both Rod and Peter have discussed this issue and I agree with them. Then it is clear you are databasing all the available names for life.
  2. all names must be items , start with the names in current usage but add in synonyms as you can.
  3. all names are listed under their original combinations as the mainspace name. your first statement will be the instance of taxon name, your second will be the original reference used in the circumscription of the name.
  4. have statements for the type data and the type locality.
  5. a statement declaring the current nomenclatural status of the name, eg available, valid, homonym, etc whatever is relevant
  6. then have a statement with the currently used combination, with the reference that made this change. Your parent identifier works for this.
  7. then a statement with a list of its known synonyms
  8. then put in the extra like images etc
  9. then a your various identifiers such as ITIS, and various checklists that use it, aggregators whatever.

There may be others that can be justified. but these are the ones you need. Then you are databasing the information that is absolutely needed, the data can then be used elsewhere throughout Wikimedia and possibly further afield. You must include all parents, so you need items for any proposed ranks, including subgenera, subspecies, subfamilies etc. This is very hit and miss at present. Now whether you put species in genera or species in subgenera and subgenera in genera I could not care. But you need to have a way of acknowledging the existence of the sub groups and having them queried in a way that the result places them in these ranks if thats what the user wants. Databases serve they provide information for other applications. Some of you may find this lecture on Wikiversity I created useful [2]. Cheers Scott Thomson (Faendalimas) talk 10:31, 10 July 2020 (UTC)[reply]

Does "all names are listed under their original combinations as the mainspace name." mean what I think it means? That is, have items only for original combinations, and add all subsequent combinations (as strings) to that item?
        Upon first glance this proposal seeems to maintain all current problems and add new ones; it would lose information on a grand scale. - Brya (talk) 11:00, 10 July 2020 (UTC)[reply]
        P.S. The Wikiversity page should make clear that it does not apply to Nomenclature and Taxonomy in general, but to Zoological Nomenclature and Zoological Taxonomy only. - Brya (talk) 11:00, 10 July 2020 (UTC)[reply]
Yes I do think your better off having all names under original combination, however, I also said all names should be items not strings. Stings would cost you a lot of data, I would not recommend that. I think this could be done without data loss, as I said I did not suggest strings, and I did say add everything else in you have eg images. I would not refer to this as a proposal it would need a lot of fleshing out and discussion first. At present I am just thinking out loud. What is here so far may not solve every problem, of course it adds work, restructuring a database always does so if thats the added problem sure. I do not for one minute think it would be easy, but to discuss it further it should be done elsewhere. Since both the title and subtitle state in the wikiversity article its using turtles as a case study I figured the zoology part was clear. Cheers Scott Thomson (Faendalimas) talk 11:47, 10 July 2020 (UTC)[reply]
I am not sure if I understand how that would be a change in this respect from the current situation. All names are already linked to the original combination (where this applies), so the most this would mean would be a property that is the reverse of "original combination" (like a property "subsequent combination")?
        From the above I got the impression you wanted to gather all information in the item on the original combination. Logically, this would mean adding names (subsequent combinations) as strings. - Brya (talk) 16:39, 10 July 2020 (UTC)[reply]
@Faendalimas: I proposed basically the same thing two years ago and it went nowhere. Your proposal retains many of the problems of the current system and would be a maintenance nightmare. Every time a species was renamed, all of its synonyms would have to be updated. I'm tired of us going around and around in circles on this. I just want an easy way to add more synonym data to Wikidata without causing any other problems. Nothing that I'm proposing makes other more comprehensive solutions impossible in the future. Can't we just take a small step towards improving the current situation rather than endlessly re-hashing the entire taxonomy system (which always leads nowhere)? Please consider the merits of this modest proposal and whether or not it would actually lead to more useful data in Wikidata (which I think it would). Sure it's not perfect, but it's an improvement, and it doesn't make the existing situation any worse. Kaldari (talk) 20:10, 10 July 2020 (UTC)[reply]
Sorry it was not my wish to hyjack this. I saw it as an opportunity to highlight some of the issues of what is clearly a hybrid database. Anyway above I have supported your proposal. Cheers Scott Thomson (Faendalimas) talk 21:00, 10 July 2020 (UTC)[reply]
Faendalimas, essentially, you proposed status quo. It's clear that it may appear difficulte to create new items for users who never did that, but I don't see how this new property would help users get started with Wikidata. That a few (automated) maintainance steps could (or should) be added isn't really argument in favor of this property proposal. --- Jura 21:17, 10 July 2020 (UTC)[reply]
Jura, essentially I agree with you, I do not think anything I commented was in favor of this. But in the absence of a full redesign this will allow data to be added. I think people should be encouraged to make items instead of strings. However in the absence of items at least something is there. I have supported it by concession, not because I think its perfect. Cheers Scott Thomson (Faendalimas) talk 22:03, 10 July 2020 (UTC)[reply]
@Faendalimas: what you don't seem to be taking into account is that Wikidata needs to be able to model its data sources, and in particular their IDs (especially for the English Wikipedia, whose taxonbar uses IDs as entries to those sources). Repeating what I wrote above, for some sources, like the World Spider Catalog, the ID is for the taxon: if the accepted name changes, the entry has a new name but the same ID. Strings will work fine. In other sources, like Tropicos, IDs are for names, and there is useful information under the name. For these cases, we need each synonym to have a link to an ID. Strings won't work for this. Peter coxhead (talk) 06:35, 24 July 2020 (UTC)[reply]
@Peter coxhead: Couldn't we just add the taxon name IDs as qualifiers to the synonym string property? We would just need to change their property scope constraint, but that's easy. Then no data would be lost. Kaldari (talk) 15:11, 21 August 2020 (UTC)[reply]

  Comment: As I see it, a prime problem is what is intended by "name". In the real world this is governed by the various Codes of nomenclature. These have ordered names into the following categories:

  1. not-formal names (not available, not validly published)
  2. formal names, that however may not ever be used as the correct name of a taxon (dead names, objectively invalid names, permanently invalid names) and
  3. formal names, that may be used as the correct name of a taxon, depending on taxonomy.

From the last category any taxonomist will select a subset that he accepts as correct names, but different taxonomists can select different subsets.

There are the following basic models that could be adopted:

  1. Single-Point-of-View: create a taxonomic point of view which adopts a set of correct names, and treats all other names as synonyms (without making a distinction between categories). Much of the taxonomic literature adopts this format, as does Wikispecies and, I guess, Commons. Pretty much by definition, it would only be compatible with itself, and not with different Wikipedias using different sets of correct names. A further problem with this SPoV would be that it would violate the NPoV and NOR policies of Wikipedias.
  2. Hybrid model: every name that potentially can be used as the correct name of a taxon can have its own item. By adding references it is possible to indicate which names are accepted by which taxonomist. Each of these items has information about the name (nomenclatural information) and the taxon (for which it has been used as the correct name). This is what happens in the real world, and to such an extent that many in the real world cannot distinguish between the correct name of a taxon and the taxon. This hybrid model can only work if a parallel structure is set up for names in category 2 (formal names that are dead, objectively invalid names, permanently invalid names) which interacts with the structure of potentially correct names.
  3. Names only. Items are for names only, containing nomenclatural information only (of all three categories). This will require setting up a parallel structure for taxa, with items on taxonomic information, and information about taxa. Only this parallel structure may serve as the interface between Wikipedia's and the name-only items.

The problem is that different users prefer different models but without accepting the consequences of their chosen model, and just import data which they alter by where they put it. - Brya (talk) 05:27, 11 July 2020 (UTC)[reply]

There is another option by using TNUs but it requires also a lot of good design. Happy to explain it but I think I have taken up enough of this proposal with what is an issue to the side, I feel I am being unfair to the proposal. Set up a place to discuss the Wikidata taxonomy model and I am happy to provide input. Cheers Scott Thomson (Faendalimas) talk 06:41, 11 July 2020 (UTC)[reply]
@Faendalimas: https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Taxonomy would be a good place to talk about the Wikidata taxnonomy model. ChristianKl09:01, 11 July 2020 (UTC)[reply]
  •   Comment Given that this discussion is rumbling on without obvious resolution, and you will find similar, seemingly endless discussions involving people paid to think about taxonomy, might a pragmatic (and hence admittedly unsatisfactory) solution be to (a) use "also known as" as the place for synonyms as strings, and (b) if you want to specify that a name is a synonym then create an item for that name and connect it to the taxon of interest using taxon synonym (P1420)  . For example, Anolis cristifer (Q5315308) has no taxon synonym (P1420)   but there are synonyms in "also known as" (Query). Now, "also known as" can include all manner of things, such as common names, etc., and so isn't as precise as taxon synonym (P1420)  , but if the goal is to help discoverability then having a set of alternative labels for something is all you need. One could argue that if you really want to distinguish something as taxonomic synonym (rather than just an alternative label), then you really should go to the trouble of spelling that out via taxon synonym (P1420)  . This seems a halfway approach that acknowledges that there is not universal agreement on the best way to model taxonomy in Wikidata, nor what the current model actually is, but still lets people do things without waiting for a resolution of the modelling issue. --Rdmpage (talk) 10:57, 16 July 2020 (UTC)[reply]