Open main menu

Wikidata:Property proposal/taxon author citation

< Wikidata:Property proposal

taxon author citationEdit

Originally proposed at Wikidata:Property proposal/Natural science

Data typeString
Allowed valuesstring pattern
Example 1Pseudalethe poliophrys (Q27075654) → Sharpe, 1902
Example 2Cerasus maximowiczii (Q11179187) → (Rupr.) Kom. (1932)
Example 3Tromikosoma uranus (Q2178790) → (Thomson, 1877)
Planned useInfoboxes such as this one and/or to combine taxon name (P225) with the proposed property to retrieve the exact and proper taxonomic citation


I have heard that this have already been discussed, but this is a different property than taxon author (P405). I think there is the need and enough room for the both properties. One is about a human / author, and is useful as such, the other will be to retrieve the exact nomenclatured author citation. See also [1] made by user:Liné1 Christian Ferrer (talk) 14:52, 1 December 2018 (UTC)


Achim Raschka (talk)
Andrawaag (talk)
Brya (talk)
CanadianCodhead (talk)
Christian Ferrer (talk)
Dan Koehl (talk)
Daniel Mietchen (talk)
FelixReimann (talk)
Infomuse (talk)
Infovarius (talk)
Jean-Marc Vanel
Joel Sachs
Josve05a (talk)
Klortho (talk)
Lymantria (talk)
Mellis (talk)
Michael Goodyear
Mr. Fulano (talk)
Nis Jørgensen
Peter Coxhead
Andy Mabbett (talk)
Prot D
Rod Page
Strobilomyces (talk)
Tommy Kronkvist (talk)
Tris T7 TT me
William Avery
  Notified participants of WikiProject Taxonomy Christian Ferrer (talk) 15:05, 1 December 2018 (UTC)

  •   Oppose WD is about structured data. --Succu (talk) 15:15, 1 December 2018 (UTC)
    • a citation is obviously a data and, here, is an information specific to each taxon, and this property will integrate it into the structure → = structured data. Christian Ferrer (talk) 15:27, 1 December 2018 (UTC)
    • A string of text that can be specific to each item is a structured data. And not less a structured data than taxon name (P225) also a property with string datas. Christian Ferrer (talk) 15:33, 1 December 2018 (UTC)
      • @Christian Ferrer: I remember a taxon name two different persons named Li are involved to established the name. I was able to link them to the correct items. Are you aware that ICBN and ICZN have different rules for shorting authorship citation with et al. (Q311624)? --Succu (talk) 22:47, 1 December 2018 (UTC)
        @Succu: That's exactly the purpose of this proposal. It is to retrieve, whatever the kind of taxon (botany, zoology, ect, ect...), and whatever the different rules that govern them, the true author citations. And then to be able to export this information on external places, example infoboxes. Is there one example in one of the Wikimedia project (in the web??), where a code manage to retrieve the right citation without using a string value? and where a code manage to retrieve the right citation respecting the different nomenclatures, as the ones you quote above. That's exactly the purpose of this proposal, this is to store the author citation that follow the different nomenclatures. Christian Ferrer (talk) 04:05, 2 December 2018 (UTC)
        The purpose is to store the info of the field(s) "authority" that you can see in all various Taxoboxes in ENWP or in Commons, ect, ect... and this is exaclty because there is several different nomenclatures, and because that some cases are indeed very complicated, that we need a string value. Christian Ferrer (talk) 04:30, 2 December 2018 (UTC)
  •   Oppose Duplicative. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:59, 1 December 2018 (UTC)
    • duplicative of what? what is the property that retrieve the nomenclature? Christian Ferrer (talk) 16:04, 1 December 2018 (UTC)
    • Is it possible that the next who votes in opposition proposes something that works in order to retrieve the nomenclatured author citations ([2], [3]) which is very important in the scientific world? and specific to each taxon (therefore a strutured data)? is it possible? Christian Ferrer (talk) 16:09, 1 December 2018 (UTC) if you are not interested in scientific classifications, and in the structuring of the scientific relative data, you may have to go elsewhere, and not to prevent others from advancing. Christian Ferrer (talk) 16:13, 1 December 2018 (UTC)
  •   Support sure. Long-term this will be superfluous, but short-term this may well be useful. There is no telling how long it will take to get "taxon author" and "year of taxon name publication" completed, but it looks like it will not be anytime soon. - Brya (talk) 16:43, 1 December 2018 (UTC)
  •   Support I think this would be a very good idea, if it can be used to store the standard author string, which can be very much more complex than the examples above. If I understand it right, the proposal is to store the standard author string followed by the publication date in the new property. In my view the current system of holding these author information, described in the taxonomy tutorial, is too complicated to be practical. Please see this section of the tutorial talk page for the algorithm which I think is needed in the current system to extract the author string from the Wikidata data structure. Will people really take all that into account? Strobilomyces (talk) 18:05, 1 December 2018 (UTC)
    • Yes you well understood, the purpose is to store, as string data, the missing part of what is currently stored with taxon name (P225). And then by easily combining both we will have the exact taxonimic citation. And yes I too am sorry that this is not possible automatically nor currently nor in the future in the short or medium term, not even sure that it will be possible one day, though I will be happy if that happen one day. Christian Ferrer (talk) 18:37, 1 December 2018 (UTC)
      • Well, I think it is theoretically possible to store the author information in WD and derive the author string, but it needs a lot of population and data cleansing work. In fact I was intending to start to do this for fungi. But it is a good pragmatic decision to have the author string immediately available. Strobilomyces (talk) 21:05, 1 December 2018 (UTC)
      • I would like to make a further comment. There is a bug in the current data model for storing the author information, since the order of the authors in the author string is important but there is no satisfactory way of indicating this. Well, at present the authors stay in the order in which they were entered, but in theory we should not rely on this - the order is not formally part of the data model and future modifications could change it. Currently if someone wants to change the order of authors, they must overwrite and re-enter the whole statement. In this sense a new property incuding the author string would not be 100% duplication. Strobilomyces (talk) 16:28, 9 December 2018 (UTC)
        • Could you please refer to an article of any of the Codes where keeping the order of authors is mandatory or even recommended? --Succu (talk) 21:49, 11 December 2018 (UTC)
      • The Codes are very clear that authors are to be cited in a particular order. For example see Art. 46 of the ICNafp and the Examples therein. There is no explicit statement that the order may not be reversed or altered (except Rec. 46C.2), as this is felt to be unnecessary (see Preamble 13). - Brya (talk) 03:43, 12 December 2018 (UTC)
  •   Support I have been enjoying the year of taxon name publication thing. Most of the Hook. fil. (1844) are wrong, due to an improvised title page (aka publisher cludge) back in 1844. I like the feeling of "cleaning up".--RaboKarbakian (talk) 18:10, 1 December 2018 (UTC)
    • Good example! Is the value of this new property string: Hook. fil. (1844), Hook. fil., 1844, Hook. f. (1844), Hook. f., 1844, Hook.f. (1844) or Hook.f., 1844?! RaboKarbakian: To which publication are you referring? --Succu (talk) 19:08, 1 December 2018 (UTC)
    • I remember the problem. The dates are well known. --Succu (talk) 19:33, 1 December 2018 (UTC)
      • @ RaboKarbakian , Succu: Also, we should perhaps consider that within the fields of botany and mycology all author abbreviations are (supposed to be) unique for a specific author (for example "Alexander" i.e. Edward Johnston Alexander, 1901–1985), whereas within zoology they are not (e.g. "Alexander" can refer to at least seven different zoologists.) –Tommy Kronkvist (talk), 22:16, 2 December 2018 (UTC).
  •   Oppose Duplicative. MargaretRDonald (talk) 18:46, 1 December 2018 (UTC)
  •   Support. This will be a good way to quickly improve the scientific quality of Wikidata. Today, on most taxon pages the authority citations are still missing, what makes Wikidata an unserious source. 1. One major reason for this lack of correct name citation: it is far too complicated for editors to add the correct author citations, and to do this without producing errors: There are not only two different ways for botanical and zoological new combinations, additionally several new items have to be created for the basionym or the original name, and for non-existing authors ... that is extremely umcomfortable and a huge obstacle for the best-willing editor. The full process of linking should of cause follow as a second step, and can be done by editors experienced with these complicated Wikidata habits. 2. Adding a simple text string will help also for programmers who need this combined data on other Wikimedia projects. --Thiotrix (talk) 13:57, 2 December 2018 (UTC)
    • At the beginning of WD there was a similar concern about creating items for references. Nowadays WD is full of unused items representing useful references. It's the same old song if something new and a little bit more complex way is introduced. --Succu (talk) 22:52, 3 December 2018 (UTC)
  •   Support I think this will be useful until the author and date are linked properly, and even then, will be good as a constraint check. --99of9 (talk) 01:27, 3 December 2018 (UTC)
    • Could you please give an example of such a constraint check, 99of9? --Succu (talk) 22:29, 3 December 2018 (UTC)
      • @Succu: for example, checking for cases where a string of numerals in the citation string does not match a value in year of taxon name publication (P574). Something more complicated could check surname matches. --99of9 (talk) 02:08, 4 December 2018 (UTC)
        • Circular argument. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:32, 5 December 2018 (UTC)
          • I can see why you say that, and in some sense many statements are duplicative and circular once the constraint is satisfied (e.g. many of our properties with inverse properties). But unsatisfied constraints are certainly not circular, and are useful for QA. So, for example, let's say a taxon item had been mis-linked to an incorrect taxon author (P405), because the author had the exact same name, and was also a biologist. With this new property, we would have the chance of picking this up, because the value of botanist author abbreviation (P428) on the linked wrong author's item would not match the abbreviation in this new property. --99of9 (talk) 01:12, 6 December 2018 (UTC)
            • A better way to do this is checking the reference that refers the publication that established the new name or combination. That's possible without this new property. Until now nobody installed such a complex constraint check. Even some of our simplier and fundamental constraints are not fixed here, because only very few users here are interested in doing this hard work. --Succu (talk) 21:27, 6 December 2018 (UTC)
              • I am one of the few. Your method sounds good too, but much of QA is about having multiple ways of finding a problem. --99of9 (talk) 05:28, 19 February 2019 (UTC)
  •   Support Clearly awfully needed. scientific names should not be separated from its author citation. @Succu: Wikidata is structured, yes, but biology nomenclature were designed before computer. Liné1 (talk) 10:12, 3 December 2018 (UTC)
    • Essentially is a clear reference to the nomenclatural act. Giving the authorship (with year) is only a shortcut doing this. And it's not mandatory in the Codes. --Succu (talk) 22:35, 3 December 2018 (UTC)
      • Authority is totally mandatory and is part of the taxon unicity: Physeter microps Linnaeus, 1758 is the Sperm whale, Physeter microps Fabricius, 1780 is the Killer whale !!! Liné1 (talk) 15:10, 8 December 2018 (UTC)
    • An author citation is optional. Neither Physeter microps Linnaeus, 1758 nor Physeter microps Fabricius, 1780 is the valid name of a taxon; both are synonyms. Neither name has a Wikidata item. In addition, it would take a minor miracle for Physeter microps Fabricius, 1780 to be the valid name of a taxon, since it started out as a later homonym. - Brya (talk) 12:33, 9 December 2018 (UTC)
      • And this same document state just after: "The original author and date of a name should be cited at least once in each work dealing with the taxon denoted by that name. This is especially important in distinguishing between homonyms and in identifying species-group names which are not in their original combinations". Hexasoft (talk) 09:48, 11 January 2019 (UTC)
        • Sure, but that does not contradict what I said. - Brya (talk) 12:07, 11 January 2019 (UTC)
  •   Support Citation string (which is part of the taxa definition) can't be deduced from author and source (i.e. author name + date) because it also reflects taxa history, and can lead to very complex citations (see examples). Can't figure out why images can have legends but taxa can't have author citation. Hexasoft (talk) 09:46, 11 January 2019 (UTC)
    • The "[c]itation string" can be deduced from the Wikidata item, once Wikidata is complete. It is just that Wikidata won't be complete for a long time. The "taxon definition" is something else entirely, and (except in a few cases) is not dealt with by Wikidata at all. Any taxon definition will have its own authorship, often different from that of the name. - Brya (talk) 12:07, 11 January 2019 (UTC)
      • Can't figure how citation can be deduced from wd item. I mean how can you deduce that it is "Linnaeus, 1758", or "(Linnaeus, 1758)", or "L., 1753", or "(L.) Merr., 1917"? Hexasoft (talk) 12:58, 11 January 2019 (UTC)
    • By looking at "basionym" or "original combination". There is a taxobox module that can take care of this automatically. - Brya (talk) 17:30, 11 January 2019 (UTC)

  Info At the end of the last year around one fifth of our taxon names had taxon author (P405) and year of taxon name publication (P574), This is ten times more than at the end of 2015. So we are making progress. --Succu (talk) 19:09, 11 January 2019 (UTC)

Yes, which is why this is a somewhat academic exercise. Half a million is a lot, but still only a fifth of the taxon names. I don't expect to see anybody adding tens of thousands of author strings, let alone millions. Still for some groups, it may make a difference. - Brya (talk) 09:07, 12 January 2019 (UTC)
It's a tempered progress making structured data true. I can not see a hint from which source the value ( > 2,500,000) of this new property is drawn. I hope not from Wikimedia projects... --Succu (talk) 20:41, 12 January 2019 (UTC)
  •   Support This is a good idea to move forward. In a way it is similar to author (P50) and author name string (P2093), which helped in moving the citation graph to where it is now. Andrawaag (talk) 22:59, 14 February 2019 (UTC)
    • Is it? My concerns about the format are unreflected. Should we allow more than one value (e.g. ex-author is not mandatory)? Should this new string property removed if the "structure version" is present? --Succu (talk) 19:50, 15 February 2019 (UTC)
"Should we allow more than one value?" this is no more and no less an issue than with a potential "structure version", or/and no more an issue than in the other databases, encyclopedias, books, articles, ect.... that retrieve taxon names with the/a nomenclature form of the author citation. I created this property proposal in order to have the possibility to retrieve in other projects (at least) one correct author citation, not to list all the possibilities. I guess that a single "accepted" value should be ok, in any case that was what I thought. And in my opinion we should prioritize and prefer the most common form. The main issue is that without this property we don't have even one single accepted/correct value. But well, the door is open, and I'm not strictly opposed that this property accepting multiple if there is a real interest, although that is not really my goal. "Should this new string property removed if the "structure version" is present?" the purpose of this proporty is specifically to overcome the lack of a "structure version" to retrieve the author citation, "structured version" which is not about to exist in the near or middle future, as far as I can be aware; therefore yes, if a structure version is one day available, I insist on the "one day", and if this "structured version" is working well, and manage to give a satisfacotry result, then yes why not to remove this proprety witch will lose its interest. Christian Ferrer (talk) 23:02, 15 February 2019 (UTC)

@Tommy Kronkvist, Strobilomyces, Brya, MargaretRDonald, Hexasoft, Thiotrix: @Andrawaag, Pigsonthewing, Liné1, Succu, 99of9, Christian Ferrer: @RaboKarbakian:   Done: taxon author citation (P6507). − Pintoch (talk) 08:41, 16 February 2019 (UTC)