Wikidata:WikiProject Taxonomy/Tutorial

This page is a work in progress, not an article or policy, and may be incomplete and/or unreliable.
Please offer suggestions on the talk page.

বাংলা | català | čeština | Deutsch | English | español | suomi | français | magyar | italiano | 日本語 | 한국어 | Lëtzebuergesch | latviešu | македонски | norsk bokmål | Nederlands | polski | português | português do Brasil | română | Scots | shqip | српски / srpski | svenska | русский | українська | 中文(简体)‎ | +/−

This Tutorial is intended to explain how to build a Wikidata item that deals with a taxon. A taxon is a group of organisms that a taxonomist has declared to be a unit. The scope of Wikiproject Taxonomy includes taxa of all kinds, animals, plants, fungi, algae, prokaryotes and viruses. Also fossils of these.

Obviously, Wikidata is not intended to be a Checklist (a list of 'right' scientific names), but is intended to hold data (preferably referenced data) that allows various Checklists (from various points of view: "names that are right according to ...") to be generated.

Basic propertiesEdit

Panthera uncia
speciesPanthera uncia
scientific name of species
Panthera uncia

Each item about a taxon like Panthera uncia (Q30197) should have the following four basic properties:

The scientific name of the taxon. In this example: Panthera uncia. "Taxon name" is intended for the correct name of the taxon (correct according to the referenced viewpoint), usually the name under which it is treated on a Wikipedia. Synonyms should not be included here, nor should orthographical variants.
Use this property to indicate the rank of the taxon. In this example: species (Q7432). Currently-used ranks can be found here. If the taxon does not have a rank, as for example magnoliids (Q846071), set P105 to "no value", a special property status you find left of the form field where you can add the rank.
The next higher taxon (according to a particular classification; it generally is important to add the source, but a species automatically belongs to the genus that supplied the first part of its name). In this example, Panthera (Q127960). If this genus would be divided in subgenera, parent taxon (P171) could point to the relevant subgenus (for subgenera, it is important to add good references)

Label, description, picture and common nameEdit

Label, description, and aliases

The label usually is the scientific name of the taxon.

The description serves for disambiguation. It should be short and should tell a reader who knows nothing about the topic where the item approximately belongs. Descriptions that are useful for disambiguation are "species of mammal" or "genus of annelids". This will tell the reader roughly what kind of organism it is. Indicating the rank of the taxon is also helpful, especially for higher taxa. The scientific names of higher taxa have a termination that indicates the rank. However, many readers are not familiar with these terminations of scientific names and will not know that HALIOTIDAE is a family of animals, while Columbiformes will be an order, so it is helpful to use "family of molluscs" and "order of birds". (For Desserobdella picta it is unhelpful to say that it is a species in the genus Desserobdella; the reader could already deduce that from the name, and is unlikely to know kind of organism Desserobdella is: is it a fungus, bacterium, etc?).)

An image for the taxon can be added by image (P18) and the name of the image on Commons.

A common name can be added with taxon common name (P1843): the property demands that the language is indicated. Each language may have any number of common names for any taxon. The property should be referenced.

Subgenera (zoology)Edit

In zoology, the name of a subgenus consists of one word (Article 4.1). Some databases use the format "Plaxiphora (Plaxiphora)" to indicate a subgenus Plaxiphora, placed in genus Plaxiphora, that is, they do not only provide the name of the subgenus but also give its taxonomic position (as used in that database). This is not a good idea for Wikidata, as any subgenus can, in principle, be placed in any genus. Wikidata has, in principle (but no exceptions known) one item for one zoological subgeneric name (one name, one item). This item for a subgenus can have as many parent taxa (genera) as can be found in the literature. So for Wikidata, it is appropriate to use as label only the name itself (consisting of one-word).

Naming rulesEdit

Scientific names are governed by Codes of nomenclature:

The property Code of nomenclature (P944) should be only added at items for higher-level taxa, such as kingdoms or above.


Desserobdella picta
unknown rankGlossiphoniidae
speciesDesserobdella picta
scientific name of species
Desserobdella picta
Verrill, 1872

The authorship of the name of the taxon can be set with the following basic qualifiers of taxon name (P225):

value = the item representing the author. If there is no such item, this has to created first. If a name has more than one author, all of them should be added (in the proper sequence). In this example, the author is Addison Emery Verrill (Q353120).
value = the year of publication of the name. In this example: 1872.

Also the whole author citation with year of publication can be added as a single string using taxon author citation (P6507). For instance the value for Clitocybe gibba (Q46795429) is "(Pers.) P.Kumm. (1871)".

The exact form of author citations are different, depending on what Code of nomenclature (P944) applies.

The zoological Code (ICZN)Edit

International Code of Zoological Nomenclature (Q13011)) applies for animals, and typically uses

  • Bos taurus Linnaeus, 1758 besides
  • Nanger dama (Pallas, 1766).

That is,

  1. it uses the full last name of the author(s). For example, "Linnaeus" is used for Carolus Linnaeus (and not the latter-day noble name von Linné). And
  2. it uses parentheses "( )" to indicate that the taxon has later been moved to a different genus. In Wikidata, this can be indicated by adding "instance of recombination" (instance of (P31)=recombination (Q14594740)) as a qualifier.
    Note: the website Mammal Species of the World (erroneously) cites all authors without parentheses.
  3. mentioning the year is not required, but is done very often.

Special propertiesEdit

can be used if the original combination has its own item. In that case, the original combination will have the same authorship, but without "instance of recombination" as a qualifier. For example Felis leo (Q15294488) is the original combination of Panthera leo.
is used to indicate an earlier name, when a nomen novum has replaced it. A nomen novum or replacement name is a new name for a taxon that had a completely different name before. A replacement name has its own author(s) and date of publication. The only thing it has in common with the earlier name is the nomenclatural type. The earlier name needs to have its own item; if it does not exist, create it.
can be used to accommodate differences between zoology and botany in the customary form of the name of the author, when used in a citation. A claim is to be placed in the item on the author. As its value it takes the form of the name of the author as used in zoology: this may be just the last name, or the last name with initials, as the case may be (some last names are shared by different authors, and initials are helpful in telling them apart).

The ‘botanical’ Code (ICNafp, formerly ICBN)Edit

core eudicots
euasterids I
scientific name of order
Juss. ex Bercht. & J.Presl (1820)

International Code of Nomenclature for algae, fungi, and plants (Q693148)) applies for plants, fungi, and algae, and uses for example

For taxa above the rank of genus, author(s) of a basionym are never cited.

Special propertiesEdit

It is possible to express the authorship and status using the following additional properties:

Moreover, both the author(s) of the original name and the author(s) who placed the taxon in its current rank and/or position must be shown. The author(s) of the original name (basionym) are not added here but are added at the item of the original name (basionym) and then linked to by using basionym (P566), see for example Maihuenia (Q134515). The basionym must have its own item (unfortunately, Wikidata is missing many such items), if it does not exist, create it.
is used to indicate an earlier name, when a nomen novum has replaced it. A nomen novum or replacement name is a new name for a taxon that had a completely different name before. A replacement name has its own author(s) and date of publication. The only thing it has in common with the earlier name is the nomenclatural type. The earlier name needs to have its own item; if it does not exist, create it.
An "ex-author" (who did not himself publish the name, but who is credited with inventing the name) can be added with ex taxon author (P697). For Gentianales (Q21754), the authors Friedrich von Berchtold (Q940150) and Jan Svatopluk Presl (Q379593) decided to honor Antoine Laurent de Jussieu (Q223963) as ex-author.
can be used to accommodate differences between zoology and botany in the customary form of the name of the author, when used in a citation. A claim is to be placed in the item on the author. As its value it takes the standard form (standard abbreviation) of an author's name (a unique form, a one-to-one relationship). For example, the official abbreviation of Carolus Linnaeus (later also named von Linné) is "L.". In this case, add botanist author abbreviation (P428)="L." to the author's item Carl Linnaeus (Q1043).

The bacteriological Code (ICNP, formerly ICNB or BC)Edit

International Code of Nomenclature of Prokaryotes (Q743780)) applies to bacteria, and uses

  • Bacillus subtilis (Ehrenberg 1835) Cohn 1872,

so with a basonym, two sets of authors and two years of publication.

The Code of Virus Classification (ICTV)Edit

International Code of Virus Classification and Nomenclature (Q14920640) applies for virus taxa. No author citations are given.

A name that is based on an earlier name (TODO)Edit

The type of a nameEdit

[Explanatory note: The type of a name ("nomenclatural type", "name-bearing type") anchors a name to a taxon. There is a popular myth that a type defines the taxon, but this is an obvious impossiblity (except perhaps for species of insects). A name has only one type, except perhaps when an epitype has been added. In the case of animals it may occur that several syntypes together form the type (until a taxonomist comes along to clear up the situation)]

Adding the type of a name (type (Q3707858)) can be done with taxonomic type (P427). This uses "item" as its datatype, so this works for names above the rank of species.


Ginkgo (Q149461), type (Q3707858): Ginkgo biloba (Q43284)).

In principle, this will also work for names at the rank of species and below, but this requires making an item for each and every type (specimen, illustration or description). Making such new items will be most rewarding for type specimens that are available online, which can be linked to.


Ginkgo biloba (Q43284), lectotype (Q2439719) : Gordon, Herb. Linn. No. 1292B.2 (LINN) (Q20819552)
Acanthocalycium ferrarii (Q337692), holotype (Q1061403) : Rausch 572 (Q19359611)

Note that for the ICNafp it is also correct (actually not just correct, but more accurate) to state:

Ginkgo (Q149461), type (Q3707858): Gordon, Herb. Linn. No. 1292B.2 (LINN) (Q20819552) ,

but this is not possible under the zoological Code, so is not generally recommended (the exception is when this represents the actual situation, that is, a specimen or illustration is the type of a genus, but not the type of a species).

  • Original combination (P1403)
The zoological Code (ICZN) is very complicated, also in its rules for types. The Wikidata property "original combination (P1403)" is intended to indicate the combination (name) where the "species-group name" was first published. The property is used when a species or subspecies has been assigned to a different position (in another genus). Both names have the same type (also same authorship, and same date).
  • Parus cristatus is the original combination for Lophophanes cristatus (Q207831).
  • Basionym (P566)
In the Code for names of algae, fungi and plants (ICNafp), a basionym (P566) is the name that another name is based on. This property is used when a taxon has been assigned to a different position (in another genus), or a different rank. Both names have the same type (but each name has its own authorship, and date).
  • Libocedrus papuana (Q15043538) is the basionym for Papuacedrus papuana (Q133224).
  • Sempervivum sect. Jovibarba (Q15903589) is the basionym for the generic name Jovibarba (Q159607).
  • Replaced synonym (P694) and replacement name (Q749462)
The property "replaced synonym (for nom. nov.) (P694)" is for a relationship with replacement name (Q749462). The new name, the replacement name (nomen novum), has the same type as the older, replaced name (each name has its own authorship, and date). A replaced synonym is:
  • a name that replaces a name that can never be used: Fleroya (Q5862554) is the replacement name for a name Hallea (Q5642838), that proved to be a later homonym (and that thus can never be used).
  • a name that in a particular position cannot be used as a basionym:
- Polygonum bistorta (Q12954907) cannot be used as a basionym for a name in Bistorta (Bistorta bistorta is not a possible name). A new name was published, Bistorta officinalis (Q112917), with the same type (a replacement name).
- Similarly Douglasia gormanii (Q17244429) cannot be used as a basionym for a combination in Androsace as there already was a name Androsace gormanii (applying to another plant), so a new name was published, Androsace constancei (Q15316421), with the same type (a replacement name).


There are a lot of databases available on the World Wide Web, but these vary in many respects. Some are purely nomenclatural databases, like International Plant Names Index (Q922063) (IPNI); these can be used to reference a nomenclatural act (like the publishing of a name), but they don't express any taxonomic viewpoint. There are databases that are organized on a nomenclatural basis, but with some taxonomy included, like Tropicos (Q2578548). Then there are databases which are organized from a single taxonomic viewpoint (with „wrong” and „right” names). So, it is always important to keep in mind what kind of database one is dealing with.

Quality of databases can vary strongly. Some databases are known to have significant amounts of outright nonsense, like ZipcodeZoo (Q15078690), Catalogue of Life (Q38840) (CoL), The Plant List (Q625817) (for the material wrongly copied from Tropicos). Encyclopedia of Life (Q82486) (EoL) copies all the content of Catalogue of Life, and is therefore not reliable as such (Global Biodiversity Information Facility (Q1531570) (GBIF) also copied a lot of errors from CoL, but appears to have succesfully cleared these up). The Integrated Taxonomic Information System (Q82575) (ITIS) is better, but is mostly behind the times.

Other databases such as Avibase (Q20749148), Catalog of Fishes (Q9185167), GRIN Taxonomy for Plants (Q19576476), Reptile Database (Q1644501), and World Register of Marine Species (Q604063) (WoRMS) are very well kept and can generally be used as a reference.

When a taxonomic viewpoint needs to be referenced (and taxonomic viewpoints usually do need to be referenced), online databases generally are to be used only as a temporary measure. It is much preferred to use taxonomic literature as a reference.

Taxonomy and WikidataEdit

Wikidata aims

  1. to link similar pages on Wikipedia's (so called interwiki links) and
  2. to provide standardized data. For taxa such standardized data include the scientific name, the author citation of that name and the date of publication of that name. This can be used in Wikipedias for example in templates.


Ideally, in Wikidata, each taxon should have its own item, marked with instance of (P31)=taxon (Q16521). If the same taxon (aka the same group of organisms) with the same name has more than one item, the items should be merged. If a particular Wikipedia has two separate pages by the same name (duplicate pages), this cannot be resolved here, but can only be dealt with after that Wikipedia has dealt with it.

However, it is quite possible that a Wikipedia correctly has two or more pages with the same scientific name, dealing with different uses of that name (different circumscriptions). For example, the same name used for a 'big' taxon and a 'small' taxon: Magnoliidae as used by Cronquist (1981) is much smaller than (and is subdivided quite differently from) Magnoliidae as used by Chase and Reveal (2009).

Also, it may be that one particular taxon has been given different names by different authors, as the result of different placements. Often enough, there will be different names which may or may not apply to the same taxon. If these are heterotypic names, each name must have a separate item, and taxon synonym (P1420) can be used (preferably with a good reference). At the moment a synonym can only be indicated if the name does have its own item. For homotypic names, see below.

In some cases there is no clearcut solution on how a taxon should be dealt with on Wikidata. What is, and is not possible on Wikidata depends partially on what is in the Wikipedia's. As a rule of thumb, any scientific name that was used in prominent literature for a taxon may be given its own item, regardless of whether or not it has iw-links.

Not to be treated as taxaEdit

What should not be treated as taxa are products of taxa (fruits, timbers, etc). Products of taxa can be linked to taxa by natural product of taxon (P1582) and this taxon is source of (P1672). See below.

What also should not be treated as taxa are cultivated forms (dog breeds, pigeon breeds, etc). Cultivated forms can be linked by "instance of:". The exception are named cultivars which may be given a "taxon name", although these really are not taxa (= taxonomic groups) either.

What also should not be treated as taxa are names that cannot be used as the correct name of a taxon. See below.

Taxonomy changesEdit

Desserobdella picta
speciesDesserobdella picta
scientific name of species
Desserobdella picta
(Verrill, 1872)
Clepsine picta
Verrill, 1872
Placobdella picta
(Verrill, 1872)

Taxonomy is a science and subject to change. The default way for Wikidata to handle different scientific viewpoints is by using a separate item for every name. Like Wikipedia, Wikidata is not supposed to choose The One True Taxonomy, but rather to document the various scientific viewpoints. Where taxonomy is concerned, references are important: a reference is preferably a taxonomic work (general databases like ITIS, The Plant List, etc are not all that helpful or reliable).

Taxon synonymEdit

Synonyms are non-current names (that is, non-current from the referenced point of view). At this stage, it is possible to add synonyms via taxon synonym (P1420), but only when there already is an item for this name. It is desirable to add a reference to any use of taxon synonym (P1420), as what is, and what is not, a synonym will vary, depending on taxonomic viewpoint.

The property "taxon synonym" P1420 is primarily intended to generate a list of synonyms for a taxobox, so it should contain well-known synonyms as values. Theoretically it should become possible to produce more than one list of synonyms from one and the same item, representing more than one taxonomic point of view, by using the references provided. In practice, this is a long way off.

Possibly, some time in the future, there may be an additional property that allows to add synonyms as strings, but for the moment they can only be added as "also known as".

The reverse relationship ("this name is a synonym of") can be indicated by "instance of: synonym", with a qualifier "of (P642)" pointing at the current name item - but perhaps a new property would be a good idea.

Monotypic taxonEdit

Use of "instance of monotypic taxon" (Q310890) indicates that there is only one species in that genus, only one genus in that family, or only one family in that order. This makes it possible (for Wikipedia's that prefer doing so) to have a taxobox focused on both a species and a genus (or both a genus and a family, or both an order and a family).

In the case of a genus that contains only one species, it generally will be possible to put all the sitelinks (of both the genus and the species) in the same item, that of the species. [For dinosaurs (and the like), it may happen that there is only one item, that of the genus, where sitelinks of both the species and the genus are placed.] This does not hold for a genus and a family, or a family and an order.

It is desirable to accompany any use of "instance of monotypic taxon (Q310890)" by a good taxonomic reference. After all, the number of lower-level taxa in a taxon may well vary, depending on taxonomic viewpoint: one taxonomist's one-species family is another's six-hundred species family.

At some point a user ran a bot that rendered every "taxon" as a "monotypic taxon", so there are many inappropriate claims to be found of "monotypic taxon" that need to be converted.

Homotypic namesEdit

Homotypic names are names based on the same type (a type is an objective reference, an anchoring point). Usually, homotypic species names have the same specific epithet/name.

Nanger dama and Gazella dama are different names in use, reflecting different taxonomic viewpoints. For data purposes, this is best reflected by two different items, with the author citation "Pallas, 1766" placed in the item of the original name (Gazella dama), with original combination (P1403) = Gazella dama in the item of the other, derived name (Nanger dama). In one or both items taxon synonym (P1420) can be used to connect the items, referenced whenever possible (different references may take different points of view). Also, "instance of" "synonym" (Q1040689) "of" "Q####" can be used to connect items (preferably also referenced).

The various claims should be put in the item of the name they apply to. Normally, the interwiki-links are all put in one of the items (for connectivity): see Carduelis flammea (Q206701) and Common Redpoll (Q20754771). Hopefully all the various information and all the linked sites can be found by the end-users through the taxon synonym (P1420).

However, it is not unheard of for a Wikipedia to have multiple pages on the same species, each under one of the synonyms. In that case, there is no other option than to put these in the items which use that name.

Higher-level taxaEdit

For higher-level taxa the nomenclature can be very involved, like the conifers (which have lots of names). It may be an option to put all the names in taxon name (P225), but presumably, it is best to have an item for each name, connected by said to be the same as (P460).

Names that are not taxaEdit

Under any of the nomenclature Codes, there exist more scientific names than there are taxa. Most of these exist to reflect various taxonomic (scientific) insights. The aim of a Code of nomenclature is to provide one correct name for any one taxon, with one particular circumscription, rank and position. On the other hand, changes in circumscription, rank or position (taxonomic viewpoint) may lead to different names.

But there also exist very many scientific names that can never be used for a taxon (no matter what taxonomic viewpoints is adopted), names that for all practical purposes are effectively dead (or semi-dead). They are best forgotten except for processes internal to nomenclature.

As such scientific names are not notable (as a rule, there may be rare exceptions), they do not deserve a Wikidata item, unless they serve a structural purpose. Nevertheless, there are Wikipedia's that have pages on them (although they should not), in which case there perforce needs to be a Wikidata item. Also, sometimes such names do serve a structural purpose. Then they will have an item of their own, as well.

Such items should hold "instance of (P31) later homonym", "earlier homonym", "illegitimate name", etc, as may be the case. It may be helpful to add "instance of (P31) [family, etc]", to indicate where approximately it belongs. Clearly, "taxon name (P225)" is not appropriate: it would be nice if there were a separate property for these names, but there is not one yet. The name accompanied by its author citation can be entered in the "label" field.

The -ii questionEdit

A recurring problem is that of names that can be found in two forms: one ending on -i and one ending on -ii (the difference is a honorific "i", added to give credit). This is different for animals and for algae, fungi and plants.


The zoological Code prefers the formation of scientific names, honouring a person, with the single -i, but in earlier times many names were published with the double -ii. The original form (as when first published) is to be used, except when an altered form has been very widely adopted ("prevailing usage"). In practice, this means that it is desirable to have a taxonomist specialising in that particular group decide on this.

The zoological Code also allows personal names to be treated as Latin. A recurring issue are personal names on -a; these can given a normal Latin genitive -ae. In the literature many cases can be found where this is 'corrected' to -ai: in such cases the original form is to be used.

If a scientific name honours a woman or more than one person, -i (or -ii) should be corrected to the appropriate form (-(i)ae for a woman, -(i)orum for more than one person, unless they are all women when it should be -(i)arum).


For algae, fungi and plants the basic rule is that the original spelling (when first published) is to be followed, except when a name honours a person by using his surname. A scientific name ending in -i based on a surname of a man ending in a consonant should be corrected to -ii (but -i after a vowel or -er: -ai, -ei, -ii, -oi, -ui, -yi, -eri). If a scientific name honours a woman or more than one person, -i (or -ii) should be corrected to the appropriate form (-iae for a woman, -iorum for more than one person, unless they are all women when it should be -iarum).

Not to be corrected are scientific names basing themselves on 1) personal names already in Latin (caroli for Carolus, Charles, Carl, Karl, Karel, etc, petri for Petrus, Peter, Pete, etc) or having a well-known latinized form (Hieracium bauhini after Bauhin, in the contemporary Latin literature known as "Bauhinus"), 2) ad hoc latinizations of personal names (if sufficiently distinct) and 3) translations into Latin of personal names (Schoenoplectus tabernaemontani after Bergzabern, Wollemia nobilis after Noble). These are to be left in the original form. There are also some unusual Latin genitives (for personal names on -o and -on), like Muhlenbergia richardsonis (after Richardson) which are to be accepted as they are.

There is a very extensive and old tradition of using Latin forms for a given name, so in practice these deserve extra caution. For a person named Charles both the traditional Latin form caroli (for Carolus) and the modern form charlesii can be correct, depending on the original spelling. An epithet published as alberti (after a given name Albertus, Albert, Bert, etc) is correct, as is an epithet published as albertii (after a given name Albert).

Also to be left as they are, are names after geographic features, even though these have names looking like personal names.

In practice this means that it is important to make certain that a scientific name honours a person (and not a geographic feature), and determining how it fits in Latin (classical Latin, early Latin, medieval Latin, botanical Latin).

Taxon is source ofEdit

Sometimes taxa are the source of economically important products:


An ad hoc Glossary for taxonomy on Wikidata:

  • taxon name = a scientific name that can be used as the correct name of a taxon (depending on taxonomic point of view)
  • taxon = a group of organisms that a taxonomist judges to be a unit. A taxon usually has a correct scientific name, but unnamed taxa can be found in the literature
  • fossil taxon = a taxon described on the basis of fossil material (as in the ICNafp)
  • clade = a taxon circumscribed using cladistic methods (a monophyletic group)
  • taxon synonym / synonym = a scientific name that also applies (in some way) to this taxon, but is not the correct name. A synonym may be a name that in a different taxonomic point of view would be the correct name for a taxon, but this need not be the case
  • later homonym = a scientific name (published for a taxon) with the same spelling as a another, earlier scientific name (for a different taxon) (as in the ICNafp). A later homonym may never be used as the correct name of a taxon (no matter the taxonomic point of view)
  • earlier homonym = a scientific name (published for a taxon) with the same spelling as a another scientific name (for a different taxon) that was published later, and with the later name being protected (conserved) (as in the ICNafp). An earlier homonym may never be used as the correct name of a taxon (no matter the taxonomic point of view)
  • basionym = usual meaning (as in the ICNafp)
  • original combination = (in analogy with basionym, but for animals) the first combination that used this particular second part of a species name or this particular third part of a subspecies name.
  • designation = not formally established as a scientific name; does not exist as a scientific name (as in the ICNafp), and obviously may never be used as the correct name of a taxon (no matter the taxonomic point of view)
  • unavailable combination is used for a combination with a generic name that may not be used (such as illegitimate names, later homonyms): the resultant combination may never be used as the correct name of a taxon (no matter the taxonomic point of view)


These are automatically created if sources are added.

  1. Mark E. Siddall, Rebecca B. Budinoff, Elizabeth Borda: Phylogenetic evaluation of systematics and biogeography of the leech family Glossiphoniidae. In: Invertebrate Systematics. (19) p.105–112. 2005-01-01
  2. Integrated Taxonomic Information System.