Wikidata:Property proposal/Natural science

Property proposal: Generic Authority control Person Organization
Creative work Place Sports Sister projects
Transportation Natural science Lexeme Wikimedia Commons


See alsoEdit

This page is for the proposal of new properties.

Before proposing a property

  1. Check if the property already exists by looking at Wikidata:List of properties (research on manual list) and Special:ListProperties.
  2. Check if the property was previously proposed or is on the pending list.
  3. Check if you can give a similar label and definition as an existing Wikipedia infobox parameter, or if it can be matched to an infobox, to or from which data can be transferred automatically.
  4. Select the right datatype for the property.
  5. Start writing the documentation based on the preload form below and add it in the appropriate section.

Creating the property

  1. Once consensus is reached, change status=ready on the template, to attract the attention of a property creator.
  2. Creation can be done 1 week after the proposal, by a property creator or an administrator.
  3. See steps when creating properties.

  On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2021/03.


Physics/astronomyEdit

conductor forEdit

   Under discussion
Descriptionthe subject allows the object to flow through
Data typeItem
Example 1gas pipe (Q55753892): pipe for the passage of gasgas (Q11432): one of the four fundamental states of matter
Example 2artery (Q9655): blood vessel that carry blood away from the heartblood (Q7873): organic fluid which transports nutrients throughout the organism
Example 3electrical conductor (Q124291): material that allows the flow of electrical currentelectricity (Q12725): physical phenomena associated with the presence and flow of electric charge

MotivationEdit

I was searching for a way to express that blood flows through arteries and found that we currently lack a property to describe this relation. I'm not 100% sure that conductor is the best word and open to other suggestions. ChristianKl❫ 13:13, 4 December 2020 (UTC)

Tobias1984
Snipre
Physikerwelt
Pamputt
Petermahlzahn
Jibe-b
Restu20
Daniel Mietchen
TomT0m
ArthurPSmith
Mu301
Sarilho1
SR5
DavRosen
Danmichaelo
Ptolusque
PhilMINT
Malore
Thibdx
Ranjithsiji
Niko.georgiev
Simon Villeneuve
Toni 001
Marc André Miron
  Notified participants of WikiProject Physics Tobias1984
Doc James
Bluerasberry
Wouterstomp
Gambo7
Daniel Mietchen
Andrew Su
Peter.C
Klortho
Remember
Matthiassamwald
Projekt ANA
Andrux
Pavel Dušek
Was a bee
Alepfu
FloNight
Genewiki123
Emw
emitraka
Lschriml
Mvolz
Franciaio
User:Lucas559
User:Jtuom
Chris Mungall
ChristianKl
Gstupp
Geoide
Sintakso
علاء
Dr. Abhijeet Safai
Adert
CFCF
Jtuom
Lucas559
Drchriswilliams
Okkn
CAPTAIN RAJU
LeadSongDog
Ozzie10aaaa
Sami Mlouhi
Marsupium
Netha Hussain
Abhijeet Safai
ShelleyAdams
Fractaler
Seppi333
Shani Evenstein
Csisc
linuxo
Arash
Morgankevinj
Anandhisuresh
TiagoLubiana
ZI Jony
Antoine2711
Viveknalgirkar
JustScienceJS
Leptospira
Scossin
Starsign1971
Bibeyjj
  Notified participants of WikiProject Medicine

DiscussionEdit

  Comment why not adapt that carries (P2505) property ? They don't have same fields but look to be carrying same idea. Bouzinac💬✒️💛 17:47, 4 December 2020 (UTC)
Given the examples of the property it seems to be something different. Tioga Pass (Q2436036) carries (P2505) California State Route 120 (Q758090) but Tioga Pass (Q2436036) <conductor for> motor car (Q1420) would match the relationship in my proposed property. ChristianKl❫ 21:32, 4 December 2020 (UTC)

BiologyEdit

Please visit Wikidata:WikiProject Taxonomy for more information. To notify participants use {{Ping project|Taxonomy}}
Please visit Wikidata:WikiProject Biology for more information. To notify participants use {{Ping project|Biology}}

taxon synonym stringEdit

   Under discussion
Descriptionsynonym of the taxon name
Representssynonym (Q1040689)
Data typeString
Template parameter"synonyms" in en:template:speciesbox, en:template:taxobox, and en:template:automatic taxobox
Domaintaxa
Allowed valuesscientific names
Example 1Phidippus johnsoni (Q675345) → Attus johnsonii
Example 2Phidippus johnsoni (Q675345) → Phidippus bicolor
Example 3Phidippus johnsoni (Q675345) → Dendryphantes johnsoni
Planned useaddition of taxon synonyms for jumping spiders (99+% of which don't have items)
Robot and gadget jobsNothing currently planned, but there's a lot of potential here
See alsotaxon synonym (P1420)

MotivationEdit

The original proposal for taxon synonym (P1420) was to have it be a string property. However, the property was controversially created as an item property instead at the insistance of a single user. Fast forward six years and taxon synonym (P1420) is barely used (14,425 uses) despite the fact that almost every taxon on the planet has synonyms and many have dozens. This means that Wikidata still doesn't have useful data on taxon synonyms for 99.9% of taxons (even though Wikipedia, Wikispecies, and even Commons have this data). The fact is, no one is going to create Wikidata items for the millions of synonyms that exist (and it's debatable whether or not they even should in many cases, e.g. objective synonyms, taxonomic vandalism, etc.). These problems have been discussed ad nauseam on the property talk page without any solutions being reached, mainly because we keep letting the perfect be the enemy of the good. The only practical way forward, IMO, is to adopt Brya's compromise proposal to have separate item and string properties, similar to how author (P50) and author name string (P2093) work. This should allow us to finally move forward with adding more comprehensive synonym data to Wikidata while still allowing users to flesh out full items for taxon synonyms if they need to. Kaldari (talk) 16:01, 28 June 2020 (UTC)

How this property would be used in relation to taxon synonym (P1420)Edit

Use of "taxon synonym string" (with taxon author (P405) and year of taxon publication (P574) qualifiers) would be the default for most taxon synonyms. In cases where there are ...

  1. Competing taxon concepts in recent literature (e.g. Prasophyllum uroglossum (Q65946197) vs. Prasophyllum fuscum (Q15488111))
  2. Disagreements about taxon level in recent literature (e.g. Bos taurus (Q20747334) vs. Bos primigenius taurus (Q20747320))
  3. Deprecated taxon names that are notable in their own right (e.g. Attus (Q4818757))

... use of taxon synonym (P1420) would be encouraged. In other words, we should only use taxon synonym (P1420) where there is a concrete reason for a Wikidata item to exist for that synonym and the information can't be equally-well handled with just a string and qualifiers. Kaldari (talk) 20:32, 28 June 2020 (UTC)

DiscussionEdit

  •   Oppose:
  1. The purpose of author name string (P2093) is "undifferentiated author", but taxon name is usually unique so it is possible to create items for individual names.
  2. This proposed solution assumes there is only one canonical name for each taxon, but the canonical name may differ between classification systems

--GZWDer (talk) 16:14, 28 June 2020 (UTC)

@GZWDer:
  1. That's not accurate. According to the description (and how the property is actually used), author name string (P2093) is for when the author is undifferentiated or doesn't have an item. Regardless, just because something is "possible" doesn't mean people are actually going to do it. For example, I really want to add synonyms to spider species, but I'm not about to create 100,000+ items in order to do it. Nor do I expect that anyone is going to create hundreds of thousands of Wikidata items for every author of a paper listed in Wikidata.
  2. True, in which case you can still use taxon synonym (P1420).
Kaldari (talk) 16:23, 28 June 2020 (UTC)
We need consistency, instead of different ways to achieve the same thing. --GZWDer (talk) 16:28, 28 June 2020 (UTC)
If there was a way to have properties that could be either strings or items, I would agree with you. Kaldari (talk) 16:30, 28 June 2020 (UTC)
  Comment There seems to be a classic conflict here, does Wikidata capture as much information as it can without trying to model everything in detail (in the hopes that it can be sorted out later) or does postpone adding data that hasn't (yet) been modelled? Individuals seem divided on this topic, the question is can both approaches exist side by side? The author name string (P2093) example works well because (a) it's clear that a huge amount of bibliographic data couldn't be added to Wikidata if we insisted on having authors as items, and (b) there are people and bots mapping author strings to author items. Having the string data in the first place makes this mapping possible, and doesn't seem to negatively impact this who would prefer to have authors as items. Perhaps this proposal could be strengthened by showing how we could move from synonyms as strings to synonyms as items? That way, those who prefer items could see a path to that goal, and hence we could develop bots to make that transition as and when we have information and/or time. I think most would agree that the modelling of taxonomic names and taxa in Wikidata is, at best, open to improvement, but how it is to be improved seems unclear. It is tempting to argue that we shouldn't mess with taxonomic properties until that model is clarified, but that will be enormously frustrating to those who "just want to get things done." --Rdmpage (talk) 17:10, 28 June 2020 (UTC)
  Comment I very much support Rod's pragmatic approach and, indeed, an expansion of Kaldari's original proposal to explain the mechanics of supporting a transition from string to item (when appropriate) would be a great addition. In my opinion, having followed the usage of author-related properties for a while, this incremental approach is solid. I'd be curious to hear from @Magnus Manske: the success of the author string vs item resolution depends to a large extent on the tools he created. --DarTar (talk) 18:22, 28 June 2020 (UTC)
@Rdmpage, DarTar: I don't think Wikidata claims always have to "progress" from strings to items. There are cases where it makes sense for an author to be a string forever (a minor author in a minor academic paper who has no other citations). Similarly, there are numerous cases (the majority, IMO) where it makes sense for a taxonomic synonym to be just a string (with qualifiers). The mussel species Anodonta cygnea has 568 synonyms. Creating separate items for each one (most of which only occur within a single paper), would be pointless and create a maintenance nightmare. I've now outlined above how I think this property should be used in relation to taxon synonym (P1420). Let me know if you think that makes sense. Kaldari (talk) 20:47, 28 June 2020 (UTC)
So adding 568 (qualified and referenced) synonym strings to Swan mussel (Q777865) will not „create a maintenance nightmare“? --Succu (talk) 21:12, 28 June 2020 (UTC)
Creating items for fake species does more than create "a maintenance nightmare": it makes Wikidata a serious threat to information on taxa, on a global scale. - Brya (talk) 03:53, 29 June 2020 (UTC)
@Succu: No more so than what Wikipedia already deals with. If, however, we had 568 items for all the synonyms (the current method), every time the accepted scientific name changed, we would have to update 568 items (rather than just updating one item). Anodonta cygnea may not be the best example, as there doesn't seem to be an item for the taxon name Anodonta cygnea specifically, but Phidippus audax (Q134277), for example, has several dozen synonyms. Most of them are listed in Wikipedia, but none are listed in Wikidata. Is there a downside to allowing string synonyms? I realize that it doesn't allow quite as much data to be associated with the synonym, but having name, author, and date is a lot better than having nothing, which is what we have now and it avoids the problems associated with having a proliferation of synonym items. I hope you will consider this idea with an open mind as your opinion carries a lot of weight here. Kaldari (talk) 14:15, 29 June 2020 (UTC)
I also am in agreement with Rod (@Rdmpage:) here. Particularly the point that I do not think comparison between authors and taxon names is appropriate. Large numbers of exisiting synonyms are important names and carry the potential of future usage, as such I think these should be items not strings. I accept there are some and objective synonyms is an example, that clearly will never be used. But I do think the aim should be to create items for these. I also agree that the entire struct of the taxonomic data needs a significant amount of scrutiny and redesign. It may be, as Rod says, better use of peoples time to first get the overall in the most useful form. Cheers Scott Thomson (Faendalimas) talk 01:50, 9 July 2020 (UTC)
  •   Question Is there a consensus that the current handling of synonyms in Wikidata should change to string-based solutions? Vojtěch Dostál (talk) 07:38, 29 June 2020 (UTC)
    • @Vojtěch Dostál: No, but there is also no consensus that the handling of synonyms should be item-based (see the original taxon synonym (P1420) proposal and various discussions on that property's talk page). By having both options we can flexibly accommodate either method when it makes sense to. I'm open to tweaking the guidelines about when to use either, but it's clear that the current item-only method isn't adequate. Kaldari (talk) 15:18, 29 June 2020 (UTC)
      • To me it's not that clear. I am afraid of breaking the status quo which has its merits, and creating two parallel taxonomic systems in Wikidata which will not communicate with each other. @Kaldari: I have one more question if you don't mind: How will a bot which imports from Wikispecies know when to use items and when strings? Clearly, some synonyms deserve an item of their own and classifying them as mere strings attached to a different taxon is an over-simplification of the taxonomic reality.Vojtěch Dostál (talk) 15:27, 29 June 2020 (UTC)
        • @Vojtěch Dostál: As suggested by Rod and Dario's discussion above, the default would be to initially bot-create the synonyms as strings under the item linked from Wikispecies. Any synonyms that merited their own items could then be changed into taxon synonym (P1420) claims as needed by humans. I expect that that would be the exception rather than the rule, however, and most synonyms would remain as strings with qualifiers. I see the two properties as being complimentary, not separate and parallel, in the same way that author (P50) and author name string (P2093) have allowed us to have more comprehensive author data than we would have with just author (P50) alone. Kaldari (talk) 15:40, 29 June 2020 (UTC)
  •   Question what happens if a name currently in "taxon name" becomes a synonym? Will you repurpose the item and change taxon name, apply the string? --- Jura 08:52, 29 June 2020 (UTC)
    • @Jura1: This should compliment the use of taxon name (P225) well (which is also a string property). In the event that the taxon name becomes a synonym, the values and qualifiers under taxon name would be moved to a "taxon synonym string" claim under the same item and a new taxon name (P225) claim would be created. As you suggest, the item would simply be repurposed (or merged with another item) and no new item would need to be created. This also saves us from having to move all the sitelinks (or split them up based on which ones move to the new name). It should be a much easier process and much more complimentary to our sister projects. Kaldari (talk) 15:04, 29 June 2020 (UTC)
      • It was a question, not a suggestion on my side. I don't think items should be repurposed in general. Your response suggests that you want to move from a taxon name based system. If you merely want to improve enwiki's handeling of siteslinks, you could opt for Commons' approach. --- Jura 15:10, 29 June 2020 (UTC)
        • @Jura1: By repurposed, I simply meant the same item would receive the new taxon name (if the name of that taxon was changed). The item itself would still represent the exact same taxon. In other words, taxon (Q16521) items would actually represent taxons, rather than our current awkward system of taxon items sometimes representing just taxon names and sometimes representing both taxons and taxon names. Does that make sense? Kaldari (talk) 15:23, 29 June 2020 (UTC)
          • I think I understand the two approaches. Maybe it's important to note that there is a key difference to "author name"/"author"-pair of properties you mentioned: "author name" is used for bulk imports because insufficient information is available to create an item for the author and large scale creations for these do happen once some identifier is available. For synonyms, I think one can't really create the string version without having sufficient information to create separate items as well. Also, the renaming you outlined illustrates the uncertaintity introduced by no longer providing a stable identifier (QID) for taxon names. --- Jura 07:23, 30 June 2020 (UTC)
            • There may be enough information to create an item, but 1) many synonyms are not notable enough to have an item and 2) in the current model creating an item would promote the synonym into a species (a taxon), even when it cannot be one, by definition. Another fake species will have been created. - Brya (talk) 05:23, 1 July 2020 (UTC)
              • I don't agree about the amalgmation you are doing, but the proposal doesn't change anything about (2). As for (1), it seems odd that that should depend on the existence of a sitelink. --- Jura 06:02, 1 July 2020 (UTC)
                • As for 2), with the proposed property it would be possible to add synonyms as strings, without implying in any way that they are, or represent, actual species, just like in any taxonomic paper. Creating an item for a synonym unambiguously states ("instance of: taxon") that they are species, even though it is possible to add a qualifier "this is a fake species" (there is a large variety of such qualifiers): the latter will confuse just about any user. - Brya (talk) 05:06, 2 July 2020 (UTC)
                  • @Jura1: I removed the sitelink criteria. To address your other concern, I think the uncertaintity introduced by no longer providing a stable identifier (QID) for taxon names is far less significant than the confusion currently caused by having to have a separate item for every taxon synonym (even synonyms which are simply misspellings or taxonomic vandalism). If you have any suggestions for improving the proposal, I'm open to ideas. Kaldari (talk) 00:55, 9 July 2020 (UTC)
  •   Support The proof of the pudding is in the eating. I don't see the apocalyptic scenario's described above. Wikidata is a linked-data store and core to linked data is the possibility to reroute links. SPARQL even has the CONSTRUCT query that enables this. So even if this proposal becomes obsolete because all taxa are perfectly defined in their proper wikidata items, there is really no long term risk in accepting this proposal, which would enable quite some use cases. Worst case scenario is that the need for this property becomes obsolete, in which case all we have to do is do a simple bot operation, that removes these deprecated statements, for which wikidata even has invented ranks. --Andrawaag (talk) 14:55, 29 June 2020 (UTC)
  •   Support Taxon syonyms refer to the same entity and thus modeling it this way is better then creating a new item. ChristianKl❫ 17:03, 29 June 2020 (UTC)
  •   Support With the understanding that (as proposed) this will exist besides P1420, and that most synonyms won't ever be transferred to P1420, although some may be (if notable enough). - Brya (talk) 04:18, 30 June 2020 (UTC)
  • For the moment   Oppose. Some reasons:
    1. taxon name (P225) is not intended to reflect the taxon concept currently in use. P225 should only be changed in very rare cases. A lot of others items and properties rely on it. In fact you'll get a warning if you try to change the value of P225
    2. I missed the word "reference" in your explaination. Subjective/heterotypic synonyms reflect a taxonomic opinion of someone. Wikmedia projects are not a reference at all. In your example Phidippus audax (Q134277) enWP has no reference for the listed synonyms. World Spider Catalog is a good reference for species etc. considered as synonyms. But you should allways cite the version or use retrieved (P813)
    3. Your The "synonym strings" are dumb. They show no relationship to each other (e.g. based on the same type). Should they ordered somehow (e.g. by date)?
    4. If you consider your "synonym strings" as an analogy to taxon name (P225) (including all allowed qualifiers) and you have all this information at hand, why not create an item? It's cheap and supported by tools like QS.
    5. In my opinion all synonyms provided by reliable sources are "notable enough" to get an item. Of course not all of this items are allowed to label a taxon concept
  • --Succu (talk)
    •   Comment Can we please refrain from calling others contributions "dumb" and be more convincing and less judgemental?
      • @ Andrawaag: I read "dumb" to mean the string was "dumb", not that the contribution was "dumb". In other words, the synonym string is just a set of text characters, not a thing (item) linked to other useful information.
    • Why is notability a criteria in selecting the value type of a property? In that case each InChI (P234) and InChIKey (P235) (just two random examples) should point to individual items too. Creating items is not as cheap as suggested, yes you can do that with QS or through the Wikidata API, but a lot of contributions are also coming from single users adding single statements. Then the workflow is suddenly frustrating. My approach in those cases (and I know how to write bots) is to open a second tab create the new wikidata item select the newly created and populated wikidata QID and paste that in the statements of which the property requires an item. This workflow is just not user-friendly and yet easier than achieving the same in QS. --Andrawaag (talk) 08:53, 1 July 2020 (UTC)
      • If you want to add a new monotypic species you'll have to add the new genus of course first. This is not a flaw in usability. In a database this a normal workflow such as adding a new customer before creating an invoice for the new customer. If you are creating the invoice with a text editor then this restriction will vanish at the cost of structure. BTW: here you can omit adding parent taxon (P171) and hope the constraint will be fixed by someone else. A common pattern... --Succu (talk)
  • It's not clear to me that the world fits any model we make of it in this regard. That, I suppose, is a support for the "mixed model" Kaldari proposes. It would be a good idea to work out what we do with synonyms if two species (say) become one, or one becomes two. Maybe this is an established concept in taxonomy. Also do we label synonyms with nomen erratum etc? How do we deal with updated or contradictory synonymy lists?
I hope your experience with author stings to authors is better than Wikipedia's. We have many citations where we would be so much better with a simple "authors=" parameter, rather than list of first and last names which doesn't handle all the oddities of human naming. All the best: Rich Farmbrough16:26, 7 July 2020 (UTC).
@User:Rich Farmbrough: lumpers and splitters (Q1662868) is an old taxonomic "problem". So what Wikipedias do if
  1. two species become one (lumped / merged)
  2. one becomes two (split; Crocodylus halli (Q68594258) is a recent example)
in regard of the related sitelinks? --Succu (talk) 19:06, 7 July 2020 (UTC)
We have a lesser problem, we can have an article about some or many species, we can have multiple articles about one species. We can have redirects where we need them, and change these things on a far more ad-hoc basis than Wikidata. that's not to say we always succeed, I'm not highly involve in that area. All the best: Rich Farmbrough00:23, 19 July 2020 (UTC).
  •   Oppose for the present. I have sympathy for Kaldari's frustration with the current system, which I share. But the problem is a deep one, and I'm not convinced that this patch is, overall, an improvement.
    1. The deep problem is that Wikidata does not model taxa as opposed to taxon names. Numerous discussions have resulted in the conclusion that we don't know how to model taxa and taxon names. (My summary, and it is only a summary, is here if you aren't familiar with these discussions.) Rich Farmbrough's comment It's not clear to me that the world fits any model we make of it seems very apposite to me.
    2. Wikidata taxon items are used to retrieve taxon identifiers, e.g. in the English Wikipedia's taxonbars. So any synonym that has an identifier in a taxonomic database needs to be an item, not a string. However, taxonomic databases have very different policies. The World Spider Catalog (WSC) does not have identifiers for taxon names, but for taxa. If the name it accepts for the taxon changes, the identifier remains the same. Treating synonyms as strings works well for data extracted from the WSC only if the taxon name of the item can be changed, i.e. the item really does represent a taxon, not a taxon name. GBIF has identifiers for all the taxon names it lists, including somewhat oddly both "Araneus diadematus Clerck, 1758" and "Araneus diadematus (Clerck, 1758)". So to retrieve GBIF's identifiers, the synonyms need to be treated as items.
I do take the point about the best being the enemy of the good, and we should strive to improve the modelling of taxa and taxon names in Wikidata regardless of the fact that there may not even be a way to do so properly. However, I'm not (yet) convinced this proposal alone is an improvement. The present system, which muddles taxa and taxon names, more-or-less works. Peter coxhead (talk) 06:42, 9 July 2020 (UTC)
@Peter coxhead: Regarding #2, that is why I am proposing a hybrid system rather than just changing taxon synonym (P1420) to a string property. In cases where there are legitimate reasons to have separate items (and separate Wikipedia taxonbars), we could accommodate that. Wikipedia, however, generally follows the 1 species = 1 article model, so this would actually make Wikidata more compatible with Wikipedia, not less, as it's more likely that relevant data for a species won't be split across multiple items. The downside of not including every database identifier for every synonym is a very minor downside, in my opinion, and I see very little practical impact from it. Kaldari (talk) 13:48, 9 July 2020 (UTC)
@Peter coxhead: Any thoughts regarding my response above? I would also like to mention that many database identifiers for synonyms just go to useless pages which have no more information than author and publication year, thus providing no additional information.[1] Are there specific cases you are thinking about where losing a database identifier for a synonym (that doesn't meet one of the criteria for creating a separate item) would actually result in some detriment to a Wikidata user or re-user? FWIW, I don't know of any Wikidata re-uses that are utilizing database identifiers for synonyms, so this seems like more of a theoretical concern than a practical problem. Kaldari (talk) 17:53, 16 July 2020 (UTC)
@Kaldari: sorry to be slow, I don't monitor Wikidata regularly. Databases differ significantly in what they do about synonyms, which complicates the issue.
  • The World Spider Catalog's identifier is for the taxon; if the name its compilers accept changes, the ID goes to the new name, and the old one is just listed as a synonym. So for this database, nothing is gained by having taxon names as items. Strings would be fine.
  • Tropicos (a plant taxonomic database) records names for those taxa it covers, listing for each any other sources that accepted those names. Its identifiers for synonymous taxon names are useful, because the entries contain useful information.
  • Many sources linked by taxon name IDs are not regularly updated, if at all; for example for plants the online Flora of China and the online Flora of North America. So they will often be using names no longer accepted in recent sources. Their entries can only be found under IDs for synonyms.
I don't know of any Wikidata re-uses that are utilizing database identifiers for synonyms – look at the taxonbar at the bottom of en:Reynoutria japonica. Calflora and FNA (Flora of North America) are useful sources, but they both use the synonym Fallopia japonica. If Fallopia japonica (Q899672) were not an item, we wouldn't have these IDs and hence links. Peter coxhead (talk) 06:35, 24 July 2020 (UTC)
The muddle between taxons and taxon names is one of the remaining places where names get confused with the concepts towards which the name points. Having less of those cases and only those cases which are needed for Wikipedia compatibility is desireable.
The present system also has the problem when linking a taxon from within Wikidata with properties such as instance of (P31) or found in taxon (P703). Whenever we have synonymous you would have to link to both the taxon and it's synonym if you want to log all the relevant relations. ChristianKl❫ 20:06, 9 July 2020 (UTC)
Shouldn't found in taxon (P703) better be linked to a circumscription (Q5121761)? --Succu (talk) 20:26, 9 July 2020 (UTC)
  •   Support in the absence of any major fixes to the struct here this is a needed solution for now. Cheers Scott Thomson (Faendalimas) talk 21:00, 10 July 2020 (UTC)
  • Comment

ok for the sake of somewhere to say this I have been thinking on what @Rdmpage: and @Peter coxhead: have said, both here and elsewhere. I am not meaning to hijack this and this is not so much an issue with this proposal but of how Wikidata does taxonomy in general. As I and others have said above the whole methodology here needs rebuilding, your struct. Its almost like your trying to be a hybrid of a database and an encyclopedia. Wikidata is a database you should treat it as one.

  1. change the name of Instance of Taxon to instance of taxon name, this removes this confusion. Both Rod and Peter have discussed this issue and I agree with them. Then it is clear you are databasing all the available names for life.
  2. all names must be items , start with the names in current usage but add in synonyms as you can.
  3. all names are listed under their original combinations as the mainspace name. your first statement will be the instance of taxon name, your second will be the original reference used in the circumscription of the name.
  4. have statements for the type data and the type locality.
  5. a statement declaring the current nomenclatural status of the name, eg available, valid, homonym, etc whatever is relevant
  6. then have a statement with the currently used combination, with the reference that made this change. Your parent identifier works for this.
  7. then a statement with a list of its known synonyms
  8. then put in the extra like images etc
  9. then a your various identifiers such as ITIS, and various checklists that use it, aggregators whatever.

There may be others that can be justified. but these are the ones you need. Then you are databasing the information that is absolutely needed, the data can then be used elsewhere throughout Wikimedia and possibly further afield. You must include all parents, so you need items for any proposed ranks, including subgenera, subspecies, subfamilies etc. This is very hit and miss at present. Now whether you put species in genera or species in subgenera and subgenera in genera I could not care. But you need to have a way of acknowledging the existence of the sub groups and having them queried in a way that the result places them in these ranks if thats what the user wants. Databases serve they provide information for other applications. Some of you may find this lecture on Wikiversity I created useful [2]. Cheers Scott Thomson (Faendalimas) talk 10:31, 10 July 2020 (UTC)

Does "all names are listed under their original combinations as the mainspace name." mean what I think it means? That is, have items only for original combinations, and add all subsequent combinations (as strings) to that item?
        Upon first glance this proposal seeems to maintain all current problems and add new ones; it would lose information on a grand scale. - Brya (talk) 11:00, 10 July 2020 (UTC)
        P.S. The Wikiversity page should make clear that it does not apply to Nomenclature and Taxonomy in general, but to Zoological Nomenclature and Zoological Taxonomy only. - Brya (talk) 11:00, 10 July 2020 (UTC)
Yes I do think your better off having all names under original combination, however, I also said all names should be items not strings. Stings would cost you a lot of data, I would not recommend that. I think this could be done without data loss, as I said I did not suggest strings, and I did say add everything else in you have eg images. I would not refer to this as a proposal it would need a lot of fleshing out and discussion first. At present I am just thinking out loud. What is here so far may not solve every problem, of course it adds work, restructuring a database always does so if thats the added problem sure. I do not for one minute think it would be easy, but to discuss it further it should be done elsewhere. Since both the title and subtitle state in the wikiversity article its using turtles as a case study I figured the zoology part was clear. Cheers Scott Thomson (Faendalimas) talk 11:47, 10 July 2020 (UTC)
I am not sure if I understand how that would be a change in this respect from the current situation. All names are already linked to the original combination (where this applies), so the most this would mean would be a property that is the reverse of "original combination" (like a property "subsequent combination")?
        From the above I got the impression you wanted to gather all information in the item on the original combination. Logically, this would mean adding names (subsequent combinations) as strings. - Brya (talk) 16:39, 10 July 2020 (UTC)
@Faendalimas: I proposed basically the same thing two years ago and it went nowhere. Your proposal retains many of the problems of the current system and would be a maintenance nightmare. Every time a species was renamed, all of its synonyms would have to be updated. I'm tired of us going around and around in circles on this. I just want an easy way to add more synonym data to Wikidata without causing any other problems. Nothing that I'm proposing makes other more comprehensive solutions impossible in the future. Can't we just take a small step towards improving the current situation rather than endlessly re-hashing the entire taxonomy system (which always leads nowhere)? Please consider the merits of this modest proposal and whether or not it would actually lead to more useful data in Wikidata (which I think it would). Sure it's not perfect, but it's an improvement, and it doesn't make the existing situation any worse. Kaldari (talk) 20:10, 10 July 2020 (UTC)
Sorry it was not my wish to hyjack this. I saw it as an opportunity to highlight some of the issues of what is clearly a hybrid database. Anyway above I have supported your proposal. Cheers Scott Thomson (Faendalimas) talk 21:00, 10 July 2020 (UTC)
Faendalimas, essentially, you proposed status quo. It's clear that it may appear difficulte to create new items for users who never did that, but I don't see how this new property would help users get started with Wikidata. That a few (automated) maintainance steps could (or should) be added isn't really argument in favor of this property proposal. --- Jura 21:17, 10 July 2020 (UTC)
Jura, essentially I agree with you, I do not think anything I commented was in favor of this. But in the absence of a full redesign this will allow data to be added. I think people should be encouraged to make items instead of strings. However in the absence of items at least something is there. I have supported it by concession, not because I think its perfect. Cheers Scott Thomson (Faendalimas) talk 22:03, 10 July 2020 (UTC)
@Faendalimas: what you don't seem to be taking into account is that Wikidata needs to be able to model its data sources, and in particular their IDs (especially for the English Wikipedia, whose taxonbar uses IDs as entries to those sources). Repeating what I wrote above, for some sources, like the World Spider Catalog, the ID is for the taxon: if the accepted name changes, the entry has a new name but the same ID. Strings will work fine. In other sources, like Tropicos, IDs are for names, and there is useful information under the name. For these cases, we need each synonym to have a link to an ID. Strings won't work for this. Peter coxhead (talk) 06:35, 24 July 2020 (UTC)
@Peter coxhead: Couldn't we just add the taxon name IDs as qualifiers to the synonym string property? We would just need to change their property scope constraint, but that's easy. Then no data would be lost. Kaldari (talk) 15:11, 21 August 2020 (UTC)

  Comment: As I see it, a prime problem is what is intended by "name". In the real world this is governed by the various Codes of nomenclature. These have ordered names into the following categories:

  1. not-formal names (not available, not validly published)
  2. formal names, that however may not ever be used as the correct name of a taxon (dead names, objectively invalid names, permanently invalid names) and
  3. formal names, that may be used as the correct name of a taxon, depending on taxonomy.

From the last category any taxonomist will select a subset that he accepts as correct names, but different taxonomists can select different subsets.

There are the following basic models that could be adopted:

  1. Single-Point-of-View: create a taxonomic point of view which adopts a set of correct names, and treats all other names as synonyms (without making a distinction between categories). Much of the taxonomic literature adopts this format, as does Wikispecies and, I guess, Commons. Pretty much by definition, it would only be compatible with itself, and not with different Wikipedias using different sets of correct names. A further problem with this SPoV would be that it would violate the NPoV and NOR policies of Wikipedias.
  2. Hybrid model: every name that potentially can be used as the correct name of a taxon can have its own item. By adding references it is possible to indicate which names are accepted by which taxonomist. Each of these items has information about the name (nomenclatural information) and the taxon (for which it has been used as the correct name). This is what happens in the real world, and to such an extent that many in the real world cannot distinguish between the correct name of a taxon and the taxon. This hybrid model can only work if a parallel structure is set up for names in category 2 (formal names that are dead, objectively invalid names, permanently invalid names) which interacts with the structure of potentially correct names.
  3. Names only. Items are for names only, containing nomenclatural information only (of all three categories). This will require setting up a parallel structure for taxa, with items on taxonomic information, and information about taxa. Only this parallel structure may serve as the interface between Wikipedia's and the name-only items.

The problem is that different users prefer different models but without accepting the consequences of their chosen model, and just import data which they alter by where they put it. - Brya (talk) 05:27, 11 July 2020 (UTC)

There is another option by using TNUs but it requires also a lot of good design. Happy to explain it but I think I have taken up enough of this proposal with what is an issue to the side, I feel I am being unfair to the proposal. Set up a place to discuss the Wikidata taxonomy model and I am happy to provide input. Cheers Scott Thomson (Faendalimas) talk 06:41, 11 July 2020 (UTC)
@Faendalimas: https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Taxonomy would be a good place to talk about the Wikidata taxnonomy model. ChristianKl❫ 09:01, 11 July 2020 (UTC)
  •   Comment Given that this discussion is rumbling on without obvious resolution, and you will find similar, seemingly endless discussions involving people paid to think about taxonomy, might a pragmatic (and hence admittedly unsatisfactory) solution be to (a) use "also known as" as the place for synonyms as strings, and (b) if you want to specify that a name is a synonym then create an item for that name and connect it to the taxon of interest using taxon synonym (P1420)  . For example, Anolis cristifer (Q5315308) has no taxon synonym (P1420)   but there are synonyms in "also known as" (Query). Now, "also known as" can include all manner of things, such as common names, etc., and so isn't as precise as taxon synonym (P1420)  , but if the goal is to help discoverability then having a set of alternative labels for something is all you need. One could argue that if you really want to distinguish something as taxonomic synonym (rather than just an alternative label), then you really should go to the trouble of spelling that out via taxon synonym (P1420)  . This seems a halfway approach that acknowledges that there is not universal agreement on the best way to model taxonomy in Wikidata, nor what the current model actually is, but still lets people do things without waiting for a resolution of the modelling issue. --Rdmpage (talk) 10:57, 16 July 2020 (UTC)
  •   Support Good opening arguments. If the community desires standing items for synonyms, IMHO, would better fit the Lexeme framework and get an L-id instead of a Q-id.@Rdmpage: suggestion also seems like a good compromise. --TiagoLubiana (talk) 16:31, 8 November 2020 (UTC)
  •   Oppose I don't think creating this is compatible with the current approach. --- Jura 09:14, 14 November 2020 (UTC)

Nomenclature de tous les noms de roses connus IDEdit

   Under discussion
Descriptionidentifier of a rose cultivar in "Nomenclature de tous les noms de roses, connus, avec indication de leur race, obtenteur, année de production, couleur et synonymes", 2nd edition, by Léon Simon and Pierre Cochet
RepresentsNomenclature de tous les noms de roses (Q96643524)
Data typeExternal identifier
Domainrose cultivar (Q26817508)
Allowed values([1-9]\d{0,3}|1[01]\d\d\d)[a-z]
Example 1Rosa 'Belle Poitevine' (Q60965265) → 1409
Example 2Rosa 'Belle Biblis' (Q83673639) → 1226
Example 3Rosa 'Belle Herminie' (Q83673647) → 1340
Example 4Rosa 'Belle Isis' (Q60964253) → 1355
Example 5Rosa 'James Bougault' (Q64030579) → 4935
Example 6Rosa 'James Mitchell' (Q83668435) → 4938
Example 7Rosa 'James Veitch' (Q83672920) → 4941
Example 8Rosa 'Jeanne de Montfort' (Q83660038) → 5027
Example 9Rosa 'Jeanne Masson' (Q83673867) → 5044
Example 10Rosa 'Comte Adrien de Germiny' (Q60965024) → 2388
Example 11Rosa 'Comte de Chambord' (Q60964318) → 2401
Example 12Rosa 'Comte de Montalivet' (Q96277233) → 2415
Example 13Rosa 'Mademoiselle Claudine Perreau' (Q83672486) → 7181
Example 14Rosa 'Mademoiselle Christine de Nouë' (Q83671406) → 7174
Example 15Rosa 'Mademoiselle Blanche Lafitte' (Q60964631) → 7164
Example 16Rosa 'Princesse Louise' (Q83674468) → 9105
Example 17Rosa 'Adrienne de Cardoville' (Q83673585) → 288
SourceNomenclature de tous les noms de roses (Q96643524) pp. 1-170
Planned useadd to some existing ones [3], create new ones
Number of IDs in source>10000
Expected completenessalways incomplete (Q21873886)
See alsoLes Roses cultivées à L'Haÿ en 1902 ID (P8662)

MotivationEdit

Provides a simple way to cross-reference our items about cultivars with Nomenclature de tous les noms de roses (Q96643524) (Add your motivation for this property here.) --- Jura 12:08, 18 July 2020 (UTC)

The identifier used in the work is unique. It is generally numeric (<12000), sometimes followed by a letter (a-z). I added that to the regex above. A formatter url isn't available, but this isn't a requirement for external-id properties. --- Jura 14:32, 20 August 2020 (UTC)

99of9
Achim Raschka (talk)
Andrawaag (talk)
Brya (talk)
CanadianCodhead (talk)
Canley
Circeus
Dan Koehl (talk)
Daniel Mietchen (talk)
Enwebb
Faendalimas
FelixReimann (talk)
Hyperik (talk)
Infomuse (talk)
Infovarius (talk)
Jean-Marc Vanel
Joel Sachs
Klortho (talk)
Lymantria (talk)
Magnefl (talk)
MPF
Manojk
MargaretRDonald
Mellis (talk)
Michael Goodyear
Mr. Fulano (talk)
Nis Jørgensen
Oronsay
PEAK99
Peter Coxhead
PhiLiP
Andy Mabbett (talk)
Plantdrew
Prot D
pvmoutside
RaboKarbakian
Rod Page
Strobilomyces (talk)
Stuchka (talk)
Succu (talk)
TiagoLubiana (talk)
Tinm
Tom.Reding
TomT0m
Tommy Kronkvist (talk)
Tris T7 TT me
Tubezlob
William Avery
Minorax
Culex
Koala0090
Mike Krüger
Friesen5000
Salgo60
TED
GoEThe (talk)
Estopedist1
Leptospira
  Notified participants of WikiProject Taxonomy --- Jura 12:08, 18 July 2020 (UTC)

M.A.Miron Vincnet41 Tubezlob Prime Lemur Tris T7 TT me Infomuse TED   Notified participants of WikiProject Botany --- Jura 08:52, 19 August 2020 (UTC)

DiscussionEdit

  • this numbers? --Succu (talk) 20:15, 18 July 2020 (UTC)
  •   Support   Neutral Well founded. Nothing else to add. As Succu pointed out, Catalog and Catalog number might be enough for this need. M.A. Miron (📬) 00:20, 20 August 2020 (UTC)
    • Personally, I find a distinct property make checks on old roses easier. I'd try to do more than copy the same label to countless languages separately. --- Jura 09:04, 22 August 2020 (UTC)
  •   Oppose Use stated in (P248)=Nomenclature de tous les noms de roses (Q96643524) and catalog code (P528), page(s) (P304). --Succu (talk) 06:41, 20 August 2020 (UTC)
    • The primary use of this property would be for main statements (see domain/samples). stated in (P248) can't be used for that. (I don't think the page number is important once we have the id.) The idea is to include Nomenclature de tous les noms de roses (Q96643524) like a database, version 1906. --- Jura 12:41, 20 August 2020 (UTC)
      • Could you please give examples where this number is cited? --Succu (talk) 16:36, 20 August 2020 (UTC)
        • French Wikipedia uses the work as a reference, but cites with page numbers rather than the id. --- Jura 16:41, 20 August 2020 (UTC)
          • So I guess this number is not really notable. stated in (P248) for taxon name (P225) as suggested should work. --Succu (talk) 20:56, 20 August 2020 (UTC)
            • If the names were unique, one might do that. However they are not. I think French Wikipedia just didn't get to the the level where it matters.
              Also, we would have a harder time trying to figure out which ones are missing.
              In any case, while it could also be used as a reference, the primary use would be for main statements with a unique value constraint.
              BTW, As there are identifiers with a-z after the number, I suspect the identifiers are stable between the first and the second edition, but I haven't found a copy of the first edition. --- Jura 04:45, 21 August 2020 (UTC)
            • I see the point made by Succu; catalogue code does the trick (example done for Rosa 'Belle Poitevine' (Q60965265)) M.A. Miron (📬) 04:35, 21 August 2020 (UTC)
              • @Marc André Miron: yes, that could be another option. It would be exquivalent except for the unique value constraint. Also one would have to add slightly more (i.e. the qualifier). --- Jura 04:45, 21 August 2020 (UTC)
                • @Jura1: About the unique value constraint: I noticed that in the book, some species seem to have multiple entries, say for example Rose à coeur jaune that has id 18 and 19. Can't we create two catalogue codes pointing to the same catalogue for any Wikidata item? M.A. Miron (📬) 05:09, 21 August 2020 (UTC)
                  • @Marc André Miron: they might be two different cultivars (see "obtenteurs", also "année" on one), so I'd create two items. If they were same, yes, the two identifiers would be one the same item. The unique value constraint mainly avoids/detects that e.g. "18" appears on two separate items. --- Jura 05:23, 21 August 2020 (UTC)
                  • @Marc André Miron: Les Roses cultivées à L'Haÿ en 1902 ID (P8662) can show how it works out. --- Jura 07:28, 14 October 2020 (UTC)
                    • It didn't work at all. A random example: Rosa alba (Q478530) (=300) has two numbers 137 and 3300. --Succu (talk) 19:07, 22 October 2020 (UTC)
                      • Good pick. Fixed that one: there are indeed two for this: I think it means they planted the same at two places: once in the collection botanique and once in the collection horticole. Cultivars should have unique numbers in Hay's. --- Jura 19:22, 22 October 2020 (UTC)
                        • The point is that these numbers a not intended to be an ID (or a catalouge number). --Succu (talk) 19:56, 22 October 2020 (UTC)
                        • How would you qualify Hay's? --- Jura 21:02, 22 October 2020 (UTC)
                          • Qualify? It's simply a reference (mentioned in). --Succu (talk) 20:58, 23 October 2020 (UTC)
                            • I don't quite see the difference, but Hay's is probably better discussed elsewhere. --- Jura 09:43, 24 October 2020 (UTC)
  •   Support I find catalog codes less useful than dedicated properties. Mix'n'match is harder, constraints are harder, queries are harder.... So when there are enough items, I support making a property. --99of9 (talk) 01:22, 28 September 2020 (UTC)
  •   Support --Tinker Bell 17:59, 22 October 2020 (UTC)

MinDat taxon IDEdit

   Under discussion
Descriptionidentifier for a taxon in the MinDat database
RepresentsMinDat (Q15221937)
Data typeExternal identifier
Domaintaxon (Q16521)
Allowed valuesP{0,1}[1-9]\d*
Example 1Crocodylus checchiai (Q5187412)8689919
Example 2animal (Q729)1
Example 3deuterostome (Q150866)P67145
Example 4Neusticemys (Q48837953)4818809
Formatter URLhttps://www.mindat.org/taxon-$1.html
See alsoMinDat mineral ID (P6263), MinDat Locality ID (P6265)

MotivationEdit

MinDat database for taxon. Pamputt (talk) 11:59, 24 July 2020 (UTC)

DiscussionEdit

  Info: „BETA TEST - Fossil data and pages are very much experimental and under development. Please report any problems”. --Succu (talk) 12:21, 24 July 2020 (UTC)

99of9
Achim Raschka (talk)
Andrawaag (talk)
Brya (talk)
CanadianCodhead (talk)
Canley
Circeus
Dan Koehl (talk)
Daniel Mietchen (talk)
Enwebb
Faendalimas
FelixReimann (talk)
Hyperik (talk)
Infomuse (talk)
Infovarius (talk)
Jean-Marc Vanel
Joel Sachs
Klortho (talk)
Lymantria (talk)
Magnefl (talk)
MPF
Manojk
MargaretRDonald
Mellis (talk)
Michael Goodyear
Mr. Fulano (talk)
Nis Jørgensen
Oronsay
PEAK99
Peter Coxhead
PhiLiP
Andy Mabbett (talk)
Plantdrew
Prot D
pvmoutside
RaboKarbakian
Rod Page
Strobilomyces (talk)
Stuchka (talk)
Succu (talk)
TiagoLubiana (talk)
Tinm
Tom.Reding
TomT0m
Tommy Kronkvist (talk)
Tris T7 TT me
Tubezlob
William Avery
Minorax
Culex
Koala0090
Mike Krüger
Friesen5000
Salgo60
TED
GoEThe (talk)
Estopedist1
Leptospira
  Notified participants of WikiProject Taxonomy Should we have this property? ChristianKl❫ 23:00, 29 December 2020 (UTC)

EPA Ecoregion Level 1 CodeEdit

   Under discussion
Descriptionpart of a classification system for ecological areas with similar habitat in North America
RepresentsLevel I ecoregion (Q98544667)
Data typeExternal identifier
DomainLevel I ecoregion (Q98544667)
Allowed values([1-9]|1[0-5])
Example 1Tundra (Q98825169) → 2
Example 2Arctic Cordillera (Q98825182) → 1
Example 3Tropical Wet Forests (Q98825183) → 15
Sourcehttps://www.epa.gov/eco-research/ecoregions
Planned useI will create instances of Q98544667 for the 15 identities in this class of ecoregions. See my code here for building the WikiData item information and links for all EPA Ecoregions.
Number of IDs in source15
Expected completenesseventually complete (Q21873974)
See also

MotivationEdit

User TimK_MSI started creating items for two of the classes of EPA ecoregions some time ago (User_talk:TimK_MSI). Without an identifier property, he incorporated an identifier into some of the items by including a series ordinal on the instance of property in cases like Q5330161. I am now finishing out the work to generate items and linkages for all EPA ecoregion classifications as a persistent, resolvable, online resource for work in my lab and by others who work with these data for various analytical purposes. I would like to put identifiers on each item, properly registered to official properties, such that we can execute queries from external data systems that already have the original EPA Ecoregion codes in use. Skybristol (talk) 14:13, 25 August 2020 (UTC)

DiscussionEdit

  •   Comment please add three samples to each proposal. --- Jura 15:53, 26 August 2020 (UTC)
    • I did add 4 examples of the actual code/identifier values in the template for all 5 of the ecoregion identifiers I proposed. In the case of Level 1 Ecoregions, these really are as simple as the numeric identifiers I included as examples. You can review samples of the items to which these identifiers will apply via this instance of query, where you will see that I've temporarily put a compound form of the identifier into the aliases. The reason we need separate identifier properties for these is not only for proper semantics but also to apply a formatter URL property that will facilitate linking from the items to third party resources such as map server queries to return full boundary geometry for the items. Skybristol (talk) 22:05, 1 September 2020 (UTC)
      • Would you insert them in the template? See another proposal on how it's done. --- Jura 22:12, 1 September 2020 (UTC)
        • Thanks for pointing me in the right direction on this. I put specific item into the examples as directed. Skybristol (talk) 12:33, 10 September 2020 (UTC)
  •   Support I have also updated the regular expression to be more precise (only 16 values so no need to accept numbers like 1293797) and fixed some minor formatting with the proposal. --Dhx1 (talk) 15:38, 23 September 2020 (UTC)
    •   Comment Domain also changed to ecoregion (Q295469). --Dhx1 (talk) 15:43, 23 September 2020 (UTC)
      •   Comment Undone as the original domain was more precise and correct. --Dhx1 (talk) 16:06, 23 September 2020 (UTC)
    •   Comment @Skybristol: I changed the number of items from 16 to 15 per the official website at [4] and the English Wikipedia article that both state only 15 identifiers exist in the series. Can you clarify which is correct before I provide support to this proposal? --Dhx1 (talk) 15:55, 23 September 2020 (UTC)
    •   Comment The description at [5] indicates this identifier was issued by Commission for Environmental Cooperation (Q2986468) and not United States Environmental Protection Agency (Q460173). I suggest that the name of this property could therefore be "CEC 2006 Ecoregion Level 1 Code" and the source URI/official website becomes [6]. --Dhx1 (talk) 16:37, 23 September 2020 (UTC)
  •   Comment Is there a reason for making all the levels separate properties? I feel that two properties, "CEC Ecoregion code" linked to (what's the QID?) and "EPA Ecoregion code" linked to United States Environmental Protection Agency ecoregion (Q52111282), will suffice. The levels are pretty obvious from the formatting for both humans and machines.--Artoria2e5 (talk) 06:00, 2 February 2021 (UTC)
  •   Support, an important property for biology.--Arbnos (talk) 14:59, 19 February 2021 (UTC)

EPA Ecoregion Level 2 CodeEdit

   Under discussion
Descriptionpart of a classification system for ecological areas with similar habitat in North America
RepresentsLevel II ecoregion (Q98544662)
Data typeExternal identifier
DomainLevel II ecoregion (Q98544662)
Allowed values([1-9]|1[0-6])\.[1-6]
Example 1Sierra And Plains Of El Cabo (Q98825449) → 14.6
Example 2Sierra Los Tuxtlas (Q98825451) → 15.3
Example 3Coastal Plain And Hills Of Soconusco (Q98825454) → 15.6
Sourcehttps://www.epa.gov/eco-research/ecoregions
Planned useI will create instances of Q98544662 for the 50 identities in this class of ecoregions. See my code here for building the WikiData item information and links for all EPA Ecoregions.
Number of IDs in source50
Expected completenesseventually complete (Q21873974)
See also

MotivationEdit

User TimK_MSI started creating items for two of the classes of EPA ecoregions some time ago (User_talk:TimK_MSI). Without an identifier property, he incorporated an identifier into some of the items by including a series ordinal on the instance of property in cases like Q5330161. I am now finishing out the work to generate items and linkages for all EPA ecoregion classifications as a persistent, resolvable, online resource for work in my lab and by others who work with these data for various analytical purposes. I would like to put identifiers on each item, properly registered to official properties, such that we can execute queries from external data systems that already have the original EPA Ecoregion codes in use. Skybristol (talk) 14:13, 25 August 2020 (UTC)

DiscussionEdit

  •   Weak oppose @Skybristol: The official website at [7] states there are only 50 identifiers not the 51 you originally stated. Can you clarify which is correct? I updated the proposal to have 50 IDs until confirmed otherwise. Regex also improved but my weak oppose at this stage on the proposal is due to the level1.level2 syntax. Could the format instead just be changed to [1-6] so it doesn't mix level 1 and level 2 identifiers together? If there is a reason for using a level1.level2.level3 syntax I would argue that only one property proposal would be required. --Dhx1 (talk) 16:03, 23 September 2020 (UTC)
  •   Comment The description at [8] indicates this identifier was issued by Commission for Environmental Cooperation (Q2986468) and not United States Environmental Protection Agency (Q460173). I suggest that the name of this property could therefore be "CEC 2006 Ecoregion Level 2 Code" and the source URI/official website becomes [9]. --Dhx1 (talk) 16:41, 23 September 2020 (UTC)

EPA Ecoregion Level 3 CodeEdit

   Under discussion
Descriptionpart of a classification system for ecological areas with similar habitat in North America
RepresentsLevel III ecoregion (Q52111338)
Data typeExternal identifier
DomainLevel III ecoregion (Q52111338)
Allowed values([1-9]|1[0-6])\.[1-6]\.([1-9]|1[0-5])
Example 1Colorado Plateaus (Q2984378) → 10.1.6
Example 2Columbia Plateau (Q3391942) → 10.1.2
Example 3Coast Range (Q5138204) → 7.1.8
Sourcehttps://www.epa.gov/eco-research/ecoregions
Planned useI will create instances of Q52111338 for the 182 identities in this class of ecoregions. See my code here for building the WikiData item information and links for all EPA Ecoregions.
Number of IDs in source182
Expected completenesseventually complete (Q21873974)
See also

MotivationEdit

User TimK_MSI started creating items for two of the classes of EPA ecoregions some time ago (User_talk:TimK_MSI). Without an identifier property, he incorporated an identifier into some of the items by including a series ordinal on the instance of property in cases like Q5330161. I am now finishing out the work to generate items and linkages for all EPA ecoregion classifications as a persistent, resolvable, online resource for work in my lab and by others who work with these data for various analytical purposes. I would like to put identifiers on each item, properly registered to official properties, such that we can execute queries from external data systems that already have the original EPA Ecoregion codes in use. Skybristol (talk) 14:13, 25 August 2020 (UTC)

DiscussionEdit

  •   Weak oppose @Skybristol: Regex improved but my weak oppose at this stage on the proposal is due to the level1.level2.level3 syntax. Could the format instead just be changed to ([1-9]|1[0-5]) (a number between 1 and 15) so it doesn't mix level 1, level 2 and level 3 identifiers together? If there is a reason for using a level1.level2.level3 syntax I would argue that only one property proposal would be required. --Dhx1
  •   Comment The description at [10] indicates this identifier was issued by Commission for Environmental Cooperation (Q2986468) and not United States Environmental Protection Agency (Q460173). I suggest that the name of this property could therefore be "CEC 2006 Ecoregion Level 3 Code" and the source URI/official website becomes [11]. --Dhx1 (talk) 16:39, 23 September 2020 (UTC)

EPA Ecoregion US Level 3 CodeEdit

   Under discussion
Descriptionpart of a classification system for ecological areas with similar habitat in the United States
RepresentsLevel III ecoregion (Q52111338)
Data typeExternal identifier
DomainQ52111338
Allowed values{[0-9]+}
Example 1Eastern Great Lakes Lowlands (Q5330161) → 83
Example 2Snake River Plain (Q7547081) → 12
Example 3Western Allegheny Plateau (Q17148740) → 70
Sourcehttps://www.epa.gov/eco-research/ecoregions
Planned useInstances of Q52111338 for the 85 Level 3 Ecoregions in the contiguous U.S. will contain identifiers aligned with both the North American and U.S. concepts. See my code here for building the WikiData item information and links for all EPA Ecoregions.
Number of IDs in source85
Expected completenesseventually complete (Q21873974)
See also

MotivationEdit

User TimK_MSI started creating items for two of the classes of EPA ecoregions some time ago (User_talk:TimK_MSI). Without an identifier property, he incorporated an identifier into some of the items by including a series ordinal on the instance of property in cases like Q5330161. I am now finishing out the work to generate items and linkages for all EPA ecoregion classifications as a persistent, resolvable, online resource for work in my lab and by others who work with these data for various analytical purposes. I would like to put identifiers on each item, properly registered to official properties, such that we can execute queries from external data systems that already have the original EPA Ecoregion codes in use. Skybristol (talk) 14:13, 25 August 2020 (UTC)

DiscussionEdit

  •   Comment The examples should follow the pattern of other property proposals - start with an existing Wikidata item that would have a statement using this property, then the value it would have. ArthurPSmith (talk) 19:33, 2 September 2020 (UTC)
    • Thank you for pointing me in the right direction. I linked the examples to specific existing items. Skybristol (talk) 12:33, 10 September 2020 (UTC)
  •   Weak oppose @Skybristol: The subject item and domain should be different from the one used in Wikidata:Property_proposal/EPA_Ecoregion_Level_3_Code. As there are 5 classification identifiers proposed for ecoregions, each one should have a unique subject item (for the unique identifier/code) and a separate unique domain item (for the ecoregion). The source website indicates at [12] that there are 105 ecoregions identified, 1-85 and 101-120 instead of the 85 specified in this proposal. Which is correct? With those things fixed I'm OK to support this proposal. --Dhx1 (talk) 16:24, 23 September 2020 (UTC)
    •   Comment As a further suggestion: the official website/source URI for this property should be changed to [13] and "US" removed from the title of the proposal. --Dhx1 (talk) 16:44, 23 September 2020 (UTC)

EPA Ecoregion US Level 4 CodeEdit

   Under discussion
Descriptionpart of a classification system for ecological areas with similar habitat in the United States
RepresentsLevel IV ecoregion (Q52111409)
Data typeExternal identifier
DomainQ52111409
Allowed values{[0-9]+[a-z]}
Example 1Tawas Lake Plain (Q52087723) → 50ah
Example 2Southern Coast and Islands (Q53625714) → 76d
Example 3Lansing Loamy Plain (Q53772624) → 56g
Sourcehttps://www.epa.gov/eco-research/ecoregions
Planned useInstances of Q52111409 for the 967 Level 4 Ecoregions in the contiguous U.S. be created as items. See my code here for building the WikiData item information and links for all EPA Ecoregions.
Number of IDs in source967
Expected completenesseventually complete (Q21873974)
See also

MotivationEdit

User TimK_MSI started creating items for two of the classes of EPA ecoregions some time ago (User_talk:TimK_MSI). Without an identifier property, he incorporated an identifier into some of the items by including a series ordinal on the instance of property in cases like Q5330161. I am now finishing out the work to generate items and linkages for all EPA ecoregion classifications as a persistent, resolvable, online resource for work in my lab and by others who work with these data for various analytical purposes. I would like to put identifiers on each item, properly registered to official properties, such that we can execute queries from external data systems that already have the original EPA Ecoregion codes in use. Skybristol (talk) 14:13, 25 August 2020 (UTC)

DiscussionEdit

  •   Comment Your description needs to be much shorter - move the bulk of what you have written there to the motivation section. ArthurPSmith (talk) 18:33, 25 August 2020 (UTC)
    • Thank you for the comment. I shortened the description to bare essentials in each of the 5 identifier property proposals. I wasn't sure that the description actually went into the property item and was trying to be thorough. Please let me know if that makes better sense now. Skybristol (talk) 22:13, 1 September 2020 (UTC)
  •   Support Thanks for fixing up the proposal, this looks good to me. However, I am wondering how you plan to connect these Ecoregion codes to actual locations - is that part of what you want to do here? Will that require another property? ArthurPSmith (talk) 17:05, 10 September 2020 (UTC)
    • Thanks for the question. The source data I referenced behind this are geospatial data with mapped boundaries for each ecoregion unit. The process I put together builds a representative point location and adds that to the Wikidata items as a geocoordinate statement, allowing for a quick, basic orientation to the general area. The statements also include intersections with political boundaries in the applicable North American countries. I was looking into options for publicly hosting the actual boundary data and would be happy for some advice on that. I looked into loading them to Wikimedia Commons as there is some documentation about loading spatial data there, but I haven't yet figured out how to actually get geojson files loaded. Another option would be to include a reference to a geospatial web service from the EPA source as the formatter URL configuration of the identifier properties. The services will respond to queries based on the ecoregion codes, so that would be a route to link each logical entity to its corresponding full geospatial location/boundary. There are other examples in Wikidata such as political jurisdictions that do link to a corresponding spatial representation in Wikimedia Commons, so that seems like the way to go, but any input on direction would be welcome. Skybristol (talk) 14:44, 15 September 2020 (UTC)
  •   Weak oppose for the same reasons I provided at Wikidata:Property_proposal/EPA_Ecoregion_US_Level_3_Code and also due to the proposed format resulting in a merger of the EPA level 3 (number) and EPA level 4 (alphabetical letter suffix) identifiers into the one proposal. My preference is two properties, one the level 3 numerical prefix and the other being the level 4 alphabetical letter suffix. If there is a need to keep the merged identifiers such as "58af" in the single Wikidata property, I would then suggest that only one property is needed to cover both EPA level 3 and EPA level 4. --Dhx1 (talk) 16:51, 23 September 2020 (UTC)

has surfaceEdit

   Under discussion
Descriptionthe object is the 2-dimensional surface that partly or completely surrounds the subject in 3-dimensional space
Data typeItem
Example 1deltoid muscle (Q130243)surface of deltoid (Q66591974)
Example 2liver (Q9368)surface of liver (Q66509673)
Example 3organ (Q712378)surface of organ (Q66515789)
See alsohas boundary (P4777)

MotivationEdit

We currently have no way to model the relationship between deltoid muscle (Q130243) and surface of deltoid (Q66591974) ChristianKl❫ 19:38, 6 December 2020 (UTC)

ChristianKl (talk) 14:41, 8 July 2016 (UTC) Iwan.Aucamp (talk) 15:13, 22 March 2020 (UTC) Was a bee (talk) 14:48, 23 September 2017 (UTC) Okkn (talk) 02:20, 25 October 2017 (UTC) JS (talk)   Notified participants of WikiProject Anatomy

DiscussionEdit

  •   Support I assume this could be applied to mathematical, as well as biological, objects? ArthurPSmith (talk) 19:24, 7 December 2020 (UTC)
    • @ArthurPSmith: In the current wording it works with 3-dimensional mathematical objects. I'm not sure whether mathematicians define the term surface also for objects that are not 3-dimensional.

Opensofias
Tobias1984
Arthur Rubin
Cuvwb
TomT0m
Physikerwelt
Lymantria
Bigbossfarin
Infovarius
Helder
PhilMINT
Malore
Nomen ad hoc
Lore.mazza51
Wikisaurus
The Anome
The-erinaceous-one
Daniel Mietchen
Haansn08
Xenmorpha
  Notified participants of WikiProject Mathematics ChristianKl❫ 19:49, 7 December 2020 (UTC)

  •   Support --Tinker Bell 19:37, 7 December 2020 (UTC)
  • I talked with a friend who's a mathematician and now I specified it a bit better. Additionally, I made it explicit that sometimes the surface is not completely surrounding for cases like the surface of the heart that gets pierced by an artery/vein. ChristianKl❫ 18:20, 8 December 2020 (UTC)
  • I switched it to be more open for surfaces that don't completely surround the object. ChristianKl❫ 18:09, 10 December 2020 (UTC)
  •   Oppose How is the proposed property different than has boundary (P4777)? If we want to create a new property specifically for biological items, I'm fine with that, but in mathematical items we should use has boundary (P4777). As far as the mathematical definition of a surface, we define a surface as a two-dimensional space. A surface, however, can be embedded in higher dimensional spaces. So, a sphere is two-dimensional because if you zoom in on the surface it has no thickness--it looks like a 2D plane. We often represent a sphere in three dimensions, however. But we could also put it in four-dimensional space, five-dimensional space, etc. — The Erinaceous One 🦔 09:26, 14 December 2020 (UTC)
@The-erinaceous-one: has boundary (P4777) was created with idea that it's what you get when you project a 3D entity on a 2D plane. It's intented to be the relationship between Walls of Jerusalem (Q2918723) and Old City (Q213274). ChristianKl❫ 12:47, 14 December 2020 (UTC)
@ChristianKl: I see. The mathematical definition of "boundary" is much broader than the meaning of has boundary (P4777) that you described because the boundary of an object can n-dimensional with n >= 0 (depending on the object). Semantically, I've never heard anybody say "a <mathematical object> has surface <x>" so it doesn't make sense to try to shoehorn mathematical objects into the proposed property. I would much rather broaden the definition of has boundary (P4777) to allow for mathematical objects in more than two dimensions. — The Erinaceous One 🦔 00:43, 23 December 2020 (UTC)
@The-erinaceous-one: For objects in the physical world there's a massive difference between being surrounded when you project into 2D and being surrounded in 3D and it doesn't really make sense to model them with the same relation. ChristianKl❫ 01:35, 23 December 2020 (UTC)
@ChristianKl: I just don't see the number of dimensions as a quality that merits a new property. There are four dimensional objects that have boundaries too. Would we need another new property? In fact, on hyperball (Q3776995) there is the statement: hyperball (Q3776995) has boundary (P4777) n-sphere (Q306610). This means that different instances Q3776995 will have boundaries of various dimensions. How would you propose modeling this? — The Erinaceous One 🦔 12:04, 23 December 2020 (UTC)
Given the current definition of has boundary (P4777) that comes from it's property discussion hyperball (Q3776995) has boundary (P4777) n-sphere (Q306610) is clearly wrong.
The relationship between cell (Q7868)cell surface (Q189094) is not the same as the one between Walls of Jerusalem (Q2918723)Walls of Jerusalem (Q2918723). Both cell (Q7868) and Walls of Jerusalem (Q2918723) happen to be items that exist in 3D reality, so it's not possible to unambiguously interfer which meaning is intended when the same property is used. ChristianKl❫ 20:42, 23 December 2020 (UTC)
@ChristianKl: I think we need reconsider the details of has boundary (P4777). The current description is unclear. The first part says, "element that's on the two dimensional border that surrounds the subject," but the second part says, "the limit of an entity." The meaning of the two parts, together, is ambiguious. The first part states a narrower usage of "has boundary" you described, whereas the second part indicates a broader usage that is in line with the mathematical usage of the "boudary." Additionally, in the has boundary (P4777) property proposal, there wasn't discussion regarding the dimension of the boundary; you mention it once, but nobody else commented on it before the property was created. In other words, there wasn't a consensus at the time that "has boundary" only applies to 2D objects with a 1D boundary (or a 2D projection of a 3D object). (Side note: if we determine P4777 should only apply to a border in a 2D projection, then the description should say, "element on the one-dimensional border that surrounds the subject," since a curve in a plane is a one-dimensional obejct.)
If we go with the narrow meaing of P4777 then I have two concerns. The first is that it destroys its utiltiy as a mathematical property. Replacing it with "has surface" is better than nothing but it is contrary to the semantics used within mathematics. Secondly, I am concerned about restricting has boundary (P4777) to only the 2D case causes the property to arbitrarily include some types of political/legal boundaries and exclude other types. In particular, consider a fictional Sci-Fi universe with boundaries between nations "X" and "Y" in outer space. The "X-Y boundary" would be a 2D boundary in 3D space and the relationship between "X" and "X-Y boundary" is exactly the same as the relationship between "the US" and the "US-Mexico border." Similar cases also arise in the real world when you consider the vertical boundaries of a piece of property or nation [14]. Allowing P4777 to use the broader meaning would increase its usefulness without requiring a proliferation of properties.
Regarding the ambiguity you mentioned, I don't see this being a real source of confusion (for humans). It is usually clear when we talk about the boundary of a geographic item, we are talking about the boundary of the projection on the map. If it's not clear, we could make it explicit by adding a qualifier along the lines of United States has border US-Mexico border / with respect to map projection. — The Erinaceous One 🦔 12:30, 26 December 2020 (UTC)

Cephalopod Ontology entity IDEdit

   Under discussion
Descriptionidentifier for an anatomical entity in Cephalopod Ontology, an anatomical and developmental ontology for cephalopods
RepresentsCephalopod Ontology (Q104030182)
Data typeExternal identifier
Domainanatomical entity (Q27043950)
Example 1stylet (Q7629472) → 0000245
Example 2photosensitive vesicle (Q104030333) → 0000200
Example 3gill (Q132390) → 0000122
Formatter URLhttp://purl.obolibrary.org/obo/CEPH_$1

MotivationEdit

I found that stylet (Q7629472) currently has no good authority reference and referencing Cephalopod Ontology seems desireable. ChristianKl❫ 15:04, 8 December 2020 (UTC) ChristianKl (talk) 14:41, 8 July 2016 (UTC) Iwan.Aucamp (talk) 15:13, 22 March 2020 (UTC) Was a bee (talk) 14:48, 23 September 2017 (UTC) Okkn (talk) 02:20, 25 October 2017 (UTC) JS (talk)   Notified participants of WikiProject Anatomy

DiscussionEdit

Fungal gross anatomy entity IDEdit

   Under discussion
Descriptionidentifier for an anatomical entity in Fungal gross anatomy, a structured controlled vocabulary for the anatomy of fungi
RepresentsFungal gross anatomy (Q81661616)
Data typeExternal identifier
Domainanatomical entity (Q27043950)
Example 1fungal structure (Q56883667) → 0000001
Example 2hypha (Q193129) → 0001001
Example 3conidiophore (Q104030668) → 0000043
Formatter URLhttp://purl.obolibrary.org/obo/FAO_$1

MotivationEdit

We currently have no authrority control property for fungi anatomy. ChristianKl❫ 15:16, 8 December 2020 (UTC)

DiscussionEdit

  Oppose whilst this is a useful controlled vocabulary, it has only 120 class terms. For me, that is not enough to require a new property and I would suggest using equivalent class (P1709) to create a relationship instead. Simon Cobb (User:Sic19 ; talk page) 10:39, 21 December 2020 (UTC)

Plant Ontology entity IDEdit

   Under discussion
Descriptionidentifier for an entity in Plant Ontology, a structured vocabulary and database resource that links plant anatomy, morphology and growth and development to plant genomics data
RepresentsPlant Ontology (Q55118572)
Data typeExternal identifier
Example 1seta (Q126780)0025066
Example 2Leptoid (Q11754738)0025033
Example 3fruit (Q1364)0025496
Formatter URLhttp://purl.obolibrary.org/obo/PO_$1

MotivationEdit

We currently lack authority control information for our items about plant anatomy. ChristianKl❫ 18:00, 8 December 2020 (UTC)

DiscussionEdit

Having a formatter url is very useful to look up the definition and relationships. It also makes it easy to query for plant structures that don't have a Plant Ontology entity ID. ChristianKl❫ 21:15, 8 December 2020 (UTC)
Claiming formatter URL (P1630) as „very useful to look up the definition and relationships“ is your POV. Could your please give a SPARQL to detect „plant structures“ (whatever this means) we don't have here and needs a refinement. --Succu (talk) 22:26, 8 December 2020 (UTC)
Complaing in the same paragraph about "whatever this means" in regards to "plant structure" and being skeptical about it being useful to access a definition for plant structure from an authority like Plant Ontology seems circular. ChristianKl❫ 22:31, 8 December 2020 (UTC)
Simple answer   Oppose. --Succu (talk) 22:37, 8 December 2020 (UTC)

Hymenoptera Anatomy Ontology entity IDEdit

   Under discussion
Descriptionidentifier for an anatomical entity in Hymenoptera Anatomy Ontology, a structured controlled vocabulary of the anatomy of the Hymenoptera (bees, wasps, sawflies and ants)
RepresentsHymenoptera Anatomy Ontology (Q81661648)
Data typeExternal identifier
Example 1arolium (Q697956) → 0000148
Example 2Galea (Q1491790) → 0000368
Example 3maxilla (Q350956) → 0000513
Formatter URLhttp://purl.obolibrary.org/obo/HAO_$1

MotivationEdit

Given our current lack of authority control for Hymenoptera it would be good to have a property for this. ChristianKl❫ 18:39, 8 December 2020 (UTC)

DiscussionEdit

GTDB taxonEdit

   Under discussion
DescriptionTaxonomic identifier in the Genome Taxonomy Database, a proposed phylogenomic nomenclature of prokaryotes
RepresentsGenome Taxonomy Database (Q104830506)
Data typeExternal identifier
Domainprokaryotes (Q19081)
Allowed values[dpcofgs]__[A-Z][-[:alnum:]_ ]*
Example 1Paenarthrobacter ureafaciens (Q55180430)s__Paenarthrobacter ureafaciens
Example 2CPR group (Q27110262)p__Patescibacteria
Example 3Asgard (Q45003302)p__Asgardarchaeota
Example 4Clostridia (Q132809)c__Clostridia; c__Clostridia_A
Example 5Escherichia coli (Q25419)s__Escherichia coli; s__Escherichia coli_C; s__Escherichia coli_D; s__Escherichia dysenteriae; s__Escherichia fergusonii; s__Escherichia flexneri
Sourcesee Q104830506#P1343
Number of IDs in source> 45,000 (45,416 as of r85)
Expected completenesseventually complete (Q21873974)
Formatter URLhttps://gtdb.ecogenomic.org/tree?r=$1
Robot and gadget jobsmaybe
Applicable "stated in"-valueGenome Taxonomy Database (Q104830506)
Distinct values constraintno
Wikidata projectWikiProject Tree of life (Q8503033)

MotivationEdit

The GTDB provides a "clean", if unorthodox, tree of life without paraphyletic or oddly-ranked groups. It is especially helpful to provide GTDB links on taxa that are known to be weird themselves or include such taxa. Artoria2e5 (talk) 05:40, 2 February 2021 (UTC)

DiscussionEdit

99of9
Achim Raschka (talk)
Andrawaag (talk)
Brya (talk)
CanadianCodhead (talk)
Canley
Circeus
Dan Koehl (talk)
Daniel Mietchen (talk)
Enwebb
Faendalimas
FelixReimann (talk)
Hyperik (talk)
Infomuse (talk)
Infovarius (talk)
Jean-Marc Vanel
Joel Sachs
Klortho (talk)
Lymantria (talk)
Magnefl (talk)
MPF
Manojk
MargaretRDonald
Mellis (talk)
Michael Goodyear
Mr. Fulano (talk)
Nis Jørgensen
Oronsay
PEAK99
Peter Coxhead
PhiLiP
Andy Mabbett (talk)
Plantdrew
Prot D
pvmoutside
RaboKarbakian
Rod Page
Strobilomyces (talk)
Stuchka (talk)
Succu (talk)
TiagoLubiana (talk)
Tinm
Tom.Reding
TomT0m
Tommy Kronkvist (talk)
Tris T7 TT me
Tubezlob
William Avery
Minorax
Culex
Koala0090
Mike Krüger
Friesen5000
Salgo60
TED
GoEThe (talk)
Estopedist1
Leptospira
  Notified participants of WikiProject Taxonomy

  •   Comment @Artoria2e5: I attempted to link your examples, but the space character breaks linking. How do you plan to handle the ID's with space characters in them? ArthurPSmith (talk) 22:21, 2 February 2021 (UTC)

Biochemistry/molecular biologyEdit

Please visit Wikidata:WikiProject Molecular biology for more information. To notify participants use {{Ping project|Molecular biology}}

inverse agonist ofEdit

   Ready Create
Descriptionsubstance whose binding to a protein produces the opposite response as an agonist binding to that protein
Data typeItem
Domaininverse agonist (Q1671731)
Allowed valuesWikidata items of instance of (P31)/subclass of (P279): chemical compound (Q11173), protein (Q8054), or protein family (Q417841)
Example 1pimavanserin (Q7194603)HTR2A (Q14891424)
Example 2pimavanserin tartrate (Q27284759)HTR2A (Q14891424)
Example 3SR9243 (Q27162823)Liver X receptor (Q3454539)
Example 4SR9243 (Q27162823)NR1H2 (Q18032292)
Example 5SR9243 (Q27162823)NR1H3 (Q18034995)
SourceGuide to Pharmacology, ChEMBL, BindingDB, PDB, ChEBI

MotivationEdit

Other pharmacological relationships (e.g., agonist of (P3772), antagonist of (P3773), positive allosteric modulator of (P3778)) have already been assigned Wikidata properties and there is ongoing work to incorporate more relationships into the OBO Relation Ontology (https://github.com/oborel/obo-relations/issues/369, https://github.com/oborel/obo-relations/issues/371). Inverse agonism is another mechanism of action that's a bit less common than the others, but is very interesting. We want to upload a lot of relations inferred from ChEBI based on manual curation from https://github.com/chemical-roles/chemical-roles and this relation will support encoding some of those.  – The preceding unsigned comment was added by Cthoyt (talk • contribs)..

DiscussionEdit

  • @Cthoyt: The description should define the term inverse agonist instead of just repeating it. ChristianKl❫ 18:14, 20 June 2020 (UTC)
  •   Support ChristianKl❫ 20:17, 1 December 2020 (UTC)

Andrew Su
Marc Robinson-Rechavi
Pierre Lindenbaum
Michael Kuhn
Boghog
Emw
Chandres
Dan Bolser
Pradyumna
Chinmay
Timo Willemsen
Salvatore Loguercio
Tobias1984
Daniel Mietchen
Optimale
Mcnabber091
Ben Moore
Alex Bateman
Klortho
Hypothalamus
Vojtěch Dostál
Gtsulab
Andra Waagmeester
Sebotic
Mvolz
Toniher
Elvira Mitraka
David Bikard
Dan Lawson
Francesco Sirocco
Konrad U. Förstner (talk)
Chris Mungall (talk)
Kristina Hettne
Hardwigg
i9606
Putmantime
Tinm
Karima Rafes
Finn Årup Nielsen
Jasper Koehorst
Till Sauerwein
Crowegian
Nothingserious
Okkn
AlexanderPico
Amos Bairoch
Gstupp
DePiep
Was a bee
SarahKeating
Muhammad Elhossary
Ptolusque
Netha
Damian Szklarczyk
Kpjas
Thibdx
Juliansteinb
TiagoLubiana
SCIdude
Photocyte
Yusra Haider
JS
Hannes Röst
Kritika Dusad
T.Shafee(evo&evo) (talk)
GoEThe
  Notified participants of WikiProject Molecular biology Saehrimnir
Leyo
Snipre
Jasper Deng
Dcirovic
Walkerma
Egon Willighagen
Denise Slenter
Daniel Mietchen
Kopiersperre
Emily Temple-Wood
Pablo Busatto (Almondega)
Antony Williams (EPA)
TomT0m
Wostr
Devon Fyson
User:DePiep
User:DavRosen
Benjaminabel
99of9
Kubaello
Fractaler
Sebotic
Netha
Hugo
Samuel Clark
Tris T7
Leiem
Christianhauck
SCIdude
Binter
Photocyte
Robert Giessmann
Cord Wiljes
Jonathan Bisson
GrndStt
Ameisenigel
Charles Tapley Hoyt
ChemHobby
  Notified participants of WikiProject Chemistry ChristianKl❫ 20:18, 1 December 2020 (UTC)

  • @ChristianKl: What are the next steps? Do we need feedback from more specific people? Perhaps @DeSl:? Cthoyt (talk) 16:57, 5 February 2021 (UTC)
  •   Support, an important property for chemistry.--Arbnos (talk) 14:44, 19 February 2021 (UTC)

Basic Formal Ontology IDEdit

   Under discussion
Descriptionidentifier in Basic Formal Ontology (BFO), a top-level ontology developed by Barry Smith and his associates for the purposes of promoting interoperability among domain ontologies
RepresentsBasic Formal Ontology (Q4866972)
Data typeExternal identifier
Example 1independent continuant (Q53617489)BFO_0000004
Example 2entity (Q35120)BFO_0000001
Example 3continuant (Q103940464)BFO_0000002
Formatter URLhttp://purl.obolibrary.org/obo/BFO_$1

MotivationEdit

Basic Formal Ontology gets used a lot in biology, so it's worth being able to reference it directly even when at the moment there aren't that many entries. ChristianKl❫ 20:07, 6 December 2020 (UTC)

Andrew Su
Marc Robinson-Rechavi
Pierre Lindenbaum
Michael Kuhn
Boghog
Emw
Chandres
Dan Bolser
Pradyumna
Chinmay
Timo Willemsen
Salvatore Loguercio
Tobias1984
Daniel Mietchen
Optimale
Mcnabber091
Ben Moore
Alex Bateman
Klortho
Hypothalamus
Vojtěch Dostál
Gtsulab
Andra Waagmeester
Sebotic
Mvolz
Toniher
Elvira Mitraka
David Bikard
Dan Lawson
Francesco Sirocco
Konrad U. Förstner (talk)
Chris Mungall (talk)
Kristina Hettne
Hardwigg
i9606
Putmantime
Tinm
Karima Rafes
Finn Årup Nielsen
Jasper Koehorst
Till Sauerwein
Crowegian
Nothingserious
Okkn
AlexanderPico
Amos Bairoch
Gstupp
DePiep
Was a bee
SarahKeating
Muhammad Elhossary
Ptolusque
Netha
Damian Szklarczyk
Kpjas
Thibdx
Juliansteinb
TiagoLubiana
SCIdude
Photocyte
Yusra Haider
JS
Hannes Röst
Kritika Dusad
T.Shafee(evo&evo) (talk)
GoEThe
  Notified participants of WikiProject Molecular biology ChristianKl❫ 20:09, 6 December 2020 (UTC)

DiscussionEdit

  Support +   Comment It seems reasonable. I wonder if we want all OBO Foundry ontologies to have specific properties on Wikidata. I think they are generally very useful. As a side note, one of the examples says independent continuant (Q53617489) → 0000002, but BFO_0000002 is Class: continuant, I think it was a typo or something. Best, TiagoLubiana (talk) 02:40, 7 December 2020 (UTC)

  • I fixed the value. I copy paste and forgot to fill in the proper values. I do think that we likely want all the OBO Foundry ontologies to have external ID properties in Wikdiata. ChristianKl❫ 10:45, 7 December 2020 (UTC)
  •   Support --Tinker Bell 01:44, 8 December 2020 (UTC)
  •   Oppose Basic Formal Ontology has only 35 class terms. Linking with equivalent class (P1709) would be sufficient. Simon Cobb (User:Sic19 ; talk page) 09:48, 21 December 2020 (UTC)
    • While it has few items, having the ability to look up the definition quickly is valuable. The fact that the items are on top of the ontology also means that they are likely to be seen much more often then a lot of other items. ChristianKl❫ 20:33, 22 December 2020 (UTC)
@ChristianKl: Yes, the ability to look up the definition is useful and it is possible to do that without creating a property. Not sure I properly understand your point about items on top of the ontology being seen more often but it does not seem to justify the creation of this property, especially as we already have a property to store exactly the same information and connect to these terms.
Also, can this proposal be placed alongside your other OBO property proposals so they can be discussed together? Succu has correctly noted that the prefix, in this instance BFO_, is part of the ID and this should not be overlooked. Although there are extant OBO ontology properties, e.g. Environment Ontology ID (P3859) and OBO Gazetteer ID (P6778), they are seldom utilised and I think it would be sensible to discuss the option of a single property (with the prefix included to identify the ontology) as an alternative to a property for each ontology. Or, if we can establish a consensus to (potentially) create properties for each ontology, I will withdraw my objections based on the number of terms. Simon Cobb (User:Sic19 ; talk page) 08:29, 23 December 2020 (UTC)
@Sic19: There are entities that have identifiers in multiple OBO ontologies. That means giving our semantics around external identifiers, a property that takes all IDs wouldn't be an external ID. When a user looks at a property it's valuable when all the external identifiers are listed together and that doesn't happen with a string property.
I did switch the ID name to BFO_.
I have created a list for all OBO ontologies at https://www.wikidata.org/wiki/Wikidata:WikiProject_Bioinformatics/OBO_Foundary_Ontologies . Most of them don't have Wikidata properties at present. While I do believe that it would be benefitial to eventually have properties for all of them, I currently don't propose to add Wikidata properties without planned usages. The ontologies I did propose were ontologies that touch items with whom I worked. ChristianKl❫ 15:27, 23 December 2020 (UTC)
@ChristianKl: regarding identifiers in multiple ontologies, see the RDA value vocabularies ID proposal below, which implies it is possible to have an external identifier datatype property in this scenario. It also deals with the issue of some of the vocabularies being too small to require a specific property. Simon Cobb (User:Sic19 ; talk page) 16:32, 23 December 2020 (UTC)
@Cmungall: you proposed OBO Gazetteer ID (P6778), do you have plans to fill it with more data? ChristianKl❫ 15:27, 23 December 2020 (UTC)

ChemistryEdit

Please visit Wikidata:WikiProject Chemistry for more information. To notify participants use {{Ping project|Chemistry}}

isotopically modified form ofEdit

MotivationEdit

Another property to model relations between molecular entities. Label 'isotopically modified form of' instead of 'isotopically modified compound of' has been chosen because to be able to link other molecular entities (other than compounds, like ions) with each other. This is IMO the only possible method to link molecular entity with natural isotopic composition (assuming that every item about a molecular entity is that by default) with isotopically modified molecular entities. isotopically modified form subclass of (P279) regular item about a molecular entity would be possible if (1) chemical compounds in WD are classes (right now every chemical compound is classified as an instance of a class), (2) items about molecular entities in WD are assumed to with any/undefined isotopic composition, not with natural isotopic composition. This property is designed based on hydrated form of (P4770), i.e. unidirectional + item requires statement constraint (Q21503247) that requires isotopic compound (Q22332141) as a superclass. Wostr (talk) 18:34, 18 October 2020 (UTC)

Saehrimnir
Leyo
Snipre
Jasper Deng
Dcirovic
Walkerma
Egon Willighagen
Denise Slenter
Daniel Mietchen
Kopiersperre
Emily Temple-Wood
Pablo Busatto (Almondega)
Antony Williams (EPA)
TomT0m
Wostr
Devon Fyson
User:DePiep
User:DavRosen
Benjaminabel
99of9
Kubaello
Fractaler
Sebotic
Netha
Hugo
Samuel Clark
Tris T7
Leiem
Christianhauck
SCIdude
Binter
Photocyte
Robert Giessmann
Cord Wiljes
Jonathan Bisson
GrndStt
Ameisenigel
Charles Tapley Hoyt
ChemHobby
  Notified participants of WikiProject Chemistry

DiscussionEdit

  • I think subclass of (P279) is the right approach here, it is what we use for the isotopes themselves. There isn't any one "natural" isotopic composition anyway - there are some standard Earth surface ones, but I think it's a bit silly to say that a compound found, say, on Mars, is not identified by the associated wikidata item for the generic compound, even if the isotopic composition is different. ArthurPSmith (talk) 16:58, 19 October 2020 (UTC)
    • @ArthurPSmith: as I wrote above, this approach is not possible right now and won't be possible in the foreseeable future. The reasons are that chemical compounds are now classified as instances — i.e. it's not possible to add a subclass or an instance to an instance — and there was no agreement to change that. The problem is that chemical compound (Q11173) has dual use now — it is a part of classification of chemical species (most classes of chemical compounds classified as structural class of chemical compounds (Q47154513) and group of chemical compounds (Q56256086) /with subclasses/ have chemical compound (Q11173) as a superclass at some point) and is a metaclass for every item describing stereochemically or isotopically defined compound (just like structural class of chemical compounds (Q47154513) is a metaclass for structural classes o chemical compounds). That led to the existing situation in which most compounds are classified using instance of (P31). I don't see any possibility to change this situation right now, because there is no good data model proposed for chemical species and every discussion in WikiProject Chemistry leads to the same result: no consensus due to too few people participating in the discussion. Wostr (talk) 18:07, 19 October 2020 (UTC)
    • About the rest of your comment: almost every item about chemical species have 'mass' as a property, usually calculated using CIAAW data. So every such chemical species is assumed to have some default (natural) isotopic composition. Wostr (talk) 18:20, 19 October 2020 (UTC)
      • @Wostr: You state "it's not possible to add a subclass or an instance to an instance" - why not? We have that all over the place in Wikidata. Every chemical species is a class in the sense that it represents all the various possible real-world manifestation of that species, which would be instances of the class. As to the existing mass property, they probably should have the determination method (P459) qualifier reflecting the isotopic assumptions. ArthurPSmith (talk) 18:41, 19 October 2020 (UTC)
        • @ArthurPSmith: as to why not — because right now chemical species are not classes in WD. All chemical species instance of (P31) chemical compound (Q11173) and adding isotopically modified compound subclass of (P279) regular chemical compound (or even isotopically modified compound instance of (P31) regular chemical compound) triggers a violation (and rightly so) — as an example, you can't add liothyronine I-131 (Q27269725) subclass of (P279) liothyronine (Q327362) until there is no subclass of (P279) in liothyronine (Q327362) (ideally liothyronine (Q327362) subclass of (P279) DL-triiodothyronine (Q27163652) without redundant liothyronine (Q327362) subclass of (P279) chemical compound (Q11173)). Wostr (talk) 19:30, 19 October 2020 (UTC)
          • Well, this is exactly why I argued that subclass of (P279) should be used much more broadly for such abstract concepts (there is very likely no Wikidata item that refers to, for example, a single specific physical molecule of anything). What if you want to define both stereochemically and isotopically? Subclasses work, the current distinction between instance and class in these cases is just broken. ArthurPSmith (talk) 13:15, 21 October 2020 (UTC)
            • While I agree this is some kind of subclass you still need to indicate the fact that the isotopic content is different, together with the compound you are doing this relation. If you are happy with a long list of P279 statements that each has a qualifier as to the kind of subclass, then we can abandon other properties altogether. --SCIdude (talk) 13:23, 27 October 2020 (UTC)
              • @SCIdude: I assume the isotopic content would be indicated by has part (P527) statements on the isotopic compound item. What other superclasses are you thinking there should be for such items? I guess you could have 2 or 3, but I wouldn't expect a "long list of P279 statements" - can you explain a bit more why you think that would happen? ArthurPSmith (talk) 18:02, 27 October 2020 (UTC)
                • How should I know the future? If any property that can be interpreted as subclass is abandoned in favor of subclass then the number of such statements will rise. "antagonist of" will be subclass of antagonist, "inhibits" will be subclass of inhibitor, "tautomer", "stereoisomer" just to name a few. Be consistent! --SCIdude (talk) 18:13, 27 October 2020 (UTC)
                • I don't think that has part (P527) (and its inverse) should be used for chemical compounds for elemental composition, it is too general, right now it is used for different things which makes it very inconsistent. With classes like compound of carbon (Q2901852) in a classification tree, using has part (P527) could be abandoned. Also, there is a possibility to have classes like 'compound of carbon-14' (being a subclass of 'compound of carbon' and 'isotopically modified compound'). Wostr (talk) 13:45, 28 October 2020 (UTC)

MedicineEdit

Please visit Wikidata:WikiProject Medicine for more information. To notify participants use {{Ping project|Medicine}}

DoseEdit

   Under discussion
Representsdose (Q473420)
Data typeQuantity
Allowed unitsMilligrams, micrograms, international units
Example
route of administration (P636) = oral administration (Q285166)
event interval (P2257) = 1 day
Sourceen:Dose (biochemistry)
Planned usePlan is to use as a qualifier for the price of different medications
Robot and gadget jobsEventually
Motivation

Doses of medications will be required for listing prices or defining how a medication is usually taken.

For example the typical dose of amoxicillin is 500 mg po TID

The wholesale price of 500 mg of amoxicillin is 0.063 USD as of July 17th, 2019 in the United States.[15]

Right now we are using "quantity" but when one adds 400 mg it gives a warning Doc James (talk · contribs · email) 04:33, 17 July 2019 (UTC)

Discussion

Tobias1984
Doc James
Bluerasberry
Wouterstomp
Gambo7
Daniel Mietchen
Andrew Su
Peter.C
Klortho
Remember
Matthiassamwald
Projekt ANA
Andrux
Pavel Dušek
Was a bee
Alepfu
FloNight
Genewiki123
Emw
emitraka
Lschriml
Mvolz
Franciaio
User:Lucas559
User:Jtuom
Chris Mungall
ChristianKl
Gstupp
Geoide
Sintakso
علاء
Dr. Abhijeet Safai
Adert
CFCF
Jtuom
Lucas559
Drchriswilliams
Okkn
CAPTAIN RAJU
LeadSongDog
Ozzie10aaaa
Sami Mlouhi
Marsupium
Netha Hussain
Abhijeet Safai
ShelleyAdams
Fractaler
Seppi333
Shani Evenstein
Csisc
linuxo
Arash
Morgankevinj
Anandhisuresh
TiagoLubiana
ZI Jony
Antoine2711
Viveknalgirkar
JustScienceJS
Leptospira
Scossin
Starsign1971
Bibeyjj
  Notified participants of WikiProject Medicine

  • @Doc James: Please provide a description for the property and examples.
    I'm a bit weary about the potential of people adding a dose to the chemical compounds. Inutively it seems to me like only packaged drugs have a dosis and the dosis isn't a property of the underlying chemical substances. ChristianKl❫ 14:06, 17 July 2019 (UTC)
  •   Question Should this property also address dose units for other than chemicals, such as for Q186161 ionizing radiation? LeadSongDog (talk) 17:25, 17 July 2019 (UTC)
    • User:LeadSongDog you mean in a medication related context? Doc James (talk · contribs · email) 21:38, 18 July 2019 (UTC)
      • No, not just in medication-related context: In a ionizing radiation safety context, workers wear dosimetry badges to track occupational exposure; In semiconductors, defined quantities of neutrons cause defined p-type doping effects; In a photochemistry context, defined quantities of light are used to power many reactions (e.g. photosynthesis, epoxy polymerization, cross-linking of synthetic rubbers, etc.); In food safety, defined radiation doses can be used to sterile-package uncooked foods as an alternative to Pasteurization. LeadSongDog (talk) 15:44, 19 July 2019 (UTC)
  •   Oppose Isn't this what defined daily dose (P4250) is for? There are also some related properties like acceptable daily intake (P2542) that cover other aspects of this. In any case the current proposal is too ill-defined to be suitable here. ArthurPSmith (talk) 18:47, 17 July 2019 (UTC)
    Oh, hang on, you just want a property that works like "quantity" but allows units/non-integer values? It looks like we don't have something for that at all - how about calling this "amount" then? ArthurPSmith (talk) 18:53, 17 July 2019 (UTC)
    User:ArthurPSmith yes perfectly happy with calling it "amount". What we are looking for is a dosage which is an amount for a medication. DDD is a specific type of amount as is ADI. Doc James (talk · contribs · email) 20:56, 18 July 2019 (UTC)
    @Doc James: We might want to try a completely new proposal for that - this one seems a little too tied to the medical context; we should also come up with some other examples where it would be useful, I'm sure they exist! ArthurPSmith (talk) 21:02, 18 July 2019 (UTC)
    User:ArthurPSmith Actually it might be best to tie this specifically to medications. We could remove a constraint from quantity (P1114) but that could have a lot of negative effects. Doc James (talk · contribs · email) 21:11, 18 July 2019 (UTC)
  • I checked defined daily dose (P4250) and acceptable daily intake (P2542) to see how the proposal modeled examples for those properties. There is hardly any modeling. I tried to do some modeling here in this edit. Before it referred to an item, and now there is still that item linked but I also tried to move the administration and event frequencies here for discussion. To talk about a dose, we need an amount, a time period, and an administration route, right? Is it still worthwhile to describe a dose with less information than that? What other information is also helpful?
    I am unsure with Arthur about overlap with those existing properties, which may be sufficient. Blue Rasberry (talk) 19:42, 17 July 2019 (UTC)
    • We may also need formulation, like is it extended release, oral dissolving tablet, liquid, tablet, capsule, etc. Doc James (talk · contribs · email) 21:01, 18 July 2019 (UTC)
  •   Support It is needed for the particular usage of medecine. Antoine2711 (talk) 21:11, 18 July 2019 (UTC)
  •   Oppose Use "quantity", and fix the constraints. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:01, 23 August 2019 (UTC)

GoodRx IDEdit

   Under discussion
RepresentsGoodRx (Q30640316)
Data typeExternal identifier
Template parameterProposed as "GoodRx" in en:template:infobox drug and en:template:Drugbox external links (the intent is to pull GoodRx urls from wikidata to supply the corresponding fields these templates)
Domainpharmaceutical product (Q28885102), prescription drug (Q1643563), medication (Q12140), drug (Q8386), chemical compound (Q11173)
Example 1DL-amphetamine (Q179452)Adderall
Example 2DL-amphetamine (Q179452) → Adderall-XR
Example 3DL-amphetamine (Q179452) → Evekeo
Example 4DL-amphetamine (Q179452) → Adzenys-XR-ODT
Example 5Adderall (Q935761) → Adderall
Example 6Adderall (Q935761) → Adderall-XR
Example 7methylphenidate (Q422112) → Ritalin
Example 8methylphenidate (Q422112) → Ritalin-LA
Example 9methylphenidate (Q422112) → Concerta
Example 10Insulin Lispro (Q3492616) → Humalog
Sourcehttps://www.goodrx.com
External linksUse in sister projects:
Planned useAdd the corresponding IDs to each Wikidata item from GoodRx
Number of IDs in source~6000
Expected completenesseventually complete (Q21873974)
Formatter URLhttps://www.goodrx.com/$1
Robot and gadget jobsAdding GoodRx IDs to wikidata to permit use on Wikipedia

MotivationEdit

Tobias1984
Doc James
Bluerasberry
Wouterstomp
Gambo7
Daniel Mietchen
Andrew Su
Peter.C
Klortho
Remember
Matthiassamwald
Projekt ANA
Andrux
Pavel Dušek
Was a bee
Alepfu
FloNight
Genewiki123
Emw
emitraka
Lschriml
Mvolz
Franciaio
User:Lucas559
User:Jtuom
Chris Mungall
ChristianKl
Gstupp
Geoide
Sintakso
علاء
Dr. Abhijeet Safai
Adert
CFCF
Jtuom
Lucas559
Drchriswilliams
Okkn
CAPTAIN RAJU
LeadSongDog
Ozzie10aaaa
Sami Mlouhi
Marsupium
Netha Hussain
Abhijeet Safai
ShelleyAdams
Fractaler
Seppi333
Shani Evenstein
Csisc
linuxo
Arash
Morgankevinj
Anandhisuresh
TiagoLubiana
ZI Jony
Antoine2711
Viveknalgirkar
JustScienceJS
Leptospira
Scossin
Starsign1971
Bibeyjj
  Notified participants of WikiProject Medicine

I'd like to add GoodRx IDs to wikidata to permit linking to GoodRx webpages through templates on Wikipedia. I intend to code a bot script to do this using pywikibot; I'm aware that I need to get the bot approved. GoodRx provides pharmacy price data and coupons for prescription drugs in the US. Seppi333 (Insert ) 11:43, 16 August 2019 (UTC)

DiscussionEdit

  • Support What is proposed here seems like routine integration of the identifiers from a popular medical database into Wikidata.
    Wikidata collects many names for what should be the same drug. We are still relying on all of these products by various names resolving as equivalent to one name. GoodRx is another layer on that depending on the foundational quality that all our data data is correct. I question whether the information we have is correct. The pharma industry is playing many anti-consumer games by marketing all these various names into the marketplace. Such as things are, this mapping plan with GoodRx matches to Wikidata's current quality and the best quality data that NIH and similar databases present for import and reconciliation with Wikidata. If we ever have separate Wikidata items for various product names, then we could easily split this GoodRx cataloging system into more specific name articles. Blue Rasberry (talk) 12:24, 16 August 2019 (UTC)
    The fact that two drugs share the same active ingrident doesn't mean that they are the same product. The ways a drug gets manifactured has often clinical effects.
    I don't believe that items about chemical compounds should link to items about individual product names. I would want items for the individual named products to be created if you want to add external ids of individual named products to items. Otherwise, I   Support having the property. There should be a single value constraint. ChristianKl❫ 12:43, 16 August 2019 (UTC)
    The main reason I linked 2 items to "Adderall" and "Adderall-XR" is that on en-wiki, en:Adderall and en:Amphetamine both exist; the Adderall article is about a specific mixture of amphetamine enantiomers (1:3 levoamphetamine to dextroamphetamine) in clinical use, whereas the Amphetamine article is about the compound in general (i.e., 1:1 racemic and any enantiomeric mixtures of levo- and dextro-amphetamine). But, for what it's worth, DL-amphetamine (Q179452) already lists Adderall, Evekeo, and several other brands under a different property (active ingredient in (P3780)) pertaining to brands in which amphetamine is or was previously an active ingredient.
    Also, a single value constraint would almost entirely preclude the use I had in mind for this property (i.e., pulling the urls from Wikidata and linking to the corresponding GoodRx pages in Wikipedia templates). Seppi333 (Insert ) 13:32, 16 August 2019 (UTC)
    may help clarify why there's so much confusion. Multiple regulators. Multiple database systems. Multiple producers. Multiple labels. Multiple formulations. Multiple dosages. Seems almost like it was designed to confound multi-national studies. Anywho, different is different: we should be careful that we do not conflate referents through sloppy handling of identifiers. This could cause serious harm. If it is made glaringly clear that formulations may differ, there might be value in identifying "related" referents. LeadSongDog (talk) 19:02, 16 August 2019 (UTC)
    When it comes to adding external ids to Wikidata the first priority is to keep our order on Wikidata. Just because Wikipedia versions mix different concepts on the same page doesn't mean that we should do so as well.
    In those cases it might make sense to sooner or later mark in the templates on Wikipedia which concepts are actually covered by the pages.
    There's potential drama involved here by drawing links to thousands of pages of a for-profit unicorn startup and it seems to me like till now you haven't got a clear consensus from EnWiki that those links are considered welcome on EnWiki. Going through a bot request on EnWiki leaves less potential for drama afterwards.
    @Doc_James: What do you think here? ChristianKl❫ 09:05, 20 August 2019 (UTC)
  •   Support David (talk) 05:34, 17 August 2019 (UTC)
  • Can we also import some pricing data as well? More useful than just a link though of course more complicated to do. Doc James (talk · contribs · email) 09:44, 20 August 2019 (UTC)
    • @Doc James: When it comes to pricing data it seems to me even more important that it's for the price for a specific drug and not the general compound. Importing privacy data might be copyright sensistive. It would make sense to ask Goodrx what they think about such an import.
      I think have a good understanding of the opinions of the medical Wikipedians on Enwiki. How much potential for conflict for linking to a for-profit website like this in infoboxes do you see? How do you think the relevant consensus should be established over there? ChristianKl❫ 13:00, 20 August 2019 (UTC)
    • @Seppi333: did you have any contact with GoodRx about this import project and how they stand on it? ChristianKl❫ 13:00, 20 August 2019 (UTC)
      • What is the "specific drug" versus "general compound"? We specifically label medications by the INN and redirect all brands to generics (except when a brand is used for more than one separate medication)
        The price is a data point and not copyrightable. Doc James (talk · contribs · email) 13:37, 20 August 2019 (UTC)
        • When you buy a drug in a pharmacy you are not only getting the compound but a lot of other things in the same pill. Some compounds can be orally as well as intravenously and are sold in different drug formulations.
          To the extend that we have links to generics, we could simply link to the generics in GoodRx. What benefit would we get from also linking to brand names on it? ChristianKl❫ 14:58, 20 August 2019 (UTC)
        • "We [...] redirect all brands to generics" No, we do not: Concerta (Q10868995); Adderall (Q935761); Ritalin (Q47521826). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:00, 23 August 2019 (UTC)
          • @Pigsonthewing: I don't see why a many-to-one linking is a problem. This will, in fact, be an m-to-n association between our database and theirs because we have additional IDs on some medications but not others (it's going to be a painstaking process for me to resolve all of the identifiers between the two databases, but I'm hoping I can get USANs (very likely) or INNs (probably) for the drug names, as I'd only have maybe 5-10% remaining from the union of both sets to sort out (e.g., pages on obscure brand and/or generic pharmaceuticals that are poorly linked on WD/WP, of which there's quite a few). The whole point of this proposal is not to establish a 1-to-1 bijective association between two databases; it's about facilitating data utilization on other projects. In any event, the only reason that I am proposing the brand name is that (1), (2) the prototype brand of a particular chemical, and particularly a drug class, is at least as recognizable to the public - if not moreso - than its generic name, (3) doctors often prescribe a brand name, with generic substitution being intended, because the drug product (i.e., dosage form, active ingredient, and its excipients) is inherently more clear to a pharmacist than prescribing the dosage form and the generic drug name, as that is not a drug product. Case in point, there is a qualitative but rather technical chemical difference in the dosage forms of Adderall XR and Mydayis (see table; the difference in cost between them is several hundred US dollars because one is off-patent and the other is not), and I have no idea how a doctor, much less a patient, would be able to distinguish between the two if a generic term were used. Seppi333 (Insert ) 06:09, 8 September 2019 (UTC)
      • It seems a bit premature for me to ask for their data without having established any form of consensus on WP to merit access to potentially proprietary information. That said, there isn't one fixed way of doing what I'm proposing, so I'm open to hearing what others have to say and, if there's consensus to do this only in a certain manner, I figured I'd broach the issue by making my request and stating those conditions upfront. If they're fine with that, great. If not, I could probably still get the data I need from a cloud-based NLP AI, but I probably wouldn't get all the relevant data (e.g., GoodRx/retail price, dosage form, dose, brand name, generic name, and/or other data items) on every drug in their database if that ends up being the only alternative. It would be a bit of a pain in the ass to go that route because they have brand name drugs and - assuming they're no longer patented - the corresponding generic(s) redirected to the same uri, so I'd be generating duplicate data on alternate identifiers through redirects. I'd probably have to delete the redundancies and save the corresponding identifiers from a web scrape after-the-fact rather than preclude writing that data because I don't think a web-scraping AI would be programmed with niche functionality like that. Seppi333 (Insert ) 06:09, 8 September 2019 (UTC)
  •   Oppose "Ritalin", for example, is an identifier for Ritalin (Q47521826), not methylphenidate (Q422112). (Also, FYI, I get the error "GoodRx is not available outside of the United States." when trying to access the site.) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:05, 23 August 2019 (UTC)
  •   Oppose as long as this is intended to be used in items about 'chemical substances'. We already have many problems with linking between 'chemical substance' ↔ 'active ingredient' ↔ 'pharmaceutical product'. IDs that described specific pharmaceutical products (brands) should be added to items that describe specific product/brand, not to the items about chemical substances nor about active ingredients. Wostr (talk) 16:33, 15 June 2020 (UTC)
  •   Oppose Given the intent in using it for items about 'chemical substances'. ChristianKl❫ 00:57, 7 December 2020 (UTC)

EudraCT trial IDEdit

   Ready Create
Descriptionidentifier for a trial in the EU Clinical Trials Register, the clinical trial registry of the European Medicines Agency
Data typeExternal identifier
Domainclinical trial (Q30612)
Example 1A Phase 1/2 Study Evaluating ABT-751 in Combination With Alimta in Advanced Non-Small Cell Lung Cancer (Q65537971)2006-002830-38
Example 2Growth Hormone Treatment in Children Born Small for Gestational Age (SGA) (Q66024900)2017-000914-47
Example 3Treatment of Children and Adolescents With Growth Failure Associated With Primary IGF-1 Deficiency (Q65390763)2019-000844-81
Example 42020-001271-33/FR
Planned useinteresting in the context of articles describing clinical trials during the covid-19 pandemics
Number of IDs in source37778
Expected completenessalways incomplete (Q21873886)
Formatter URLhttps://www.clinicaltrialsregister.eu/ctr-search/search?query=$1
See alsoClinicalTrials.gov Identifier (P3098) and OpenTrials ID (P6220)

MotivationEdit

I apologize for this submission being so incomplete. Maybe somebody else will be interested and fill in all the missing information. This is a bit technical for me. Lewisiscrazy (talk) 17:27, 19 April 2020 (UTC)

Property for the EU Clinical Trials Register (Q96183789) identifier, as used on the https://clinicaltrialsregister.eu/ website of the European Medicines Agency (Q130146). It complements in the ClinicalTrials.gov Identifier (P3098) and the OpenTrials ID (P6220). --Egon Willighagen (talk) 15:29, 23 August 2020 (UTC)

DiscussionEdit

number of deaths in senior care homesEdit

   Under discussion
Descriptiontotal (cumulative) number of people who died in senior care homes, since start or as a direct result of an event or cause
Representsdeath toll (Q65096341)
Data typeQuantity
Template parameternot yet but could be added as a new field e.g. in the outbreak infobox: https://en.wikipedia.org/wiki/Template:Infobox_outbreak
Domainpandemics e.g. Q81068910, measures or observations e.g. Q193181
Allowed valuesquantity (positive integer)
Example 1COVID-19 pandemic (Q81068910) --> 10560 (in France, at 2020-08-18, provided by https://github.com/opencovid19-fr/data)
Example 2influenza (Q2840) --> 1000
Example 3heat wave (Q215864) --> 555
Planned useFrance consolidates COVID19 death numbers by splitting global deaths and those specifically in senior care homes (EHPAD). We want to use this property to distinguish this in the reports per department, per region etc.
See alsonumber of deaths (P1120)

MotivationEdit

The motivation comes from the way France consolidates daily figures of cases related to the COVID19 pandemic: it makes the distinction between the global number of deaths,and the number of deaths specifically in senior care homes (called EHPAD in France). Heatlh care services need this distinction to assess the effectiveness of security measures set up in those places.

I assume this distinction probably makes sense in others countries, although I have no evidence to support this. Hence the generic term "care homes for seniors" rather than the specific French EHPAD term.

The property could equally be applied to any dramatic event such as an earth quake, heat wave, etc.

DiscussionEdit

  •   Oppose The examples don't look like they make sense. ChristianKl❫ 21:25, 24 August 2020 (UTC)
@ChristianKl: Can you be more specific? Maybe I wrote it in the wrong way (Im not used to the syntax in these pages, sorry about that). The point is to denote cases such as:
COVID-19 pandemic (Q81068910) entailed 1000 deaths in senior care homes. The same property can be used for the seasonal influenza or for a heat wave: for instance, the 2003 heat wave (https://en.wikipedia.org/wiki/2003_European_heat_wave) killed 14,802 people in France, mostly among the elderly.
Does it make more sense this way?
2003 European heat wave (Q1136557) is a different kind of item then heat wave (Q215864) ChristianKl❫ 16:19, 27 August 2020 (UTC)
You are right, 2003 European heat wave (Q1136557) is an instance of heat wave (Q215864) which is not an event in itself but a more general concept. I can update the example with 2003 European heat wave (Q1136557) instead of heat wave (Q215864), this will make more sense. Are you ok with that? --- Franck Michel 16:40, 27 August 2020 (UTC)
Samples are generally real statements that could be added to Wikidata. --- Jura 19:39, 27 August 2020 (UTC)
I'm not sure I understand what you mean. Can you elaborate please? — Franck Michel (talk) 11:39, 7 September 2020 (UTC)
"1000" and "555" look like a made-up numbers. Please add references for all three statements. --- Jura 18:39, 21 September 2020 (UTC)
These are supposed to be *examples*, not actual facts ready to be published in Wikidata. So yes indeed, these are made up, yet does this hamper thinking about whether this property is worth or not? I added a valid figure to the Covid19 pandemic. Franck Michel (talk) 07:09, 22 September 2020 (UTC)
Proposal almost always include three actual samples. We tend to not create the ones that don't have that. --- Jura 07:15, 22 September 2020 (UTC)
@Franck Michel: Property proposal are about thinking how certain fact can be modeled in Wikidata. For that it's important to have examples of actual facts. ChristianKl❫ 14:38, 22 September 2020 (UTC)

number of cases in intensive careEdit

   Under discussion
Descriptionnumber of people in intensive care services since start or as a direct result of an event or cause
Representsintensive care medicine (Q679690)
Data typenumber (positive integer)-invalid datatype (not in Module:i18n/datatype)
Domainpandemic (Q12184), measures or observations e.g. observation (Q193181)
Allowed valuesquantity (positive integer)
Example 1COVID-19 pandemic (Q81068910) → 1000
Example 2influenza (Q2840) → 1000
Example 3heat wave (Q215864) → 555
Planned useCount the numbner of people in intensive care following a certain event such as the COVID19 pandemic
See also

MotivationEdit

Wikidata already has useful properties to report statistics on the spreading of the COVID19 pandemic: number of hospitalized cases (P8049), number of cases number of cases (P1603), number of deaths number of deaths (P1120), number of recoveries number of recoveries (P8010).

Similarly, health care services also need to track the number of people in intensive care so as to assess the criticity of the crisis.

The property could equally be applied to any other disease or event such as heat wave.

DiscussionEdit

Same comment, I'm sorry I don't get your point. Can you elaborate please? — Franck Michel (talk) 11:41, 7 September 2020 (UTC)

DeCS IDEdit

   Ready Create
Descriptionidentifier in the Health Sciences Descriptors thesaurus
RepresentsHealth Sciences Descriptors (Q5690673)
Data typeExternal identifier
Domainitem
Allowed values[1-9]\d*
Example 1anhedonia (Q545365)54515
Example 2dutasteride (Q424760)55943
Example 3abdomen (Q9597)5
Example 4Gammacoronavirus (Q16977225)56930
Sourcehttps://decs.bvsalud.org/I/DeCS2020_Alfab.htm
Mix'n'match4099 and 4100, but they are incomplete.
Planned useAddition of the identifier to Wikidata items.
Number of IDs in source34118 (as of 2020)
Expected completenesseventually complete (Q21873974)
Formatter URLhttps://decs.bvsalud.org/en/ths/resource/?id=$1
Distinct values constraintYes
Wikidata projectWikiProject Medicine (Q4099686)

MotivationEdit

Tobias1984
Doc James
Bluerasberry
Wouterstomp
Gambo7
Daniel Mietchen
Andrew Su
Peter.C
Klortho
Remember
Matthiassamwald
Projekt ANA
Andrux
Pavel Dušek
Was a bee
Alepfu
FloNight
Genewiki123
Emw
emitraka
Lschriml
Mvolz
Franciaio
User:Lucas559
User:Jtuom
Chris Mungall
ChristianKl
Gstupp
Geoide
Sintakso
علاء
Dr. Abhijeet Safai
Adert
CFCF
Jtuom
Lucas559
Drchriswilliams
Okkn
CAPTAIN RAJU
LeadSongDog
Ozzie10aaaa
Sami Mlouhi
Marsupium
Netha Hussain
Abhijeet Safai
ShelleyAdams
Fractaler
Seppi333
Shani Evenstein
Csisc
linuxo
Arash
Morgankevinj
Anandhisuresh
TiagoLubiana
ZI Jony
Antoine2711
Viveknalgirkar
JustScienceJS
Leptospira
Scossin
Starsign1971
Bibeyjj
  Notified participants of WikiProject Medicine

Health Sciences Descriptors (Q5690673) (DeCS) is an open multilingual thesaurus (in Portuguese, Spanish, French and English) created by BIREME (Latin American and Caribbean Center on Health Sciences Information) (Q6496267) and originally developed from the Medical Subject Headings (Q199897) thesaurus by the United States National Library of Medicine. I believe this would be a nice addition to the authority control of the related Wikipedias. -sasha- (talk) 02:00, 14 January 2021 (UTC)

DiscussionEdit

  •   Support. However, I'd suggest to use mapping relation type (P4390) qualifier for at least some chemistry-related entries (and maybe other too), as in the case of MeSH ID, there may be entries that are not 100% equivalent to WD entries. Examples can be easily found in DeCS too: Hydrocarbons are not an equivalent to our hydrocarbon (Q43648), because DeCS entry also cover derivatives of hydrocarbons (i.e. a lot of compounds that are in fact not hydrocarbons). Wostr (talk) 10:54, 14 January 2021 (UTC)

Tobias1984
Doc James
Bluerasberry
Wouterstomp
Gambo7
Daniel Mietchen
Andrew Su
Peter.C
Klortho
Remember
Matthiassamwald
Projekt ANA
Andrux
Pavel Dušek
Was a bee
Alepfu
FloNight
Genewiki123
Emw
emitraka
Lschriml
Mvolz
Franciaio
User:Lucas559
User:Jtuom
Chris Mungall
ChristianKl
Gstupp
Geoide
Sintakso
علاء
Dr. Abhijeet Safai
Adert
CFCF
Jtuom
Lucas559
Drchriswilliams
Okkn
CAPTAIN RAJU
LeadSongDog
Ozzie10aaaa
Sami Mlouhi
Marsupium
Netha Hussain
Abhijeet Safai
ShelleyAdams
Fractaler
Seppi333
Shani Evenstein
Csisc
linuxo
Arash
Morgankevinj
Anandhisuresh
TiagoLubiana
ZI Jony
Antoine2711
Viveknalgirkar
JustScienceJS
Leptospira
Scossin
Starsign1971
Bibeyjj
  Notified participants of WikiProject Medicine

MineralogyEdit

Please visit Wikidata:WikiProject Mineralogy for more information. To notify participants use {{Ping project|Mineralogy}}

Computer scienceEdit

Please visit Wikidata:WikiProject Informatics for more information. To notify participants use {{Ping project|Informatics}}


GeologyEdit

Please visit Wikidata:WikiProject Geology for more information.

GeographyEdit

MathematicsEdit

Please visit Wikidata:WikiProject Mathematics for more information. To notify participants use {{Ping project|Mathematics}}

MaterialEdit

Please visit Wikidata:WikiProject Materials for more information. To notify participants use {{Ping project|Materials}}

GlaciologyEdit

AllEdit

NutritionEdit