Wikidata:Property proposal/standard atomic weight

standard atomic weight edit

Originally proposed at Wikidata:Property proposal/Natural science

   Not done
DescriptionIs a relative atomic mass (Q41377) of chemical elements for terrestial samples (sources), as defined (specified) by Commission on Isotopic Abundances and Atomic Weights (Q15647945) (CIAAW)
Representsstandard atomic weight (Q28912964)
Data typeNumber (not available yet)
Domainchemical element (Q11344) (84 out of 118)
Allowed values(\d+.\d+\(\d+\)|\[\d+.\d+, \d+.\d+\])
Example
Format and edit filter validationsee allowed values
SourceCIAAW, technical report 2013
Planned use
  1. add to the chemical element items.
  2. check with current mass (taken from PubChem database)
Robot and gadget jobsPresent PubChem retrieved values seem outdated, incorrect
See alsoproposal/conventional atomic weight
Motivation

Various quantity names and values are used for the atomic weight of chemical elements. Often is used the quantity 'relative atomic mass' (Ar), which is generic. Historically, the CIAAW (an IUPAC commission) has improved the measurements and methods, and publishes a more specific, well-defined value named 'standard atomic weight'. CIAAW only uses terrestial, natural sources/samples. Their published values are useful for most materials found or created on Earth. The standard atomic weight has great authority, and is used widely. For consistency throughout, it is advisable that Wikidata and its consumers use the same, authorised mass definition, name and value. Per SI, one could write quantity symbol Ar, CIAAW or Ar, standard. The CIAAW values may be updated biannually (2017, 2019, ...). DePiep (talk) 19:40, 7 March 2017 (UTC)[reply]

Discussion
  •   Support Gstupp (talk) 19:55, 7 March 2017 (UTC)[reply]
  •   Support we're in the middle of a discussion about chemical elements vs atoms; this is a property that applies to the bulk (average) not to individual atoms, so it's of interest to that discussions also... ArthurPSmith (talk) 17:11, 8 March 2017 (UTC)[reply]
  •   Support As property on chemical element (Q11344). --Egon Willighagen (talk) 08:31, 13 March 2017 (UTC)[reply]
  •   Support Walkerma (talk) 02:47, 9 March 2017 (UTC)[reply]
  •   Oppose It is no longer clear what is being proposed. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:26, 9 March 2017 (UTC)[reply]
  •   Comment So the only one opposer has no content argument. Well then. -DePiep (talk) 21:12, 1 April 2017 (UTC)[reply]
  •   Oppose. The trio of proposals is confused and duplicative of mass (P2067). The proposals want the quantity to be a pure ratio (i.e., number), but that ratio is just the [atomic mass]/[some standard atomic mass unit]. In other words, the unit of measurement for atomic mass is [some standard atomic mass unit]. mass (P2067) can use dalton (Q483261) (u or Dalton). The topic is about a physical quantity, so having units attached to the value is a plus; a normalized value query (psn:) should work. We don't want one dimensionless property for temperature ratio to one °R and another for temperature ratio to one Kelvin. Another fracture line of the proposals is tying the properties to how the numbers are specified. The "standard atomic weight" proposal is supposed to be an interval, the "conventional atomic weight" proposal is supposed to be a single number derived from "standard atomic weight", and "relative atomic mass" is the "standard atomic weight" for nonstandard situations. Quantities already do that. For the first case, one can query P2067 for wikibase:quantityLowerBound and wikibase:quantityUpperBound to get an interval. For the second case, use wikibase:quantityAmount to get a single value. For the third case, there is no requirement that P2067 must be restricted to a particular source's definition. Yes, there is confusion about what upper and lower bounds may mean (what confidence level), but that problem exists throughout the project. And yes, there is a distinction between an average atomic mass of a sample and the mass of actual atoms. These proposals don't address that population issue. Glrx (talk) 20:09, 5 August 2017 (UTC)[reply]

Datatype issues edit

  • It's not clear what is meant by the two examples, which use a different format. Neither is a single number, as the proposed datatype suggests. Please clarify. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:21, 7 March 2017 (UTC)[reply]
    • The Datatype only recognises 'number', not any specification. So I left it at that. The two examples show the two patterns CIAAW publishes. We'll have to accomodate those somehow. My first try for the pattern: (\d+.\d+\(\d+\)|\[\d+.\d+, \d+.\d+\]).
      3. I recon the first one is a single number, including an uncertainty as is common in physics. I don't mind the way of counting, as long as Wikidata can treat is as a number for future calculations (e.g. for molar mass & uncertainty).
      4. CIAAW gives no value for 34 unstable radioactive elements (think: 43Tc, 61Pm and ~from 84Polonium up). -DePiep (talk) 21:14, 7 March 2017 (UTC)[reply]
      • That's no clearer. A number-type property has quantitative values like "123", "6.78" or -23.89". They may have a precision, such as "2765+-2". Are you proposing a property that takes as its value strings like "4.002602(2)" and "[1.00784, 1.00811]"? If not, what exact values would it take in those cases? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:20, 7 March 2017 (UTC)[reply]
This is the clearest I can get. It is the CIAAW publishing, and they their Technical report 2013 refers to [www.bipm.org/en/publications/guides/vim BIPM VIM] for value notation. If there are limits to Wikidata possibilities - please say so. -DePiep (talk) 21:47, 7 March 2017 (UTC)[reply]
@Pigsonthewing: the (2) is standard notation for +- 2 in the last digit, it would have to be entered in wikidata as 4.002602 +- 0.000002. @DePiep: I assume the second example means there are two values given; these would have to be entered as two *separate* claims for this value, it can't be done on one line. If it is instead just a range, then it can be done with +- notation about the mid-point value. ArthurPSmith (talk) 17:11, 8 March 2017 (UTC)[reply]
The note on "(+/-2)" is correct. Then, the [d, d] notation is the range interval (is the correct word) of values (borders included), explicitly without any statistical indication.The mid-value is no more probable than an other value in the interval. I'd say writing so in WD would suggest an uncertainty, which CIAAW wants to avoid. (Background: CIAAW uses actual samples/sources from the Earth. By radioactive history those can vary for that element. A crude example by me: hydrogen in air can have developed differently (radioactively spoken) than when in minerals or oceans. So CIAAW finds different mixes of isotopes in the RL sources for hydrogen. This is presented as an interval. OTOH, the regular numbers are narrowed averages as more easily expected).
Now how to handle this? Having an actual number (with uncertainty) is great because WD users (that could be automated external 'infoboxes') can use that to calculate a molecular mass if a chemical compound, preferably handling the uncertainties right even.
I assume we can not have two properties for the same formal quantity. Is there a WD-internal option to say "use Prop-A when present (the numeric one), else Prop-B (the interval textual one)"? Cannot leave that to programmers (like Lua in enwiki).
For now, I cannot think anything better than: make it a string (regex controlled), and the reader will have to extract the number, uncertainty or border from it before any calculating (or re-formatting) can be done. Ouch. Of course, whatever WD does: any calculation will always have to make a choice on how to use the interval borders (produce two values? use the midvalue?).
A secondary solution: CIAAW also publishes a 'conventional' simple value for these interval'ed elements (12/84). (proposal too). These are single number values, a little less perfect and without uncertainty, eg for commercial and trade usage (not for pharma and sciences one would say). So, any value extracting WD-user could say: "IF Property 'conventional value' exist THEN use that one to avoid the interval, ELSE use the standard value". Would leave unsolved: those single-number values are still in this same textual property, so can not be numeric.
TL;DR; 1. CIAAW published values are sacred, including the interval form. Better not rewrite, if needed write by literal textstring... 2. Single numbers would be very useful as numbers, eg for calculations (72/84 elements). 3. Interval-value elements (12/84) should also get the single-number 'conventional atomic weight' (property), also published by CIAAW. This value can be used in less-extreme calculations. -DePiep (talk) 18:12, 8 March 2017 (UTC)[reply]
────────────────────────────────────────────────────────────────────────────────────────────────────@ArthurPSmith:: in Special:ListDatatypes I see datatype "Quantity" with 'lower bound' and 'upper bound' (next to 'amount'). Q1: is this how the "number±number" is generated I see in those live item properties? Q2: are there pros and cons when we use these for an interval? Can we leave the 'amount' empty? -DePiep (talk) 15:41, 9 March 2017 (UTC)[reply]
@DePiep: I don't think you can leave "amount" empty, however that might be a matter for testing. Certainly you can't do that using the current wikidata user interface, but it might be possible via an API call. I don't know how the UI would display such a thing. But there is no need to treat the +- represented here as an "uncertainty" of any standard sort, it could represent a flat distribution for a range, just with an appropriate value for the uncertainty corresponds to (P2571) qualifier. See for example the half-life statement on caesium-117 (Q1940002) for how it's been used. Instead of "standard deviation" I would put "range" or something like that as the value. ArthurPSmith (talk) 16:47, 9 March 2017 (UTC)[reply]
* Benjaminabel (talk) 10:25, 9 March 2017 (UTC)
I don't think it's necessary to create two properties for standard atomic weigth, we could simply transpose differently range numbers and values with uncertainty like this:
  • Hydrogen: Ar(H) = [1.007 84, 1.008 11] since 2009 stored as quantity: value:1.008, lowerBound:1.00784 upperBound: 1.00811
  • Helium: Ar(He) = 4.002 602(2) since 1983 stored as quantity: value:4.002602, lowerBound:4.0026018 upperBound: 4.0026022
This first states that we can set the upper and lower values of the interval in Datatype Quantity. Secondly, it assumes we can (or must?) add the amount value, which may not be the midvalue and even may be outside of the interval (cases: conventional value for oxygen, thallium).
First requirement—a must: In every situation, we require that a value is presented as it is defined. So being a value with uncertainty or an interval: there must be two different forms available (one per element is used). By its definition, it is not acceptable that an interval is presented as a single-value-with-uncertainty. We can not compromise on the formal value published.
Second requirement—a would be nice: We could consider adding the conventional value for interval values (1.008 for hydrogen; 12 such elements). If this can be done, fine. However, this may not compromise the #1 requirement. This conventional value may not obscure the standard interval value. This asks for a different way of fetching, right?
Can the datatype Quantity serve this? (how can we test this?) -DePiep (talk) 15:57, 10 March 2017 (UTC)[reply]
@DePiep: yes the datatype supports this, although the web UI currently does not. You would need to interact with it using the API (there are python, java, etc. libraries for this). test.wikidata.org exists precisely for testing things like this. ArthurPSmith (talk) 17:17, 13 March 2017 (UTC)[reply]

Orthographic issues edit

How does this property relate to 'chemical element', and to 'relative atomic mass"? Part of, instance of? -DePiep (talk) 21:46, 9 March 2017 (UTC)[reply]

Overview edit

Developing. -DePiep (talk) 21:46, 9 March 2017 (UTC)[reply]
related: Q28919928 -- note: "abridged", not "rounded" -DePiep (talk) 00:02, 10 March 2017 (UTC)[reply]

@DePiep, ArthurPSmith, Gstupp, Benjaminabel:@Pigsonthewing, Glrx, Gstupp, Willighagen, Walkerma: Not done, given that this proposal has gotten stale. If you are still interested in a property for this purose feel free to create a new property proposal. ChristianKl (talk) 18:39, 1 November 2017 (UTC)[reply]