Wikidata:WikiProject Chemistry/Proposal:Models

Chemistry is an interesting field and has many concepts. We have pure chemical compound (Q11173), chemical substance (Q79529), mixture (Q169336), ion (Q36496), chemical element (Q11344), and much, much more. Even chemical compounds are hard: Wikipedia (Q52) has entries for fully stereochemically defined compounds, for racemic mixtures, for compound classes (like fatty acid (Q61476)), and for entries for compounds with undefined stereochemistry.

The WikiProject Chemistry only defines guidance for pure chemical compounds: "each pure chemical substance (i.e. not mixtures or solutions) the property instance of (P31) with the value chemical compound (Q11173)". However, it does not provide guidance on how to model many of the other topics, leading to questions:

(and many more)

On this page I am proposing some models (and I will want to use ShEx for this, see these WikidataCon slides) to start a discussion to formalize the approaches we want to use. This matters (lol) to use as we have in WikiPathways (Q7999828) a lot of compound classes (like fatty acid (Q61476)). For this reason I have started aspects for Scholia (Q45340488) to visualize a specific chemical and (still a pull request) for chemical classes. However, as Scholia uses the Wikidata SPARQL end point, it benefits from consistent structuring of data.

Chemical CompoundsEdit

All chemicals having the following properties have to be considered as instance of (P31):chemical compound (Q11173) or instance of a subclass of chemical compound (Q11173):

  • constant chemical composition
  • composed of several elements
  • no global electric charge
  • fully defined stereochemistry (i.e. cis/trans or E/Z configurations, o-, p-, m- configurations, D/L configurations, R/S configuration, endo/exo configurations)
  • atoms or groups of atoms are linked by covalent bond, ionic bond, metallic bond or Coordinate covalent bond
  • can be isolated in pure form and is stable enough allowing measurement of chemical and physical properties like melting point,... → this is controversial: hypothetical compounds and compounds that cannot be isolated in pure form, but are known from their derivatives would be exlcuded. Cf. en:Category:Hypothetical chemical compounds. Wostr (talk) 21:39, 9 January 2018 (UTC)

This definition includes neutral salts, stable radicals like nitric oxide (radical) (Q207843), some hydrates and some coordination complexes


Particular casesEdit

A) Chemicals with incompletely defined stereochemistry have to be classified as subclass of (P279):chemical compound (Q11173) and not as instance of (P31):chemical compound (Q11173)


B) Chemicals composed of only one chemical element have to be classified as instance of (P31):simple substance (Q2512777)} and not as instance of (P31):chemical compound (Q11173)


Propositions by WostrEdit

  1. Treat all chemical compounds as portions of matter and classify under chemical compound (Q11173) (which would be subclass of (P279) = chemical substance (Q79529)). Thus, all chemical compounds would be classes. Chemical compound = chemical substance composed of electrically neutral molecular entities made of atoms of at least two chemical elements.
    Hovewer, there are some inconsistencies in this approach, but it's IMHO more related to that what we are trying to collect in WD (substances with their properties,its uses [by classification in e.g. drugs classes], and not just entities).
  2. More consistent approach is to adopt ChEBI classification. As every chemical compound would be treated like a molecule (entity), not as a portion of matter (substance), the definition would be like: electrically neutral molecular entity made of atoms of at least two chemical elements. 'Water molecule' would be equal to 'water', so only one item would be necessary. But: in chemical compounds items we would have many properties that pertains only to chemical substances (like surface tension, vapor pressure, safety classification and many others). And the second, bigger BUT: chemical elements cannot be treated like 'entities' because 'chemical element' include all the isotopes, all the allotropic forms etc.

(Racemic) MixturesEdit

Because the mixture does not generally define the ratio of the amount of the components, I opt for defining it as a class.



(**)Which would also indicate that the compound is a subclass of (P279) chemical substance (Q79529)

Compound ClassesEdit

Wikidata supports properties for two important databases that define compound classes: ChEBI (Q902623) (ChEBI ID (P683)) and LIPID MAPS (Q20968889) (LIPID MAPS ID (P2063))



Proposition by WostrEdit


Every item about compound class should have instance of (P31) with any metaclass from above.

Chemical ElementsEdit

See Wikidata:Elements WikiProject