Wikidata:WikiProject Chemistry/Tools
|
Classification trees
editIn order to optimize the use of wikidata several classifications have to be applied on each chemical compounds:
- All pure components are defined by the property instance of (P31) with value chemical compound (Q11173).
- All pure components are defined by their elemental composition (properties still missing).
- All pure components are defined by their functional groups (property and list of functional groups used in that classification still missing).
Chemical element
edit- Mandatory classification:
- Each element is an instance of (P31) chemical element (Q11344).
- Each element is part of a group (Q83306).
- Each element is part of a period (Q101843).
- Each element is a subclass of isotope (Q25276).
- Mandatory properties:
- ..
Chemical compound
edit- Chemical compound and mineral form have to be splitted into 2 different items.
Definition
edit- chemical compound: each instance of chemical compound (Q11173) is a class of molecules of identical atomic structure.
- isotopic compound: each instance of isotopic compound (Q22332141) is a class of molecules of identical isotopic structure.
Examples
editamyl alcohol (Q248797) — subclass of (P279): chemical compound (Q11173)
- 1-pentanol (Q151733), 3-pentanol (Q590622) — instance of (P31): chemical compound (Q11173); subclass of (P279): amyl alcohol (Q248797)
- 2-pentanol (Q210479): — subclass of (P279): amyl alcohol (Q248797)
- (R)-2-pentanol, (S)-2-pentanol — instance of (P31): chemical compound (Q11173); subclass of (P279): 2-pentanol (Q210479)
water (Q283) — instance of (P31): chemical compound (Q11173)
- heavy water (Q155890) — subclass of (P279): water (Q283); instance of (P31): isotopic compound (Q22332141)
Properties
edit- Mandatory properties:
- CAS Registry Number (P231): Only one value per chemical compound. Don't mix different numbers representing different compounds in the same item:
- For drugs: only the CAS number of the active substance in the organic form. Salt forms have to be described in other items.
- For hydrates: only the CAS number for the hydrate defined in the label of the item. Other hydrates have to be described in other items.
- PubChem CID (P662): Only one number per item and only the CID number.
- chemical formula (P274): Different ways are possible to describe the formula but put always the w:en:Hill formula and after other formulas based on other rules.
- canonical SMILES (P233): Use the canonical version of the SMILES identifier: this ensures an unique version of the SMILES.
- InChI (P234): Use the standard version of the InChI identifier like 1S/C2H6O/c1-2-3/h3H,2H2,1H3 and not 1/C2H6O/c1-2-3/h3H,2H2,1H3
- InChIKey (P235): Use the standard version of the InChIKey identifier has a single character at the end of the identifier (in the following example this is an N): BQJCRHHNABKAKU-KBQPJGBKSA-N.
- CAS Registry Number (P231): Only one value per chemical compound. Don't mix different numbers representing different compounds in the same item:
- Required properties:
- melting point (P2101): Important for solid at room temperature
- boiling point (P2102): Important for liquid at room temperature. Always indicate the pressure at which the meaure was done by using under pressure (P2077) as qualifier.
- Optional properties:
- Identifiers:
- Physical properties:
- Safety properties:
- Identifiers:
Statistics
editselection 1: # instance of: chemical compound (Q11173)
selection 2: UNION of instance of: chemical compound (Q11173), InChIKey
selection 3: anything having InChI, InChIKey
selection 4: UNION of instance of: chemical compound (Q11173), InChI, InChIKey
selection 5: UNION of instance of: chemical compound (Q11173), InChI, InChIKey, CAS number, ChEBI ID, CHEMBL ID, PubChem ID
selection 6: UNION of instance of: chemical compound (Q11173), CAS number, PubChem ID, ChEBI ID
selection 7: UNION of instance of: chemical compound (Q11173), PubChem Compound ID
selection 8: UNION of instance of: chemical compound (Q11173), CAS number
selection 9: UNION of instance of: chemical compound (Q11173), ChEBI ID
Date | Selection 1 | Selection 2 | Selection 3 | Selection 4 | Selection 5 | Selection 6 | Selection 7 | Selection 8 | Selection 9 |
---|---|---|---|---|---|---|---|---|---|
14.12.2016 | 156543 | 151158 | 148568 | 10881 | 19592 | 144315 | 69890 | 82410 | |
07.02.2017 | 156629 | 151231 | 148619 | 10838 | |||||
2017-10-14 | 156823 | 151315 | 147676 | 11266 | 20012 | 144595 | 69885 | 82989 | |
2018-02-19 | 157012 | 151521 | 148858 | 11259 | 20005 | 144761 | 69925 | 83002 | |
2018-07-02 | 157061 | 151617 | 148994 | 11616 | 20804 | 144852 | 70541 | 84511 | |
2019-05-01 | 162492 | 156457 | 11604 | 20976 | 149908 | 71148 | 84616 | ||
2019-11-17 | 216837 | 157554 | 154400 | 12711 | 22392 | 150872 | 124207 | 85893 | |
2020-07-20 | 1063182 | 997119 | 989961 | 14010 | 25072 | 200828 | ? | 85774 | |
2021-03-21 | 1201284 | 1135517 | 1128891 | 14227 | 25832 | 654272 | 917634 | 98371 | |
2021-11-14 | 1223684 | 1157978 | 1151350 | 14080 | 25535 | 678244 | 917610 | 98050 | |
2022-08-13 | 1244986 | 1179424 | 1172560 | 15566 | 32124 | 934132 | 925861 | 107281 |
Dashboard
editTop groupings (Minimum 20 items) | Top Properties (used at least 0 times per grouping) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Name | Count | CAS Registry Number (P231) | InChI (P234) | InChIKey (P235) | ChEMBL ID (P592) | PubChem CID (P662) | ChEBI ID (P683) | DSSTox substance ID (P3117) | ECHA Substance Infocard ID (P2566) | ZVG number (P679) |
chemical compound (Q11173) | 177 | 62.71% (111) 🔍 | 15.82% (28) 🔍 | 15.82% (28) 🔍 | 5.08% (9) 🔍 | 20.9% (37) 🔍 | 2.82% (5) 🔍 | 7.34% (13) 🔍 | 5.08% (9) 🔍 | 2.26% (4) 🔍 |
type of chemical entity (Q113145171) | 22 | 100.0% (22) 🔍 | 86.36% (19) 🔍 | 86.36% (19) 🔍 | 31.82% (7) 🔍 | 77.27% (17) 🔍 | 18.18% (4) 🔍 | 50.0% (11) 🔍 | 22.73% (5) 🔍 | 9.09% (2) 🔍 |
Totals (all items) | 177 | 62.71% (111) 🔍 | 15.82% (28) 🔍 | 15.82% (28) 🔍 | 5.08% (9) 🔍 | 20.9% (37) 🔍 | 2.82% (5) 🔍 | 7.34% (13) 🔍 | 5.08% (9) 🔍 | 2.26% (4) 🔍 |
Selection SPARQL queries
edit# instance of: chemical compound (Q11173)
edit#title: Give me a list of all chemical compounds
SELECT * WHERE {
?compound wdt:P31 wd:Q11173
}
#title: Give me a list of all chemical compounds and its subclasses
SELECT * WHERE {
?compound wdt:P31/wdt:P279* wd:Q11173
}
UNION of instance of: chemical compound (Q11173), InChIKey
editSELECT * WHERE {
?compound wdt:P31 wd:Q11173 ;
wdt:P235 ?inchikey
}
anything having InChI, InChIKey
editSELECT * WHERE {
?compound wdt:P234 ?inchi ;
wdt:P235 ?inchikey
}
UNION of instance of: chemical compound (Q11173), InChI, InChIKey
editSELECT * WHERE {
?compound wdt:P31 wd:Q11173 ;
wdt:P234 ?inchi ;
wdt:P235 ?inchikey
}
UNION of instance of: chemical compound (Q11173), InChI, InChIKey, CAS number, ChEBI ID, CHEMBL ID, PubChem Compound ID
editSELECT * WHERE {
?compound wdt:P31 wd:Q11173 ;
wdt:P234 ?inchi ;
wdt:P235 ?inchikey ;
wdt:P231 ?cas ;
wdt:P683 ?chebi ;
wdt:P592 ?chembl ;
wdt:P662 ?pubchem
}
UNION of instance of: chemical compound (Q11173), CAS number, PubChem Compound ID, ChEBI ID
editSELECT * WHERE {
?compound wdt:P31 wd:Q11173 ;
wdt:P231 ?cas ;
wdt:P662 ?pubchem ;
wdt:P683 ?chebi
}
UNION of instance of: chemical compound (Q11173), PubChem Compound ID
editSELECT * WHERE {
?compound wdt:P31 wd:Q11173 ;
wdt:P662 ?pubchem
}
UNION of instance of: chemical compound (Q11173), CAS number
editSELECT * WHERE {
?compound wdt:P31 wd:Q11173 ;
wdt:P231 ?cas
}
UNION of instance of: chemical compound (Q11173), ChEBI ID
editSELECT * WHERE {
?compound wdt:P31 wd:Q11173 ;
wdt:P683 ?chebi
}
Data to curate
editChemical compound without InChIKey
editSELECT ?compound WHERE {
?compound wdt:P31 wd:Q11173
MINUS {?compound wdt:P10718 [] }
MINUS {?compound wdt:P235 [] }
}
Chemical compound without PubChem CID
editSELECT ?compound WHERE {
?compound wdt:P31 wd:Q11173
OPTIONAL {?compound wdt:P662 ?d }
FILTER (!bound(?d))
}
Chemical compound without CAS number
editSELECT ?compound WHERE {
?compound wdt:P31 wd:Q11173
MINUS {?compound wdt:P231 [] }
}
Chemical compound with InChI but without InChIKey
editSELECT ?compound WHERE {
?compound wdt:P31 wd:Q11173 ;
wdt:P234 ?inchi
OPTIONAL {?compound wdt:P235 ?d }
FILTER (!bound(?d))
}
Chemical compound with InChIKey but without PubChem CID and inverse
editSELECT ?compound WHERE {
?compound wdt:P31 wd:Q11173 ;
wdt:P235 ?inchikey
OPTIONAL {?compound wdt:P662 ?d }
FILTER (!bound(?d))
}
SELECT ?compound WHERE {
?compound wdt:P31 wd:Q11173 ;
wdt:P662 ?pubchemcid
OPTIONAL {?compound wdt:P235 ?d }
FILTER (!bound(?d))
}
Chemical compound with InChIKey but without PubChem CID
editSELECT ?compound WHERE {
?compound wdt:P31 wd:Q11173 ;
wdt:P235 ?pubchemcid ;
OPTIONAL {?compound wdt:P662 ?d }
FILTER (!bound(?d))
}
SPARQL queries
edit...
Visualizations
editCDK Depict
editToolforge has an running CDK Depict functionality at https://cdkdepict.toolforge.org/
It can be used via a Wikidata gadget with https://www.wikidata.org/wiki/User:Egon_Willighagen/cdkdepict_gadget.js
Image gallery of all elements (ordered by atomic number)
edit#defaultView:ImageGrid
SELECT DISTINCT ?element ?elementLabel ?pic ?atomicNumber
{
?element wdt:P31 wd:Q11344 ; # element
wdt:P1086 ?atomicNumber . # atomic number
OPTIONAL { ?element wdt:P18 ?pic } # picture
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en,de" }
}
ORDER BY ASC(?atomicNumber)
Bubble chart of all elements by atomic mass
edit#defaultView:BubbleChart
SELECT DISTINCT ?element ?elementLabel (SAMPLE(?mass) AS ?oneMass)
{
?element wdt:P31 wd:Q11344 ; # element
wdt:P2067 ?mass . # mass
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en,de" }
}
GROUP BY ?element ?elementLabel
ORDER BY DESC(?oneMass)
Timeline of the elements by the time of their discovery
edit#defaultView:Timeline
SELECT DISTINCT ?element ?elementLabel ?pic ?discoveryDate
{
?element wdt:P31 wd:Q11344 ; # element
wdt:P575 ?discoveryDate . # date of discovery
OPTIONAL { ?element wdt:P18 ?pic }
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en,de" }
}
ORDER BY ASC(?discoveryDate)
Birthplaces of Chemistry Nobel Prize laureates (as a map)
edit#defaultView:Map
SELECT DISTINCT ?person ?geo ?personLabel
WHERE {
?person wdt:P166 wd:Q44585 ; # Nobel Prize
wdt:P19 ?birthplace .
?birthplace wdt:P625 ?geo .
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en,de" }
}
Ions and corresponding salts families
editChemical element and corresponding simple substance
edit- Wikidata overview: