Wikidata talk:WikiProject Chemistry/Archive/2016
This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
eChemPortal
The OECD eChemPortal is a valuable database of information on chemical substances. I recommend including a link to that database in the items of chemicals. In the case of pseudocumene (Q376994) for example, the link is http://www.echemportal.org/echemportal/substancesearch/substancesearch_execute.action?allParticipants=true&numberType=CAS&number=95-63-6. --Leyo 15:55, 4 February 2016 (UTC)
- Leyo No, because this is not a database but a weblink database. This can a tool to find data in other databases but this is better to directly link to the original databases where data are instead of pointing towards a database which points to other databases. Snipre (talk) 18:09, 4 February 2016 (UTC)
- OK, let's call it metadatabase. It's more than just a number of blind links to (possible) entries in other databases. The only problem for Wikidata is that there is no ID other than the CAS number. --Leyo 23:37, 4 February 2016 (UTC)
- From WD point of view, we should have only one parameter defining a unique entry in the database. From what I see, this is not the case for this database. Snipre (talk) 16:01, 9 February 2016 (UTC)
- The CAS number is “only one parameter”. Is the problem that it is not a parameter specific to the eChemPortal? --Leyo 23:21, 11 February 2016 (UTC)
- Leyo We already have the CAS number as property so no need to create anything: people can search using this information using the search tool of the database.
- But if you want to create a link using the CAS number, we have already the tool of Magnus which connect a CAS number to all databases using CAS number as search parameter: see here for the case of the methanol. Snipre (talk) 08:22, 15 February 2016 (UTC)
- As far as I see, the tool is not linked in items of chemicals. However, my point is, that the eChemPortal should be accessible directly from there. --Leyo 00:43, 17 February 2016 (UTC) P.S. There are several dead links in Magnus' tool.
- Leyo Seems that eChemPortal changed the way to link to their data. How do you want to create a link to eChemPortal ? Snipre (talk) 19:01, 17 February 2016 (UTC)
- I am not sure how exactly it should be done. That's why I am asking. ;-) --Leyo 02:16, 18 February 2016 (UTC)
- Leyo Seems that eChemPortal changed the way to link to their data. How do you want to create a link to eChemPortal ? Snipre (talk) 19:01, 17 February 2016 (UTC)
- As far as I see, the tool is not linked in items of chemicals. However, my point is, that the eChemPortal should be accessible directly from there. --Leyo 00:43, 17 February 2016 (UTC) P.S. There are several dead links in Magnus' tool.
- The CAS number is “only one parameter”. Is the problem that it is not a parameter specific to the eChemPortal? --Leyo 23:21, 11 February 2016 (UTC)
- From WD point of view, we should have only one parameter defining a unique entry in the database. From what I see, this is not the case for this database. Snipre (talk) 16:01, 9 February 2016 (UTC)
- OK, let's call it metadatabase. It's more than just a number of blind links to (possible) entries in other databases. The only problem for Wikidata is that there is no ID other than the CAS number. --Leyo 23:37, 4 February 2016 (UTC)
- We should be using the InChI / InChIKey as main unique identifier for most compounds (e.g. all organic compounds). --Egon Willighagen (talk) 18:30, 14 February 2016 (UTC)
- Egon Willighagen InChI is not really human friendly for comparison purpose. InChIKey is better but still complex. But to be honest we should have a tool which create a drawing of the chemical and the corresponding SMILES, InChI and InChIKey. These four elements have to be created at the same time and shouldn't have different origins. In that way PubChem is a good tool because it create these four elements together. Snipre (talk) 08:43, 15 February 2016 (UTC)
- Snipre Yes, PubChem CID works for me. They have a bot that can help, see User:ProteinBoxBot and pinging User:Andrawaag. For the Wikidata:WikiProject_Medicine/Zika project I am using QuickStatements (see this source code: https://github.com/egonw/zikaVirus; both options use Bioclipse): 1. take a SMILES, generate InChI and InChIKey, lookup PubChem CID, and create a QuickStatement (linked to the paper in which the compound was mentioned); 2. take a ChEMBL ID, look up SMILES, InChI, and InChIKey in ChEMBL, and create a QuickStatement (with their permission to copy the data for these Zika-related compounds). In both cases, I visualize the 2D structures in Bioclipse (with the ui.view() command), to make sure things look OK. I have not had time for this, but need to learn how to write (and mostly use) bots, and then talk to the PubChem people, to autopopulate items with PubChem CIDs with additional CCZero/PD data from PubChem (as earmarked by them). Egon Willighagen (talk) 10:39, 15 February 2016 (UTC)
- Egon Willighagen InChI is not really human friendly for comparison purpose. InChIKey is better but still complex. But to be honest we should have a tool which create a drawing of the chemical and the corresponding SMILES, InChI and InChIKey. These four elements have to be created at the same time and shouldn't have different origins. In that way PubChem is a good tool because it create these four elements together. Snipre (talk) 08:43, 15 February 2016 (UTC)
- We should be using the InChI / InChIKey as main unique identifier for most compounds (e.g. all organic compounds). --Egon Willighagen (talk) 18:30, 14 February 2016 (UTC)
Open Beauty Facts
Hey all
Notified participants of WikiProject Chemistry,
The volunteers behind Open Food Facts are attacking Cosmetics :-)
Just like what we did for food, we're going to create a worldwide open data base of all cosmetic products, with ingredients, allergens, categories, brands, reference photos, using mobile phones. The effort has started at http://world.openbeautyfacts.org
We'll get a full list of ingredients in your favorite lipstick, shampoo or creams. We're starting to realize how much chemistry is actually involved.
- My first step was to get the UNII (P652) ids imported (Mix'n Match).
- We're going to improve the cosmetics articles, and hopefully manage to better link them with the underlying molecules (parabens, quaterniums…) (hierarchy)
- We're also going to try and get as many Colour Index International constitution ID (P2027) as possible in relevant chemistry items (they're only in labels right now, and not part of the Chemistry infobox), as they're often used on shampoos.
Let me know if you have any ideas, either for Open Beauty Facts or on how to improve the cosmetic situation on Wikidata.
Elements
There is two ways to define elements
- Types of atoms with the same atomic number (aka. the set of all atoms for some atomic number)
- Types of substances with only one type of atoms (in definition 1)
It seems that we have an interwiki conflict here, because en:Chemical elements uses 1. as a main definition, and, for example, fr:élément chimique does use the second. I'm afraid I failed to gain a concensus jut for the pair of languages to align the definitions, so I guess we won't avoid the spitting of items, this will give work to WD:XLINK. For each element … The good news is that it will clarify some classification issue (assuming we solve the same problem for chemical substance, molecular entity and all) :
- Hydrogen (en) = hydrogène élémentaire/hydrogène pur (fr)
- Hydrogène (fr) = Hydrogen atom (en)
- élément chimique (fr) = "type of atom occuring in some element" (en)
Does that seem correct ? @Emw, Snipre: of course.
- I am very much in favor of splitting things. For several reasons: one "element" can have two or more substances. Oxygen has at least molecular oxygen and ozone, carbon has several, excluding all the pure-carbon molecular structures (buckyballs, graphenes). Is there consensus now? Egon Willighagen (talk) 06:39, 13 April 2016 (UTC)
Compounds with several CAS numbers
What should we with tartaric acid (Q194322) for example?
CAS | Comment |
---|---|
133-37-9 | DL |
87-69-4 | L-(+) |
147-71-7 | D-(–) |
147-73-9 | meso |
526-83-0 | D-(–) ? |
--Kopiersperre (talk) 10:14, 10 March 2016 (UTC)
- We have to separate mixture of isomers and isomers. So we will have 3 items at least:
- - one for the mixture (DL)
- - one for the D form
- - one for the L form
- Snipre (talk) 14:04, 10 March 2016 (UTC)
- There should be separate items for dextrorotatory isomers, laevorotatory isomers, and racemic mixtures. I have been creating separate items for D- and L-isomers in certain cases. James Hare (NIOSH) (talk) 15:16, 10 March 2016 (UTC)
@James Hare (NIOSH): Can you provide the Q number of the items you created ? We should transfer the data from tartaric acid (Q194322).In this case we have to create another item for the meso form which is a component too. For the second CAS number for D form we should check if these 2 numbers are correct and in that case check which one is the current valid one. Snipre (talk) 08:07, 11 March 2016 (UTC)
- There should be separate items for dextrorotatory isomers, laevorotatory isomers, and racemic mixtures. I have been creating separate items for D- and L-isomers in certain cases. James Hare (NIOSH) (talk) 15:16, 10 March 2016 (UTC)
- @Kopiersperre: L-tartaric acid (Q23034944), D-tartaric acid (Q23034947) and (S)-tartaric acid (Q23034950). Snipre (talk) 13:23, 11 March 2016 (UTC)
- There is still a problem: there one CAS number for the racemic mixture which is different from the generic tartaric acid. Snipre (talk) 13:31, 11 March 2016 (UTC)
- Very much supporting the split up. The more precise we are, the better we do. This particular case is important to research I do in various projects. Egon Willighagen (talk) 06:41, 13 April 2016 (UTC)
Silicic acids
silicic acids (Q16524585) and orthosilicic acid (Q422843) should be merged. May you please me help to investigate the right CAS numbers (see also silica gel (Q308976) and metasilicic acid (Q3604536))?--Kopiersperre (talk) 17:34, 10 March 2016 (UTC)
- @Kopiersperre: Please look at the german articles for both items silicic acids (Q16524585) and orthosilicic acid (Q422843): one is about the family of silicic acid and the other one about a specific form of silicic acid. German links will prevent any merge actions so perhaps should you first analyze them. But for me these items shouldn't be merged just relabeled. Snipre (talk) 14:23, 12 March 2016 (UTC)
- Rename orthosilicic acid (Q422843), create disilicic acid (Q23038943),
metasilicic acid (Q23038949)and pyrosilicic acid (Q23038952). Snipre (talk) 14:43, 12 March 2016 (UTC)- Thanks for the solution. Created trisilic acid (Q23038984).--Kopiersperre (talk) 15:11, 12 March 2016 (UTC)
- Trisilic or trisilicic acid ? --Chris.urs-o (talk) 12:27, 17 June 2016 (UTC)
- Thanks for the solution. Created trisilic acid (Q23038984).--Kopiersperre (talk) 15:11, 12 March 2016 (UTC)
- Rename orthosilicic acid (Q422843), create disilicic acid (Q23038943),
Items for hypothetical compounds?
What do you think on User talk:Marsupium#Ammonia / ammonium hydroxide? Should a separate item be created for the hypothetical ammonium hydroxide described by AAT record 300266781? How are similar cases handled such as elements not existing under common conditions? Cheers and thanks for pinging! --Marsupium (talk) 14:21, 17 April 2016 (UTC)
- @Marsupium: Create a different item because ammonium hydroxide ia only one type of molecule in a ammonia solution. Ammonia solution is a chemical substance meaning a mixture of different types of molecules and one class of these molecules is ammonium hydroxide. So you can connect ammonia solution with ammonium hydroxide usinf property "part of". Snipre (talk) 19:59, 8 June 2016 (UTC)
- OK, thanks! But the problem is that ammonium hydroxide seems not to exist actually by itself outside ammonia solution. Which instance of (P31) shall <ammonium hydroxide> get? --Marsupium (talk) 18:48, 13 June 2016 (UTC)
- @Marsupium: I don't know but hypothetical compound is not correct: this compound exists but only in small quantity and in certain conditions. Snipre (talk) 15:15, 14 June 2016 (UTC)
- OK, thank you! I thought about that. If there is no obligation to point that out, I'll simply create the item. --Marsupium (talk) 18:12, 14 June 2016 (UTC)
- @Marsupium: I don't know but hypothetical compound is not correct: this compound exists but only in small quantity and in certain conditions. Snipre (talk) 15:15, 14 June 2016 (UTC)
- OK, thanks! But the problem is that ammonium hydroxide seems not to exist actually by itself outside ammonia solution. Which instance of (P31) shall <ammonium hydroxide> get? --Marsupium (talk) 18:48, 13 June 2016 (UTC)
Import parts of UniProtKB
Hi, what is necessary for an import of 550.000 reviewed items with their properties "accession number, protein name, gene name, organism, GO - molecular and biological function, keywords, length, mass and sequence"? We already have their permission to import. Here's the archived discussion from the project chat and here's the section at Portal:Gene Wiki, thanks, --Ghilt (talk) 07:15, 8 June 2016 (UTC)
- @Ghilt: You need
- a spreadsheet with all the data or at least an API to extract data from UniProtKB
- the list of all items about protein with the corresponding UniProtKB identifier
- a matching table between the wikidata properties and the corresponding UniProtKB parameters
- an agreement from contributors working in the field of biology to import al the mentioned data.
- and finally a bot operator ready to do the job. Don't forget to ask him to add after each statement import the reference using as example help:Sources, section databases.
- The goal of wikidata is not to import all data from all databases. You should aim for data which can be useful for wikipedia mainly. The best is first to analyze infoboxes from different WP like en:WP, de:WP an fr:WP to see what kind of data is used in the articles. Then you can start to extract all corresponding data from UniProtKB. Snipre (talk) 19:55, 8 June 2016 (UTC)
- Hi Snipre, thank you very much for the reply. The 555,000 items are not the full database, only their reviewed items. The data is used for writing protein articles on wikipedia. The matching table and the agreement shouldn't be a problem. But the API might, as it was difficult to get answers to my questions in either section (gene wiki on en.wp, wd Partnerships_and_data_imports and wd project chat) and i can't code sufficiently. Is there anybody who can help with that? --Ghilt (talk) 08:47, 10 June 2016 (UTC)
- @Ghilt: Wikidata: Bot request. Snipre (talk) 11:29, 11 June 2016 (UTC)
- Thanks again, i'll try that --Ghilt (talk) 21:58, 11 June 2016 (UTC)
- @Ghilt: Wikidata: Bot request. Snipre (talk) 11:29, 11 June 2016 (UTC)
- Hi Snipre, thank you very much for the reply. The 555,000 items are not the full database, only their reviewed items. The data is used for writing protein articles on wikipedia. The matching table and the agreement shouldn't be a problem. But the API might, as it was difficult to get answers to my questions in either section (gene wiki on en.wp, wd Partnerships_and_data_imports and wd project chat) and i can't code sufficiently. Is there anybody who can help with that? --Ghilt (talk) 08:47, 10 June 2016 (UTC)
- @Ghilt: I am a little surprised to discover this proposal here and that you did not find the 377,000 UniprotKB items (SwissProt curated items, SPARQL query) we (project molbio/Gene Wiki team) already imported. We have all code in place and could do a full Swissprot import anytime required, but we prefer to do it species-wise, so we can link genes and proteins as described in the data model the Wikiproject Molecular Biology agreed on. Please see our papers on this [1] [2]. Sebotic (talk) 08:51, 16 June 2016 (UTC)
- @Sebotic: Thanks for the reply. I had checked two typical protein items for molecular weight and length and didn't find the info, which is why i started at the project chat, followed by Portal:Gene_Wiki at en.wp, Partnerships and data imports, on this page and at Portal:Biology. And I finally found you! As i didn't intend to reinvent the wheel, your reply is a great help! This way, i don't need to import the 551,000. Should i discuss the creation of the properties "GO - molecular and biological function, keywords, length, mass and sequence" and the subsequent imports here or there? Cheers, --Ghilt (talk) 17:58, 16 June 2016 (UTC)
- @Ghilt: The Wikidata protein items already have the full Gene Ontology annotations, which are maintained by our bot, directly from the original source QuickGo, so no need to add anything. Regarding length, mass and sequence: Length could be determined from sequence, so no need to add that, but there is a general agreement in WD project Molbio, not to add protein or nucleic acid sequences at this point, but let the users go to the original source if they need sequence info. This decision makes sense, as the current character limit for most WD text field properties is 400. Regarding mass: Several months ago, mass has been proposed as a property in the domain of chemistry, but it has been declined, because the mass of a molecule can be calculated from its chemical formula. Best, Sebotic (talk) 18:31, 22 June 2016 (UTC)
- By the way, here ist the german version of the template infobox protein, cheers, --Ghilt (talk) 08:08, 17 June 2016 (UTC)
- If sequences aren't feasible, how about importing the length? And I would really like to have the mass for writing protein articles without having to calculate each one or to go look at Uniprot. Cheers, --Ghilt (talk) 18:43, 22 June 2016 (UTC)
- @Sebotic: Thanks for the reply. I had checked two typical protein items for molecular weight and length and didn't find the info, which is why i started at the project chat, followed by Portal:Gene_Wiki at en.wp, Partnerships and data imports, on this page and at Portal:Biology. And I finally found you! As i didn't intend to reinvent the wheel, your reply is a great help! This way, i don't need to import the 551,000. Should i discuss the creation of the properties "GO - molecular and biological function, keywords, length, mass and sequence" and the subsequent imports here or there? Cheers, --Ghilt (talk) 17:58, 16 June 2016 (UTC)
- @Ghilt: I am a little surprised to discover this proposal here and that you did not find the 377,000 UniprotKB items (SwissProt curated items, SPARQL query) we (project molbio/Gene Wiki team) already imported. We have all code in place and could do a full Swissprot import anytime required, but we prefer to do it species-wise, so we can link genes and proteins as described in the data model the Wikiproject Molecular Biology agreed on. Please see our papers on this [1] [2]. Sebotic (talk) 08:51, 16 June 2016 (UTC)
Moving this discussion to Project Molecular biology, cheers --Ghilt (talk) 20:42, 20 June 2016 (UTC)
- BTW, i'll be in Esino Lario, who else? --Ghilt (talk) 15:11, 23 June 2016 (UTC)
- Not possible for me. But if you have good experience there please feel free to report here your comments. Snipre (talk) 15:17, 23 June 2016 (UTC)
- It actually was a great experience, the people of Esino Lario were incredibly welcoming. There were 'We welcome Wikipedians' signs on every fourth house and there were even drive-by hollars 'I love Wikipedia'. The local bakery renamed its cookies to 'Wikipedia's cookies'. The talks were ok, they're accessible on youtube, but more important was meeting some of the wikipedians i only knew by writing and pinning a face and a character to their name. Cheers, --Ghilt (talk) 18:07, 29 June 2016 (UTC)
- Thanks for comment. It is always a good thing when we have positive feedback: this can help us to take part to the events in the future. Snipre (talk) 07:11, 30 June 2016 (UTC)
- It actually was a great experience, the people of Esino Lario were incredibly welcoming. There were 'We welcome Wikipedians' signs on every fourth house and there were even drive-by hollars 'I love Wikipedia'. The local bakery renamed its cookies to 'Wikipedia's cookies'. The talks were ok, they're accessible on youtube, but more important was meeting some of the wikipedians i only knew by writing and pinning a face and a character to their name. Cheers, --Ghilt (talk) 18:07, 29 June 2016 (UTC)
- Not possible for me. But if you have good experience there please feel free to report here your comments. Snipre (talk) 15:17, 23 June 2016 (UTC)
Philadelphia ACS meeting
Hello! There will be a Wikipedia Edit-a-thon at the national ACS meeting in Philadelphia next month. Will anyone from this group be there, to show ignorant chemists such as myself how to contribute to chemistry on Wikidata? Would anyone be able to give a short talk on what Wikidata is and how it will (hopefully) be used within Wikipedia? Walkerma (talk) 22:52, 15 July 2016 (UTC)
- @Walkerma: Sorry, I am living in Europe and without any project to have holydays in the next weeks. I can only propose that you start to read some some help pages for the general structure of WD and then once you have more detailed questions, I will try to answer them. My reading proposition:
- Wikidata:Introduction and Help:About_data for general overview.
- Wikidata:Tours. These 2 small tutorials are quite good as description of the WD interface.
- Help:Contents: a bunch of help pages on different topics if you want to go further.
- Snipre (talk) 08:06, 20 July 2016 (UTC)
- Thanks - I'll try to work through these. If I get anywhere, I may try to contribute a couple of slides on it to the Edit-a-thon, just to explain the concept to the chemists who show up.. Walkerma (talk) 02:59, 21 July 2016 (UTC)
GHS hazard statements
We already have items with H phrases and with P phrases. In my opinion unsourced hazard statements should get deleted.--Kopiersperre (talk) 14:40, 5 August 2016 (UTC)
I would like to import the first big chunk of P728 (P728) and P940 (P940). I think the only viable way is creating one item for every possible phrase or phrase combination (see the list).--Kopiersperre (talk) 14:42, 5 August 2016 (UTC)
- The statements are strings, not items. --Izno (talk) 15:39, 5 August 2016 (UTC)
- I know, but this should be changed.--Kopiersperre (talk) 08:31, 6 August 2016 (UTC)
- BTW what will be the source of the statements you want to import? ∼Wostr (talk) 17:27, 6 August 2016 (UTC)
- German Wikipedia, which basically means GESTIS database (Q15811170). It will be a test, no plans to import everything.--Kopiersperre (talk) 19:43, 6 August 2016 (UTC)
- @Kopiersperre, Izno: Please don't use Wikipedia to import data in WD: we already have enough complaint about the quality of these data to look for other sources. For GHS data please use the data from the ECHA available here as excel sheet. You can use the CAS number and the EINECS number to identify the item before the importation. Thanks Snipre (talk) 09:56, 8 August 2016 (UTC)
- As Snipre above. Neither wikipedias nor unofficial sources/database/MSDSs should be used for GHS properties. Only ECHA database and harmonised classification (not notified classification, as it varies greatly depending on the producer) should be included in WD. I think we should also use applies to jurisdiction (P1001) = European Union (Q458), because classification in other parts of the World can be diffrent from European classification and labelling included in CLP and ATPs (e.g. U.S. OSHA may have it's own official c&l for certain substances). ∼Wostr (talk) 13:13, 8 August 2016 (UTC)
- I think we should perhaps change the data structure. My concern is about different sets of H phrases. For example, if source A says that compound C should be labeled with H202 and H400 and source B says that the labelleing for C is H201 and H401, how can we later retrieve the good set of H phrases according to only one source ?
- Instead of having different statements P728 (P728) and to have to filter them in order to get one unique labeling according to one source, we should create a new property Safety classification and to group all H phrases as qualifiers.
- Example:
- Safety classification: GHS hazard statement (Q28360)
- P728 (P728): H201
- P728 (P728): H401
- Stated in : Source B
- Safety classification: GHS hazard statement (Q28360)
- P728 (P728): H202
- P728 (P728): H400
- Stated in : Source A
- Snipre (talk) 13:44, 8 August 2016 (UTC)
Table of valid phrases
May you please help me filling out this tables? Some phrases (*) were altered by later ATPs.--Kopiersperre (talk) 10:36, 8 August 2016 (UTC)
- @Kopiersperre: Please have a look at [3], page 34. Can you find a tool to extract data from pdf ? Snipre (talk) 11:17, 8 August 2016 (UTC)
- @Kopiersperre: I found a better way: go to [4], then select all H phrases, select one language and choose the button "Download selected phrases as PLS" and you get a excel sheet with all phrases. Repeat the same with other languages then copy paste the content of the different sheets in one document and you have your list. Snipre (talk) 11:32, 8 August 2016 (UTC)
- @Snipre: Very good solution.--Kopiersperre (talk) 13:45, 8 August 2016 (UTC)
- @Kopiersperre: I found a better way: go to [4], then select all H phrases, select one language and choose the button "Download selected phrases as PLS" and you get a excel sheet with all phrases. Repeat the same with other languages then copy paste the content of the different sheets in one document and you have your list. Snipre (talk) 11:32, 8 August 2016 (UTC)
Approximate values of dipole moment
CRC Handbook of Chemistry and Physics (95th edition) (Q20887890) contains table with dipole moments, some of them are given with a good presicion, but some are marked with "≈" ("Values measured in the gas phase that are questionable because of undetermined error sources are indicated as approximate") or enclosed in brackets ("Values obtained by liquid phase measurements, which sometimes have large errors because of association effects"). How can I add this information to WD? In propyl formate (Q421045) I tried to use sourcing circumstances (P1480) with circa (Q5727902) for [1,89] D, but I don't think it's a good option – this is not an approximate value, but just an undetermined uncertainity. sourcing circumstances (P1480) would be better with values marked with "≈", but I'm not sure if this is a right use of this property. ∼Wostr (talk) 23:10, 17 August 2016 (UTC)
- @Wostr: The best is to use the original references and not the Handbook for these values in order to define which is the error. Snipre (talk) 09:47, 18 August 2016 (UTC)
- @Snipre: I checked the first value marked with "≈" in the original source and it is marked with "Q" = "Questionable value" (there is a serious question about the best value to select or where there is insufficient information on which to base meaningful estimate of accuracy (...) They may be regarded as giving a rough estimate of the magnitude of the moment but are not of sufficient accuracy for quantitative use). Tables in CRC is based on 68 sources, mainly published before 2000, so for some compounds there may be better measurements of DM in the literature, but for some there may not be any other value. ∼Wostr (talk) 13:07, 18 August 2016 (UTC)
Pigments
I would like to add many printing and coating pigments (example Pigment Yellow 138 (Q26705718)). Am I right that there is no generic property for color?--Kopiersperre (talk) 16:18, 26 August 2016 (UTC)
- Kopiersperre There is color (P462) for general colors description. But if you want to describe the color with a more detailed way there is sRGB color hex triplet (P465). Snipre (talk) 16:35, 26 August 2016 (UTC)
DSSTOX substance identifier
Please also have a look at the proposed property for the EPA CompTox Dashboard identifier. User:ChemConnector has uploaded some 700 thousand InChIKey<>DTXSID mappings as CCZero to Figshare, and I want to include that information in Wikidata. For this, I will want to use a bot task, and will soon write up a task proposal. For go would then be to add mappings for Wikidata entries with matching InChIKeys, but I can also imagine creating new compound entries for InChIKeys not found in Wikidata yet. Comments on that second part most welcome. --Egon Willighagen (talk) 08:53, 28 August 2016 (UTC)
Import of ChEBI
Hello everyone, I will start importing all actual chemical compounds represented in ChEBI. Furthermore, I would like to import and maintain the full ChEBI ontology structure. This would enable a unique representation of chemical compounds in Wikidata and would highly improve the quality of chemical compounds in Wikidata. I have done that sucessfully with the Gene Ontology, which has a similar size and complexity and therefore have show that this is feasible.
For long term maintenance: The source code for this will be AGPLv3, available on our bitbucket repo [5] so in worst case, somebody else could take over and run the bot. Nevertheless, I would like to know your opinion on this. Best, Sebotic (talk) 20:43, 22 June 2016 (UTC)
- @Sebotic: Not in favor of importing an external ontology in WD. Why do we have to maintain in WD an ontology defined and modified in another website ? The goal of WD is not to integrate everything from other databases but to link databases.
- Same reasoning for importation of all chemicals from ChEBI. I don't see the interest of just being a mirror of another website. Better work at the interface of the existing databases than just copy-pasting data form one. I propose you instead of import data from one database to match data from different databases like ChEBI, ChemIDplus, ChemSpider, PubChem, ChEMBL or GESTIS and to import the data which are similar in all databases. ChEBI is just one database among several others so I don't understand why Wikidata should be the mirror of this database and not of the others. Snipre (talk) 11:38, 23 June 2016 (UTC)
- @Snipre: Sorry for the delayed reply! The reason why I think ChEBI would be valuable is that it is the best chemical ontology currently available. It brings a ton of classification which could form the basis of futher work by the WD community. The only thing which maybe should not be imported is tautomers, as they have the same inchi (key). In general, I would want to import data from several source but certainly not as separate item per source but as a unified item with all the identifiers on it (CAS, Inchi key, Inchi, canonical SMILES, isomeric SMILES, CID, SID, ChEMBL, SureChEMBL, IUPHAR/GtoP, Drugbank, etc). The common id should be the InChI key, not perfect, but the best which is out there. Certainly, an important part is proper referencing, which is fairly easy as soon as the data sources have been determined. If we succeed, we would end up with the most high quality, open corpus of chemical compounds with most data/ids per compound anywhere to be found, which I think is great. Sebotic (talk) 01:13, 28 June 2016 (UTC)
- @Sebotic: No problem for the delay. For the data I am sure you have good expertise. By only concern is to have a control process which work before the importation of data. I am really tired to correct statements and to merge duplicates each time large chemical data imports is done because people didn't do a correct job of data matching before importation. My recommendation are the next ones:
- Before creating any new item check if another item already shares an identifier with your data set. And don't use label or page title of Wikipedia article as matching criteria.
- Import data in one item only if you can match at least two identifiers between your data set and the data already present in the item.
- If during the data import you detect the existence of an existing value for the property you want to import, compare the existing value with the value you want to import and if there is a difference don't import your data but create a conflict report in order to analyze the item later
- For the question of the ontology, even if ChEBI is a good reference, we first have to check if the ChEBI ontology can match the overall Wikidata ontology. Wikidata can't be the sum of different ontologies if we want to have an unique way to query and to display data independently from the knowledge domains. For example, what happens if ChEBI ontology agrees to have items with both instance f/subclass of in an item but not Wikidata ?
- I know that the ontology of Wikidata is very unclear but we need to be careful to keep a homogeneous system. Snipre (talk) 09:46, 28 June 2016 (UTC)
- @Sebotic: No problem for the delay. For the data I am sure you have good expertise. By only concern is to have a control process which work before the importation of data. I am really tired to correct statements and to merge duplicates each time large chemical data imports is done because people didn't do a correct job of data matching before importation. My recommendation are the next ones:
- @Snipre: Sorry for the delayed reply! The reason why I think ChEBI would be valuable is that it is the best chemical ontology currently available. It brings a ton of classification which could form the basis of futher work by the WD community. The only thing which maybe should not be imported is tautomers, as they have the same inchi (key). In general, I would want to import data from several source but certainly not as separate item per source but as a unified item with all the identifiers on it (CAS, Inchi key, Inchi, canonical SMILES, isomeric SMILES, CID, SID, ChEMBL, SureChEMBL, IUPHAR/GtoP, Drugbank, etc). The common id should be the InChI key, not perfect, but the best which is out there. Certainly, an important part is proper referencing, which is fairly easy as soon as the data sources have been determined. If we succeed, we would end up with the most high quality, open corpus of chemical compounds with most data/ids per compound anywhere to be found, which I think is great. Sebotic (talk) 01:13, 28 June 2016 (UTC)
- @Sebotic: I guess you have also seen the Mix'n'Match already? I love to see ChEBI fully in Wikidata. Now, ChEBI has a lot of ionic species (which becomes very clear when you run the Mix'n'Match in Game mode :) Do you also plan to include these? Also, will you include the links between the compounds, as the ChEBI ontology defined, particularly for these ions? --Egon Willighagen (talk) 08:40, 28 August 2016 (UTC)
- @Egon Willighagen, Snipre: Well, after some more considerations and taking into account the concerns by Snipre, I think that importing all of ChEBI might not be too useful at this point. E.g. all the ions and enantiomeres do not have enough chemcial idenfiers to be really useful in Wikidata. Moreover, ChEBI has many edges which currently don't exist in Wikidata, so they would all need to be proposed and approved (subclass of and has role already exist, so most of the core graph could be imported). What I will do definitely is to make sure that all 'primary' (organic) compounds make it into Wikidata. That said, I would have the bot code ready to do a full import, only things missing are edges (WD properties) and a general consensus that the full import should be done. Sebotic (talk) 18:05, 29 August 2016 (UTC)
- @Snipre, Sebotic: These two aspects of chemical compounds, along with pureness (compounds vs substance) are important. What about we start ironing out how Wikidata should model these things? Are ions notable enough (probably, given that other databases support them?)? Should compounds with unspecified stereochemistry be instances or subclasses? And, quite related, how will we model compound classes and other "things" that are more than one distinct (isomeric) chemical structure on Wikipedia? It seems to me, we have critical mass. It seems to me that @ChemConnector, Walkerma, Pigsonthewing: (first two have been very active in the Wikipedia Chemistry team along, and Andy has been at the Royal Society of Chemistry (Q905549)) will like join in these discussions too, and then we have critical mass. This defines a group of experienced chemists who think Wikidata should be used in science. I'd say, let's do it! Let's define the framework and do that final clean up. Within not too long, we can beat several popular scientific databases in quality. (And then we submit a paper to the Journal of Cheminformatics (Q6294930) with our results, along the lines of Wikipedia Chemical Structure Explorer: substructure and similarity searching of molecules from Wikipedia (Q21957425). This will undoubtedly attract more scholarly chemists!) --Egon Willighagen (talk) 05:22, 30 August 2016 (UTC)
- @Egon Willighagen:
- * All ions are notable
- * compounds with unspecified stereochemistry are defined as subclasses of chemical compounds and compounds with specified stereochemistry are defined as instance of items describing compounds with unspecified stereochemistry (see relations between L-lactide (Q24757824), D-lactide (Q24757832), (R,S)-lactide (Q24757839), and lactide (Q421313))
- * Next problems to solve:
- how isotopic compound (Q22332141) should be structured compared to chemical compound ? Is heavy water (Q155890) an instance of water (Q283) ?
- how should we treat tautomers ? Two items or an item ? Which criterion can be used to define if a tautomer can have 2 items or not ?
- what is the granularity of the structure for chemical compounds: can we consider ethanol as an instance or as a class ?
- Snipre (talk) 10:10, 30 August 2016 (UTC)
- @Snipre: Cool, thanks for the details! The first of the next problems is indeed interesting, because ontologically seen, an instance of an instance is not typically done. ChEBI actually models even water (Q283) as a class. That's not that unreasonable, as a water molecule instance is something in your mouth right now, and the 'chemical compound' water is just the concept of it. Tautomers is another hard one. Personally, I like to have all chemical graphs as separate entities, actually like ChEBI does. However, if you say chemical compounds have a 1-to-1 relation to the Standard InChI, then we have a problem. Worse, the Standard InChI does not consider everything a tautomer that a biologist/chemist would (it's an incomplete model). So, the current answer following from the compound<>InChI link is: both two and a single item. The third problem to solve is related to the first. But this is the discussion we indeed need to have. What is the central concept of a chemical compound? That has major implications for the identifiers side of this. To me, the more explicit we are, the better we serve the scientific community. --Egon Willighagen (talk) 12:08, 30 August 2016 (UTC)
- @Snipre, Sebotic: These two aspects of chemical compounds, along with pureness (compounds vs substance) are important. What about we start ironing out how Wikidata should model these things? Are ions notable enough (probably, given that other databases support them?)? Should compounds with unspecified stereochemistry be instances or subclasses? And, quite related, how will we model compound classes and other "things" that are more than one distinct (isomeric) chemical structure on Wikipedia? It seems to me, we have critical mass. It seems to me that @ChemConnector, Walkerma, Pigsonthewing: (first two have been very active in the Wikipedia Chemistry team along, and Andy has been at the Royal Society of Chemistry (Q905549)) will like join in these discussions too, and then we have critical mass. This defines a group of experienced chemists who think Wikidata should be used in science. I'd say, let's do it! Let's define the framework and do that final clean up. Within not too long, we can beat several popular scientific databases in quality. (And then we submit a paper to the Journal of Cheminformatics (Q6294930) with our results, along the lines of Wikipedia Chemical Structure Explorer: substructure and similarity searching of molecules from Wikipedia (Q21957425). This will undoubtedly attract more scholarly chemists!) --Egon Willighagen (talk) 05:22, 30 August 2016 (UTC)
- @Egon Willighagen, Snipre: Well, after some more considerations and taking into account the concerns by Snipre, I think that importing all of ChEBI might not be too useful at this point. E.g. all the ions and enantiomeres do not have enough chemcial idenfiers to be really useful in Wikidata. Moreover, ChEBI has many edges which currently don't exist in Wikidata, so they would all need to be proposed and approved (subclass of and has role already exist, so most of the core graph could be imported). What I will do definitely is to make sure that all 'primary' (organic) compounds make it into Wikidata. That said, I would have the bot code ready to do a full import, only things missing are edges (WD properties) and a general consensus that the full import should be done. Sebotic (talk) 18:05, 29 August 2016 (UTC)
Importing COSING
[Discussion with Magnus: Matching CoSing numbers using multiple identifiers]
The CoSing number has recently been created for Chemical compounds. It is the EU canonical identifier for Chemistry and Cosmetics, and as a result, there a 25 000 identifiers, as well as identifiers to all the other chemistry systems, and interesting info for properties and labels.
I had first thought truncating the file for import using Mix N'Match, but I wondered if someone is skilled to maximize the utility of the file.
Source
Snippet
COSING Ref No | INCI name | INN name | Ph. Eur. Name | CAS No | EINECS/ELINCS No | Chem/IUPAC Name / Description | Restriction | Function | Update Date |
38946 | ZEA MAYS STARCH | starch | maydis amylum | 9005-25-8 | 232-679-6 | Zea Mays Starch is a high-polymeric carbohydrate material usually derived from the peeled seeds of the Corn, Zea mays L., Gramineae | - | ABRASIVE, ABSORBENT, ANTICAKING, SKIN PROTECTING, VISCOSITY CONTROLLING | 15/10/2010 |
Property talk:P3073#Importing the identifiers
- @Teolemon : I don't think Mix N'Match tool is necessary: the dataset contains CAS number and EINECS number so you can use those identifiers to identify the item for adding the CoSing number in WD. The best would be to check when possible if the CAS and EINECS numbers in the item are identical to the ones present in the dataset from CoSing database. Snipre (talk) 11:33, 29 August 2016 (UTC)
qualifier to indicate a conformer
Is there any way to indicate the conformer? Dipole moments are sometimes measured for specific conformer (gauche, trans etc.), but I do not think there should be different items for every conformer as there are the same molecule. ∼Wostr (talk) 21:56, 30 August 2016 (UTC)
- From what I know, no. Snipre (talk) 22:10, 30 August 2016 (UTC)
Problem with mixture and solution
I have a problem with items describing mixture and especially aqueous solution of salts or other soluble substances. First these items can't be classify as chemical compound but can we classify them as chemical substance or as mixture ? My proble with items describing solution like barium hydroxide solution (Q809681), this is the large possible and different solutions which can be represented by this item. If I take the IUPAC definition of chemical substance, I read "Physical properties such as density, refractive index, electric conductivity, melting point etc. characterize the chemical substance". As I understand the definition, barium hydroxide solution (Q809681) can't be classified as chemical substance because I can't define one density or refractive index to item barium hydroxide solution (Q809681): the density is valid only for one solution, for example water 70%/barium hydroxyde 30%, but not for the solution water 99%/barium hydroxyde 1%.
So this already solves a problem: barium hydroxide solution (Q809681) is not an instance of mixture or chemical substances but a subclass of mixture/subclass of chemical subsatnce as barium hydroxide solution (Q809681) represents an infinity of solutions having different compositions from 0.0001 to 99.9999%.
Then next question: can we put as constraint that identifiers used to identify pure substance can't be used to identify aqueous solutions of the same substance ? Even if this is allowed in general by external rules outside of wikidata ? Example is CAS number which is used for pure substances and their aqueous solutions. But this creates a mess in our constraints report so I woul like to formalize the restriction of CAS number to only pure substances and exclude the use of the same CAS number for aqueous solutions. Comments ? Snipre (talk) 21:09, 13 September 2016 (UTC)
- I am not sure if we should use 'chemical substance' as an opposition to mixture (or maybe we shouldn't use it at all). That's very unprecise term and its definition may depend on language, author/source etc. (e.g. in Polish chemical literature from 60s–70s chemical substances are divided into pure substances /compounds, elements/ and mixtures). Even the IUPAC definition is not as precise as it should be: the solution (mixture) of two substances with specified composition would be a mixture and a substance at the same time (both conditions are fulfilled: constant composition, characteristic physical properties). And we also have legal definitions: substance is a [chemical] mixture (pure substance + necessary additives + technological impurities) and mixture is a mixture/solution of two or more substances [EU CLP definitions].
- You're right with barium hydroxide solution (Q809681): it should be classified as 'subclass of' mixture (but IMHO better 'subclass of' saturated solution -> solution -> mixture).
- And yes, we should limit the use of CAS number to 'pure chemical substances' only. I think that no distinction between solutions and compounds in the CAS Registry is not intentional, but it's a result of practical reasons only; so there is no substantive reasoning behind it. ∼Wostr (talk) 20:00, 16 September 2016 (UTC)
Annotation in which species chemical compounds are found
I am adding this to shed some light of what I am up to with Wikidata. At the moment I am close to the first steps of developing a bot based on the User:ProteinBoxBot code base and made a first request. This bot can help import a lot of data, but also help add missing information. For example, Christopher Southan just reported a list of about 700 hundred PubChem CID (P662) for entries with SMILES: https://twitter.com/cdsouthan/status/769814678197460993 Pulling in this information is easy. For now, I will focus on the biology side of things, and plan to annotate compounds and the species they are found in, e.g. using knowledge in the WikiPathways database (see Wikidata:Requests_for_permissions/Bot/UreomiczBot 1). User:Wostr pointed out on my Discussion page that found in taxon (P703) can be used directly on the Wikidata entry being added/edited, so, instead of instance of (P31). There are quite a few species specific metabolite database where this can be sourced from. I stress how important it is to have this kind of information, because academic researchers now often face the problem that they have measured compounds from human samples of unknown chemical identity (in any typical untargeted metabolomics experiment). More info can be found in this report of a recent student project on Figshare (https://figshare.com/articles/Volatile_Organic_Compounds_A_Detailed_Account_of_Identity_Origin_Activity_and_Pathways/3466805) and the H2020 project proposal Enabling Open Science: Wikidata for Research (Wiki4R) (Q26707522). --Egon Willighagen (talk) 08:49, 28 August 2016 (UTC)
- @Egon Willighagen: Before doing any advertising to use WD in scientific research we have to implement a control system which allows to sell WD as reference database. To be able to reach that objective we should perform an unique step: for each "instance of: chemical compound", an unique value for InChI (P234) with the corresponding InChIKey (P235) has to be provided.
- But currently we have
- 21969 items with "instance of: chemical compound"
- 14471 items with a value for InChI (P234)
- 15519 items with a value for InChIKey (P235)
- 14584 items with "instance of: chemical compound" and a value for InChI (P234) and for InChIKey (P235) ???
- In one word, we have to be able once to propose in WD one fixed list of chemicals clearly identified with a coherent set of identifiers (mainly InChI, InChIKey and chemical structure) from the same source or generated from the same system. We are far away of that situation now so for me trying to sell WD as a tool for scientific research is just a bad idea and a way to loose any trust for the future. Snipre (talk) 12:25, 29 August 2016 (UTC)
- @Snipre:, I am not claiming Wikidata is perfect yet. There are indeed a number of problems, but I like to see your results that show that Wikidata is doing worse than scientific databases. Many of the latter have a certain scope, and only a few use InChI as a basis. The above issues need a lot of attention, and the bot I am developing can help. E.g. it is trivial to add InChIs and InChIKeys for chemical compounds with a SMILES. Finding inconsistencies too. The fact that the number of "instance of: chemical compound" is currently higher than the number of InChIs does not worry me at all: many compound classes are annotated as "instance of: chemical compound" rather than "subclass of: chemical compound", and compound classes do simply not have an InChI. Furthermore, there are chemical substances annotated as compound, etc, etc. Yes, there is plenty to clean up, but that's why Wikidata should be at the center of science, as it is an open database where all scholars can contribute to, without having to worry of being able to reuse their own contributions later. I love to sit down with you and a few other Wikidata Chemists and iron out some ideas! What about a (virtual) meet up soon? --Egon Willighagen (talk) 13:07, 29 August 2016 (UTC)
- @Egon Willighagen: The problem with the items defined as chemical compound without an InChI is that they are not completely identified. One quarter of our database is not fully defined and this why I prefer to slow down the use of WD by external users. We can always discuss about next steps but it would be great to put different options on the paper first in order to already have an idea about the possible work to perform before starting discussion. My proposition is developed there. The talk page can be used to add other ideas and we will update the page once an agreement will be found. Snipre (talk) 09:50, 30 August 2016 (UTC)
- @Snipre: Great! Let's continue talking there then! Mind you, there are some people who want to solve this problem, including me and Sebotic. And I know for a fact ChemConnector has that interest too. These are scientists, not users, but developers and data providers. Over the next few days, I will run some scripts to quantify the current quality. There will be a lot of manual work to be done. Also, I think you overestimate with 'a quarter'... not everything now qualified as compound really is a compound that should have an InChI. More in that talk page asap! --Egon Willighagen (talk) 10:25, 30 August 2016 (UTC)
- @Snipre: BTW, sn-glycerol 3-phosphate(2-) (Q26711901) is a new compound which, according to searching on PubChem CID and InChIKey, was not yet in Wikidata. Adding missing information (or correcting info, if needed) can be automated. Feedback on that new compound page is appreciated. --Egon Willighagen (talk) 12:26, 30 August 2016 (UTC)
- @Egon Willighagen: Seems OK, but it would be perfect if you can follow the recommandation of Help:Sources#Databases and add at least the "retrieved date". Title is a good think too but less important. Snipre (talk) 22:15, 30 August 2016 (UTC)
- @Snipre: Agreed about the 'retrieved data' but setting that requires an URL in the calendarModel property of the data, which causes the abuseFilter to overreact, so I cannot set that right now. See e.g. this log message. --Egon Willighagen (talk) 11:47, 31 August 2016 (UTC)
- @Egon Willighagen: Please report the problem to the dev team or the abuse filter admin. Seems to be a programmation problem. Snipre (talk) 19:13, 31 August 2016 (UTC)
- @Snipre: Agreed about the 'retrieved data' but setting that requires an URL in the calendarModel property of the data, which causes the abuseFilter to overreact, so I cannot set that right now. See e.g. this log message. --Egon Willighagen (talk) 11:47, 31 August 2016 (UTC)
- @Egon Willighagen: Seems OK, but it would be perfect if you can follow the recommandation of Help:Sources#Databases and add at least the "retrieved date". Title is a good think too but less important. Snipre (talk) 22:15, 30 August 2016 (UTC)
- @Egon Willighagen: The problem with the items defined as chemical compound without an InChI is that they are not completely identified. One quarter of our database is not fully defined and this why I prefer to slow down the use of WD by external users. We can always discuss about next steps but it would be great to put different options on the paper first in order to already have an idea about the possible work to perform before starting discussion. My proposition is developed there. The talk page can be used to add other ideas and we will update the page once an agreement will be found. Snipre (talk) 09:50, 30 August 2016 (UTC)
- @Snipre, Egon Willighagen: This is the current count of core identifiers of chemical items in Wikidata. By chemical items, I mean items either instance of or subclass of chem compound, or just having a cas, cid, inchi(key), smiles but lacking an instance or subclass categorization. Snipre, I see your concerns, but I have have invested quite some time into the chem compound space in WD now and I am confident that there will be substantial improvements in the coming weeks. Sebotic (talk) 17:46, 29 August 2016 (UTC)
- @Snipre:, I am not claiming Wikidata is perfect yet. There are indeed a number of problems, but I like to see your results that show that Wikidata is doing worse than scientific databases. Many of the latter have a certain scope, and only a few use InChI as a basis. The above issues need a lot of attention, and the bot I am developing can help. E.g. it is trivial to add InChIs and InChIKeys for chemical compounds with a SMILES. Finding inconsistencies too. The fact that the number of "instance of: chemical compound" is currently higher than the number of InChIs does not worry me at all: many compound classes are annotated as "instance of: chemical compound" rather than "subclass of: chemical compound", and compound classes do simply not have an InChI. Furthermore, there are chemical substances annotated as compound, etc, etc. Yes, there is plenty to clean up, but that's why Wikidata should be at the center of science, as it is an open database where all scholars can contribute to, without having to worry of being able to reuse their own contributions later. I love to sit down with you and a few other Wikidata Chemists and iron out some ideas! What about a (virtual) meet up soon? --Egon Willighagen (talk) 13:07, 29 August 2016 (UTC)
chem items | 24809 | subclass of or instance of chemical compound, or having cas, cid, inchi(key), smiles. |
article | 16391 | links to en.wikipedia.org |
mass | 637 | |
chemSpider | 11462 | |
pubchem_cid | 16906 | |
unii | 11826 | |
mesh_id | 927 | |
kegg_id | 4065 | |
mesh_code | 3 | |
chebi | 4464 | |
drugbank | 2682 | |
chembl | 5461 | |
iuphar | 1033 | |
cas | 19692 | |
csmiles | 15635 | |
inchi | 14470 | |
inchi_key | 14943 | |
chemical_formula | 19475 | |
atc_code | 1709 | |
ismiles | 16 |
Here is a list 240 items with conflicts on the structure level, where the InChI key does not match for some of the identifiers on an item. This is usually due to incomplete sterechemistry items added or just the wrong stereochemistry or the wrong compound in the first place. I think I can clean up a good share of those by just getting the majority vote of identifiers for an InChI key and then using this key to populate the item. This is certainly not error free. Otherwise, these 240 could be fixed by hand, what is required is just to delete any PubChem CID, InChI key, Chembl, chebi, Unii, or chemspider which is incorrect. After that, one valid identifier on an item is sufficient to let the item be populated by my bots. Btw: these 240 are result of a consistency check for 4000 items, so approximately 1/4th of all compounds with a PubChem CID in Wikidata. Sebotic (talk) 09:20, 20 September 2016 (UTC)
- @Sebotic: Thanks for your work. But I can't work now on that curation, at least not before 2 weeks. Snipre (talk) 19:25, 22 September 2016 (UTC)
- @Snipre: I will try to get as many of them resolved through other automatic means, so that finally, we end up with a number which can be handled more easily. Manual curation, in my experience, is a very time consuming process, so I think it should be the last resort. Let's see how quickly we can resolve it. The biggest challenge for some of these will be to choose the one with the most appropriate ('correct') stereochemistry. Sebotic (talk) 20:12, 22 September 2016 (UTC)
UPDATE: that is now the full list of 1,279 items in the space of chem items with pubchem ID which need inspection and curation, out ot 17,709 (7,2%)
UPDATE: Now 1.284 compounds, a few new items appeared and a few fixed ones where removed.
UPDATE: Managed to bring it down to 1,030, updated list accordingly. What I see frequently are InChI keys where most major resources agree on, but PubChem has a different one (different connectivity) and the one agreed on by the other resources can be found in the PubChem data provider supplied descriptions, that's not ideal... Sebotic (talk) 08:57, 30 September 2016 (UTC)
http://www.wikidata.org/entity/Q5275604 Corrected, confusion between digermane (Q5275604) and digermanium (Q27183266) http://www.wikidata.org/entity/Q905418 http://www.wikidata.org/entity/Q670450 http://www.wikidata.org/entity/Q5319233 http://www.wikidata.org/entity/Q5976757 http://www.wikidata.org/entity/Q421676 http://www.wikidata.org/entity/Q408221 http://www.wikidata.org/entity/Q937085 http://www.wikidata.org/entity/Q5144763 http://www.wikidata.org/entity/Q7272074 http://www.wikidata.org/entity/Q7841934 http://www.wikidata.org/entity/Q7182926 http://www.wikidata.org/entity/Q4596915 http://www.wikidata.org/entity/Q420070 Corrected, confusion between sodium percarbonate (Q420070) and sodium hydroperoxy(oxo)methanolate (Q27216890) http://www.wikidata.org/entity/Q6508313 http://www.wikidata.org/entity/Q415798 http://www.wikidata.org/entity/Q2393155 http://www.wikidata.org/entity/Q1890177 http://www.wikidata.org/entity/Q744577 http://www.wikidata.org/entity/Q6948223 http://www.wikidata.org/entity/Q6714672 http://www.wikidata.org/entity/Q4115930 http://www.wikidata.org/entity/Q996659 http://www.wikidata.org/entity/Q4941885 http://www.wikidata.org/entity/Q15410255 http://www.wikidata.org/entity/Q4946240 http://www.wikidata.org/entity/Q2033359 http://www.wikidata.org/entity/Q262613 http://www.wikidata.org/entity/Q4122197 http://www.wikidata.org/entity/Q367994 Not corrected, two ways to represent the molecule: with ionic bond or covalent bond between calcium atom anf nitrogen atom http://www.wikidata.org/entity/Q5516411 http://www.wikidata.org/entity/Q15720554 http://www.wikidata.org/entity/Q7197544 http://www.wikidata.org/entity/Q409648 http://www.wikidata.org/entity/Q7234718 http://www.wikidata.org/entity/Q3529346 http://www.wikidata.org/entity/Q209404 http://www.wikidata.org/entity/Q13019044 http://www.wikidata.org/entity/Q4348637 http://www.wikidata.org/entity/Q20707800 http://www.wikidata.org/entity/Q425180 http://www.wikidata.org/entity/Q7914945 http://www.wikidata.org/entity/Q15409364 http://www.wikidata.org/entity/Q2817111 http://www.wikidata.org/entity/Q4737382 http://www.wikidata.org/entity/Q423829 http://www.wikidata.org/entity/Q18206495 http://www.wikidata.org/entity/Q473546 http://www.wikidata.org/entity/Q3055852 http://www.wikidata.org/entity/Q417819 http://www.wikidata.org/entity/Q15709156 http://www.wikidata.org/entity/Q4887561 http://www.wikidata.org/entity/Q3315722 http://www.wikidata.org/entity/Q2823219 http://www.wikidata.org/entity/Q241678 http://www.wikidata.org/entity/Q385657 http://www.wikidata.org/entity/Q2616064 http://www.wikidata.org/entity/Q5159421 http://www.wikidata.org/entity/Q2669979 http://www.wikidata.org/entity/Q9137074 http://www.wikidata.org/entity/Q11163063 http://www.wikidata.org/entity/Q7280124 http://www.wikidata.org/entity/Q5276483 http://www.wikidata.org/entity/Q417003 http://www.wikidata.org/entity/Q15410253 http://www.wikidata.org/entity/Q3008693 http://www.wikidata.org/entity/Q19903618 http://www.wikidata.org/entity/Q2640914 http://www.wikidata.org/entity/Q681387 http://www.wikidata.org/entity/Q420527 http://www.wikidata.org/entity/Q6960846 http://www.wikidata.org/entity/Q1014287 http://www.wikidata.org/entity/Q4807880 http://www.wikidata.org/entity/Q424481 http://www.wikidata.org/entity/Q4596785 http://www.wikidata.org/entity/Q424726 http://www.wikidata.org/entity/Q487064 http://www.wikidata.org/entity/Q19597253 http://www.wikidata.org/entity/Q15634253 http://www.wikidata.org/entity/Q15425783 http://www.wikidata.org/entity/Q135270 http://www.wikidata.org/entity/Q1280166 http://www.wikidata.org/entity/Q425085 http://www.wikidata.org/entity/Q5383934 http://www.wikidata.org/entity/Q4890795 http://www.wikidata.org/entity/Q409676 http://www.wikidata.org/entity/Q419170 http://www.wikidata.org/entity/Q7388910 http://www.wikidata.org/entity/Q5190964 http://www.wikidata.org/entity/Q661724 http://www.wikidata.org/entity/Q21099667 http://www.wikidata.org/entity/Q7777226 http://www.wikidata.org/entity/Q415768 http://www.wikidata.org/entity/Q5018816 http://www.wikidata.org/entity/Q15720561 http://www.wikidata.org/entity/Q1810456 http://www.wikidata.org/entity/Q29428 http://www.wikidata.org/entity/Q2710669 http://www.wikidata.org/entity/Q7119043 http://www.wikidata.org/entity/Q4759444 http://www.wikidata.org/entity/Q132037 http://www.wikidata.org/entity/Q5057301 http://www.wikidata.org/entity/Q4545648 http://www.wikidata.org/entity/Q4918616 http://www.wikidata.org/entity/Q133878 http://www.wikidata.org/entity/Q4682573 http://www.wikidata.org/entity/Q4676582 http://www.wikidata.org/entity/Q7507259 http://www.wikidata.org/entity/Q1318344 http://www.wikidata.org/entity/Q6606395 http://www.wikidata.org/entity/Q176525 http://www.wikidata.org/entity/Q7390603 http://www.wikidata.org/entity/Q3927866 http://www.wikidata.org/entity/Q5572308 http://www.wikidata.org/entity/Q5251502 http://www.wikidata.org/entity/Q3120938 http://www.wikidata.org/entity/Q3832015 http://www.wikidata.org/entity/Q337231 http://www.wikidata.org/entity/Q421246 http://www.wikidata.org/entity/Q4691976 http://www.wikidata.org/entity/Q2823250 http://www.wikidata.org/entity/Q6951349 http://www.wikidata.org/entity/Q5572322 http://www.wikidata.org/entity/Q2912342 http://www.wikidata.org/entity/Q239593 http://www.wikidata.org/entity/Q4890804 http://www.wikidata.org/entity/Q7814247 http://www.wikidata.org/entity/Q4864613 http://www.wikidata.org/entity/Q4119810 http://www.wikidata.org/entity/Q721202 http://www.wikidata.org/entity/Q4774534 http://www.wikidata.org/entity/Q4445816 http://www.wikidata.org/entity/Q15605490 http://www.wikidata.org/entity/Q11071947 http://www.wikidata.org/entity/Q4993812 http://www.wikidata.org/entity/Q739601 http://www.wikidata.org/entity/Q415484 http://www.wikidata.org/entity/Q5036049 http://www.wikidata.org/entity/Q5137363 http://www.wikidata.org/entity/Q418258 http://www.wikidata.org/entity/Q421389 http://www.wikidata.org/entity/Q15624043 http://www.wikidata.org/entity/Q10864413 http://www.wikidata.org/entity/Q3079150 http://www.wikidata.org/entity/Q415909 http://www.wikidata.org/entity/Q5891570 http://www.wikidata.org/entity/Q15720548 http://www.wikidata.org/entity/Q7039308 http://www.wikidata.org/entity/Q415920 http://www.wikidata.org/entity/Q4673300 http://www.wikidata.org/entity/Q15408428 http://www.wikidata.org/entity/Q425240 http://www.wikidata.org/entity/Q10859631 http://www.wikidata.org/entity/Q4811191 http://www.wikidata.org/entity/Q4748577 http://www.wikidata.org/entity/Q26998367 http://www.wikidata.org/entity/Q5150954 http://www.wikidata.org/entity/Q373791 http://www.wikidata.org/entity/Q7671452 http://www.wikidata.org/entity/Q10861060 http://www.wikidata.org/entity/Q422305 http://www.wikidata.org/entity/Q6678756 http://www.wikidata.org/entity/Q5849706 http://www.wikidata.org/entity/Q4737395 http://www.wikidata.org/entity/Q5441133 http://www.wikidata.org/entity/Q8059839 http://www.wikidata.org/entity/Q4596757 http://www.wikidata.org/entity/Q943416 http://www.wikidata.org/entity/Q96385 http://www.wikidata.org/entity/Q12062354 http://www.wikidata.org/entity/Q132442 http://www.wikidata.org/entity/Q7846702 http://www.wikidata.org/entity/Q5871697 http://www.wikidata.org/entity/Q5010980 http://www.wikidata.org/entity/Q421291 http://www.wikidata.org/entity/Q4639640 http://www.wikidata.org/entity/Q420284 http://www.wikidata.org/entity/Q4789030 http://www.wikidata.org/entity/Q425036 http://www.wikidata.org/entity/Q5057233 http://www.wikidata.org/entity/Q4024215 http://www.wikidata.org/entity/Q20880500 http://www.wikidata.org/entity/Q5637062 http://www.wikidata.org/entity/Q4748994 http://www.wikidata.org/entity/Q5278705 http://www.wikidata.org/entity/Q15628035 http://www.wikidata.org/entity/Q413572 http://www.wikidata.org/entity/Q245487 http://www.wikidata.org/entity/Q2363204 http://www.wikidata.org/entity/Q903824 http://www.wikidata.org/entity/Q390239 http://www.wikidata.org/entity/Q904599 http://www.wikidata.org/entity/Q410095 http://www.wikidata.org/entity/Q4596899 http://www.wikidata.org/entity/Q4596815 http://www.wikidata.org/entity/Q407373 http://www.wikidata.org/entity/Q411175 http://www.wikidata.org/entity/Q369048 http://www.wikidata.org/entity/Q1281115 http://www.wikidata.org/entity/Q1097932 http://www.wikidata.org/entity/Q6951323 http://www.wikidata.org/entity/Q7119896 http://www.wikidata.org/entity/Q5013768 http://www.wikidata.org/entity/Q10880252 http://www.wikidata.org/entity/Q426921 http://www.wikidata.org/entity/Q413703 http://www.wikidata.org/entity/Q15634177 http://www.wikidata.org/entity/Q3083814 http://www.wikidata.org/entity/Q20817136 http://www.wikidata.org/entity/Q4721853 http://www.wikidata.org/entity/Q4746184 http://www.wikidata.org/entity/Q417309 http://www.wikidata.org/entity/Q853845 http://www.wikidata.org/entity/Q7923143 http://www.wikidata.org/entity/Q408132 http://www.wikidata.org/entity/Q5057298 http://www.wikidata.org/entity/Q8043134 http://www.wikidata.org/entity/Q409818 http://www.wikidata.org/entity/Q5256385 http://www.wikidata.org/entity/Q7099512 http://www.wikidata.org/entity/Q419308 http://www.wikidata.org/entity/Q2090740 http://www.wikidata.org/entity/Q414824 http://www.wikidata.org/entity/Q7800111 http://www.wikidata.org/entity/Q15409399 http://www.wikidata.org/entity/Q3058085 http://www.wikidata.org/entity/Q6581446 http://www.wikidata.org/entity/Q2044136 http://www.wikidata.org/entity/Q4689286 http://www.wikidata.org/entity/Q5528046 http://www.wikidata.org/entity/Q5521314 http://www.wikidata.org/entity/Q4637036 http://www.wikidata.org/entity/Q2074372 http://www.wikidata.org/entity/Q5506958 http://www.wikidata.org/entity/Q554818 http://www.wikidata.org/entity/Q4353536 http://www.wikidata.org/entity/Q4982752 http://www.wikidata.org/entity/Q7698194 http://www.wikidata.org/entity/Q5010983 http://www.wikidata.org/entity/Q61184 http://www.wikidata.org/entity/Q2482223 http://www.wikidata.org/entity/Q7076760 http://www.wikidata.org/entity/Q21098925 http://www.wikidata.org/entity/Q4708928 http://www.wikidata.org/entity/Q7260199 http://www.wikidata.org/entity/Q258653 http://www.wikidata.org/entity/Q3066718 http://www.wikidata.org/entity/Q19596037 http://www.wikidata.org/entity/Q898299 http://www.wikidata.org/entity/Q16935646 http://www.wikidata.org/entity/Q18358153 http://www.wikidata.org/entity/Q419209 http://www.wikidata.org/entity/Q21098923 http://www.wikidata.org/entity/Q4650413 http://www.wikidata.org/entity/Q423692 http://www.wikidata.org/entity/Q411899 http://www.wikidata.org/entity/Q662425 http://www.wikidata.org/entity/Q6816337 http://www.wikidata.org/entity/Q15633962 http://www.wikidata.org/entity/Q424091 http://www.wikidata.org/entity/Q6374755 http://www.wikidata.org/entity/Q248891 http://www.wikidata.org/entity/Q419846 http://www.wikidata.org/entity/Q5200299 http://www.wikidata.org/entity/Q7316883 http://www.wikidata.org/entity/Q424871 http://www.wikidata.org/entity/Q414359 http://www.wikidata.org/entity/Q4764702 http://www.wikidata.org/entity/Q416534 http://www.wikidata.org/entity/Q4641512 http://www.wikidata.org/entity/Q3825917 http://www.wikidata.org/entity/Q6528191 http://www.wikidata.org/entity/Q20706932 http://www.wikidata.org/entity/Q21045149 http://www.wikidata.org/entity/Q4596810 http://www.wikidata.org/entity/Q5140608 http://www.wikidata.org/entity/Q4392082 http://www.wikidata.org/entity/Q15410989 http://www.wikidata.org/entity/Q413127 http://www.wikidata.org/entity/Q2983921 http://www.wikidata.org/entity/Q417304 http://www.wikidata.org/entity/Q5050928 http://www.wikidata.org/entity/Q5277314 http://www.wikidata.org/entity/Q763802 http://www.wikidata.org/entity/Q6172522 http://www.wikidata.org/entity/Q413036 http://www.wikidata.org/entity/Q15409373 http://www.wikidata.org/entity/Q310828 http://www.wikidata.org/entity/Q179619 http://www.wikidata.org/entity/Q1074417 http://www.wikidata.org/entity/Q4903628 http://www.wikidata.org/entity/Q2034517 http://www.wikidata.org/entity/Q5204319 http://www.wikidata.org/entity/Q19903180 http://www.wikidata.org/entity/Q1387655 http://www.wikidata.org/entity/Q6593308 http://www.wikidata.org/entity/Q3817447 http://www.wikidata.org/entity/Q7851139 http://www.wikidata.org/entity/Q7808830 http://www.wikidata.org/entity/Q419415 http://www.wikidata.org/entity/Q412994 http://www.wikidata.org/entity/Q7912519 http://www.wikidata.org/entity/Q3243737 http://www.wikidata.org/entity/Q658 http://www.wikidata.org/entity/Q411138 http://www.wikidata.org/entity/Q421074 http://www.wikidata.org/entity/Q5283993 http://www.wikidata.org/entity/Q5383826 http://www.wikidata.org/entity/Q21099568 http://www.wikidata.org/entity/Q7119205 http://www.wikidata.org/entity/Q7863562 http://www.wikidata.org/entity/Q2406759 http://www.wikidata.org/entity/Q15627472 http://www.wikidata.org/entity/Q6122828 http://www.wikidata.org/entity/Q5201339 http://www.wikidata.org/entity/Q419070 http://www.wikidata.org/entity/Q15927659 http://www.wikidata.org/entity/Q2629981 http://www.wikidata.org/entity/Q6997373 http://www.wikidata.org/entity/Q5264591 http://www.wikidata.org/entity/Q4646883 http://www.wikidata.org/entity/Q3132209 http://www.wikidata.org/entity/Q4596759 http://www.wikidata.org/entity/Q5024643 http://www.wikidata.org/entity/Q18344013 http://www.wikidata.org/entity/Q2912604 http://www.wikidata.org/entity/Q7050960 http://www.wikidata.org/entity/Q5057289 http://www.wikidata.org/entity/Q4545730 http://www.wikidata.org/entity/Q414394 http://www.wikidata.org/entity/Q594482 http://www.wikidata.org/entity/Q4918919 http://www.wikidata.org/entity/Q7762 http://www.wikidata.org/entity/Q5332581 http://www.wikidata.org/entity/Q420191 http://www.wikidata.org/entity/Q2117581 http://www.wikidata.org/entity/Q7395228 http://www.wikidata.org/entity/Q198473 http://www.wikidata.org/entity/Q15634055 http://www.wikidata.org/entity/Q7263592 http://www.wikidata.org/entity/Q423910 http://www.wikidata.org/entity/Q414619 http://www.wikidata.org/entity/Q7936365 http://www.wikidata.org/entity/Q5137434 http://www.wikidata.org/entity/Q15426238 http://www.wikidata.org/entity/Q7957934 http://www.wikidata.org/entity/Q7367466 http://www.wikidata.org/entity/Q5323095 http://www.wikidata.org/entity/Q1235560 http://www.wikidata.org/entity/Q6581305 http://www.wikidata.org/entity/Q408360 http://www.wikidata.org/entity/Q794084 http://www.wikidata.org/entity/Q420056 http://www.wikidata.org/entity/Q252251 http://www.wikidata.org/entity/Q6823338 http://www.wikidata.org/entity/Q409054 http://www.wikidata.org/entity/Q1101052 http://www.wikidata.org/entity/Q2943815 http://www.wikidata.org/entity/Q13566119 http://www.wikidata.org/entity/Q5571076 http://www.wikidata.org/entity/Q417484 http://www.wikidata.org/entity/Q470900 http://www.wikidata.org/entity/Q7321711 http://www.wikidata.org/entity/Q416677 http://www.wikidata.org/entity/Q2823194 http://www.wikidata.org/entity/Q722387 http://www.wikidata.org/entity/Q4981136 http://www.wikidata.org/entity/Q5120032 http://www.wikidata.org/entity/Q539395 http://www.wikidata.org/entity/Q45044 http://www.wikidata.org/entity/Q5134843 http://www.wikidata.org/entity/Q4680659 http://www.wikidata.org/entity/Q5275247 http://www.wikidata.org/entity/Q421634 http://www.wikidata.org/entity/Q5319 http://www.wikidata.org/entity/Q5102982 http://www.wikidata.org/entity/Q61416 http://www.wikidata.org/entity/Q416904 http://www.wikidata.org/entity/Q4807670 http://www.wikidata.org/entity/Q2823840 http://www.wikidata.org/entity/Q409216 http://www.wikidata.org/entity/Q416513 http://www.wikidata.org/entity/Q3007886 http://www.wikidata.org/entity/Q7671383 http://www.wikidata.org/entity/Q1586727 http://www.wikidata.org/entity/Q3973521 http://www.wikidata.org/entity/Q3029787 http://www.wikidata.org/entity/Q421255 http://www.wikidata.org/entity/Q596946 http://www.wikidata.org/entity/Q6456961 http://www.wikidata.org/entity/Q5418554 http://www.wikidata.org/entity/Q5332352 http://www.wikidata.org/entity/Q4797402 http://www.wikidata.org/entity/Q416641 http://www.wikidata.org/entity/Q6138969 http://www.wikidata.org/entity/Q4716536 http://www.wikidata.org/entity/Q6542719 http://www.wikidata.org/entity/Q11350933 http://www.wikidata.org/entity/Q5102980 http://www.wikidata.org/entity/Q5264607 http://www.wikidata.org/entity/Q3347765 http://www.wikidata.org/entity/Q422504 http://www.wikidata.org/entity/Q414591 http://www.wikidata.org/entity/Q4596737 http://www.wikidata.org/entity/Q413258 http://www.wikidata.org/entity/Q1076381 http://www.wikidata.org/entity/Q419900 http://www.wikidata.org/entity/Q223099 http://www.wikidata.org/entity/Q4673314 http://www.wikidata.org/entity/Q6839784 http://www.wikidata.org/entity/Q423412 http://www.wikidata.org/entity/Q7234707 http://www.wikidata.org/entity/Q419895 http://www.wikidata.org/entity/Q5199864 http://www.wikidata.org/entity/Q6482030 http://www.wikidata.org/entity/Q2408443 http://www.wikidata.org/entity/Q4596911 http://www.wikidata.org/entity/Q4736748 http://www.wikidata.org/entity/Q14200355 http://www.wikidata.org/entity/Q411484 http://www.wikidata.org/entity/Q4637034 http://www.wikidata.org/entity/Q15634079 http://www.wikidata.org/entity/Q7260204 http://www.wikidata.org/entity/Q417103 http://www.wikidata.org/entity/Q7675206 http://www.wikidata.org/entity/Q5015902 http://www.wikidata.org/entity/Q143289 http://www.wikidata.org/entity/Q3077500 http://www.wikidata.org/entity/Q15269704 http://www.wikidata.org/entity/Q2816006 http://www.wikidata.org/entity/Q419775 http://www.wikidata.org/entity/Q15409431 http://www.wikidata.org/entity/Q2823201 http://www.wikidata.org/entity/Q4811598 http://www.wikidata.org/entity/Q5319234 http://www.wikidata.org/entity/Q4650970 http://www.wikidata.org/entity/Q2581447 http://www.wikidata.org/entity/Q4646882 http://www.wikidata.org/entity/Q7646983 http://www.wikidata.org/entity/Q21098924 http://www.wikidata.org/entity/Q4747075 http://www.wikidata.org/entity/Q3607822 http://www.wikidata.org/entity/Q2823244 http://www.wikidata.org/entity/Q6931218 http://www.wikidata.org/entity/Q7573806 http://www.wikidata.org/entity/Q5057221 http://www.wikidata.org/entity/Q26272 http://www.wikidata.org/entity/Q7843285 http://www.wikidata.org/entity/Q409035 http://www.wikidata.org/entity/Q2708007 http://www.wikidata.org/entity/Q2602246 http://www.wikidata.org/entity/Q3469748 http://www.wikidata.org/entity/Q15634126 http://www.wikidata.org/entity/Q419642 http://www.wikidata.org/entity/Q425064 http://www.wikidata.org/entity/Q5443648 http://www.wikidata.org/entity/Q6787831 http://www.wikidata.org/entity/Q411844 http://www.wikidata.org/entity/Q4885099 http://www.wikidata.org/entity/Q407658 http://www.wikidata.org/entity/Q3980350 http://www.wikidata.org/entity/Q5198686 http://www.wikidata.org/entity/Q2850134 http://www.wikidata.org/entity/Q904668 http://www.wikidata.org/entity/Q4981048 http://www.wikidata.org/entity/Q114391 http://www.wikidata.org/entity/Q5748732 http://www.wikidata.org/entity/Q2742455 http://www.wikidata.org/entity/Q7838854 http://www.wikidata.org/entity/Q16069783 http://www.wikidata.org/entity/Q7777225 http://www.wikidata.org/entity/Q4445833 http://www.wikidata.org/entity/Q3388802 http://www.wikidata.org/entity/Q5359421 http://www.wikidata.org/entity/Q423275 http://www.wikidata.org/entity/Q13578067 http://www.wikidata.org/entity/Q1187513 http://www.wikidata.org/entity/Q7322878 http://www.wikidata.org/entity/Q15411007 http://www.wikidata.org/entity/Q424250 http://www.wikidata.org/entity/Q544393 http://www.wikidata.org/entity/Q4734058 http://www.wikidata.org/entity/Q421761 http://www.wikidata.org/entity/Q425065 http://www.wikidata.org/entity/Q2008962 http://www.wikidata.org/entity/Q5518498 http://www.wikidata.org/entity/Q4890905 http://www.wikidata.org/entity/Q424528 http://www.wikidata.org/entity/Q2614009 http://www.wikidata.org/entity/Q3553093 http://www.wikidata.org/entity/Q4832247 http://www.wikidata.org/entity/Q510113 http://www.wikidata.org/entity/Q5749096 http://www.wikidata.org/entity/Q2653981 http://www.wikidata.org/entity/Q6896941 http://www.wikidata.org/entity/Q3915149 http://www.wikidata.org/entity/Q687686 http://www.wikidata.org/entity/Q192553 http://www.wikidata.org/entity/Q15708273 http://www.wikidata.org/entity/Q11786072 http://www.wikidata.org/entity/Q4737384 http://www.wikidata.org/entity/Q411426 http://www.wikidata.org/entity/Q413278 http://www.wikidata.org/entity/Q415646 http://www.wikidata.org/entity/Q1829318 http://www.wikidata.org/entity/Q73972 http://www.wikidata.org/entity/Q3592644 http://www.wikidata.org/entity/Q5404857 http://www.wikidata.org/entity/Q4674302 http://www.wikidata.org/entity/Q1630230 http://www.wikidata.org/entity/Q5089008 http://www.wikidata.org/entity/Q704923 http://www.wikidata.org/entity/Q7074645 http://www.wikidata.org/entity/Q407446 http://www.wikidata.org/entity/Q4138107 http://www.wikidata.org/entity/Q5162311 http://www.wikidata.org/entity/Q7204785 http://www.wikidata.org/entity/Q4890770 http://www.wikidata.org/entity/Q2331543 http://www.wikidata.org/entity/Q2657418 http://www.wikidata.org/entity/Q4391972 http://www.wikidata.org/entity/Q7165030 http://www.wikidata.org/entity/Q413849 http://www.wikidata.org/entity/Q7071996 http://www.wikidata.org/entity/Q21098845 http://www.wikidata.org/entity/Q899416 http://www.wikidata.org/entity/Q419421 http://www.wikidata.org/entity/Q620084 http://www.wikidata.org/entity/Q15408415 http://www.wikidata.org/entity/Q5280075 http://www.wikidata.org/entity/Q3991659 http://www.wikidata.org/entity/Q3080860 http://www.wikidata.org/entity/Q2896809 http://www.wikidata.org/entity/Q4332794 http://www.wikidata.org/entity/Q2930096 http://www.wikidata.org/entity/Q4828930 http://www.wikidata.org/entity/Q907070 http://www.wikidata.org/entity/Q420138 http://www.wikidata.org/entity/Q6990833 http://www.wikidata.org/entity/Q415392 http://www.wikidata.org/entity/Q416716 http://www.wikidata.org/entity/Q417755 http://www.wikidata.org/entity/Q4673311 http://www.wikidata.org/entity/Q7514072 http://www.wikidata.org/entity/Q7116885 http://www.wikidata.org/entity/Q417250 http://www.wikidata.org/entity/Q407891 http://www.wikidata.org/entity/Q4586731 http://www.wikidata.org/entity/Q15712807 http://www.wikidata.org/entity/Q15427895 http://www.wikidata.org/entity/Q4454241 http://www.wikidata.org/entity/Q3814656 http://www.wikidata.org/entity/Q2594649 http://www.wikidata.org/entity/Q15088351 http://www.wikidata.org/entity/Q909931 http://www.wikidata.org/entity/Q5003182 http://www.wikidata.org/entity/Q424684 http://www.wikidata.org/entity/Q4747307 http://www.wikidata.org/entity/Q3680915 http://www.wikidata.org/entity/Q571037 http://www.wikidata.org/entity/Q2073868 http://www.wikidata.org/entity/Q423846 http://www.wikidata.org/entity/Q5104342 http://www.wikidata.org/entity/Q4938924 http://www.wikidata.org/entity/Q5509469 http://www.wikidata.org/entity/Q740439 http://www.wikidata.org/entity/Q1117877 http://www.wikidata.org/entity/Q779118 http://www.wikidata.org/entity/Q5519727 http://www.wikidata.org/entity/Q419193 http://www.wikidata.org/entity/Q4642883 http://www.wikidata.org/entity/Q5276413 http://www.wikidata.org/entity/Q7072002 http://www.wikidata.org/entity/Q421116 http://www.wikidata.org/entity/Q5199357 http://www.wikidata.org/entity/Q5443567 http://www.wikidata.org/entity/Q425059 http://www.wikidata.org/entity/Q2553496 http://www.wikidata.org/entity/Q3629883 http://www.wikidata.org/entity/Q2943814 http://www.wikidata.org/entity/Q5011453 http://www.wikidata.org/entity/Q2866762 http://www.wikidata.org/entity/Q411046 http://www.wikidata.org/entity/Q408805 http://www.wikidata.org/entity/Q7395917 http://www.wikidata.org/entity/Q15410941 http://www.wikidata.org/entity/Q3333710 http://www.wikidata.org/entity/Q421905 http://www.wikidata.org/entity/Q18386276 http://www.wikidata.org/entity/Q417538 http://www.wikidata.org/entity/Q21098950 http://www.wikidata.org/entity/Q7800905 http://www.wikidata.org/entity/Q13024951 http://www.wikidata.org/entity/Q7269871 http://www.wikidata.org/entity/Q427105 http://www.wikidata.org/entity/Q6583647 http://www.wikidata.org/entity/Q7197959 http://www.wikidata.org/entity/Q5332458 http://www.wikidata.org/entity/Q7099046 http://www.wikidata.org/entity/Q2823302 http://www.wikidata.org/entity/Q7670227 http://www.wikidata.org/entity/Q4132745 http://www.wikidata.org/entity/Q5268487 http://www.wikidata.org/entity/Q21099604 http://www.wikidata.org/entity/Q6951351 http://www.wikidata.org/entity/Q15708268 http://www.wikidata.org/entity/Q706868 http://www.wikidata.org/entity/Q4445835 http://www.wikidata.org/entity/Q21099637 http://www.wikidata.org/entity/Q4545805 http://www.wikidata.org/entity/Q425053 http://www.wikidata.org/entity/Q3706873 http://www.wikidata.org/entity/Q7842218 http://www.wikidata.org/entity/Q422777 http://www.wikidata.org/entity/Q4119955 http://www.wikidata.org/entity/Q15426208 http://www.wikidata.org/entity/Q5161074 http://www.wikidata.org/entity/Q7851973 http://www.wikidata.org/entity/Q413805 http://www.wikidata.org/entity/Q3599478 http://www.wikidata.org/entity/Q5200296 http://www.wikidata.org/entity/Q1951971 http://www.wikidata.org/entity/Q3641126 http://www.wikidata.org/entity/Q8052674 http://www.wikidata.org/entity/Q419226 http://www.wikidata.org/entity/Q423398 http://www.wikidata.org/entity/Q6806652 http://www.wikidata.org/entity/Q415872 http://www.wikidata.org/entity/Q5513695 http://www.wikidata.org/entity/Q4177124 http://www.wikidata.org/entity/Q19833284 http://www.wikidata.org/entity/Q5113892 http://www.wikidata.org/entity/Q2064889 http://www.wikidata.org/entity/Q15402123 http://www.wikidata.org/entity/Q6710338 http://www.wikidata.org/entity/Q7116606 http://www.wikidata.org/entity/Q4463083 http://www.wikidata.org/entity/Q7263189 http://www.wikidata.org/entity/Q3277932 http://www.wikidata.org/entity/Q808801 http://www.wikidata.org/entity/Q15410232 http://www.wikidata.org/entity/Q7051399 http://www.wikidata.org/entity/Q778163 http://www.wikidata.org/entity/Q6824053 http://www.wikidata.org/entity/Q4735601 http://www.wikidata.org/entity/Q7263674 http://www.wikidata.org/entity/Q7119395 http://www.wikidata.org/entity/Q418758 http://www.wikidata.org/entity/Q1065083 http://www.wikidata.org/entity/Q475631 http://www.wikidata.org/entity/Q7119048 http://www.wikidata.org/entity/Q7558263 http://www.wikidata.org/entity/Q7198091 http://www.wikidata.org/entity/Q7636084 http://www.wikidata.org/entity/Q4745975 http://www.wikidata.org/entity/Q4864584 http://www.wikidata.org/entity/Q287582 http://www.wikidata.org/entity/Q2993328 http://www.wikidata.org/entity/Q5199059 http://www.wikidata.org/entity/Q3940320 http://www.wikidata.org/entity/Q424958 http://www.wikidata.org/entity/Q18208892 http://www.wikidata.org/entity/Q164403 http://www.wikidata.org/entity/Q1989071 http://www.wikidata.org/entity/Q4161099 http://www.wikidata.org/entity/Q384709 http://www.wikidata.org/entity/Q27267 http://www.wikidata.org/entity/Q410521 http://www.wikidata.org/entity/Q8041945 http://www.wikidata.org/entity/Q18211886 http://www.wikidata.org/entity/Q15427885 http://www.wikidata.org/entity/Q15427926 http://www.wikidata.org/entity/Q961081 http://www.wikidata.org/entity/Q1046522 http://www.wikidata.org/entity/Q7678897 http://www.wikidata.org/entity/Q4652482 http://www.wikidata.org/entity/Q4491065 http://www.wikidata.org/entity/Q6004128 http://www.wikidata.org/entity/Q6961010 http://www.wikidata.org/entity/Q2701649 http://www.wikidata.org/entity/Q21099606 http://www.wikidata.org/entity/Q930170 http://www.wikidata.org/entity/Q21099663 http://www.wikidata.org/entity/Q421894 http://www.wikidata.org/entity/Q17074532 http://www.wikidata.org/entity/Q415024 http://www.wikidata.org/entity/Q423783 http://www.wikidata.org/entity/Q3072948 http://www.wikidata.org/entity/Q421598 http://www.wikidata.org/entity/Q5010218 http://www.wikidata.org/entity/Q5470216 http://www.wikidata.org/entity/Q3637333 http://www.wikidata.org/entity/Q410036 http://www.wikidata.org/entity/Q5955671 http://www.wikidata.org/entity/Q3849795 http://www.wikidata.org/entity/Q5199001 http://www.wikidata.org/entity/Q17299859 http://www.wikidata.org/entity/Q4637047 http://www.wikidata.org/entity/Q3429577 http://www.wikidata.org/entity/Q4748447 http://www.wikidata.org/entity/Q6535822 http://www.wikidata.org/entity/Q5379483 http://www.wikidata.org/entity/Q18349230 http://www.wikidata.org/entity/Q757702 http://www.wikidata.org/entity/Q905058 http://www.wikidata.org/entity/Q7669595 http://www.wikidata.org/entity/Q15410969 http://www.wikidata.org/entity/Q60457 http://www.wikidata.org/entity/Q329022 http://www.wikidata.org/entity/Q4652479 http://www.wikidata.org/entity/Q2930105 http://www.wikidata.org/entity/Q2627834 http://www.wikidata.org/entity/Q2315302 http://www.wikidata.org/entity/Q6003986 http://www.wikidata.org/entity/Q416490 http://www.wikidata.org/entity/Q6965855 http://www.wikidata.org/entity/Q422327 http://www.wikidata.org/entity/Q7294040 http://www.wikidata.org/entity/Q4841341 http://www.wikidata.org/entity/Q421320 http://www.wikidata.org/entity/Q408201 http://www.wikidata.org/entity/Q259015 http://www.wikidata.org/entity/Q2709086 http://www.wikidata.org/entity/Q14035740 http://www.wikidata.org/entity/Q782318 http://www.wikidata.org/entity/Q423531 http://www.wikidata.org/entity/Q4163873 http://www.wikidata.org/entity/Q409743 http://www.wikidata.org/entity/Q249208 http://www.wikidata.org/entity/Q5242815 http://www.wikidata.org/entity/Q144917 http://www.wikidata.org/entity/Q284367 http://www.wikidata.org/entity/Q413733 http://www.wikidata.org/entity/Q4673302 http://www.wikidata.org/entity/Q10861089 http://www.wikidata.org/entity/Q3410841 http://www.wikidata.org/entity/Q19597525 http://www.wikidata.org/entity/Q3026455 http://www.wikidata.org/entity/Q15409424 http://www.wikidata.org/entity/Q15411005 http://www.wikidata.org/entity/Q2622702 http://www.wikidata.org/entity/Q4673057 http://www.wikidata.org/entity/Q2261930 http://www.wikidata.org/entity/Q413559 http://www.wikidata.org/entity/Q900922 http://www.wikidata.org/entity/Q420934 http://www.wikidata.org/entity/Q5382029 http://www.wikidata.org/entity/Q5113817 http://www.wikidata.org/entity/Q412291 http://www.wikidata.org/entity/Q412191 http://www.wikidata.org/entity/Q421272 http://www.wikidata.org/entity/Q4117486 http://www.wikidata.org/entity/Q964482 http://www.wikidata.org/entity/Q5276420 http://www.wikidata.org/entity/Q7921024 http://www.wikidata.org/entity/Q4891024 http://www.wikidata.org/entity/Q15426197 http://www.wikidata.org/entity/Q15634054 http://www.wikidata.org/entity/Q4161299 http://www.wikidata.org/entity/Q7238143 http://www.wikidata.org/entity/Q4596853 http://www.wikidata.org/entity/Q3979404 http://www.wikidata.org/entity/Q3381514 http://www.wikidata.org/entity/Q5712560 http://www.wikidata.org/entity/Q3604267 http://www.wikidata.org/entity/Q4041747 http://www.wikidata.org/entity/Q5404502 http://www.wikidata.org/entity/Q15632788 http://www.wikidata.org/entity/Q420043 http://www.wikidata.org/entity/Q2823228 http://www.wikidata.org/entity/Q225854 http://www.wikidata.org/entity/Q3604498 http://www.wikidata.org/entity/Q7118739 http://www.wikidata.org/entity/Q368222 http://www.wikidata.org/entity/Q4637180 http://www.wikidata.org/entity/Q267896 http://www.wikidata.org/entity/Q1839256 http://www.wikidata.org/entity/Q7706543 http://www.wikidata.org/entity/Q1104482 http://www.wikidata.org/entity/Q13024942 http://www.wikidata.org/entity/Q988591 http://www.wikidata.org/entity/Q6072216 http://www.wikidata.org/entity/Q413299 http://www.wikidata.org/entity/Q81890 http://www.wikidata.org/entity/Q139883 http://www.wikidata.org/entity/Q15991360 http://www.wikidata.org/entity/Q909387 http://www.wikidata.org/entity/Q6078756 http://www.wikidata.org/entity/Q6839436 http://www.wikidata.org/entity/Q10861003 http://www.wikidata.org/entity/Q2267471 http://www.wikidata.org/entity/Q2985253 http://www.wikidata.org/entity/Q18604129 http://www.wikidata.org/entity/Q920725 http://www.wikidata.org/entity/Q20054555 http://www.wikidata.org/entity/Q886862 http://www.wikidata.org/entity/Q4068819 http://www.wikidata.org/entity/Q4834702 http://www.wikidata.org/entity/Q7698204 http://www.wikidata.org/entity/Q2790082 http://www.wikidata.org/entity/Q3032708 http://www.wikidata.org/entity/Q415410 http://www.wikidata.org/entity/Q4790694 http://www.wikidata.org/entity/Q5611751 http://www.wikidata.org/entity/Q425248 http://www.wikidata.org/entity/Q5383774 http://www.wikidata.org/entity/Q287745 http://www.wikidata.org/entity/Q415588 http://www.wikidata.org/entity/Q418735 http://www.wikidata.org/entity/Q7101735 http://www.wikidata.org/entity/Q4734921 http://www.wikidata.org/entity/Q6647969 http://www.wikidata.org/entity/Q419478 http://www.wikidata.org/entity/Q3757667 http://www.wikidata.org/entity/Q419361 http://www.wikidata.org/entity/Q4642874 http://www.wikidata.org/entity/Q2705859 http://www.wikidata.org/entity/Q3109285 http://www.wikidata.org/entity/Q15708035 http://www.wikidata.org/entity/Q7352933 http://www.wikidata.org/entity/Q15410921 http://www.wikidata.org/entity/Q172409 http://www.wikidata.org/entity/Q2700587 http://www.wikidata.org/entity/Q4634059 http://www.wikidata.org/entity/Q420354 http://www.wikidata.org/entity/Q6062315 http://www.wikidata.org/entity/Q647580 http://www.wikidata.org/entity/Q3276808 http://www.wikidata.org/entity/Q5578972 http://www.wikidata.org/entity/Q4644278 http://www.wikidata.org/entity/Q10858037 http://www.wikidata.org/entity/Q1033359 http://www.wikidata.org/entity/Q5057294 http://www.wikidata.org/entity/Q17318234 http://www.wikidata.org/entity/Q412742 http://www.wikidata.org/entity/Q965955 http://www.wikidata.org/entity/Q60279 http://www.wikidata.org/entity/Q5200429 http://www.wikidata.org/entity/Q7680336 http://www.wikidata.org/entity/Q7784698 http://www.wikidata.org/entity/Q5409893 http://www.wikidata.org/entity/Q3347162 http://www.wikidata.org/entity/Q411087 http://www.wikidata.org/entity/Q412874 http://www.wikidata.org/entity/Q10859487 http://www.wikidata.org/entity/Q15410276 http://www.wikidata.org/entity/Q7071299 http://www.wikidata.org/entity/Q407189 http://www.wikidata.org/entity/Q15303950 http://www.wikidata.org/entity/Q3985292 http://www.wikidata.org/entity/Q19595855 http://www.wikidata.org/entity/Q2777979 http://www.wikidata.org/entity/Q283033 http://www.wikidata.org/entity/Q3831365 http://www.wikidata.org/entity/Q3517399 http://www.wikidata.org/entity/Q1786341 http://www.wikidata.org/entity/Q894130 http://www.wikidata.org/entity/Q5748740 http://www.wikidata.org/entity/Q28775 http://www.wikidata.org/entity/Q7245902 http://www.wikidata.org/entity/Q18155805 http://www.wikidata.org/entity/Q15634052 http://www.wikidata.org/entity/Q417044 http://www.wikidata.org/entity/Q7848584 http://www.wikidata.org/entity/Q120384 http://www.wikidata.org/entity/Q26998317 http://www.wikidata.org/entity/Q21400577 http://www.wikidata.org/entity/Q7050437 http://www.wikidata.org/entity/Q3277888 http://www.wikidata.org/entity/Q5198906 http://www.wikidata.org/entity/Q4642864 http://www.wikidata.org/entity/Q3935171 http://www.wikidata.org/entity/Q4677960 http://www.wikidata.org/entity/Q18209997 http://www.wikidata.org/entity/Q5415983 http://www.wikidata.org/entity/Q3044728 http://www.wikidata.org/entity/Q5049003 http://www.wikidata.org/entity/Q7843270 http://www.wikidata.org/entity/Q5102988 http://www.wikidata.org/entity/Q7811972 http://www.wikidata.org/entity/Q5120191 http://www.wikidata.org/entity/Q5205953 http://www.wikidata.org/entity/Q413596 http://www.wikidata.org/entity/Q20707829 http://www.wikidata.org/entity/Q15410217 http://www.wikidata.org/entity/Q2031142 http://www.wikidata.org/entity/Q904411 http://www.wikidata.org/entity/Q426660 http://www.wikidata.org/entity/Q7250468 http://www.wikidata.org/entity/Q900926 http://www.wikidata.org/entity/Q868435 http://www.wikidata.org/entity/Q4737376 http://www.wikidata.org/entity/Q278972 http://www.wikidata.org/entity/Q420087 http://www.wikidata.org/entity/Q833649 http://www.wikidata.org/entity/Q26979 http://www.wikidata.org/entity/Q349427 http://www.wikidata.org/entity/Q8214050 http://www.wikidata.org/entity/Q7915670 http://www.wikidata.org/entity/Q3512695 http://www.wikidata.org/entity/Q4637178 http://www.wikidata.org/entity/Q7368629 http://www.wikidata.org/entity/Q18357634 http://www.wikidata.org/entity/Q15425284 http://www.wikidata.org/entity/Q5991162 http://www.wikidata.org/entity/Q411909 http://www.wikidata.org/entity/Q6913406 http://www.wikidata.org/entity/Q2813821 http://www.wikidata.org/entity/Q410875 http://www.wikidata.org/entity/Q7106486 http://www.wikidata.org/entity/Q5049581 http://www.wikidata.org/entity/Q1761300 http://www.wikidata.org/entity/Q4652498 http://www.wikidata.org/entity/Q4701917 http://www.wikidata.org/entity/Q367258 http://www.wikidata.org/entity/Q3570564 http://www.wikidata.org/entity/Q4732178 http://www.wikidata.org/entity/Q7280510 http://www.wikidata.org/entity/Q7120083 http://www.wikidata.org/entity/Q1960495 http://www.wikidata.org/entity/Q7181329 http://www.wikidata.org/entity/Q410614 http://www.wikidata.org/entity/Q4364572 http://www.wikidata.org/entity/Q408256 http://www.wikidata.org/entity/Q426524 http://www.wikidata.org/entity/Q4676694 http://www.wikidata.org/entity/Q2823286 http://www.wikidata.org/entity/Q15409437 http://www.wikidata.org/entity/Q4499058 http://www.wikidata.org/entity/Q4779987 http://www.wikidata.org/entity/Q4836836 http://www.wikidata.org/entity/Q5984942 http://www.wikidata.org/entity/Q7067904 http://www.wikidata.org/entity/Q5120034 http://www.wikidata.org/entity/Q419849 http://www.wikidata.org/entity/Q2629234 http://www.wikidata.org/entity/Q904475 http://www.wikidata.org/entity/Q4352981 http://www.wikidata.org/entity/Q12744507 http://www.wikidata.org/entity/Q5203006 http://www.wikidata.org/entity/Q161294 http://www.wikidata.org/entity/Q8074586 http://www.wikidata.org/entity/Q4745983 http://www.wikidata.org/entity/Q2972710 http://www.wikidata.org/entity/Q424223 http://www.wikidata.org/entity/Q6927482 http://www.wikidata.org/entity/Q412805 http://www.wikidata.org/entity/Q5047057 http://www.wikidata.org/entity/Q7706553 http://www.wikidata.org/entity/Q6808812 http://www.wikidata.org/entity/Q3351791 http://www.wikidata.org/entity/Q2706622 http://www.wikidata.org/entity/Q2288772 http://www.wikidata.org/entity/Q151446 http://www.wikidata.org/entity/Q618730 http://www.wikidata.org/entity/Q21045227 http://www.wikidata.org/entity/Q743705 http://www.wikidata.org/entity/Q6518814 http://www.wikidata.org/entity/Q44944 http://www.wikidata.org/entity/Q3596763 http://www.wikidata.org/entity/Q286793 http://www.wikidata.org/entity/Q958387 http://www.wikidata.org/entity/Q4674080 http://www.wikidata.org/entity/Q413762 http://www.wikidata.org/entity/Q18209791 http://www.wikidata.org/entity/Q418564 http://www.wikidata.org/entity/Q7316807 http://www.wikidata.org/entity/Q7277486 http://www.wikidata.org/entity/Q2281857 http://www.wikidata.org/entity/Q421235 http://www.wikidata.org/entity/Q18347446 http://www.wikidata.org/entity/Q19904197 http://www.wikidata.org/entity/Q424851 http://www.wikidata.org/entity/Q1490748 http://www.wikidata.org/entity/Q6951371 http://www.wikidata.org/entity/Q425165 http://www.wikidata.org/entity/Q424541 http://www.wikidata.org/entity/Q5519258 http://www.wikidata.org/entity/Q4641536 http://www.wikidata.org/entity/Q379123 http://www.wikidata.org/entity/Q4832281 http://www.wikidata.org/entity/Q553129 http://www.wikidata.org/entity/Q7181438 http://www.wikidata.org/entity/Q423223 http://www.wikidata.org/entity/Q424931 http://www.wikidata.org/entity/Q422652 http://www.wikidata.org/entity/Q5398839 http://www.wikidata.org/entity/Q748200 http://www.wikidata.org/entity/Q16634590 http://www.wikidata.org/entity/Q414317 http://www.wikidata.org/entity/Q4973576 http://www.wikidata.org/entity/Q8213894 http://www.wikidata.org/entity/Q6681547 http://www.wikidata.org/entity/Q2436886 http://www.wikidata.org/entity/Q3151476 http://www.wikidata.org/entity/Q908742 http://www.wikidata.org/entity/Q5009205 http://www.wikidata.org/entity/Q19597398 http://www.wikidata.org/entity/Q3429576 http://www.wikidata.org/entity/Q418886 http://www.wikidata.org/entity/Q7251822 http://www.wikidata.org/entity/Q3546864 http://www.wikidata.org/entity/Q4782227 http://www.wikidata.org/entity/Q7681179 http://www.wikidata.org/entity/Q14521943 http://www.wikidata.org/entity/Q15409426 http://www.wikidata.org/entity/Q4545640 http://www.wikidata.org/entity/Q1072477 http://www.wikidata.org/entity/Q417674 http://www.wikidata.org/entity/Q5261117 http://www.wikidata.org/entity/Q12746850 http://www.wikidata.org/entity/Q347621 http://www.wikidata.org/entity/Q21098991 http://www.wikidata.org/entity/Q620072 http://www.wikidata.org/entity/Q410281 http://www.wikidata.org/entity/Q420212 http://www.wikidata.org/entity/Q4639568 http://www.wikidata.org/entity/Q1027605 http://www.wikidata.org/entity/Q5385177 http://www.wikidata.org/entity/Q4674081 http://www.wikidata.org/entity/Q5135146 http://www.wikidata.org/entity/Q10859673 http://www.wikidata.org/entity/Q4914076 http://www.wikidata.org/entity/Q15628029 http://www.wikidata.org/entity/Q20707021 http://www.wikidata.org/entity/Q19414 http://www.wikidata.org/entity/Q4353551 http://www.wikidata.org/entity/Q7860340 http://www.wikidata.org/entity/Q415945 http://www.wikidata.org/entity/Q5272281
Bot importations
@ProteinBoxBot, SoCalChemBot, TaxonBot:, @Doc Taxon, Sebotic, Andrawaag:. Please announce your importation campaign about chemicals and other proteins in this page in order to give the end of the campaign. This will help for the data curation and prevent any bot reimportation of bad data after a manual correction.
Then we have to think about the future: we can't just let the bots operate in the same manner in the future, after data curation. Even if a database is providing some data, we can't just erase what will be present in WD after a manual data curation. So next bot actions should avoid any data deletion and focus on data comparison with report generation indicating conflicts.
Then a remark for those bots adding molar mass as mass to chemicals. This is not a very good solution because this data can mix monoisotopic mass and average molecular mass. The best would be to provide only the number of the different atoms and to let the people calculate the molecular mass according to their own choice.
Thank you. Snipre (talk) 08:12, 7 October 2016 (UTC)
- +1.--Kopiersperre (talk) 11:07, 8 October 2016 (UTC)
- +1 --Ghilt (talk) 16:24, 9 October 2016 (UTC)
- +1 --Mabschaaf (talk) 16:38, 9 October 2016 (UTC)
- +1. A remark concerning TaxonBot: This task of adding ECHA Substance Infocard ID (P2566) based on CAS Registry Number (P231) was done based on pre-curated data. I am currently working on the remaining issues. --Leyo 17:40, 9 October 2016 (UTC)
- @Snipre: Ok, so if that runs as planned, ChEBI compound imports should be done by Wednesday, after that, there will be an UNII compound import run of another 5 days, maybe I find a way to do imports faster. Regarding the curation: All of the newly imported/created items are in good shape, centered around an InChI key. The items which need human intervention are listed above (~1,030). These need to be centered around one InChI key too. In addition to those items, there are about another ~1000 where there are still some wrong IDs on them (CAS, UNII, ChEBI, ChEMBL), but SMILES, InChI (key), PubChem CID and ChemSpider are ok. I can remove these with a bot automatically.
- For curation after a bot run: In order to make sure that a bot does not overwrite good curation, the curated values need to have good references according to the Wikidata ref guidelines for databases, otherwise, any curation work is futile, as this is essential for a bot to recognize human curation. But this only works for statements; labels, description, aliases do not have refs. In addition, it is not realistic to only do one time imports, because the original data sources evolve, improve, and expand.
- So these need to be kept in sync. What happens if there is a one-time import of data to Wikidata and afterwards no constant sync, has been demostrated by importing chemical compound data from various infoboxes of various Wikipedias basically once and then not caring for continuous syncs any more, this is one major contributor to why there are still a ton of issues in the chem space of Wikidata.
- So if we can agree on the mandatory requirement for good refs, I will modify my bots in a way to always keep the manually curated parts with good refs. Otherwise, there is no way to find out who made a good contribution. A list of user names is not a good way, because this will exlude anyone not on that list. Furthermore, keeping everything which has been contributed by users is also not a good way, because I have seen many wrong contributions because the users had either no idea of chemistry or were playing one of these curation games and got it wrong there. Ideas or suggestions?
- Regarding the mass: What I import is the monoisotopic mass as stated in PubChem, I can add a qualifier to make that more explicit, but I cannot see how a data user should be able to calculate an average mass if the user does not know the isotopic distribution. But I can certainly add average mass as well (Or any other). Sebotic (talk) 07:58, 10 October 2016 (UTC)
- @Sebotic: Thanks for your answer. My concern is currently about the duplication of items about the same chemical: the bots add data once in one item then in the second item. This can be solved by merging but the problem is to be sure that both items are about the same chemical. Then we arrive to the second problem: the confusion between mixture of stereoisomers and pure stereoisomers. This is a real problem especially for the CAS number. I have huge problem to curate CAS numbers because to few databases provide this identifier.
- Concerning curation currently I delete and merge, no new addition. The problem is that sometimes the original databases are wrong (typical case of confusion between mixture of stereoisomers and pure stereoisomers) and I can't replace the wrong data by a correct one (typical case for CAS number). In that case reference doesn't help. And this why the sync is not a good idea: after data import and curation WD is not more a compilation of what is given by other databases, but is a database and should considered like this. Future imports are not more possible, only comparison and conflict reports should be generated. No more massive bot actions, only manual correction based on bot comparison. I agree with you about the references as key element to judge if a data should be kept but in the future bots will only play the role of data comparison and not more data import or correction (in large extend at least).
- Bot work is not a problem but you have to agree that their actions will change after first import: sync is the not the goal, the goal is to provide a coherent set of data about one topic. If we agree on that, then we can go to the next step which is the definition of a system where bots provide some reports and contributors use them to curate and correct. So please don't work alone in your corner but try to work with this project when dealing with chemicals. The case of the mass is a good example of the lack of discussion: even if you are working based on a good reasoning, nobody knows which kind of mass you imported and no rules are defined for future data imports or addition. So the risk is very high that without guidelines, after some weeks people mix different data using the same property. Snipre (talk) 08:53, 10 October 2016 (UTC)
- @Snipre: I agree that orienting around CAS is not a good idea, but this ID is so widely used that we should add it if we can. Therefore, I strongly advocate for using InChI keys, these are tied to the structure and uniquely identify a compound. So my basic premise is: The structure comes from scientific literature or chem/pharma companies, at Wikidata, we do not have the means to make a comment on the structure of a compound, it can be good or bad or incomplete. This is why I think that for some compounds, we will need to live with 2 or more versions of a compound, because the real, true structure is incomplete/wrong versions of the structure exist. The connectivity can serve as the common basis (in most cases there is no disagreement on that) but the isomerism might differ. These different isomers can be connected to each other using Wikidata properties, and can be detected by using SPARQL queries. And over time, hopefully, many of those will resolve, but certainly, we will not reach a point where each and every compound has a high quality structure. So if I add 2 compounds with the same connectivity but differeent isomer info, how do you know that one is better than the other, or they are just 2 different isomers? I see no problem in having parallel versions. We can also have a compound without any stereoisomeric info as a minimum requirement and one or more defined stereoisomers, ideally of good quality.
- Regarding bot imports: I completely agree that Wikidata is an independent database and should not be the aggregator of other databases. In the contrary, we should make use of our flexibility and community curation. That said, senior figures in PubChem have told me personally that they are interested in taping/using the community curation done in Wikidata. That said, I think we need to find mechanisms to not touch the community curated things, but still import the improvements made in the original sources. Moreover, we definitely need the new compounds added from those resources, because these are usually compounds of high interest (e.g. in Drugbank, UNII) with high medical/biological relevance and of public interest and also with biologic activity. As I said above, I think proper references are one way of doing that.
- Regarding import efforts and import of special data: I agree that I should have discussed the import of 'mass' beforehand. I will also put up any import campaigns beforehand.
- For error detection, I think we should use SPAQL queries. I also have a bot which can continuously check if SMILES, and InChI (key) are consistent on an item and file a report if not. Sebotic (talk) 21:14, 10 October 2016 (UTC)
- @Sebotic: I don't have any problem with CAS number, my problem is when you import the CAS number from Pubchem and I delete it in WD because someone did a mistake by importing the wrong CAS number in PubChem. I can't provide the correct CAS number because I don't have access to SciFinder and most of the time Google can't provide a good answer. The main problem is the data curation in PubChem: they should do the same as us and analyze their CAS number to check if they are correct with their structure.
- Just have look at this compound in PubChem as example: someone put the wrong chemical formula as title for this entry in PubChem database. I can't change that wrong data so please don't reimport it with your bot. This is my only concern.
- About stereoisomers I think we should focus only on two kind of compounds: the compounds which are completely defined and the one which not at all defined. The latter having a role of grouping all possible completely defined stereoisomers. We don't need to create items for all possible stereoisomers in a systematic way but when we have confusion about mixture and pure forms we should split the data in order to avoid the confusion in the future.
- For your other remarks I think we agree together about the main principle. The only difference I think I have compare to your approach is the fact aboiut importing: I don't think we need to import data in WD to curate them after. Once we have a quite stable set of chemicals in WD we can work by comparison using bots and then create conflict reports. And only after a manual check we can import the data from others databases. If we agree on that we will avoid a lot of discussion later.
- For now I am working with report of constraint violations: every day I can see the results of the curation. I don't need to use sparql for the moment. But if you want to create the querie just do it. Snipre (talk) 22:28, 10 October 2016 (UTC)
- @Snipre: Regarding CAS numbers: I agree that PubChem does not do the best possible job here. But the CAS numbers I import are actually from UNII and ChEBI, not directly from PubChem. Still these could be incorrect. I have access to SciFinder, Reaxys, etc, but I am very sure that I am not allowed to do a systematic import of CAS or Beilstein numbers to WD. What could be a way to go, is to ask ACS directly if they would be willing to contribute a InChI key to CAS number mapping file.
- On the importation: In principle, I agree that it's a good idea to detect and log conflicts instead of overwriting. Two questions here: Is that feasible for thousands of items? And where would I post such a list of conflicts, so it can be processed really in a fast manner? Text, which then needs to be copy and pasted around by a curator is not a good way. I also agree on the stereochemistry part, either fully defined stereochemistry or no stereochemistry. But I think we need some flexibility here, because for very many important compounds, the only stereochemistry which exists is partial. What I have seen, this is very common for naturally occurring, larger molecules. But for cases where several stereoisomers exist, take only the fully defined ones. Sebotic (talk) 23:40, 10 October 2016 (UTC)
I suggest importing the remaining ZVG number (P679) based on CAS Registry Number (P231) and/or EC number (P232). This task is unlikely to create significant issues. Of the 8745 ZVG numbers available in an Excel list from there, we currently have slightly less than half. --Leyo 15:19, 10 October 2016 (UTC)
- @Leyo: I'm reluctant to import entries like 900063. The remaining ZVG entries seem not relevant to me, but when anyone imports data, we should do so, too.--Kopiersperre (talk) 16:14, 10 October 2016 (UTC)
- I did not ask for the creation of new items based on this list. IMHO it is sufficient to add those with a match in either CAS or EC number (or both). --Leyo 17:46, 10 October 2016 (UTC)
- @Leyo: Sorry for getting you wrong. The import was done by this Mix-n-Match catalog and can be resumed at any time. But I think, there is not much to do.--Kopiersperre (talk) 21:46, 10 October 2016 (UTC)
- I did not ask for the creation of new items based on this list. IMHO it is sufficient to add those with a match in either CAS or EC number (or both). --Leyo 17:46, 10 October 2016 (UTC)
- Before any importation can we once do a data comparison ? For example: take the list of CAS numbers from Gestis, match the items with their CAS number and then compare the EINECS number from Gestis with the EINECS number from WD. If both CAS number and EINECS number match then we can think about data importation if and only if the CAS number is used only once. My concern about Gestis is the fact that Gestis can use several times the same CAS number/EINECS number like for hydrogen chloride and hydrochloric acid solution (two ZVG numbers but one CAS number and one EINECS number).
- But before any importation we have to solve all violations of the constraints for CAS numbers and EINECS numbers. Snipre (talk) 21:08, 10 October 2016 (UTC)
- There are only very few contraint violations for the latter. Unclear cases should be skipped, and if possible, listed for manual review. --Leyo 23:37, 10 October 2016 (UTC)
Creating items for Cosmetic properties
In the COSING EU database about cosmetics, the Chemical components have one or several cosmetic properties. We might want to create those before importing COSING data --Teolemon (talk) 15:57, 16 October 2016 (UTC)
en:ABRASIVE en:definition:Removes materials from various body surfaces or aids mechanical tooth cleaning or improves gloss en:ABSORBENT en:definition:Takes up water- and/or oil-soluble dissolved or finely dispersed substances en:ANTICAKING en:definition:Allows free flow of solid particles and thus avoids agglomeration of powdered cosmetics into lumps or hard masses en:ANTICORROSIVE en:definition:Prevents corrosion of the packaging en:ANTIDANDRUFF en:definition:Helps control dandruff en:ANTIFOAMING en:definition:Suppresses foam during manufacturing or reduces the tendency of finished products to generate foam en:ANTIMICROBIAL en:definition:Helps control the growth of micro-organisms on the skin en:ANTIOXIDANT en:definition:Inhibits reactions promoted by oxygen, thus avoiding oxidation and rancidity en:ANTIPERSPIRANT en:definition:Reduces perspiration en:ANTIPLAQUE en:definition:Helps protect against plaque en:ANTISEBORRHOEIC en:definition:Helps control sebum production en:ANTISTATIC en:definition:Reduces static electricity by neutralising electrical charge on a surface en:ASTRINGENT en:definition:Contracts the skin en:BINDING en:definition:Provides cohesion in cosmetics en:BLEACHING en:definition:Lightens the shade of hair or skin en:BUFFERING en:definition:Stabilises the pH of cosmetics en:BULKING en:definition:Reduces bulk density of cosmetics en:CHELATING en:definition:Reacts and forms complexes with metal ions which could affect the stability and/or appearance of cosmetics en:CLEANSING en:definition:Helps to keep the body surface clean en:COSMETIC COLORANT en:definition:Colours cosmetics and/or imparts colour to the skin and/or its appendages. All colours listed are substances on the positive list of colorants (Annex IV of the Cosmetics Directive) en:DENATURANT en:definition:Renders cosmetics unpalatable. Mostly added to cosmetics containing ethyl alcohol en:DEODORANT en:definition:Reduces or masks unpleasant body odours en:DEPILATORY en:definition:Removes unwanted body hair en:DETANGLING en:definition:Reduces or eliminates hair intertwining due to hair surface alteration or damage and, thus, helps combing en:EMOLLIENT en:definition:Softens and smooths the skin en:EMULSIFYING en:definition:Promotes the formation of intimate mixtures of non-miscible liquids by altering the interfacial tension en: EMULSION STABILISING en:definition:Helps the process of emulsification and improves emulsion stability and shelf-life en: FILM FORMING en:definition:Produces, upon application, a continuous film on skin, hair or nails en: FLAVOURING en:definition:Gives flavour to the cosmetic product en: FOAM BOOSTING en:definition:Improves the quality of the foam produced by a system by increasing one or more of the following properties: volume, texture and/or stability en: FOAMING en:definition:Traps numerous small bubbles of air or other gas within a small volume of liquid by modifying the surface tension of the liquid en: GEL FORMING en:definition:Gives the consistency of a gel (a semi-solid preparation with some elasticity) to a liquid preparation en: HAIR CONDITIONING en:definition:Leaves the hair easy to comb, supple, soft and shiny and/or imparts volume, lightness, gloss, etc. en: HAIR DYEING en:definition:Colours hair en: HAIR FIXING en:definition:Permits physical control of hair style en: HAIR WAVING OR STRAIGHTENING en:definition:Modifies the chemical structure of the hair, allowing it to be set in the style required en: HUMECTANT en:definition:Holds and retains moisture en: HYDROTROPE en:definition:Enhances the solubility of substance which is only slightly soluble in water en: KERATOLYTIC en:definition:Helps eliminate the dead cells of the stratum corneum en: MASKING en:definition:Reduces or inhibits the basic odour or taste of the product en: MOISTURISING en:definition:Increases the water content of the skin and helps keep it soft and smooth en: NAIL CONDITIONING en:definition:Improves the cosmetic characteristics of the nail en: NOT REPORTED en:definition:NOT REPORTED en: OPACIFYING en:definition:Reduces transparency or translucency of cosmetics en: ORAL CARE en:definition:Provides cosmetic effects to the oral cavity, e.g. cleansing, deodorising, protecting en: OXIDISING en:definition:Changes the chemical nature of another substance by adding oxygen or removing hydrogen en: PEARLESCENT en:definition:Imparts a nacreous appearance to cosmetics en: PERFUMING en:definition:Used for perfume and aromatic raw materials (Section II) en: PLASTICISER en:definition:Softens and makes supple another substance that otherwise could not be easily deformed, spread or worked out en: PRESERVATIVE en:definition:Inhibits primarily the development of micro-organisms in cosmetics. All preservatives listed are substances on the positive list of preservatives (Annex VI of the Cosmetics Directive) en: PROPELLANT en:definition:Generates pressure in an aerosol pack, expelling contents when the valve is opened. Some liquefied propellants can act as solvents en: REDUCING en:definition:Changes the chemical nature of another substance by adding hydrogen or removing oxygen en: REFATTING en:definition:Replenishes the lipids of the hair or of the top layers of the skin en: REFRESHING en:definition:Imparts a pleasant freshness to the skin en: SKIN CONDITIONING en:definition:Maintains the skin in good condition en: SKIN PROTECTING en:definition:Helps to avoid harmful effects to the skin from external factors en: SMOOTHING en:definition:Seeks to achieve an even skin surface by decreasing roughness or irregularities en: SOLVENT en:definition:Dissolves other substances en: SOOTHING en:definition:Helps lightening discomfort of the skin or of the scalp en: STABILISING en:definition:Improves ingredients or formulation stability and shelf-life en: SURFACTANT en:definition:Lowers the surface tension of cosmetics as well as aids the even distribution of the product when used en: TANNING en:definition:Darkens the skin with or without exposure to UV en: TONIC en:definition:Produces a feeling of well-being on skin and hair en: UV ABSORBER en:definition:Protects the cosmetic product from the effects of UV-light en: UV FILTER en:definition:Filters certain UV rays in order to protect the skin or the hair from harmful effects of these rays. All UV filters listed are substances on the positive list of UV filters (Annex VII of the Cosmetics Directive) en: VISCOSITY CONTROLLING en:definition:Increases or decreases the viscosity of cosmetics
- The best is to use the property has use (P366) for that list. Snipre (talk) 20:01, 16 October 2016 (UTC)
- I actually have a multilingual taxonomy with many more languages - http://en.wiki.openbeautyfacts.org/Global_properties_taxonomy --Teolemon (talk) 21:07, 16 October 2016 (UTC)
Importing Pigment CICN numbers (Colour Index)
- The Colour Index International constitution ID (P2027) has been created a while ago.
- Many CICN numbers are present in labels or infobox ("CI 12345", "C.I. 12345", "Colour Index 12345").
- So far, as I was looking for Mix N'Match import candidates, I've found short lists of pigments that are 20-100 values long.
- However, the labels and external databases seem to have most of them.
Is it possible to source them from an external db or to REGEX them out from labels ?
It would be tremendously useful for Open Beauty Facts, that way we could decypher what's in your shampoo: List of ingredients of your favorite shampoo
--Teolemon (talk) 08:46, 17 October 2016 (UTC)
- Some data can be found here but CI number is a non-free system. Snipre (talk) 10:00, 17 October 2016 (UTC)
- My understanding is that having the identifiers on an item to link to their system is not an issue ? If they claim it's proprietary, this is very disturbing, since it is used on all the cosmetics you use daily, as if it was a standard… --Teolemon (talk) 12:11, 17 October 2016 (UTC)
- List ready at the bottom of the page. What I think is the CAS number and the CICN. I'm not quite sure how to add statements based on another statement. http://en.wiki.openbeautyfacts.org/Global_colour_index_taxonomy--Teolemon (talk) 12:53, 17 October 2016 (UTC)
- The problem is not to import the data, the problem is to access the data. I don't think there is a free database with all CI numbers. We can use the numbers but we can't import all the database in WD. Snipre (talk) 14:02, 17 October 2016 (UTC)
- List ready at the bottom of the page. What I think is the CAS number and the CICN. I'm not quite sure how to add statements based on another statement. http://en.wiki.openbeautyfacts.org/Global_colour_index_taxonomy--Teolemon (talk) 12:53, 17 October 2016 (UTC)
Creating Wikidata Items for GHS hazard statements
Currently the GHS hazard statements are stored as strings into Items. I feel that creating items for each GHS hazard statement could be interesting, esp ecially since the H302 will translate not only to an English sentence, but to sentences in many languages. Here's what I have done for Open Beauty Facts
- http://en.wiki.openbeautyfacts.org/Global_risks_taxonomy
- http://en.wiki.openbeautyfacts.org/Global_safety_taxonomy
--Teolemon (talk) 15:41, 16 October 2016 (UTC)
- We can create the new properties under Wikidata:Property_proposal/Natural_science#Chemistry. Snipre (talk) 20:04, 16 October 2016 (UTC)
- @Teolemon: That's what I proposed here. I think we need no property, we should just change P728 (P728) and P940 (P940) from string to property.--Kopiersperre (talk) 20:08, 16 October 2016 (UTC)
- @Kopiersperre: We can't change the datatype of a property: we have to create a new one. Snipre (talk) 09:22, 24 October 2016 (UTC)
- Since the properties are used in very few items, we may remove them all and then change the datatype. --Leyo 17:56, 24 October 2016 (UTC)
- My source is : http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX:32001L0059:EN:HTML --Teolemon (talk) 21:31, 16 October 2016 (UTC)
- Well, this is about the [[system previous to GHS (CLP in the EU), i.e. Dangerous Substances Directive (Q899329). --Leyo 17:56, 24 October 2016 (UTC)
- @Kopiersperre: We can't change the datatype of a property: we have to create a new one. Snipre (talk) 09:22, 24 October 2016 (UTC)
Some resources
Just to invite you to share your tools I create a section for the SPARQL queries related to chemistry in Wikidata:WikiProject_Chemistry/Tools#SPARQL_queries and to avoid reimport of wrong data from external databases, please report all errors in Wikidata:WikiProject_Chemistry/References#Report_of_errors_in_reference_databases. I hope we will find a way to contact once the administrators of the different databases to inform them about some problems in their dataset. Snipre (talk) 11:46, 19 October 2016 (UTC)
IECIC id (cosmetics in China)
Please review https://www.wikidata.org/wiki/Wikidata:Property_proposal/IECIC_id --Teolemon (talk) 07:18, 20 October 2016 (UTC)
Qualifier for reactions
Notified participants of WikiProject Chemistry
Has anyone figured out how to document reactions? Or is there a place this is already being discussed? Over on meta wiki I discovered there has been a proposal for a Wikichem (see discussion) which I think is a great idea but could use more feedback. And I foresee it's implementation depending on how much wikidata can support. Devon Fyson (talk) 05:25, 11 November 2016 (UTC)
- @Devon Fyson: With WD, we don't need to create a new structure Wikichem. And when you see the activity in this project, which is the most similar to a Wikichem, I think you can easily deduce that a Wikichem will have very few contributors.
- About reaction, this is no rule or model but if you want you can start a section under Wikidata:WikiProject_Chemistry/Tools and put a draft of reaction model. Snipre (talk) 20:06, 15 November 2016 (UTC)