Documentation

KEGG ID
identifier from databases dealing with genomes, enzymatic pathways, and biological chemicals
DescriptionID in Kyoto Encyclopedia of Genes and Genomes (Q909442), a collection of online databases dealing with genomes, enzymatic pathways, and biological chemicals. The PATHWAY database records networks of molecular interactions in the cells, and variants of them specific to particular organisms. Since July 2011, KEGG switched to a subscription model, and access via FTP is no longer free.
Applicable "stated in" valueKyoto Encyclopedia of Genes and Genomes (Q909442)
Data typeExternal identifier
Template parameterTemplate:Infobox drug (Q6033882) KEGG, Template:Chembox (Q52426) KEGG
Domain
According to this template: genomes, enzymatic pathways, and biological chemicals
According to statements in the property:
type of chemical entity (Q113145171), crude drug (Q735160), medication (Q12140), chemical element (Q11344), structural class of chemical entities (Q47154513), group of chemical entities (Q55640599), biological process (Q2996394), disease (Q12136) or mixture (Q169336)
When possible, data should only be stored as statements
Allowed values[A-Z]\d+
Example
According to this template: DL-ascorbic acid (Q193598) = "D00018", ketamine (Q243547) = "D08098 ", phenothiazine (Q410846) = "D02601" Leave "D" away?
According to statements in the property:
DL-ascorbic acid (Q193598)D00018 (RDF)
sulfur dioxide (Q5282)D05961 (RDF)
When possible, data should only be stored as statements
Sourcehttps://www.kegg.jp/
Formatter URLhttps://www.kegg.jp/entry/$1
Robot and gadget jobsGather data from infoboxes and website.
Tracking: usageCategory:Pages using Wikidata property P665 (Q26250085)
Lists
Proposal discussionProposal discussion
Current uses
Total21,255
Main statement21,22899.9% of uses
Qualifier2<0.1% of uses
Reference250.1% of uses
Search for values
[create Create a translatable help page (preferably in English) for this property to be included here]
Format “([CDEGHRK]|hsa|map|rn|RC)\d{5}: value must be formatted using this pattern (PCRE syntax). (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303). Known exceptions: Folate biosynthesis (Q60602831)
List of violations of this constraint: Database reports/Constraint violations/P665#Format, SPARQL
Distinct values: this property likely contains a value that is different from all other items. (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P665#Unique value, SPARQL (every item), SPARQL (by value)
Format “[A-Za-z]*\d+: value must be formatted using this pattern (PCRE syntax). (Help)
List of violations of this constraint: Database reports/Constraint violations/P665#Format, hourly updated report, SPARQL
Allowed entity types are Wikibase item (Q29934200): the property may only be used on a certain entity type (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P665#Entity types
Scope is as main value (Q54828448), as reference (Q54828450): the property must be used by specified way only (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P665#Scope, SPARQL


Split identifier into KEGG Drug Identifier, KEGG Compound Identifier, etc? edit

Hi all, since there is now a restriction to have only one KEGG identifier, this causes problems for compounds which actually do have multiple identifiers, like a KEGG Drug and KEGG Compound and KEGG Glycan identifier. I propose to split up this identifier into that various identifiers. I will write these up this weekend. Any objections? Or things you want me to take into account when writing this up? --Egon Willighagen (talk) 10:23, 15 December 2018 (UTC)Reply

  Notified participants of WikiProject Chemistry

@Egon Willighagen: It looks like the majority of the constraint violations refer to the "distinct values constraint" (same value on multiple items) and not a "unique values constraint" (multiple statements on a single item), which is the problem that I think your solution would solve. Unless I'm misreading this? If I've got that right, then I would either support removing that constraint, or simply letting those rare (proportionally speaking) constraint violations exist. Best, Andrew Su (talk) 01:04, 20 December 2018 (UTC)Reply
@Andrew Su: Yes, removing the "unique values constraint" would solve part of the problem. I do not know the full scale of the issue, but likely indeed less than a few tens of thousands. But by making it more specific we can validate things more accurately. By just removing the "unique values constraint" we cannot check with the constraints system that it only has one "Cxxxxx" identifier (but there are other ways of doing that, of course). --Egon Willighagen (talk) 07:34, 20 December 2018 (UTC)Reply
@Egon Willighagen: just FYI, I don't believe that the "unique values constraint" is currently applied to this property. So it sounds like you are proposing creating more specific identifier properties and adding that constraint to each. That sounds fine to me, though I also think that the status quo is sufficient for the Gene Wiki team's use cases... Best, Andrew Su (talk) 17:48, 20 December 2018 (UTC)Reply
@Egon Willighagen: The values that should be flagged are those from the same category (drug, compound etc). They start with the same character and this can be checked by a complex constraint query. No need for multiple properties. --SCIdude (talk) 06:46, 20 August 2020 (UTC)Reply
True. I could actually run a script that tags everything with a KEGG identifiers starting with a D as drug... --Egon Willighagen (talk) 06:43, 21 August 2020 (UTC)Reply
@Jura1:I think many KEGG Drug also have KEGG Compound identifiers. The exact number is not easy to determine from KEGG itself, but likely a few thousand close to 1000 already in Wikidata. --Egon Willighagen (talk) 07:26, 20 December 2018 (UTC)Reply
Return to "P665" page.