User:ProteinBoxBot/Drug items
Introduction edit
The ProteinBoxBot maintains information about Genes, Diseases and Drugs in Wikidata. The entities in these three domains are maintained by different corresponding sub-processes of the main bot.
This objective of the Drug sub-bot is to add and maintain Wikidata items for all drugs relevant to human health.
Intended Scope edit
FDA-approved drugs and drug combinations.
Items maintained by this bot edit
The following Wikidata query will retrieve all items currently maintained by the bot. It retrieves all items that are both instances of a pharmaceutical drug (CLAIM[31:12140) and have a Drugbank ID (CLAIM[715]).
- Autolist (human) CLAIM[31:12140] AND CLAIM[715]
- API query (machine) CLAIM[31:12140] AND CLAIM[715]
Bot test edits edit
- Calcidiol
- Biotin
- Aspartame
- Aspartame
- Menadione
- Ampicillin
- Abciximab
- Cyclosporine
- Palivizumab
- Famciclovir
In order to query all bot test edit items, please use CLAIM[31:12140] AND CLAIM[715]
Prototype items edit
Data sources edit
The data sources for this effort are the open databases National Drug File, DrugBank, PubChem and ChEMBL. In order to determine the list of compounds currently approved by the FDA, Drugbank is used, it also provides a set of basic identifiers. In order to aquire more data, the RDF API auf PubChem is used to acquire the PubChem ID and MeSH ID. The National Drug File REST API is used to get the FDA UNII and ChEMBL is also queried directly for the ChEMBL ID.
Output edit
The bot will be able to add new drugs appearing in its source databases to Wikidata. It will also modify existing Wikidata drug items if new information becomes available in the mentioned data sources. The data added for a Wikidata drug item will initially comprise, additionally to labels and aliases, the following
Properties edited by the bot edit
Property | Description | Datatype |
---|---|---|
Property:P31 | instance of | item |
Property:P636 | route of administration | item |
Property:P267 | ATC code | string |
Property:P231 | CAS registry number | string |
Property:P486 | MeSH ID | string |
Property:P672 | MeSH Code | string |
Property:P662 | PubChem ID | External ID |
Property:P661 | ChemSpider ID | External ID |
Property:P652 | UNII | External ID |
Property:P665 | KEGG ID | External ID |
Property:P683 | ChEBI ID | External ID |
Property:P274 | chemical formula | External ID |
Property:P715 | Drugbank ID | External ID |
Property:P592 | ChEMBL ID | External ID |
Property:P233 | SMILES | string |
Property:P234 | InChI | string |
Property:P235 | InChIKey | string |
Property:P2275 | Word Health Organisation International Nonproprietary Name | Monolingual text |
Property:P657 | RTECS Number | string |
Property:P2115 | NDF-RT ID | External ID |
Implementation edit
The bot is split in two parts, the drug data aggregator and the actual drug bot which uses the aggregated data to write to Wikidata. The bot code is open source and available for inspection.
Bot approval edit
Bot approval discussion August 2015: Wikidata:Requests_for_permissions/Bot/ProteinBoxBot_4