User:ProteinBoxBot/2016 ShEx sprint

Overall summary edit

Exploring Shape Expression to validate data added by bots.

Participants edit

Gameplan edit

  1. Extract a subset of data added by PBB to use as test case
  2. Write a shape expression that identifies if a wikidata item on a disease contains a disease ontology id.
  3. Identify a set of valuable data expressions.

Example Validation edit

Validate this gene

Requirements







Validate references also?

gene.ttl edit

PREFIX direct: <http://www.wikidata.org/prop/direct/>
PREFIX ent: <http://www.wikidata.org/entity/> 
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX pref: <http://www.wikidata.org/prop/reference/>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX pstate: <http://www.wikidata.org/prop/statement/>
PREFIX qual: <http://www.wikidata.org/prop/qualifier/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ref: <http://www.wikidata.org/reference/>
PREFIX schema: <http://schema.org/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX state: <http://www.wikidata.org/entity/statement/>
PREFIX val: <http://www.wikidata.org/prop/reference/value/>
PREFIX wikba: <http://wikiba.se/ontology#>
PREFIX wikd: <http://wikiba.se/ontology#>
PREFIX xml: <http://www.w3.org/XML/1998/namespace>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
ent:Q17861702 p:P1057 state:Q17861702-CA3E84B9-2404-428B-9AD2-3DE29ABB3FA2 ;
   p:P2548 state:Q17861702-0ED35367-33E1-4946-B508-F1437EE9A4B8 ;
   p:P279 state:Q17861702-7AA31ED4-C3FB-42B9-8CC6-6F4BAA1D5EA7 ;
   p:P2888 state:Q17861702-92391B71-07BB-4E1C-9381-408818596076 ;
   p:P351 state:Q17861702-B42D312B-8E55-454C-9D9B-30CE8470897E ;
   p:P353 state:Q17861702-FCC4CAB2-5B33-4271-B880-158CDC703397 ;
   p:P354 state:Q17861702-540AA318-7549-4C73-9532-827BC7D75238 ;
   p:P593 state:Q17861702-A34B8893-7ECC-4E49-9A39-4C548742DE8C ;
   p:P594 state:Q17861702-370E4E59-9E21-43F5-A6DF-10083805A620 ;
   p:P639 state:Q17861702-EBBC167A-027D-4008-96FA-95C66053A6CB ;
   p:P644 state:Q17861702-016DE9B5-A71B-4652-B4F3-E608ABB6E730,
       state:Q17861702-47FCA8A0-60CD-4D5B-8B8D-5E696FF32463 ;
   p:P645 state:Q17861702-8FDE462B-8548-4F9E-B11F-2E76CAD61229,
       state:Q17861702-95EA5477-1655-4A01-9E3C-29359694C3F1 ;
   p:P684 state:Q17861702-28BC9302-6B24-4BE1-B1DF-E221C3B9A222 ;
   p:P688 state:Q17861702-431573F8-5238-4758-8254-79BAFB5DE2DD ;
   p:P692 state:Q17861702-BD3AC56D-BEF7-40A5-89A1-5C7876428961,
       state:Q17861702-D03376D8-1EC0-4711-9062-2904D448D03D ;
   p:P703 state:Q17861702-995FB37D-C9FB-4C15-A532-30101C313EA0 ;
   p:P704 state:Q17861702-0DBED063-9218-468C-958D-77A716E869D4,
       state:Q17861702-2D01758A-4B9A-4313-A43C-7F6A412B93AF,
       state:Q17861702-EABB21D9-582F-42CC-BDDB-CF01CDC985A4 ;
   direct:P1057 ent:Q220677 ;
   direct:P2548 ent:Q22809680 ;
   direct:P279 ent:Q20747295 ;
   direct:P684 ent:Q18296779 ;
   direct:P688 ent:Q21100363 ;
   direct:P703 ent:Q15978631 .
state:Q17861702-CA3E84B9-2404-428B-9AD2-3DE29ABB3FA2 a wikba:BestRank ;
   wikba:rank wikba:NormalRank ;
   prov:wasDerivedFrom <http://www.wikidata.org/reference/548aa04930b65df7ca12bc71748040453baf6186> ;
   qual:P659 ent:Q20966585,
       ent:Q21067546 ;
   pstate:P1057 ent:Q220677 .


gene.shex edit

PREFIX direct: <http://www.wikidata.org/prop/direct/>
PREFIX ent: <http://www.wikidata.org/entity/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX pref: <http://www.wikidata.org/prop/reference/>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX pstate: <http://www.wikidata.org/prop/statement/>
PREFIX qual: <http://www.wikidata.org/prop/qualifier/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ref: <http://www.wikidata.org/reference/>
PREFIX schema: <http://schema.org/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX state: <http://www.wikidata.org/entity/statement/>
PREFIX val: <http://www.wikidata.org/prop/reference/value/>
PREFIX wikba: <http://wikiba.se/ontology#>
PREFIX wikd: <http://wikiba.se/ontology#>
PREFIX xml: <http://www.w3.org/XML/1998/namespace>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
start=@<Q17861702>
<Q17861702> {
#  p:P1057 @wikba:BestRank ;
#  p:P2548 @wikba:BestRank ;
#  p:P279 @wikba:BestRank ;
#  p:P2888 @wikba:BestRank ;
#  p:P351 @wikba:BestRank ;
#  p:P353 @wikba:BestRank ;
#  p:P354 @wikba:BestRank ;
#  p:P593 @wikba:BestRank ;
#  p:P594 @wikba:BestRank ;
#  p:P639 @wikba:BestRank ;
#  p:P644 @wikba:BestRank+ ;
#  p:P645 @wikba:BestRank+ ;
#  p:P684 @wikba:BestRank ;
#  p:P688 @wikba:BestRank ;
#  p:P692 @wikba:BestRank+ ;
#  p:P703 @wikba:BestRank ;
#  p:P704 @wikba:BestRank+ ;
#  direct:P1057 @<Chromosome> ;
#  direct:P2548 @<Strand> ;
#  direct:P279 @<ProtCoding> ;
#  direct:P684 @<Gene> ;
#  direct:P688 @<Q21100363> ;
#  direct:P703 @<Species>
}
wikba:BestRank {
  a [wikba:BestRank];
  wikba:rank [wikba:NormalRank];
  prov:wasDerivedFrom @<Ref>
}
<Ref> {
  pref:P248 IRI;
  ( pref:P594 LITERAL
  | pref:P351 LITERAL ;
    pref:P813 xsd:dateTime ;
    val:P813 IRI
  )
}
<Chromosome> {
}
<Strand> {
}
<ProtCoding> {
}
<Gene> {
}
<Q21100363> {
}
<Species> {
}

Links edit