Wikidata:Wikidata curricula/Activities/Explore some SPARQL queries to check data quality

A good collection of such queries sits at Wikidata:SPARQL_query_service/queries/examples#Queries_for_maintenance.

Some more such queries are listed below.

Papers with both a P2093 (author name string) and a P50 (author) statement for the same P1545 (series ordinal) edit

The following query uses these:

  • Properties: author (P50)     , author name string (P2093)     , series ordinal (P1545)     
    #papers with both a P2093 (author name string) and a P50 (author) statement for the same P1545 (series ordinal)
    SELECT ?q ?series_id ?author_q ?author_name {
     ?q p:P50 ?author_statement .
     ?author_statement ps:P50 ?author_q .
     ?author_statement pq:P1545 ?series_id .
     ?q p:P2093 ?author_name_statement .
     ?author_name_statement ps:P2093 ?author_name .
     ?author_name_statement pq:P1545 ?series_id
     }
    LIMIT 50
    

Papers with titles (P1476) that are all UPPER CASE edit

The following query uses these:

  • Properties: instance of (P31)     , title (P1476)     
    #Papers with titles that are all UPPER CASE
    SELECT ?item ?title WHERE {
      ?item wdt:P31 wd:Q13442814; #scholarly article
            wdt:P1476 ?title. #title
      FILTER(REGEX(STR(?title), "^[\\p{Lu}\\p{M}\\p{N}\\p{P}\\p{Z}]+$")) #non-lowercase 
    }
    LIMIT 1000
    

People with a P496 (ORCID ID) statement but no P21 (sex/gender) statement edit

The following query uses these:

  • Properties: ORCID iD (P496)     , instance of (P31)     , author (P50)     , sex or gender (P21)     
    #People with a P496 (ORCID ID) statement but no P21 (sex/gender) statement
    SELECT DISTINCT ?author ?authorLabel #(COUNT(?paper) AS ?count)
    WHERE
    {
        ?author wdt:P496 ?orcid ; 
                wdt:P31 wd:Q5 .
        ?paper wdt:P50 ?author .
        MINUS { ?author wdt:P21 ?gender . }
    	SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
    }
    #GROUP BY ?author ?authorLabel
    #ORDER BY DESC(?count)
    LIMIT 200
    

People with a P496 (ORCID ID) statement but no P106 (occupation) statement edit

The following query uses these:

  • Properties: ORCID iD (P496)     , instance of (P31)     , author (P50)     , occupation (P106)     
    #People with a P496 (ORCID ID) statement but no P106 (occupation) statement
    SELECT DISTINCT ?author #?authorLabel (COUNT(?paper) AS ?count)
    WHERE
    {
        ?author wdt:P496 ?orcid ; 
                wdt:P31 wd:Q5 .
        ?paper wdt:P50 ?author .
        MINUS { ?author wdt:P106 ?occupation . }
    #	SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
    }
    #GROUP BY ?author ?authorLabel
    #ORDER BY DESC(?count)
    LIMIT 200
    

See also edit