Wikidata:WikiProject Collection highlights National Library of the Netherlands/Flora Batava/Data quality
This page is part of the Wikiproject Collection highlights National Library of the Netherlands, subproject Flora Batava. This project is of the Wikidata efforts of the Koninklijke Bibliotheek Nederland
General and structural overviews | Botanical overviews | SPARQL example queries | Machine & programmatical reuse | Data models, quality & completeness | All pages |
Data models, quality & completeness
editData models
edit- Datamodel for a volume (based on Volume 01)
- Datamodel for a plate (based on Plate 0041)
- See also the graph visualisation of the semantic relations between the entities linked to Plate 0098 (inspired by the Wikidata Knowledge Grapher tool)
#defaultView:Graph
#TEMPLATE={ "template": { "en": "All statements of ?item containing another item" }, "variables": { "?item": {} } }
SELECT ?item ?itemLabel ?itemImage ?value ?valueLabel ?valueImage ?edgeLabel WHERE {
BIND(wd:Q118398291 AS ?item)
?item ?wdt ?value.
?edge a wikibase:Property;
wikibase:propertyType wikibase:WikibaseItem; # note: to show all statements, removing this is not enough, the graph view only shows entities
wikibase:directClaim ?wdt.
OPTIONAL { ?item wdt:P18 ?itemImage. }
OPTIONAL { ?value wdt:P18 ?valueImage. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Property completeness dashboards
editPowered by InteGraality
SPARQL queries to monitor data quality and completeness
editAll queries in [[1]]]
Starting from Wikidata
edit- 1
- 2
Starting from Wikimedia Commons
editTO WORK ON: Presence of compulsory structured data fields in Commins files
editFor the 4 groups of Commons files that can be distinghuised, the following structured data fields & values are compulsory:
TO WORK ON:For the botanical plates
edit(example File:Veronica chamaedrys - Pl0001 - FloraBatava-KB-v01.jpg), the following structured data fields & values are compulsory:
- depicts (P180) = Qid of corresponding plate on Wikidata (for instance Plate 0001, Flora Batava (KB), volume 1 (Q118315786))
- digital representation of (P6243) = Qid of corresponding plate on Wikidata (for instance Plate 0001, Flora Batava (KB), volume 1 (Q118315786))
- main subject (P921) = Qid of depicted plant spieces (for instance Veronica chamaedrys (Q157343))
- collection (P195) = Koninklijke Bibliotheek (Q1526131)
- instance of (P31) = digital image (Q1250322)
- copyright status (P6216) = public domain (Q19652)
- source of file (P7482) = file available on the internet (Q74228490) with qualifiers
- operator (P137) = kb.nl (Q93997197)
- described at URL (P973) = URL of image on kb.nl (for instance https://galerij.kb.nl/kb.html#/nl/florabatava01/page/16/zoom/3/lat/-67.06743335108297/lng/-51.328125)
# For all botanical plates on Commons (in Category:Flora Batava - KB copy, and its subcategories), are all required structured data fields present?
# Example File:Veronica chamaedrys - Pl0001 - FloraBatava-KB-v01.jpg:
# - depicts (P180) = Qid of corresponding plate on Wikidata (for instance Plate 0001, Flora Batava (KB), volume 1 (Q118315786))
# - digital representation of (P6243) = Qid of corresponding plate on Wikidata (for instance Plate 0001, Flora Batava (KB), volume 1 (Q118315786))
# - main subject (P921) = Qid of depicted plant spieces (for instance Veronica chamaedrys (Q157343))
# - collection (P195) = Koninklijke Bibliotheek (Q1526131)
# - instance of (P31) = digital image (Q1250322)
# - copyright status (P6216) = public domain (Q19652)
# - source of file (P7482) = file available on the internet (Q74228490) with qualifiers
# - operator (P137) = kb.nl (Q93997197)
# - described at URL (P973) = URL of image on kb.nl (for instance https://galerij.kb.nl/kb.html#/nl/florabatava01/page/16/zoom/3/lat/-67.06743335108297/lng/-51.328125)
For the text pages in Dutch
edit(example File:Veronica chamaedrys - Pl0001 - DescriptionNL01 - FloraBatava-KB-v01.jpg), the following structured data fields & values are compulsory:
- main subject (P921) = Qid of depicted plant spieces (for instance Veronica chamaedrys (Q157343))
- collection (P195) = Koninklijke Bibliotheek (Q1526131)
- instance of (P31) = digital image (Q1250322)
- copyright status (P6216) = public domain (Q19652)
- source of file (P7482) = file available on the internet (Q74228490) with qualifiers
- operator (P137) = kb.nl (Q93997197)
- described at URL (P973) = URL of image on kb.nl (for instance https://galerij.kb.nl/kb.html#/nl/florabatava01/page/16/zoom/3/lat/-67.06743335108297/lng/-51.328125)
# For all text pages in Dutch (in Category:Flora Batava - KB copy, and its subcategories), are all required structured data fields present?
# Example File:Veronica chamaedrys - Pl0001 - DescriptionNL01 - FloraBatava-KB-v01.jpg:
# - main subject (P921) = Qid of depicted plant spieces (for instance Veronica chamaedrys (Q157343))
# - collection (P195) = Koninklijke Bibliotheek (Q1526131)
# - instance of (P31) = digital image (Q1250322)
# - copyright status (P6216) = public domain (Q19652)
# - source of file (P7482) = file available on the internet (Q74228490) with qualifiers
# - operator (P137) = kb.nl (Q93997197)
# - described at URL (P973) = URL of image on kb.nl (for instance https://galerij.kb.nl/kb.html#/nl/florabatava01/page/16/zoom/3/lat/-67.06743335108297/lng/-51.328125)
SELECT DISTINCT
?plate ?file ?title
?mainsubject ?mainsubjectLabel ?mainsubjectIsA ?collection ?instanceof ?copyrightstatus
?source ?operator ?describedatURL
WITH
{
SELECT ?file ?title
WHERE
{
SERVICE wikibase:mwapi
{
bd:serviceParam wikibase:api "Generator" .
bd:serviceParam wikibase:endpoint "commons.wikimedia.org" .
bd:serviceParam mwapi:gcmtitle "Category:Flora Batava - KB copy" .
bd:serviceParam mwapi:generator "categorymembers" .
bd:serviceParam mwapi:gcmlimit "50" .
?categoryName wikibase:apiOutput mwapi:title .
?ns wikibase:apiOutput "@ns".
}
FILTER (?ns = "14") # (sub)categories only
SERVICE wikibase:mwapi
{
bd:serviceParam wikibase:api "Generator" .
bd:serviceParam wikibase:endpoint "commons.wikimedia.org" .
bd:serviceParam mwapi:gcmtitle ?categoryName.
bd:serviceParam mwapi:generator "categorymembers" .
bd:serviceParam mwapi:gcmtype "file" .
bd:serviceParam mwapi:gcmlimit "1000" .
?title wikibase:apiOutput mwapi:title .
?pageid wikibase:apiOutput "@pageid" .
}
BIND (URI(CONCAT('https://commons.wikimedia.org/entity/M', ?pageid)) AS ?file)
}
}
AS %get_files
WHERE
{
INCLUDE %get_files
BIND(STRBEFORE(STRAFTER(?title," - Pl")," - DescriptionNL") AS ?plate)
FILTER(REGEX(?title,"(DescriptionNL[0-9]+ - FloraBatava-KB-v[0-9]{2}.jpg)$"))
OPTIONAL{ ?file wdt:P921 ?mainsubject. # main subject (P921) = Qid of depicted plant spieces (for instance Veronica chamaedrys (Q157343))
SERVICE <https://query.wikidata.org/sparql> {
?mainsubject rdfs:label ?mainsubjectLabel FILTER (lang(?mainsubjectLabel) = "la").
?mainsubject wdt:P31 ?mainsubjectIsA.
FILTER (?mainsubjectIsA = wd:Q16521) # ?mainsubject must be instance of taxon/Q16521
}
}
OPTIONAL{ ?file wdt:P31 ?instanceof.
FILTER (?instanceof = wd:Q1250322).
} # ?instanceof must be a digital image (Q1250322)
OPTIONAL{ ?file wdt:P6216 ?copyrightstatus.
} # copyright status (P6216) = public domain (Q19652)
OPTIONAL{ ?file wdt:P195 ?collection.
} # collection (P195) = Koninklijke Bibliotheek (Q1526131)
OPTIONAL{ ?file p:P7482 [ps:P7482 ?source; pq:P137 ?operator; pq:P973 ?describedatURL].}
} ORDER BY ?plate
For the text pages in French
edit(example File:Veronica chamaedrys - Pl0001 - DescriptionFR01 - FloraBatava-KB-v01.jpg), the following structured data fields & values are compulsory:
- main subject (P921) = Qid of depicted plant spieces (for instance Veronica chamaedrys (Q157343))
- collection (P195) = Koninklijke Bibliotheek (Q1526131)
- instance of (P31) = digital image (Q1250322)
- copyright status (P6216) = public domain (Q19652)
# For all text pages in French (in Category:Flora Batava - KB copy, and its subcategories), are all required structured data fields present?
# Example File:Veronica chamaedrys - Pl0001 - DescriptionFR01 - FloraBatava-KB-v01.jpg:
# - main subject (P921) = Qid of depicted plant spieces (for instance Veronica chamaedrys (Q157343))
# - collection (P195) = Koninklijke Bibliotheek (Q1526131)
# - instance of (P31) = digital image (Q1250322)
# - copyright status (P6216) = public domain (Q19652)
SELECT DISTINCT
?plate ?file ?title ?mainsubject ?mainsubjectLabel ?mainsubjectIsA ?collection ?instanceof ?copyrightstatus
WITH
{
SELECT ?file ?title
WHERE
{
SERVICE wikibase:mwapi
{
bd:serviceParam wikibase:api "Generator" .
bd:serviceParam wikibase:endpoint "commons.wikimedia.org" .
bd:serviceParam mwapi:gcmtitle "Category:Flora Batava - KB copy" .
bd:serviceParam mwapi:generator "categorymembers" .
bd:serviceParam mwapi:gcmlimit "100" .
?categoryName wikibase:apiOutput mwapi:title .
?ns wikibase:apiOutput "@ns".
}
FILTER (?ns = "14") # (sub)categories only
SERVICE wikibase:mwapi
{
bd:serviceParam wikibase:api "Generator" .
bd:serviceParam wikibase:endpoint "commons.wikimedia.org" .
bd:serviceParam mwapi:gcmtitle ?categoryName.
bd:serviceParam mwapi:generator "categorymembers" .
bd:serviceParam mwapi:gcmtype "file" .
bd:serviceParam mwapi:gcmlimit "3000" .
?title wikibase:apiOutput mwapi:title .
?pageid wikibase:apiOutput "@pageid" .
}
BIND (URI(CONCAT('https://commons.wikimedia.org/entity/M', ?pageid)) AS ?file)
}
}
AS %get_files
WHERE
{
INCLUDE %get_files
BIND(STRBEFORE(STRAFTER(?title," - Pl")," - DescriptionFR") AS ?plate)
FILTER(REGEX(?title,"(DescriptionFR[0-9]+ - FloraBatava-KB-v[0-9]{2}.jpg)$"))
OPTIONAL{ ?file wdt:P921 ?mainsubject. # main subject (P921) = Qid of depicted plant spieces (for instance Veronica chamaedrys (Q157343))
SERVICE <https://query.wikidata.org/sparql> {
?mainsubject rdfs:label ?mainsubjectLabel FILTER (lang(?mainsubjectLabel) = "la").
?mainsubject wdt:P31 ?mainsubjectIsA.
FILTER (?mainsubjectIsA = wd:Q16521) # ?mainsubject must be instance of taxon/Q16521
}
}
OPTIONAL{ ?file wdt:P31 ?instanceof.
FILTER (?instanceof = wd:Q1250322).
} # ?instanceof must be a digital image (Q1250322)
OPTIONAL{ ?file wdt:P6216 ?copyrightstatus.
} # copyright status (P6216) = public domain (Q19652)
OPTIONAL{ ?file wdt:P195 ?collection.
} # collection (P195) = Koninklijke Bibliotheek (Q1526131)
} ORDER BY ?plate
For the non-plate, non-text pages
edit(example File:Preface NL 01 - FloraBatava-KB-v01.jpg), the following structured data fields & values are compulsory:
- collection (P195) = Koninklijke Bibliotheek (Q1526131)
- instance of (P31) = digital image (Q1250322)
- copyright status (P6216) = public domain (Q19652)
# For all non-plate, non-text pages (in Category:Flora Batava - KB copy, and its subcategories), are all required structured data fields present?
# Example File:Preface NL 01 - FloraBatava-KB-v01.jpg
# - collection (P195) = Koninklijke Bibliotheek (Q1526131)
# - instance of (P31) = digital image (Q1250322)
# - copyright status (P6216) = public domain (Q19652)
SELECT DISTINCT
?file ?title ?collection ?instanceof ?copyrightstatus
WITH
{
SELECT ?file ?title
WHERE
{
SERVICE wikibase:mwapi
{
bd:serviceParam wikibase:api "Generator" .
bd:serviceParam wikibase:endpoint "commons.wikimedia.org" .
bd:serviceParam mwapi:gcmtitle "Category:Flora Batava - KB copy" .
bd:serviceParam mwapi:generator "categorymembers" .
bd:serviceParam mwapi:gcmlimit "100" .
?categoryName wikibase:apiOutput mwapi:title .
?ns wikibase:apiOutput "@ns".
}
FILTER (?ns = "14") # (sub)categories only
SERVICE wikibase:mwapi
{
bd:serviceParam wikibase:api "Generator" .
bd:serviceParam wikibase:endpoint "commons.wikimedia.org" .
bd:serviceParam mwapi:gcmtitle ?categoryName.
bd:serviceParam mwapi:generator "categorymembers" .
bd:serviceParam mwapi:gcmtype "file" .
bd:serviceParam mwapi:gcmlimit "3000" .
?title wikibase:apiOutput mwapi:title .
?pageid wikibase:apiOutput "@pageid" .
}
BIND (URI(CONCAT('https://commons.wikimedia.org/entity/M', ?pageid)) AS ?file)
}
}
AS %get_files
WHERE
{
INCLUDE %get_files
FILTER(!REGEX(?title,"(Pl[0-9]{4} - FloraBatava-KB-v[0-9]{2}.jpg)$"))
FILTER(!REGEX(?title,"(DescriptionFR[0-9]+ - FloraBatava-KB-v[0-9]{2}.jpg)$"))
FILTER(!REGEX(?title,"(DescriptionNL[0-9]+ - FloraBatava-KB-v[0-9]{2}.jpg)$"))
OPTIONAL{ ?file wdt:P195 ?collection.} # collection (P195) = Koninklijke Bibliotheek (Q1526131)
OPTIONAL{ ?file wdt:P6216 ?copyrightstatus.} # copyright status (P6216) = public domain (Q19652)
OPTIONAL {?file wdt:P31 ?instanceof.
FILTER (?instanceof = wd:Q1250322)} # ?instanceof must be a digital image (Q1250322)
} ORDER BY ?title