User:Epìdosis/NLG

Lessons about Wikidata at the National Library of Greece (Q1467610) (24-26 May 2023).

Scheduled program
  • introduction to Wikidata
  • creating and editing items
  • Mix'n'match
  • managing bibliographic items in Wikidata
  • entity management in Wikidata, use of information sources (VIAF, ISNI, etc.)
  • basics of SPARQL queries; finding duplicate IDs in NLG authority file through Wikidata
  • comparing NLG and Wikidata in order to find missing or wrong data on one side
  • doing semi-automatic imports from NLG to Wikidata or from Wikidata to NLG

Introduction to Wikidata edit

What's Wikidata edit

  • a knowledge base (Q593744) (~ a database (Q8513)): it stores structured data
  • a Wikimedia Foundation project (Q14827288): it exchanges data with other Wikimedia projects (including Wikipedia and Commons); users use one Wikimedia account on all the Wikimedia projects
  • a wiki (Q171): it can be edited, without specific computer skills
    • a Wikibase (Q16354758) website: it uses a specific data model (see below), shared by other websites using Wikibase (the "Wikibase ecosystem"); its data can be queried through a SPARQL endpoint (see below)
  • a community of users: it is a collaborative project, where persons speaking different languages and interested in different fields o study interact in order to improve the data

(statistics: Wikidata WikiScan; Wikidata Navel Gazer)

Data model edit

  • data in Wikidata are structured in triples: subject - predicate - object (e.g. Rome (Q220)country (P17)Italy (Q38))
    • the subject of a statement is an item (items have IDs starting with Q)
    • the predicate of a statement is a property (properties have IDs starting with P)
    • the object of a statement can be an item, or a string, or a date, or a URL, or coordinates etc.
  • an item is divided in four parts (e.g. Domenico Comparetti (Q1158982), classical philologist (Q16267607))
    • labels, descriptions, aliases (item - is labelled in language X as - NAME string)
    • statements (item - has as value of property Pn - VALUE)
    • identifiers (item - has as value of property Pn - IDENTIFIER string)
    • sitelinks (item - has as sitelink in the Wikimedia project X - ARTICLE TITLE string)
  • a property is divided in four parts (es. member of the deme (P2462), SBN author ID (P396))
    • labels, descriptions, aliases (property - is labelled in language X as - NAME string)
    • data type (property - has as data type - DATA TYPE)
    • statements (property - has as value of property Pn - VALUE)
    • constraints (property - has as value of property property constraint (P2302) - VALUE)

(interactive list of properties: Wikidata Propbrowse; properties regarding Greece in {{Greece properties}})

Editing content edit

(a visualization of Wikidata edits in real time: Listen to Wikidata)

importScript( 'User:Magnus Manske/mixnmatch gadget.js' );
importScript( 'User:Bargioni/UseAsRef.js' );
importScript( 'User:Bargioni/viaf.js' );
importScript( 'User:Epìdosis/moreIdentifiers settings.js' );
importScript( 'User:Bargioni/moreIdentifiers.js' );

Reasons for using Wikidata edit

  • central node in the semantic web: Wikidata links to more than 8k databases, plus other websites, and many more websites link to Wikidata; interconnected with the growing Wikibase ecosystem (see map 1, map 2)
  • cross-domain: it collects and connects data from sources of different fields
  • multilingual: it allows multiple labels, descriptions, aliases divided by language, so that data are also easily translatable (a recent example of data reuse at PLWABN: https://dbn.bn.org.pl/descriptor-details/9810701965205606)
  • collaborative: an active community of users collaborates for the quantitative and qualitative improvement of the data
  • flexible: the ontology can be enriched and restructured in order to adapt to different sources; new properties can be added
  • reusable: structured data in Wikidata have CC0 license

Mix'n'match edit

Bibliographic items edit

Entity management edit

SPARQL queries edit

Further reading edit

Conference 22-23/05/2023 (all):

Koha Community Newsletter (June 2023): https://koha-community.org/koha-community-newsletter-june-2023/#kohaandrda

Users edit

Tutors
NLG

Statistics edit

Data from SPARQL queries (2023):

Parameter 27/05 29/05 31/05 03/06 09/06 18/06 26/06 08/07 18/08 19/08 25/08 23/09 25/10 28/02/2024
NLG IDs (query) 40181 40590 41676 41931 42892 43477 43771 44253 45107 46042 46888 46997 47533 48135
Items with NLG IDs (query) 39589 39987 40981 41214 42098 42653 42929 43378 44187 45081 45898 46003 46534 47110
Items with NLG IDs and VIAF IDs (query) 38089 38517 39410 39694 40591 41134 41401 41879 42663 43510 44302 44406 44904 46646
Items with NLG IDs and ISNI IDs (query) 36027 36415 37243 37505 38346 38865 39126 39576 40317 41128 41903 42009 42435 42992
Items with NLG IDs and referenced birth date (query) 34699 35044 35848 36079 36859 37341 37583 37962 38641 39400 40137 40229 40633 41078
Items with NLG IDs and referenced occupation (query) 27045 27300 27936 28303 28957 29409 29695 30358 30955 31527 32264 32439 32769 33274
Items with NLG IDs and SBN IDs (query) 9549 9643 9861 9940 10130 10236 10339 10506 10722 10845 10978 11054 11343 13752
Items with at least one ID of a Greek or Cypriot authority file (query) 57344 57725 58611 59102 59939 60502 60804 61260 84370 85116 85817 86173 86794 87836

Data from Wikidata Navel Gazer (2023-2024):

Parameter 01/05 01/06 01/07 01/08 01/09 01/10 01/11 01/12 01/01/2024 01/02 01/03 01/04
NLG IDs 40241 42027 44086 44632 47015 47091 47770 47802 47955 48087 48162 48260
NLG IDs added in the previous month 91 1785 2034 572 2382 76 505 173 101 119 71 84

Things to be checked:

Duplications to be solved elsewhere:

Items to be improved: