Wikidata:WikidataCon 2017/Submissions/ClassRank: discovering the relevance of each class in Wikidata

 This is an Open submission for WikidataCon 2017 that has not yet been reviewed by the members of the Program Committee.

Submission no. 7
Title of the submission
ClassRank: discovering the relevance of each class in Wikidata.

Author(s) of the submission
Daniel Fernández-Álvarez.
E-mail address
danifdezalvarez@gmail.com
Country of origin
Spain.
Affiliation, if any (organisation, company etc.)
WESO Research group, University of Oviedo.

Type of session
Something in between of "talk" and "discussion"
Length of session
The content can be explained and discussed in 30 minutes. A longer talk may fall into too technical and less interesting details.
Ideal number of attendees
Everyone interested int he topic.
EtherPad for documentation
https://etherpad.wikimedia.org/p/WikidataCon-7

Abstract

The knowledge contained in Wikidata is provided by a wide and heterogeneous community, which expands the graph in hardly predictable ways. How can we maintain summaries of this huge amount of information? Which are the most linked topics? And which kind of SPARQL queries allow us to access that content?

We think that the idea of "class" can be a key element to provide an answer to those questions, and we have developed the ClassRank algorithm. ClassRank takes ideas of PageRank-like algorithms and adapts them to the domain of classes. Our approach detects which are the most relevant classes in an RDF graph according to the centrality of their instances. This can be helpful for several reasons:

  • A class is an abstract concept that can be seen as a topic which groups a set of individuals (instances). Then, a ranking of class relevance can be used as a ranking of topic relevance. This allows summarizing the content of the graph.
  • All the instances of a class have a common nature and are supposed to fit in a certain basic set of properties (schema). The usage of these shared properties allows designing SPARQL queries that involve all those individuals at a time.

During this talk, I would like to present and discuss the ideas of our approach. We have implemented a prototype of ClassRank and we have used it to measure class relevance in Wikidata. I would also like to present and discuss the obtained results.


What will attendees take away from this session?
  1. An introduction to the ClassRank algorithm.
  2. An overview of the class relevance distribution in Wikidata.
Slides or further information
Special requests

Interested attendees edit

If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest.

  1. Would be great if we could play around with your implementation ahead of the talk. --Daniel Mietchen (talk) 07:09, 31 July 2017 (UTC)[reply]
  2. ArthurPSmith (talk) 15:19, 31 July 2017 (UTC)[reply]
  3. ...