Wikidata:WikiProject Reasoning/Use cases

This page is for gathering examples of rules of inference that we might want to express in Wikidata. Not all of these use cases might be desirable in the end. This page is not for discussing if a particular rule would be good in a particular case, but for finding out which kinds of rules are needed in general.

This page is part of the Wikidata:WikiProject Reasoning.

Format of this page

Use cases should each describe one kind of rule, specific requirements, and examples where it would apply.
Each use case has one section where it can also be discussed.
A rule on this page is an informal statement of the form "if premise then conclusion".
Rules can refer to many objects (items, properties, values, statements). Some of these objects are variables (they can take different values for each rule application); some of these objects are constants (they are specific objects defined in the rule). The variables are written with a leading "?" to clarify that they are not fixed.

Use cases

Symmetry of individual properties

Many properties are declared symmetric by the Template:Constraint:Symmetric. For a particular property Pxxx, this corresponds to a rule like this:

If item ?A has a statement ?S1 with property Pxxx and value ?B,

then item ?B has a statement ?S2 with property Pxxx and value ?A.

Example:

Douglas Adams (Q42) has spouse (P26) with value Jane Belson (Q14623681), therefore Jane Belson (Q14623681) has spouse (P26) with value Douglas Adams (Q42).

Sometimes, qualifiers of the statement S1 should be copied to the statement S2. For example, this is typically the case for start time (P580) and end time (P582). There should be a way to say which qualifiers are to be copied.

Examples:

Douglas Adams (Q42) has spouse (P26) with value Jane Belson (Q14623681) and qualifiers start time (P580) with value 25 November 1991 and end time (P582) with value 11 May 2001. Therefore Jane Belson (Q14623681) has spouse (P26) with value Douglas Adams (Q42) and qualifiers start time (P580) with value 25 November 1991 and end time (P582) with value 11 May 2001.
United States of America (Q30) has diplomatic relation (P530) with value Afghanistan (Q889) and qualifiers statement is subject of (P805) with value Afghanistan–United States relations (Q4689214) and diplomatic mission sent (P531) with value Embassy of the United States, Kabul (Q894835). Therefore, Afghanistan (Q889) has diplomatic relation (P530) with value United States of America (Q30) and qualifier statement is subject of (P805) with value Afghanistan–United States relations (Q4689214), but without the value Embassy of the United States, Kabul (Q894835) for qualifier diplomatic mission sent (P531) (there might be another appropriate value for this qualifier, but this we cannot know from the statement on United States of America (Q30)).

Symmetry of properties that are declared symmetric

Wikidata has a way to say that properties are symmetric by making them instance of (P31) symmetric property (Q18647518). Therefore, instead of writing down one rule for every symmetric property, it would be better to have a rule that says the following:

If item ?A has a statement ?S1 with property ?P and value ?B,

and property ?P has a statement ?S2 with property instance of (P31) and value symmetric property (Q18647518)

then item ?B has a statement ?S2 with property ?P and value ?A.

Example:

Douglas Adams (Q42) has spouse (P26) with value Jane Belson (Q14623681), and spouse (P26) has instance of (P31) with value symmetric property (Q18647518), therefore Jane Belson (Q14623681) has spouse (P26) with value Douglas Adams (Q42).

Discussion:

A advantage of this approach is that one rule applies to many properties, based on their data. --Markus Krötzsch (talk) 11:49, 27 August 2015 (UTC)[reply]
A disadvantage of this approach is that it might be difficult to specify qualifier handling for all possible qualifiers in this one rule. It is not possible to handle the same qualifier in different ways for different symmetric properties. This may indicate that the general classification of properties as Jane Belson (Q14623681) is too coarse to be used in reasoning. --Markus Krötzsch (talk) 11:49, 27 August 2015 (UTC)[reply]

Subclass of is transitive

It is usually assumed that subclass of (P279) is transitive in the following sense:

If entity ?A has a statement ?S1 with property subclass of (P279) and value ?B

and entity ?B has a statement ?S2 with property subclass of (P279) and value ?C

then entity ?A has a statement ?S3 with property subclass of (P279) and value ?C

Normally, there are no qualifiers for subclass of (P279). It would not be clear if the above rule is still correct with some (unexpected) qualifier. Therefore, it would be good to specify that the rule can only be applied if statements ?S1 and ?S2 do not have any qualifiers.

Female parents are mothers

There is a discussion to remove gender from parenthood relations. The gendered parenthood relations could then be inferred with a rule like the following:

If entity ?A has a statement ?S1 with property Pparent and value ?B

and entity ?B has a statement ?S2 with property sex or gender (P21) and value female (Q6581072),

Then entity ?A has a statement ?S3 with property mother (P25) and value ?B

A similar rule could be given for father (P22). Note that to express such a rule, it would be useful to keep properties father (P22) and mother (P25) even if they are not to be used in statements. In a similar way, one could create other "virtual properties" that are only meant to express inferred statements.

It is not clear which qualifiers might be relevant in this inference rule. It might be best to restrict the inference to statements without any qualifiers.

Offices of heads of government

An example of a rule that involves several properties and specific items, but which is still universally true by definition:

If item ?A has a statement ?S1 with property head of government (P6) and value ?B

and item ?A has statement ?S2 with property office held by head of state (P1906) and value ?C

then item ?B has statement ?S3 with property position held (P39) and value ?C.

Example:

Cyprus (Q229) has a statement with property head of government (P6) and value Dimitris Christofias (Q57334), and Cyprus (Q229) has a statement with property office held by head of state (P1906) and value President of Cyprus (Q841760), therefore Dimitris Christofias (Q57334) has a statement with property position held (P39) and value President of Cyprus (Q841760).

Note that in this case, temporal qualifiers start time (P580) and end time (P582) should be copied from statement ?S1 to statement ?S3, but not from statement ?S2. Where the office held by head of state (P1906) changes over time (e.g. from a king to a president) statement S2 will need start and end time qualifiers which will make the rules much more complicated. The rule above may only apply to preferred (i.e. current) values of statements or only to statements with only one value. Alternatively if S2 has values C1 and C2 then we can allow S3 to have values C1 or C2 and ignore the time qualifiers on S2.

Discussion:

Since the number of states is not so large, it might be easier to have specific rule for each to control the details. In this case, one would not have to check for office held by head of state (P1906). --Markus Krötzsch (talk) 20:30, 27 August 2015 (UTC)[reply]
A rule should never depend on whether there is one or more statements for one property. The meaning of Wikidata statements should not change based on other statements in general. Moreover, having two statements does not mean that there are two values; maybe there are just two different values from two references. It would be very hard to base a rule on this. --Markus Krötzsch (talk) 20:30, 27 August 2015 (UTC)[reply]

Publication date propagation, value propagation

If an article ?A was published in a journal edition ?B and the publish date of ?B is ?D then the publish date of ?A is also ?D, and conversely. author TomT0m / talk page 13:41, 28 August 2015 (UTC)[reply]

Types and properties and their instances : a painting on canvas (Q16669405) has a canvas (Q20155905)

If a class of object has a common properties, for example any painting on canvas as a canvas, we should be able to infer that any of the instances of that class has also a canvas, or any oil painting oil paint (Q296955) is made with the relevant painting technique and is painted with oil and pigments ... author TomT0m / talk page 17:22, 31 August 2015 (UTC)[reply]

Part of

in some cases, when applied to a class like in the example, some properties just would not make sense for a physical object : pigments are used to paint, the class of painting itself, as an abstract object, don't have pigments. But some properties like part of can apply to classes and instances with a different meaning. It's usually said (see Refining "part of") that an instance (token) can only be a part of a class, and a class can only be part of a class. But what if all instances of some class are part of the same concrete object ?

Examples (taken from Wikidata_talk:Requests_for_comment/Help:Basic_membership_properties thanks to ArthurPSmith and Filcoelaire :

<inner planets> are <part of> the <Solar System>
every college of some university U is an instance of a class <college of ?U university>, as well as a part of the ?U university

Thus as is we can't express any planet in the <solar system planet> class has a <part of: solar system> statement just by adding a <part of: solar system> to the <solar system planet> item.

Update: After some thoughts, I'd suggest a model to achive that. If any instance I of a class C has a property with main snak pair (prop,val), then the class should have a statement

⟨ C ⟩ has quality Search ⟨ https://www.wikidata.org/wiki/Q23766486 ⟩
prop Search ⟨ val ⟩

and this should be inferred that

⟨ I ⟩ prop Search ⟨ val ⟩

Applies to part and part of

An example to illustrate :

⟨ some pictures ⟩ color (P462) ⟨ blue ⟩
applies to part Search ⟨ sky ⟩

could as well be written

⟨ some picture ⟩ has part Search ⟨ the pictures sky ⟩
⟨ the pictures sky ⟩ color (P462) ⟨ blue ⟩
.

Decomposing hierarchically an image or a painting and create items for the parts and subparts might be the only way to describe a picture in depth. applies to part can apply when the hierarchy is not to deep and help reduce the number of items, but hardly applies in general has the parts need an identifier somehow, and might be themselves decomposed ... Inferring one representation from the other should go from the particular case to the more generic one, and might need the creation of a blank item ... @Filceolaire: as we discuss this issue sometime :) author TomT0m / talk page 09:56, 2 September 2015 (UTC)[reply]

I don't think it is our job to "describe a picture in depth". That is what the image file is for. I really don't want to start creating items for parts of a picture. Let's just stick to qualifiers and that will go as deep as we need to go. Joe Filceolaire (talk) 10:06, 2 September 2015 (UTC)[reply]

@Filceolaire: I'm just collecting usecase. I saw people start doing this on french Bistro is all. author TomT0m / talk page 10:11, 2 September 2015 (UTC)[reply]

valid in period and begin/end dates

Usecase description

valid in period (P1264) is a qualifier to indicates the a period of time of validity of some statement. This helps to reduce the redundancy of some dates if the time period is often used and helps the user to enter facts efficiently.

Possible rule

if ?st=
⟨ subject ⟩ Wikidata property ⟨ object or value ⟩
valid in period (P1264) ⟨ ?period ⟩
and
⟨ ?period ⟩ start time (P580) ⟨ ?bd ⟩
and

⟨ ?period ⟩ end time (P582) ⟨ ?ed ⟩
then

?st should be augmented with
⟨ subject ⟩ Wikidata property ⟨ object or value ⟩
start time (P580) ⟨ ?bd ⟩
end time (P582) ⟨ ?ed ⟩

Discussion

author TomT0m / talk page 08:29, 3 September 2015 (UTC)[reply]

statement is subject of and facet of

An item whose subject is another item statement is a facet of the latter item:

If item ?A has a statement ?S1 with property ?P1 and qualifier statement is subject of (P805) and qualifier value ?B,

then item ?B has a statement ?S2 with property facet of (P1269) and value ?A.

Examples:

⟨ Barack Obama (Q76)      ⟩ position held (P39) ⟨ President of the United States (Q11696)      ⟩
statement is subject of (P805) ⟨ presidency of Barack Obama (Q1379733)      ⟩
⟨ presidency of Barack Obama (Q1379733)      ⟩ facet of (P1269) ⟨ Barack Obama (Q76)      ⟩

Discussion:

Nationalities

Just to record the idea, it seems that the country of citizenship (P27) of a lot of person can change because of a change in the administrative status of the country they belong to, for example for Harry Secombe (Q762958)   there is a statement

⟨ Harry Secombe (Q762958)  

 ⟩ country of citizenship (P27) ⟨ United Kingdom of Great Britain and Ireland (Q174193)  

 ⟩
end time (P582) ⟨ 1927 ⟩
end cause (P1534) ⟨ Royal and Parliamentary Titles Act 1927 (Q7375047)  

 ⟩

and a statement

⟨ subject ⟩ country of citizenship (P27) ⟨ United Kingdom (Q145)  

 ⟩

It seems that there could be a lot of other persons in that same case and that inference could be a good way to avoid this kind of redundancy, but I have no idea of the kind of rule that would be needed for this. Maybe the biggest problem is to build the ontology needed to write such a rule :) Actually the example is bad as the country was split and it seem impossible to know if the person became irish or stayed britannic, a better example could be a merge of country, German reunification (Q56039)   or other recent cases. author TomT0m / talk page 12:06, 25 October 2018 (UTC)[reply]

(for the record, this idea is inspired by a question on the french wd project chat which wondered why « britanny » happened to be shown twice in the frwiki infobox, once with an end date.) author TomT0m / talk page 11:45, 25 October 2018 (UTC)[reply]