Wikidata:Requests for comment/Kinship
The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Merging of P:P7 and P:P9 was approved, the discussion died after May 20. Possible open issues should be discussed in a new RfC. --Sannita - not just another it.wiki sysop 00:10, 31 July 2013 (UTC)[reply]
I proposed merging P:P7 (brother) and P:P9 (sister) at Wikidata:Project chat. User:Izno pointed out to me quite well the current family relationship system is a mess. So I ask you to discuss here whether to implement my proposal (see below) or else what we should do in general about this.
Current system edit
- Properties
- P:P7 brother (male)
- P:P9 sister (female)
- P:P22 father (male)
- P:P25 mother (female)
- P:P26 spouse (male or female)
- P:P29 uncle (male, paternal or maternal)
- P:P40 child (male or female)
- P:P43 stepfather (male) [note: husband of the subject's mother excludes husbands of homosexual fathers]
- P:P44 stepmother (female) [note: wife of the subject's father excludes wives of homosexual mothers]
- P:P45 grandparent (male or female, maternal or paternal)
- P:P139 aunt (female, paternal or maternal)
- P:P451 cohabitant (male or female) [note: it is unclear what is meant by this, see discussion page]
As you can see, this system is highly inconsistent and redundant.
If we choose to keep going the redundancy path, there is no valid argument against the creation of these properties:
- cousin (what about second cousin twice removed, qualifiers?)
- grandchild
- stepbrother
- stepsister
- half-sister
- half-brother
- niece
- nephew
- etc.
At least we should clean up the inconsistency by deciding whether to keep different properties for different sexes or merging them together. Otherwise there is also no valid argument against:
- son
- daughter
- female cousin
- male cousin
- maternal grandmother
- paternal grandmother
- maternal grandfather
- paternal grandfather
- husband
- wife
- grandson
- granddaughter
- maternal aunt
- paternal aunt
- maternal uncle
- paternal uncle
- etc.
Proposed systems edit
Option 1 edit
So I propose a minimalistic alternative. The one we use, or should be using, for every other item category as well.
- Properties
- P:P22 biological father (male)
- P:P25 biological mother (female)
- P:Pα legal parent (male or female)
- P:P26 spouse (male or female)
- P:Pβ1 to P:Pβn (male or female) [note: any number of different non-marriage sexual relationships from unofficial to on same terms with legal marriage]
Every other family relation can be derived from these statements, no redundancies. —★PοωερZtalk 03:06, 26 April 2013 (UTC)[reply]
Option 2 edit
*P:P7 brother (male)
*P:P9 sister (female)
*P:P29 uncle (male, paternal or maternal)
- P:P40 child (male or female)
*P:P43 stepfather (male) [note: husband of the subject's mother excludes husbands of homosexual fathers]
*P:P44 stepmother (female) [note: wife of the subject's father excludes wives of homosexual mothers]
*P:P45 grandparent (male or female, maternal or paternal)
*P:P139 aunt (female, paternal or maternal)
- and a new property called family relation which can be used for uncle, aunt, cousin,... when missing relations occur. The type of relation will be described by qualifiers. Example: family relation: John Doe, qualifier: cousin.
- and for legal parents and adopted child we can used a qualifier again so no need to distinguish biological parent from legal one through propertie but with the help of qualifier. Snipre (talk) 08:00, 26 April 2013 (UTC)[reply]
Discussion edit
I also prefer a simple system, however there is a problem here. In many cases, we can't derive the relationship of two people because we have no information about the intermediary people. For instance, if we only know two people are cousins and there is no more information we can get from any source, how do we link them together? So I propose another property called "indirect relationship" and use a qualifier to identify the relationship type to deal with this situation. This property should only be used when the intermediary people are unknown. --Stevenliuyi (talk) 03:35, 26 April 2013 (UTC)[reply]
- This surely is a rare occasion, but I see your point. There is little information about many historical characters. I can't think of an example off the top of my head, but let's pretend Jesus III. succeeded his cousin Jesus II. as King of Dystopia, but only an ancient ruler list is known as source, no further information. We can't skip this data so we need something like your proposal for these cases. —★PοωερZtalk 03:44, 26 April 2013 (UTC)[reply]
Support I agree with the general idea here: deducible properties should generally be avoided. Here, that means properties like 'stepmother', 'stepfather', 'aunt', 'uncle', and 'grandparent' should not exist because they are trivially deducible from the properties 'mother', 'father', 'brother' and 'sister', 'wife' and 'husband'. This minimalist approach has some obstacles, but they're resolvable. For example, what should be done if you want to determine the subject's brother-in-law (who has a Wikidata item about him), but the subject's husband doesn't have a Wikidata item about him? Two possible solutions are:
- Create a Wikidata item for the otherwise non-notable husband (as implied in a comment in the brother-in-law property proposal), so that reasoners can deduce the answer by traversing the subject's family tree with existing properties.
- Create a Wikidata claim using a generic 'relative' property which directly specifies the brother-in-law, and don't make a Wikidata item for the husband. Then qualify what type of relative the person is with a 'relation' property, which takes an item like brother-in-law.
The second approach seems like it would better handle more distant relations, like an ancestor or descendant separated by many generations of people that don't have Wikidata items about them. Emw (talk) 04:01, 26 April 2013 (UTC)[reply]
- I go with number 1. It's not like Wikidata will run out of space for items. The hypothetical husband might not be of interest now, but there could always be someone querying for "list of husbands of chemists" or something along that line. I also fear people arguing whether to use property relative because sometimes it might not be clear if the items are properly linked at a first glance.
As to why mother and father, not just parent: It's about whose egg / sperm cell it actually was, since sex is not clear enough in this regard. If both parents are marked intersex, there is no way a bot could distinguish that. —★PοωερZtalk 04:35, 26 April 2013 (UTC)[reply]
- Do you have an actual example of a person notable enough to have an item on Wikidata whose parents are both intersex? Gabbe (talk) 08:44, 26 April 2013 (UTC)[reply]
Comment Personally I think the really unnecessary property is P:P40 (child). For the other properties, there are conceivable cases where we know that "X" and "Y" are siblings, for example, but there isn't any information about their parents, so the "brother/sister/sibling" property is the only way of specifying this kinship. But there is no conceivable instance where we can say that "X is the father/mother/parent of Y" where it isn't by necessity the case that "Y is the child of X". Parameter P40 is totally redundant. Gabbe (talk) 08:53, 26 April 2013 (UTC)[reply]
Question To what extent do you think these properties will be used outside enwp? -- Lavallen (block) 14:29, 26 April 2013 (UTC)[reply]
- My initial impression is that these properties will be used on all Wikipedias. I think it's also likely that they will be used by third parties beyond Wikimedia. Emw (talk) 15:14, 27 April 2013 (UTC)[reply]
- Maybe, but I have doubts, since I know that there are many users who think that notability is not an inheritance. -- Lavallen (block) 11:34, 29 April 2013 (UTC)[reply]
Comment
Option 1 is good and I Support it.(edit: see new comment below)- As for Option 2, I see multiple problems. First, "child" is redundant if we have a way of indicating parenthood. Second, it says that "mother" and "father" could mean biological or legal, which would be indicated by qualifiers. How would we indicate that a legal parent is intersex? Using mother implies the female sex, and using father implies male. What if two legal parents are women; is one of them the father, or are they both mothers? What if a legal father changes his sex? The item pointed to by father will have "sex: female", which is inconsistent.
- As for indirect relations where the intermediate persons are known, but aren't notable by themselves, I think the structural notability option covers their inclusion as items. However, for claims of indirect relations where the intermediate persons are unknown, I agree with Stevenliuyi that we would need an "indirect relation" property, along with a qualifying property to indicate what kind of relation it is.
- Silver hr (talk) 17:38, 26 April 2013 (UTC)[reply]
- Just make all Propertys free of a special sex --Pyfisch (talk) 19:26, 26 April 2013 (UTC)[reply]
- Regarding the first problem with Option 2 -- "child" is redundant if we have a way of indicating parenthood -- I think it's worth noting that 'sibling' can also be considered redundant: Jane's siblings could inferred by seeing which items specify Jane's parents as their parents. So the most minimalist design for genealogical properties is to use just one property: parent. Does that mean 'sibling' and 'child' should not exist? I'm not so sure. Emw (talk) 15:09, 27 April 2013 (UTC)[reply]
- "Sibling" and "parent" properties are not always redundant to one another. There are pairs of items who we know to be siblings, without knowing anything more. For instance, if I want to indicate that Q5491071 and Q12006095 are sisters, is it really a good idea to create a phantom parent item, about whom nothing further will ever be known? "Parent" and "child", however, are always necessarily redundant. There is no conceivable situation where "item X is the parent of item Y" where it will not be the case that "item Y is the child of item X". Gabbe (talk) 07:53, 29 April 2013 (UTC)[reply]
- A property relative with the respective qualifiers (sibling, cousin, etc.) has been proposed by a number of people to deal with the rare cases we don't know the actual family tree for sure. —★PοωερZtalk 08:17, 29 April 2013 (UTC)[reply]
- Unfortunately, I have to completely reverse my position on this, after some thinking. The reason is that one of Wikidata's initial requirements is that "Wikidata will not be about the truth, but about statements and their references". This carries the consequence that if a source says "A has uncle B", that's exactly what we have to record. We cannot claim that the source said "A has mother X, X has brother B", even though that's what's implied. So, even though I prefer a minimalist design with only the bare necessary properties and inference of everything else, I am now forced to admit that we will need as many properties as necessary to accurately record what sources literally say. Silver hr (talk) 02:53, 25 May 2013 (UTC)[reply]
- Comment – But references in many languages would not use a word with the same meaning as "uncle" in English. In these languages the sources say things like say "sibling of mother" or "sibling of parent" or "brother of mother" or other things maybe involving the persons relative age or other things. If you wouldn't accept any of these references for an "uncle" claim, Wikidata would become English language biased. Byrial (talk) 06:24, 25 May 2013 (UTC)[reply]
- Comment This problem is entirely solved with the proposed special relationship + qualifier (in this case uncle) property. —★PοωερZtalk 07:23, 25 May 2013 (UTC)[reply]
- Oppose (keep the current system). Redundancy is not necessarily a bad thing. It is more complicated to infer a non-existing property (at inclusion time) than to add another property (using a bot). For example, if a wiki would like to present separate lists of brothers and sisters, it would take a more complicated query to check the sex of each sibling. Additionally, if an item is found to be the brother of some other item, a bot can add the property sex to the first item (as well as the main type person to both items). Also, as others have pointed out, we may need to keep track of relationships (like cousin, uncle, etc) between two items without having items for the common ancestors. Rsocol (talk) 03:51, 27 April 2013 (UTC)[reply]
- Comment I'm not really clear about the disadvantages of the current series of properties. What is the problem that you are trying to resolve? -- Docu at 05:09, 29 April 2013 (UTC)[reply]
- Inconsistency bugs me and I'm generally opposed to redundant properties. —★PοωερZtalk 07:34, 29 April 2013 (UTC)[reply]
We should work this out step by step. I think most of us agree:
- brother/sister is redundant (and potentially imprecise) to sex and should be replaced by sibling,
- child is redundant. —★PοωερZtalk 08:17, 29 April 2013 (UTC)[reply]
Comment I think kinship relations should be generally gender/sex neutral from an IT point of view since gender/sex is not a property of an relationsship but a property of a person.--Trockennasenaffe (talk) 10:00, 29 April 2013 (UTC)[reply]
- I completely agree, with one exception. I try to make it clear by a famous, simple and fictional example: Eric Cartman's mother Liane Cartman (no item yet) is technically speaking intersex and Eric's biological father. This cannot be properly described by a gender neutral parent property. —★PοωερZtalk 10:49, 29 April 2013 (UTC)[reply]
- That's true. So biological father (male), biological mother (female) and P:Pα legal parent as in Option 1 is probably best.--Trockennasenaffe (talk) 11:05, 29 April 2013 (UTC)[reply]
Just came to my head: There is a difference between Wikidata lacking information about who are the biological parents and actual data unknown (i.e. adoption, etc.). The former can just be left blank until data comes in, but the latter is information we should store here. Since father/mother is datatype item, how can this be done? —★PοωερZtalk 11:51, 29 April 2013 (UTC)[reply]
- I think wikidata in general lacks the possibility to distinguish between "property value inexistent" and "property value unknown". Maybe you you could create an Item "unknown" for this purpose. I this case however I don't see the problem since every person has biological parents even I they are unknown.--Trockennasenaffe (talk) 12:08, 29 April 2013 (UTC)[reply]
- When you're adding a statement to an item, there's a little blue icon to the left of the second input box. Clicking on this opens a menu containing the options "Custom value", "Unknown value", and "No value". Select "Unknown value" to show that the value is actually unknown, and not just not added yet. --Yair rand (talk) 17:54, 29 April 2013 (UTC)[reply]
- Oh, thanks. Nice feature I wasn't aware of. —★PοωερZtalk 17:59, 29 April 2013 (UTC)[reply]
- The feature does not work with the MonoBook skin (nothing happens when you click on the icon). Where do we report such problems? Rsocol (talk) 19:43, 29 April 2013 (UTC)[reply]
- Either at bugzilla: or WD:Contact the developers. --Yair rand (talk) 19:52, 29 April 2013 (UTC)[reply]
- When you're adding a statement to an item, there's a little blue icon to the left of the second input box. Clicking on this opens a menu containing the options "Custom value", "Unknown value", and "No value". Select "Unknown value" to show that the value is actually unknown, and not just not added yet. --Yair rand (talk) 17:54, 29 April 2013 (UTC)[reply]
Comment No property is reduntant, only brother/sister can be merged. What about case when there are two famous persons, which are siblings, but their parents are not notable? How can we say they are brother and sister, when there is no property for it?
Wikidtata are notgenealogic database, but database of notable things (persons), so thre might be impossible to deduct indirect relationships, because some parts are missing. I prefer Option 2 with at least Property:Sibling too. JAn Dudík (talk) 13:21, 29 April 2013 (UTC)[reply]
- Have you read Wikidata:Notability? Who are the parents of a relevant person is relevant data by itself, I'd say, but no. 3 makes your argument entirely invalid.
- Have you even read this discussion yet? The idea of an indirect relation property has come up multiple times now for the case linking people are actually missing (and not defined out of existence because of policy criteria); the type of relation can be expressed by qualifiers this way. —★PοωερZtalk 13:40, 29 April 2013 (UTC)[reply]
- For these, are you suggesting to use: "Property:indirect relation" : (item about the person/relative), qualifier: "Property:relation type": (item: e.g. aunt) ? -- Docu at 22:17, 29 April 2013 (UTC)[reply]
- Yeah, something along that line. But again, just for the rare cases we have a source stating person1 is person2's aunt and there is no other information available whatsoever. Aunt is defined as "a person who is the sister or sister-in-law of a parent" (4 possibilities, and that's just under the premise each parent only has one brother and sister and the brothers only had one wife each), this is kinda vague information anyway. —★PοωερZtalk 22:41, 29 April 2013 (UTC)[reply]
New Section edit
Okay, once and for all, I'll show you, why Option 1 is not going to lose any information.
- A and B are notable persons (i.e. at least one valid sitelink to a Wikipedia page)
- Premise: Parents of notable persons fulfil the structural need of filling the parents/father/mother statement.
Relations we have to cover (i.e. family properties existing as of now):
- sibling
- uncle/aunt
- stepparent
- grandparent
Relation: | A is sibling of B | A is uncle/aunt of B | A is stepparent of B | A is grandparent of B |
---|---|---|---|---|
Statement 1 | A parent: C | A parent: C | B parent: C | B parent: C |
Statement 2 | B parent: C | B parent: D | A spouse: C | C parent: A |
Statement 3 | D parent: C |
And along the way, this method has more explanatory power than the current one. It is clear whether A and B are brothers or just half-brothers, whether A is a patrilinear or matrilinear uncle/grandparent of B, etc., etc. without ever using any qualifier and just two properties (parent and spouse; but as I said before, I prefer different properties for father/mother, skipped sex completely for the sake of simplicity here). —★PοωερZtalk 14:20, 29 April 2013 (UTC)[reply]
I think we still need the "child" property. Let me give a concrete example. Here is a database entry about an ancient Chinese person. In the kin section, you can find "S1: Ouyang Fa", "S2: Ouyang Yi", "S3: Ouyang Fei", "S4: Ouyang Bian", which represent the person's eldest son, second son, third son and fourth son. How do we record this kind of information? According to [1] and [2], we don't know some of his sons' birth years, so we can't derive the information based on birth years. I think in this case we need the child property, and use qualifiers such as "relation type: eldest son". --Stevenliuyi (talk) 23:26, 29 April 2013 (UTC)[reply]
- Child is still redundant as this information can also be stored at the parent statement. I'd also go with a more general approach at this: not relation type but simply rank or order (1, 2, etc.). This would be widely useful across Wikidata. And (just a quick thought) in this particular context could also provide a way to properly express a twin relation (when birth dates aren't known) by assigning the same number to them.
- Moreover, child groups different things together and is therefore imprecise (at least as it is now). In Wikidata:List_of_properties#Relationship it is defined as the opposite of mother/father and stepmother/stepfather, which is fairly useless since you can be a legal parent without marrying a biological parent and also can marry a biological parent without becoming legal parent of the children in most countries. —★PοωερZtalk 00:31, 30 April 2013 (UTC)[reply]
- Yes "child" is redundant somehow but I think it's still useful. First, for the above example, I prefer to use the qualifier "relation type" because it's more flexible to adapt to different cultures and situations. For instance, in Chinese culture there is a concept called "trueborn eldest son" who may or may not be the first-born son. It's difficult for "rank/order" to represent this concept. And there are also other "trueborn sons" we don't know the orders, using a "child" property and qualifiers is a simple solution to deal with this situation, and perhaps many other different situations in different cultures. Second, using a "child" property we can have statement like "child: no" for those who don't have kids. I think it's also a useful information to include.
- Besides, although we don't want many redundant data, it doesn't mean we can't tolerate any redundant data. These are several other useful reciprocal properties exist such as preceded by and followed by. Even for the "spouse" property, the data is redundant now because we add it to both the wife items and the husband items. I don't think we should delete half of all "spouse" statements just to prevent redundant data. I agree with you that the child property is imprecise now, but we can simply split it to "biological child" and "adopted child". --Stevenliuyi (talk) 02:07, 30 April 2013 (UTC)[reply]
- Comment – How two people are related, and what sex one of them has, is two distinct things which not should be mixed up in a database. The database should have different fields for relationship and sex, and each user can then combine them as wanted, for instance to match the words in one's language. It is impossible to make predefined combinations to suit all, as the need for combinations is different to different people. Therefore there should be no separate properties for sister and brother, aunt and uncle, female cousin and male cousin (which happens to have different words in my native languages), daughter and son, wife and husband etc. I did intentionally not include mother and father in the list, because I think the relationships here are different as the mother bore the child, while the father did not. That is to me the fundamental thing to decide first. Then we can always discuss how many of the relationships we should have. Byrial (talk) 13:19, 5 May 2013 (UTC)[reply]
- Exactly, that's why I said, let's do it step-by-step so we can get finished eventually instead of rambling about details preventing us from doing anything in the end. So for everybody who isn't aware of it yet: Wikidata:Properties_for_deletion#Property:P9 (see version history for admin wild west) —★PοωερZtalk 13:30, 5 May 2013 (UTC)[reply]
Comment there are various resources on the Web to this topic, I think this is a good starting point: http://jay.askren.net/Projects/SemWeb/ --FischX (talk) 01:46, 6 May 2013 (UTC)[reply]
Support - Seems good. I cannot see why Wikidtata cannot incorporate a genealogic database. --凡其Fanchy 18:42, 14 May 2013 (UTC)[reply]
On genealogy websites all you have to do is specify a person's parents, and all the relationships, brother, sister, half-brother, etc. are determined automatically. Quite a simple thing to do. Danrok (talk) 18:34, 20 May 2013 (UTC)[reply]
The merging of P7 (P7) and P9 (P9) is closed. It doesn't make much sense to keep on discussing here before phase III has started. —★PοωερZtalk 04:04, 20 May 2013 (UTC)[reply]
- So should this RFC be temporarily closed? --Yair rand (talk) 17:02, 17 June 2013 (UTC)[reply]
- I think so. Emw (talk) 14:35, 29 June 2013 (UTC)[reply]