Wikidata:Property proposal/personal pronoun

preferred pronoun edit

Originally proposed at Wikidata:Property proposal/Person

Descriptionpersonal pronoun(s) this person uses
Representspersonal pronoun (Q468801)
Data typeLexeme
Domainhuman (Q5), fictional character (Q95074)
Allowed valueslexical category pronoun (Q36224)
Example 1Chelsea Manning (Q298423)she (L484) (source: her Twitter bio)
Example 2Lucas Werkmeister (Q57387675)he (L485) (source: my Twitter bio) and er (L41654) (not sure if there’s an exact source for this)
Example 3Amandla Stenberg (Q195691)she (L484) and they (L371) (source: [1]); to reflect common usage, she (L484) could use preferred rank, not sure
See alsosex or gender (P21)

Motivation edit

If you want to talk or write about a person, you will most likely need to know their pronouns, at least in languages with gendered pronouns like English. This information can often be inferred from the person’s sex or gender (P21), but this is not always the case, especially for non-binary people. Recording it as a separate statement, using a new “personal pronoun” property, would make sense to me. (It would also be possible to concoct items like “non-binary who uses she/her and they/them” for use in sex or gender (P21), but that’s just absurd IMHO.)

This was discussed back in 2013, tangentially under gender identity and briefly in the dedicated proposal personal pronoun. (Pinging participants: AdamBMorgan, Filceolaire, Yair rand, Tobias1984, Danrok, Pere prlpz, Giftzwerg 88, MarrickLip, Jaredzimmerman (WMF).) When reading those discussions, keep in mind that the English label of P:P21 back then was sex, though the {{P}} template will render it as today’s sex or gender.

I think it’s time to discuss this property again, because we now have the correct datatype for it: a set of pronouns corresponds to a lexeme. This lets us specify all the different forms of a set of pronouns (she, her, hers, herself, …) using a single value, and also accounts for the fact that pronouns work differently between languages.

Questions that I’m still unsure about:

  • What is the expected completeness of the property – how often will it be used? Like sex or gender (P21) or more like sexual orientation (P91)?
  • In how many languages should it be specified per person? All the languages the person speaks? Or more like all the languages in which one might talk/write about the person (i. e. need a pronoun to refer to them by)?
  • Should the English label be “personal pronoun” or “pronoun”? The former seems more specific, but the current lexical category of values like he (L485)/she (L484)/they (L371) is Q36224 (pronoun) rather than Q468801 (personal pronoun).For now, “preferred pronoun”, as suggested by Ash Crow.

Lucas Werkmeister (talk) 21:08, 20 January 2019 (UTC)[reply]

Discussion edit

  • First of all, I don't think we should be adding such extra statements in cases when the content can be deduced from existing P21 data. In the relevant contexts, are any of the pronouns ever split from their groups? Would anything be gained from splitting "he" from "him", instead of using all relevant pronouns as a group? How do these things work across languages? (Does anyone go by a masculine pronoun in one language and a feminine one in the other? Are there direct translations that work?) Also, pronouns in many languages depend on many things other than P21 attributes, and may be different depending on who is speaking. Taking that into account, I don't think we should have a preferred pronoun property with a lexeme datatype. --Yair rand (talk) 00:43, 21 January 2019 (UTC)[reply]
  • are any of the pronouns ever split from their groups? I don’t think so, which is why the proposed datatype is “lexeme”, not “form”.
  • How do these things work across languages? I don’t know for other languages, but at least in German there’s no non-binary pronoun that’s as widely accepted as English they (L371) (though there are multiple proposals, of course), so I could imagine someone going by they (L371) in English but er (L41654) or sie (L41653) in German. But that’s a hypothetical example, I don’t know enough non-binary German people to have a real example, unfortunately.
  • Taking that into account, I don't think we should have a preferred pronoun property with a lexeme datatype. I don’t understand how you come to that conclusion, to be honest. I’m proposing this property because the situation is too complicated to expect data users to infer this information. If we have a property, we can deal with the complexity through the normal Wikidata means – mostly qualifiers, I’d assume, though we need more examples before we can start to figure it out in earnest. If we don’t have a property, we can’t even describe the problem. --Lucas Werkmeister (talk) 01:26, 22 January 2019 (UTC)[reply]
  • I don't think we can infer the pronouns from P21. Some non-binary people use only gender-neutral pronouns while some others use both "they" and "she". In French, there are several gender-neutral pronouns (iel, ael, ul, ol, im, em, ille, el and probably others, all equivalent to the singular they) and you cannot guess the preferred one from the P21. For the question of translation I think the best is to stick to sources. If someone gives their preferred pronouns in English and German on their Twitter bio, you cannot deduce the pronoun that they prefer used in French from that (so a lexeme datatype is better than a item there.) As for the English label for the property, I think it should be "preferred pronoun" to stress that it is a personal choice from the subject of the item, not a matter of opinion from the contributors or external sources - and so, we cannot accept "he" as a value for Chelsea Manning (Q298423) because some transphobic paper uses it. Overall, I   Support the creation of the property, as long as there is a mandatory reference to a statement by the person. -Ash Crow (talk) 21:05, 23 January 2019 (UTC)[reply]
  • @Lucas Werkmeister: Sorry, I didn't realize that all of those were structured as forms of a single lexeme. Still, is this also the case for gendered first- and second-person pronouns, in languages where those pronouns are gendered? My assumption is that gendered first-, second-, and third-person pronouns would not share a single base lexeme, so if this datatype were used there would need to be a separate value for each. In languages that use different pronouns depending on the speaker's gender, or relative status or relationship, or level of formality, these would probably not share a single lexeme and would therefore need separate statements for each. (I'm working under the assumption that there are at least a few languages where this would come out to dozens of statements, despite not knowing of any particular such examples. If someone who knows more about languages believes that that's probably not the case, I would appreciate the correction.) --Yair rand (talk) 00:49, 24 January 2019 (UTC)[reply]
  • @Yair rand: Hm, I didn’t think about those cases. Gendered first- and second-person pronouns could, I believe, still stored under this property (in that case, there would be not only multiple statements for the pronouns in different languages, but also multiple statements for the different grammatical kinds of pronouns). This might not scale well across the other issues you suggested (speaker’s gender etc.), but I don’t know how much of a problem that would be in practice, since we both don’t know any particular examples where that would be the case. More input from others would definitely be useful here. --Lucas Werkmeister (talk) 23:20, 24 January 2019 (UTC)[reply]

@Amire80, Robin van der Vliet, Wildly boy, MichaelSchoenitzer, Jura1: @Jsamwrites, Lucas Werkmeister, Yair rand:   Done: personal pronoun (P6553)Pintoch (talk) 09:04, 1 March 2019 (UTC)[reply]