Wikidata:Property proposal/has contraction

has contraction edit

Originally proposed at Wikidata:Property proposal/Lexemes

   Not done
Descriptionthis lexeme forms in some uses a contraction with another lexeme. Use the word formed by the contraction as value.
Data typeForm
Domainlexemes, not prefixes/suffixes in languages where this is thought to be useful. Initially: French (add if needed).
Example 1"ledit" Lexeme:L19375-F1 → "dudit" Lexeme:L19364-F1
Example 2de (L2379) → "du"
Example 3lequel (L11158)-F1 → auquel (L19396)-F1
Example 4"lesdits" Lexeme:L19375-F2 → "desdits" Lexeme:L19364-F2
Planned useadd to applicable fr items

Motivation edit

Helps find cases where the listed forms don't apply. (Add your motivation for this property here.) --- Jura 07:57, 9 September 2018 (UTC)[reply]

Discussion edit

  •   Comment doesn't seem to raise any concerns, ready to go? --- Jura 11:27, 13 October 2018 (UTC)[reply]
  • @Jura1: I think it would be good to have at least a few support votes first. (I do not work with lexemes myself so I will not vote, although it looks sensible) − Pintoch (talk) 17:16, 14 October 2018 (UTC)[reply]
  • @Jura1, Pintoch: Something to identify contractions sounds useful, however, again, I don't understand the way in which this property has been proposed. Typically a contraction comes about from a combination of 2 lexemes to make a third: for an English example, "have" + "not" -> "haven't". From the examples it looks like this proposed property would link "have" to "haven't", which is fine, except the examples are at the form level while I think in most cases the combination happens at the top lexeme level (so "hasn't" and "hadn't" would need their own individual statements with this proposal, but just one statement would cover all forms if it was at the lexeme level). I'm not familiar enough with French to know if the common cases are similar to English in that regard or not, but it probably bears some discussion here. Also, how should the second lexeme ('not' in my example) be treated - as a qualifier? Or perhaps it should be the value of the statement with the contracted form linked as a qualifier? I think the modeling here need to be discussed a bit more. Maybe on the Lexicographic Data talk page? ArthurPSmith (talk) 14:44, 15 October 2018 (UTC)[reply]
  •   Comment I have a feeling that it can be used for Polish lexemes too, like "żem"="że"+"jestem (form of być)" but somehow I don't understand proposal, how to use it. Intuitively I would use combines lexemes (P5238) with object form (P5548) qualifier. KaMan (talk) 06:49, 16 October 2018 (UTC)[reply]
    • It might work, but I got the impression that for Polish it's the rule rather than the exception. P5238 might work better on "żem" than "że". --- Jura 11:49, 16 October 2018 (UTC)[reply]
  • I fixed the description above to read "value" instead of "subject". Format is <subject/entity with statement> <property> <object/property value>. --- Jura 14:34, 16 October 2018 (UTC)[reply]
  • @Jura1: I note your new example which does help a little here - so you could link Lexeme:L19375 to Lexeme:L19364 at the lexeme level as I suggested with this property, except the form "de ladite Lexeme:L19364-F3 is not a contraction. Maybe there's a better label for the property so that would still be ok at the lexeme level? I definitely don't think it's a good idea to limit this to French - many languages have lexemes like this that are contractions of a two or more word phrase. ArthurPSmith (talk) 18:14, 16 October 2018 (UTC)[reply]
    • I don't think it should be limited to French. It just that I find it useful for French in the way it's defined and I'm not sure if it works well for random languages. For each language, one should think about it if it's worth using it and if yes, just add note it in the property documentation. For English, as you pointed out, it might be better to use a similar property that defines it on lexeme level (it would probably have another datatype). For Polish, the ideal level might be on a lexical subcategory pointing to another lexical subcategory (which might be doable with some existing item based property). Anyways, all things that are interesting, but not really relevant to an optimal way to use it in French. --- Jura 13:23, 17 October 2018 (UTC)[reply]
    • @ArthurPSmith, KaMan: What do you think of the above? Shall we create two more proposals for lexeme- and item-datatype? Depending on the language, one or several can apply. --- Jura 10:19, 20 October 2018 (UTC)[reply]
      I don't know French good enough to opinion, and for Polish I think I stay with combines lexemes (P5238) because I understand it better. KaMan (talk) 10:37, 20 October 2018 (UTC)[reply]
      • I probably shouldn't express an opinion about Polish, but maybe for Polish an interesting statement could be that (some or all) <insert lexical category of "że" mentioned above> form contractions with verbs (lexical category of Lexeme:L3524#L3524-F2 mentioned above). So the statement would be on the items for the lexical categories or their subclasses. Lexemes wouldn't have any statements beyond what you mentioned. --- Jura 10:45, 20 October 2018 (UTC)[reply]
  • @Jura1: I'm not comfortable with the proposal as it is, I think it needs further discussion. Maybe bring it up on the Lexicographical data talk page? Or maybe there should be a wikiproject for discussing lexeme modeling issues like this? ArthurPSmith (talk) 14:24, 22 October 2018 (UTC)[reply]
    • It seems fairly straightforward. If you are not comfortable with modeling proposals for French, I can't really help you. Why didn't you formulate one for English with what you had in mind? --- Jura 17:42, 23 October 2018 (UTC)[reply]

  Not done No support.--Micru (talk) 09:45, 22 December 2018 (UTC)[reply]