Wikidata:Property proposal/Lexemes

Property proposal: Generic Authority control Person Organization
Creative work Place Sports Sister projects
Transportation Natural science Lexeme Wikimedia Commons

See alsoEdit

This page is for the proposal of new properties.

Before proposing a property

  1. Check if the property already exists by looking at Wikidata:List of properties (research on manual list) and Special:ListProperties.
  2. Check if the property was previously proposed or is on the pending list.
  3. Check if you can give a similar label and definition as an existing Wikipedia infobox parameter, or if it can be matched to an infobox, to or from which data can be transferred automatically.
  4. Select the right datatype for the property.
  5. Start writing the documentation based on the preload form below and add it in the appropriate section.

Creating the property

  1. Once consensus is reached, change status=ready on the template, to attract the attention of a property creator.
  2. Creation can be done 1 week after the proposal, by a property creator or an administrator.
  3. See steps when creating properties.

  On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2020/05.

LexemeEdit

Etymology

Grammar

position with respect to the nounEdit

   Under discussion
Descriptionthe position this lexeme must have in respect to the noun, normally to use with adjectives
Data typeItem
Domainlexeme
Allowed valuesbefore (Q79030196) and after (Q79030284)
Example 1red (L3271)before (Q79030196)
Example 2abertzale (L82679)after (Q79030284)
Example 3bueno (L230409)before (Q79030196)
Planned useadd information about adjectives
Expected completenessalways incomplete (Q21873886)

MotivationEdit

This is relevant if we need to model a language with automatic tools, and also for translation. Where to place the adjective (or other words) is something difficult for many people and machines need to learn it. Theklan (talk) 15:03, 18 December 2019 (UTC)

DiscussionEdit

  Support - Theklan (talk) 15:03, 18 December 2019 (UTC)

Hmm, I'm not sure this is the right way to model this, although I agree something is needed to cover this aspect of language. Another aspect of this is the "adjective order" rules, see en:Adjective#Order - how could that be modeled? It seems like maybe we ought to have something like "classes" for lexemes; we could use instance of (P31) or a custom property to indicate? Not sure how to proceed on this... ArthurPSmith (talk) 20:10, 18 December 2019 (UTC)
@ArthurPSmith: I don't know if this is the best way to do this, but in Basque this is a relevant aspect. And I suspect that it exists also in other languages. -Theklan (talk) 12:42, 23 December 2019 (UTC)

  Comment This might vary by sense, cf. French grand (L9217):

un grand homme

(‘a great man/celebrity’) vs.

un homme grand

(‘a tall man’). With the proposal being that this be a property of lexemes, would there then be two different lexemes for French

grand

,

grand
(Lxxx1)
position with respect to the noun before

and

grand
(Lxxx2)
position with respect to the noun after


(see wikt:fr:grand#Notes)?―BlaueBlüte (talk) 04:41, 26 December 2019 (UTC)

  Comment Regarding before (Q79030196) and after (Q79030284), why do we have adverbs ( instance of (P31) adverb (Q380057)) in the item space? (In which case, which language should they be taken from? And should that language be used regardless of the language of the lexemes described by this proposed property?)
Items should represent more abstract concepts (e.g., anteposition (placement of a dependent phrase before the one it depends on) instance of (P31) placement of dependent phrase), not the adverbs used to denote those concepts.―BlaueBlüte (talk) 04:41, 26 December 2019 (UTC)

  Comment “with respect to the noun” might be inconveniently narrow, as the lexeme in question might also be a noun phrase or other nominal phrase. And speaking of which, are there languages that have non-arbitrary ordering of certain kinds of adjectives or similar? For example, adjectives that are always positioned after the noun, but before any other adjectives that are also positioned after the noun? Like, ‘ranks’ of adjectives? Maybe the range for this property should in that case be (positive and negative) numeric, e.g., negative numbers meaning before, positive numbers meaning after the ‘noun’, with the absolute value indicating how far (with respect to potential further adjectives as attributes to the same ‘noun’).―BlaueBlüte (talk) 04:41, 26 December 2019 (UTC)

  •   Weak oppose if it were by sense this would be very good for open-source translations for now, but currently it needs some changes as User:BlaueBlüte says.

  Comment after (Q79030284) and before (Q79030196) are currently instance of (P31) of adverb (Q380057). They should not be words, but concepts IMHO. — Finn Årup Nielsen (fnielsen) (talk) 00:33, 28 February 2020 (UTC)

feminine form of lexemeEdit

   Under discussion
Descriptionlexeme that is the feminine form of the subject lexeme
Representsfeminine (Q1775415)
Data typeLexeme
Domainlexeme
Example 1king (L9670)queen (L1380)
Example 2acteur (L13374)actrice (L12849)
Example 3padre (L221496)madre (L47362)
Example 4lion (L17815)lioness (L43104)
Planned useto be used where it doesn't make sense to add the feminine form as a Form to the subject Lexeme, but instead is considered to be a separate Lexeme
See alsofemale form of label (P2521), male form of label (P3321)

MotivationEdit

Currently there is no single way to do this. Theklan (talk) 13:47, 25 October 2019 (UTC)

DiscussionEdit

  Comment I think the general idea makes sense, but it shouldn't be limited to feminine / masculine form (many languages have more than these two genders, and some don't have both of these but have others). A more generic property "gendered version" or whatnot might make sense :) (the word "form" might not be ideal because we already use it for forms of a lexeme).--Reosarevok (talk) 15:38, 25 October 2019 (UTC)

  Comment @Theklan: This relationship can already be represented in the Item space by using female form of label (P2521) and male form of label (P3321), and at the Sense level by using antonym (P5974) qualified with criterion used (P1013) = gender binary (Q5530970). It could be a good to have similar properties for the Lexeme level. Perhaps they should be called "female form of lexeme" and "male form of lexeme" to better mirror the existing Item-typed properties, and to specify that it is at the Lexeme level rather than the Form level (as the current proposed name is a bit ambiguous in that regard). Also, it would be good to include a few more use cases other than just the "king & queen" example. Liamjamesperritt (talk)

Maybe the solution is to have a general gender representation property, so we can have more than two options. Also I don't consider that actor and actress are antonyms. -Theklan (talk) 08:23, 29 October 2019 (UTC)

  Comment @Theklan, Reosarevok, Fnielsen, ArthurPSmith: With permission from Theklan, I updated the proposal to make it a little clearer. Liamjamesperritt (talk) 05:18, 9 November 2019 (UTC)

The symmetric proposal is here: Wikidata:Property proposal/masculine formFinn Årup Nielsen (fnielsen) (talk) 12:38, 13 November 2019 (UTC)

  Comment This property should be used on sense level not lexeme. Dichotomy might not apply to all senses of a given lexeme. --Lexicolover (talk) 22:04, 4 December 2019 (UTC)

  Comment The proposed name for the property is not correct. King/queen are not forms of the lexeme, these are two separate lexemes describing close concepts which differ by gender of the subject. Thus something like "male/female describing term for this sense" should be better. --Lexicolover (talk) 22:04, 4 December 2019 (UTC)

  Comment Is it possible to generalize this concept to cover cases like sheep/ram/lamb etc.? --Lexicolover (talk) 22:04, 4 December 2019 (UTC)

  Comment All your proposals are really interesting. We need something to denote this (sheep/ram/lamb is a good example) and antonym is not valid (queen is not the antonym of king). Can someone make a proposal that could handle everything? -Theklan (talk) 15:11, 18 December 2019 (UTC)

  Strong oppose The proposal conflates, as it seems, multiple aspects that are customarily discussed as separate phenomena, and usefully so in both linguistics and its practical applications; and needlessly inscribes a concrete notion of ‘gender’ that applies only to a few languages.

  • king (L9670) vs. queen (L1380) is a question of mere semantics and so should operate on the level of senses. This could be described using item for this sense (P5137); as an example, see Gräfin (L34190). This could then be queried like so:
    SELECT ?l1 ?lemma1 ?sense1 ?gloss1 ?l2 ?lemma2 ?sense2 ?gloss2 WHERE {
      # Assuming that we have a given sense ?sense1 (here: L34190-S1),
      FILTER (?sense1 = wd:L34190-S1)
    
      # we are looking for a ?sense2 that has the same conceptualized meaning ?meaning,
      ?sense1 wdt:P5137 ?meaning.
      ?sense2 wdt:P5137 ?meaning.
      
      # but we don’t want the ?meaning to be the gender aspect of the sense.
      MINUS { 
        ?meaning wdt:P31 wd:Q48277, wd:Q48264
      }
    
      # We also want the ?sense2 to be about a lexeme in the same language
      ?l1 dct:language/^dct:language ?l2.
      # where ?l1 and ?l2 are the lexemes for the senses ?sense1 and ?sense2, respectively  
      ?sense1 ^ontolex:sense ?l1.
      ?sense2 ^ontolex:sense ?l2.
      # (Note that we would write the above three triples as the single line
      ?sense1 ^ontolex:sense/dct:language/^dct:language/ontolex:sense ?sense2.
      # but then we wouldn’t be binding lexemes ?l1 and ?l2, which we might want for debugging.)
    
      # And we don’t want ?sense2 to be identical to the original ?sense1
      FILTER (?sense2 != ?sense1)
                                  
      # Finally, for debugging purposes, we’re binding the lemmas and glosses for ?sense1 and ?sense2
      ?l1 wikibase:lemma ?lemma1.
      ?sense1 skos:definition ?gloss1.
      ?l2 wikibase:lemma ?lemma2.
      ?sense2 skos:definition ?gloss2.
    }
    LIMIT 1
    
    Try it! (Technically, while the main code is agnostic to specific genders, the above is a query that retrieves a ‘masculine form’ because it is applied to a ‘feminine form’ and the only other ‘form’ currently defined is one that may be viewed as either masculine or generic. But the query would work in reverse as well, although the lexeme properties might not be fully populated yet.) (Incidentally, @Lexicolover: This could also address the case of “sheep” vs. “ram” vs. “lamb”; in the case of the latter in particular by substituting baby animal (Q12038335) for the gender items.)
  • acteur (L13374) vs. actrice (L12849) might be about etymology, and so in regard to that should operate on the level of lexemes using derived from (P5191): actrice (L12849) derived from (P5191) acteur (L13374), maybe with a qualifier like mode of derivation (P5886) -trice (L25480) or using some better-suited property (maybe one more closely related to gender inflection (Q1124523), whose current label “gender inflection” is maybe not quite on point):
    actrice (L12849) derived from (P5191) acteur (L13374) / mode of derivation (P5886) -trice (L25480)
    lioness (L43104) derived from (P5191) lion (L17815) / mode of derivation (P5886)
    -ess

    A query equivalent to what this proposed property might have been intended to be about in regards to etymology might search for, depending on the notion of ‘gender’ applicable to the language in question,
  • The general nonapplicability of a dichotomous notion of ‘gender’ has already been pointed out. I might add that even under the assumption of a dichotomous (real-world) gender,
  • @Theklan: If indeed a “single way to do this” is desired, one might build a query (and maybe service around that) that somehow combines the two queries sketched above, though it is not clear to me how that would yield a well-defined result in the case of languages that, for example, have both (real-world-)gender-dependent lexemes and (language-intrinsic) grammatical gender.

BlaueBlüte (talk) 07:38, 26 December 2019 (UTC)

masculine form of lexemeEdit

   Under discussion
Descriptionlexeme that is the masculine form of the subject lexeme
Representsmasculine (Q499327)
Data typeLexeme
Domainlexeme
Example 1queen (L1380)king (L9670)
Example 2actrice (L12849)acteur (L13374)
Example 3madre (L47362)padre (L221496)
Example 4lioness (L43104)lion (L17815)
Planned useto be used where it doesn't make sense to add the masculine form as a Form to the subject Lexeme, but instead is considered to be a separate Lexeme
See alsofemale form of label (P2521), male form of label (P3321)

MotivationEdit

Currently there is no single way to do this. Theklan (talk) 13:47, 25 October 2019 (UTC)

DiscussionEdit

Used with verbEdit

   Under discussion
Descriptionword used with this adjective; probably translated as "to be" into English.
Data typeLexeme
Domainallowed on senses of lexemes; not just adjectives, to allow stuff like "hambre" having "tener" on it
Allowed valuessenses of lexemes
Example 1bueno (L230409) S1 → ser (L5140)
Example 2bonito (L232551) S1 → estar (L5141)
Example 3MISSING
Expected completenesseventually complete (Q21873974) (limited number of languages have a distinction like this)

MotivationEdit

For machine translation, there needs to be a way to check which word to use with an adjective, stored on each sense. In English, there's not a problem; there's just one word for "to be"—"I am sad" (currently), "I am a Wikipedian", and "I am hungry" all use the same word. However, this is different in other languages, e.g. Spanish. The three examples are translated as "yo estoy triste,", "yo soy Wikipedista", and "yo tengo hambre". This would allow there to be a distinction for this with adjectives. P.S. I probably messed up some values of the template; feel free to correct them.DemonDays64 | Talk to me 02:34, 14 January 2020 (UTC) (please ping on reply)

DiscussionEdit

  •   Support However the label needs to be something more standardized - what would you call this relationship in Spanish? ArthurPSmith (talk) 18:23, 14 January 2020 (UTC)
@ArthurPSmith: hmm...so "to be" is called a wikipedia:Copula. However, that is only used referring ser and estar, not the other ones used with adjectives. Maybe this needs some re-phrasing. Maybe "used with verb" would be better? DemonDays64 | Talk to me 15:41, 15 January 2020 (UTC)
Maybe "linking verb"? But from the page en:Copula (linguistics) it seems many languages don't use a verb for this at all, but some other mechanism. So "copula" probably wouldn't work if you want to restrict it to verbs; is there a more general way to model this here? ArthurPSmith (talk) 16:03, 15 January 2020 (UTC)
  Oppose The way this property is phrased looks like it's very English centric and not focused on general categories that linguists use. ChristianKl❫ 09:32, 15 January 2020 (UTC)
@ChristianKl: yeah the name doesn’t translate well. Maybe "used with verb" would be better, to be clearer outside of English, in languages without a general word for "to be"? DemonDays64 | Talk to me 15:41, 15 January 2020 (UTC)
The field of linguistics has spent a lot of effort into researching how to classify language. Before adding a property like this we should look at how linguistics model the problem domain. ChristianKl❫ 07:49, 21 January 2020 (UTC)

cognateEdit

   Under discussion
DescriptionLexeme in another language derived from the same word in an old language
Representscognate (Q690548)
Data typeLexeme
DomainLexeme
Example 1hound (L6419)hund (L31499)
Example 2bryn (L252176)brow (L16178)
Example 3father (L3624)père (L2245)
See alsoderived from (P5191)

BegrundelseEdit

Etymological information is conveyed by the derived from (P5191) property. Parallel to this we could establish a property, the proposed property, where two (modern) languages could indicate mutual derivational root. This has the NxN downside that is also an issue with the translation (P5972) property, where all cognates in other languages need to be specified on all lexemes. However, in some cases it may be easier to establish cognates rather than the complete derivation chain to find cognates via SPARQL. Finn Årup Nielsen (fnielsen) (talk) 00:52, 28 February 2020 (UTC)

DiscussionEdit

transitivityEdit

   Under discussion
Descriptionproperty for transitivity (Q830012)
Representstransitivity (Q830012)
Data typeItem
Domainverbs
Example 1создать (L297630)transitive verb (Q1774805)
Example 2MISSING
Example 3MISSING

MotivationEdit

(Add your motivation for this property here.) Deepsaged (talk) 06:27, 30 April 2020 (UTC)

DiscussionEdit

I think this property should be applied on Sense because some verbs can be both transitive and intransitive, at least in Franch depending on the sense. Pamputt (talk) 07:13, 30 April 2020 (UTC)

My preferred option would be to add it to lexemes and when a verb can be transitive or intransitive, also add transitivity statements to individual senses. We could maybe use ambitransitive verb (Q4115075) at the lexeme level. - Nikki (talk) 11:25, 30 April 2020 (UTC)

@Pamputt: Even if we use this for sense we also need transitivity property, or I miss some thing? Deepsaged (talk) 20:55, 30 April 2020 (UTC)

Yes, you are right, this property is interesting anyway. I was just talking about where to place this proposal in this page. Pamputt (talk) 08:25, 1 May 2020 (UTC)

I just found has quality (P1552) and i see examples where relations marked with it. Please look жена (L108683) has quality (P1552) animate (Q51927507) . I do not think it is good idea to use has quality (P1552) for all cases if we do not have exact property. If we try to write SPARQL to get animacy or transitivity marked with has quality (P1552) we'll have to list all animacy/transitivity variants in this query or join addition level using subclass of (P279) animacy (Q1250335) . So I think we also need to create animacy property Deepsaged (talk) 09:10, 1 May 2020 (UTC)

  • @Deepsaged: There should be a description that explains the property is about. Ideally, one that doesn't just repeat the word transitivity. A good description allows people to find better translations when a word can be translated in multiple ways. Wikitext is also not allowed in descriptions. ChristianKl❫ 20:56, 4 May 2020 (UTC)
  • @ChristianKl: I do not know how to explain it in other way that is done in transitivity (Q830012). Should I copy text "property of verbs that relates to whether a verb can take direct objects and how many such objects a verb can take" from it to description?

structure word classEdit

   Under discussion
Descriptiona group of Lojban structure words that have the same grammatical use (can appear interchangeably in sentences, as far as the grammar is concerned) but differ in meaning or other usage
RepresentsCategory:Lojban selma'o (Q31732340)
Data typeItem
Domainlexeme, where Lexical class == function word (Q2120608). See for example zo (L299039), (L299541), (L299538), bau (L299042).
Allowed valuesthe jbo-language label of any item that is an instance of Category:Lojban selma'o (Q31732340). (SPARQL query)
Example 1zo (L299039)ZO (Q93923538)
Example 2 (L299541)A (Q93920876)
Example 3 (L299538)A (Q93920876)
Example 4bau (L299042)BAI (Q93923669)
SourceThe Complete Lojban Language, v1.2.6, section 2.18, Lojban wiktionary entry
Planned useEntering Lojban cmavo lexemes
Expected completenesseventually complete (Q21873974)
Robot and gadget jobsYes

MotivationEdit

As part of an effort to provide a dictionary between Lojban and other languages, we want to add data about Lojban structure words (called cmavo in Lojban, and very roughly equivalent to English prepositions). All Lojban structure words are categorized into classes called selma'o in Lojban. The set of selma'o is closed and limited. So, we would like to have a selma'o or (in English) "structure word class" property that we can add to any cmavo lexeme. --Robert.Baruch (talk) 02:27, 11 May 2020 (UTC)

DiscussionEdit

  •   Support why not. I do not the Lobjan language but if some people need this, let's add this property. Pamputt (talk) 07:19, 11 May 2020 (UTC)
  •   Strong support --Tinker Bell 01:14, 12 May 2020 (UTC)

I'm not sure if the allowed values should reference Category:Lojban selma'o (Q31732340) or if I should make a new item that isn't a Category? For example, currently ZO (Q93923538) is an instance of Category:Lojban selma'o (Q31732340), but should it instead be an instance of a new item called just "Lojban selma'o"? I should be able to make that change... --Robert.Baruch (talk) 21:51, 17 May 2020 (UTC)

(Answering my own question) Looking at various bits of documentation, I think this help article is most relevant: "All items that represent Wikimedia categories should use category's main topic (P301) to link to the item that is the subject of the Category." In fact, Category:Lojban selma'o (Q31732340) does not have such a property, so I think it's clear I should create the item appropriate for that property, and then link all selma'o to that item rather than to Category:Lojban selma'o (Q31732340). --Robert.Baruch (talk) 15:39, 23 May 2020 (UTC)

Looking over some of the other property proposals, I wonder if we should use instance of (P31) instead. Then, if you want to search for all cmavo for a given selma'o, a SPARQL query that looks for all X such that X instance-of Y and Y instance-of Category:Lojban selma'o (Q31732340) should work. This is the equivalent of X belongs-to-structure-word-class Y. --Robert.Baruch (talk) 15:19, 23 May 2020 (UTC)

FormEdit

common misspellingsEdit

   Under discussion
Descriptioncommon misspellings of this form
Representsmisspelled word
Data typeString
Domainform
Example 1L3280-F1 → "fuscia" (incorrect for fuchsia (L3280))
Example 2L3280-F1 → "fuschia" (incorrect for fuchsia (L3280))
Example 3L36116-F1 → "abbonnemang" (incorrect for abonnemang (L36116))
Example 4L36116-F1 → "abbonemang" (incorrect for abonnemang (L36116))
See alsoWikidata:Property proposal/correct form

MotivationEdit

This makes it possible to easily create e.g. a spell checker that recommends a correction. See discussion here.--So9q (talk) 21:23, 22 March 2020 (UTC)

DiscussionEdit

  Support I support this proposal in this form (more in the linked discussion) with the condition we have applicable definition of common misspelling. --Lexicolover (talk) 12:17, 24 March 2020 (UTC)

See discussion here: Wikidata_talk:Lexicographical_data#Common_misspellings_data--So9q (talk) 19:55, 24 March 2020 (UTC)

  Neutral we need something to solve this problem but I'm not sure if a simple property is the simpliest solution here. A broader system for all sort of variants would be more difficult but better in the long run as correct/incorrect spelling is often not a binary situation (see "colour"/color" in English, correctness is contextual here). Cheers, VIGNERON (talk) 20:44, 25 March 2020 (UTC)

  Neutral: what is your definition of "common"? It sounds a bit arbitrary... Nomen ad hoc (talk) 07:30, 26 March 2020 (UTC).

@Nomen ad hoc: that point can easily be objectively defined by the frequency. If a misspelling is over a threshold, let's say 5%, then it's "common". We can use tool like Google Books Ngram Viewer to see the frequency. We can also rely on sources, dictionaries (especially the descriptivist one) often give the common misspelling. Cheers, VIGNERON (talk) 08:55, 26 March 2020 (UTC)
@Nomen ad hoc: see proposed definition here: Wikidata_talk:Lexicographical_data#Common_misspellings_data--So9q (talk) 10:41, 26 March 2020 (UTC)

  SupportFinn Årup Nielsen (fnielsen) (talk) 11:25, 26 March 2020 (UTC)

  •   Comment I preferred the initial version of this proposal [1] or the earlier proposal (correct form) using form datatype. --- Jura 15:49, 26 March 2020 (UTC)
    • Actually, the direction of the earlier proposal seems preferable (correct form). If the form is only known as a misspelling, "grammatical feature" could include that too. If it's also something else, the "grammatical feature" would just include that "something else". --- Jura 13:28, 1 April 2020 (UTC)
  •   Oppose per above. --- Jura 13:59, 24 April 2020 (UTC)

SenseEdit

pertainymEdit

   Under discussion
Descriptionthis lexeme sense is of or pertaining to the value lexeme sense
Representspertainym (Q86527217)
Data typeSense
Domainsense
Example 1L7279-S1 (slowly) → L1388-S1 (slow)
Example 2L252311-S1 (Danish) → L252308-S1 (Denmark)
Example 3L43202-S1 (fremtidig @ da) → L46329-S1 (fremtid @ da)
Sourcehttps://globalwordnet.github.io/gwadoc/#pertainym
See alsoperiphrastic definition (P7219)

BegrundelseEdit

We lack good methods to link the senses of non-noun lexemes or link senses across word classes (e.g., hypernyms is within a word class). A "pertainym" is a concept found in the wordnet community, see https://globalwordnet.github.io/gwadoc/#pertainym and it would allow us to link some non-nouns together, specifically some adjectives to nouns and some adverbs to adjectives. Finn Årup Nielsen (fnielsen) (talk) 18:31, 27 February 2020 (UTC)

DiscussionEdit

terminologyEdit

   Under discussion
Descriptionterminology where the sense is used
Representsterminology (Q8380731)
Data typeItem
Domainsense
Allowed valuesterminology (Q8380731)
Example 1ipso facto (L227969), sense 3 → legal terminology (Q76419834)
Example 2window (L3327), sense 1 → computing terminology (Q3457057)
Example 3note (L4316), sense 2 → music terminology (Q77655668)
Planned useusing on many lexemes
See alsolanguage style (P6191)

MotivaciónEdit

I've suggested this on Wikidata:Property_proposal/part of terminology, but I didn't proposed it formally. --Tinker Bell 05:27, 19 March 2020 (UTC)

DiscussionEdit

  Support Yes I think lexemes is the place for this, and I guess we don't currently have a property that's quite right for it. ArthurPSmith (talk) 18:20, 19 March 2020 (UTC)
  Neutral I approve of the idea, but I don't like the requirement of a terminology item. If only since even within computing or legal terminology subfields may apply different definitions... Circeus (talk) 17:41, 24 March 2020 (UTC)
@Circeus: do you have an example? --Tinker Bell 07:12, 29 March 2020 (UTC)
Leaning towards   Support Clearly   Support. I'm just wondering: are we sure we couldn't use other properties (maybe facet of (P1269)?) and if terminology (discipline) (Q1725664) is probably not the right item, here it's "terminology" as in « The set of terms actually used in any business, art, science, or the like; nomenclature; technical terms. », the second sense en:wikt:terminology while terminology (discipline) (Q1725664) is more about the first sens of the same entry « The doctrine of terms ». A new item is needed (and would useful for other items like nomenclature (Q863247)). Cheers, VIGNERON (talk) 17:39, 30 March 2020 (UTC)
@VIGNERON: I've corrected the item. --Tinker Bell 23:34, 30 March 2020 (UTC)
Thanks (somehow I missed this item, no need for a new item then). And what about my first question? Cheers, VIGNERON (talk) 07:35, 31 March 2020 (UTC)
@VIGNERON: Before opening this proposal, I considered using part of (P361). But I realized a sense doesn't belongs to a terminology, but its concept, so I'm pretty convinced this new property is necessary. I don't see how facet of (P1269) can be used here. --Tinker Bell 08:41, 1 April 2020 (UTC)
TBH, I'm not sure either, I just wanted to be sure you consider other options. Thanks. Cheers, VIGNERON (talk) 08:51, 1 April 2020 (UTC)
  Comment I support adding information about field of use. Just a question - is it necesary to limit the values to terminology (Q8380731)? Would be linking to mathematics (Q395), architecture (Q12271) or card game (Q142714) not sufficient enough? --Lexicolover (talk) 21:27, 31 March 2020 (UTC)
@Lexicolover: I think it's useful to restrict the allowed values to a set of topics that can be identified and referred easily. For example, the sense about mouse (Q7987) could use computing (Q179310) or computer (Q68)? I would prefer using computing terminology (Q3457057) because it's clearer: the topic isn't too specific, nor too broad. --Tinker Bell 07:18, 1 April 2020 (UTC)
@Tinker Bell:   Disagree It seems you have a need to have an identifier for "lists of words used in a domain". This list can be generated at any time with SPARQL. There are tens of thousands of domains, and so with your opinion, there would be a need to maintain tens of thousands of such lists or identifiers for such topics (really just compound concepts. computing + terminology), like "bicycle terminology" "poker terminology" "browser terminology"... Since "lexeme" actually means "a lexical unit of language", then we are already in the realm of "terminology" and "words". There's no need to create compound concepts.

  Disagree I was looking around for this today, to mark the domains "business" "finance" on S2 sense for Lexeme acceptance while reviewing the same sense on Wiktionary. "field of study" is much more appropriate and allows both broad as well as precise data enrichment. "bow" on a violin -> domain: music, bowed string instrument, violin and similarly a "bow" on a boat -> domain: vessel, marine. "domain" within Wikidata seems to be conceptually wrapped up in "field of study" which seems reasonable, since any "class" can be a field of study. But I would be OK with usage of facet of (P1269) statements on Senses as others mention which seems to be a common alternative way for many Wikidata editors to say "field of study" and where it is typically used as "a domain or field of study" predicate. It seems we already have this ability now to apply any statement to an individual Sense. So I can already apply "facet of" or "field of study" to any Sense. So, I don't see a need for this property to be introduced. Instead, we should just improve the Lexeme documentation with tips on how to apply "field of study" or "facet of" Thadguidry (talk) 16:19, 22 April 2020 (UTC)

  • @Thadguidry: Using studied by (P2579) would be fine for items, but not for lexemes. For example, music terminology (Q77655668) has statement studied by (P2579) music theory (Q193544). We can say that a terminology itself is studied by some field or discipline, but a lexeme by itself is not studied by the field it belongs, but by lexicology (Q178433). About using facet of (P1269), its description says "item that offers a broader perspective on the same topic". I can't see how a lexeme fullfills that description. --Tinker Bell 01:25, 25 April 2020 (UTC)
  • @Tinker Bell: Hi! Sure, there are lots of relationships that can be made between "words" (lexemes) and "concepts" (items). I guess what I'm trying to lead up to is this... What does the community feel are the most important relationships between the two? Do you think its more important to have links between the two that describe a relationship that "word senses" are studied, used, or both in a domain ? About the facet of (P1269) , that could be broaden to be used on both items and lexemes, Correct? I feel that sensible reuse of properties between the two, i.e. I don't think there is anything inherently wrong with saying that a "word sense" can be a facet of some domain. And I think that is the core of this proposal in trying to offer a way that a "word sense" has some relationship to some domain. And the question that everyone has I think is this... What is the best way to say that some "word sense" is used in some domain? Can we reuse existing properties and make small changes to allow that? I think so, for instance, it might be decided and easier to expand used by (P1535) to be "used by/in" to allow a triple pattern of "bow" -> "used by/in" -> "music" ? or would that really confuse things and we need to create new properties? Thadguidry (talk) 02:36, 25 April 2020 (UTC)

location of lexeme usageEdit

   Under discussion
Descriptionfor lexemes which are considered regional, the main locations where the lexeme is used
Data typeItem
Domainlexemes
Allowed valuesgeographical locations
Example 1carer (L290514)caregiver (Q553079) (British word for caregiver)
Example 2MISSING
Example 3MISSING
See alsolocation of sense usage (P6084)

MotivationEdit

To be able to mark lexemes which are regional. Uziel302 (talk) 10:02, 26 March 2020 (UTC)

DiscussionEdit

Siddhaṃ alphabetEdit

PronunciationEdit