Wikidata:Property proposal/used in phrase

used in phrase

Originally proposed at Wikidata:Property proposal/Lexemes

Withdrawn

Description	phrase in which is lexeme used
Data type	Lexeme
Domain	lexeme
Example 1	ovce (L10561) → černá ovce (L221051)
Example 2	hole (L3430) → black hole (L2890)
Example 3	MISSING

Motivation

Some explanatory dictionaries list phrases for their entries and I haven't found fitting property for this. This property would allow to indicate existing phrases in which the lexeme is used and might have different sense there. Querying such thing would return incorrect matches (homonyms). Lexicolover (talk) 19:49, 29 December 2019 (UTC)[reply]

Discussion

Comment How about modeling the inverse relation, for example
black hole (L2890)has part(s) (P527)black (L3273)object of statement has role (P3831)attributive adjective (to be created) or attributive (Q4818723)
black hole (L2890)has part(s) (P527)hole (L3430)object of statement has role (P3831)head noun (to be created)
? Of course the qualifiers in the above example are not strictly necessary to express the semantics of the proposed property.―BlaueBlüte (talk) 20:15, 29 December 2019 (UTC)[reply]

@BlaueBlüte: I like it. In your opinion is it better to have just this one inverse relation or both (when data would be reused by third party tools)? For some reason there are many properties that are inversive to each other. --Lexicolover (talk) 19:10, 30 December 2019 (UTC)[reply]

@Lexicolover: (Using only) this inverse-relation approach would probably avoid a lot of clutter on the pages of lexemes from which many other words are derived/which are used in many phrases; there would only be as many occurrences of the (inverse) property on any lexeme as a phrase has tokens (2 occurrences in each of your examples černá ovce (L221051) and black hole (L2890)).
Your proposed property could be helpful in cases where data users (e.g., Wikipedias, Wiktionaries) want to list all derived lexemes or all phrases that use a given lexeme, but are, for technical reasons, limited to querying items (or here, lexemes) and cannot run SPARQL queries, which is true for example in the dynamic population of certain Wikipedia infoboxes (which are about items, not lexemes, as far as I can tell). I don’t know if there are any such cases when it comes to derived lexemes/phrases.
By the way, rather than ~~has part(s) (P527)~~, there seems to be consensus that combines lexemes (P5238), potentially with qualifier object form (P5548), should be used. Mahir256, I think this might be what you had in mind.―BlaueBlüte (talk) 22:34, 30 December 2019 (UTC)[reply]

@BlaueBlüte: Thank you. Your original proposal is nice and smart but as for combines lexemes (P5238), I'd rather used nothing than this property. The property itself says subject item of this property should be compounds. I don't know about other languages, but at least in my own are compounds, derivations or collocations completely different things. In my opinion it is better to add nothing than essentially wrong information. --Lexicolover (talk) 14:21, 31 December 2019 (UTC)[reply]

@Lexicolover: How is using combines lexemes (P5238) more ‘wrong’ than using has part(s) (P527)? Where do you see the difference between compounds and collocations when it comes to identifying their components? Also, see the usage example for combines lexemes (P5238) involving avoir besoin de (L9441), which might be an instance of what you describe as a collocation.
If you’re concerned about the property labels and descriptions focusing on compounds too much, maybe those need some clarification; see also the discussion about using combines lexemes (P5238) instead of creating a new property “combines lexemes”.―BlaueBlüte (talk) 19:06, 31 December 2019 (UTC)[reply]

It is simple - compound is linguistic term with specific meaning while "has part" is something generic. As for that example I don't know a thing about other languages (as I stated above) or about authority of whoever added those examples but if the scope of property are compounds then using it for something that is not compound is just wrong and it causes our data to be wrong. I don't have knowledge to create "big picture on how we want morphology and etymology in Wikidata lexemes" but I know that I don't want to add statements that I believe are not correct. --Lexicolover (talk) 20:27, 1 January 2020 (UTC)[reply]

Oppose in favor of a solution similar to that suggested by BlaueBlüte (which I thought was already the case for compound words/phrases). Mahir256 (talk) 18:19, 30 December 2019 (UTC)[reply]

Oppose per BlaueBlüte. --Tinker Bell ★ ♥ 04:37, 31 December 2019 (UTC)[reply]

Since there is clear consensus against such properties I hereby withdraw the proposal. --Lexicolover (talk) 20:27, 1 January 2020 (UTC)[reply]