Wikidata:Property proposal/used in phrase
used in phrase
editOriginally proposed at Wikidata:Property proposal/Lexemes
Description | phrase in which is lexeme used |
---|---|
Data type | Lexeme |
Domain | lexeme |
Example 1 | ovce (L10561) → černá ovce (L221051) |
Example 2 | hole (L3430) → black hole (L2890) |
Example 3 | MISSING |
Motivation
editSome explanatory dictionaries list phrases for their entries and I haven't found fitting property for this. This property would allow to indicate existing phrases in which the lexeme is used and might have different sense there. Querying such thing would return incorrect matches (homonyms). Lexicolover (talk) 19:49, 29 December 2019 (UTC)
Discussion
edit Comment How about modeling the inverse relation, for example
black hole (L2890)has part(s) (P527)black (L3273)
black hole (L2890)has part(s) (P527)hole (L3430)
? Of course the qualifiers in the above example are not strictly necessary to express the semantics of the proposed property.―BlaueBlüte (talk) 20:15, 29 December 2019 (UTC)
- @BlaueBlüte: I like it. In your opinion is it better to have just this one inverse relation or both (when data would be reused by third party tools)? For some reason there are many properties that are inversive to each other. --Lexicolover (talk) 19:10, 30 December 2019 (UTC)
- @Lexicolover: (Using only) this inverse-relation approach would probably avoid a lot of clutter on the pages of lexemes from which many other words are derived/which are used in many phrases; there would only be as many occurrences of the (inverse) property on any lexeme as a phrase has tokens (2 occurrences in each of your examples černá ovce (L221051) and black hole (L2890)).
Your proposed property could be helpful in cases where data users (e.g., Wikipedias, Wiktionaries) want to list all derived lexemes or all phrases that use a given lexeme, but are, for technical reasons, limited to querying items (or here, lexemes) and cannot run SPARQL queries, which is true for example in the dynamic population of certain Wikipedia infoboxes (which are about items, not lexemes, as far as I can tell). I don’t know if there are any such cases when it comes to derived lexemes/phrases.
By the way, rather thanhas part(s) (P527), there seems to be consensus that combines lexemes (P5238), potentially with qualifier object form (P5548), should be used. Mahir256, I think this might be what you had in mind.―BlaueBlüte (talk) 22:34, 30 December 2019 (UTC)- @BlaueBlüte: Thank you. Your original proposal is nice and smart but as for combines lexemes (P5238), I'd rather used nothing than this property. The property itself says subject item of this property should be compounds. I don't know about other languages, but at least in my own are compounds, derivations or collocations completely different things. In my opinion it is better to add nothing than essentially wrong information. --Lexicolover (talk) 14:21, 31 December 2019 (UTC)
- @Lexicolover: How is using combines lexemes (P5238) more ‘wrong’ than using has part(s) (P527)? Where do you see the difference between compounds and collocations when it comes to identifying their components? Also, see the usage example for combines lexemes (P5238) involving avoir besoin de (L9441), which might be an instance of what you describe as a collocation.
If you’re concerned about the property labels and descriptions focusing on compounds too much, maybe those need some clarification; see also the discussion about using combines lexemes (P5238) instead of creating a new property “combines lexemes”.―BlaueBlüte (talk) 19:06, 31 December 2019 (UTC)- It is simple - compound is linguistic term with specific meaning while "has part" is something generic. As for that example I don't know a thing about other languages (as I stated above) or about authority of whoever added those examples but if the scope of property are compounds then using it for something that is not compound is just wrong and it causes our data to be wrong. I don't have knowledge to create "big picture on how we want morphology and etymology in Wikidata lexemes" but I know that I don't want to add statements that I believe are not correct. --Lexicolover (talk) 20:27, 1 January 2020 (UTC)
- @Lexicolover: How is using combines lexemes (P5238) more ‘wrong’ than using has part(s) (P527)? Where do you see the difference between compounds and collocations when it comes to identifying their components? Also, see the usage example for combines lexemes (P5238) involving avoir besoin de (L9441), which might be an instance of what you describe as a collocation.
- @BlaueBlüte: Thank you. Your original proposal is nice and smart but as for combines lexemes (P5238), I'd rather used nothing than this property. The property itself says subject item of this property should be compounds. I don't know about other languages, but at least in my own are compounds, derivations or collocations completely different things. In my opinion it is better to add nothing than essentially wrong information. --Lexicolover (talk) 14:21, 31 December 2019 (UTC)
- @Lexicolover: (Using only) this inverse-relation approach would probably avoid a lot of clutter on the pages of lexemes from which many other words are derived/which are used in many phrases; there would only be as many occurrences of the (inverse) property on any lexeme as a phrase has tokens (2 occurrences in each of your examples černá ovce (L221051) and black hole (L2890)).
Oppose in favor of a solution similar to that suggested by BlaueBlüte (which I thought was already the case for compound words/phrases). Mahir256 (talk) 18:19, 30 December 2019 (UTC)
- Oppose per BlaueBlüte. --Tinker Bell ★ ♥ 04:37, 31 December 2019 (UTC)
Since there is clear consensus against such properties I hereby withdraw the proposal. --Lexicolover (talk) 20:27, 1 January 2020 (UTC)