Wikidata:Property proposal/has semantic role (2nd proposal)

‎has semantic role (2nd proposal)

edit

Originally proposed at Wikidata:Property proposal/Generic

   Not done
Descriptionitem that describes a role in an event/action class
Data typeItem
Domainitem, occurrence (Q1190554)
Example 1military offensive (Q2001676)"has semantic role"attacker (Q31924059)object of statement has role (P3831)agent (Q392648)
Example 2military offensive (Q2001676)"has semantic role"defender (Q111729140)object of statement has role (P3831)theme (Q118826633)
Example 3throwing (Q12898216)"has semantic role"actor (Q23894381)object of statement has role (P3831)agent (Q392648)
Example 4throwing (Q12898216)"has semantic role"projectile (Q49393)object of statement has role (P3831)theme (Q118826633)
Example 5throwing (Q12898216)"has semantic role"target (Q1047579)object of statement has role (P3831)destination (Q111335358)
Planned useadd to (possibly newly created) items describing occurrences/actions
Expected completenessalways incomplete (Q21873886)


This proposal is a substantial revision of Wikidata:Property proposal/has semantic role.

Motivation

edit

Consider concepts that describe classes of events, actions and processes, roughly the subclasses of "occurrence (Q1190554)". For the lack of a better inclusive term, we call them "event/action" classes. (They are sometimes called "eventualities" in linguistic literature.) All event/action classes have core semantic roles, as illustrated by widely used resources such as "FrameNet (Q1322093)", "VerbNet (Q7920918)" and "PropBank (Q7250039)". For example, “eating" has an "eater" and something "eaten"; "throwing" has the "thrower", the "target" and the "projectile". These roles are not optional. Every act of "eating" has an "eater" and something "eaten" independently of how it is expressed and in what language. While Wikidata has over 300 existing properties for roles in event/action instances (e.g., "participant (P710)", "victim(s) (P8032)"), there are very few that are used with event/action classes. The two most common are "practiced by (P3095)" and "uses (P2283)". The vast majority of event/action classes have no statements describing semantic roles. For example, until very recently, "military offensive (Q2001676)" didn't have any semantic roles at all. Clearly, every military offensive has an attacker and a defendant. We added these roles using two statements:

military offensive (Q2001676)has characteristic (P1552)attacker (Q31924059)object of statement has role (P3831)agent (Q392648)

military offensive (Q2001676)has characteristic (P1552)defender (Q111729140)object of statement has role (P3831)theme (Q118826633)

Here, "agent (Q392648)" and "theme (Q118826633)" are instances of "thematic relation (Q613930)". The property "has characteristic (P1552)" is extremely generic and has many uses. Our proposed “has semantic role” property would be a specific sub-property of "has characteristic (P1552)" for designating semantic roles.

Some of the existing event/action classes already have statements indicating semantic roles. For example, the creator in "creation (Q11398090)" is indicated by the "practiced by (P3095)" property. We would not change this, but, since this property has many uses, we added a qualifier:

creation (Q11398090)practiced by (P3095)creator (Q2500638)object of statement has role (P3831)agent (Q392648)

The item "creation (Q11398090)" did not have a statement for the "object of creation" role. So, we added:

creation (Q11398090)has characteristic (P1552)artificial object (Q16686448)object of statement has role (P3831)theme (Q118826633)

If we had the proposed "has semantic role" property, we would have used it instead of the generic "has characteristic (P1552)" property.

This proposal is a part of a wider project: "Wikidata:WikiProject_Events_and_Role_Frames". We encourage the interested parties to visit and join the project discussion. Anatole Gershman (talk) 21:53, 23 July 2024 (UTC)[reply]

Discussion

edit
UMR semantic roles Wikidata property
actor, causer "agent of action" (currently proposed)
force, stimulus, cause, reason has cause (P828)
undergoer, patient, theme, affectee object of occurrence (P12912) or object class of occurrence (P12913)
recipient, goal destination of transfer (P12694) or destination point (P1444) (note, not has goal (P3712), which describes a desired state, whereas the UMR role seems to describe a location)
experiencer no specific existing or proposed property, but probably largely covered by object of occurrence (P12912)/object class of occurrence (P12913)
instrument uses (P2283)
start start point (P1427) or source of transfer (P12693)
companion together with (P1706) (note that a companion is relative to an agent rather than an action directly, which is why this is a qualifier)
material/source source of material (P2647), made from material (P186), or source of transfer (P12693) as appropriate (this is really more than one semantic role)
place location (P276) and its sub-properties
temporal any Wikidata property with datatype 'time' (Q18636219)
extent various numerical-valued properties
manner has characteristic (P1552)
purpose has goal (P3712)
attribute has characteristic (P1552) and various others; this "role" is pretty nonspecific
result has effect (P1542)
direction direction (P560), towards (P5051), terminus (P559), depending on subject class
  • Some of these properties also accept subjects that are not actions/occurrences, but that doesn't impede their use for filling in semantic slots for actions/occurrences. So you can see that the semantic roles given by UMR are already well covered, especially if "agent of action" is created, and in fact in many cases the existing properties are more fine-grained than UMR. Swpb (talk) 16:06, 31 July 2024 (UTC)[reply]
    @Swpb Wow! I had no idea these roles could be so well represented by existing properties. I can quibble a bit. Experiencer and Stimulus are actually quite distinct and deserve more attention, and I'm still dithering about Manner and Attribute being "has characteristic" but on the whole this is pretty comprehensive. I would still want them all collected together under "has semantic role" which provides a way to cluster these properties together as serving this purpose. In addition, as broad as this list is, and as applicable as it is, there are always verbs in every language that have participants that do not fit any of these categories. "has semantic role" also provides a general backoff category for the participants that just don't fit anything else. Why can't we have both, "has semantic role" and "has agent of action"? MarthaStonePalmer (talk) 22:05, 31 July 2024 (UTC)[reply]
    To me, the fact that this mapping surprises anyone suggests that the "Events and Role Frames" WikiProject has been working in too much isolation from the rest of the project, and should step back and reconsider the redundancy of its approach and its integration. The point is not that Wikidata already has a perfect property for every possible semantic role, but that the mapping of semantic roles to Wikidata properties is already extremely close, just through natural development that didn't originally have semantics in mind, and can continue to be brought closer. The table only lists the roles given by UMR, but I'm confident that for just about any other role you can think of, there is an appropriate property, and if not, one can be proposed. The usual way of grouping properties by function would be to create a Q-item like "Wikidata property that may be used to represent a semantic role", and making the applicable properties instances of it. The problem with having "has semantic role" and properties representing specific roles is that of overlap – on an item like military offensive (Q2001676), would you have both "agent of action"/object class of occurrence (P12913) statements and "has semantic role" statements? I would think not, because the former accomplishes the task better. But then what items would you use "has semantic role" on? I think my table shows that in almost all cases, there is a better property to use. (To wit, the statement at the end of the Motivation section would be better expressed as creation (Q11398090)has effect (P1542)artificial object (Q16686448).) You could argue that "has semantic role" would just be a pseudo-parent property, not meant for use except as a place-holder when a better role-specific property doesn't yet exist, but it would not work that way in practice: generic properties (like of (P642) and the former "as") get used and abused. I see its creation as having a huge downside for very little upside, and I think the proposal should be withdrawn while the WikiProject contemplates the issue. Swpb (talk) 14:15, 1 August 2024 (UTC)[reply]
    @Swpb Of course the correct approach would be to always use the more specific property if available. However, you still haven't addressed my point about recourse to a general "has semantic role" property when there are no appropriate more specific "role" properties available. On the other point, how is "thing thrown" a selection preference? The "thing" could be anything concrete, including a building if Superman is around, or anything abstract, as in an "election" being thrown. I take your point about defining a Q item to collect all of these "role" properties together, so I'll withdraw the suggestion of using "has semantic role" for that purpose. It could just be one of such properties used primarily for backoff purposes, and for things like Experiencer and Stimulus until we come up with better property definitions. I really don't see what we are suggesting as competing with what you are doing, but rather complementing it. MarthaStonePalmer (talk) 17:26, 1 August 2024 (UTC)[reply]
    I thought I did address the idea of recourse to a general property when no appropriate property is available: first, there almost always is an appropriate property available, and second, from experience, when a "general" property exists, many editors will use it when they should be using a more specific property, instead of figuring out what that specific property is. That creates ongoing cleanup work for other editors. If "has semantic role" is created, I'd want to see the property description make VERY clear that it is not to be used where a more appropriate property exists, and direct editors to a table of such properties – but I think even with that, there will be a lot of lazy misapplication. If there is a gap in role-specific properties, like "experiencer", we should propose that property and then create statements with it, instead of creating temporary statements to be migrated later. As to the distinction between semantic roles and selection preferences when metaphor gets involved, I'll leave that to you linguists – it doesn't seem to bear directly on my main concern with the proposal. Swpb (talk) 18:10, 1 August 2024 (UTC)[reply]
    @Swpb I totally agree with wanting to discourage use of a general property when more specific properties are available. We went through your table above in our meeting this afternoon, and we're perfectly happy with almost all of your mappings. We also much prefer the use of properties for semantic roles - they embody the implied semantic relationships more naturally. That's why we've persisted for the last 8 months in our endeavor to get at least one such property approved. We were just so daunted by the thought of trying the same thing with more than one property that we didn't even consider it. But if you've already done most of the work, more power to you, and we will happily tag along. With respect to the table, after due deliberation on whether together with (P1706) would really work since it is primarily a qualifier, and looking at several examples, we decided it's fine. The same for has characteristic (P1552) for Manner and Attribute. After initial reservations, we all came around. However, Extent and Direction need some refinement. Extent needs to be broader since it isn't always numerical. Any type of change in degree can be included in Extent, and could be described as imprecisely as "an extreme increase in foreclosures." Direction is also primarily a trajectory which needs to be carefully distinguished from an end point (up above in Goal), but that just means removing terminus (P559). That still leaves Experiencer and Stimulus, which can be addressed on another day. There are still two arguments in favor of "has semantic role". 1) As a place to include either your table above or at least a link to it, as well as a link to an appropriately revised version of our "Wikidata:WikiProject_Events_and_Role_Frames" and clear directions about using specific roles if possible;
    2) As a catchall when the traditional labels aren't good fits. In English there are predicating elements like "contain, exceed, yield," and "possess" whose arguments don't easily fit traditional roles. A "storage tank" that contains toxic chemicals isn't really an Agent or an Undergoer. Similarly for the "performance" in "her performance exceeds expectations". FrameNet labels the combatants in "The combatants yielded to the invaders" as Capitulators, again, not exactly Agents. Same for Fiona in "Fiona possessed a quirky sense of humor and flaming red hair that were hard to forget." This is one of the reasons FrameNet ended up with over 2000 distinct Frame Elements. Maybe those kinds of events will rarely, if ever, get modeled in WikiData, but, just in case, we could handle them without going to those lengths. MarthaStonePalmer (talk) 23:46, 1 August 2024 (UTC)[reply]
    Hi Martha, I'm delighted your team is recognizing the power of existing properties to express semantic relationships, and I'm happy to work with you to clean up the mappings and identify gaps – I made the table in about 10 minutes, so I'm not surprised it isn't perfect. I assume the best place for those discussions will be on the WikiProject; you can ping me as needed. That said, I can't accept your two remaining arguments for the "has semantic role" property:
    1) Properties are meant to be used; if the primary goal is to make it easier for editors to find a different property they should really be using, there are much more appropriate ways to do that: properties are categorized by subject, and there are navigation boxes for properties in different subject areas. There are a number of tools for searching for properties, such as the Prop explorer. There are about 650 subclasses that are used to organize and link related properties together. And there are numerous properties for linking two properties directly, including related property (P1659), subproperty of (P1647), complementary property (P8882), and inverse property (P1696); and properties for this type (P1963) to indicate properties that are appropriate for a given class. And editors can create and share their own means of organizing properties; I have one of my own. Creating a new property just to direct editors to a list of properties is not appropriate.
    2) For the examples you give of statements that might require a catchall, there are, again, already properties that are appropriate, or could be appropriate with very little tweaking: without knowing the exact senses you have in mind, we have contains (P4330), greater than (Q47035128) ("exceeds"), product or material produced or service provided (P1056) and by-product (P2821) (various senses of "yield"), owner of (P1830) and has characteristic (P1552) for different senses of "possess". By simply adding properties for specific relations as the need arises, Wikidata has already developed quite a deep bench, and when a new need is demonstrated, it's not usually hard to get it met, either through a new property or rescoping an existing one. This model does not need, and in fact in some ways suffers from, the presence of catchall properties.
    For those reasons, it's still my position that this proposal should be withdrawn. If everything I'm saying is wrong and there's a real need for this property, it won't be hard to get it approved later, but if it is found to be problematic after being created and used, it will be a bit of work to get the cat back in the bag. Swpb (talk) 15:13, 2 August 2024 (UTC)[reply]
    @Swpb Your position appears to be that there are already existing properties (namely the properties for thematic roles) that cover some of the cases where "has semantic role" is to be used for and that the missing ones can easily be added. But what about cases where there is no existing (or even not existing but generally accepted) thematic role? Martha has mentioned several of these but there are many others. I don't expect that there are properties in Wikidata for all the possible situations that could arise and it doesn't seem possible to handle these situations by creating a new property on a case-by-case basis. The proposal here is that "has semantic role" can be used as a general property and it will cover all the situations without having to create a lot of new properties. Is there another way that this could be handled? If so, what is it? Without a fallback I don't see how this important kind of information can be captured in Wikidata. Peter F. Patel-Schneider (talk) 14:14, 13 August 2024 (UTC)[reply]
    My position is that there are existing properties not just for some, but for virtually all cases, with greater or lesser specificity. I think I've provided appropriate properties for every case given so far, including Martha's latest; if there are, as you say, many other cases that do not fall into any of the roles already mapped to existing properties, I have yet to see them. As I've explained, the problem with "general" or "fallback" properties is that they always end up being used in place of more appropriate, specific properties. This is not idle speculation; this is exactly what happened with the now-deprecated "as" property, and the current property of (P642) which is taking an enormous effort to deprecate. There is certainly no gap in Wikidata's ability to express semantic roles that is large enough to justify creating another such headache. The biggest gap is that of agent, and there is an active proposal to close it, which AWesterinen of the Event Roles and Frames WikiProject is opposing. Swpb (talk) 15:24, 13 August 2024 (UTC)[reply]
  • @Peter F. Patel-Schneider, MarthaStonePalmer, HajicJanSr, SkatjeMyers, Kitchengoose, AWesterinen: Pinging to make you aware of my rationale for opposing, in case it affects your thoughts on the proposal. Swpb (talk) 17:50, 30 July 2024 (UTC)[reply]
    @Swpb Thanks for your thoughtful comments. We could indeed use object class of occurrence (P12913) for most selectional preferences. Events doesn't always come across as actions, as so it might be a bit counter-intuitive at times. We can also keep object of occurrence (P12912) where it is currently being used, just as we can keep "practiced by" for the "eater" of "eating." We can add it to our table of semantic roles in our Project description, "Wikidata:WikiProject_Events_and_Role_Frames". But if you look at that table you'll see that we have a lot of additional roles, most of which do not have properties already defined. One of our main goals is to come up with a consistent predictable way of defining event/action participants and an easily understood process for doing so. I don't see why we couldn't say that object of occurrence (P12912) is a subproperty of our proposed "has semantic role", unifying what are currently quite diverse ways of specifying participants. It is hard to do that for "practiced by" since it has a lot of alternative uses, but maybe that isn't true of object of occurrence (P12912)? MarthaStonePalmer (talk) 23:08, 30 July 2024 (UTC)[reply]
    Since object of occurrence (P12912) and object class of occurrence (P12913) extend to events that are not actions per se, their labels or descriptions could be adjusted to reflect this, but I have found that in most cases where there is an undergoer, the event is an action. You say that most of the semantic roles in Anatole's table don't have properties, but I don't believe that's true – the existing properties just haven't been explicitly mapped to semantic roles before, but I have done so in the table in my reply to Peter above. Logically, object of occurrence (P12912) and the other role-specific properties could be sub-properties of the one proposed here (which is misnamed because it really indicates a selectional preference/requirement rather than a semantic role), but I don't see when you'd ever want to use the latter property when the more specific former ones cover all the roles we have identified. Swpb (talk) 16:18, 31 July 2024 (UTC)[reply]
    @Swpb I commented on your table up above. I'm impressed with the coverage provided by existing properties, but I still see an hierarchy of semantic role properties as being valuable as explained above. I don't agree when you say "has semantic role" is really for selectional preferences, not roles, although I can see how "defender (Q111729140)" and "projectile (Q49393)" might have caused that confusion. Selectional preferences are really a separate issue. Our intention is to use the participant descriptions in the PropBank Frame Files that are intended to be very action specific and very intuitive. With that in mind we should have said "entity attacked" rather than "defender (Q111729140)". Since "defender (Q111729140)" was an existing Q item that was close to "entity attacked" we used it. But a selectional preference for either "entity attacked" or "defender (Q111729140)" would be different, something like "animate"/"organization". "projectile (Q49393)" is actually "thing thrown" in the PropBank frame and that's what we should have used instead. "projectile (Q49393)" is even more confusing. Our idea is to populate the participant information semi-automatically using both the PropBank specific descriptions as well as the more general UMR roles that you have listed in the table which are also associated with the PropBank function tags. MarthaStonePalmer (talk) 22:35, 31 July 2024 (UTC)[reply]
    I've mostly responded above, but I'll add here that I really don't think the problem is one of imprecise labels. For all intents and purposes, "projectile" and "thing thrown" are the same thing. On an earlier version of this proposal, I argued against creating a whole set of semantic-derived items that more or less mirror existing items; that's just a recipe for confusion. This is the reason we have aliases. Swpb (talk) 14:22, 1 August 2024 (UTC)[reply]
    @Swpb I still strongly support this "has semantic role" proposal and indeed have issues with the [agent (class) of action proposal] (as noted on that page). I am very supportive of reusing specific properties for instance level declarations such as "agent of action", object of occurrence (P12912), uses (P2283) for the role of instruments, etc. Please see my comments on the "agent of action" proposal. AWesterinen (talk) 05:26, 8 August 2024 (UTC)[reply]
@AWesterinen: You're welcome to maintain your support, but, with respect, I don't think you've engaged with any of my criticisms of this proposal, either here or on the agent of action proposal page. I've responded to your comments on that proposal there, but I don't see anything to respond to with respect to this proposal. Swpb (talk) 14:25, 9 August 2024 (UTC)[reply]
  •   Support I think this makes sense. However it looks like examples 4 and 5 are mixed up? ArthurPSmith (talk) 17:41, 30 July 2024 (UTC)[reply]
    @User:ArthurPSmith, I think you're right. I switched them. Thanks for catching that!MarthaStonePalmer talk]]) 20:56, 30 July 2024 (UTC)[reply]
    I would think projectile (Q49393) is the instrument (Q6535309) of throwing (Q12898216) rather than a theme or destination. Swpb (talk) 17:45, 30 July 2024 (UTC)[reply]
    @MarthaStonePalmer, so an (optional) instrument in "throwing" could be gloves? Or something like a slingshot (but in a throwing context)? Or simply the hand? All of these seem correct to me in one way or another Egezort (talk) 19:42, 6 August 2024 (UTC)[reply]
    @Swpb. Instruments as thematic relations are typically intermediaries. The key in I unlocked the door with a skeleton key/ the screwdriver in I repaired the fan with a screwdriver, etc. In the prototypical case the Agent has contact with the Instrument and the Instrument has contact with the Patient or Theme. In the throwing event, the frisbee isn't being used by the agent to accomplish a particular purpose, it is the thing in motion as a direct result of the Agent's action, so it wouldn't typically be labeled as an Instrument. MarthaStonePalmer (talk) 20:51, 30 July 2024 (UTC)[reply]
    @Egezort ‘Threw the ball with my hand’ or ‘kick the wall with my foot’ sound a bit odd because the hand and the foot are already implicit, but you can certainly ‘hit someone with a fist/stick/pillow/etc’ where they are alternative instruments. Or ‘catch the ball with a baseball glove/catcher’s mitt/a racket/etc.’. MarthaStonePalmer (talk) 15:39, 13 August 2024 (UTC)[reply]
  •   Comment If the creator(s) of this proposal are not inclined to withdraw it, I believe it is time to close it as failed. The central problem I pointed out almost a month ago – the mess invariably created by overlapping/generic/fallback properties – has not been refuted, and an overriding need for such a property (i.e., semantic roles that do not map to any existing property) has not been demonstrated. Six of the eight supporters of the proposal (including the proposer) are participants of the WikiProject that developed it (rather than arriving here independently); all eight announced their support before I fully explained the problem, and the three who have replied since have (I think they would agree) not refuted that problem. MarthaStonePalmer, apparently speaking for the WikiProject, wrote that "we're perfectly happy with almost all of your mappings. We also much prefer the use of properties for semantic roles". Another project member, Peter F. Patel-Schneider, has moved on to developing a model for expressing semantic information that lines up with the longstanding (if previously implicit) practice of using properties to represent specific semantic roles. No one has spoken in support of this proposal in the past 14 days. I believe there is ample reason this property should not be created, little energy remaining behind the push to create it, and the proposal remaining open may be distracting from efforts on more appropriate modeling of event semantics. Swpb (talk) 14:08, 27 August 2024 (UTC)[reply]
  •   Not done, no consensus of proposed property at this time based on the above discussion. Regards, ZI Jony (Talk) 19:52, 5 September 2024 (UTC)[reply]