Wikidata:WikiProject Events and Role Frames

WikiProject Events and Role Frames

The primary aims of WikiProject Events and Role Frames are

  • to define a set of properties that consistently model eventualities (states, processes and events) and their participants (Pustejovsky, 2021);
  • to fill gaps in Wikidata regarding items for states/processes/events/actions; and
  • to encourage use of the proposed model and newly introduced items across Wikidata.

Pustejovsky, James, (2021) The Role of Event-Based Representations and Reasoning in Language, in Caselli T, Hovy E, Palmer M, Vossen P, eds. Computational Analysis of Storylines: Making Sense of Events. Cambridge University Press.

Join us!

Motivation

edit

One of the known weaknesses of Wikidata is the spotty coverage of eventualities (processes, states, events) and their prototypical participant structures. Let’s look at the spotty coverage issue first.

Spotty State/Process/Event/Action Coverage

edit

One of the most common verbs in most languages is “to bring”, e.g., “I brought flowers to my mother”, “J'ai apporté des fleurs à ma mère”, “Я принёс цветы маме”, “Ich habe meiner Mutter Blumen mitgebracht”. Until we added "bringing (Q124457329)" in February 2024, there was no such concept in Wikidata. We examined over 11,500 rolesets contained in PropBank (Q7250039) that describe English predicating expressions (mostly verbs) and identified over 7500 potentially missing Wikidata items [(arw) Currently, there are over 7500 English verbs defined as lexemes. It would be interesting to understand the overlap and/or what is missing.]. Each of these “gaps” needs further examination to determine if it warrants a new item, but the list gives us a starting point. We want to emphasize that although we start with English, the gaps are semantic, not lexical and we should use multiple languages to identify semantic gaps. [(arw) There are many Russian lexemes currently defined, many/most by the Lexicator tool].

State/Process/Event/Action Role Structures

edit

All states/processes/events/actions have core semantic roles - "eating" has the "eater" and the "eaten", "throwing" has the "thrower", the "target" and the "projectile". These roles are not optional. Every act of "eating" has an "eater" and the "eaten" independently of how and in which language it is expressed. Most of the existing items for such classes do not mention these roles. For example, "throwing (Q12898216)", defined as “launching of a ballistic projectile by hand” does not have any statements that indicate the existence of the thrower, the target, or the projectile, let alone the specifications of the kinds of entities these attributes are likely to be.

Some Wikidata items for event/action concepts include statements for some of the semantic roles. For example, "eating (Q213449)" uses the "practiced by (P3095)" property whose object is "eater (Q20984678)". Although "practiced by (P3095)" is defined as “type of agents that study this subject or work in this field”, it is often used to indicate an agent of an action. [(arw) This misuse should either be corrected in the property definition or by the creation of a new property.] But, as its description suggests, it has other uses. In "eating (Q213449)", property "uses (P2283)" points to "food (Q2095)" to indicate the "eaten". This property also has many uses. Since no Wikidata property is used exclusively to indicate a semantic role, the existing properties [pfps a property is something like uses (P2283), I suspect you mean that the information needs to be added to a property value in the event concept] will need a qualifier such as "object has role (P3831)" to indicate that the object [pfps I would use value instead here] is a semantic role.

Caveats

edit

This project does not address the problem of ontological consistency of Wikidata items. But, as we examine Wikidata events, we might also fill in the gaps in some of the “subclass of” properties. For example, departure (Q21171241) is not currently a subclass of going (Q19279529). The item execution (Q3966286) defined as “homicide as capital punishment” does not seem to be connected to capital punishment (Q8454).

[(arw) There are other semantic issues beyond subclass of (P279) - such as skos:altLabel ... for example, bringing (Q124457329) is a subclass ofmoving (Q115095261) which has a altLabel of "renaming".]

Proposal

edit

The proposal outlines a step-by-step procedure for expanding Wikidata state/process/event/action coverage. It has four steps:

1. Adding missing Wikidata state/process/event/action classes [(arw) Add concepts which are Q items, lexemes which are L items, and lexeme senses which are S items. Tying a sense to a Wikidata concept/Q item is accomplished using item for this sense (P5137) or predicate for (P9970).]

2. Adding missing state/process/event/action roles [(arw) Do this by defining a new property for a lexeme's sense such as "has semantic argument" which then references a Q item (for ex, creator (Q2500638) for the role, "creator" for the verb sense, "create.01"). This statement has two properties: (1) "has semantic role" that references a Wikidata item equivalent to its semantic role in the table below (e.g., Actor => actor (Q23894381)), and (2) "defined using property" that references a P item with the explicit semantics for that sense (such as creator (P170)).]

3. Specifying selectional preferences for state/process/event/action roles

4. Adding role specifications to the state/process/event/action instances


Step 1. Adding missing Wikidata state/process/event/action classes

We propose to go systematically over the PropBank RoleSets. For each RoleSet, we look for an existing Q item class that represents the same concept. For example, when we examine PropBank's "see.01" defined as "to perceive an object with one's eyes", we find two relevant Q items: "visual perception (Q162668)" and "seeing (Q25374341)". The former is described as "ability to interpret the surrounding environment using light in the visible spectrum" and the latter as "the event of perceiving something using eyesight" which looks like a better candidate.

[(arw) But when reviewing PropBank, one often sees many possible senses of a word. This highlights the value of defining lexemes and their various senses. In addition, the sense can specifically reference a PropBank term ("see.01") by creating a new identifier, "PropBank ID".]

If no such item is found, we create one. For example, we could not find a Q item equivalent to PropBank's "bring.01" defined as "carry along with, move literally or metaphorically". We created a new Q item "bringing (Q124457329)" described as "transporting something toward somebody/somewhere". We also added labels in Russian and French and made it a "subclass of" "moving (Q115095261)".

[(arw) Translations of lexeme senses are supported via the property, translation (P5972).]

Step 2. Adding missing event/action roles

When we found [pfps find?] or created [pfps create?] an appropriate state/process/event/action Q item, we went [pfps will go?] over the roles of the RoleSet. In the "eat.01" example, there are two roles: the "consumer, eater" and the "meal". For each role, we look for a Q item statement that describes the role.

eating (Q213449)practiced by (P3095)eater (Q20984678)

describes the "consumer, eater" role. RoleSet "eat.01" indicates that this role is a "PAG" - a Prototypical Agent. Since the "practiced by (P3095)" property may have uses other than designating a semantic frame role, we add a qualifier:

eating (Q213449)practiced by (P3095)eater (Q20984678)object has role (P3831)agent (Q392648)

[(arw) An alternative rendering with a Lexeme would be to create Lxxxx-eat with Sense Lxxxx-eat-S1 (defined using the property, ontolex:sense). Then, Lxxxx-eat-S1 could use the "has semantic argument" property to reference eater (Q20984678). That statement would be further qualified with "has semantic role" referencing actor (Q23894381) (which may be a better match than agent (Q392648)) and "defined using property" referencing practiced by (P3095) (assuming that the definition was updated or referencing a new property).]

We provide a mapping between PropBank's prototypical roles such as "PAG" and "PPT" and the corresponding Q items. [(arw) The table below seems perfect for this use.]

PropBank's "meal" role of "eat.01" is a PPT (Prototypical Patient). The statement that best describes it is:

eating (Q213449)uses (P2283)food (Q2095)

We added a qualifier to this statement:

eating (Q213449)uses (P2283)food (Q2095)object has role (P3831)theme (Q118826633)

[(arw) The example here would be to add detail to Lxxxx-eat-S1. The "has semantic argument" property would reference food (Q2095). The "has semantic argument" property would reference theme (Q118826633). The "defined using property" property would reference uses (P2283) or a new property.]

If we cannot find a suitable qualifier object [pfps value for object has role (P3831)?], we can use "semantic role (Q117747915)" as a "back off". [(arw) Should we explicitly declare the various Q items in the table as subclasses/instances of semantic role (Q117747915)?]

When an event/action class does not have a statement describing a core semantic role, we look for an existing Q item that most closely describes that role. For example, "creation (Q11398090)" (process during which something comes into being and gains its characteristics) corresponds to PropBank's "create.01". It has a statement for the "creator" role but no statements for "Arg1-PPT thing created". The item that best describes the object of creation is "artificial object (Q3619132)". To complete the frame, we add the following statement:

creation (Q11398090)has characteristic (P1552)artificial object (Q3619132)object has role (P3831)theme (Q118826633)

We used the very generic "has characteristic (P1552)" property because we could not find a more specific existing one. The qualifier provides the interpretation.

Another example is "military offensive (Q2001676)" which does not have statements describing the attacker and the defendant. The item "attacker (Q31924059)" seems appropriate for the attacker role and "defender (Q111729140)" for the defender role:

military offensive (Q2001676)has characteristic (P1552)attacker (Q31924059)object has role (P3831)agent (Q392648)

military offensive (Q2001676)has characteristic (P1552)defender (Q111729140)object has role (P3831)theme (Q118826633)

(pfps Why are thematic roles showing up here?)

When no role Q item is found, we need to create one.


Step 3. Specifying selectional preferences for event/action roles

Each role, in an event/action frame typically describes the classes of entities that would normally be expected to play that role in that frame's instances. For example, we normally expect that the "eater" in an "eating" instance would be an organism. Because these expectations could be violated we call them selectional preferences, not restrictions. Unfortunately, Wikidata does not have an existing property to specify selectional preferences and we have to resort to a common substitution by using a combination of "has characteristic (P1552)" with an "object has role (P3831)" qualifier:

eater (Q20984678)has characteristic (P1552)organism (Q7239)object has role (P3831)selectional preference (Q124051768)


[(arw) Why wouldn't the "semantic argument" alone give you this detail? As you say, eater (Q20984678) is defined as "human or other live being who eats something" and uses has characteristic (P1552) to reference organism (Q7239). This is true regardless of its semantic role.]


Step 4. Adding role specifications to the event/action instances

[(arw) An approach would be to check the Q item of which the instance is a type and then check if that Q item is referenced using item for this sense (P5137) or predicate for (P9970) by a lexeme sense. For example, "create.01" would reference creation (Q11398090). Then, examine that sense's "semantic argument"s (two different ones: creator (Q2500638) for creator and work (Q386724) for the thing created) and use the properties referenced by the arguments' "defined using property" to reference the entities in that role (creator (P170) for creator and has effect (P1542) for the thing created).

Alternately, perhaps it would be easier to create a new property that tied an instance directly to its sense and then use object has role (P3831) on that statement to reference the "semantic argument" directly.]

When a new event/action instance is created, ideally, the creator should consult the class of the instance and make sure that the semantic roles of the class are instantiated. For example, suppose we want to enter the event of Mickey Mouse creation by Walt Disney on 18 November 1928. Let's call the ID for this event Q_mm_creation. Wikidata uses over 300 properties to indicate event/action instance roles. We can pick "creator (P170)" for the creator role, "has effect (P1542)" for the created artifact role and "point in time (P585)" for time. We add the following 3 statements:

Q_mm_creationcreator (P170)Walt Disney (Q8704)object has role (P3831)creator (Q2500638)

Q_mm_creationhas effect (P1542)Mickey Mouse (Q11934)object has role (P3831)artificial object (Q3619132)

Q_mm_creationpoint in time (P585)18 November 1928object has role (P3831)point in time (Q186408)

We are using the "object has role (P3831)" qualifier to specify the role played by the object. In the case of event/action classes, we used high-level semantic role items such as "agent (Q392648)" or "theme (Q118826633)" as the objects of "object has role (P3831)". In the case event/action instances we use the actual role items such as "creator (Q2500638)" or "attacker (Q31924059)".

Also note that we do not propose to attach the "default" roles such as "location", "start", "end", "point in time" to the event/action classes since all events/actions take place in defined time and place. The instances, though, should specify them (if known). See Semantic Roles below for more details.

Wikidata contains a very large number of event/action instances. For example, "Petsamo–Kirkenes Offensive (Q705222)" is one of many instances of "military offensive (Q2001676)". Currently, it has the following statements for the attacker and the defender roles:

Petsamo–Kirkenes Offensive (Q705222)participant (P710)Soviet Union (Q15180)

Petsamo–Kirkenes Offensive (Q705222)participant (P710)Nazi Germany (Q7318)

These statements do not specify who was the attacker and who was the defender. Ideally, we should add the "object has role (P3831)" qualifier to indicate the role:

Petsamo–Kirkenes Offensive (Q705222)participant (P710)Soviet Union (Q15180)object has role (P3831)attacker (Q31924059)

Petsamo–Kirkenes Offensive (Q705222)participant (P710)Nazi Germany (Q7318)object has role (P3831)defender (Q111729140)

[(arw) Note that there is a target (P533) property already defined, although an attacker/instigator property may also be needed if distinguishing participant (P710) and target (P533) is not enough.]

Obviously, we cannot inspect all existing event/action instances and "fix" them but, at least, we now have a method for doing it.

--Anatole Gershman (talk) 21:33, 12 June 2024 (UTC)[reply]

Semantic Roles

edit

Semantic roles, originally termed cases, are often also referred to as predicate arguments, slots, thematic relations (VerbNet, LIRICS), frame elements (FrameNet), etc. Thematic relations traditionally only refer to the core arguments of the predicating element, and do not include more adjunct-like information found in temporal and locative modifiers. The latter can be applied quite generally and are considered more peripheral. Defining adjuncts precisely has remained a persistent challenge for the linguistics community, making it difficult to distinguish consistently between core and peripheral arguments. The term "semantic roles" can encompass both. Time and place are critical elements of useful descriptions of event instances.

Our stated goal is a mapping between Wikidata items and PropBank semantic roles. The original aim of PropBank was to add semantic role information to the syntactic structures in the Penn Treebank. Since there is no one-to-one mapping between syntactic constituents and semantic roles, annotators were asked to examine every clause in the Penn Treebank featuring a specific lexical item, such as "throw" as a predicating element, and assign the most suitable semantic role label to each one. A PropBank Frame File, listing the different senses of the lexical item and appropriate argument structures for each one, was referred to during this process. For example, the frame for "throw", as listed below, indicates an ARG0-PAG, a prototypical agent (Dowty, 1990), an ARG1-PPT, a prototypical patient or theme, and an ARG2-GOL (the goal or destination of the entity being thrown). There can be up to six numbered core arguments, and a dozen additional peripheral ARGM's, marked individually with function tags such as manner (MNR), locative (LOC), direction (DIR), comitative/accompanier (COM), etc. There are also several more syntactic function tags, to mark modals (MOD), negations (NEG), discourse markers (DIS), etc. The full list with their definitions can be found in the PropBank Guidelines, available at the PropBank GitHub site linked below. The example frame files referred to above are provided below in the PropBank Frame File Examples subsection. After the original 50K sentence Penn Treebank was PropBanked, funding was provided to expand the number of genres and now almost 2M tokens of English have been PropBanked, as well as several other languages including Chinese, Arabic, Korean, Hindi, Urdu, German, French, etc. English PropBank has also been mapped to VerbNet and FrameNet as part of SemLink: Mapping together PropBank/VerbNet/FrameNet, and one can browse a combined representation of those three resources at the Unified Verb Index. PropBank's coverage has also been extended to provide support for Abstract Meaning Representation (AMR) annotation (which uses PropBank Frame Files), unifying PropBank rolesets across different parts of speech.

Below is a table listing our recommended semantic role labels for Wikidata that are mapped to PropBank labels and are adopted from the Uniform Meaning Representation (UMR) project. They have been carefully reviewed to ensure that they accommodate cross-linguistic typological variation (Bonial et al. 2011 A Hierarchical Unification of LIRICS and VerbNet Semantic Roles (Q118174236), Van Gysel et al, 2021 Designing a Uniform Meaning Representation for Natural Language Processing (Q115519832)). For the most part we are relying on existing Wikidata Thematic Relation definitions to realize our PropBank semantic roles which at this point do actually also include Start, Temporal and Place.

UMR/PropBank Semantic Roles to Wikidata Items Mapping
UMR Semantic Role Wikidata item PropBank Function Tag Description Example
Actor actor (Q23894381) PAG An animate entity who performs an action "The chef prepared the meal."
Force force (Q126009669) PAG An event or inanimate entity that acts upon an undergoer in a way that is usually spontaneous, forceful, and direct "The wind blew the door open."
Causer agent (Q392648) PAG
CAU
An animate entities who acts on another actor to cause them to engage in the action "My grandmother made me eat liver."
Undergoer undergoer (Q111335542) PPT The entity that undergoes the action when it is not clearly a Patient or Theme. "The kitten licked her fingers."
Patient patient (Q170212) PPT Subclass of undergoer. The patient is an undergoer in an event that is usually structurally changed, for instance by experiencing a change of state or condition; is often acted upon by an agent; is causally involved or directly affected by other participants; and exists independently of the event. "The chef prepared the meal."

"The roommates painted the walls."
Theme theme (Q118826633) PPT Subclass of undergoer. The theme is an undergoer that is central to an event or state that does not have control over the way the event occurs, is not structurally changed by the event, and/or is characterized as being in a certain position or condition throughout the state. Often in motion. "She packed her suitcase for the trip."
Recipient recipient (Q20820253)
addressee (Q19720921)
GOL The entity that receives something. "The librarian handed me a book."
Experiencer experiencer (Q1242505) PPT
PAG
The entity that directly experiences a sensation or emotion "Many tourists saw the accident."

"He felt a sense of relief."
Stimulus stimulus (Q109566760) PAG
CAU
The entity that causes an emotional or mental state "The loud noise startled the cat."
Instrument instrument (Q6535309) MNR An inanimate entity used to perform an action or event "The rock broke the window."

"She cut the paper with scissors.”
Start origin (Q3885844) DIR The entity from which an action originates or the starting point of an action or event "I flew from Heathrow."

"The bidding opened at $5."
Goal goal (Q109405570) GOL Where an action is directed. In motion verbs, the final destination "He ran to the store."
Companion companion (Q106645134) COM An animate entity that accompanies another entity or entities and is presented as an oblique argument; who an action was done with "I went to the movies with friends."
Material/Source material (Q214609)
source (Q31464082)
DIR The location, entity, or material from which an action or event originates "Water flowed from the faucet"
"I milked the cow."

"The shirt is made of cotton."
Place location (Q109377685) LOC The place where an event or action occurs "The party will be at the park."
Affectee affectee (Q125995757) PPT Entity positively or negatively affected by the circumstances of an event or action without being the primary undergoer "The movie made her cry."
Cause cause (Q2574811) CAU Why an event or inanimate entity brings about an action or event "The pool was closed because of lightening."
Temporal duration (Q2199864)
time (Q12322185)
Frequency (Q125995799)
TMP When an action took place. This includes all temporal referents, such as dates, duration, frequency, order, repetition, etc. “He went to the store yesterday.”

"I've been reading email for three hours."

"They cleaned the kitchen first."

"She lost her keys again."
Extent extent (Q125953445) EXT The degree or amount to which something happens "He ran five miles."

"The price increased by 5%."
Manner means (Q12774177) MNR The way in which something is performed "He worked quickly and mechanically."
Reason cause (Q2574811) PRP The reason, explanation or justification for an event or action "I went to the store because we were out of milk."

"He left early because he had another meeting."
Purpose cause (Q2574811) PRP The purpose or intended objective of an event or action "I went to the store to buy milk."

"He left early to get to another meeting."
Attribute attribute (Q109674924) PRD The quality or characteristic ascribed to an entity "The house is big."
Result result (Q2995644) PRD The entity described by a secondary predicate "She kicked the door shut.”

"You scared me to death."

"He painted the door red."
Direction direction (Q2151613) DIR Motion along a specified (literal or figurative) path “I walked down the street.”

"I turned left."

--Anatole Gershman (talk) 22:04, 21 June 2024 (UTC)[reply]

Example PropBank Frames

edit

Here are the complete PropBank frames referenced above.


bring.01 - carry along with, move literally or metaphorically bring (v.)

Roles:

  • ARG0-PAG: bringer
  • ARG1-PPT: thing brought
  • ARG2-GOL: benefactive or destination brought-for, brought-to ;
  • ARG3-PRD: attribute, state after bringing, secondary action
  • ARG4-DIR: ablative, brought-from

active, benefactive: She [ARG0-PAG] brought [REL] them [ARG2-GOL] shame [ARG1-PPT] .

eat.01 - consume, comsuming

Aliases: eat (v.) eating (n.)

Roles:

  • ARG0-PAG: consumer, eater
  • ARG1-PPT: meal

Arg0, 1: His [ARG0-PAG] eating [REL] carrots [ARG1-PPT] constantly [ARGM-TMP] has tinted his skin a suspiciously bright orange hue .

throw.01 - throw, sending through the air, manually, projection of an object through space

Aliases: throw (v.) throwing (n.) throw (n.)

Roles:

  • ARG0-PAG: thrower
  • ARG1-PPT: thing thrown
  • ARG2-GOL: thrown at, to, over, etc

see.01 - view

Aliases: see (v.) seeing (n.) sight (v.) sight (n.)

Roles:

  • ARG0-PAG: viewer
  • ARG1-PPT: thing viewed
  • ARG2-PRD: attribute of arg1, further description

sight-n: both args: The climax is his visit to the dead man 's house and his [ARG0-PAG] sight [REL] of the body [ARG1-PPT] .

create.01 - create

Aliases: create (v.) creation (n.)

Roles:

  • ARG0-PAG: creator
  • ARG1-PPT: thing created
  • ARG2-VSP: materials used
  • ARG3-GOL: benefactive
  • ARG4-PRD: attribute of arg1

Creation [REL] of a new , realistic U.S. policy [ARG1-PPPT]

attack.01 - to make an attack, criticize strongly

Aliases: attacking (n.) attack (n.) attack (v.)

Roles:

  • ARG0-PAG: attacker
  • ARG1-PPT: entity attacked
  • ARG2-PRD: attribute

Metaphorical attack, illness: The new medication has reduced Sally 's [ARG1-PPT] asthma [ARG1-PAG] attacks [REL] .


Potential New Properties

edit

a) has semantic role

As we mentioned above, few of the existing event/action classes have fully specified semantic roles. For example, "creation (Q11398090)" does not have a statement for the "created". We used "has characteristic (P1552)" - a very generic property, with an "object has role (P3831)" qualifier to indicate the role function:

creation (Q11398090)has characteristic (P1552)artificial object (Q3619132)object has role (P3831)theme (Q118826633)

It seems desirable to have a more specific property, "P_has semantic_role" for this purpose. It would be a sub-property of "has characteristic (P1552)" and used when no existing property such as "practiced by (P3095)" could be found to indicate a semantic role. We would still use the qualifier to indicate the role function.

b) has selectional preference

We also mentioned that Wikidata does not have an existing property to specify selectional preferences and that we had to resort to a common substitution by using a combination of "has characteristic (P1552)" with an "object has role (P3831)" qualifier:

eater (Q20984678)has characteristic (P1552)organism (Q7239)object has role (P3831)selectional preference (Q124051768)

Here again, it seems desirable to have a more specific property "Q_has_selectional_preference". With this dedicated property, there will be no need for the "object has role (P3831)" qualifier.

--Anatole Gershman (talk) 15:46, 24 June 2024 (UTC)[reply]

Statistics

edit

(to be filled in later)

Queries

edit

(to be filled in later)

Current tasks

edit

(to be filled in later)

Participants

edit

The participants listed below can be notified using the following template in discussions:
{{Ping project|Events and Role Frames}}

edit