Wikidata:WikiProject Events and Role Frames
The primary aims of WikiProject Events and Role Frames are
- to define a set of properties that consistently model eventualities (states, processes and events) and their participants (Pustejovsky, 2021);
- to fill gaps in Wikidata regarding items for states/processes/events/actions; and
- to encourage use of the proposed model and newly introduced items across Wikidata.
Pustejovsky, James, (2021) The Role of Event-Based Representations and Reasoning in Language, in Caselli T, Hovy E, Palmer M, Vossen P, eds. Computational Analysis of Storylines: Making Sense of Events. Cambridge University Press.
Motivation
editOne of the known weaknesses of Wikidata is the spotty coverage of eventualities (processes, states, events) and their prototypical participant structures. Let’s look at the spotty coverage issue first.
Spotty State/Process/Event/Action Coverage
editOne of the most common verbs in most languages is “to bring”, e.g., “I brought flowers to my mother”, “J'ai apporté des fleurs à ma mère”, “Я принёс цветы маме”, “Ich habe meiner Mutter Blumen mitgebracht”. Until we added "bringing (Q124457329)" in February 2024, there was no such concept in Wikidata. We examined over 11,500 rolesets contained in PropBank (Q7250039) that describe English predicating expressions (mostly verbs) and identified over 7500 potentially missing Wikidata items [(arw) Currently, there are over 7500 English verbs defined as lexemes. It would be interesting to understand the overlap and/or what is missing.]. Each of these “gaps” needs further examination to determine if it warrants a new item, but the list gives us a starting point. We want to emphasize that although we start with English, the gaps are semantic, not lexical and we should use multiple languages to identify semantic gaps. [(arw) There are many Russian lexemes currently defined, many/most by the Lexicator tool].
State/Process/Event/Action Role Structures
editAll states/processes/events/actions have core semantic roles - "eating" has the "eater" and the "eaten", "throwing" has the "thrower", the "target" and the "projectile". These roles are not optional. Every act of "eating" has an "eater" and the "eaten" independently of how and in which language it is expressed. Most of the existing items for such classes do not mention these roles. For example, "throwing (Q12898216)", defined as “launching of a ballistic projectile by hand” does not have any statements that indicate the existence of the thrower, the target, or the projectile, let alone the specifications of the kinds of entities these attributes are likely to be.
Some Wikidata items for event/action concepts include statements for some of the semantic roles. For example, "eating (Q213449)" uses the "practiced by (P3095)" property whose object is "eater (Q20984678)". Although "practiced by (P3095)" is defined as “type of agents that study this subject or work in this field”, it is often used to indicate an agent of an action. [(arw) This misuse should either be corrected in the property definition or by the creation of a new property.] But, as its description suggests, it has other uses. In "eating (Q213449)", property "uses (P2283)" points to "food (Q2095)" to indicate the "eaten". This property also has many uses. Since no Wikidata property is used exclusively to indicate a semantic role, the existing properties [pfps a property is something like uses (P2283), I suspect you mean that the information needs to be added to a property value in the event concept] will need a qualifier such as "object has role (P3831)" to indicate that the object [pfps I would use value instead here] is a semantic role.
Caveats
editThis project does not address the problem of ontological consistency of Wikidata items. But, as we examine Wikidata events, we might also fill in the gaps in some of the “subclass of” properties. For example, departure (Q21171241) is not currently a subclass of going (Q19279529). The item execution (Q3966286) defined as “homicide as capital punishment” does not seem to be connected to capital punishment (Q8454).
[(arw) There are other semantic issues beyond subclass of (P279) - such as skos:altLabel ... for example, bringing (Q124457329) is a subclass ofmoving (Q115095261) which has a altLabel of "renaming".]
Proposal
editThe proposal outlines a step-by-step procedure for expanding Wikidata state/process/event/action coverage. It has four steps:
1. Adding missing Wikidata state/process/event/action classes [(arw) Add concepts which are Q items, lexemes which are L items, and lexeme senses which are S items. Tying a sense to a Wikidata concept/Q item is accomplished using item for this sense (P5137) or predicate for (P9970).]
2. Adding missing state/process/event/action roles [(arw) Do this by defining a new property for a lexeme's sense such as "has semantic argument" which then references a Q item (for ex, creator (Q2500638) for the role, "creator" for the verb sense, "create.01"). This statement has two properties: (1) "has semantic role" that references a Wikidata item equivalent to its semantic role in the table below (e.g., Actor => actor (Q23894381)), and (2) "defined using property" that references a P item with the explicit semantics for that sense (such as creator (P170)).]
3. Specifying selectional preferences for state/process/event/action roles
4. Adding role specifications to the state/process/event/action instances
Step 1. Adding missing Wikidata state/process/event/action classes
We propose to go systematically over the PropBank RoleSets. For each RoleSet, we look for an existing Q item class that represents the same concept. For example, when we examine PropBank's "see.01" defined as "to perceive an object with one's eyes", we find two relevant Q items: "visual perception (Q162668)" and "seeing (Q25374341)". The former is described as "ability to interpret the surrounding environment using light in the visible spectrum" and the latter as "the event of perceiving something using eyesight" which looks like a better candidate.
[(arw) But when reviewing PropBank, one often sees many possible senses of a word. This highlights the value of defining lexemes and their various senses. In addition, the sense can specifically reference a PropBank term ("see.01") by creating a new identifier, "PropBank ID".]
If no such item is found, we create one. For example, we could not find a Q item equivalent to PropBank's "bring.01" defined as "carry along with, move literally or metaphorically". We created a new Q item "bringing (Q124457329)" described as "transporting something toward somebody/somewhere". We also added labels in Russian and French and made it a "subclass of" "moving (Q115095261)".
[(arw) Translations of lexeme senses are supported via the property, translation (P5972).]
Step 2. Adding missing event/action roles
When we found [pfps find?] or created [pfps create?] an appropriate state/process/event/action Q item, we went [pfps will go?] over the roles of the RoleSet. In the "eat.01" example, there are two roles: the "consumer, eater" and the "meal". For each role, we look for a Q item statement that describes the role.
eating (Q213449)practiced by (P3095)eater (Q20984678)
describes the "consumer, eater" role. RoleSet "eat.01" indicates that this role is a "PAG" - a Prototypical Agent. Since the "practiced by (P3095)" property may have uses other than designating a semantic frame role, we add a qualifier:
eating (Q213449)practiced by (P3095)eater (Q20984678)
[(arw) An alternative rendering with a Lexeme would be to create Lxxxx-eat with Sense Lxxxx-eat-S1 (defined using the property, ontolex:sense). Then, Lxxxx-eat-S1 could use the "has semantic argument" property to reference eater (Q20984678). That statement would be further qualified with "has semantic role" referencing actor (Q23894381) (which may be a better match than agent (Q392648)) and "defined using property" referencing practiced by (P3095) (assuming that the definition was updated or referencing a new property).]
We provide a mapping between PropBank's prototypical roles such as "PAG" and "PPT" and the corresponding Q items. [(arw) The table below seems perfect for this use.]
PropBank's "meal" role of "eat.01" is a PPT (Prototypical Patient). The statement that best describes it is:
eating (Q213449)uses (P2283)food (Q2095)
We added a qualifier to this statement:
eating (Q213449)uses (P2283)food (Q2095)
[(arw) The example here would be to add detail to Lxxxx-eat-S1. The "has semantic argument" property would reference food (Q2095). The "has semantic argument" property would reference theme (Q118826633). The "defined using property" property would reference uses (P2283) or a new property.]
If we cannot find a suitable qualifier object [pfps value for object has role (P3831)?], we can use "semantic role (Q117747915)" as a "back off". [(arw) Should we explicitly declare the various Q items in the table as subclasses/instances of semantic role (Q117747915)?]
When an event/action class does not have a statement describing a core semantic role, we look for an existing Q item that most closely describes that role. For example, "creation (Q11398090)" (process during which something comes into being and gains its characteristics) corresponds to PropBank's "create.01". It has a statement for the "creator" role but no statements for "Arg1-PPT thing created". The item that best describes the object of creation is "artificial object (Q3619132)". To complete the frame, we add the following statement:
creation (Q11398090)has characteristic (P1552)artificial object (Q3619132)
We used the very generic "has characteristic (P1552)" property because we could not find a more specific existing one. The qualifier provides the interpretation.
Another example is "military offensive (Q2001676)" which does not have statements describing the attacker and the defendant. The item "attacker (Q31924059)" seems appropriate for the attacker role and "defender (Q111729140)" for the defender role:
military offensive (Q2001676)has characteristic (P1552)attacker (Q31924059)
military offensive (Q2001676)has characteristic (P1552)defender (Q111729140)
(pfps Why are thematic roles showing up here?)
When no role Q item is found, we need to create one.
Step 3. Specifying selectional preferences for event/action roles
Each role, in an event/action frame typically describes the classes of entities that would normally be expected to play that role in that frame's instances. For example, we normally expect that the "eater" in an "eating" instance would be an organism. Because these expectations could be violated we call them selectional preferences, not restrictions. Unfortunately, Wikidata does not have an existing property to specify selectional preferences and we have to resort to a common substitution by using a combination of "has characteristic (P1552)" with an "object has role (P3831)" qualifier:
eater (Q20984678)has characteristic (P1552)organism (Q7239)
[(arw) Why wouldn't the "semantic argument" alone give you this detail? As you say, eater (Q20984678) is defined as "human or other live being who eats something" and uses has characteristic (P1552) to reference organism (Q7239). This is true regardless of its semantic role.]
Step 4. Adding role specifications to the event/action instances
[(arw) An approach would be to check the Q item of which the instance is a type and then check if that Q item is referenced using item for this sense (P5137) or predicate for (P9970) by a lexeme sense. For example, "create.01" would reference creation (Q11398090). Then, examine that sense's "semantic argument"s (two different ones: creator (Q2500638) for creator and work (Q386724) for the thing created) and use the properties referenced by the arguments' "defined using property" to reference the entities in that role (creator (P170) for creator and has effect (P1542) for the thing created).
Alternately, perhaps it would be easier to create a new property that tied an instance directly to its sense and then use object has role (P3831) on that statement to reference the "semantic argument" directly.]
When a new event/action instance is created, ideally, the creator should consult the class of the instance and make sure that the semantic roles of the class are instantiated. For example, suppose we want to enter the event of Mickey Mouse creation by Walt Disney on 18 November 1928. Let's call the ID for this event Q_mm_creation. Wikidata uses over 300 properties to indicate event/action instance roles. We can pick "creator (P170)" for the creator role, "has effect (P1542)" for the created artifact role and "point in time (P585)" for time. We add the following 3 statements:
Q_mm_creationcreator (P170)Walt Disney (Q8704)
Q_mm_creationhas effect (P1542)Mickey Mouse (Q11934)
Q_mm_creationpoint in time (P585)18 November 1928
We are using the "object has role (P3831)" qualifier to specify the role played by the object. In the case of event/action classes, we used high-level semantic role items such as "agent (Q392648)" or "theme (Q118826633)" as the objects of "object has role (P3831)". In the case event/action instances we use the actual role items such as "creator (Q2500638)" or "attacker (Q31924059)".
Also note that we do not propose to attach the "default" roles such as "location", "start", "end", "point in time" to the event/action classes since all events/actions take place in defined time and place. The instances, though, should specify them (if known). See Semantic Roles below for more details.
Wikidata contains a very large number of event/action instances. For example, "Petsamo–Kirkenes Offensive (Q705222)" is one of many instances of "military offensive (Q2001676)". Currently, it has the following statements for the attacker and the defender roles:
Petsamo–Kirkenes Offensive (Q705222)participant (P710)Soviet Union (Q15180)
Petsamo–Kirkenes Offensive (Q705222)participant (P710)Nazi Germany (Q7318)
These statements do not specify who was the attacker and who was the defender. Ideally, we should add the "object has role (P3831)" qualifier to indicate the role:
Petsamo–Kirkenes Offensive (Q705222)participant (P710)Soviet Union (Q15180)
Petsamo–Kirkenes Offensive (Q705222)participant (P710)Nazi Germany (Q7318)
[(arw) Note that there is a target (P533) property already defined, although an attacker/instigator property may also be needed if distinguishing participant (P710) and target (P533) is not enough.]
Obviously, we cannot inspect all existing event/action instances and "fix" them but, at least, we now have a method for doing it.
Semantic Roles
editSemantic roles, originally termed cases, are often also referred to as predicate arguments, slots, thematic relations (VerbNet, LIRICS), frame elements (FrameNet), etc. Thematic relations traditionally only refer to the core arguments of the predicating element, and do not include more adjunct-like information found in temporal and locative modifiers. The latter can be applied quite generally and are considered more peripheral. Defining adjuncts precisely has remained a persistent challenge for the linguistics community, making it difficult to distinguish consistently between core and peripheral arguments. The term "semantic roles" can encompass both. Time and place are critical elements of useful descriptions of event instances.
Our stated goal is a mapping between Wikidata items and PropBank semantic roles. The original aim of PropBank was to add semantic role information to the syntactic structures in the Penn Treebank. Since there is no one-to-one mapping between syntactic constituents and semantic roles, annotators were asked to examine every clause in the Penn Treebank featuring a specific lexical item, such as "throw" as a predicating element, and assign the most suitable semantic role label to each one. A PropBank Frame File, listing the different senses of the lexical item and appropriate argument structures for each one, was referred to during this process. For example, the frame for "throw", as listed below, indicates an ARG0-PAG, a prototypical agent (Dowty, 1990), an ARG1-PPT, a prototypical patient or theme, and an ARG2-GOL (the goal or destination of the entity being thrown). There can be up to six numbered core arguments, and a dozen additional peripheral ARGM's, marked individually with function tags such as manner (MNR), locative (LOC), direction (DIR), comitative/accompanier (COM), etc. There are also several more syntactic function tags, to mark modals (MOD), negations (NEG), discourse markers (DIS), etc. The full list with their definitions can be found in the PropBank Guidelines, available at the PropBank GitHub site linked below. The example frame files referred to above are provided below in the PropBank Frame File Examples subsection. After the original 50K sentence Penn Treebank was PropBanked, funding was provided to expand the number of genres and now almost 2M tokens of English have been PropBanked, as well as several other languages including Chinese, Arabic, Korean, Hindi, Urdu, German, French, etc. English PropBank has also been mapped to VerbNet and FrameNet as part of SemLink: Mapping together PropBank/VerbNet/FrameNet, and one can browse a combined representation of those three resources at the Unified Verb Index. PropBank's coverage has also been extended to provide support for Abstract Meaning Representation (AMR) annotation (which uses PropBank Frame Files), unifying PropBank rolesets across different parts of speech.
Below is a table listing our recommended semantic role labels for Wikidata that are mapped to PropBank labels and are adopted from the Uniform Meaning Representation (UMR) project. They have been carefully reviewed to ensure that they accommodate cross-linguistic typological variation (Bonial et al. 2011 A Hierarchical Unification of LIRICS and VerbNet Semantic Roles (Q118174236), Van Gysel et al, 2021 Designing a Uniform Meaning Representation for Natural Language Processing (Q115519832)). For the most part we are relying on existing Wikidata Thematic Relation definitions to realize our PropBank semantic roles which at this point do actually also include Start, Temporal and Place.
UMR Semantic Role | Wikidata item | PropBank Function Tag | Description | Example |
---|---|---|---|---|
Actor | actor (Q23894381) | PAG | An animate entity who performs an action | "The chef prepared the meal." |
Force | force (Q126009669) | PAG | An event or inanimate entity that acts upon an undergoer in a way that is usually spontaneous, forceful, and direct | "The wind blew the door open." |
Causer | agent (Q392648) | PAG CAU |
An animate entities who acts on another actor to cause them to engage in the action | "My grandmother made me eat liver." |
Undergoer | undergoer (Q111335542) | PPT | The entity that undergoes the action when it is not clearly a Patient or Theme. | "The kitten licked her fingers." |
Patient | patient (Q170212) | PPT | Subclass of undergoer. The patient is an undergoer in an event that is usually structurally changed, for instance by experiencing a change of state or condition; is often acted upon by an agent; is causally involved or directly affected by other participants; and exists independently of the event. | "The chef prepared the meal." "The roommates painted the walls." |
Theme | theme (Q118826633) | PPT | Subclass of undergoer. The theme is an undergoer that is central to an event or state that does not have control over the way the event occurs, is not structurally changed by the event, and/or is characterized as being in a certain position or condition throughout the state. Often in motion. | "She packed her suitcase for the trip." |
Recipient | recipient (Q20820253) addressee (Q19720921) |
GOL | The entity that receives something. | "The librarian handed me a book." |
Experiencer | experiencer (Q1242505) | PPT PAG |
The entity that directly experiences a sensation or emotion | "Many tourists saw the accident." "He felt a sense of relief." |
Stimulus | stimulus (Q109566760) | PAG CAU |
The entity that causes an emotional or mental state | "The loud noise startled the cat." |
Instrument | instrument (Q6535309) | MNR | An inanimate entity used to perform an action or event | "The rock broke the window." "She cut the paper with scissors.” |
Start | origin (Q3885844) | DIR | The entity from which an action originates or the starting point of an action or event | "I flew from Heathrow." "The bidding opened at $5." |
Goal | goal (Q109405570) | GOL | Where an action is directed. In motion verbs, the final destination | "He ran to the store." |
Companion | companion (Q106645134) | COM | An animate entity that accompanies another entity or entities and is presented as an oblique argument; who an action was done with | "I went to the movies with friends." |
Material/Source | material (Q214609) source (Q31464082) |
DIR | The location, entity, or material from which an action or event originates | "Water flowed from the faucet" "I milked the cow." "The shirt is made of cotton." |
Place | location (Q109377685) | LOC | The place where an event or action occurs | "The party will be at the park." |
Affectee | affectee (Q125995757) | PPT | Entity positively or negatively affected by the circumstances of an event or action without being the primary undergoer | "The movie made her cry." |
Cause | cause (Q2574811) | CAU | Why an event or inanimate entity brings about an action or event | "The pool was closed because of lightening." |
Temporal | duration (Q2199864) time (Q12322185) Frequency (Q125995799) |
TMP | When an action took place. This includes all temporal referents, such as dates, duration, frequency, order, repetition, etc. | “He went to the store yesterday.” "I've been reading email for three hours." "They cleaned the kitchen first." "She lost her keys again." |
Extent | extent (Q125953445) | EXT | The degree or amount to which something happens | "He ran five miles." "The price increased by 5%." |
Manner | means (Q12774177) | MNR | The way in which something is performed | "He worked quickly and mechanically." |
Reason | cause (Q2574811) | PRP | The reason, explanation or justification for an event or action | "I went to the store because we were out of milk." "He left early because he had another meeting." |
Purpose | cause (Q2574811) | PRP | The purpose or intended objective of an event or action | "I went to the store to buy milk." "He left early to get to another meeting." |
Attribute | attribute (Q109674924) | PRD | The quality or characteristic ascribed to an entity | "The house is big." |
Result | result (Q2995644) | PRD | The entity described by a secondary predicate | "She kicked the door shut.” "You scared me to death." "He painted the door red." |
Direction | direction (Q2151613) | DIR | Motion along a specified (literal or figurative) path | “I walked down the street.” "I turned left." |
Example PropBank Frames
editHere are the complete PropBank frames referenced above.
bring.01 - carry along with, move literally or metaphorically
bring (v.)
Roles:
- ARG0-PAG: bringer
- ARG1-PPT: thing brought
- ARG2-GOL: benefactive or destination brought-for, brought-to ;
- ARG3-PRD: attribute, state after bringing, secondary action
- ARG4-DIR: ablative, brought-from
active, benefactive: She [ARG0-PAG] brought [REL] them [ARG2-GOL] shame [ARG1-PPT] .
eat.01 - consume, comsuming
Aliases: eat (v.) eating (n.)
Roles:
- ARG0-PAG: consumer, eater
- ARG1-PPT: meal
Arg0, 1: His [ARG0-PAG] eating [REL] carrots [ARG1-PPT] constantly [ARGM-TMP] has tinted his skin a suspiciously bright orange hue .
throw.01 - throw, sending through the air, manually, projection of an object through space
Aliases: throw (v.) throwing (n.) throw (n.)
Roles:
- ARG0-PAG: thrower
- ARG1-PPT: thing thrown
- ARG2-GOL: thrown at, to, over, etc
see.01 - view
Aliases: see (v.) seeing (n.) sight (v.) sight (n.)
Roles:
- ARG0-PAG: viewer
- ARG1-PPT: thing viewed
- ARG2-PRD: attribute of arg1, further description
sight-n: both args: The climax is his visit to the dead man 's house and his [ARG0-PAG] sight [REL] of the body [ARG1-PPT] .
create.01 - create
Aliases: create (v.) creation (n.)
Roles:
- ARG0-PAG: creator
- ARG1-PPT: thing created
- ARG2-VSP: materials used
- ARG3-GOL: benefactive
- ARG4-PRD: attribute of arg1
Creation [REL] of a new , realistic U.S. policy [ARG1-PPPT]
attack.01 - to make an attack, criticize strongly
Aliases: attacking (n.) attack (n.) attack (v.)
Roles:
- ARG0-PAG: attacker
- ARG1-PPT: entity attacked
- ARG2-PRD: attribute
Metaphorical attack, illness: The new medication has reduced Sally 's [ARG1-PPT] asthma [ARG1-PAG] attacks [REL] .
Potential New Properties
edita) has semantic role
As we mentioned above, few of the existing event/action classes have fully specified semantic roles. For example, "creation (Q11398090)" does not have a statement for the "created". We used "has characteristic (P1552)" - a very generic property, with an "object has role (P3831)" qualifier to indicate the role function:
creation (Q11398090)has characteristic (P1552)artificial object (Q3619132)
It seems desirable to have a more specific property, "P_has semantic_role" for this purpose. It would be a sub-property of "has characteristic (P1552)" and used when no existing property such as "practiced by (P3095)" could be found to indicate a semantic role. We would still use the qualifier to indicate the role function.
b) has selectional preference
We also mentioned that Wikidata does not have an existing property to specify selectional preferences and that we had to resort to a common substitution by using a combination of "has characteristic (P1552)" with an "object has role (P3831)" qualifier:
eater (Q20984678)has characteristic (P1552)organism (Q7239)
Here again, it seems desirable to have a more specific property "Q_has_selectional_preference". With this dedicated property, there will be no need for the "object has role (P3831)" qualifier.
Statistics
edit(to be filled in later)
Queries
edit(to be filled in later)
Current tasks
edit(to be filled in later)
Participants
editThe participants listed below can be notified using the following template in discussions:{{Ping project|Events and Role Frames}}
Related links
edit- PropBank
- FrameNet
- VerbNet
- UVI
- The DARPA Wikidata overlay: Wkidata as an ontology for natural language processing (Q119958789)
- PDT-Vallex
- ENG-Vallex
- NomVallex
- AnCorpaVerb
- ADESSE: Base de datos de Verbos, Alternancias de Diátesis y Esquemas Sintáctico-Semánticos del Español
- Pattern Dictionary of English Verbs
- Mapping Czech Verbal Valency to PropBank Argument Labels
- A Visual Dictionary of Tibetan Verb Valency
- ValPal