User:Amadalvarez/sports statistics property

Background edit

To date, properties related to sports information have been underdeveloped. Player and team characteristics have average coverage, but seasons, leagues, championships, matches, etc. have items with poor content. I think the main reason lies in the variety of performances and data among different sports that makes it difficult to share a homogeneous ontology. If we focus on the quantitative properties that measure results, we can find:

New trend edit

In late February 2021, a block of baseball-only statistical properties was created to measure several actions of players, and some for matches: number of at bats (P9180), number of hits (P9184), bases on balls (P9188), runs batted in (P9190), stolen bases (P9217), doubles hit (P9220), triples hit (P9225). A request is now open for 13 new properties (1 for rugby, 1 for baseball and 11 for basketball) to collect statistical totals.

We're opening a can of worms !. The number of statistical indicators managed by sports specialists is high, but I want to understand that it is reasonable from the point of view of who is dealing with a single sport. Additionally, these indicators have more than one dimension, as they are usually shown by player, team, season, match, etc. and their combinations (ex: goals of a player in a match, in a season, in a team, in his career, etc.). If we multiply this by the number of sports (except cases like the P1351 which is shared by many sports) the number of properties will become huge. Also, if someone is able to maintain a new statistical indicator, who will be able to approve / reject the creation of a new property, for example, number of turnover (Q354115), or the Pace Factor [1] of an NBA team ?.

Proposal edit

 
X axis = "sport statistics indicators"
Y axis = scope of measurement.
Z axis = each "slice" of this dimension is one "subject item" (player, club, league, match,etc)

My proposal is to handle these indicators into a common structure for all of them, based on the traditional OLAP cube with 3 dimensions: subject, object and scope. Each different combination of this three dimensions points to a quantity, it is, the value of the indicator. I'm not talking about having an OLAP, I'm just taking the idea of ​​3-dimensional representation.

Component description edit

A "single wildcard property" called “sports statistics”, “statistic indicator value”, “quantity of” (or what we agree on, doesn't matter now) will define the "indicator" we need to collect in sport item. Let's see the three dimension descriptions:

  • the subject of the measure (player, team, competition, match, ...).
  • It answers the question, "Whose indicator is it?".
  • In our case, it is the item.
  • We may show it on the Z-axis of the cube.
  • the object of the measure (points, goals, rebounds, fast laps, etc.).
  • It respond the question "What is the concept measured ?".
  • Here we will use the new wildcard property to define the indicator concept. It's not a closed list as would happen with specific properties for each indicator.
  • We may show it in X-axis cube.
  • the scope of the measure.
  • It answers the question "Which is the part the measure correspond?". Usually, we could talk about "period of time", not a date. Example: season, time in a league, match, career, etc. However, can also be use to define another criteria, for instance, which of the members of a team does the indicator refer to?.
  • We may show it on the Y-axis of the cube.

If we project the concepts defined on the cube, the cell on the X-Y-Z coordinate contains the figure of the measured quantity.

Examples of use edit

Here we can see different situations, such as the figures of the whole career, a season or a match for the case of a player, a team or a match.

The "value" of the "quantity of" property is the object to be measured;

the value of the P518 qualifier is the scope of the measurement;
and the value of the P1114 qualifier is the measured figure.

In player item:

Whole career statistical
Michael Jordan (Q41421)quantity of (new)basketball game (Q18431960)applies to part (P518)career (Q282049)quantity (P1114) 1072
Michael Jordan (Q41421)quantity of (new)point (Q2353718)applies to part (P518)career (Q282049)quantity (P1114) 32292
Michael Jordan (Q41421)quantity of (new)rebound (Q654355)applies to part (P518)career (Q282049)quantity (P1114) 6672
Michael Jordan (Q41421)quantity of (new)assist (Q1510817)applies to part (P518)career (Q282049)quantity (P1114) 5633
Michael Jordan (Q41421)quantity of (new)three-point field goal (Q746826)applies to part (P518)career (Q282049)quantity (P1114) 536
.... etc.
Time in league statistical
Dani Pedrosa (Q313959)quantity of (new)podium (Q5688743)applies to part (P518)MotoGP (Q10858737)quantity (P1114) 112
Dani Pedrosa (Q313959)quantity of (new)pole position (Q588596)applies to part (P518)MotoGP (Q10858737)quantity (P1114) 31
Dani Pedrosa (Q313959)quantity of (new)fastest lap (Q310258)applies to part (P518)MotoGP (Q10858737)quantity (P1114) 44
Dani Pedrosa (Q313959)quantity of (new)point (Q3393320)applies to part (P518)MotoGP (Q10858737)quantity (P1114) 3087
Dani Pedrosa (Q313959)quantity of (new)fastest lap (Q310258)applies to part (P518)250cc/Moto2 (Q15635270)quantity (P1114) 15
Dani Pedrosa (Q313959)quantity of (new)point (Q3393320)applies to part (P518)250cc/Moto2 (Q15635270)quantity (P1114) 626
Season statistical
Michael Jordan (Q41421)quantity of (new)basketball game (Q18431960)applies to part (P518)1996–97 NBA season (Q1321749)quantity (P1114) 82
Michael Jordan (Q41421)quantity of (new)point (Q2353718)applies to part (P518)1996–97 NBA season (Q1321749)quantity (P1114) 2427
.... etc.
Match
Michael Jordan (Q41421)quantity of (new)point (Q2353718)applies to part (P518)Game 4 of 1988 NBA Playoffs Eastern Conference First Round, Chicago Bulls at Cleveland Cavaliers (Q56670521)quantity (P1114) 44
Michael Jordan (Q41421)quantity of (new)rebound (Q654355)applies to part (P518)Game 4 of 1988 NBA Playoffs Eastern Conference First Round, Chicago Bulls at Cleveland Cavaliers (Q56670521)quantity (P1114) 5

In team item:

Chicago Bulls (Q128109)quantity of (new)point (Q2353718)applies to part (P518)2019–20 Chicago Bulls season (Q63859186)quantity (P1114) 6942

In match item: examples equivalent to the requested properties.

1989 Georgetown vs. Princeton men's basketball game (Q90747291)quantity of (new)rebound (Q654355)applies to part (P518)Bob Scrabis (Q98841453)quantity (P1114) 2
1989 Georgetown vs. Princeton men's basketball game (Q90747291)quantity of (new)rebound (Q654355)applies to part (P518)Princeton Tigers men's basketball (Q7245013)quantity (P1114) 13

Some considerations edit

  • Although not represented in the diagram, the model may incorporate other additional qualifiers.

Overlapping with existing properties edit

When we approve this property we will give solution to many of the newly created properties or in the application phase. For those historical and widely populated properties, a very precise analysis will be necessary, since in some cases their use is applied to multiple situations and with different structures. In general, we could say:

  • Current uses as main property can be migrated with few changes. In many cases, the current usage is erroneous and should be fix in advance.
  • When it is being used to display total data (career), it will be possible to migrate it to the new property.
  • Many uses as a qualifier of properties P1350, P1351 seem reasonable to maintain, specially when they document one "scope" that has no associated items:
to Marek Štěch (Q984140)participant in (P1344)2018–19 FA Cup (Q54866623)
+ Marek Štěch (Q984140)Pnnnassociation football match (Q16466010)applies to part (P518)2018–19 FA Cup (Q54866623)quantity (P1114) 0
  • Many uses as a qualifier of properties P1355, P1356, P1357, P1358, P1359 could be migrated to the new property, but solutions that leave heterogeneous structures for a single solution must be avoided.
  • In any case, the current user systems of these structures must be carefully considered while the migration process.

The following table shows the current situation.

Property defined as # entries
Principal
# entries
Qualifyer
Scope
number of matches played/races/starts (P1350) Undefined 39190 675513 principal: career for teams, career for competitions (28222)
players participation in competition (12406)
qualifier: results within a team (P54) or in a competition (P1344) for players
results participant teams in competitions (222454)
number of points/goals/set scored (P1351) Undefined 6297 756543 Similar skill to P1350
number of wins (P1355) Qualifyer 37! 31958 Principal:usually, shows career info for players and info that must be P1350 for teams
Qualifier: Results within P1344, P1923, P2416, P710, mainly.
number of losses (P1356) Undefined 25 30956 Principal:career info for players and autodefined scope items for teams
Qualifier: Results within P1344, P1923, P710, mainly.
number of draws/ties (P1357) Qualifyer 22! 30732 Principal:career info for players and autodefined scope items for teams
Qualifier: Results within P1344, P1923, P710, mainly.
points for (P1358) Qualifyer 14! 101002 Principal:career info for players and autodefined scope items for teams
Qualifier: Results within P1344, P1923, P710, mainly.
number of points/goals conceded (P1359) Undefined 242 16958 Principal:career info for players (goalkeepers, mainly) and goals received for teams/season
Qualifier: Results within P1323, P1944, P710, mainly.
total goals in career (P6509) Undefined 11494 4 career
total shots in career (P6543) Undefined 6189 2 career
total points in career (P6544) Undefined 7265 4 career
penalty minutes in career (P6546) Undefined 7250 0 career
career plus-minus rating (P6547) Undefined 6189 0 career
century breaks (P4912) Principal 502 1 career
highest break (P6590) Undefined 658 0 career
doubles record (P555) Undefined 3935 5 career
singles record (P564) Undefined 3956 4 career
number of at bats (P9180) Both 3 4 principal: career for player
qualifier: match for competitions
number of hits (P9184) Both 3 4 principal: career for player
qualifier: match for competitions
bases on balls (P9188) Both 2 3 principal: career for player
qualifier: match for competitions
stolen bases (P9217) Both 4 6 principal: career for player
qualifier: match for competitions
doubles hit (P9220) Both 2 4 principal: career for player
qualifier: match for competitions
triples hit (P9225) Both 2 4 principal: career for player
qualifier: match for competitions