Wikidata:Requests for comment/P155/P156 as qualifiers only, rather than as main statements

An editor has requested the community to provide input on "P155" via the Requests for comment (RFC) process. This is the discussion page regarding the issue.

If you have an opinion regarding this issue, feel free to comment below. Thank you!

This request for comment concerns the use of P155/P156 (follows/followed by) as main statements rather than as qualifiers, and argues that current main statement uses of these properties be moved to qualifiers on other properties.

(This is largely derived from posts made by Mahir256 on Project chat and on P156's talk page, and uses of the first person below refer to Mahir256.)

Request for comment edit

Motivation edit

As a prelude to the primary motivation for this RfC, what follows is an elaborated version of an argument I made to another admin in favor of the mandatory requirement of the use of follows (P155)/followed by (P156) as a qualifier more than four years ago:

To detach the ordering of any of these items from the sequences to which they are with respect (by not having P155 or P156 as qualifiers to some other clarifying property) seems rather disingenuous. Numerous other sequences (of recurring events, of multiple series of texts, of story arcs in TV shows, or of really anything else) may be adjusted in exactly this same fashion. A rather similar argument can be made with respect to the use of similar sequence properties such as replaces (P1365)/replaced by (P1366) as qualifiers only (and in fact has likely been made before).

Background edit

On 11 July 2019 Matěj Suchánek added a property scope constraint to followed by (P156) (copied over shortly after to follows (P155)), one of the generic properties used to indicate an adjacent element of some series of which something is a part, that establishes that these should only be used as main statements or as qualifiers. This, however, appeared to contradict discussion on the latter property's talk page that suggested that it should only be used as a qualifier, and my understanding at that time continued to be that P155/P156 were only meaningful if they were tied to some series, and that this tie is best maintained through these properties' qualifying a statement which defines the series in question. Thus back in late September I removed the "as main statement" qualifier on those constraints. Those removals, however, were shortly thereafter questioned and reverted (which, in retrospect, was fine, given that the discussions which drive this RfC had not yet happened at that time).

Some proposed migration paths edit

I was asked by User:MisterSynergy in early October to provide some suggestions for migrating statement uses of P155/P156 to qualifiers. Below are some of these, ordered in decreasing order of the number of items with certain (groups of) P31s. Most of these follow similar threads, and can be expanded to include other sequences using the same properties as those noted in the suggestions. (Note that the use counts—as of 4 October 2020—are for follows (P155) only, on the assumption that a similar quantity of uses of followed by (P156) are also present.)

  1. There are ~225k uses on items with P31/P279* minor planet (Q1022867).
  2. There are ~105k uses on items with P31 album (Q482994).
  3. There are ~80k uses on items with P31/P279* sporting event (Q16510064).
  4. There are ~46k uses on items with P31 biographical article (Q19389637).
  5. There are ~56k uses on items with P31/P279* Wikimedia category (Q4167836).
  6. There are ~62k uses on items with P31/P279* season (Q20852192).
  7. There are ~51k uses on items with P31 television series episode (Q21191270).
    • As is seen on an episode like Project Daedalus (Q56605474), the statements could be moved to qualifiers on either part of the series (P179) for the show as a whole or season (P4908) for the specific season of which the episode is a part.
    • Example: Consider the episode Gethsemane (Q5554456).
      • To say it "follows" Demons (Q5256287) might seem unambiguous, but the "followed by" counterpart could be either Redux (Q50279722) (since the episode is part of the series The X-Files (Q2744)) or "no value" (since it is the final episode of the series The X-Files, season 4 (Q3468921)), and to use one (or even both!) as main statements would introduce ambiguity.
      • Before: at this revision, it's not evident from the P155 and P156 statements, either implicitly or explicitly, that the episode is the last in its season but not the last in the overall TV series.
      • After: at this revision, the P155/P156 qualifiers make the episode's being the last in its season but not the last in the overall TV series clear.
  8. There are ~27k uses on items with P31 events in a specific year or time period (Q18340514).

The above list is of course not exhaustive; I would be happy to suggest a migration path for some class of items with P155/P156 main statements not covered by any of the cases above.

Counters edit

What follows here are some rebuttals to some (predicted) responses to the prior iteration of the migration proposed above:

  • What about the status quo?: that is the whole point of this RfC, no?
  • What about qualifying P155/P156 with the series in question: I find this unworkable for two reasons: 1) it decouples a sequence (noted by a main statement with a given property) to which an item belongs from the items in that sequence (qualified with that same property or something similar), making it prone to desynchronization (avoiding which is the whole point of this RfC in the first place), and 2) it appears to suggest that an item inherently has some position in an ordering (indicated by P155/P156 being more prominent than the sequence property qualifying it) that end users (no matter how much you inform them about any qualifiers that might be present) may well mis(s/interpret), whereas confining the scope of P155/P156 to a particular property will generally avoid this problem.
  • What about the awkwardness of qualifying, say, instance of (P31)?: I will agree that this does suggest a deficiency in the sorts of properties to which sequence information can be attached on Wikidata, but this does not automatically validate the ill-defined nature of an incredibly generic sequence relation property which as a main statement adopts whatever meaning the reader wishes (this meaning, of course, entirely invisible and foreign to a SPARQL query or to external users of our data). I would be happy to support new properties which P155/P156 can qualify to replace relations where P155/P156 is qualifying P31, but absent those new properties I stand by those entries in the list above involving qualifiers to P31.
  • What about people moving P1365/P1366 to qualifiers?: This RfC does not concern either of those properties, at least on account of their specificity as to the sequence relation being described; there are in fact valid uses of those properties as main statements, in addition to their use as qualifiers on other properties such as position held (P39).
  • (This list may be expanded as other arguments are brought up here.)

If there are some circumstances from a data modeling standpoint (we all know that badly modeled data on Wikidata hurts its users) that prevent the adoption of a constraint on these properties requiring them to be used as qualifiers on some other property, then it would help to either define 1) what sort of otherwise usable "series item" might be useful to have for those circumstances, or 2) what other properties might be needed to better specify the special sequence relation in question in those circumstances, rather than to "just give up" and leave these vague main statements.

General discussion edit

?head (p:prop/pq:P156)* ?item , or
?tail (p:prop/pq:P155)* ?item
This will pull out all the people who follow each other in the particular role defined by p:prop. (And a trick can then be used to sort that list into sequence).
It is not possible to do the same thing if the role is defined by a subject has role (P2868) qualifier on follows (P155). For a single step you can write.
?tail p:P155 ?stmt . ?stmt ps:P155 ?predecessor . FILTER(?stmt pq:P2868 ?role)
But statements like that cannot be chained together, because you cannot put a FILTER requirement inside the ( .. )* construction.
Based on this my view would be that if at all possible follows (P155) and followed by (P156) should be used as qualifiers, so that the main property can indicate in what regard A is followed by B. -- Jheald (talk) 17:55, 22 February 2021 (UTC)[reply]
This can be avoided by insisting on a main statement of the form: Apart of the series (P179)B , and then using the follows (P155) qualifier on that. IMO this should be used, even if there is also a statement Ainstance of (P31)B Jheald (talk) 18:03, 22 February 2021 (UTC)[reply]
@Mahir256: Thanks. For some names we do feel separate items are warranted. We've used both P155/P156 and P1365/P1366 properties in these items because it was never clear which was the preferred property and the definitions seemed to overlap. We also wanted to make sure that however they were queried, the items would be returned in results. UWashPrincipalCataloger (talk) 06:17, 23 February 2021 (UTC)[reply]
  • It looks like the conversion would be a megaproject. Are you planning to do the work yourself? If so, it would be feasible to work on those specific cases where it's pretty obvious that they would be better done as qualifiers, without needing consensus that there are no cases where they should be main statements. Ghouston (talk) 09:54, 23 February 2021 (UTC)[reply]
    • @Ghouston: If no one else wishes to help address this, I could start on it more earnestly. If the now-blocked User:MovieFex hadn't complained about it, then I'd have gone ahead in those cases without this RfC, but @MisterSynergy:, who asked for migration paths, has not yet directly responded to them, so I await a response in that case. Mahir256 (talk) 15:43, 23 February 2021 (UTC)[reply]
      • I'm still not convinced that we should move all P155/P156 to qualifiers. Main value P155/P156 claims with a qualifier to indicate the series seems cleaner and more managable to me. However, in case we do move to a qualifier-only setting, the migration plan does already cover quite a significant portion of the current cases that need to be updated. There will nevertheless remain a lot of cases which need to be dealt with individually. —MisterSynergy (talk) 19:03, 23 February 2021 (UTC)[reply]
        • @MisterSynergy: cleaner and more managable to me. But not compatible with queries. See above. Jheald (talk) 19:12, 23 February 2021 (UTC)[reply]
          • I do not understand your example, to be honest. You seem to imply that there is exactly one series per property and the main value is irrelevant, but this is clearly not the case. The example query somehow needs to specify the series (main value), and then it is equally difficult as with the inverse order (P155/P156 with qualifiers). Or do I miss something here? —MisterSynergy (talk) 19:36, 23 February 2021 (UTC)[reply]
            • @MisterSynergy: It's true that not every property has only one series. In particular if there were different statements of the form Apart of the series (P179)X(i) for different series orderings X(i), my query could not separate out the different series Xi starting from a particular item a. (Nor I think can any other). But for some properties there is only one series, and for those properties using a form that allows expressions of the form ?head (p:prop/pq:P156)* ?item allows a chain related to that property to be distinguished from chains related to any other property. Jheald (talk) 20:29, 23 February 2021 (UTC)[reply]
              • From what I understand, most series would be using a rather generic property, meaning the main value matters. Can you name some examples where this is not the case? I think we should not consider this a reason to favor the P155/P156-as-qualifier approach just because of some edge cases. Most series would probably not be queried any easier if we were to go this path. —MisterSynergy (talk) 20:45, 23 February 2021 (UTC)[reply]
        • I once added the qualifiers to some award statements, e.g., for Founder’s Medal (Q26268774), e.g., as used on award received (P166) on Peter Scott (Q731311). Isn't it more natural as a qualifier, than to have a separate main statement indicating the award? Ghouston (talk) 21:21, 23 February 2021 (UTC)[reply]
        • They are also sometimes found on position held (P39) statements to give a sequence of office holders. Ghouston (talk) 21:43, 23 February 2021 (UTC)[reply]
        • @MisterSynergy: Can you explain where the "uncleanliness" and "unmanageability" of tying the preceding and following elements of a sequence to that sequence, while indicating the greater importance of the sequence to an item compared to the adjacent elements in that sequence, comes from? Mahir256 (talk) 22:59, 23 February 2021 (UTC)[reply]
          • The qualifier scheme is difficult to manage because editors drop these qualifiers whereever they can, often without much consideration whether these P155/P156 properties should be used at all and whether a suitable main property already exists in the item. We are going to end up with plenty of P155/P156 qualifiers at various places, which would effectively only make it more difficult to query them; this would potentially also violate plenty of constraint definitions, since the range of allowed qualifiers is often limited but P155/P156 are not yet included.
            The preceeding discussion with User:Ghouston sort of proves my point here. If you add P155/P156 as direct statements that require a "series" qualifier of some nature, you need to actively think about what is appropriate and whether the direct statements are applicable at all.
            In general, I think two factors are relevant for decisions such as this one. First of all: can it be queried nicely (IMO no difference in both settings, Jheald's argument above seems to cover some niches only); and second: which solution would presumably result in better data quality due to its better editor guidance. If we standardize this at all, I really think that direct P155/P156 have a clear advantage here over the proposed qualifier scheme, thus I would only support a change into that direction. —MisterSynergy (talk) 23:11, 23 February 2021 (UTC)[reply]
            • @MisterSynergy: better data quality due to its better editor guidance I don’t think your argument is really OK here. Basically those pair of properties can, and are, actually used anywhere even in cases where it’s not clear at all which series it at sake because there is no series/like statement at all for example. You basically can use « followed by » in a human item, and actually there is : https://w.wiki/32Lb 71 when I execute this query. Concerning data qualilty ? There had to be add a constraint to forbid using those properties with « human ». That’s like allowing it anywhere while forbidding it where it should not be used … Does not seem really easy to manage at all to me. By contrast allow them as qualifiers to a set of relevant properties is easier to manage, even more if it’s not used at all as main statement. I don’t think this is really a practice that actually lead to better quality datas to allow as main statement. author  TomT0m / talk page 11:43, 24 February 2021 (UTC) ADD By contrast, I think it’s pretty easy to create a community culture/rule on the principle « whenever a sequence is involved, (that is if a thing follows another) the data related to the sequence should be put as qualifiers. I realize this rule would as well involve to put the episode/season number in a video series. author  TomT0m / talk page 11:46, 24 February 2021 (UTC)[reply]
        Mahir recently asked me to reconsider my opposition to the proposal. I still cannot get behind the actual proposal as presented above, but—unlike before—I would support a migration if there is *only exactly one* main property (possibly part of the series (P179)) that hosts all P155/P156 qualifiers. This would still come with some drawbacks, but I think it would be a doable change that I could support. —MisterSynergy (talk) 22:17, 16 March 2022 (UTC)[reply]
  •   Support said it before and will say it again. I cannot see how they can be anything but qualifiers to not be ambiguous on all occasions. Their use should be explicit, not implicit.  — billinghurst sDrewth 11:18, 25 February 2021 (UTC)[reply]
  •   Support an uniform model would be easier for both editors and re-users. VIGNERON (talk) 18:48, 25 February 2021 (UTC)[reply]
  •   Support I agree with billinghurst. Ambiguous statements are not immediately useful to a machine. --SilentSpike (talk) 22:37, 19 November 2021 (UTC)[reply]
  •   Strong support Very much agree with this, as per the above discussion about ambiguity. Where something is part of a series by its very identity (e.g. numbers, dates etc.) then we should use these as qualifiers on instance of (P31). However, there is nothing stopping us also using these as qualifiers for part of (P361) (or whatever else defines the series). Theknightwho (talk) 01:30, 9 December 2021 (UTC)[reply]
  •   Support I also agree with Theknightwho that many times P361 is, in fact, a serie. Amadalvarez (talk) 05:57, 15 September 2022 (UTC)[reply]
  •   Support I was convinced by the reasoning above. Popperipopp (talk) 10:54, 15 September 2022 (UTC)[reply]
  •   Oppose I'm worried this will make it harder to manipulate/generate this data programmatically BrokenSegue (talk) 01:39, 18 September 2022 (UTC)[reply]
  •   Weak oppose There are already too many proposed properties that can be accompanied by follows (P155) and followed by (P156). This will result in inconsistency in data modelling, as new editors and data reusers will not read this page. As a user I can't find out what to do even after reading this page. Midleading (talk) 10:01, 5 November 2022 (UTC)[reply]

Some datas edit

follows (P155) is used on item with a variety of instance of (P31) values – see that list (and the number of usage per class), amongst which « city », Mikko Niskanen (Q224765) Mikko Niskanen (Q224765), calendar month, historical periods, biographical articles, cycling teams … It’s sometime used to indicate a rank hierarchy in a set of ranks, wikimedial models (?), submarine classes … You have to know the kind of item it is to guess the meaning, actually. author  TomT0m / talk page 13:20, 24 February 2021 (UTC)[reply]