Wikidata:Property proposal/verifiability of property

verifiability of property edit

Originally proposed at Wikidata:Property proposal/Generic

   Not done
DescriptionVerifiability of this property, one of "Verified", "Human verifiable", "no value"
Data typeItem
Domainproperty
Allowed values"Verified", "Human verifiable", "no value"
Example 1stated in (P248) → "Verified"
Example 2imported from Wikimedia project (P143) → "Human verifiable"
Example 3retrieved (P813) → no value
Example 4reference URL (P854) → "Verified"
Planned useSourcing SPARQL query
Robot and gadget jobsno

Motivation edit

We need a way to enforce that results from Wikidata query service are reliably sourced. Currently there is no simple way to require higher reliability standard in Wikidata query service than "wdt". And this means unsourced statements or statements imported by bots from Wikipedia will appear in query result. Simply requiring a source be provided doesn't work, because it can be imported from Wikimedia project (P143) Wikipedia (Q52). Requiring any sources other than imported from Wikimedia project (P143) doesn't work either, because imported from Wikimedia project (P143) can be accompanied by retrieved (P813), and retrieved (P813) triple will be matched by SPARQL software as a valid source. Not to mention we are starting to have new identicals of imported from Wikimedia project (P143), such as Wikimedia import URL (P4656) and inferred from (P3452).

Using this new property, we can write SPARQL query that require the data be properly sourced. Example:

SELECT ?item ?value ?reliablesource ?reference WHERE {
  ?item p:P31 [ps:P31 ?value;
               a wikibase:BestRank;
               prov:wasDerivedFrom [?reliablereference ?reference]].
  ?reliablesource <http://example.com/verifiability> <http://example.com/Verified>;
                  wikibase:reference ?reliablereference.
}

Alternatively, we can link the new items "Verified" or "Human verifiable" to properties using existing properties, for example instance of (P31) or has characteristic (P1552). But using a new property we can even define a hierarchy of reliability. For example if statement "Verified" next lower rank (P3729) "Human verifiable" exists, we can write the following SPARQL query to require the result having at least one source even if it is linked to Wikipedia (Q52), but do not include unsourced statements or bogus sources, for example a reference with only retrieved (P813):

SELECT ?item ?value ?reliablesource ?reference WHERE {
  ?item p:P31 [ps:P31 ?value;
               a wikibase:BestRank;
               prov:wasDerivedFrom [?reliablereference ?reference]].
  ?reliablesource <http://example.com/verifiability>/wdt:P3729* <http://example.com/HumanVerifiable>;
                  wikibase:reference ?reliablereference.
}

Midleading (talk) 12:28, 10 December 2019 (UTC)[reply]

Data Model: There are multiple ways to model source reliability. We should decide which one we want to use.

  1. "Generic reliable source" > "Unreliable source", as proposed above.
  2. "Reliable source from trusted organizations and governments" > "Generic reliable source" > "Unreliable source". Defines a multi-level reliability hierarchy. More levels can be added if necessary.
  3. "Reliable source of science", "Reliable source of law", "Reliable source of medicine", "Reliable source of biblography", "Generic reliable source", "Unreliable source". Defines reliability by area.
  4. "Primary source", "Secondary source", "Tertiary source", "Generic reliable source", "Unreliable source".

Also we need to discuss the criterion used to define this property, is it about verifiability or is it about reliability? Midleading (talk) 08:06, 11 December 2019 (UTC)[reply]

Discussion edit

Using a subclass technically works, but the new items used to specify property verifiability are subclass of Wikimedia internal item (Q17442446), as they are specific to Wikidata. The properties themselves are subclass Wikidata property to indicate a source (Q18608359) but not indirect subclass of Wikimedia internal item (Q17442446). Also, this new property uses parent relationship next higher rank (P3730)next lower rank (P3729) which is different from subclass of (P279). This property doesn't indicate the results are indeed reliable, the statement can still be incorrect or vandalized, but the reference is listed in query result which you can verify by consulting sources. Without this property, you will be overwhelmed by huge list of values without means to verify.--Midleading (talk) 02:25, 12 December 2019 (UTC)[reply]
Properties are specific to Wikidata too, I'm not sure what your point is there. Here's specifically what I'm suggesting: create new items like "Wikidata property to indicate a verified source", "Wikidata property to indicate a human-verifiable source" as subclasses of Wikidata property to indicate a source (Q18608359), and make stated in (P248) an instance of the first, imported from Wikimedia project (P143) an instance of the second, and leave retrieved (P813) as it is. You can relate the new items with next lower rank (P3729) etc. relationships if you like, and a modified version of your SPARQL will work the same way, but using the subclass relation instead of a new property. ArthurPSmith (talk) 18:25, 12 December 2019 (UTC)[reply]
I don't think properties are specific to Wikidata because many of Wikidata property to indicate a source (Q18608359) have an external equivalent property, for example publication date (P577) = schema:datePublished . Making them an instance of Wikimedia internal item (Q17442446) is just like saying Universe (Q1) is instance of "Wikipedia article". I'm fine with using subclass instead of a new property, provided we maintain the ontology stable so that we aren't forced to use ugly/slow "?reliablesource wdt:P31/(wdt:P279|wdt:P3729)* wd:Q123456789" and/or VALUES statement in next years, and the new subclass can't be used to infer the properties are Wikimedia-only.--Midleading (talk) 05:29, 13 December 2019 (UTC)[reply]
They're already listed as instances of Wikidata property to indicate a source (Q18608359) so whether or not you consider them Wikimedia-only, making them instances of subclasses of that won't change anything in that respect. Also have many existing items that were created for purposes like this and should remain stable - for example all the property constraint-related items. ArthurPSmith (talk) 18:47, 13 December 2019 (UTC)[reply]
@Midleading: That property exist to express a relationship that's defined by policy. This policy wants to say things about what sources are reliable in the absence of policy. A datamodel of how we think about source reliability should be codified in policy and RfC approved. ChristianKl11:23, 4 September 2020 (UTC)[reply]
Wikidata:Living_people exists because we protect the rights of living people. If an user violates that policy, he or she might be warned or blocked. But this one? I may violate this "reliable sources" policy, if defined, any time. Nobody would be blocked. I don't like such a useless policy being codified. If such a policy would be defined, it is the statement in the reference that matters, not the property itself. --Midleading (talk) 05:32, 5 September 2020 (UTC)[reply]
As explained from the start, this doesn't work because 1. there are references which combine imported from Wikimedia project (P143) and retrieved (P813) 2. there is other imported from Wikimedia project (P143) equivalent, for example, Wikimedia import URL (P4656). --Midleading (talk) 05:32, 5 September 2020 (UTC)[reply]
  Oppose I don't get the benefits from this proposal tbh. There should be no credibility differences between bot edits and human edits. If there are low quality bot edits, request the bot to be blocked. If you do not want to use information imported from Wikimedia projects, filter for imported from Wikimedia project (P143). -- Dr.üsenfieber (talk) 09:10, 12 September 2020 (UTC)[reply]
  •   Oppose Filter out statements which are backed only by reference properties which you don't find reliable. If needed, it would also be possible to make reference properties instance of (P31) {some type of classification system for reference properties} if there is a reasonable classification system in existence. --Dhx1 (talk) 14:03, 21 December 2020 (UTC)[reply]
  Oppose as per Dr.üsenfieber. --FocalPoint (talk) 09:14, 14 February 2021 (UTC)[reply]

  Not done Clear lack of consensus. JesseW (talk) 02:34, 12 March 2021 (UTC)[reply]