Wikidata:Bot requests/Archive/2021/01

Library catalogs (2021-01-28)

Request date: 28 January 2021, by: Epìdosis

Task description

More specific queries:

SELECT ?item ?ref WHERE {
  ?item wdt:P396 ?sbn.
  ?ref pr:P248 wd:Q576951.
  ?statement prov:wasDerivedFrom ?ref.
  ?item ?p ?statement.
}
Try it!
SELECT ?item ?ref WHERE {
  ?item wdt:P7293 ?plwabn.
  ?ref pr:P248 wd:Q856423.
  ?statement prov:wasDerivedFrom ?ref.
  ?item ?p ?statement.
}
Try it!

Thanks @Ladsgroup:! --Epìdosis 08:47, 2 February 2021 (UTC)

Discussion
This is going to take a while. It's pretty big. Amir (talk) 19:12, 1 February 2021 (UTC)
User:Epìdosis. Hey, can you add a more specific query? Thanks. Amir (talk) 08:11, 2 February 2021 (UTC)
The first part is done. Doing the second part now. Amir (talk) 07:33, 4 February 2021 (UTC)
Request process

Accepted by (Amir (talk) 19:07, 1 February 2021 (UTC)) and under process

Task completed (21:52, 12 February 2021 (UTC))

I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. Amir (talk) 21:52, 12 February 2021 (UTC)

Archivio Storico Ricordi multiple references

Request date: 1 January 2020, by: Epìdosis

Link to discussions justifying the request
  • ...
Task description

Given the following query:

SELECT DISTINCT ?item
WHERE {
  ?item wdt:P8290 ?asr .
  ?item p:P569 ?statement.
  ?reference1 pr:P248 wd:Q3621644.
  ?reference2 pr:P248 wd:Q3621644.
  ?statement prov:wasDerivedFrom ?reference1.
  ?statement prov:wasDerivedFrom ?reference2.
  FILTER (?reference1 != ?reference2)
}
Try it!

Typically date of birth (P569) has two references, one with stated in (P248)+Archivio Storico Ricordi person ID (P8290)+retrieved (P813) and the other with stated in (P248)+Archivio Storico Ricordi person ID (P8290); the second should always be deleted.

The same should be repeated substituting in the above query: place of birth (P19), date of death (P570), place of death (P20).

Discussion

@Ladsgroup: as his bot is probably ready for doing this. --Epìdosis 22:36, 1 January 2021 (UTC)

@Epìdosis: Started Amir (talk) 17:02, 27 February 2021 (UTC)
Request process

Accepted by (Amir (talk) 17:01, 27 February 2021 (UTC)) and under process
Task completed (14:11, 6 March 2021 (UTC))

I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. Amir (talk) 14:11, 6 March 2021 (UTC)

Admin bot for deletion of 100k non-notable items

Request date: 19 January 2021, by: Epìdosis

Link to discussions justifying the request
Task description

Bot-deletion of the following items:

Discussion

I ping @Ladsgroup: and @MisterSynergy: as I know they have admin-bots. Thanks in advance, --Epìdosis 16:50, 19 January 2021 (UTC)

  • Not sure whether we should delete those at all. You can find plenty of similar datasets with *very* limited obvious use for our project. I'd say the items do meet the notability policy, so a much clearer consensus should be reached IMO, and it should be clear how this consensus relates to other similar situations. We could probably delete millions of items for the same reason, but do we want to? —MisterSynergy (talk) 17:02, 19 January 2021 (UTC)
    @MisterSynergy: I know that the situation is unclear and that it regards tens of thousands of items, maybe more. For this reason I have opened a general discussion (the third link above) and I've waited for a week with no feedback, whilst in the Project chat (the first link above) there seemed to be wide consensus for deletion. If you want to notify the discussion in other pages, or open a RfC or whatever I obviously support this, as I perfectly agree about the necessity to reach a clear conclusion on this point. Thanks as always, --Epìdosis 17:18, 19 January 2021 (UTC)
    Well, I don't find it that clear. There are quite some complaints about the bot operator not having their import approved with a separate bot task, but User:GZWDer is right that this is not explicitly required anywhere. In general, we have never managed to cleanly define batch editing and its distinction from bot editing in our policies, and we never managed to update the bot policy in a way that is suits Wikidata. It still bases on experiences made with bots in Wikipedias before Wikidata was launched, although this project relies on automated (bot) editing *much* more than Wikipedias do.
    To me, the discussions linked above seem to be fueled by the aversion to User:GZWDer that many users seem to feel. To be quite honest, I also do not like their behavior in most cases, as they aggressively edit in the gray area of our policies, and they are not very open for input from other users. This is genuinely a problem in a collaborative project, but as long as they do not clearly violate policies, which is not the case here in my opinion, I do not see a reason to suppress their contributions; please also mind that in my opinion Wikidata:Deletion policy does not allow to make use of the deletion tool here (I do consider these items notable according to WD:N).
    So, I think we should instead try to improve the policies so that this should not happen again, rather than to set a precedent for a sympathy-based use of the deletion tool. —MisterSynergy (talk) 20:01, 19 January 2021 (UTC)
    @MisterSynergy: OK, I perfectly agree about the need of updating our bot editing policy in order to avoid discussions post factum about great amounts of edits. However, my point isn't about the operate of GZWDer, but instead about the fact that, according to my interpretation, these items should be deleted because they don't respect WD:N. The discussion I opened was an attempt to find consensus about their respect, or not, of WD:N and no user contested my interpretation about the fact that didn't respect it, so I have also edited Help:Sources accordingly. I think that two separate discussions are then useful: one about bot policies; the other, already open but desert as of now, about the possibility for encyclopedia articles not having Wikisource sitelinks to fit WD:N (about which I am personally skeptical). --Epìdosis 21:45, 19 January 2021 (UTC)
    With the DOI claims, there is no doubt that they do meet the notability requirements. There also seem to be valid references on all (or at least most) of the claims. Notability is not an issue here; a deletion based on a not-notable claim would be completely at odds with our standard practice. —MisterSynergy (talk) 21:58, 19 January 2021 (UTC)
    @MisterSynergy: I partially disagree about their notability for one reason: as in these cases the DOI (P356) in fact coincides with the respective identifiers present in the items of the subjects of these articles, I agree with what @Bovlb: said in the Project chat: "Unless we're going to get a lot more information on these items, it seems to me that this sort of import would be better embodied in an identifier property." In general, my position is that the notability of items containing DOI (P356) can be taken for sure unless the DOI (P356) overlaps with an existing Wikidata property. I would prefer having a brief discussion somewhere about the fact that whichever item having DOI (P356) is notable, in order to finally add a statement DOI (P356)instance of (P31)Wikidata property for an identifier that suggests notability (Q62589316) and reach a general conclusion about this point. --Epìdosis 22:14, 19 January 2021 (UTC) P.S. As I'm not an expert of copyright, just a little confirmation: importing this sort of bibliographic metadata is CC0-compliant, isn't it?
    Well, the Benezit ID (P2843) identifier and the DOI are not identical. The identifier property was poorly managed in the past on Wikidata, but apparently mistakes/poor decisions have been made on the side of the external database as well which sort of contributed to the mess on the property page. The DOIs identify the encyclopedia articles, and the Benezit ID (P2843) identifiers identify the persons described in the articles. Of course, the URLs should *not* be identical, but Benezit ID (P2843) unfortunately uses DOI urls since March 2020 (Special:Diff/1130357608). The formatter URL should instead point to the URL which the DOI resolves to—unfortunately the identifiers would have to be changed as well then (i.e. rather make a new property and reset Benezit ID (P2843) to the pre-March 2020 state).
    The amount of content which is available in the items is not a relevant factor. There is no rule that there should be a "lot more information" available about something in order to be admissable here. —MisterSynergy (talk) 23:21, 19 January 2021 (UTC)

OK, thanks @MisterSynergy: for all the answers. As of now, it is quite clear that this problem certainly needs further discussions in other pages and I'm now convinced that probably the deletion is not to be performed anyway, as these items are notable because of DOI (P356) with few doubts. We can close, at least for now, this bot-request. Thanks again and good night, --Epìdosis 23:29, 19 January 2021 (UTC)

Request process
I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. Vojtěch Dostál (talk) 12:30, 15 June 2021 (UTC)