On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at User talk:Ivan A. Krestinin/Archive.

Incorrect Link RotEdit

I'm not sure why you deprecated the DOI on Barringer Medal citation for Michael R. Dence (Q101634288). It's a working DOI, so that deprecation seems unfounded. Trilotat (talk) 07:27, 26 November 2020 (UTC)

Are you about this edit? Bot did not change deprecation rank. It just made the value upper cased. — Ivan A. Krestinin (talk) 15:49, 2 January 2021 (UTC)

Non-capturing regex group (?:)Edit

Hi Ivan,

At Wikidata:Property_proposal/URL_match_pattern, we are trying to figure out which would be sensible default pattern for replacement.

At Property_talk:P973, this would be probably \2, but, if Krbot supports non-capturing regex groups, we could use "\1".

this tries to test it with that url. Will Krbot convert it? It seems to be busy with other things in the meantime.

Wikidata:Property_proposal/URL_match_pattern could probably be useful for Krbot as well. used by (P1535) could qualify the ones used by Krbot. --- Jura 07:54, 27 November 2020 (UTC)

PCRE v8.39 is used by KrBot. Please see the library documentation for supported syntax. As I understand general idea is to replace {{Autofix}} to properties. Autofix template currently supports several different use cases. Do you have idea how to describe all of them using properties? — Ivan A. Krestinin (talk) 16:03, 2 January 2021 (UTC)

Connection to other wikisEdit

Hey! Can you move Q9212417 to Q8564503 ?

The topic is the same: en.: Category:Jazz clubs

it.: Categoria:Locali jazz

The move will allow the connection to other wikis.

Thanks! --CoolJazz5 (talk) 10:59, 2 December 2020 (UTC)

@CoolJazz5 I've done this. In future, please specify links to the pages that you write about. This greatly simplifies execution of your request. Michgrig (talk) 22:41, 2 December 2020 (UTC)
Michgrig Ok, thanks!

Question about the edits made by the botEdit

Why did the bot edit the two entries Q87402631 and Q104417514? It is true that the announcements that the bot moved were displayed with a warning. But that doesn't mean they're wrong too. --Gymnicus (talk) 10:48, 24 December 2020 (UTC)

As I noticed just after resetting, there were no warnings at all. For this reason, the edits of the bot make even less sense from my point of view. --Gymnicus (talk) 10:59, 24 December 2020 (UTC)
It is very common mistake then person-specific property is set for items that describes person groups. See this edit as example. Bot fixes such cases. It is logically incorrect to use properties like country of citizenship (P27), sex or gender (P21) for human groups. — Ivan A. Krestinin (talk) 16:21, 2 January 2021 (UTC)
I don't see it as easy as you do. For the property country of citizenship (P27) I would go with you. But I can't understand that with the property occupation (P106). There is also an example with the data object Igor and Grichka Bogdanoff (Q12920) where your bot does not remove this property and this statement was added on November 22, 2020. Then why does he remove it from the examples I mentioned? --Gymnicus (talk) 10:28, 6 January 2021 (UTC)

RequestEdit

Could you make your bot:

  • Add P750 > Q1486288 to items with P2725
  • Add P750 > Q1052025 to items with P5944
  • Add P750 > Q1052025 to items with P5971
  • Add P750 > Q1052025 to items with P5999
  • Add P750 > Q22905933 to items with P7294
  • Add P750 > Q135288 to items with P5885

--Trade (talk) 21:55, 27 December 2020 (UTC)

Please add statements like:
property constraint (P2302)
  item-requires-statement constraint (Q21503247)
property (P2306) distributed by (P750)
item of property constraint (P2305) PlayStation Store (Q1052025)
constraint status (P2316) mandatory constraint (Q21502408)
0 references
add reference


add value
Bot fixes such statements automatically. — Ivan A. Krestinin (talk) 16:49, 2 January 2021 (UTC)

Fenestella troublesEdit

There are two different genera named Fenestella, namely an animal one (Q20975616) and a fungus one (Q17317929). Despite the names, they are completely different. So should their corresponding article items be, and their category items (Q9651255 and Q18283983, respectively) be. However, by name confusion, there has been mistakes, like linking the Swedish fungus article to the Commons category for the animals. Unhappily, these mistakes also seem to have caused confusion in wikidata.

I do not understand why your robot talks about a "redirect from Q20975616 to Q17317929" in the summary of this edit, but I know that this edit, and others mentioned in the summary details, have contributed to the confusion. I'll revert these edits, correct what I can, and add further "different from" properties to the central items (the four enumerated supra). I hope that this will lessen the risks for this particular confusion in the future. Best regards, JoergenB (talk) 20:01, 29 December 2020 (UTC)

I think different from (P1889) should be enough. Thank you for your job! — Ivan A. Krestinin (talk) 17:17, 2 January 2021 (UTC)

Ανδρέας (Greek given name) abusively mutated to Slavic equivalent(?) АнджейEdit

Problem here may be caused by one or more edits in Q14941830... To correct it manually is hard: got an error message: 'Item Q87263878 already has label "Ανδρέας" associated with language code el, using the same description text.' Perhaps a bot can be more helpful.

Happy New Year, Ivan A. Krestinin, Klaas `Z4␟` V:  11:31, 1 January 2021 (UTC) ( on behalf of NameProjectMembers & notifiers)

Looks like everything was fixed already. Happy New Year! — Ivan A. Krestinin (talk) 17:21, 2 January 2021 (UTC)

Protection of user pageEdit

Hi! As suggested here, I've raised the protection level of your user page to administrators, in order to avoid accidental creations by registered users. If you want to create the page in the future, you can always make a request to WD:AN. Best regards and happy 2021! --Epìdosis 14:46, 1 January 2021 (UTC)

Good, thank you! Happy new year! — Ivan A. Krestinin (talk) 17:35, 2 January 2021 (UTC)

Error in BnF correctionEdit

Hi,

The correction made here is wrong, the correct id is 12746940n.

When it starts with FRBNF, the last character is always false and must be recalculated.

eru [Talk] [french wiki] 17:54, 1 January 2021 (UTC)

What did KrBot do?Edit

Hi Ivan, ich don't get the reason of this change of your bot. As far as I see, the bot didn't change anything :-)

and of course - a happy new year. greetings from Germany --Z thomas (talk) 13:01, 2 January 2021 (UTC)

Hi Thomas, bot removed non-printable symbols at the end of name. Happy new year! — Ivan A. Krestinin (talk) 17:53, 2 January 2021 (UTC)
Thanks. I assumed something like that. Greetings --Z thomas (talk) 18:19, 2 January 2021 (UTC)
Hi, could you also add non-breaking spaces (\u202F and \u00A0) and multiple spaces (\s{2,} -> " ") to your script, please? Recently there was not very successful import from DACS (ping @Hannolans:), containing such spaces in en/nl labels. --Lockal (talk) 16:31, 9 January 2021 (UTC)
AI, wasn't aware of this, this was a download from the unmatched mix n match that I uploaded with openrefine. Would be great if the bot can repair this. Double spaces is also very useful. --Hannolans (talk) 22:36, 9 January 2021 (UTC)

Murder -> HomicideEdit

Hi!

Could you check the discussion at Property_talk:P1196#Allow_assassination? and, unless there is a good reason that I and others have so far managed to overlook, kindly ask your bot to stop edit-warring? Best, --Matthias Winkelmann (talk) 19:58, 2 January 2021 (UTC)

Hi! Are you about this edit? Please remove corresponding {{Autofix}} rule from Property talk:P1196 page if the rule is inapplicable for some cases. — Ivan A. Krestinin (talk) 22:53, 14 February 2021 (UTC)

Edits made based on a wrong constraints on MUBI IDEdit

Hey, Trade added two item-requires-statement constraint (Q21503247) on MUBI film ID (P7299) that weren't right since MUBI has ID for many films that are not on MUBI. They are basically there to show up in search and then suggest similar titles for potential viewers. Your bot added is adding statements based on these constraints. Could you undo them? thanks, Máté (talk) 05:18, 3 January 2021 (UTC)

Seconding this. Plesae undo these edits. Trivialist (talk) 18:50, 24 January 2021 (UTC)
Hi!   DoneIvan A. Krestinin (talk) 23:58, 14 February 2021 (UTC)
Thanks! Máté (talk) 08:21, 15 February 2021 (UTC)

Bot adding of distributed by (P750)Edit

Hi! I have notice that you have added a number of distributed by (P750) based on identifiers on items. First of all I have not found any bot request for this work. This is needed for an approval for a new task. For music items it's not correct to add Spotify (Q689141), Tidal (Q19711013), Deezer (Q602243), etc as distributers. That is like saying ham (Q170486) is distributed by (P750) Walmart (Q483551). Within the music industry, music distribution is the way that the music industry makes recorded music available to consumers. This is often done by a record label. So, please remove the added items on music related pages and create a bot request for this work. --Premeditated (talk) 14:08, 5 January 2021 (UTC)

Wait, i though we followed the same model with music releases as we do with video games and film? --Trade (talk) 15:20, 5 January 2021 (UTC)
What do you mean? Like Rocky IV (Q387638) is distributed by (P750) Metro-Goldwyn-Mayer (Q179200), not Netflix (Q907311) or Apple TV+ (Q62446736) (just examples) because they are available on those sites. For games I guess there is more of a publishers type of distribution, but I don't know much about how that workes for games. --Premeditated (talk) 09:02, 6 January 2021 (UTC)
The theatrical and home media video version of Rocky IV (Q387638) are distributed by (P750) Metro-Goldwyn-Mayer (Q179200) while the video on demand (Q723685) version are distributed by (P750) Netflix (Q907311) (in lieu of being distributed on Netflix' video-on-demand service)
'For games I guess there is more of a publishers type of distribution, but I don't know much about how that workes for games' A publisher are the one who publishes the game. A distributor is the website that the game download are being sold on tho sometimes there are exceptions for physical releases and streaming platforms. @Premeditated: --Trade (talk) 12:59, 8 January 2021 (UTC)
@Trade: I think you are mixing distribution format (P437) for distributed by (P750). Like The Beatles (Q3295515) has distribution format (P437)music streaming (Q15982450). --Premeditated (talk) 13:47, 8 January 2021 (UTC)

I corrected my examples. So, why do you think that listing music streaming platforms are outside the scope of distributed by (P750)? @Premeditated:--Trade (talk) 00:42, 10 January 2021 (UTC)

@Trade: Sorry for late response. I think that it should be made a new property named "distribution platform", that could be used for all of those platforms like Steam (Q337535), Spotify (Q689141), Microsoft Store (Q135288), etc. Instead for cluttering distributed by (P750). - Premeditated (talk) 12:22, 20 January 2021 (UTC)

KrBot malfunction at Wikidata:Database reports/Constraint violations/P8988Edit

Hello, your bot failed to detect any violations at Wikidata:Database reports/Constraint violations/P8988, which is improbable. Please could you look at what is wrong? Vojtěch Dostál (talk) 15:08, 5 January 2021 (UTC)

Looks like everything is fine with the page now. — Ivan A. Krestinin (talk) 22:45, 14 February 2021 (UTC)

KrBot malfunction at Wikidata:Database reports/identical birth and death datesEdit

Hello, some entries Wikidata:Database reports/identical birth and death dates were fixed some days ago but not removed, could you have a look? Some examples:

Thank you! --Emu (talk) 21:40, 6 January 2021 (UTC)

Now all the items are removed. Maybe somebody fix the items. Or maybe some caching issue... — Ivan A. Krestinin (talk) 22:43, 14 February 2021 (UTC)

KrBot2 sleepingEdit

Hello Ivan! KrBot2 he fell asleep. Please wake me up! :) Thanks Palotabarát (talk) 00:13, 13 February 2021 (UTC)

  • Hi! The latest JSON dumps on https://dumps.wikimedia.org/other/wikidata/ are corrupted. Its have very small size (~200 Mb instead of ~86 Gb) and invalid JSON structure. I added check for dump size. This should help. It is good to describe the issue in some Wikimedia issue tracker. — Ivan A. Krestinin (talk) 22:34, 14 February 2021 (UTC)
Ah, I get it. Now I know where the dump is, which gives the data. Thanks for the reply! Palotabarát (talk) 00:15, 15 February 2021 (UTC)
20210303.json.gz is corrupted also unfortunately. The issue is tracked here. — Ivan A. Krestinin (talk) 09:54, 8 March 2021 (UTC)
  FixedIvan A. Krestinin (talk) 19:52, 29 March 2021 (UTC)

Resolving redirectsEdit

My understanding that withdrawn identifiers should be handled by deprecating them and marking them as withdrawn. Your bot is instead replacing them without more. The withdrawn (and replaced) identifiers are still used in other systems and linking them may still be desired.

For example withdrawn VIAF identifiers are still used by Worldcat Identities. Though Worldcat should update and merge their entries, until they do the old VIAF ID is still useful. Int21h (talk) 00:12, 15 March 2021 (UTC)

@Int21h: Hi! For VIAF ID (P214) there was consensus here for the removal of redirected and withdrawn IDs since VIAF clusterization has many problems (e.g. Q212872#P214) and keeping trace of it would be quite problematic. Bye, --Epìdosis 07:44, 15 March 2021 (UTC)
Ok thanks I wasn't aware of previous discussions. Good to know! Int21h (talk) 16:28, 15 March 2021 (UTC)

KrBot and Single ConstraintEdit

Hi, would it be possible when checking the Single Constraint violations of identifiers ignoring the ones who have a deprecated rank and reason for deprecated rank (P2241):redirect (Q45403344) as a qualifier? One example would be this one, which is listed in the constraint report. Those are considered valid values and should be kept, so having them in the report makes maintenance and cleanup harder. -- Agabi10 (talk) 18:00, 18 March 2021 (UTC)

Hello, there is technical troubles to implement this. Maybe I can propose alternative way. Does IMDb allow to get all valid (non-redirect) identifiers? If yes I can create bot that will fix such redirects continuously. — Ivan A. Krestinin (talk) 19:56, 29 March 2021 (UTC)
I don't know if it allows getting all identifiers, but at least for now they shouldn't be replaced, as long as they have been valid identifiers they should be kept with deprecated rank. If checking the qualifier is too much trouble just ignoring the statements with deprecated rank when creating the report would be more feasible? -- Agabi10 (talk) 09:45, 7 April 2021 (UTC)
@Agabi10: That's a good interim solution - yes, skipping item with deprecated statements would really be best.Vojtěch Dostál (talk) 10:12, 7 April 2021 (UTC)

Reverted mergeEdit

Hello. Q20540007 was mistakenly merged with Q17165321. Then KrBot re-linked statements pointing to the redirect. Then the merge was reverted. Could you also revert the bot-actions? Thanks in advance. Greetings, --FriedhelmW (talk) 14:52, 21 March 2021 (UTC)

Hello,   DoneIvan A. Krestinin (talk) 19:57, 29 March 2021 (UTC)
Thank you! --FriedhelmW (talk) 16:09, 30 March 2021 (UTC)

NGA numberEdit

Hello. I have seen your bot doing great work fixing up light characteristic (P1030) and ARLHS lighthouse ID (P2980). I wonder if you might be able to help with NGA lighthouse ID (P3563)? Often these are written as a 5-digit number but are missing the 3-digit volume prefix. (Compare [1]). The volume depends on the geographic area, which may be deducible from country (P17). This map shows how the 7 volumes are distributed. If you can help, that would be great. MSGJ (talk) 21:45, 22 March 2021 (UTC)

Hi, this is out of some current bot tasks. It is better to put the request to Wikidata:Bot requests. — Ivan A. Krestinin (talk) 20:06, 29 March 2021 (UTC)

David van DantzigEdit

@KrBot: Hi Ivan: The University of Utrecht never was employer of David van Dantzig. Please see the biography of van Dantzig written by Gerard Alberts Twee geesten van de wiskunde : biografie van David van Dantzig published in 2000 or the paper of his student Jan Hemelrijk The Statistical Work of David Van Dantzig (1900-1959) published in 1960 or the short biography in Academic Genealogy of Mathematicians (page 310) by Sooyoung Chang published in 2011. Moreover, Utrecht University is not cited in the Complete Dictionary of Scientific Biography neither in MacTutor History of Mathematics.--Ferran Mir (talk) 11:22, 23 March 2021 (UTC)

Please @KrBot:, read my arguments against the statement that University of Utrecht was employer of David van Dantzig.--Ferran Mir (talk) 15:00, 23 March 2021 (UTC)
Hi, Ferran, KrBot is just a bot) It uses very simple rule: each item with Catalogus Professorum Academiae Rheno-Traiectinae ID (P2862) property should have employer (P108) = Utrecht University (Q221653) statement according to constraints specified on Property:P2862. This edit will help. — Ivan A. Krestinin (talk) 20:17, 29 March 2021 (UTC)
OK @KrBot: @Ivan A. Krestinin:, I have seen the exception included in the restriction. That's right! Thanks.--Ferran Mir (talk) 07:40, 30 March 2021 (UTC)

Qualifier reason for deprecated rank (P2241) on property constraintsEdit

Hi Ivan A. Krestinin,

to use Help:Property_constraints_portal/Entity_suggestions_from_constraint_definitions, some constraint statements have the above qualifier (and deprecated rank). Can you skip those constraints for Krbot? In the most recent update, the report throws an error. --- Jura 10:23, 13 April 2021 (UTC)

Hi Jura, I added the property to ignore list. The nearest update is in progress already, so it will report the error. The next update should be fine. — Ivan A. Krestinin (talk) 20:48, 24 April 2021 (UTC)

Petit-Rocher Lighthouse (Q106498634)Edit

Bot is doing strange things on Petit-Rocher Lighthouse (Q106498634) — Martin (MSGJ · talk) 20:17, 19 April 2021 (UTC)

Bot executes rules from {{Autofix}}. The rules were added by Jura. Better to discuss the issue with him. — Ivan A. Krestinin (talk) 20:32, 24 April 2021 (UTC)
Yes, it seems to work as planned (removing the dots). --- Jura 20:48, 24 April 2021 (UTC)
It took 8 edits to do it though? — Martin (MSGJ · talk) 19:29, 25 April 2021 (UTC)
To avoid breaking things, I think I did "."→" " as sometimes a space to following them was missing and some "." shouldn't be replaced.
We could probably have more Autofix rules that try to do it in fewer steps, but then these would have to be checked on every run as well.
This report has more patterns that might need to be normalized, but it's a tricky thing. --- Jura 07:11, 29 April 2021 (UTC)


constraint scope (P4680) qualifier, error on KrBot updateEdit

Hi Ivan,

Maybe the qualifier should be handled somehow or ignored.

I removed it at [2] for [3], but maybe there are cases where it's useful (possibly at this property). --- Jura 15:11, 29 April 2021 (UTC)

single-value constraint (Q19474404) is checked for main value only. So the property is not looked as something useful. — Ivan A. Krestinin (talk) 15:57, 10 May 2021 (UTC)
I tend to agree. I think people started adding them as there was some oddity with the Wikibase extension (initially checking by default everywhere). Or was that about the distinct value constraint? Go figure. --- Jura 09:38, 12 May 2021 (UTC)
In that case, would you be able to provide constraint reports while ignoring that qualifier (instead of throwing an error and producing no report, as is currently done)? Mahir256 (talk) 18:35, 6 June 2021 (UTC)
@Ivan A. Krestinin: Thoughts on the idea of ignoring that qualifier? Mahir256 (talk) 17:56, 17 June 2021 (UTC)
Hi! I reviewed several usages of the property. It is looked as completely redundant as I see. Why not just remove it? Also it is very confusing because it is similar to property scope (P5314). — Ivan A. Krestinin (talk) 20:49, 21 June 2021 (UTC)
@Ivan A. Krestinin: I agree that it is redundant for you, given that you only check main values, which is why I'm asking if you could ignore it and possibly other properties not applicable in that situation when generating reports. I believe that P4680 is still useful for the gadget for which @Lucas Werkmeister (WMDE): proposed that property in the first place, and possibly for other future tools which are developed for constraint checking. (@MisterSynergy:, as the proposer of P5314, who might have more to say on that point.) Mahir256 (talk) 21:42, 21 June 2021 (UTC)
property scope (P5314) defines where a property might be used, while constraint scope (P4680) defines where the constraint should be checked.
As an example, consider identifier properties. They are usually allowed (via property scope (P5314))) as main values and references. A distinct-values constraint (Q21502410), however, should not be checked on references, as the reference value might occur on different claims and even different items.
MisterSynergy (talk) 09:39, 23 June 2021 (UTC)
distinct-values constraint (Q21502410) is checked for main values only. I do not see any reason to duplicate this fact on each property page. — Ivan A. Krestinin (talk) 22:56, 23 June 2021 (UTC)

resolved redirectsEdit

I came here to pat the bot. I had no idea that I had left so many links to my redirects. Today, there was a good bot, thanks for that.--RaboKarbakian (talk) 00:26, 2 May 2021 (UTC)

Lingering uses of redirects in statementsEdit

Hi! I understand your bot is responsible for fixing statements that refer to redirected items. I see that high mass X-ray binary (Q71963720) was redirected on 2019-12-10T17:42:33‎, but there are still many statements using the old item (e.g. here is one I fixed today). Do you know why these are not getting fixed? (I noticed this when trying to work out why there were so many type violations on Wikidata:Database_reports/Constraint_violations/P59.) Cheers, Bovlb (talk) 16:30, 26 May 2021 (UTC)

Hello, currently bot fixes redirects only once per redirect. After this bot adds item to special "already fixed" list and ignores it. Bot fixes all redirects to Q71963720 at 2019-12-12, but after this the redirect was used by User:Ghuron, see [4] as example. Looks like I need to create special algorithm to detect fixed redirects reusage. — Ivan A. Krestinin (talk) 09:47, 6 June 2021 (UTC)
I need to fix that in my script, thanks Ghuron (talk) 15:45, 6 June 2021 (UTC)
I also added such items detection to my bot. Links to high mass X-ray binary (Q71963720) were fixed. — Ivan A. Krestinin (talk) 05:59, 7 June 2021 (UTC)

Removal of "occupation" for Peter and Rosemary GrandEdit

Hi, your bot is removing one of the occupations ("evolutionary biologist", the most important one !) of Peter and Rosemary Grant (Q3657692), as in this edit, and I cannot understand why ? (by the way, a bot should probably not repeat twice the same edit if it has been manually reverted; it should likely lead to a discussion). Cheers, Schutz (talk) 09:11, 28 May 2021 (UTC)

Hi, bot removes person specific properties like birth/death dates, nationality, spoken language, occupation and etc. from items about person groups. It is very common mistake that is repeated many times by different bots, half-automatic procedures and some users. — Ivan A. Krestinin (talk) 09:53, 6 June 2021 (UTC)
In any case, the bot should not blindly remove several times the same information -- it should alert the user instead. But I don't really see why "writer", in this case, is kept, while "evolutionary biologist" is not. In this case, the latter is not an error, as the couple worked together as evolutionary biologists. The problem is that by simply removing the property, nothing meaningful appears in the infobox at w:fr:Peter et Rosemary Grant (at the moment it is only "writer", translated in French). If you have any suggestion about how the interesting information can be displayed (in other words, how the Wikidata item can include the information that the pair has worked as evolutionary biologists, and not "writers", so that this information can trickle down to the infobox), I'd love to hear it. Otherwise, could you change your bot so that it does not remove this useful (and correct) information ? Many thanks in advance, Schutz (talk) 13:54, 5 July 2021 (UTC)
Wikidata is not just collection of information. The information should be structured also. Discussed type of error is too common. Too many users made it from time to time. Bot fixed 4649 cases of this type already... 4649 notifications on user`s pages... Will be looked like spam bot) I changed the article a bit to make the article information more clean. — Ivan A. Krestinin (talk) 22:17, 12 July 2021 (UTC)

RequestEdit

Can this bot replace all current and future value in Namuwiki ID (P8885) from %20 to space (example from 머라이어%20캐리 to 머라이어 캐리)? Thanks. Hddty (talk) 01:05, 9 June 2021 (UTC)

  DoneIvan A. Krestinin (talk) 17:00, 10 June 2021 (UTC)

Resolving redirect (Q12368077_Q5334199)Edit

As a consequence of bad merge there are 292 erroneous links now[5]. These links should be "unresolved". May I also suggest that your bot waits longer after a merge before resolving links, perhaps a week, or rather even a month. So that it would be liklier that bad merges will be caught before. 2001:7D0:81DA:F780:A8E0:C965:57F9:B464 06:53, 24 June 2021 (UTC)

  DoneIvan A. Krestinin (talk) 07:59, 26 June 2021 (UTC)

Update the report of P2991Edit

Hi Ivan. Could you maybe let your KrBot run over the property IBSF athlete ID (P2991) regarding Wikidata:Database reports/Constraint violations/P2991? I had removed several bugs, but I also noticed that there are definitely false messages and I would be interested to know whether these will now be removed. --Gymnicus (talk) 11:25, 25 June 2021 (UTC)

Thank you --Gymnicus (talk) 14:54, 25 June 2021 (UTC)

To mergeEdit

Why no updates of User:Ivan A. Krestinin/To merge anymore? - FakirNL (talk) 08:24, 2 July 2021 (UTC)

Hi! Bot failed on wrong different from (P1889) values like this and this. I added some checks to skip such values. The reports should be updated in 1-2 days. — Ivan A. Krestinin (talk) 09:26, 3 July 2021 (UTC)

Wrong statements based on wrong constraintEdit

Hey, could you please revert edits based on this statement? The constraint was erroneous. – Máté (talk) 04:50, 4 July 2021 (UTC)

  DoneIvan A. Krestinin (talk) 10:45, 4 July 2021 (UTC)


replacement value (P9729)Edit

Hi Ivan,

Similarly to replacement property (P6824), can you ignore this when found in property constraint? Otherwise KrBot would generate an error. --- Jura 11:33, 11 July 2021 (UTC)

Hi Jura, interesting property. I added it to ignored for conflicts-with constraint (Q21502838) and none-of constraint (Q52558054). But looks like bot should do something more than just ignoring this property. Currently the property used a bit random. For example I do not understand the property usage for location of creation (P1071):
Do you have some ideas how to make its usage more structured maybe? — Ivan A. Krestinin (talk) 21:22, 12 July 2021 (UTC)
I had seen that use too, but wasn't sure what to think of it. The problem seems to be that the replacement isn't always applicable. Personally, I'd remove that.
The samples at replacement value (P9729) are closer to how I'd use them. @Dhx1: documented them at Help:Property_constraints_portal/None_of.
I noticed platform (P400) has plenty of constraints that can use it. --- Jura 14:54, 13 July 2021 (UTC)

Thank youEdit

Just wanted to drop a big "thank you" for cleaning up the apparent mess that Pi bot made with all the duplication. Great catch. Huntster (t @ c) 23:52, 15 August 2021 (UTC)

New constraint Label in LanguageEdit

Hi Ivan,

Please see phab:T195178. To test the future deployment, I added label in language constraint (Q108139345) at [6]. You might need to have Krbot skip it. --- Jura 12:30, 7 September 2021 (UTC)

Hi Jura, I added fake implementation for the constraint. It is not so hard to add real implementation. But this requires to load information about labels. This information requires some memory. But memory is critical resource now unfortunately. See my message bellow for details. — Ivan A. Krestinin (talk) 20:58, 8 September 2021 (UTC)
I had previously implemented it with complex constraints, see Help:Property_constraints_portal/Label_language. --- Jura 07:59, 9 September 2021 (UTC)

KrBot stuck? (Wikidata:Database_reports/Constraint_violations/P2088)Edit

Wikidata:Database_reports/Constraint_violations/P2088 has not been regenerated for 16 days.

Is there a way to force it to revalidate this property? --Vladimir Alexiev (talk) 14:24, 8 September 2021 (UTC)

  • Hi Vladimir, the report was updated just now. About the situation in general: 10 days, not 16. I have issue with hardware unfortunately. Wikidata is growing. Now bot requires 106 GB of memory to load and process all data. But my server has only 32 GB of RAM. SSD for swap makes the processing possible, but very slow. ~10 days per update cycle. I made several memory usage optimizations already. This helped, but Wikidata growing is continue. — Ivan A. Krestinin (talk) 20:53, 8 September 2021 (UTC)
    106 GB is a lot, but couldn’t you get more RAM by moving to Wikimedia servers? This task is critical to the Wikidata community, so I think WMF should provide the amount of resources reasonably needed if you decide to move your bot to their servers. —Tacsipacsi (talk) 21:36, 8 September 2021 (UTC)
    I created a new task for discussion of this, see https://phabricator.wikimedia.org/T290635 --So9q (talk) 09:16, 9 September 2021 (UTC)
  • Thanks Ivan! @Tacsipacsi: completely agree, WMD should take over this crucial service.
# Note: before https://phabricator.wikimedia.org/T201150 is fixed, the result will only be partial
SELECT DISTINCT ?item ?itemLabel ?value WHERE {
	?statement wikibase:hasViolationForConstraint wds:P2088-DD4CDCEA-B3F6-4F02-9CFB-4A9E312B73A8 .
	?item p:P2088 ?statement .
	?statement ps:P2088 ?value.
	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" } .
}
Try it!
  • Unfortunately this returns fewer violations compared to the pages generated by KrBot. See the comment in the query: "Note: before https://phabricator.wikimedia.org/T201150 is fixed, the result will only be partial"
  • @Ivan A. Krestinin: Will you open-source KrBot? Otherwise WMDE won't take over its running, and seem willing to rewrite the bot. Please comment in https://phabricator.wikimedia.org/T290635
    • I put some comments to the tasks. I can publish constraints-related code. But continuing parallel live of two implementations is not good direction (one inside WD core, another for bot). — Ivan A. Krestinin (talk) 23:10, 18 September 2021 (UTC)
    Any updates? The bot is still not refreshing constrain violations. Germartin1 (talk) 20:21, 4 December 2021 (UTC)
    Refresh is near to be completed now. Bot skipped one update cycle due bug. Good news: I made some optimizations. Next update cycle should take ~5.5 days instead of ~9 days. — Ivan A. Krestinin (talk) 22:25, 10 December 2021 (UTC)
    More good news: I made more optimizations. Update cycle now is less than 1.5 days. — Ivan A. Krestinin (talk) 12:50, 5 January 2022 (UTC)

"Unique value" violations due to duplicate external-idEdit

Looking at https://www.wikidata.org/wiki/Wikidata:Database_reports/Constraint_violations/P2088#%22Unique_value%22_violations, we see many Qnnn values that are the same.

I described them as "false positives" but then looked at some instances eg https://www.wikidata.org/wiki/Q5013693#P2088 and see that indeed there's a problem: the same external-id is recorded with and without a reference. The one without reference should be removed --Vladimir Alexiev (talk) 12:55, 9 September 2021 (UTC)

Culture Bas-Saint-LaurentEdit

Salut Ivan,

Nous avons pris soin comme organisation de compléter l'entrée de Culture Bas-Saint-Laurent (Q108475391). Nous allons demander le retrait de l'entrée Q87727973 puisqu'elle est maintenant désuète.

Merci à vous

 – The preceding unsigned comment was added by [[User:|?]] ([[User talk:|talk]] • contribs). Template:Setion resolved

  • I am apologize. I do not understand your message. Google translate did not help me. Could you write it in English? — Ivan A. Krestinin (talk) 23:18, 18 September 2021 (UTC)

Conditional item-requires-statement constraint (Q21503247)Edit

Hi Ivan,

{{Autofix}} allows to add additional statements based on existing values.

An interesting enhancement could be to do this as constraint as well, e.g.

if currentProperty + currentPropertyValue then requiredProperty + requiredPropertyValue

Also:

if currentProperty + currentPropertyValue then requiredProperty

Maybe the property in the condition could be an argument as well:

if currentProperty + otherProperty + otherPropertyValue then requiredProperty
if currentProperty + otherProperty + otherPropertyValue then requiredProperty + requiredPropertyValue

--- Jura 09:46, 12 September 2021 (UTC)

Hi Jura, could you provide some example for testing? — Ivan A. Krestinin (talk) 10:46, 19 September 2021 (UTC)
How about these ? --- Jura 14:28, 19 September 2021 (UTC)
I misread your message the first time. Now I understand your idea. It is possible to create such constraint. But is {{Autofix}} enough maybe? Do we need control such cases using constraints additionally? — Ivan A. Krestinin (talk) 14:46, 19 September 2021 (UTC)
I will try to dig up better examples. When Autofix is (safely) possible (for item datatype properties with a predefined value), the constraint wouldn't that useful . --- Jura 14:53, 19 September 2021 (UTC)

Samples:

Sorry for the delay. Happy holidays. --- Jura 00:32, 24 December 2021 (UTC)

COSPAR and CoordinatesEdit

Hello Ivan. Regarding your reverts to my removals at COSPAR ID (P247) and coordinate location (P625), you have to realize that more than just satellites are sent into space and receive COSPAR IDs. Probes sent to other worlds have need for coordinates (primarily Mars since that planet is supported in our system, but others are as well), and it's entirely possible that Earth-bound spacecraft may potentially have a need for it as well. My point is, making these two properties mandatorily conflicting doesn't make sense in modern spaceflight. Huntster (t @ c) 13:25, 19 September 2021 (UTC)

Landing point is just one of the points in spacecraft live. So we can specify coordinates for some event, but not for spacecraft at all. It is same as specifying geographic coordinates of some human. Just add geo coords as qualifier to some event. Like this or this. — Ivan A. Krestinin (talk) 13:49, 19 September 2021 (UTC)


spouse (P26) duplicate statementsEdit

Hi Ivan,

What do you think of Wikidata:Bot_requests#Merge_multiple_P26_statements? Didn't your bot merge some statements? --- Jura 21:15, 22 September 2021 (UTC)

Hi Jura, usually my bot does not clean such duplicates because the values have different qualifiers. I started special job for this case. It is in progress now. — Ivan A. Krestinin (talk) 23:01, 22 September 2021 (UTC)
  Done, please check remaining 18 items manually. Bot failed to resolve data conflicts in its. — Ivan A. Krestinin (talk) 21:49, 23 September 2021 (UTC)

Wikidata:Database reports/Constraint violations/P1397#"Single value" violationsEdit

Добрый день. Здесь какое-то странное обновление пришло. Первый и второй объект без второго значения, а третий и вовсе без свойства. И такая картина почти по всем. Ощущение, что бот формировал отчёт ещё 17 сентября, а опубликовал только сейчас, 28 сентября. 185.16.139.123 20:52, 28 September 2021 (UTC)

  • Всё обновилось, спасибо. Ещё вопрос: как правильно настроить исключения? Там какая-то красная ошибка вылезла. 185.16.139.123 01:21, 29 September 2021 (UTC)
  • А из Single value надо как отключить Rienioja (Q4395039): 0405722, 0405733. 185.16.139.123 02:25, 29 September 2021 (UTC)
    • Приветствую, ограничения и исключения из них настраиваются на странице Property:P1397. Вроде бы все поправил, должно работать как нужно. Также настроил, чтобы автоматически происходили замены 123456 -> 0123456, через какое-то время робот все выявленные проблемы такого рода поправит. — Ivan A. Krestinin (talk) 22:39, 29 September 2021 (UTC)
      • Спасибо. Всё не для людей, анонимам свойства настраивать нельзя получается. 185.16.139.123 22:56, 29 September 2021 (UTC)
        • Не знал, может если хорошо объяснить причины работы без учетной записи на Wikidata:Administrators' noticeboard, то конкретную страницу разблокируют. — Ivan A. Krestinin (talk) 23:06, 29 September 2021 (UTC)
          • Не, они все выключены. Это даже не защита, просто нет кнопок редактирования у любых Property. В такой ситуации что-то объяснять наверху бесполезно. 185.16.139.123 00:10, 30 September 2021 (UTC)

DOI format restrictionEdit

Hi, I noticed something strange at the DOI property, maybe you can identify the root of the problem? As you can see in Wikipedia in Health Professional Schools: from an Opponent to an Ally (Q108747926), the DOI property is falling under a format restriction, [a-z]*. Not sure how to fix it. Good contributions, Ederporto (talk) 00:13, 30 September 2021 (UTC)

Hello, just use upper case: [7]Ivan A. Krestinin (talk) 16:09, 30 September 2021 (UTC)

Men's basketballEdit

The bot is adding "men's basketball" to male basketball players. This property is not for individuals; it is for clubs, teams, competitions. Therefore all those bot contributions create an exclamation mark (!) which can be avoided by stopping this activity. When I remove the thingy from the individual sportsmen's items, the bot comes and adds it again! (Now I used an exclamation mark. :) Cheers. --E4024 (talk) 15:23, 3 October 2021 (UTC)

Hi, could you provide link to the edit sample? — Ivan A. Krestinin (talk) 19:26, 3 October 2021 (UTC)
Ömer Faruk Yurtseven (Q18129444) and many others... --E4024 (talk) 23:13, 3 October 2021 (UTC)
This bot behavior is caused by this edit. I deleted the constraint. So bot will not do such edits anymore. Also I added conflict with constraint for better protection. What we should do with existing values of competition class (P2094) in human (Q5) items? 1. Delete all such values. 2. Delete only added by my bot. 3. Move all values to sports discipline competed in (P2416). 4. Something else?) — Ivan A. Krestinin (talk) 16:30, 4 October 2021 (UTC)

Merge overlapsEdit

Hi. Trying to look how we not report different versions of a work as duplicates, and not having to put on a "do not merge" list or mark as different. Example

Here we have the parent work, and respective versions or translations xwiki, and they are listed on the parent. We will have an expectation that this would be a widespread situation as more and more works are transcribed. Is it worthwhile not listing as duplicate where they are both listed on the parent with has edition or translation (P747). Thanks for the consideration.  — billinghurst sDrewth 23:47, 10 October 2021 (UTC)

Hello, it is better to ask User:Pasleim about this. He is the reports author. Possible adding different from (P1889) should help. — Ivan A. Krestinin (talk) 17:09, 11 October 2021 (UTC)

Wikidata:Database reports/Constraint violations/P1566Edit

Добрый день. В отчётах по GeoNames творится какой-то ад, разбирать на две-три жизни. Вы не могли бы проверить точность настроек? Как вариант, поделить списки на страны, позвать добровольцев… В общем проблема сама себя не решает. 194.50.15.241 05:35, 12 October 2021 (UTC)

Приветствую, подобный ад творится здесь практически в каждом популярном свойстве. Меня к сожалению на все свойства не хватает. Настройки генерации отчета вы можете найти на Property:P1566. Не слишком популярные проблемы, например, некорректный формат, проще поправить руками, там всего 10 элементов. Для массовых проблем вы можете попытаться выделить группы ошибок и предложить какие-нибудь автоматизированные процедуры их исправления. В этом вам могут помочь на Wikidata:Bot requests. — Ivan A. Krestinin (talk) 05:59, 12 October 2021 (UTC)
Меня больше интересовало, не мог бы ваш бот отсортировать проблемы по типам объектов (реки, озёра, горы) и по странам (Россия, СНГ). Это уже будет возможно разбирать. 194.50.15.241 20:13, 12 October 2021 (UTC)
Да, этого можно добиться добавив свойство group by (P2304). Посмотрите как это сделано, например, здесь: Property:P1538. Есть правда неприятный момент — сгруппировать сразу по двум свойствам не получится. Либо по стране, либо по типу. Также можете попробовать сформировать произвольный отчет с помощью SPARQL. — Ivan A. Krestinin (talk) 20:48, 12 October 2021 (UTC)
Анонимам запрещено править свойства. Вы не могли бы помочь? Типы: lake (Q23397), river (Q4022), mountain (Q8502); страны: Russia (Q159), Ukraine (Q212), Belarus (Q184). 194.50.15.241 18:48, 13 October 2021 (UTC)
Добавил группировку по государствам. Но лучше создайте учетку и продолжите сами. Или воспользуйтесь SPARQL. Пример получения всех объектов России у которых число кодов больше, чем один:
SELECT ?item ?itemLabel
WHERE
{
	{
		SELECT DISTINCT ?item {
			?item wdt:P1566 ?value1 .
			?item wdt:P1566 ?value2 .
            ?item wdt:P17 wd:Q159
			FILTER( ?value1 != ?value2 ) .
		}
	} .
	SERVICE wikibase:label { bd:serviceParam wikibase:language "ru,en" } .
}
Try it!
Ivan A. Krestinin (talk) 21:33, 14 October 2021 (UTC)
Благодарю. Я так понимаю, группировка появится при обновлении отчёта? Сейчас свежесть от 8 октября. 194.50.15.241 03:18, 16 October 2021 (UTC)
Да, при следующем обновлении. База Викиданных сильно подросла последнее время, к сожалению теперь боту требуется уже дней десять, чтобы сгенерировать очередную версию отчетов. SPARQL в этом плане удобнее. — Ivan A. Krestinin (talk) 15:14, 16 October 2021 (UTC)

Q10497835Edit

Is this posible to reverse this replace? Eurohunter (talk) 15:48, 15 October 2021 (UTC)

  DoneIvan A. Krestinin (talk) 09:53, 16 October 2021 (UTC)

An actress changes sex and becomes an actor... :-)Edit

Hello Ivan. In the item Maurane (Q509029) —a belgian female singer—, KrBot repeteadly changes another one of her occupations —actress— in its male counterpart: "actor". I don't know why but could you please solve this? Thanks a lot in advance: Tatvam (talk) 16:10, 15 October 2021 (UTC)

@Tatvam: This processing is quite intentional. Because it's not that the data object actor (Q33999) only describes male actors, but also female actors, i.e. actresses. The item is no different from the data object singer-songwriter (Q488205), which also describes both female and male persons. --Gymnicus (talk) 18:45, 15 October 2021 (UTC)
Thank you for your answer, but I used the data object actress (Q21169216), not actor (Q33999) and I would like it to stay like that. It is KrBot which repeteadly changes actress (Q21169216) to actor (Q33999) without reason. Tatvam (talk) 18:57, 15 October 2021 (UTC)
@Tatvam: If you do not want this change you would have to raise this concern on the discussion page of occupation (P106), because there the bot is asked to make these changes. --Gymnicus (talk) 19:19, 15 October 2021 (UTC)
Hello, @Tatvam:, KrBot makes the changes because Property talk:P106 has {{Autofix}} template for this value. Please discuss the case on Property talk:P106 and delete the autofix if required. — Ivan A. Krestinin (talk) 10:47, 16 October 2021 (UTC)

Problematic botEdit

Your bot (KrBot) is replacing proper item (Q4354683) with some nonsensical disambiguation page (Q3537858), tens of pages are affected. Please stop that. --Orijentolog (talk) 18:17, 15 October 2021 (UTC)

@Orijentolog: Ivan can now do very little for these arrangements. The bot is programmed to resolve redirect links and the two of the data objects mentioned were accessed by a user on May 18, 2021
merged and only separated from each other on October 4th. In the meantime, the bot has done its job and replaced the redirect links. The bot cannot see that the merging was wrong. --Gymnicus (talk) 18:41, 15 October 2021 (UTC)
Thanks for the info, it's mostly OK now because I fixed most mistakes manually. I just want to be sure that bot won't repeat the same mistakes. Greetings to both. :) Orijentolog (talk) 18:45, 15 October 2021 (UTC)
@Orijentolog: The bot should not change it back, since it is no longer a redirect link and in principle the bot ignores this links now. If such changes happen, then something is wrong with the programming. --Gymnicus (talk) 19:06, 15 October 2021 (UTC)
Hi, Orijentolog, I reverted bots changes. Bot waits for some time before resolving redirects. But wrong merge that exists for a long time creates issue not only for my bot. Humans ans other bots use target item only. So reverting of old wrong merge requires reviewing all links in any case. — Ivan A. Krestinin (talk) 14:35, 16 October 2021 (UTC)

Lexeme constraintEdit

Did you see Property talk:P1296#Lexeme language? FogueraC (talk) 07:43, 16 October 2021 (UTC)

Hi, I have too many notifications. So I did not see {{Ping}} mentions. Sorry. — Ivan A. Krestinin (talk) 15:10, 16 October 2021 (UTC)
No problem. And thanks! FogueraC (talk) 15:59, 16 October 2021 (UTC)

detected wrong mergeEdit

I just discovered a faulty merge, which unfortunately also led to your bot being edited. Could you see if you can undo the edits made by your bot where it changed the data object Chowdhury (Q30971895) to Chowdhry (Q1068345)? --Gymnicus (talk) 22:59, 16 October 2021 (UTC)

I've created a new item - Q108911685 for non-Latin surnames. Probably it also needs separation. --Infovarius (talk) 23:39, 16 October 2021 (UTC)
@Infovarius: Thank you, I also see that as useful. But shouldn't we also separate the individual languages ​​(Bengali, Nepalese, Urdu and Newari)? At least the names look very different to me as a layman who has no idea about these languages. --Gymnicus (talk) 11:07, 17 October 2021 (UTC)
I reverted my bot edits. — Ivan A. Krestinin (talk) 01:56, 17 October 2021 (UTC)
Thank you very much --Gymnicus (talk) 11:07, 17 October 2021 (UTC)

Why removing constraint status (P2316) from IAPH code (P8425) and Guía Digital del Patrimonio Cultural de Andalucía identifier (P3318)Edit

Hi;

Why have you removed this property? I would ask you to undone the change, please. —Ismael Olea (talk) 10:58, 31 October 2021 (UTC)

Hi, both have too many violations (more than four hundreds). It is too many for mandatory constraint (Q21502408). This flag was created for monitoring and manual fixing few number of unexpected cases. But the mechanism is broken now. Wikidata:Database reports/Constraint violations/Mandatory constraints/Violations stopped updating because it current size is ~7 Mb (page size limit is 2 Mb). — Ivan A. Krestinin (talk) 11:26, 31 October 2021 (UTC)

Removing Swedish Open Cultural Heritage URI (P1260)Edit

Why is KrBot removing Swedish Open Cultural Heritage URI (P1260) like this? Swedish Open Cultural Heritage URI (P1260) is allowed to have duplicate values. /ℇsquilo 13:40, 4 November 2021 (UTC)

Why would you want a duplicate value? — Martin (MSGJ · talk) 15:14, 4 November 2021 (UTC)
Several absolutely equal values is mistake in the most cases. I can add Swedish Open Cultural Heritage URI (P1260) as exception. But maybe it is better to add some qualifier to the values? For example applies to part (P518). Currently the values are looked very strange for humans also. Looks like Wikidata has single item for the lighthouse, Swedish database has single record also. It is not obvious why the identifier should be specified twice. — Ivan A. Krestinin (talk) 15:24, 4 November 2021 (UTC)
I guess the squirrel didn't notice that the values were exactly the same! — Martin (MSGJ · talk) 18:09, 4 November 2021 (UTC)

Constraint Violation StatisticsEdit

Hi Ivan,

I'm currently conducting research on the constraints violations of Wikidata and I have found your bot KrBot2. My question is if the queries/scripts for violations counting are available in some git repo, or if there is another way to get them. Thank you for your time!

Cheers, Nicolas

Hi Nicolas, the code is located in private repo currently. The code loads full and incremental Wikidata dumps. This process takes ~9 days and requires significant amount of memory. So I am not sure that the code will be useful for your task. Maybe this report will be enough for your research. — Ivan A. Krestinin (talk) 14:36, 5 November 2021 (UTC)

Commons gallery moved to Commons categoryEdit

Hi! Why is this bot edit? The gallery and the category are two completely different concepts; existence of the one doesn’t imply the existence of the other, and even if they both exist, their name may not match (cf. c:Category:Moscow vs c:Москва). Now no statement carries the information that there’s a gallery about Evolution. And I don’t even see anything on Property talk:P935 that would instruct the bot to do so, so I can’t stop it. (Which is unfortunately often an issue with your bots: not open source so I can’t use the source look, often unclear edit summaries, no community control over certain tasks.) —Tacsipacsi (talk) 14:07, 7 November 2021 (UTC)

Hello, Evolution (Q336251) does not have gallery on Commons. Evolution (software) is redirect to category. I agree, edit summary is confusing a bit for this case. — Ivan A. Krestinin (talk) 16:24, 7 November 2021 (UTC)
I see. Yes, please use an appropriate edit summary in this case—the bot didn’t move the statement value, it removed it because it was no longer appropriate. As I explained above, I can’t even imagine a case where this edit summary was appropriate assuming the edits themselves are correct. —Tacsipacsi (talk) 00:55, 8 November 2021 (UTC)

Update on constraints reportsEdit

Nice bot. Any chance to get an update on the constraints report for Norwegian historical register of persons ID (P4574)? I might report a few more properties with recently added constraints soon assuming that's ok. Thanks. --Infrastruktur (T | C) 07:45, 18 November 2021 (UTC)

Hello, bot does not touch report if the only change is item count or date. I updated it manually. Bot did not detect any constraint violations. — Ivan A. Krestinin (talk) 22:07, 18 November 2021 (UTC)

How to undo consequence of a incorrect merge?Edit

Hi,

There was an incorrect merge of Alsace wine (Q80114014) and Alsatian Vineyard Route (Q1334019) (done Andrew Dalby and undone by Jon Harald Søby last October, thanks).

But I noticed today that you not replace the first by the second which make the current situation of mess: 700+ wine are now defined as a road... (and also this cause 700+ constraint violations), see Q41437058 for one example.

My question: is there a simple way to undo the replacement? (or at least to find the list to do an overwrite with Quickstatements).

Cheers, VIGNERON (talk) 12:45, 20 November 2021 (UTC)

Sorry about that. It seemed useful at the time and I had no idea that this chain reaction would happen. Andrew Dalby (talk) 14:05, 20 November 2021 (UTC)
@Andrew Dalby: no probem, it happens... that's why I'm very carefull hen merging, there can quickly be dire consequences but errare humanum est so shikata ga nai. Cheers, VIGNERON (talk) 08:14, 21 November 2021 (UTC)
Thanks for your reply, @VIGNERON:. Thinking it over, I guess one should consider before merging whether the pages have, or ought to have, the same instance of (P31) values. It will normally be true. But in this case it wouldn't have been true, and that would have been a warning. Andrew Dalby (talk) 15:38, 23 November 2021 (UTC)
Hello, I rolled back the links change. Previously links like this were used for rollback also. See User_talk:Ivan_A._Krestinin/Archive#suggestion: using edit groups for solving redirects. for details. @Pintoch:, @Pasleim: currently edit group tool shows 'Edit group "Q80114014_Q1334019" not found' error. Is it some bug? — Ivan A. Krestinin (talk) 22:00, 20 November 2021 (UTC)
@Ivan : yes, I noticed this error too, I was wondering if it was just me or not. Anyway, thanks for the quick answer and I'll let you look into it. Cheers, VIGNERON (talk) 08:14, 21 November 2021 (UTC)
@VIGNERON, Ivan A. Krestinin: thanks for the notification. KrBot still seems to have its edits tracked by EditGroups (https://editgroups.toolforge.org/?user=KrBot) but somehow this batch seems to have been missed, it is not clear to me why. I will look into the problem. − Pintoch (talk) 14:36, 24 November 2021 (UTC)

Unitless range constraint (Q21510860) for united quantity isn't checked "correctly"Edit

Many properties are naturally expressed with units, but have unitless range constraints. (This should be deprecated and fixed, but that's another discussion).

For example, duration (P2047) has a constraint which Wikibase interprets as meaning that the maximum allowed duration is a billion seconds.

KrBot2 currently doesn't list violations of the billion-seconds constraint, such as 70 years or more after author's death (Q29870196). Wikibase does. (In this case, the constraint is inappropriate and should be removed, IMHO).

If it's not too hard to do, it would be nice if KrBot2 and Wikibase could agree on how to interpret such constraints.

Streetmathematician (talk) 13:29, 22 November 2021 (UTC)

Originally "Range" constraint checks value only. Units are ignored by this constraint. Looks like it was reimplemented in Wikibase using a bit strange normalization algorithm. I agree that some properties may require taking into account units. But there are few examples of such cases. duration (P2047) is looked as error usage of "Range" constraint. For example duration of Sun live is more than 1000000000 seconds. I fixed the constraint. Reason why units are not supported by Range constraint is very simple. Conversion from one unit to another might be non-trivial. For example Mach number (Q160669) -> kilometre per hour (Q180154). Also set of all units used on Wikidata is not defined. I can implement units support for some specific cases. But not for all possible units. — Ivan A. Krestinin (talk) 17:05, 23 November 2021 (UTC)

Please don't edit Q4233718Edit

The edits to anonymous (Q4233718) by your bot are incorrect. Please make sure that your bot doesn't edit that item. Multichill (talk) 16:45, 27 November 2021 (UTC)

Fixed: [8], [9], [10], [11], [12]. — Ivan A. Krestinin (talk) 21:11, 27 November 2021 (UTC)

Validation linkEdit

Hello! Your bot used old dead link vwo.osm.rambler.ru for this list: Wikidata:Database reports/Constraint violations/P884. Машъал (talk) 11:25, 8 December 2021 (UTC)

Приветствую, бот всегда берет первое значение из свойства formatter URL (P1630), несмотря на его ранг. Такое вот ограничение. Обошел проблему поменяв маски местами. Вообще если подумать, то какой смысл хранить устаревшие маски на странице свойства... Истории место на странице истории... — Ivan A. Krestinin (talk) 22:33, 10 December 2021 (UTC)
Спасибо. Я тоже не знаю зачем, тем более ссылка умерла. Но кто-то так настроил, наверное по правилам нужно? Машъал (talk) 19:06, 14 December 2021 (UTC)
Да нет никаких особых правил на этот счет, просто кому-то не захотелось удалять устаревший линк. — Ivan A. Krestinin (talk) 21:18, 14 December 2021 (UTC)

Scholarly article duplicatesEdit

Hi - I've gone through the list at User:Ivan A. Krestinin/To merge/Scholarly articles as it was a few weeks ago, and merged a large number of them (over 1000). However I see the list has been updated. Would it be possible to sort this list by items recently created (for example Q109?????? duplicates at the top, etc.) Or is the list possibly incomplete and items might be added that were created a long time ago but weren't caught by your checks yet? ArthurPSmith (talk) 21:29, 13 December 2021 (UTC)

Oh - I just realized you put it in a sortable table so I could have sorted on Qid from the start! Anyway, I guess I'll wait for the next update to this to see what I may have missed. I've contacted one person who was creating duplicates and that seems to have ceased, so hopefully we won't get so many going forward. ArthurPSmith (talk) 21:35, 13 December 2021 (UTC)
Hi Arthur, you made great job, thank you! The list is incomplete of course. It is limited by size (1 Mb). Full report size is 36 Mb now. bot sorts the report by internal rank. So end of full report contains false positives mostly. I update the report after Wikidata:Database reports/Constraint violations/P356 and some other pages update. — Ivan A. Krestinin (talk) 21:17, 14 December 2021 (UTC)

Database reports/identical birth and death datesEdit

The valuable report Wikidata:Database reports/identical birth and death dates/1 seems to contain a lot of matches with 1 January for date of birth or date of death at the moment, most of which are probably spurious precision. Would it be worth suppressing these, as they very seldom represent an actual match ? Jheald (talk) 13:00, 23 December 2021 (UTC)

The best solution here is just fix wrong values as for me. Did you contact with user who created values with wrong precision? Maybe he has some instrument for fix. — Ivan A. Krestinin (talk) 22:26, 23 December 2021 (UTC)
You're probably right, that it's better to try to fix issues than just to hide them. I've been in touch with Ghuron, who created about 760 entries like this as part of a recent upload of data from The Righteous Among the Nations Database (Q77598447) at Yad Vashem. Some others may have been created back in 2014 by User:GPUBot (since blocked). There may also be others again, created in other uploads (cf https://w.wiki/4bcR - quite a diverse set of items); but with luck the situation should become clearer once the Yad Vashem ones are sorted out. Jheald (talk) 17:23, 24 December 2021 (UTC)
I started task that fixes January 1 values from The Righteous Among the Nations Database (Q77598447), edit example: [13]. — Ivan A. Krestinin (talk) 08:31, 25 December 2021 (UTC)

Autofix - P17Edit

Hi, can you stop the automated edits of Catalonia to Spain? While I am totally against these edits as established by our Wikipedia community, Catalonia, like other nations such as Kurdistan, is not only in what is considered a sovereign state, but divided, in our case, in two (Spain and France). Therefore, it is possible that certain technically wrong edits may be made. Regards, --KajenCAT (talk) 10:12, 7 January 2022 (UTC)

Hello, just remove {{Autofix|pattern=Q5705|replacement=Q29}} line from Property talk:P17. I do not know situation with Catalonia in details. Maybe it is good to start discussion on Property talk:P17 before or after removing the autofix template. — Ivan A. Krestinin (talk) 17:46, 7 January 2022 (UTC)
Thank you for your response. I will provisionally withdraw it and open the subject. Thank you again. KajenCAT (talk) 23:16, 7 January 2022 (UTC)

Remove audio podcastEdit

can you please remove distribution format: audio podcast from JRE episodes such as JRE #312 - Steve Rinella, Bryan Callen (Q109306593), as most episodes are video podcasts and only very few of them are audio only Germartin1 (talk) 10:41, 8 January 2022 (UTC)

Hello, I added rollback task to bot`s task list. Bot should rollback 1420 items today or tomorrow. — Ivan A. Krestinin (talk) 18:16, 8 January 2022 (UTC)
  DoneIvan A. Krestinin (talk) 19:44, 9 January 2022 (UTC)
Thanks, what about these ones, some of them are video podcasts https://www.wikidata.org/w/index.php?title=Q101011923&type=revision&diff=1408455138&oldid=1335982903 Germartin1 (talk) 11:28, 14 January 2022 (UTC)
  Done I reverted also edits based on Spotify show ID (P5916). — Ivan A. Krestinin (talk) 01:10, 15 January 2022 (UTC)

Adding internet archive identifiers to items for peopleEdit

Hello,

Can your bot stop adding an internet archive identifier to items for people such as Q110486431. Thank you. Gamaliel (talk) 16:59, 10 January 2022 (UTC)

Hello, just delete line {{Autofix|pattern=<nowiki>https?://archive\.org/details/([0-9A-Za-z@][0-9A-Za-z._-]+)|replacement=\1|move_to=P724}}</nowiki> from Property talk:P973. This job was added by Jura1 several years ago. Maybe it is good to discuss it with him. Also I added one more conflicting value. This should prevent such edits also. — Ivan A. Krestinin (talk) 17:48, 10 January 2022 (UTC)

Wikidata:WikiProject RussiaEdit

Добрый вечер! Хочу вас пригласить как инженера Русской Википедии в википроект Россия. Вы можете помочь сообществу теснее интегрировать данные РуВики в общий банк данных, создавать нужные свойства и т.д. MasterRus21thCentury (talk) 16:54, 18 January 2022 (UTC)

Приветствую, если будут конкретные задачи, то обращайтесь. Свободного времени у меня не очень много, но какие-то задачи возможно решу. — Ivan A. Krestinin (talk) 17:01, 18 January 2022 (UTC)
Например сейчас можете принимать участие в обсуждении предполагаемых свойств Викиданных из российских источников. MasterRus21thCentury (talk) 17:35, 18 January 2022 (UTC)

Подведение итогов по свойствам ВикиданныхEdit

Иван, привет! Вы бы не могли подвести итоги по свойствам Викиданных, поскольку новые свойства не создаются с понедельника, а также скопилось 61 свойство, ожидающее решение администратора или создателя свойств? MasterRus21thCentury (talk) 17:12, 21 January 2022 (UTC)

Приветствую, довольно редко занимаюсь созданием новых свойств, лучше обратитесь к другим участникам. Я в основном специализируюсь на автоматических процедурах поддержания целостности и качества данных. — Ivan A. Krestinin (talk) 17:15, 21 January 2022 (UTC)

Many articles with PubMed ID = 9541661Edit

Hi. I find there are 24 articles with PubMed ID = 9541661. e.g., Indirect (repeat) prescribing (Q84597236), The pharmaceutical industry (Q84597219). Can we recover the edits? And is this a one-time event? Kanashimi (talk) 06:23, 23 January 2022 (UTC)

Hello, it was once-run task. The edit is looked correct: both 19790808 and 19790797 were deleted by PubMed. PubMed marked its as duplicate of 9541661. Looks like all these IDs were merged because it is single large work actually. And the Wikidata items are correspond to chapters of this work. Usually we have no separate item for each chapter on Wikidata. I suggest to merge all these items following for PubMed. — Ivan A. Krestinin (talk) 10:28, 23 January 2022 (UTC)
Thank you. Kanashimi (talk) 10:36, 23 January 2022 (UTC)

Taxonomy bug?Edit

Hi, no idea how this happend, just reporting: https://www.wikidata.org/w/index.php?title=Q469652&diff=1564671901&oldid=1564583128&diffmode=source

Best, AdrianoRutz (talk) 12:52, 28 January 2022 (UTC)

  • Hi, Adriano! reference URL (P854) property was used as qualifier. It is mistake in the most cases. Bot fixed this mistake by moving it to sources section. — Ivan A. Krestinin (talk) 16:41, 28 January 2022 (UTC)
    Oh yes I see...did not go far enough to see the original source of the wrong modif...wasn't the bot sorry! AdrianoRutz (talk) 06:24, 30 January 2022 (UTC)

regular constraint reportsEdit

Hi Ivan,

Seems constraints reports are much more frequently updated, almost daily. Excellent news. Thanks!

Maybe we should mention it on Wikidata:Status_updates/Next#Other_Noteworthy_Stuff --- Jura 08:22, 10 February 2022 (UTC)

  • Hi, Jura, some optimizations, switching to RapidJSON library and $1500 for new hardware. Result: 15 hours instead of 9-10 days for the reports update cycle. — Ivan A. Krestinin (talk) 16:15, 10 February 2022 (UTC)
    @Ivan A. Krestinin Fantastic! Vojtěch Dostál (talk) 17:10, 10 February 2022 (UTC)
Cool. I added a note to the weekly news. --- Jura 13:59, 11 February 2022 (UTC)
Update: +280$ for extending RAM and update cycle is 9 hours now. In practice update frequency is limited by 24 hour period of incremental dumps generation. — Ivan A. Krestinin (talk) 21:21, 15 February 2022 (UTC)

stats on of (P642) as qualifier by propertyEdit

Maybe you have seen Property talk:P642.

I think it would be helpful to have statistics about the properties currently using it as qualifiers.

As there are 14 million uses, this is hard to do on query service.

I noticed the constraint report for P31 has them (197165).

Do you have a simple way to generate a summary for all properties (even those without allowed qualifier constraints, e.g. P279). --- Jura 12:01, 10 February 2022 (UTC)

It is not so simple, but I am thinking on possible implementation. — Ivan A. Krestinin (talk) 16:33, 11 February 2022 (UTC)
In the meantime we got some approximation with the query Vojtěch provided.
Maybe stats on each pair property / qualifier could be interesting, beyond P642.
OTH, the problematic might not necessarily be the most used ones. Personally, I think "applies to part" is the most problematic one. --- Jura 17:01, 11 February 2022 (UTC)
@Jura1: usage report: User:Jura1/P642 usage. — Ivan A. Krestinin (talk) 10:48, 13 February 2022 (UTC)

instance of (P31) removal of maintained by wikiprojectEdit

Why would you do this? Lectrician1 (talk) 01:11, 11 February 2022 (UTC)

Just because it throws error and is not something commonly used. Question: can discussed cases be fixed automatically? Or its require non-trivial manual work? Maybe it is better to add {{Autofix}} or something like it? — Ivan A. Krestinin (talk) 16:42, 11 February 2022 (UTC)
Then why don't we just make it an allowed qualifier? I don't think we should autofix this stuff. Lectrician1 (talk) 17:47, 12 February 2022 (UTC)
I just used old principle: entities should not be multiplied beyond necessity. Is the qualifier used for some automated work? — Ivan A. Krestinin (talk) 21:09, 15 February 2022 (UTC)
@Ivan A. Krestinin It's to give people an idea about who to contact if they have questions about the constraint. A lot of of the constraints are for managing the Wikiproject Music data model which is complex and new contributors might have questions about it. Lectrician1 (talk) 01:06, 16 February 2022 (UTC)
I added the qualifier to ignored qualifiers list. Please rollback my edit. — Ivan A. Krestinin (talk) 21:41, 16 February 2022 (UTC)

identical dates and deprecated January 1Edit

Hi Ivan,

As we kept getting entries with deprecated January 1 dates, I started listing them at False_positives#pairs_with_deprecated_January_1_date. I left some notes about it at #January_1_as_date.

Since then, more get created with deprecated rank directly added (sample: Q110842925#P569).

Accordingly, I'd filter any deprecated "January 1"-date by default. --- Jura 10:35, 11 February 2022 (UTC)

Удаление DOIEdit

Добрый день. Почему ваш бот удаляет коды DOI, пусть и нерабочие, но подтверждённые источником? Они в частности позволяют избегать дублирования элементов. --INS Pirat ( t | c ) 05:11, 12 February 2022 (UTC)

Приветствую, они скорее наоборот мешают находить и объединять дублирующиеся элементы. Так как у статьи корректный код как правило только один, а вот некорректных кодов может быть сколько угодно. В результате оказывается, что есть два элемента описывающих одну статью, но DOI коды у которых разные. В свое время была сделана массовая чистка некорректных кодов, в результате количество объединенных элементов уже больше 50 тысяч. Кое-какая информация об этой работе: Property talk:P356#15138 wrong values. — Ivan A. Krestinin (talk) 07:07, 12 February 2022 (UTC)
Вы говорите о некорректных кодах в целом, а не о приведённом случае. И каким образом мешают? После создания повторного элемента, опираясь на тот же источник, точно так же DOI будет помечен как уже используемый (хоть и сам он нерабочий). И у материала может быть и верных кодов несколько. --INS Pirat ( t | c ) 09:53, 12 February 2022 (UTC)
Было много пар элементов, где в одном из них был указан корректный код, а в другом - некорректный. Робот или человек видел два элемента с разными кодами и делал логичный вывод: это разные статьи, объединять нельзя. Не знаете кстати какова природа этих некорректных кодов? Откуда берется такое большое количество невалидных значений? Кстати идентификатор DOI ведь не единственный по которому можно искать дубликаты. Для статьи, что мы обсуждаем вполне можно искать дубликаты по вполне валидному значению свойства Cairn publication ID (P4700). — Ivan A. Krestinin (talk) 16:10, 12 February 2022 (UTC)
Я не совсем понимаю вашу позицию. Да, не единственный идентификатор. Но я не считаю, что это препятствует использованию других. Есть факт: в первоисточнике указан определённый DOI. Я оформил его соответствующим образом (для того ранг нерекомендуемости и квалификаторы и нужны). Если, повторюсь, у материала больше одного рабочего DOI, ситуация получается такой же, как вы описываете. --INS Pirat ( t | c ) 16:54, 12 February 2022 (UTC)
Позиция простая: если превращать Wikidata в коллекцию заблуждений (пусть и помеченных соответствующим образом), то станет крайне сложно выполнять даже такие простые операции, как поиск элементов-дубликатов. Проблема еще осложняется тем, что некоторые участники начинают массово правильные коды помечать рангом нерекомендуемости и выставлять те же квалификаторы. И тут вообще полный ад начинается. Давайте просто не будем без особой на то необходимости тянуть в Wikidata невалидные значения. То, что мы можем это сделать не значит же, что мы должны это делать. — Ivan A. Krestinin (talk) 17:24, 12 February 2022 (UTC)
С чего бы кому-то таким заниматься, ещё и массово? Где вы такое увидели? И действия участников не должны влиять на допустимость информации. И всё ещё не ясно, какие вы видите помехи поиску дубликатов (как я уже сказал, наоборот, должно помогать). Более глобально: боты вообще не должны касаться целенаправленно внесённой невандальной информации (допустим, с рангами/квалификаторами), по крайней мере уж повторно, при отмене. --INS Pirat ( t | c ) 20:45, 12 February 2022 (UTC)
Посмотрите, например, эти правки: [14], [15], [16], там правда не DOI, а другой идентификатор, но суть та же. Давайте подробно опишу всю историю: среди элементов описывающих научные статьи дубликаты заливались и продолжают заливаться тысячами. Я решил заняться массовыми мержеваниями. Работа важная, т. к. из-за такого обилия данных движок SPARQL в ближайшем будущем умрет, вычистка дубликатов хоть немного отсрочит его смерть. Главная опасность этой работы - не объединить лишнего, т. к. откат одного неправильного объединения - дело тяжелое, а откат пары сотен неправильных объединений - вообще беда. Потому алгоритмы приходится делать сильно параноидальными, малейшее различие и объединение надо прерывать. Ориентироваться здесь на ранги идентификаторов не получается, т. к. ранг "нерекомендуемый" расставлен достаточно случайным образом (см. примеры выше). Бот успешно отработал и объединил что-то около 10000 пар элементов. После этого я начал анализировать случаи, когда бот не объединял похожие элементы. Выяснилось, что в большом количестве случаев была ситуация, когда в одном элементе был корректный код, а в другом - невалидный. Или в обоих невалидные коды. Это касалось не только DOI. Но DOI был один из самых информативных и самых "замусоренных". Дальше началась долгая история с вычисткой невалидных кодов. Часть кривых кодов была удалена как полный треш, который неизвестно откуда взялся и даже по формату на DOI не был похож. Потом нашлись коды похожие на DOI по формату, но кодами DOI не являющиеся. Пришлось договариваться с организацией CNRI, которая обслуживает этот идентификатор относительно валидации всех 27 миллионов имеющихся у нас кодов. Бот работал больше месяца, но в итоге вычистил почти 80 тысяч кодов, не являющихся DOI. После всех этих работ удалось объединить уже больше 50 тысяч пар элементов и работа продолжается. Только сегодня робот объединил больше 2000 пар элементов. По поводу того, чтобы не делать какие-то изменения повторно, если они были отменены кем-то: с одной стороны я с вами согласен, было бы наверное здорово действовать именно так. Но тут три проблемы: 1. это технически весьма сложно, а технически сложные системы обычно содержат много багов и как следствие склонны к невалидному поведению. 2. многие типы ошибок повторяются многократно разными пользователями. 3. на Викиданных приходится оперировать десятками тысяч, если не миллионами элементов, вручную исправлять кейсы, где были откаты нет никакой возможности. — Ivan A. Krestinin (talk) 22:47, 12 February 2022 (UTC)
Это несколько перпендикулярно теме опоры на источники, но уже более убедительно. (Почему сразу-то к этому не перешли?)
1) Умеет ли алгоритм объединять элементы, в одном из которых указан верный DOI, а в другом - нет никакого (и нет иного уникального идентификатора)? 2) Умеет ли алгоритм объединять элементы с разными, но валидными DOI? --INS Pirat ( t | c ) 17:45, 17 February 2022 (UTC)
Бессмысленно же опираться на источник, когда в нем указан очевидно ошибочный идентификатор. Опечатались, неправильно данные подтянули, чего только не бывает на больших объемах. Что же нам теперь превращаться в собрание всех ошибок на свете... 1) Если в элементе нет никакого идентификатора, то бот может еще найти по совпадению title (P1476). 2) Нет, сейчас алгоритм ориентируется только на точные совпадения. Есть еще отчет про то, что боту показалось похожим, но объединить автоматически он "не решился": User:Ivan A. Krestinin/To merge/Scholarly articles. — Ivan A. Krestinin (talk) 20:20, 17 February 2022 (UTC)

Regarding Wikidata Q3105247Edit

‪You recently changed the SBN author ID‬ ‪(P396)‬: IT\ICCU\BVEV\090371 to BVEV090371, but now the authority control of en.Wikipedia says: "The ICCU id BVEV090371 is not valid". Why?--Ruotailfoglio (talk) 16:29, 20 February 2022 (UTC)

Hello, it is some kind of cache most probably. I do not see any marks at Q3105247#P396. Please try F5 and Ctrl+F5 in your browser. Maybe it is browser cache. Or wait for day or two. — Ivan A. Krestinin (talk) 16:33, 20 February 2022 (UTC)
Thank you! Ruotailfoglio (talk) 18:24, 21 February 2022 (UTC)

Reverting an Autofix?Edit

Hello,

Last week a mistaken {{Autofix}} was added to platform (P400) (replacing personal computer (Q16338) with Microsoft Windows (Q1406)), and KrBot duly replaced on thousands of items. is there a way to revert all these autofixes? See Property_talk:P400#Autofixes for details. Thanks! Jean-Fred (talk) 20:29, 21 February 2022 (UTC)

edit to Arryn DiazEdit

Did you review the source? Arlo Barnes (talk) 22:12, 24 February 2022 (UTC)

Very quickly. Skeleton is not sex or gender. It is looked as inappropriate value for sex or gender (P21), please see the property constraints. — Ivan A. Krestinin (talk) 23:30, 24 February 2022 (UTC)

Code sampleEdit

Hi Ivan, hope you're having a great day :) Any hope of see the code sample that Krbot2 uses to update constraint violation report? I'd really appreciate it Joseph202 (talk) 17:17, 1 March 2022 (UTC)

Hi Joseph, the code was not published. Bot is written on C++. Constraints report update task uses many shared code with another wiki-related tasks (200+ tasks). Please write me email. I can send some parts. Or put some questions here if you need some implementation details instead of the code actually. — Ivan A. Krestinin (talk) 03:54, 2 March 2022 (UTC)
@Ivan A. Krestinin: Thank you for your reply, actually, I want to use the code on a third-party installation of wikibase, that's why I was asking Joseph202 (talk) 17:18, 6 May 2022 (UTC)
Bot works with dumps. Do you have planes to dumps of compatible format? What amount of data is planned? We can think about connecting my bot to your project too. — Ivan A. Krestinin (talk) 18:47, 6 May 2022 (UTC)
@Ivan A. Krestinin: Currently, the only way we get/generate dumps is via our Special:DataDump, although there seem to be a way via API that I haven't tried before.
But you can have a look if you wish. Joseph202 (talk) 20:40, 6 May 2022 (UTC)
@Ivan A. Krestinin: Hi, I trust you're having a great day. Per the above, We actually get dumps via the Special:EntityData special page, we can also get for the formats that Wikidata can get too. How can we begin to work on this?
Hope to hear from you soon. Joseph202 (talk) 08:26, 12 May 2022 (UTC)
Hi Joseph, I take a look to the project. One problem is different identifiers. For example P31, P279, Q21502404 - all are hard-coded now. Need to make some parametrization. Another question is project size. Constraints system requires some efforts for deploy and maintenance. I am not sure that it is reasonable to use it on small-size projects. Maybe it is better to focus on data volume increase first. — Ivan A. Krestinin (talk) 19:15, 12 May 2022 (UTC)
Hello, thanks for replying.
Yes, the IDs are different, is it not possible to configure it to fit in to Gratisdata? and yes, there are over 3000 data available currently and still counting is that not considerable in terms of being large volume?
I'd love to hear from you Ivan, thank you! Joseph202 (talk) 18:37, 13 May 2022 (UTC)

explain why constraints aren't applicable on {P7883}Edit

Hi, I noticed you removed mandatory constraint (Q21502408) from multiple properties including Historical Marker Database ID (P7883), Would you provide clarification why these were removed? Given they were just removed without comment and I'm the one who added them I'd like to know how to more accurately apply these in the future. Wolfgang8741 (talk) 19:25, 2 March 2022 (UTC)

Hello, the constrains have 100+ violations and are not looked as something easy to fix. Could you add the flag after fixing most part of violations? Wikidata:Database reports/Constraint violations/Mandatory constraints/Violations was made as tool for quick revert of vandalism or wrong edits. But the report is unmaintainable now. Size is too large. I am trying to improve the situation with quality using different approaches. — Ivan A. Krestinin (talk) 19:42, 2 March 2022 (UTC)
Ah, thanks for the explanation. Prior to adding the type constraint, there was no means of ensuring consistent use across the IDs and why applying the type constraints generated a large report. I started a discussion to constrain and cleanup the marker IDs at WikiProject_Cultural_heritage#Adding_item_Type_Constraints_for_Historic_Marker_Properties ideally leading to a model for IDs related to markers. I'm still digging deeper into Wikidata's structure for nudging for data consistency and preventing conflation of concepts. Does one of your approaches rely upon the Wikidata:Database_reports/EntitySchema_directory? These constraints weren't meant to be left, but prompt cleanup moving the IDs. So what I'm hearing is once the IDs are cleaned up adding the mandatory constraint would be appropriate. Wolfgang8741 (talk) 20:32, 2 March 2022 (UTC)
I do not use Wikidata:Database_reports/EntitySchema_directory directly. My approaches use property constraint (P2302), {{Autofix}}, many different custom bot tasks, for example automatic duplicate values cleanup. The most tasks are focused on fixing different popular mistake types. Adding "mandatory" mark is nice practice after completing work on property. — Ivan A. Krestinin (talk) 22:07, 2 March 2022 (UTC)

идентификатор в Яндекс.Дзене (P8816)Edit

День добрый. Бот по ряду персон (пример - всего таких персон лично мне известно не менее 14-ти) в идентификаторе в Яндекс.Дзене (P8816) меняет значение индентификатора, удаляя участок "id/", после чего Яндекс.Дзен перестаёт открываться. Формально бот действует в соответствии с определённой маской URL https://zen.yandex.ru/$1, но по факту в таких статьях работающие ссылки работать перестают. Как понимаю, когда обсуждался идентификатор, просто не учли, что помимо основной маски, имеющейся у большинства персон в Яндекс.Дзене, у некоторых есть ещё и такой с "id/". Поэтому вопрос, как можно проблему решить? Или настроить бот, чтоб он по таким статьям не менял идентификатор, или какие другие варианты возможны? Спасибо. --Uchastnik1 (talk) 11:12, 3 March 2022 (UTC)

  • Так понимаю, что после этой правки произошла замена. --Uchastnik1 (talk) 11:40, 3 March 2022 (UTC)
  • Посмотрел по вкладу бота за соответствующий период, число таких статей/элементов ВД увеличилось где-то до 20-ти, и также обнаружился такой момент, что это не только персон касается, но и других сущностей (предметов), к примеру: Kion, Вокруг ТВ, Холодильник.ру, то есть вряд ли можно сказать, что по духу исходных условий создания идентификатора не подразумевался охват идентификатором этих сущностей (то есть разница тут чисто технического плана, в части этого добавочного "id/", не более того, равно как и персон это касается таким же образом). --Uchastnik1 (talk) 14:55, 3 March 2022 (UTC)
    • Приветствую, да, вы верно нашли правку, которая сказала боту, что нужно убирать префикс id. Если ее отменить, то бот перестанет так делать. Чуть позже запущу отмену этих изменений. Хорошо бы еще поправить описание и свойства на странице Property:P8816, чтобы они допускали такой префикс. — Ivan A. Krestinin (talk) 16:48, 3 March 2022 (UTC)
      • Большое спасибо, коллега (я тут было пытался что-то такое сделать (думал заработает), вроде ничего не изменилось, поэтому отменил - может надо было не отменять или как-то по другому сделать, но, боюсь, моих познаний для этого пока недостаточно). --Uchastnik1 (talk) 17:18, 3 March 2022 (UTC) UPD: тут отменил. --Uchastnik1 (talk) 17:30, 3 March 2022 (UTC)
        • Откатил правки бота. Извиняюсь, что не сразу сделал. — Ivan A. Krestinin (talk) 22:12, 12 March 2022 (UTC)
          Благодарю, коллега (да, всё работает теперь). Uchastnik1 (talk) 20:42, 18 March 2022 (UTC)

Remove also non-existing files from references and qualifiersEdit

Could your bot also remove non-existing files from references and qualifiers? I have found few of them and manually fixed them (example) but it would be nice if this would be done automatically. I have seen your work do similar changes, so maybe this could be done by it as well?

Similarly, bot could change a filename of a Commons file used in a reference or a qualifier, when it is moved on Commons (example). Mitar (talk) 14:22, 11 March 2022 (UTC)

Hi, I am working on fixing qualifiers. But references processing will be more hard task. — Ivan A. Krestinin (talk) 13:55, 12 March 2022 (UTC)

Check P2190 constraint config as it moves from string to numeric IDEdit

Hi, could you double check if I migrated the property constraints correctly to numeric from string on C-SPAN person ID (P2190). This move was discussed on the property talk page and project chat. Wolfgang8741 (talk) 15:11, 14 March 2022 (UTC)

Hi, everything is fine. I just improved property a bit. — Ivan A. Krestinin (talk) 20:49, 14 March 2022 (UTC)
Hi, Thanks for that. Looking at the constraint report the deprecated string values are flagged for format violation. Shouldn't a deprecated value be exempt from the currently accepted format checks as well as from a single value constraint? Two reasons to retaining the deprecated IDs are
1. matching with archived versions of the data using the old ID affording checks of the data consistency over time or when the data was initially added
2. assisting in matching existing data to convert to the new identifier.
This is partially a technical question and partially a statement as I noticed you had removed a few ids have been removed instead of deprecating them per Help:Deprecation since the values were valid prior to transition, but less reliable. They still could be matched to the Internet archive or other archive. Wolfgang8741 (talk) 15:05, 19 March 2022 (UTC)
More ideas:
  • Some external resource may use C-SPAN person ID (P2190) for some purposes. And its may assume stable identifier format. Generally it is better to use different properties for different identifier types. See Leopoldina member ID (P3413) and Leopoldina member ID (new) (P10299) as example.
  • Format of deprecated statements is need to be controlled also. Because deprecated values can be vandalized or filled incorrectly same as normal rank values.
I fixed constraints, now we have zero violations. But splitting the property is more correct way. — Ivan A. Krestinin (talk) 22:00, 21 March 2022 (UTC)
Thank you, that would have been helpful guidance a while back. Where is splitting the property the more correct way documented? This is important process to have documented. I asked both on the property talk and Project chat for this guidance and for nearly a month no one responded to my questions about process to change the property format with certainty or clear path. How is one suppose to find the "more correct way" documentation or learn? Should we go about splitting the property so the ID can be properly constrained for monitoring? Wolfgang8741 (talk) 16:37, 24 March 2022 (UTC)
I know too few about Wikidata project documentation. Actually different approaches were used for different properties in the past. I just highlighted the best approach from my point of view. And listed reasons why it is the best. You are right, splitting allows improve constraints. — Ivan A. Krestinin (talk) 19:39, 24 March 2022 (UTC)

Languages statistics on a lexeme propertyEdit

Hello,

On Wikidata:Database reports/Constraint violations/P10338#Languages_statistics, it is stated that the property Dico en ligne Le Robert ID (P10338) is used 9 times. In fact, it is much more than that. Do you know why the statistics seem incorrect? Maybe it is related to the fact that lexemes are in a separate dump (I don't how your bot works, so it's just a blind guess)?

Cheers, — Envlh (talk) 21:48, 27 March 2022 (UTC)

Hello, looks like real bug. I will investigate this. Thank you. — Ivan A. Krestinin (talk) 23:00, 31 March 2022 (UTC)
  FixedIvan A. Krestinin (talk) 20:37, 8 April 2022 (UTC)
Thank you! I confirm it is properly working since you fixed it :) Cheers, — Envlh (talk) 16:23, 25 April 2022 (UTC)

Must revert some KrBot changesEdit

Hello!

There is problem, when had merged two diffrent entries - dispersed settlement (Q1372205) and dispersed settlement in Latvia (Q16352482). That was in March 26. In Marth 28 KrBot set claim values (watch this change). There are so many changes, which must revert. Can these changes cancelled with bot? --Treisijs (talk) 12:19, 30 March 2022 (UTC)

  DoneIvan A. Krestinin (talk) 23:01, 31 March 2022 (UTC)

Does KrBot still update P214 monthly ?Edit

Hi, this article https://ejournals.bc.edu/index.php/ital/article/view/12959 states that KrBot "updates links in Wikidata items to redirected VIAF clusters and removes links to abandoned VIAF clusters." on a montthly basis. Is it still the case ? I'm not sure based on the statistics I got here https://bambots.brucemyers.com/NavelGazer.php. Thanks !

Hello, bot had some troubles. Not all items were processed. I fixed the issue. Everything should be fine now. All items were processed now. Thank you! — Ivan A. Krestinin (talk) 20:18, 8 April 2022 (UTC)
Thanks for the info and for the great job ! 193.52.26.94 06:49, 11 April 2022 (UTC)

Wikimedia import URL constraint violationsEdit

Edits like this one trigger this constraint violation. The constraint was added by user:Tacsipacsi (and last edited by user:Nikki). Not sure how to best solve it? You could use http://www.wikidata.org/entity/Q111645043#P10039 & http://www.wikidata.org/entity/P10039#P2302 (or even http://www.wikidata.org/entity/P10039#P10039$4d1d74b4-4a3f-cc5b-c760-d133e2ac8fd9)? Or we can just remove the whole constraint... Multichill (talk) 16:36, 18 April 2022 (UTC)

Looks like a valid use case (more or less: as an outsider, it may not be obvious at first what those URLs mean). I definitely don’t recommend working around the constraint by using a different URL—if a constraint is wrong, it should be fixed, not worked around. However, I still see a constraint here: using Wikimedia import URL (P4656) with a Wikidata URL is valid if, and only if, a inferred from (P3452) statement is also present (which it makes more precise). Unfortunately this cannot be expressed with the current constraint system, so this constraint either needs to be replaced by a complex constraint that handles this situation, or a new property needs to be created for this purpose (which could have—non-complex—constraints that require that it points to a Wikidata URL, and to require that it’s always used together with P3452). The former avoids creating yet another property, the latter lets us continue to use non-complex constraints, which provide feedback to the user in context, not only on some hidden constraint report pages. —Tacsipacsi (talk) 15:41, 19 April 2022 (UTC)

Reverting redirect resolutions after wrong mergerEdit

Hi Ivan, human sexual activity (Q608) and Sex (Q98040930) were wrongly merged. Could you please revert the redirect resolutions? Thanks, Máté (talk) 05:45, 22 April 2022 (UTC)

  DoneIvan A. Krestinin (talk) 16:58, 23 April 2022 (UTC)

Self link errorEdit

Hello! Very nice things your bot does, however it's making an understandable mistake. It is removing the statement has part or parts (P527) YouTube comment (Q110875209) on YouTube comment (Q110875209). I undid it with the reason "YouTube comments can have other YouTube comments as replies", but it did it again. AntisocialRyan (Talk) 14:56, 24 April 2022 (UTC)

Hello, usually recursive link in has part or parts (P527) is just mistake. This is reason why bot removes it. But your case is interesting. I think reply comment is another comment. It is linked with original comment using specific relation "reply to". This relation is not similar to "has part" relation. Lets take analogue from another area to verify our statement. For example "human has part human" because one human is child of another human. This statement is looked wrong. So I think that statement "comment on Youtube has part comment on Youtube" is wrong too. — Ivan A. Krestinin (talk) 16:12, 24 April 2022 (UTC)
Alright, I see where you're coming from actually. YouTube comments can have replies, but replies to YouTube comments can't have more replies. I will create a new item for this, thanks! AntisocialRyan (Talk) 16:59, 24 April 2022 (UTC)

MonksEdit

Hi Ivan! These edits would not be done by your bot that it does not delete the value but replaces it with monk (Q733786)?

I understand why you delete from the occupation property which order of monks you are a member of, but if you don't write that you are a monk, the item can be left without an occupation (see). Thanks Palotabarát (talk) 21:52, 26 April 2022 (UTC)

Hi @Palotabarát:! Thanks for raising the point. We are trying to establish a standard for data regarding members of Catholic religious orders at Wikidata_talk:WikiProject_Religions#Members_of_Catholic_religious_orders; since there wasn't any objection, I applied this change, but of course it is reversible and improvement can be made (e.g. adding new P106s in order to fill the gap). We can continue the discussion there. Good night, --Epìdosis 22:42, 26 April 2022 (UTC)

Laurence Olivier Award for Best xxxxEdit

Hi, Ivan. The KrBot changed a number of entries I made yesterday for the series of Laurence Olivier Awards (Laurence Olivier Award for Best Actor in a Supporting Role in a Musical (Q19870586), Laurence Olivier Award for Best Actress in a Supporting Role in a Musical (Q19870588), Laurence Olivier Award for Outstanding Achievement in Music (Q16995976), Laurence Olivier Award for Best Actress (Q6500774), etc.), changing the country from "England" to "United Kingdom".

The Olivier Awards are not presented to theatre productions in the entire United Kingdom, made up of the individual countries of England, Scotland and Wales, along with Northern Ireland. These awards are only presented for theatre work in England (specifically London's West End theatre district), while the other countries of the United Kingdom have their own theatre awards.

When I chose "England", it even comes up with the phrase "constituent country of the United Kingdom", as it is, in fact, an actual country within the UK.

I tried using the statement for "applies to jurisdiction", set to England, but that makes my statement for London raise the flag "An entity with located in the administrative territorial entity should also have a statement country." I want the country to be England, as that is the country that matters, but the bot will just change it to United Kingdom.

I do not think that the KrBot needs to be changed, nor anything like that. Just need your help. Do you know a way that I can tell it "England" and "London", and not have any flags nor have a bot make a change? Thanks. Jmg38 (talk) 02:09, 28 April 2022 (UTC)

Hi again. I think I'll use "country (P17)" of United Kingdom, and "applies to jurisdiction (P1001)" of London. That captures everything I need, and avoids having to mention England at all while also avoiding having to fuss about England being a country, as the real important part is the "applies to jurisdiction (P1001)" of London. The KrBot was helpful in ways you may not have expected, as it forced me to think through what I was doing, which is never a bad thing! Thank you. Jmg38 (talk) 05:16, 28 April 2022 (UTC)
Hello! Could you review previous England-related discussions on Property talk:P17? Bot just executes {{Autofix}} rule from the property discussion page. You may delete the rule and bot will stop such edits. — Ivan A. Krestinin (talk) 17:03, 28 April 2022 (UTC)

Q7686436 and Q105550321 merge and replaceEdit

Hello Ivan! @Howard61313: has incorrectly merged the occupation (military aviator (Q105550321)) and the category (Category:Military aviators (Q7686436)), and Krbot2 has changed the occupation of all persons in the profile (example). Can you do that back? (links) Thanks. Palotabarát (talk) 10:57, 1 May 2022 (UTC)

  DoneIvan A. Krestinin (talk) 18:46, 1 May 2022 (UTC)
Thank You! Palotabarát (talk) 07:05, 2 May 2022 (UTC)

Single value constraints for Dutch municipalitiesEdit

Dear Ivan A. Krestinin,

In your report, DATABASE_REPORTS/CONSTRAINT_VIOLATIONS/P382#"SINGLE_VALUE"_VIOLATIONS, I notice that there is probably searched for items with more than one claim of an identifier (which should be unique by definition). The Dutch CBS Municipality code in The Netherlands can become obsolete (then it should have an end date), changed into a new code (and then have doubles) or be a whole new code when a new municipality is created. When we for instance look at Etten-Leur Q9833#P382, we see that there are two CBS codes claimed. Which is correct. The old one has an end date and the current one is actual. Recently I also updated its rank in an attempt to work with it from a script.

Would it be possible, in your database report, to include a check on the end date and perhaps a double check on the rank and only report when there are more than one codes claimed without an end date or on an equal rank?

Furthermore, if you are interested, perhaps the check on end dates could be automated. I recently tried something with the api of CBS and found a table (70739ned) with the dates registered. See a script for population data and a script for surface area's where the table is used for mapping. Please forgive me my level of programming, I gave it my best try. If you have any questions, want to work together or have remarks towards me, please let me know. (My availability is varying, but intend to respond asap.)

Best regards, Démarche Modi (talk) 18:02, 3 May 2022 (UTC)

And see https://nl.wikipedia.org/wiki/Gebruiker:D%C3%A9marche_Modi/Kladblok/python/cbs_codes for the whole table I mentioned. Démarche Modi (talk) 18:54, 3 May 2022 (UTC)
If more values are possible over time, the property should have single-best-value constraint (Q52060874) instead of single-value constraint (Q19474404). And the item statement should have normal rank instead of deprecated, as deprecated rank is not for once-correct information. —Tacsipacsi (talk) 01:03, 4 May 2022 (UTC)
Specifying separator (P4155) for the constraint fixes the issue. See [17]. Of course start time (P580) and end time (P582) are need to be specified in items. — Ivan A. Krestinin (talk) 21:45, 4 May 2022 (UTC)
All done (for the ones reported with violations), statements have normal rank again and are provided with a start and end date. For the remaining bulk (without warnings in the report) I will consider a bot request. Démarche Modi (talk) 14:55, 6 May 2022 (UTC)

what right does your robot have to deny reality?Edit

̃https://www.wikidata.org/w/index.php?title=Q28105001&diff=prev&oldid=1634524482

?????????????

A.BourgeoisP (talk) 14:53, 8 May 2022 (UTC)

The statement is already on "significant event", the bot would've added it if it wasn't already there. If that is the right property is beyond me AntisocialRyan (Talk) 15:32, 8 May 2022 (UTC)̇
https://www.wikidata.org/w/index.php?title=Q28105001&action=history
A.BourgeoisP (talk) 12:11, 14 May 2022 (UTC)
@A.BourgeoisP: position held (P39) shouldn't be used for oldest human (Q254917) (it's for formal positions), use significant event (P793) instead. Reverting the bot over and over again is as useful as banging your head against a wall. Multichill (talk) 12:42, 14 May 2022 (UTC)
why oldest person in France (Q107344155) and list of European supercentenarians (Q1637694) are in p 39 and oldest human (Q254917) is in p793 ? what justifies separating them? Read the˞ infobox of French version of the article... we see oldest person in France (Q107344155) and list of European supercentenarians (Q1637694) but not oldest human (Q254917)! Why ? That is rediculous and stupid ! Please solve this problem. Wikipedia.fr does not have to be vandalized by the action of a robot on Wikidata! A.BourgeoisP (talk) 19:03, 14 May 2022 (UTC)
Helloooo ǃ? A.BourgeoisP (talk) 19:26, 17 May 2022 (UTC)
Hello, bot makes such changes because page Property talk:P39 has template {{Autofix|pattern=Q254917|replacement=Q254917|move_to=P793}}. Just delete the template and bot stops this activity. But I recommend to discuss the issue on Property talk:P39 or on Wikidata:Project chat first. — Ivan A. Krestinin (talk) 20:41, 17 May 2022 (UTC)

compositions ≠ musical worksEdit

en:Category:2020 compositions (Q97275139) don't contain en:Category:2020 albums. be:Катэгорыя:Музычныя_творы_2020_года, ru:Категория:Музыкальные_произведения_2020_года and zh:Category:2020年音樂作品 (Q111684969) contain en:Category:2020 albums. Compositions ≠ Musical works, Musical works contain Compositions. -- 15:32, 15 May 2022 (UTC)

Q9059213Q5626704, Q9059213 contain Q5626704. -- 15:36, 15 May 2022 (UTC)
  • I merged the items because "ru:Категория:Музыкальные произведения XXXX года" is the same as "en:Category:XXXX compositions". I do not see difference between "Composition" and "Musical work". Its are looked as full synonyms after translation to my native language. Could you fix the issue? Fixing this is looked too hard task for me. You may use different from (P1889) to prevent wrong merges in future. — Ivan A. Krestinin (talk) 15:51, 15 May 2022 (UTC)

"Determination method" errorEdit

Hello! determination method (P459) was added as a main statement to just setting up my twttr (Q64790997). Instead, it should be added as a qualifier to number of likes/upvotes (P10649), number of comments (P10651), number of dislikes/downvotes (P10650), and number of reblogs/shares (P10756).

https://www.wikidata.org/w/index.php?title=Q64790997&diff=1640345856&oldid=1639539094

This would actually be super helpful because it is annoying to add! AntisocialRyan (Talk) 23:52, 16 May 2022 (UTC)

Hello! Bot will stop such edits after this change. Could you discuss with User:Trade this edit? — Ivan A. Krestinin (talk) 20:56, 17 May 2022 (UTC)
Hello! That isn't the problem, the problem is that it was added as a main statement to the item and not a qualifier of one of the properties.
See its revision here, which I have since reversed: https://www.wikidata.org/w/index.php?title=Q64790997&oldid=1640622098
It did it to a number of items. AntisocialRyan (Talk) 21:10, 17 May 2022 (UTC)