Logo of Wikidata

Welcome to Wikidata, GZWDer!

Wikidata is a free knowledge base that you can edit! It can be read and edited by humans and machines alike and you can go to any item page now and add to this ever-growing database!

Need some help getting started? Here are some pages you can familiarize yourself with:

  • Introduction – An introduction to the project.
  • Wikidata tours – Interactive tutorials to show you how Wikidata works.
  • Community portal – The portal for community members.
  • User options – including the 'Babel' extension, to set your language preferences.
  • Contents – The main help page for editing and using the site.
  • Project chat – Discussions about the project.
  • Tools – A collection of user-developed tools to allow for easier completion of some tasks.

Please remember to sign your messages on talk pages by typing four tildes (~~~~); this will automatically insert your username and the date.

If you have any questions, please ask me on my talk page. If you want to try out editing, you can use the sandbox to try. Once again, welcome, and I hope you quickly feel comfortable here, and become an active editor for Wikidata.

Best regards! --Bill william compton (talk) 14:26, 28 March 2013 (UTC)

  1. User talk:GZWDer/2013
  2. User talk:GZWDer/2014
  3. User talk:GZWDer/2015
  4. User talk:GZWDer/2016
  5. User talk:GZWDer/2017
  6. User talk:GZWDer/2018
  7. User talk:GZWDer/2019


Reckless false editsEdit

I have reverted the reckless false edits you made to Template:Q11608 in which you falsified the birth and death dates. Jc3s5h (talk) 21:57, 3 January 2020 (UTC)

Two OliviasEdit

I'm still of the opinion that we should merge Olivia Evelyn Mary Fletcher-Vane (Q76152859) into Olivia Vane (Q76361064), or delete the Peerage item. The heart of the matter is, how much confidence do we need to be allowed to say two records refer to the same person? It is pretty clear from Olivia's academic profiles that she is also known as Olivia Fletcher-Vane, she is from England, and the years listed in her education profile would imply that she must have been born around 1991 - all of these corroborate with the information from the Peerage entry. I would say that's enough confidence. Even if somebody published her date of birth, one could still argue it's insufficient - what if there were two Olivia Fletcher-Vanes born on the same day? There's no proof to the contrary, and we should act based on the most plausible representation of facts, not a rigid requirement on external proofs of equivalence. Deryck Chan (talk) 13:41, 10 January 2020 (UTC)

@Deryck Chan: It may be interested that for several year Chinese Wikipedia had two articles (and than, two items) for one person, because the lack of public source about the relationship (see zh:Talk:鬼頭桃菜). The community also argues the validity of self-published sources.--GZWDer (talk) 19:20, 10 January 2020 (UTC)
In the future, Wikidata is likely have much more case about multiple profiles (=items with set of identifiers/sources) in different fields about one living person. One may argue a simple merge may compromise the privacy of people involved.--GZWDer (talk) 19:24, 10 January 2020 (UTC)
Another matter is different wikis (and other sources) using different standard when handling BLP data. When some wikis decides to remove some personal information (e.g. this), they may be still included in other Wikipedias and the data goes to Wikidata. Wikidata may also have dates imported from external databases (reliable or not), which comes from sources that are considered unacceptable in Wikipedia (like information brokers).--GZWDer (talk) 19:33, 10 January 2020 (UTC)
Could always ask her if she thinks it should be merged or not, there's an email address at https://www.oliviavane.co.uk/research/about.html. Ghouston (talk) 00:57, 28 February 2020 (UTC)
Deryck Chan knows her personally.--GZWDer (talk) 00:59, 28 February 2020 (UTC)

MergeEdit

You caused such a mess, I can't stop finding duplicate items, ie. Q75455126, please fix it...--Arnaugir (talk) 20:11, 10 January 2020 (UTC)

@Arnaugir: Duplicated items should not be taken to RFD. See Help:Merge.--GZWDer (talk) 20:15, 10 January 2020 (UTC)
This doesn't invalidate my assessment above.--Arnaugir (talk) 09:02, 11 January 2020 (UTC)

Same remark, can't you do something for fixing the situation? Like, do some merge automatically? At least for the items with the same label, same birthdate and deathdate (with precision to the day)? See https://w.wiki/FM7 (sadly the query timesout without the LIMIT, do you know a way to override that?) I did around 50 by hands and there was no false-postive (with such precise conditions, it's very unprobable). Cheers, VIGNERON (talk) 13:38, 12 January 2020 (UTC)

CAS numbers, InChIs etc.Edit

Could you modify your future additions to not add statements that causes constraint violations? You have added many CAS numbers, InChIs etc. Many of them are correct, but in some items such statements were deleted for a reason during manual curation of data (databases have errors and that have to be curated manually by e.g. moving statements to other items). Unfortunately, you can't deduce automatically which statements were deleted from an item and are not present in WD anymore (that's why deprecated rank should be used for such cases), but you should take constraints into consideration during automatic import of data. Otherwise, all manual curation of data would be a Sisyphean task. Regards, Wostr (talk) 22:22, 13 January 2020 (UTC)

کمک کنیدEdit

سلام لطفا برای این مقاله شناسه مورد به ویکی پدیا اتصال بدهید Elmirasharifimoghadam

Mehdii158 (talk) 12:19, 16 January 2020 (UTC)

Peerage projectEdit

Hi! What's the current status of the import project from The Peerage website? I've been doing some reconciliation today from a London history source, and I have hit a number of cases where there are marriages recorded on the Peerage site for two people who have items here, but no spouse (P26) link. I thought these connections had been comprehensively imported. They're really important, for trying to trace chains of Peerage people who all need to be merged. Are there a lot still to do, or were most of them added? Jheald (talk) 19:55, 16 January 2020 (UTC)

I do have a plan to import spouses, but this require analyzing of texts. Another thing is it may be a question that what counts as a spouse (father/mother relationship is rather objective), so I also propose to check other sources before importing (The Peerage does not have a consistent format describing spouses; genealogics.org does, but this will requiring matching and mapping against another external database).--GZWDer (talk) 21:46, 16 January 2020 (UTC)
  • I see. I just came to bring up the same. There seems to be a gap of some 48000 items that don't have any family relationships: [1]. --- Jura 11:29, 29 January 2020 (UTC)

The formats seem to be:

  • "(name) married <..>

If there is some other parent mentioned:

  • (He or she) married <..>
    (He or she) married, (firstly|secondly), <..>

Hope that helps. --- Jura 07:34, 30 January 2020 (UTC)

It would also be very useful to have any "position held" information -- eg to chase down duplicate members of parliament, Lord mayors of London, bishops and archbishops, etc. Jheald (talk) 09:59, 30 January 2020 (UTC)
It looks like there may also be some further dates of birth & dates of death that were missed, eg on William Ireland Thomas de Courcy Wheeler (Q76354837). Jheald (talk) 08:42, 1 February 2020 (UTC)
This is a relative new entry, not in Mix'n'Match yet.--GZWDer (talk) 11:41, 1 February 2020 (UTC)

Q22101484Edit

Hi! I have noticed that there might have a problem in the above item created by your bot. The zhwiki article name in this item should be linked to Q7590467 according to the article names of other Wikipedia versions. I have added the Chinese article name to Q7590467 and deleted the name on Q22101484 (which Q22101484 is now empty); however, when I checked on the zhwiki article, the linkage to enwiki is "Churches in Galveston, Texas", which it doesnt have an article with this name. I am not sure what to do in order to solve these issues as mentioned above and I would like to request your help. Thanks.--BenedictusFX (talk) 18:41, 29 January 2020 (UTC)

@BenedictusFX: Help:Merge--GZWDer (talk) 18:43, 29 January 2020 (UTC)
I have merged the two items, but it seems that the linkage to enwiki on zhwiki is "Churches in Galveston, Texas" and not "St. Mary Cathedral Basilica (Galveston, Texas)".--BenedictusFX (talk) 18:54, 29 January 2020 (UTC)
The local interwiki link should be removed.--GZWDer (talk) 18:56, 29 January 2020 (UTC)
I have noticed your edit on zhwiki (I first thought I have to do this edit until I realised what you mean when I saw your edit on zhwiki, that edit is now reverted by myself) and the problem is now solved. Thanks for your help!--BenedictusFX (talk) 19:05, 29 January 2020 (UTC)


Labels from importEdit

Hi GZWder, the problem mentioned at User_talk:GZWDer/2019#Prefixes_in_labels is still unresolved. To help you clean it up, I fixed some and another user some more, but many are still there. Please attempt to fix before adding more. --- Jura 10:48, 2 February 2020 (UTC)

Notability criteria for new itemsEdit

I notice that you've been creating some new items from MnM, with information limited to eg just a Geni.com ID and a Genalogics ID, and no apparent connection to any person with an existing Wikidata item.

Perhaps I am wrong in the above, but do you consider these people notable? Because on the face of it, they don't appear to meet our criteria. Jheald (talk) 23:05, 5 February 2020 (UTC)

@Jheald: They (recent ones) may be found here. At least they have an ancestor that is notable.--GZWDer (talk) 23:09, 5 February 2020 (UTC)
Who was that? I'm wary about pushing "inherited notability" further than one or perhaps at most two generations or degrees of separation. Jheald (talk) 23:12, 5 February 2020 (UTC)
  • If I can butt in: "inherited notability" is a concept from English Wikipedia. "Our criteria" is Wikidata:Notability in that entries on people must link to an outside source to show that they actually exist, and were not created as a prank or vandalism. Any genealogical database could be imported, if someone was willing to do the tremendous work load of running mix_and_match and then merging all the duplicates, and running various error detection queries. There appears to be a moratorium on adding any new large data sets of people because of current computational constraints. The query servers currently time out for even simple date queries for instance_of=human combined with at least two other fields. --RAN (talk) 00:43, 26 February 2020 (UTC)

wiktionary itemsEdit

please creat wiktionary items in wikidata with your bot Amirh123 (talk) 07:59, 10 February 2020 (UTC)

DOI upper-case constraintEdit

I notice you added this constraint recently on property P356 - do you intend to run anything to fix these? Or is there an existing bot that does this? ArthurPSmith (talk) 20:57, 18 February 2020 (UTC)

This will be fixed once I collected all DOIs (likely via this tool). But I have other tasks for now.--GZWDer (talk) 21:00, 18 February 2020 (UTC)
Looks like I added a lot of them; I'm working on fixing them now. Thanks for the pointer to the dumper tool, it's a lot more useful than the constraint violations page in cases like this! ArthurPSmith (talk) 15:21, 24 February 2020 (UTC)

d:Q86281122 d:Q86281152Edit

? 91.197.junr3170 (talk) 15:32, 25 February 2020 (UTC)

Wikidata:Database reports/unmarked supercentenariansEdit

I noticed you helping on the error correction for people who died before they were born, here is another good one: Wikidata:Database reports/unmarked supercentenarians, if they are over 110 years mark as a supercentenarian, otherwise more errors, typos, and vandalism. --RAN (talk) 00:28, 26 February 2020 (UTC)

mergeEdit

https://www.wikidata.org/wiki/Q4416508

https://www.wikidata.org/wiki/Q85800152

Abieule (talk) 19:26, 27 February 2020 (UTC)

Something wrong: your created a double of existed itemEdit

Q86372350 is obviously a double of long existed Q13283399

--Slb nsk (talk) 10:54, 29 February 2020 (UTC)

Wikidata:Database reports/Humans with missing claims/P2600Edit

Wikidata:Database reports/Humans with missing claims/P2600 is outdated, any way to update? 77.11.15.97 13:07, 5 March 2020 (UTC)

Large ..Edit

Hi GZWDer,

if some are available, would you be so kind to upload articles about Q84263196 in priority? Thanks. --- Jura 20:54, 7 March 2020 (UTC)

Creating duplicatesEdit

Hi GZWDer. What is the point of creating an item without general label defined, when another item already exists on the same topic with the label equal to the title of the article on the French Wikipedia (French label)? Here is another example. Rather than create a useless duplicate, could you please check whether an item exists and if so directly link the Wikipedia article to that item. Thanks! Regards. --Ideawipik (talk) 19:02, 9 March 2020 (UTC)

  • Here you may check whether there's existing items about a Wikipedia article, but often there're too many to check. After creation of items there're more tools to find duplicates such as projectmerge.--GZWDer (talk) 19:07, 9 March 2020 (UTC)
Thanks, GZWDer, for the answer and the links. I understand the case and probably don't get all the facts. I just regret the artificial increase of the Item ID number. And the fact that it is much more difficult to merge items than to link a wiki article (without WD item) to an existing item.
Perhaps your bot (or script) should not create an item when another already exists with the same name. For example: Q3605395 (Adohoun) / Q86685478 (duplicate item with no name, frwiki article with the same title) and several articles from fr:Modèle:Palette Mono (département).
@Pasleim: couldn't this projectmerge list be made directly from wiki sites (or dumps) instead of Wikidata item when no WD item has been created.
For sure, we should strongly recommend page translators/creators to link their new pages to existing items. Regards. --Ideawipik (talk) 19:02, 23 March 2020 (UTC)

TR genea prop|P7977Edit

I don't see the underlying template but we have P7977. --RAN (talk) 21:27, 19 March 2020 (UTC)

Andrew DrakeEdit

Thanks for catching my error! I will add the correct generations and link them together. The DAR website is terrible, there are supposed to be 4 other searchable databases but I cant find the search page for them. There is a descendant database and a grave database and a bible database. Even finding the ancestor database from their home page seems impossible. Have you found the search page for their other dbs? If so let me know and I will add a link to my Wikidata home page. --RAN (talk) 16:53, 24 March 2020 (UTC)

Swedish nature reservesEdit

Thank you for all the fine work you have done! I was happy to find 4900+ nature reserves of Sweden imported by you, today :) --So9q (talk) 12:16, 27 March 2020 (UTC)

A lot of new nature reserves are being created right now in Sweden. I just loaded the latest source http://gpt.vic-metria.nu/data/land/NR.zip into JOSM and it reported 5041 features (with names) some 100 more than we have. Could you import the missing ones? The NVRID/Naturvårdsregistret ID (P3613) is unique I believe so you could filter based on that. Thanks in advance!--So9q (talk) 12:44, 27 March 2020 (UTC)

Q86688005Edit

Q86688005 is junk; I've nominated it for deletion. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:31, 7 April 2020 (UTC)

likewise Q87846062. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:33, 7 April 2020 (UTC)
and Q88114658 In each case you're creating items based, apparently, on invalid ORCID iDs. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:37, 7 April 2020 (UTC)
and Q87259374, where you added the ORCID iD "fix spelling". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:38, 7 April 2020 (UTC)
and Q86653963. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:44, 7 April 2020 (UTC)
Per What links here this rarely happens (compared with several million ORCIDs the bot handled). I don't think special treatment is required as the bot will not revisit the same ID, and the import from current source (European PubMed Central) is mostly complete (more than 99% for first 30 million IDs).--GZWDer (talk) 10:50, 7 April 2020 (UTC)
Why is your bot handling "several million ORCIDs"? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:05, 16 April 2020 (UTC)
I don't mean several million different ORCIDs (though it may reach that stage in the future), but the bot proceeded such a number of total authors with ORCID (each person may appears multiple times) and added such a number of author (P50) statements.--GZWDer (talk) 22:15, 16 April 2020 (UTC)
These errors are rare; I don't see how it can not be fixed manually.--GZWDer (talk) 04:42, 13 May 2020 (UTC)

Your reply does not address why you are adding such bad values, nor say what you are going to do to reduce or prevent such issues in the future.

On Q88225145 you added the ORCID iD http://orcid.org/0000-0003-4122-373. That is clearly junk; it is no where near matching the REGEX for that property. Why are you not checking your data before adding it? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:28, 16 April 2020 (UTC)

On Q88335805 you added the ORCID iD [1], [1], [1], [1], [1], [1], [1], [1]. there are no other statements, other than that the subject is a human. Do you think that is acceptable? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:32, 16 April 2020 (UTC)


Yeah dude, your bot is bad. It can be good too, but also bad. Check out ORCID ID#Format violations. For example, it occasionally adds full URLS as ORCID IDs, and has trouble distinguishing between lowercase and uppercase X, see for instance Cafer Akkoz (Q92466947) and Elham Anisi (Q90621031). Please give your bot a stern talking to, and maybe set it (or some other bot) on the journey of fixing its mistakes. -Animalparty (talk) 04:36, 13 May 2020 (UTC)

CAS COVID-19 Anti-Viral Candidate CompoundsEdit

This is another time you are adding statements without any consideration of property constraints. You've managed to add one statement per item with two constraint violations: catalog (P972) with violation of value type constraint (Q21510865) and CAS Registry Number (P231) with violation of property scope constraint (Q53869507). Firstly, CAS number as a reference is nonsensical as it does not reference anything (and it should be modified or deleted), secondly – how do you plan to fix this situation you've created? Wostr (talk) 16:44, 14 April 2020 (UTC)

I'm coming to this because I repeatedly find CAS links with nonexisting CAS numbers. Seems you only do promises about cleaning up, no actual work? --SCIdude (talk) 09:09, 2 June 2020 (UTC)
@SCIdude: Currently these items do not have constraint violations. Do you mean the CASID is invalid?--GZWDer (talk) 09:14, 2 June 2020 (UTC)
Yes, see Q90545647 for example. --SCIdude (talk) 09:17, 2 June 2020 (UTC)
@SCIdude: This means CAS itself published a list of entries with some invaild CAS IDs. I will deprecate these IDs, but you should confirmed these are not found in SciFinder (no other source has a complete list of CAS entries).--GZWDer (talk) 09:20, 2 June 2020 (UTC)

Q88579292Edit

What is the point of items like Q88579292? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:04, 16 April 2020 (UTC)

@Pigsonthewing: Per Wikidata:Property_proposal/Archive/39#P2093 use of author name string (P2093) is temporary in nature. If an item can be found then it should be used. What's the problem of this item? Do you have evidence that this conflate multiple people? (in this case you can ask ORCID to lock the profile.)--GZWDer (talk) 22:12, 16 April 2020 (UTC)
The only issue I found is someone incorrectly merged it.--GZWDer (talk) 22:21, 16 April 2020 (UTC)
I didn't say anything about author name string. I didn't say anything about the bad merge, which I already fixed. I asked "what is the point of items like that"? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:30, 16 April 2020 (UTC)
Use of author name string (P2093) is temporary in nature, so author (P50) is preferred if available. Many users are resolving authors using various tools, but this should be able to be done automatically if possible.--GZWDer (talk) 22:32, 16 April 2020 (UTC)
again: I didn't say anything about author name string. I asked "what is the point of items like that"? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:39, 16 April 2020 (UTC)
They works as normal items and more information can be added. Other toolses also make used of them (example).--GZWDer (talk) 22:56, 16 April 2020 (UTC)

Bot appears to have created duplicate itemEdit

Hello. It looks like your bot has created a duplicate item? Q88643388 was created much more recently than Q16729936. I am new to Wikidata so not sure how this all works, but thought I should let you know. Cheers Ballofstring (talk) 22:50, 16 April 2020 (UTC)

@Ballofstring: In this case Philippa Howden-Chapman (Q16729936) did not link to an ORCID so the bot can not find the item. See Help:Merge for how to handle this.--GZWDer (talk) 22:53, 16 April 2020 (UTC)
Ah right. Thanks for the link the merge page. I have done that now. Many thanks Ballofstring (talk) 23:18, 16 April 2020 (UTC)

LargeDatasetBot imported invalid DOI please cleanupEdit

Have you noticed noticed that a your bot LargeDatasetBot has created a few Invalid DOIs - sample query (not all from your bot) - https://w.wiki/NL5 Any chance you're already in the process of cleaning these up? Wolfgang8741 (talk) 10:28, 19 April 2020 (UTC)

Once I imported all entries (currently there're 1.28 million left, and will be completed in no more than 32 days) I will fix them. I am not planning an immediate fix as I need to deal with the holes (entries failed to import) first, and if the DOIs are removed they will be imported again. Eventually the invalid DOIs may simply be removed.--GZWDer (talk) 14:54, 19 April 2020 (UTC)

Block of LargeDatasetBotEdit

Please, see User talk:LargeDatasetBot. Lymantria (talk) 17:23, 4 May 2020 (UTC)

Discussion at project chatEdit

Hi! You might want to join this discussion as you have been indirectly mentioned. Bovlb (talk) 18:09, 12 May 2020 (UTC)

An item for every secondEdit

Please, stop your bot right now. All those items are perfectly useless at the moment, and have absolutely no need for being created. --Sannita - not just another it.wiki sysop 10:05, 22 May 2020 (UTC)

I am doing this as a echo of Wikidata:Property proposal/local time. People use refine date (P4241) to express time of day, but it can not express everything.--GZWDer (talk) 12:47, 22 May 2020 (UTC)

Requesting Q76335638 for deletionEdit

I'm new, but I'm not sure this meets the notability guideline. Thanks, WikiMacaroons (talk) 11:21, 27 May 2020 (UTC)

William GrahamEdit

Just noticed your said-to-be-the-same-as edits to William Graham (Q8010075) - I've checked and all three items do seem to be the same man (the husband of each daughter has an ODNB entry, and both of these state their wife's father was the Glasgow MP). Do you mind if I just merge them directly or is there any reason I should hold off? Andrew Gray (talk) 18:05, 4 June 2020 (UTC)

@Andrew Gray: Yes, I only added them because I does not found any indication in article that William Graham have more issues. Note I also asked in WikiTree.--GZWDer (talk) 18:09, 4 June 2020 (UTC)
@Andrew Gray: If possible, please also check Joseph Lawrence (Q76129051) and Joseph Lawrence (Q13529965).--GZWDer (talk) 18:11, 4 June 2020 (UTC)
Great, thanks - I'll merge the Grahams, and I've left a note at Wikitree. Lawrence is definitely mergable as well - Pollen's ODNB entry says "he married Maud Beatrice, daughter of Joseph Lawrence, a prominent Conservative MP". I'll do that as well. Andrew Gray (talk) 18:21, 4 June 2020 (UTC)
@Andrew Gray: Check Paulina Pepys (Q75495083) - the parents provided in The Peerage seems wrong, but there're some websites following it.--GZWDer (talk) 18:34, 4 June 2020 (UTC)
Hmm. The ODNB for her son Edward has "... the second but eldest surviving son of Sir Sydney Montagu (c.1571–1644) of Hinchingbrooke, Huntingdonshire, MP for Huntingdonshire, master of requests, and groom of the bedchamber to James I, and his wife, Paulina, formerly Pepys (d. 1638)"; the two-volume biography of Edward just says she was not very rich and doesn't name her parents, but it does point to a copy of her father's will here. This confirms her father was the John Pepys of Cottenham who died in 1589, with surviving children Elizabeth, Edith, Susan, Paulina, John (died 1604), Thomas (the elder), Thomas (the younger, d 1615), Robert (d 1630), Apollo (d 1644) & Talbot. His wife at the time of his death was Anne.
The footnote on p 104 about the eldest son John (died 1604) states that he married Elizabeth Bendish. Talbot is the same as Talbot Pepys (Q7679037), whose History of Parliament entry has "6th s. of John Pepys (d.1589) of Cottenham, Cambs. and Edith, da. and h. of Edmund Talbot of Cottenham".
So I think this indicates that John Pepys (Q75588710) is the brother of Paulina, and John Pepys (Q75588707) is the father - the opposite of the way the Peerage has it. Andrew Gray (talk) 18:57, 4 June 2020 (UTC)
@Andrew Gray: Joan Champernowne (Q6204937) - Are wife of Robert Gamage and wife of Anthony Denny one person? Wikipedia does not mention her marriage to Robert Gamage, and WikiTree have two profiles but they have the same day of death.--GZWDer (talk) 19:38, 4 June 2020 (UTC)
I suspect two different people (sisters?). History of Parliament & ODNB for Denny say they married in 1538 and she survived him as his widow in 1549, dying 1553. There's no date given for Gamage's marriage but their son was born 1535 and Robert Gamage died 1553 as well. This means she couldn't have remarried after being widowed, so it would imply that they had divorced, which was pretty unusual at the time and so I would expect the sources to have mentioned it somewhere if it was the case. Andrew Gray (talk) 20:31, 4 June 2020 (UTC)

DuplicateEdit

Hi,

did you see that LargeDatasetBot sometimes creates duplicates? See e. g. Q84983076 and Q61657899 (now merged).

Best, 86.193.172.227 14:09, 9 June 2020 (UTC)

As the other item does not have ORCID the bot can not find the item. In the future when more data are imported to these items, they will be more easy to find.--GZWDer (talk) 17:42, 9 June 2020 (UTC)

RE: Q81110974Edit

I have restored it. I have deleted a bunch of non-notable relatives items. I deleted that too because there were no sources others than the IDs. Esteban16 (talk) 21:34, 10 June 2020 (UTC)

Additions of unreferenced statementsEdit

According to Wikidata:Bots#Statement adding bots, bot operators should "Add sources to any statement that is added unless it has been agreed the data is 'common knowledge' in which case the bot should state where the information has been copied from." When I see a P31 statement saying "case report" or "meta-analysis", I expect this. Please comply with the policy. Charles Matthews (talk) 10:28, 17 June 2020 (UTC)

@Charles Matthews: If you open the European PMC reference such as [3], you may see the pubType field (this is where the data comes from). Alternatively same thing appears when clicking the PubMed link. I will add a source when doing review articles.--GZWDer (talk) 10:33, 17 June 2020 (UTC)

Thank you for your opinion on this matter. I asked whether you will comply with the policy. It seems the answer is no. Charles Matthews (talk) 10:38, 17 June 2020 (UTC)

Invalid page numbers (Zootaxa)Edit

Please fix this mess. --Succu (talk) 20:47, 17 June 2020 (UTC)

关于许可馨事件Edit

阁下先前认为我对于许可馨事件的条目描述不准确,我更改为:官二代言论争议事件。望获悉。 Assifbus (talk) 14:23, 18 June 2020 (UTC)

阁下说法明显错误,许可馨自己都说自己是官二代,而且根据大陆网友的深挖,发现许可馨的父亲是当地一个政府机构的局长,因此我认为称呼他为官二代,并不为过。 Assifbus (talk) 14:30, 18 June 2020 (UTC)

@Assifbus:如果有可靠来源说明许可馨的父亲的背景,建议先写到zh:许可馨事件#当事人里面。提醒一下,如果在百科认为违反BLP的内容这里也不能接受。--GZWDer (talk) 14:32, 18 June 2020 (UTC)

那在这上面写个:中国大陆言论争议事件,怎么样? Assifbus (talk) 14:36, 18 June 2020 (UTC)

@Assifbus:这样可以。--GZWDer (talk) 14:38, 18 June 2020 (UTC)

Q59039699Edit

https://www.wikidata.org/w/index.php?title=Q59039699&diff=1030757198&oldid=964626363

I can't figure out the basis for matching these two David Ridgleys together, do you have any idea? I'm not sure that "David Ridgley" the librarian is the same as "David Latimer Ridgely" the son of the Governor. Gamaliel (talk) 12:41, 19 June 2020 (UTC)

@Gamaliel: The data is imported from Mix'n'Match system, see here.--GZWDer (talk) 12:43, 19 June 2020 (UTC)
Looks like I created this mess! I'll clean it up, because the more I dig the more I'm convinced these are different people. Thanks. Gamaliel (talk) 12:47, 19 June 2020 (UTC)

GZWDer_(flood) going at ~1000 edits per minuteEdit

GZWDer, your bot account currently creates new items at roughly 1000 edits per minute (see here, "max single user rate"), which is beyond the edit resources available for all editors at Wikidata. Please slow it down to something of the order ot 100/min maximum in order to avoid a block. —MisterSynergy (talk) 08:17, 12 July 2020 (UTC)

@MisterSynergy: I am monitering the maxlag and currently it seems normal. Are there other issues involved in edit speed?--GZWDer (talk) 08:20, 12 July 2020 (UTC)
It is related to maxlag, and you have already brought it beyond 5 s. It was ~1 s before you started your bot. Be considerate with the edit resources, and be aware that the sawtooth pattern in maxlag start at roughly 700 edits per minute *for all editors*. If you are going at 1000/min, you alone already bring the servers into trouble. —MisterSynergy (talk) 08:26, 12 July 2020 (UTC)
@MisterSynergy: Several weeks ago when a bot using edit speed <100/min, it took ~20 minutes to make maxlag climbed to five seconds. Now it takes even longer. So I do not think such edits is really expensive to query services. There are much less triples involved in a 300 byte item than a 30KB item.--GZWDer (talk) 08:31, 12 July 2020 (UTC)
It depends on other editors as well. Activity in the recent days was rather low, thus you cannot directly compare these situations.
Be aware that this is not a negotiable request. I am aware that there is no formal rate limit defined for bots currently. Yet, there is simply no reason to create pages at that rate which is known to strain the servers beyond all acceptable limits and thus I am prepared to issue a block in case I see this again. —MisterSynergy (talk) 08:36, 12 July 2020 (UTC)
Soon this will become a regular-running process using cron (instead of running a very large batch once or twice each year), so this will no longer happen.--GZWDer (talk) 08:37, 12 July 2020 (UTC)

As you seem to be unwilling to change the pacing and continued to go at edit rates of 500–1000 edits per minute with User:GZWDer (flood), I have now blocked the bot account after maxlag went beyond 5s again. Please implement a considerate throttling mechanism so that I can unblock it again. —MisterSynergy (talk) 10:18, 12 July 2020 (UTC)

@MisterSynergy: I have stopped the current process.--GZWDer (talk) 10:26, 12 July 2020 (UTC)
Okay, I am going to unblock now. If I see you going at this pace again, I will not hesitate to reblock, and request removal of the botflag from your bot account.
I have no idea how you edit, so I cannot help you how to fix the issue. You either need to drastically reduce the number of concurrent tasks (if there are any), or hard-code some sort of "sleep" command with a reasonable waiting time after each successful edit into your scripts (or do both). —MisterSynergy (talk) 10:33, 12 July 2020 (UTC)

Merged Q5651245, Q72500102Edit

You created the duplicate Q72500102 so I merged them now. --SCIdude (talk) 08:19, 12 July 2020 (UTC)

Merged Q418993, Q72461336Edit

You created the duplicate Q72461336. --SCIdude (talk) 08:35, 12 July 2020 (UTC)

Merged Q204178, Q72460898Edit

You created the duplicate Q72460898. --SCIdude (talk) 08:43, 12 July 2020 (UTC)

Merged Q7103624, Q72497972Edit

You created the duplicate Q72497972. --SCIdude (talk) 09:03, 12 July 2020 (UTC)

Merged Q26840979, Q72516856Edit

You created the duplicate. --SCIdude (talk) 09:48, 12 July 2020 (UTC)

Merged Q22330463, Q72487289Edit

You created the duplicate. --SCIdude (talk) 09:57, 12 July 2020 (UTC)

Q97159590Edit

That auto creating new items doesn't make sense. You created a duplicate. Emptywords (talk) 12:42, 12 July 2020 (UTC)

Redundant itemEdit

Q97280767 is redundant to Q63126171. —Justin (koavf)TCM 19:52, 12 July 2020 (UTC)

See also Q63137866 and Q97187299. Why are you doing this? @MisterSynergy:. —Justin (koavf)TCM 19:59, 12 July 2020 (UTC)
They are "hidden" duplicates, which are not easy to find until an item is created.--GZWDer (talk) 20:35, 12 July 2020 (UTC)

Merged Q27109396, Q72488957Edit

You created the duplicate. --SCIdude (talk) 09:17, 13 July 2020 (UTC)

Merged Q27155622, Q72443030,Edit

You created the duplicate. --SCIdude (talk) 09:42, 13 July 2020 (UTC)

Bot importing *everything" from danish WikipediaEdit

I yesterday noticed you imported *every single* item from the danish list of new articles. The page says that new pages will at the earliest be bot-imported 3 weeks after creation, to give us reasonable time to look at them. So, if you do this again, I will ask to have your account blocked. --Hjart (talk) 21:09, 13 July 2020 (UTC)

@Hjart: I have changed the page age limit to seven days. Are there any people actively handling unconnected pages?--GZWDer (talk) 23:17, 13 July 2020 (UTC)
3 weeks please. I know I am regularly checking out that page. --Hjart (talk) 07:12, 14 July 2020 (UTC)
Your bot is creating quite a lot of duplicates, as you should know. Finding the right item to connect a page to can be too tricky for a bot. That's why people are actively handling those pages. Hjart (talk) 05:40, 16 July 2020 (UTC)

Duplicate valuesEdit

Wikidata:Database_reports/Constraint_violations/P8150#"Unique_value"_violations

The some 2000 there seem to be duplicates added by you, e.g. at [4] and [5]. --- Jura 07:42, 15 July 2020 (UTC)

@Jura1: I may fix them, though this is not the highest priority. Eventually these will be removed by KrBot.--GZWDer (talk) 07:46, 15 July 2020 (UTC)
Apparently, they aren't (it's there for a month now). How come your tool adds them twice? Can you investigate the bug and fix it. What's the "sandbox identifier" added as well? --- Jura 07:50, 15 July 2020 (UTC)
@Jura1: It was Semantic Scholar corpus ID (P8299), at that time the property did not exist. I will fix them soon.--GZWDer (talk) 07:53, 15 July 2020 (UTC)
If there isn't too much noise, maybe you could try autofix for P8299. --- Jura 07:57, 15 July 2020 (UTC)
I will use QuickStatement.--GZWDer (talk) 07:58, 15 July 2020 (UTC)

Delete thisEdit

Q97417109. This is element not used. It is value Q8456102.--Jordan Joestar (talk) 05:23, 16 July 2020 (UTC)

Category items lacking property P31Edit

Hi, your bot created the item Category:Eisenhower Fellows (Q97429282), which is good, but it should also have added property instance of (P31), which is mandatory. Can you do that in your next item creations, please? —capmo (talk) 13:47, 16 July 2020 (UTC)

MergeEdit

Hi GZWDer. Can you merge to Q97599100 and Q11276425. Because there are same disease. I wrote a new article but forgotten to linked wikidata item. Thanks. --KediÇobanı🐈 11:45, 23 July 2020 (UTC)

@KediÇobanı: Help:Merge--GZWDer (talk) 11:47, 23 July 2020 (UTC)
Thanks. --KediÇobanı🐈 12:07, 23 July 2020 (UTC)

Please stop adding duplicatesEdit

Your practice of adding items for articles without making any checks for pre-existing items degrades wikidata and causes other users effort to clean up.

If you *must* pick up articles with no items, please at least allow a decent time - a fortnight after creation, perhaps - for users who take more care than your bot does, to do a better job that your bot is doing. --Tagishsimon (talk) 00:23, 24 July 2020 (UTC)

  • 1. Merging duplicates should be as easy as creating new items; 2. It will be helpful if items are created earlier as they may be used by others. If there's some projects who actively takes care of unconnected pages, please let me know.--GZWDer (talk) 04:15, 24 July 2020 (UTC)
YOU are the one adding duplicates. It is YOUR JOB of 1. merging them, or 2. not creating duplicates. --SCIdude (talk) 08:31, 24 July 2020 (UTC)
Unconnected items may be hidden duplicates. Creating items will make them visible.--GZWDer (talk) 08:33, 24 July 2020 (UTC)
Unconnected items may be checked by humans. "Making them visible" by creating wikidata items causes them to disappear from the wikipedia lists of new articles (in effect making them invisible), tricking us into believing they may have been properly taken care of, when they really have not.--Hjart (talk) 12:27, 24 July 2020 (UTC)
@Hjart: This assumes that there are enough people regularly checking unconnected pages, which is not the case of many large wikis. If you know some wiki doing so, let me know.--GZWDer (talk) 13:05, 24 July 2020 (UTC)
I am aware that many (even quite seasoned) wikipedia editors appear unfamiliar with Wikidata. The solution here is to communicate the purpose of Wikidata, how it works etc, more clearly, not trick all of us into believing everything is ok. --Hjart (talk) 13:22, 24 July 2020 (UTC)
nl-wiki is pretty much up to speed. But if your bot creates blank doublures after one minute of creating an article, it might look better then it really is. And your argument "items can be merged" is also false. Double items will not be easily tracked down as easy as "no item yet" will. Better put energy in creating 10 valuable items, then 100 nonsense items "just to blank the list". Edoderoo (talk) 16:33, 6 August 2020 (UTC)
This is irrelevant. Adding duplicates only makes items visible if a person does the merging work. YOU make the duplicate, YOU do the work of merging, it is YOUR duty. --SCIdude (talk) 15:25, 24 July 2020 (UTC)
So the other solution is no making items at all, however this also means it will be much more difficult to find duplicates.--GZWDer (talk) 15:31, 24 July 2020 (UTC)
Of those duplicates created by you, that I've found, many were several years old. And many I found only by luck and many others only by comparing wikipedias. --Hjart (talk) 15:48, 24 July 2020 (UTC)
This does not mean not creating new items is a good idea; By comparing wikipedias you can find some connections; but once an item is created there are more tools to find duplicates.--GZWDer (talk) 15:51, 24 July 2020 (UTC)
Please enlighten me then. What are those tools? You are creating tons of items with no data at all. Take a village somewhere with different names in different languages. With no data, how can we tell that it's the same village by looking at Wikidata? --Hjart (talk) 05:03, 25 July 2020 (UTC)
User:Pasleim/projectmerge, User:Ivan_A._Krestinin/To_merge, toollabs:wikidata-game/distributed/#game=1, none of which can be used without an item. The only tool that can work without an item is Duplicity, where I regularly found huge backlog.--GZWDer (talk) 05:10, 25 July 2020 (UTC)
User:Pasleim/projectmerge works for items with same sitelinks only. User:Ivan_A._Krestinin/To_merge appears to work for items which actually has data only. toollabs:wikidata-game/distributed/#game=1 appears to be for manually adding to items (which as we can see may be duplicates anyway) --Hjart (talk) 05:21, 25 July 2020 (UTC)
User:Ivan_A._Krestinin/To_merge works between sitelinks; for example if someone created an article for en:1234 BC and de:1234 v. Chr., it will be reported by KrBot. Duplicates can also be found when searching Wikidata. In fact, I feels very regret when someone imports a list of geographic locations from various sources (before we emptied the cebwiki backlog) as there may exist cebwiki articles that users can not find easily without items. Duplicates will eventually happens, let it happen sooner better than latter. I hate infinitely growing unconnected page backlogs.--GZWDer (talk) 05:31, 25 July 2020 (UTC)
Even then User:Ivan_A._Krestinin/To_merge appears to depend on items actually having data. With differently named sitelinks and no data it's useless. --Hjart (talk) 06:08, 25 July 2020 (UTC)
In User:Ivan_A._Krestinin/To_merge/enwiki you can find a number of items without statement. In addition, it is only one of tools to find duplicates. Appearing in this page make duplicates discoverable. They may never be discovered if no item is ever created. For users working with connection of new articles, they can still do so.--GZWDer (talk) 06:10, 25 July 2020 (UTC)
There's a lot of items in that page. Please point me to actual items in there without statements. Studying the danish version I found a bunch of false positives. Do also (again) note that (at least on the danish WP) once items are created articles are removed of the list of new articles, making them harder to check. --Hjart (talk) 06:25, 25 July 2020 (UTC)
e.g. Q97277816. If your Wikipedia version have enough people handling Wikidata connection of new pages (such as Dutch one), that is fine. But Most Wikipedias do not.--GZWDer (talk) 06:31, 25 July 2020 (UTC)
Why did you create Q97277816 without checking whether it was a duplicate? Why are you leaving it to other editors to manually merge items like this? --Hjart (talk) 07:18, 25 July 2020 (UTC)
In English Wikipedia alone, there was 30000-40000 unconnected pages. It is impossible to check them one by one, and in reality nobody is doing so.--GZWDer (talk) 07:21, 25 July 2020 (UTC)
And you think that's an excuse for not at least trying? --Hjart (talk) 08:18, 25 July 2020 (UTC)
It is not more difficult to merge duplicates than to check individual unconnectedpages.--GZWDer (talk) 08:27, 25 July 2020 (UTC)
I think your actions hides problems, more than they help solve them. Did you communicate with anybody on i.e. Danish WP before emptying our list of new articles?
This is an argument not to create items for too new articles (to let people working on them), but in some wikis there seems no people working specifically on unconnected pages at all.--GZWDer (talk) 08:49, 25 July 2020 (UTC)
  • +1 It's not our job to clear up your shit. Lots of tools and 3rd party uses rely critically on WD being well de-duplicated; and for our own use it is crucial, to make sure that relevant info drawn in from external projects all gets added to the one item, not splattered all over the project. It's a fundamental issue of good data hygiene, of crucial importance to the project's success and wellbeing. I am fed up with putting my own work on hold because there is shit that you have created that has to be cleared up as a priority. And even more fed up that you seem to take no responsibility over what you do. No effort to avoid creating it. No effort to clean up after yourself. Not even any effort apparently to help others fix the shit you create. Rule #1 around here: if you create shit, then you own it, and it's on you to do all you can to clear it up. Take some responsibility, rather than forcing everybody else to clear up after your excrement. Jheald (talk) 16:47, 24 July 2020 (UTC)
    • See also Wikidata:Project_chat/Archive/2019/01#Mass_creation_of_new_items,_no_properties,_no_deduplication - my only intention is not to let backlog of unconnected pages grows indefinitely. The only different is previously items are created once or twice each year, but now it is done regularly. Do you think 1. we should never create new items en messe (unless we can guarentee zero duplicates, which is usually not the case) or even 2. the best way to prevent duplicates is to stop editing Wikidata completely. There are many tool to find and merge them afterwards.--GZWDer (talk) 16:58, 24 July 2020 (UTC)
      • So your hiding the problem, and making it a bigger effort to solve it. This is not a solution for a problem, but seeking a problem, to use the solution. Edoderoo (talk) 16:36, 6 August 2020 (UTC)
  • GZWDer, it seems a lot of headaches are caused not just by your mass imports, but by your apparent aversion to pre-planning and community discussion. Even giving a courtesy notice like "Hey I'm going to be importing hundreds of thousands of names from a random database, lots of them probably already have Wikidata items, and may have unreliable data" would help, but even better would be to announce: "Here's what I plan to do: let's discuss how best to minimize duplication beforehand, reconcile duplicates and merges afterwards, and tackle this with concerted community effort." But no, you largely work alone, in silence, dumping truckloads of messy data, in hopes that someone, some day, will clean up your messes. Wikidata is a community of editors, not just a personal game for people who like operating bots. Please act accordingly. -Animalparty (talk) 18:22, 24 July 2020 (UTC)
  • We have discussed what the bot does before: Wikidata:Requests_for_permissions/Bot/GZWDer_(flood)_4.--GZWDer (talk) 18:26, 24 July 2020 (UTC)
This is a lie. That bot description only talks about one task, but the bot did much more than what is discussed there. See for example this discussion about your chemistry import. You have created 600 duplicates and have not helped with a single merge. --SCIdude (talk) 04:23, 25 July 2020 (UTC)

In general terms, the "mission creep" in GZWDer's bot permissions should be a concern for those who granted those permissions. There should be an audit. Charles Matthews (talk) 05:07, 25 July 2020 (UTC)

I agree. I think it's time to reconsider those permissions. --Hjart (talk) 05:24, 25 July 2020 (UTC)
+1 --Voyagerim (talk) 07:56, 25 July 2020 (UTC)
+1 --Sabas88 (talk) 16:26, 6 August 2020 (UTC)
Ideally if every wiki have enough users to clean up unconnected pages, my bot will be useless. Otherwise, There is a requirement to clean up infinitely growing backlog of unconnected pages.--GZWDer (talk) 16:37, 6 August 2020 (UTC)
Just wanted to say, reasoning on a per-wiki basis is not the complete story, as there is also per-project/theme. There may or may not be editors combing through all en.wiki unconnected articles ; but I (and others) certainly do comb through en.wiki articles about video games to connect them (see here). I also comb through empty items connected to en.wiki articles about video games (the overwhelming majority of which you created) and merge a large part of them.
So for my perspective, you’re just emptying one backlog to fill up another one, more annoying to process.
I also disagree with what you say above regarding tool support − on backlog n°1, the PetScan integration with Duplicity makes it overall a breeze to spot articles with potential matches ; and the articles where really no items exist, at the very least I’d create them with P31:Q7889 (again, PetScan makes it easy). Conversely, when processing backlog n°2, I can either start improving the item by adding statements and identifiers (and maybe find out it has a dupe later on, making most of that work a waste of time) or trying to hit Special:Search, an annoying process. I have a stab at backlog n°2 pretty much every day and I have not been able to get it under 100 empty items for months on end.
I would really suggest to reconsider this workflow.
Jean-Fred (talk) 18:06, 6 August 2020 (UTC)

Q97621619 and similarEdit

Hello. I notice that your bot account seems to create items for new Wikinews articles. I would suggest changing this to only create an item for a published article rather than drafts like this. It is more than likely that this WN draft will be deleted, and consequently leave an empty WD item. Green Giant (talk) 21:45, 26 July 2020 (UTC)

@Green Giant: I will skip articles with Template:Develop (Q20765099) or Template:Abandoned (Q17586361). If there are other templates for drafts, please let me know.--GZWDer (talk) 21:47, 26 July 2020 (UTC)
Cheers. I would also add any page that has one of the following: Template:Editing (Q17588240), Template:Tasks (Q13420881), Template:Minimal (Q17589095), Template:Prepare (Q17586294), Template:Review (Q13421187), n:Template:Quick review and Template:Breaking review (Q17586502). Green Giant (talk) 22:03, 26 July 2020 (UTC)

Deny option exists?Edit

I've unlinked a file you created (Q97798633) leaving it dangling in the wind, and I'd unlink Q97861414 as well, only it would probably just encourage yet another uselessly created item, so I left it. Please see this discussion at the community portal. Thanks, Mathglot (talk) 00:58, 30 July 2020 (UTC)

@Mathglot: See Help:Merge and WD:RFD. For this article, I have added a soft redirect template so that it will not be imported again.--GZWDer (talk) 01:08, 30 July 2020 (UTC)
Thank you! Mathglot (talk) 01:12, 30 July 2020 (UTC)

Connect ms:HTTP 404 to Q206219Edit

Hi. Can u help me connect ms:HTTP 404 to Q206219? --Syed Muhammad Al Hafiz (talk) 14:11, 30 July 2020 (UTC)

@Syed Muhammad Al Hafiz: Help:Merge--GZWDer (talk) 14:12, 30 July 2020 (UTC)

Thank you.--Syed Muhammad Al Hafiz (talk) 14:13, 30 July 2020 (UTC)

DSSTOX compound idsEdit

You have imported these IDs yesterday. Where is your bot permission to do this kind of task? --SCIdude (talk) 05:53, 6 August 2020 (UTC)

I have discussed the intention in the property proposal.--GZWDer (talk) 06:00, 6 August 2020 (UTC)
So there is no permission? What is your opinion on the notion that imports to items that are part of a WikiProject require at least a note on their talk page? Also, if you have a DSSTOX entry, how do you identify the item where you put that statement? --SCIdude (talk) 07:26, 6 August 2020 (UTC)
The DSSTOX dump contains a mapping from DTXSID to DTXCID.--GZWDer (talk) 07:28, 6 August 2020 (UTC)
And you have used that mapping? So, you pretend to have no opinion on communicating with a WikiProject, despite failing to do so frequently. What is your approach to teamwork, in general? --SCIdude (talk) 07:40, 6 August 2020 (UTC)