Improving lexemesEdit

Hi. I see you have been adding a bunch of vietnamese lexemes lately. I have a few questions:

  • do you know any vietnamese?
  • where did you find the lemmas?
  • would you like to help improving lexemes in other languages also? I recently wrote LexUse, it's almost done and most languages have very few examples, e.g. English. Are you interested in helping adding good examples once the script is mature?
  • are you using a script? Is the code public in that case?
  • we are talking about lexemes in the telegram channel https://t.me/joinchat/ICn09hkymb2dwpFKwGo5uA feel free to join the discussion on how to improve the lexemes there. We are 42 participants now and things are going on almost all the time.

Cheers--So9q (talk) 04:59, 3 January 2021 (UTC)

I have improved them. These lexemes are created manually. It is obviously not possible to create all lexemes to create (say 2000*50 "core" lexemes) but it needs times to develop a mature bot and it is not easy to find a usable source either. I do not use Telegram.--GZWDer (talk) 08:52, 3 January 2021 (UTC)
I agree it takes time to create good bot code. It has taken me some time to program Wikidata:LexUse but I'm quite happy with the outcome and the usage examples on Swedish lexemes has doubled since I wrote it. :D I can really recommend wikidataintegrator. The author is very easy to communicate with and the library is good quality.--So9q (talk) 18:47, 12 January 2021 (UTC)
WikidataIntegrator is a heavy-weight tool and if we only want to import lexemes with specific data it is not necessary to use. However, the more important thing is that we need an agreement about the structure of an lexeme of some languages (this is why previously imported lexemes have no forms and senses - they are intended to be filled later once an agreement is reached).--GZWDer (talk) 18:54, 12 January 2021 (UTC)

Review of your bot actions and bot requests posted in Wikidata channel in telegram todayEdit

Hi. Since you are not in Telegram I just wanted to notify you about this here: "I found this: https://www.wikidata.org/wiki/Wikidata:Administrators%27_noticeboard/Archive/2020/08#Please_reconsider_bot_permissions_for_User:GZWDer_(flood) -> lost botflag for the flood account on 7 August 2020 Here is the details on the request for import of lexemes: https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/GZWDer_(flood)_5 Mahir blocked the bot from the Lexeme namespace forever on 3 nov 2020 after the user went ahead importing lexemes without the support of the community first, see https://www.wikidata.org/w/index.php?title=Special:Log&page=User%3AGZWDer_%28flood%29&type=block

There are a lot of concerns for the current and past proposals for bot requests by this user. See the latest by Pintoch 30 december 2020 near the bottom: "I still oppose this, as I am not confident the operator can respect the views of the community on this. General lack of trust in them given the history in this area." https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/RegularBot_2

It seems to me from reading these pages that GZWDer is really trying to mold their ideas/requests to fit the community, but I get the impression that they still have a long way to go before trust is (re)established.

This comment from Jura on a recent bot request sums it up: "Can you do a sum up of recently raised issues and provide ways we can check that they are fixed? You can't just open a new bot request and expect people re-repeat every problem you are meant to fix every time." https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/RegularBot_3

So basically the users in the above pages do not trust GZWDer to clean up the mess the bot might create. I think it's wise to ask GZWDer to clean up their previous botmess first and then when good standing has been reached then we can discuss further bot request ideas for improving Wikidata. Until then any request should be denied IMO."

Good luck with the clean up--So9q (talk) 18:41, 12 January 2021 (UTC)

@So9q: Any users who concerns about specific edits please post at User:GZWDer/issues so that I can see how much the issues are.--GZWDer (talk) 18:45, 12 January 2021 (UTC)
@So9q: As I do not use Telegram, please ping any participants of Telegram chats here or direct them to the User:GZWDer/issues page.--GZWDer (talk) 18:51, 12 January 2021 (UTC)
done--So9q (talk) 08:27, 13 January 2021 (UTC)
@So9q: I have closed the first issue. I will start fixing more issues via semi-automatic tools with main account (if the flood account is still blocked) on Monday unless User:Multichill said otherwise. Fixing of some issues may require discussion and can not be worked on currently, such as this one.--GZWDer (talk) 19:07, 15 January 2021 (UTC)

Unauthorized bot User:GZWDer (flood)Edit

The bot account User:GZWDer (flood) got it's bot flag removed last year by User:Ymblanter so it's no longer authorized to operate here. Please get it authorized again before doing any more edits. Multichill (talk) 18:52, 12 January 2021 (UTC)

An issue is users raised concerns at requests Wikidata:Requests for permissions/Bot/RegularBot 3 and I have responsed to them, but they does not reply further. This makes no helps about getting an approval.--GZWDer (talk) 18:56, 12 January 2021 (UTC)
That's unfortunate, but that's not the account I'm referring too. A bot flag is a statement of trust: Trust to do correct edits, trust to not abuse it, trust to fix issues, trust that you will clean up the mess that might be caused by a robot, etc.
It looks like multiple people lost that trust in you to the point that the authorization was revoked. It's going to be hard to regain that trust. Running an unauthorized bot (or running a bot under your main account) will only make matters worse. Multichill (talk) 19:14, 12 January 2021 (UTC)
@Multichill: I would prefer other users to summerize issues (or mess) at User:GZWDer/issues so that they may see how the issues got fixed. Talk page threads are not permanent.--GZWDer (talk) 19:19, 12 January 2021 (UTC)
As a side note, we need to define good practices of (semi)automatic editing. Many users are doing semi (or fully) automatic edits (e.g. imports) without any approval, and we does not have a clear distinction between tools-aided edits (e.g. Mix'n'Match) and semi-automatic ones.--GZWDer (talk) 19:28, 12 January 2021 (UTC)
You decided to do edits with the unauthorized bot account so I changed the block to a complete block. Multichill (talk) 11:42, 13 January 2021 (UTC)
@Multichill: Can I use this account for fixing issues only? I am not planning to do other edits until issues are resolved. Alternatively you may provide other routes to resolve the problem.--GZWDer (talk) 13:10, 13 January 2021 (UTC)
@Multichill, So9q: As a note: I am fixing existing issues but am not certain what is considered acceptable by community. Some issues such as these 900 items are not easily fixable without automatical tools.--GZWDer (talk) 20:02, 13 January 2021 (UTC)
Hi again. I have a concrete suggesting of how you can go forward with semi-automated edits and slowly build up your reputation again. Do you want it?--So9q (talk) 06:05, 14 January 2021 (UTC)
@So9q: I need a formal statement about semi-automated edits from @Multichill: first. Note to make user contributions readable I prefer to use the GZWDer (flood) account as legitmate alternative account for all semi-automated edits, even without bot flag (and I did it for some times).--GZWDer (talk) 08:06, 14 January 2021 (UTC)
I don't understand. If your user account is still unblocked, why can't you just use that. A bot account is for fully automated edits (only if you ask me). That's why we have an approval process and extra rules. It's possible and allowed to do semi-automated edits with a user account (I use LexUse with mine for example). Are you willing to hear my suggestion now?--So9q (talk) 12:15, 14 January 2021 (UTC)
The tools or scripts may be run in main account, but 1. as I check my contribution, it is not easy to differ manual edits from ones produced by bots; 2. Multichill does not recommand running them on main account.--GZWDer (talk) 12:37, 14 January 2021 (UTC)

Items with mismatched ORCID and nameEdit

See Fluorescence and Multiphoton Imaging for Tissue Characterization of a Model of Postmenopausal Ovarian Cancer (Q92079503). It has author serial #1 whose name matches author names string serial #2. I followed the author name for serial # (Travis W. Sawyer (Q91471773). The ORCID for that item is wrong; it belongs to another named person. Strange, but that ORCID actual owner (Travis W. Sawyer) should be the article's #1 author. I don't know how your list that you used to create items, but it was corrupted. I have found for or five items where the ORCID doesn't belong to them. It's frustrating when I run ORCIDator to put disambiguate authors based on what they have in their ORCID and the names don't match (and ORCID fails.) I've corrected the error, but I hope you'll take the take to investigate the history of these items.

I'm not sure what you need to go and isolate your bot's failures, particularly given these errors were made so long ago. It may be irreparable. I certainly hope you've improved your bot so it doesn't continue these errors. I know they may be a small fraction of your bot's output, but they still create problems with the data that are hard to uncover.

Thanks for your work to add content. Trilotat (talk) 22:32, 13 January 2021 (UTC)

I have a suggestion for a solution. Remove all author information from this corrupted dataset. We keep the article item. In this case the author linked from the article was missing an orchid despite it being stated in the source at https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=EXT_ID:32311117%20AND%20SRC:MED&resulttype=core&format=json
That gives me the impression me that this import was not carefully considered before being rushed into Wikidata. I'm generally in favor of deleting and reimporting rather than try to clean up a mess like this. If anyone is willing to run a clean up bot I'm all ears, until then I suggest we delete all incorrect statements and someone else will import correctly later on.
If orchid is missing on some of the authors it might be a good idea to contact the source and ask them to fix it instead of creating items for authors in WD with no external ID like orchid. How does that sound?--So9q (talk) 06:15, 14 January 2021 (UTC)
@Trilotat: It is frequent that someone claims other's (with same name) paper as their own contribution; in my memory a tenure professor was fired for doing so. See User_talk:LargeDatasetBot/archive#Bad_data_in_Orcid; if you see any ORCID records that does not refers to one single person, please report it to ORCID so that ORCID may lock their record.--GZWDer (talk) 08:11, 14 January 2021 (UTC)
The error is in the source for Q91471779 where one author's ORCID is linked to another author. The author item Q91471773 was then created from the ORCID but using the source where it was used incorrectly. When Q92079503 was created it was just matched to the existing item based on ORCID. Comparing the names in the source with the names in the items would find some of these errors, although sometimes it will be the same person just a different name. If it's a corrupted dataset it could affect any part of it, not just the author information. More likely it occasionally has incorrect data, similar to most used in Wikidata. Peter James (talk) 12:16, 14 January 2021 (UTC)

How did you connect ORCID to Q91471777Edit

Can you tell me the logic/source you used to connect the ORCID to this author? I generated an item for Jennifer Barton (Q104805370), the confirmed author of Evaluation of segmentation algorithms for optical coherence tomography images of ovarian tissue (Q91471779). I created a new item because I couldn’t confirm they were same person. You generated the scholarly article Q91471779 and connected it to Q91471777 with an ORCID; if I knew your logic for assigning that ORCID we could connect that ORCID to Q104805370. If you can clarify, please let me know and I’ll disambiguate. I’ve connected many items to the other person so please allow me to do the disambiguation. Thanks, Trilotat (talk) 14:42, 14 January 2021 (UTC)

  • here is the source of the data and you can see the ORCID is assigned to the wrong author. One article is only imported once and I did not see any other articles by Sawyer. Note this bot is not able to find authors without ORCID.--GZWDer (talk) 15:16, 14 January 2021 (UTC)
    It appears to be correct - unless they (or the university or the publisher) submitted the ORCID of another person with the same name instead of their own. It's the other ORCID there that is assigned to the wrong author. Peter James (talk) 17:05, 14 January 2021 (UTC)
    EuropePMC have an "claim author" tool, and people may make errors claiming others' work.--GZWDer (talk) 17:07, 14 January 2021 (UTC)

After MGP, how about AcademicTree?Edit

You've done a great job importing MGP information to Wikidata. Have you considered doing the same kind of import for Academic Tree (Academic Tree ID (P2381))? --Bender235 (talk) 21:49, 18 January 2021 (UTC)

As there would be large number of potential duplicates, we need to spend some times matching existing items first.--GZWDer (talk) 06:15, 19 January 2021 (UTC)
Is there no way to match advisor-advisee pairs from MGP to Academic Tree data? --Bender235 (talk) 20:39, 19 January 2021 (UTC)
No yet.--GZWDer (talk) 20:41, 19 January 2021 (UTC)
I see. I hope you consider it in the long run, though. Apart from that, there is also the RePEc Genealogy, based on RePEc Short-ID (P2428). Another source to keep in mind. Cheers. --Bender235 (talk) 03:35, 20 January 2021 (UTC)


Hi GZWDer, if you think OS is needed, please contact an OSer. Personally, I'd have done with mere deletion and obviously not requested it there. Thanks. --- Jura 13:51, 22 January 2021 (UTC)

An OSer is pinged in edit summary.--GZWDer (talk) 14:01, 22 January 2021 (UTC)
Somehow I doubt that works. --- Jura 15:30, 22 January 2021 (UTC)
To avoid the Streisand effect, it is still recommended not to request removal of private information in public venue.--GZWDer (talk) 21:22, 22 January 2021 (UTC)
  • Can you do the necessary? It's still not done. --- Jura 09:27, 23 January 2021 (UTC)
    • @Jura1: please email oversight@wikidata.org --GZWDer (talk) 16:25, 23 January 2021 (UTC)
      • If you still think it's needed, please do the necessary. Personally, I think the requested deletion is sufficient. If you wont proceed, I will undo your edit. --- Jura 16:57, 23 January 2021 (UTC)
        • @DannyS712: What's your opinion for that? You can see the history of AN to see what I refer to.--GZWDer (talk) 17:00, 23 January 2021 (UTC)
          AN has a rather large history - not sure what you're discussing, but if its something that needs oversight emailing the OS team is the way to do it, and if it just needs rev-del I would ask on irc in #wikidata-revdel DannyS712 (talk) 01:08, 24 January 2021 (UTC)
  • So what do you prefer to do? --- Jura 12:57, 24 January 2021 (UTC)

Bulk deletion requestEdit

See this request. Greetings, --Dick Bos (talk) 16:31, 31 January 2021 (UTC)

Three-way author mismatchEdit

The current version of George K C Wong (Q91935629) seems to confuse three people: the label is "George K C Wong", while "Philippe Bijlenga" (for whom we have Philippe Bijlenga (Q42305437)) is listed as an alias (due to a merge from Q91935643 via QS batch 38147), and the ORCID points to an entry under "Jose Suarez". All of this suggests that your workflows for these matters would benefit from close inspection. --Daniel Mietchen (talk) 21:30, 16 February 2021 (UTC)

@Daniel Mietchen: A simple investigation found that: [1] and [2], the author "George K. C. Wong" linked to ORCID 0000-0003-0548-9936 [3] and [4], the corresponding author "Philippe Bijlenga" also linked to ORCID 0000-0003-0548-9936. So @ArthurPSmith: should we ask ORCID to disable such ID?--GZWDer (talk) 08:32, 18 February 2021 (UTC)
Wow, that's a mess. Do you know why your code merged those two people? Anyway, ORCID should definitely be notified that George K. C. Wong appears to be using this ID that has a very different name. GZWDer are you willing to do that? Click on "Help" in the bottom right, then "Contact Us" and select the right option (maybe "Other" in this case) and fill in the form. ArthurPSmith (talk) 14:01, 18 February 2021 (UTC)
The likely situation is (1) the bot processed George K. C. Wong/0000-0003-0548-9936 and there are not an item for such ORCID; (2) the bot created a new item for such information; (3) the bot processed Philippe Bijlenga/0000-0003-0548-9936, but due to query service lag, the bot does not found the new item and created one (there is a LRU cache but the entry may be purged by other ORCIDs); (4) the bot created another item for that ORCID; (5) these two items have the same ORCID and are merged.--GZWDer (talk) 14:12, 18 February 2021 (UTC)
Both articles have a list of authors and collaborators and Suarez is the first name in the collaborators, so a more likely explanation is that the ORCID was added in the wrong place. Some items that now link to Q91935629 should link to either Q42305437 or an item for Jose I Suarez. Peter James (talk) 09:44, 3 March 2021 (UTC)

Unsourced claim, againEdit

Hi. About mendelevium (Q1898) and element symbol (P246), and your [5] revert.

As I wrote with my revert: You did not source this claim, which is otherwise unknown nor common. Then you did revert *without providing a source*. The Help-link does not base or motivate your claim. And, I must say: that response is very paternalistic. I strongly suggest you comply with basic Wiki quality level (instead of editwarring with empty es's). -DePiep (talk) 21:12, 25 February 2021 (UTC)