Wikidata:Requests for permissions/Bot/Dexbot 11
The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Withdrawn by operator (at least for now). — PinkAmpers&(Je vous invite à me parler) 04:17, 12 March 2018 (UTC)Reply[reply]
Dexbot edit
Dexbot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Ladsgroup (talk • contribs • logs)
Task/s: Auto-transliterating for names of humans
Code: Based on pwb, probably publish it soon.
Function details: The codes analyses dumps of Wikidata and can create an auto-transliterating system for any given pair of languages based on that. I started with Persian and Hebrew (some edits for test [1] [2]) --Amir (talk) 18:14, 7 April 2015 (UTC)Reply[reply]
Comment, please let me know when you try your system for some cyrillic language. I'd like to see it myself. --Infovarius (talk) 14:10, 8 April 2015 (UTC)Reply[reply]
- @Infovarius: I work in pair of languages like fa and he (which the bot adds Persian transliteration based on Hebrew and vice versa) which pair of language do you suggest? en and ru? Amir (talk) 11:54, 9 April 2015 (UTC)Reply[reply]
- Probably you should have stated this in your request. Your phrase "I started with" has encouraged me :) No, I don't suggest Russian as I understand the complexity of the task. --Infovarius (talk) 13:16, 10 April 2015 (UTC)Reply[reply]
- @Infovarius: I don't think Russian is too complicated to abandon. I took care of lots of different issues including country of citizenship, etc. so It's not hard for this bot. I asked you what language do think is the best pair for Russian *to start with* Amir (talk) 21:11, 10 April 2015 (UTC)Reply[reply]
- Probably you should have stated this in your request. Your phrase "I started with" has encouraged me :) No, I don't suggest Russian as I understand the complexity of the task. --Infovarius (talk) 13:16, 10 April 2015 (UTC)Reply[reply]
- @Infovarius: I work in pair of languages like fa and he (which the bot adds Persian transliteration based on Hebrew and vice versa) which pair of language do you suggest? en and ru? Amir (talk) 11:54, 9 April 2015 (UTC)Reply[reply]
- Will the bot be able to dedect delicate labels as in King An of Han (Q387311)? --Pasleim (talk) 19:24, 13 April 2015 (UTC)Reply[reply]
- Are we ready for approval here?--Ymblanter (talk) 16:08, 15 April 2015 (UTC)Reply[reply]
- Just a caveat when when dealing with Chinese languages: Chinese to Latin script (and vice versa) transliterations are rarely standardized. For example, Alan Turing's given name might be transliterated into 艾伦 or 阿兰 (as in the case of Alan Moore (Q205739)) or 亚伦 (as in the case of Alan Arkin (Q108283)). These Chinese characters are roughly resembles "Alan" when pronounced, but due to regional differences (i.e. mainland China, Taiwan, Hong Kong, etc), they result in different transliterations. Even when two people's names are transliterated by the same region, they can be different. There is simply no standardization on this matter. —Wylve (talk) 14:53, 23 April 2015 (UTC)Reply[reply]
- hmm, User:Wylve: Just a question: Is it wrong to put "亚伦" for Alan in Alan Turing? Amir (talk) 12:36, 25 April 2015 (UTC)Reply[reply]
- It's not wrong, but it might not be the only way people call Alan Turing in Chinese. The lead sentence of Turing's article on zhwiki mentions that "Alan" is also transliterated as 阿兰. —Wylve (talk) 20:48, 25 April 2015 (UTC)Reply[reply]
- @Wylve: I made 50 auto-transliterations [3], please check and say if anything is wrong or unusual. Thanks Amir (talk) 20:05, 16 May 2015 (UTC)Reply[reply]
- I can't verify every name, since some of those people aren't mentioned in Chinese news sources. My standard of what is "wrong" or "unusual" is whether the transliterations you've produced are used predominantly in reliable and reputable sources. It is hard to judge sometimes, as there is a variety of transliterations used. For instance:
- @Wylve: I made 50 auto-transliterations [3], please check and say if anything is wrong or unusual. Thanks Amir (talk) 20:05, 16 May 2015 (UTC)Reply[reply]
- It's not wrong, but it might not be the only way people call Alan Turing in Chinese. The lead sentence of Turing's article on zhwiki mentions that "Alan" is also transliterated as 阿兰. —Wylve (talk) 20:48, 25 April 2015 (UTC)Reply[reply]
- hmm, User:Wylve: Just a question: Is it wrong to put "亚伦" for Alan in Alan Turing? Amir (talk) 12:36, 25 April 2015 (UTC)Reply[reply]
- Jonathan Ross is transliterated as 强纳·森罗斯 and also 喬納森·羅斯
- Leonard B. Jordan is also transliterated as 萊昂納德·B·喬丹
- Jimmy Bennett is also transliterated as 吉米·本内特, 吉米班奈, 吉米班奈特.
- Jason Lee is also named 杰森·李.
- "Scott" from A. O. Scoot is also transliterated as 史考特.
- All of your edits should be fine if read in Chinese, as they all sound like their English name. Also, I have found this page ([4]), which documents Xinhua News Agency (Q204839)'s official transliterations of names. These transliterations are considered official only in Mainland China. —Wylve (talk) 21:58, 16 May 2015 (UTC)Reply[reply]
┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ @Ladsgroup, Wylve: Does this look okay for an approval, or is there something we're missing? I don't speak (or read, for that matter) Chinese Hazard SJ 05:40, 28 December 2015 (UTC)Reply[reply]
- Amir: Sounds cool. Regarding the he-fa pair
- Is it expected to run on any entity for which there is he label but no fa and vice versa of type human (Q5)? Do you think it will be desired to do it also for places with country (P17)=Israel (Q801) (he to fa) and country (P17)=Iran (Q794) (fa to he)?
- Please note there are some exceptions in FA-HE pairing - see here: he:ויקיפדיה:כללים לתעתיק מפרסית. (before the table there is a short list of first names that are considered as exceptions)
- Tagging Amire80 and Eldad who may add some other advices. Eran (talk) 18:53, 4 January 2017 (UTC)Reply[reply]
- @Ladsgroup: Only human names? How about geographical objects (populated places, rivers, etc.)? Right now I'm thinking to transliterate manually some batches of names of Ukrainian localities and to harvest them in WD; should I leave this task for your bot?:) --XXN, 14:49, 12 May 2017 (UTC)Reply[reply]
- @Ladsgroup: Are you still working on this? — PinkAmpers&(Je vous invite à me parler) 23:26, 4 March 2018 (UTC)Reply[reply]
- @PinkAmpersand: I love to, this is one of the most exciting things that can happen in labeling in Wikidata but unfortunately I don't have time for this. I hereby withdraw from this request Amir (talk) 23:57, 6 March 2018 (UTC)Reply[reply]
- Okay. I'll close this out. Feel free to reopen anytime you'd like. Just remember to remove the entry from the list of archived requests. — PinkAmpers&(Je vous invite à me parler) 04:17, 12 March 2018 (UTC)Reply[reply]
- @PinkAmpersand: I love to, this is one of the most exciting things that can happen in labeling in Wikidata but unfortunately I don't have time for this. I hereby withdraw from this request Amir (talk) 23:57, 6 March 2018 (UTC)Reply[reply]