Open main menu

Wikidata:Requests for permissions/Bot/GZWDer (flood) 3

GZWDer (flood) 3Edit

GZWDer (flood) (talkcontribsnew itemsSULBlock logUser rights logUser rightsxtools)
Operator: GZWDer (talkcontribslogs)

Task/s: Creating items for all Unicode characters

Code: Unavailable for now

Function details: Creating items for 137,439 characters (probably excluding those not in Normalization Forms):

  1. Label in all languages (if the character is printable; otherwise only Unicode name of the character in English)
  2. Alias in all languages for U+XXXX and in English for Unicode name of the character
  3. Description in languages with a label of Unicode character (P487)
  4. instance of (P31)Unicode character (Q29654788)
  5. Unicode character (P487)
  6. Unicode hex codepoint (P4213)
  7. Unicode block (P5522)
  8. writing system (P282)
  9. image (P18) (if available)
  10. HTML entity (P4575) (if available)
  11. For characters in Han script also many additional properties; see Wikidata:WikiProject CJKV character

For characters with existing items the existing items will be updated.

Question: Do we need only one item for characters with the same normalized forms, e.g. Ω (U+03A9, GREEK CAPITAL LETTER OMEGA) and Ω (U+2126, OHM SIGN)?--GZWDer (talk) 23:08, 23 July 2018 (UTC)

CJKV characters belonging to CJK Compatibility Ideographs (Q2493848) and CJK Compatibility Ideographs Supplement (Q2493862) such as 著 (U+FA5F) (Q55726748), 著 (U+2F99F) (Q55738328) will need to be split from their normalized form, eg. (Q54918611) as each of them have different properties. KevinUp (talk) 14:03, 25 July 2018 (UTC)

Request filed per suggestion on Wikidata:Property proposal/Unicode block.--GZWDer (talk) 23:08, 23 July 2018 (UTC)

  Support I have already expressed my wish to import such dataset. Matěj Suchánek (talk) 09:25, 25 July 2018 (UTC)
  Support @GZWDer: Thank you for initiating this task. Also, feel free to add yourself as a participant of Wikidata:WikiProject CJKV character. [1] KevinUp (talk) 14:03, 25 July 2018 (UTC)
  Support Thank you for your contribution. If possible, I hope you to also add other code (P3295) such as JIS X 0213 (Q6108269) and Big5 (Q858372) in items you create or update. --Okkn (talk) 16:35, 26 July 2018 (UTC)
  •   Oppose the use a of the flood account for this. Given the problems with unapproved defective bot run under the "GZWDer (flood)" account, I'd rather see this being done with a new account named "bot" as per policy.
    --- Jura 04:50, 31 July 2018 (UTC)
  • Perhaps we could do a test run of this bot with some of the 88,889 items required by Wikidata:WikiProject CJKV character and take note of any potential issues with this bot. @GZWDer: You might want to take note of the account policy required. KevinUp (talk) 10:12, 31 July 2018 (UTC)
  • This account has had a bot flag for over four years. While most bot accounts contain the word "bot", there is nothing in the bot policy that requires it, and a small number of accounts with the bot flag have different names. As I understand it, there is also no technical difference between an account with a flood flag and an account with a bot flag, except for who can assign and remove the flags. - Nikki (talk) 19:14, 1 August 2018 (UTC)
  • The flood account was created and authorized for activities that aren't actually bot activities. While this new task is one. Given that there had already been run defective bot tasks with the flood account, I don't think any actual bot tasks should be authorized. It's sufficient that I already had to clean up 10000s of GZWDer's edits.
    --- Jura 19:46, 1 August 2018 (UTC)
I am ready to approve this request, after a (positive) decision is taken at Wikidata:Requests for permissions/Bot/GZWDer (flood) 4. Lymantria (talk) 09:11, 3 September 2018 (UTC)
  • Wouldn't these fit better into Lexeme namespace? --- Jura 10:31, 11 September 2018 (UTC)
    There is no language with all Unicode characters as lexemes. KaMan (talk) 14:31, 11 September 2018 (UTC)
    Not really a problem. language codes provide for such cases. --- Jura 14:42, 11 September 2018 (UTC)
    I'm not talking about language code but language field of the lexeme where you select q-item of the language. KaMan (talk) 14:46, 11 September 2018 (UTC)
    Which is mapped to a language code. --- Jura 14:48, 11 September 2018 (UTC)
Note I'm going to be inactive for real life issue, so this request is   On hold for now. Comments still welcome, but I'm not able to answer it until January 2019.--GZWDer (talk) 12:08, 13 September 2018 (UTC)