Wikidata:Requests for permissions/Bot/Emijrpbot 9
The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Approved--Ymblanter (talk) 18:45, 12 July 2017 (UTC)[reply]
Emijrpbot 9 edit
Emijrpbot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Emijrp (talk • contribs • logs)
Task/s: add Unicode-free aliases for items with an Unicode English label (accent marks). This helps to find the items with the search engine.
Code: code will be based on this Spanish script
Function details:
Bot finds all items with English labels and adds aliases after replacing Unicode chars with Unicode-free chars. Examples:
- Max Trénel (Q1522612), bot adds alias "Max Trenel"
- Sólo Para Fanáticos (Q7666252), bot adds alias "Solo Para Fanaticos"
- etc
Bot only analyses items when its label only includes "[a-záéíóú\-\. ]" characters. List of characters to be replaced:
- á -> a
- é -> e
- í -> i
- ó -> o
- ú -> u
I can add replacements for àèìòù if desired.
Examples of items that will be skipped:
- Łódź (Q580)
- labels with ü, ñ, ç
- Any Cyrillic, Chinese, Japanese, Hebrew, Arabic, etc, alphabets.
Currently I am running an approved task for Spanish language labels. You can see plenty of examples there. --Emijrp (talk) 10:44, 8 July 2017 (UTC)[reply]
- this is very useful; I would support a more complete character translation list if you can come up with one (to cover all the non-Cyrillic European languages and Turkish for instance). ArthurPSmith (talk) 15:48, 11 July 2017 (UTC)[reply]
- I am open to add more replacements. A possible more complete set: [áàâäǎa̋ȁ] -> a. A side note: I will skip given names items, following a suggestion by @Jura1:. Emijrp (talk) 16:21, 11 July 2017 (UTC)[reply]