User:YMS/LC

Label Collector

  • A script to semi-automatically import labels, descriptions and aliases for multiple languages based on the articles in Wikipedia and other Wikimedia projects (screenshot)
  • Installation: Import User:YMS/labelcollect.js from your user JavaScript file (e.g. common.js)


The tool makes suggestions for labels, descriptions and aliases.

  • Descriptions are suggested by extracting definitions from article introductions.
    • Additionally, descriptions may be suggested by transforming Wikidata statements (e.g. P31:Q571+P50:Q892 -> "book by J.R.R. Tolkien").
    • Be aware, this is a rough process and may lead to incorrect or badly phrased suggestions.
  • Aliases are suggested based on what's set bold in the article introduction.
  • Labels are suggested from the article title.
    • Additionally, for some item types (disambiguation pages, persons, ...) labels are suggested if no language using that same writing system has a different label.
  • Please mind that those suggestions are not always meaningful, and if they are they still might too long, too short or in the wrong grammatical form.


The tool provides a mask to compare existing labels, descriptions and aliases with the suggestions made by the script, and edit all at once.

  • The mask displays all languages that are either in your Babel languages, or have a sitelink, or have a label or description.
  • Additionally, article introductions are displayed to allow further review and manual copy & paste.
  • For each language, every existing project (Wikipedia, Wikivoyage, ...) is displayed.
  • The mask allows to review and edit all fields and all languages at once.
  • Suggestions and own input can be reset to the existing value by clicking the "x" button next to each input field.
  • If there already is a description, no suggestion is made automatically, but it can be loaded via the "?!" button as long as it would be different than the existing text.
  • If nothing should be changed for a certain language, the whole entry can be collapsed - changes in collapsed entries will not be saved.
  • Any language not being in your Babel languages or not having a sitelink or already having a label and description will be collapsed by default.


Usage

  • Click the "Label Collector" link in the tools link box in the navigation bar either on...
    • an item page: This item will be loaded, and if you go on afterwards, the item with the next ID is loaded and so on.
    • a user page: The last item that user edited will be loaded, and afterwards the second-last, and so on.
  • This help page is displayed as a landing page. After a moment, it should get replaced by the editor user interface.
    • An external library (XRegExp) is loaded from Cloudflare (Q4778915)'s servers. If this fails, it may be a certificate issue. Try to load the library yourself and allow your browser to load pages from this server. If you mistrust the content delivered from CloudFlare, which is a commercial service which doesn't give me any control over the hosted files, don't use the script.
  • The editor will contain all the languages you have Babel boxes for on your user page, and all the languages that either have a sitelink or a label, description or alias for that item.
    • All other languages can be added manually with the "Add" button.
  • All language entries can either be expanded or collapsed by clicking the entry header.
    • Any changes made in entries that are collapsed will not be saved when you click "Save".
    • On startup, only those entries will be expanded that are in your babel languages and have a sitelink but are missing a label or description.
    • If there is only one entry in total, it will always be expanded on startup.
  • Expanded entries show the existing label, aliasses and description on the left side, and suggestions for new ones on the right side. Below, the introductions of all sitelinks (e.g. "[w] for Wikipedia) are displayed and can be collapsed by a click on the triangle symbol. Additionally, a description made from the Wikidata statements ("[d]") is displayed, if possible.
    • For every expanded entry, Label Collector tries to create a description from the introduction of all sitelinks. Additionally, a description suggestion is created from the Wikidata statements.
    • Label suggestions are taken from the page title, brackets removed. Additionally, labels may be suggested for all languages using a certain script (e.g. Latin) if all existing labels in that script are the same.
    • Alias suggestions are taken from bold text in the article introductions.
    • If a suggestion is not automatically inserted in the according input field (e.g. because there's already an existing value, or another suggestion from another project), they can be inserted using the "?!" button.
    • The suggestions can be edited in the input field.
    • If any change was made to a label, alias or description either by the automatic suggestion or manually, the input field is colored green.
    • Any suggestion or manual input can be reverted to the existing value by the reset button ("x").
  • Clicking either "Save & Next" or "Save & Close" saves all changes in all languages that are not collapsed.
    • All changed fields (colored green) will be saved, no matter if you changed them manually or the tool made a suggestion.
    • Multiple spaces and trailing or leading spaces are removed automatically.
    • Any "|" in the alias field will be used to separate multiple aliasses.
    • Always check all suggestions made by the program. They may require some manual adjustments or even be completely absurd. Please see Help:Label, Help:Aliases and Help:Description for general requirements to those.
  • When finished, the next item with no label or description can be loaded automatically ("Save & Next" or "Skip").
    • All items will be skipped that don't have a sitelink in one of your babel languages that does not yet have a label or description. Checking this may take a while especially if you're starting with something like Q1.
    • Only a certain amount of items can be checked at once. In certain conditions, finding the next item to edit may fail.


Please help to improve this tool.

  • To improve the description recognition for your language, extend the regular expressions defined in labelcollect2.js' languageData variable, or give me some advices how I could do better.
  • Report bugs, request features, give suggestions and ask questions on the tool's talk page. Thank you.