Wikidata:Translators' noticeboard/Archive/2020/03

This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

Problem to add interwiki/data

uk:wikt . Need to link with ru/en entry (mostly ru). Entering lang code rightly, name one I have an error Сторінку «Шаблон:імен de (e)s er ern/n» не вдалося знайти на сайті «zhwiktionary». Зовнішній клієнтський сайт «zhwiktionary» не надав інформації про сторінку «Шаблон:імен de (e)s er ern/n». didn/t find on zhwiktionary. WTF?--Albedo (talk) 10:57, 14 March 2020 (UTC)

This looks like phab:T247477. --Matěj Suchánek (talk) 10:57, 15 March 2020 (UTC)

Translation extension and Wikidata:Glossary format

What would be needed to get translations working if this is split into subpages? e.g. Wikidata:Glossary/alias, Help:Glossary/article placeholder. The subpages could be transcluded back into Wikidata:Glossary or into other pages (e.g. Help:Wikibase extensions (list), Help:Alias (summary section) ). --- Jura 20:56, 1 March 2020 (UTC)

@Jura1: {{TNT}} can handle it, but humans not as well: it would mean many small pages, all translations needed to be be migrated, it would be harder to follow changes through the watchlist etc. I’d recommend using Labeled Section Transclusion instead. —Tacsipacsi (talk) 00:54, 2 March 2020 (UTC)
@Jura1, Tacsipacsi: yes, there is also {{TNTN}}, if we just want to translate the link label, to land on a en page , for example: Bistro global in fr to land on Project Chat. But I am of the same opinion as Tacsipacsi: for this page that you mention in title, that makes too many sub-pages to manage (very small in addition).
Now, I try to "read between the lines". I know you have a history and you are someone established in this project. So I wondered why you wanted to do something so bizarre (I am not criticizing you). I think you want to classify both the original terms and the translated terms, right? For example in French: « Élément » before « Identifiant externe » (the reverse in English, "External ID" before "Item"). Am I right? If so, there is a much easier way, do I write the guidelines for you here? Cordially. ps. It's better to talk about the underlying problem to get to the heart of the problem faster. Eihel (talk) 03:37, 2 March 2020 (UTC)
Sorting already works, but it's a bit odd that the term isn't available separately from the definition. --- Jura 11:06, 2 March 2020 (UTC)
it seems that for translators, it shouldn't really change: at least if they are all added to the same group of messages.
The size of the units shouldn't change either.
The main problem there is now would persist: they are still about completely unrelated topics (besides being about Wikidata).
A separate page for each entry would make it much easier to query and retrieve them. Users wouldn't need a new extension either.
(Other options I had thought about was the use of lexemes or a monolingual-text property, but this would make translations much more complicated.)
Maybe if we start the pages at Help:Glossary/, the extension would directly propose the old translations as new ones (for unchanged texts). --- Jura 11:06, 2 March 2020 (UTC)
@Jura1: The problem with aggregate groups is that they are not multi-level, that is, if we create an aggregate group for the glossary pages, they would no longer be in the Help aggregate group. Many small pages count not when translating, but when using usual wiki tools to check changes: watchlist, RC etc., as well as when making changes to the English original: now, multiple sections can be changed at once, which makes almost anything faster (typo fixes, adding new items, updating multiple items as Wikidata and the Wikibase software changes etc.)
Why is it a problem that they are about unrelated topics, and how would it be solved by splitting and re-transcluding them? A glossary inevitably contains slightly related concepts, and I don’t feel it’s an issue at all. The Wikidata:Glossary page would still contain these unrelated topics if they’re transcluded rather than being right there in the wiki code.
What users would need a new extension? We’re on Wikidata, and LST is installed here (see Special:Version). Where do you want to query it? If only on Wikidata pages, a template can be created, which takes only a glossary item name, so transcluding in other pages would not be a problem.
As far as I know, the extension only proposes translations as long as they are present somewhere, that is, not after they have been removed from the original page. Although I don’t feel it a huge concern, as I would expect the person implementing this change to transfer the translations ASAP, so either the translation memory isn’t used anyways, or the original page can be kept for the time being. (If this change is implemented at all, of course.) —Tacsipacsi (talk) 11:12, 4 March 2020 (UTC)
Interesting. Didn't know we had LST. I wonder if it's actually used. Reassuring that translations wouldn't be lost.
BTW, I tried an alternate approach at Wikidata:WikiProject_Movies/new_films#Test. View in Special:Translate. What do you think of it? Changes would all be one page. --- Jura 13:12, 6 March 2020 (UTC)
Easy-to-understand is certainly not among its strengths. :) How is it better than other methods mentioned in this section? Or is it just a solution for a random different problem, and you don’t intend to use it on Wikidata:Glossary? As it’s currently full of sandbox items and properties, I don’t see how it would work “in the wild”, but Listeria’s dynamics is probably not very compatible with the staticity Translate needs (frequent updates mean frequent need for mark for translation, English original strings can often change, even translation units can dynamically appear and disappear). —Tacsipacsi (talk) 23:03, 6 March 2020 (UTC)
Well, the output there is similar to what's found on Wikidata:Glossary.
The main difference is that changes would all be on one page, not on subpages (apparently one of your concerns). The translation function would work similarly to today.
An advantage could be that translations get a unique id, e.g. Translations:Wikidata:Glossary/15397819/fr for the entry from Q15397819#Glossary_entry.
Updates would only happen if one edits an entry. There shouldn't be more changes than there are today.
The sample currently lacks links to edit the statements on items, but that's fairly easy to add and could also happen with other tools.
Did I miss anything from a translator's perspective? --- Jura 01:54, 7 March 2020 (UTC)
Oh, I start understanding what you plan—you want to transclude Translations: namespace pages directly, don’t you? Unfortunately, translation variables wouldn’t work this way: for example, when there’s Par exemple, l’élément $q a le libellé […] on Translations:Wikidata:Glossary/22/fr, that results in a literal $q when transcluded directly, while it’s replaced by [[Q2]] when it appears on Wikidata:Glossary/fr (or transcluded from there).
Of course, using items doesn’t necessarily imply more frequent updates, but I feel that more people would boldly edit items than they edit translatable pages directly. —Tacsipacsi (talk) 23:51, 7 March 2020 (UTC)
I think it's one of the problems of the glossary that it isn't edited or known much, if at all.
Good point about the tvar-variables. Somehow I couldn't get them to work in the query with Listeria anyways, so I went with qid anchors. Seems that works fairly well.
I thought primarily of making it possible to retrieve the text (if needed) from the subpages, but working transclusions help too.
This more structured approach can avoid that we get two glossaries on Wikidata with entries about lexeme, form, sense (as we do now). --- Jura 12:23, 8 March 2020 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── @Jura1: Translation variables and anchors are two different things: anchors are before the translation unit, while translation variables are in the middle of it—in your model, this means the item statements. Making the translation less convenient for the translators (by losing the translation variables) would require very strong reasoning, and I still don’t see it here. Neither do I see in what way would involving items help reducing the number of glossaries we have. Actually, what subpages are you speaking about? I think I’m getting lost… (By the way, we already have unique translation unit IDs: the translation IDs are unique within each translatable page, and page titles are unique.) —Tacsipacsi (talk) 23:20, 10 March 2020 (UTC)

  • I'm not sure how could find the translation IDs now from merely having name of the entry? It seems to be me that translations are currently somewhat lost in the translation extension.
Can you explain what translators would loose from the anchor system? (I assume they understand the link text).
Duplication could be avoid by generating different glossaries from partially the same items (e.g. lexeme would be in both glossaries). --- Jura 17:34, 11 March 2020 (UTC)
The translation IDs can be got from the English page’s source code (the <!--T:n--> comments), and also with the search, e.g. Special:Search/conflation inlanguage:en prefix:Translations:Wikidata:Glossary (to search for “conflation”).
Translators wouldn’t loose the anchor system, as i) that would remain, as far as I understand; and ii) that’s out of the translation area anyway. They would loose the translation variables, which makes it easier to paste texts that shouldn’t be translated in the middle of the translated text (the presence of these variables is also a strong warning that these should not be translated; without them it’s more likely that someone will translate something that shouldn’t be translated). See for example my above example, where the QID link is within a translation variable.
That deduplication is only for English, the translations wouldn’t be deduplicated. It’s more than nothing, but I don’t think it overweighs the costs of this change. —Tacsipacsi (talk) 00:40, 13 March 2020 (UTC)
It's clear that translation IDs can be found, but it's something that seems hard not to do one-by-one manually.
When writing the English version, I found these variables somewhat confusing .. but I guess I hadn't really figured them out. It does seem to be something where inclusion or not needs to be evaluated. For that to work in Listeria, one would probably need to convert it to the "\u"-notation (within the query). At least, that's what I ended up using to get some of the other things working.
Translations wouldn't be deduplicated (as they aren't now either), but the text would be identical so the translation extension could suggest the translation that was already done. Overall, I think that would be an advantage. BTW Thanks for the detailed feedback: I think it really helps me evaluate this. --- Jura 10:03, 13 March 2020 (UTC)
It’s still not clear for me why you need the translation IDs. It matters a lot how often they will be needed: once, once a year, once a day? For what purpose?
Translation variables are within the text, and there may be any number of them in one translation unit, so they have nothing to do in the query; if you go this way, they’ll have to be in the item statements. But if they’re not obvious for you, you’ll almost certainly make a mistake, which is a lot harder to catch, as the items don’t process these translation units, so there’s no immediate feedback. (A translatable page looks awful if the translation markup is imperfect, so it’s instantly obvious that there’s some issue, right in the preview. If the page is updated with Listeriabot, this breakage occurs only after the bot update, so the page is actually broken for readers as well.)
The English originals can already be standardized in the current system, it just needs a little attention. And more people can help, as it’s far less cryptic than the Listeria-based solution. By the way, are you sure that the resulting items will be notable? Point 1 is out of question, so is point 2; and I don’t think they qualify with point 3, either. —Tacsipacsi (talk) 00:48, 14 March 2020 (UTC)
Personally, I'm happy with the English text being accessible in a structured, queryable way, but the translation IDs allow anybody else to get the same in another language.
It's clear that the approach is somewhat different from the present one, but for translators that impact should be minimal. --- Jura 09:40, 14 March 2020 (UTC)
I still don’t see any actual or probable use case that needs these IDs. For translators the process doesn’t change indeed, given that translation variables continue to work.
I think we’ve run out of new arguments, so probably it’s time to finish this discussion. I’m still not a fan of your solution, but I will accept if you implement it (if the translation variables remain and the notability of the new items is accepted by the community). —Tacsipacsi (talk) 01:09, 15 March 2020 (UTC)