Wikidata talk:Lexicographical data/Archive/2024/03

This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

Broken lexeme

I've created the lexeme кантонский (L1259271) in semi-automatical way and it has an error (in P898). But now I can't create the lexeme at all! Anyone can? Or maybe admins? --Infovarius (talk) 11:03, 21 February 2024 (UTC)

@Infovarius: wow, that's a strange and nasty bug. Did you try the suggested removal of the problematic statement with the API to see if it does fix it? Also, out of curiosity, what tool or script did you used? Cheers, VIGNERON (talk) 17:15, 2 March 2024 (UTC)

Merging of Toki Pona homographs

I'm not an expert, so please feel free to correct this paragraph: Toki Pona has fewer parts of speech. A word can be used as a noun, verb or adjective based on context. For english we have separate lexemes for each word class light (L2889) [noun], light (L4183) [verb] and light (L4122) [adjective] which are interconnected with the homograph lexeme (P5402) property. So far we have used the same scheme for Toki Pona lexemes pona (L220755) [verb], pona (L220753) [adjective], pona (L1236807) [noun]. But as oppose to Toki Pona, English words change their forms depending (among others) on what part of speech they belong to.

It was suggested in a property proposal that those homograph lexemes should all be merged into a single lexeme using the lexical category content word (Q789016). Does everybody agree with this approach? @Ookap, JnpoJuwan, Sobsz, Binarycat32: – Shisma (talk) 14:53, 28 February 2024 (UTC)

I was not aware of Toki Pona until I read this, but this plan seems reasonable. Would you plan to have a collection of senses corresponding to the parts of speech in other languages then? ArthurPSmith (talk) 20:24, 28 February 2024 (UTC)
I'd imagine it like having one sense that represents multiple parts of speech. I made an example Lexeme pona (L1273370) [content word] – Shisma (talk) 08:50, 29 February 2024 (UTC)
No, those really are different semantic meanings and should be separate senses, in my opinion. That would be necessary for translation also as pointed out below. ArthurPSmith (talk) 16:50, 29 February 2024 (UTC)
Why would that be necessary? It seems to work pretty well without.
Shisma (talk) 18:14, 29 February 2024 (UTC)
I think I understand the confusion now. No, Lexemes can still have multiple senses. My point is they just shouldn't be divided into Parts of speech:
 
Shisma (talk) 18:52, 29 February 2024 (UTC)
but toki pona doesn't have the same concept of "parts of speech". it has two kinds of words: particles and content words.
the translations in that screenshot seem pretty accurate, toki pona does in fact describe all of those concepts with one word. Binarycat32 (talk) 20:54, 3 March 2024 (UTC)
I agree with this. Although there are parts of speech in toki pona, they are quite different than those in most natural languages. nouns, verbs, and modifiers (adjectives and adverbs) are all content words, while there are also preverbs, particles, and prepositions. Therefore, I think it is a good idea to merge all noun, verb, adjective, and adverb lexemes into content word lexemes. Ookap (talk) 00:38, 29 February 2024 (UTC)
I don't like the style of a combined sense. How should we then correspond Toki Pona word to other languages? I'd propose to have different senses for different (traditional) part of speeches. --Infovarius (talk) 10:12, 29 February 2024 (UTC)

Láadan (Q35757) also has fluid parts-of-speech categories; I usually just pick one and worry about exacting correctness later. Of course, for tok since the pu is only ~120 words, I guess later is now... Arlo Barnes (talk) 21:38, 1 March 2024 (UTC)

@Pamputt, VIGNERON:, the 2 testers as users

Hi, I work on a script that allows to navigate between, initially Wikidata and the wiktionaries, but I this last days expanded it to be able to navigate

  • back from wiktionaries to Wikidata lexeme
  • from a lexeme to other lexemes (when there are sereval lexemes with the same label)
  • also from items to lexemes/senses.

It still a work in progress and needs polish, so I’d be happy if you can test it and tell if it’s self exploratory how to use it and what you like or not in it ! You can put the following link

mw.loader.load("//www.wikidata.org/w/index.php?title=User:TomT0m/LexToWiktionary.js/sandbox.js&oldid=2051468341&action=raw&ctype=text/javascript"); // Gadget to go back and forth from wiktionary to Wikidata lexemes

in your global.js on meta so that it’s available on wiktionaries too.

The first 2 ways to navigate are throw interwiki button(s), next to the traditional one or added at the same place (on vector). The "item => lexeme" one is special and currently I don’t add such a button, informations are added in the "alias" section of the labels/description section for an item.

What I’d especially would like to know if it’s the imperfections of the script are OK for you or if you are bothered by some glitches, such as the interwiki button added on loading make the page jump, and same for the placement for the lexemes … Is it worth spending a lot of time in polish ? author  TomT0m / talk page 18:02, 11 January 2024 (UTC)

(Sorry not right now, the announce is a little premature, that’s totally broken right now. Please hold). author  TomT0m / talk page 19:59, 11 January 2024 (UTC) Nevermind, it’s repaired, I put an oldid to the link to the gadget sandbox in case I rebreak it again.
It works, thanks! --Infovarius (talk) 11:03, 21 February 2024 (UTC)
I get an error when I try to load it in a private window to test it. By the way, there is User:Nikki/LexemeInterwikiLinks.js which uses Cognate to create links from lexemes to Wiktionaries, and User:Nikki/LinkLabelsToLexemes.js which links labels and aliases to lexemes. - Nikki (talk) 08:37, 5 March 2024 (UTC)
Return to the project page "Lexicographical data/Archive/2024/03".