This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

space as a lexeme

Should I create space as a spacebar? I wanted to use it in the following statement:

schwarzes Loch (L613720)
combines lexemes (P5238)
schwarz (L181224)
series ordinal (P1545) → 1
[space]
series ordinal (P1545) → 2
Loch (L303361)
series ordinal (P1545) → 3

As we do it for interfixes. Is it a good idea? I don't know. If yes: which language is it? should every language have it's own space? some languages have different spaces with a different meaning--Shisma (talk) 07:42, 30 October 2021 (UTC)

@Shisma: I do not think having a lexeme for 'space', as used in most languages, is a good idea. Compared to infixes (which are much more likely to be heard and understood to provide meaning in speech), the contribution of a space to meaning beyond 'separates words' is quite minimal, and thus it feels to me like a lot of extra bloat on lexemes for no particular reason. In a prior section on this page (I think), I floated the idea of having decimal P1545 values for different parts of a single word in a multi-part lexeme, and at least in my view that would be a less disruptive convention (the example below for to a fare-thee-well (L201340)):
combines lexemes (P5238)
  to (L248738)
series ordinal (P1545) 1
0 references
add reference
  a (L2767)
series ordinal (P1545) 2
0 references
add reference
  fare (L16725)
series ordinal (P1545) 3.1
0 references
add reference
  thou (L18745)
series ordinal (P1545) 3.2
0 references
add reference
  well (L3219)
series ordinal (P1545) 3.3
0 references
add reference


add value
(One exception to this I might--although incredibly marginally--consider useful to have is the ideographic space (as used in CJK scripts to offset the names of the highest among individuals). Mahir256 (talk) 12:18, 30 October 2021 (UTC)
@Shisma: +1 with Mahir256 a lexeme for space seems odd (and if we create one space, why not all of them? there is at least twenty different space). But we need one way to indicate this information, especially in French where a lot of hyphens were removed in 1990 (see orthographic corrections of French in 1990 (Q486561)) and in English where variants with and without hyphen/space exist. Here is an idea :
combines lexemes (P5238)
  schwarz (L181224)
series ordinal (P1545) 1
subject form (P5830) schwarzes
0 references
add reference
  no value
series ordinal (P1545) 2
some property (object has role (P3831) ?) space (Q380933)
0 references
add reference
  Loch (L303361)
series ordinal (P1545) 3
0 references
add reference


add value
What do you think?
Cheers, VIGNERON (talk) 17:37, 13 November 2021 (UTC)
@VIGNERON, Shisma: I am still not entirely certain that having separate statements for word boundaries is the most efficient way to do it. To elaborate on my initial reply from 30 October, I've provided an example within that reply (see the altered P1545 values), which I continue to contend is less intrusive of an indication than an entirely new statement. (The actual P1545 format to indicate a word vs. hyphenated-component boundary could be settled on differently so that unambiguous vs. ambiguous boundary statements might be differently marked.) Mahir256 (talk) 17:50, 13 November 2021 (UTC)

Is ‚abbreviation‘ a lexical category?

Or should it be a noun or verb depending on what the abbreviation stands for? -Shisma (talk) 10:37, 2 November 2021 (UTC)

@Shisma: Could you provide an example where P5191/P5185 would be different for the abbreviation vs. for its unabbreviated form? (The judgment of Duden needn't be entirely authoritative on particular German lexemes, no?) Mahir256 (talk) 17:52, 13 November 2021 (UTC)
I can say they put a lot of thought into what constitutes a separate lexeme and what is merely a form. According to Duden, UN is a proper noun (Q147276) without gender and Uni­ted Na­tions is a plurale tantum (Q138246) of feminine gender . I'd guess UN is derived from Uni­ted Na­tions and Uni­ted Na­tions… well isn't 🤣 – Shisma (talk) 18:25, 13 November 2021 (UTC)
@Shisma, ArthurPSmith: hmm, tough question. I don't think "abbreviation" is a lexical category; but it probably should be a data stored somewhere and a separate lexeme is porbably a good idea. That said, it depends on the abbreviation, wether it's lexicalised or not. For instance "laser" is obviously lexicalised and clearly a lexeme on it's own (most people probably forgot it's an abbreviation). UN is probably lexicalised in most languages (at least in German, per the Duden and in French per the TLFi - TLFi who doesn't have the full word BTW - and English per Cambridge, Oxford and Merriam-Webster dictionaries) but it needs to be attested by references. Cheers, VIGNERON (talk) 17:10, 13 November 2021 (UTC)

Automatically add translations?

Hi,

Do you think it would be a good idea to automatically add translations to each sense?

  1. based on item for this sense (P5137): every sense that links to the same entity with the property item for this sense (P5137) OR
  2. based on existing translations: If pain (L13072) (S1) is a translation of pano (L290180) (S1): then pano (L290180) (S1) should be a translation of pain (L13072) (S1).

I expected there should be a tool for that but there isn't (?). Is it a good idea? Should I build it? --Shisma (talk) 16:12, 26 November 2021 (UTC)

@Shisma: The latter might be a good thing, as long as it doesn't start causing people to turn the set of sense links "A<->B<->C<->D<->E" into "A<->E"; the former will just end up adding bloat to lexemes and so I would advise against it. Mahir256 (talk) 17:27, 26 November 2021 (UTC)

Is there a property for lexemes that sound identical, but have different meanings and spelling

Like Laib (L493284) and Leib (L613407) --Shisma (talk) 09:17, 27 November 2021 (UTC)

Return to the project page "Lexicographical data/Archive/2021/11".