Wikidata:Property proposal/Lexemes

Property proposal: Generic Authority control Person Organization
Creative work Place Sports Sister projects
Transportation Natural science Computing Lexeme

See also

edit

This page is for the proposal of new properties.

Before proposing a property

  1. Search if the property already exists.
  2. Search if the property has already been proposed.
  3. Check if you can give a similar label and definition as an existing Wikipedia infobox parameter, or if it can be matched to an infobox, to or from which data can be transferred automatically.
  4. Select the right datatype for the property.
  5. Read Wikidata:Creating a property proposal for guidelines you should follow when proposing new property.
  6. Start writing the documentation based on the preload form below by editing the two templates at the top of the page to add proposal details.

Creating the property

  1. Once consensus is reached, change status=ready on the template, to attract the attention of a property creator.
  2. Creation can be done 1 week after the creation of the proposal, by a property creator or an administrator.
  3. See property creation policy.

Wikibase lexeme

edit
   Under discussion
Descriptionsuggest the relationship between similar Javanese lexemes, between its various registers (social variants), mainly ngoko (Q12500634) register (plain Javanese), krama (Q12492493) register (high/polite Javanese), and madya (Q13091955) register (middle Javanese)
Data typeLexeme
Domainlexeme senses, in particular forms with spelling alternatives
Example 1kowé/kowe/ꦏꦺꦴꦮꦺ (L2328) "ngoko" register and sampéyan/sampeyan/ꦱꦩ꧀ꦥꦺꦪꦤ꧀ (L1322036) "krama" register both means "you", but have different social register, where the former is considered casual, and the latter more formal and polite. For reference, please see the online Javanese dictionary in https://www.sastra.org/leksikon (make sure to tick "kata utuh" checkbox when searching to exclude partial matches). For more information regarding this ngoko/krama, see the introduction in this Javanese-English dictionary: https://www.sastra.org/bahasa-dan-budaya/kamus-dan-leksikon/1703-javanese-english-dictionary-horne-1974-1968, especially section 4.1. Organization of the Entries, and 5. SOCIAL STYLES. See also: en.wp, https://jv.wiktionary.org/wiki/Wikisastra:Tabel_krama-ngoko jv.wikt
Example 2(update 18 August) gunung/ꦒꦸꦤꦸꦁ (L680638) (ngoko), redi/rêdi/ꦉꦢꦶ (L45622) (krama)
Example 3(update 18 August) endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183) (ngoko), sirah/ꦱꦶꦫꦃ (L999025) (krama), mastaka/ꦩꦱ꧀ꦠꦏ (L413863) (krama inggil)

Motivation

edit

I'm planning to add more Javanese lexeme, but there are many words with different registers, and using synonym (P5973) is not correct, because although they have different meaning, but they have different usage, and also there are many synonyms within the same registers (for example, "you" have 4 or more synonyms in "ngoko", and 3 or more different words in "krama"). Using a dedicated property would enable to search and query the relationship between different registers. As you can ses from the links provided above, the relationship between these registers are not one-to-one, and while "ngoko" form is considered the default, not all "ngoko" have "krama" equivalent (only about 1000 without affixation, much more with affixation), much less "madya" and other register ("krama inggil", etc.) and some "krama" are equivalent to several "ngoko", because they are not true "synonym" equivalent, but rather substitutions words for different social context. Therefore this property should support multiple relationships. For example:

"you"
  • ngoko: kowe, (synyonym: ko'ên, kohên, kowên)
  • madya: samang, andika, (synyonym: dika)
  • krama: sampeyan, (synyonym: bênampeyan, bênangpeyan)
  • krama inggil: panjênêngan, (synyonym: nandalêm, paduka)
"to say, to tell"
  • ngoko: kandha, (synyonym: clathu, ngomong, kêcap, wara, gotèk, cluluk, wuwus, etc.)
  • krama: criyos, sanjang, (synyonym: sajang, wicantên, etc.)
  • krama andhap: matur
  • krama inggil: andika, ngêndika, (synyonym: unandika)
  • kawi: angling
(things related to hand / "tangan")
  • ngoko: tangan, krama inggil: asta, simple noun, but the verbs get complicated:
  • krama inggil: ngasta (ng- + asta) serve as substitutions for ngoko: 1 nyambut gawe (to work), 2 nggawa (to bring, take, carry), 3 nandang (to do), 4 nyekel (to hold, grasp, to handle), 5 mulang (to teach)

Bennylin (talk) 18:23, 9 August 2024 (UTC)[reply]

Update 18 August

edit

Just to make it clearer, on behalf of Javanese speakers, we would like to request 5 new properties:

The first and foremost reasoning is that most Javanese dictionaries (monolingual, bilingual jv-id, jv-en, jv-nl) separate Javanese lexemes into mainly these 5 registers and link to their counterparts seamlessly. Secondly, the current available property (synonym (P5973)) doesn't fit our need for specific-linking from one lexeme to another - besides, synonymy in Javanese is called dasanama (lit. ten names), instead of register (Jv: unggah-ungguh) - and in the future I believe using these 5 new properties would make it much easier to "transform" words, phrases, sentences from one register to another (e.g. via WikiFunctions or other tools).

I've given in the form above two new examples:

  • mountain: gunung/ꦒꦸꦤꦸꦁ (L680638) (ngoko), redi/rêdi/ꦉꦢꦶ (L45622) (krama)
    • L680638-S1, instead of having property "synonym: L45622-S1", should instead have property "krama variations: L45622-S1"
    • Likewise L45622-S1, instead of having property "synonym: L680638-S1", should instead have property "ngoko variations: L680638-S1"
    • both lexemes could have the following synonyms: ancala, indra, endra, ancala, ardi; ardya, arga, asalingga, awukir, aldaka, hyang parwata, imandri, himawan, himawat, nala, cala, dri, tambana, wanawasa, wukir, wukira, parsa of parswa, parasu, parswa = paraswa, praswa, parwaka, par(of pwar)wata, prawata, parja, pradesa, pra(of prê)bata, par(of pêr)bata, par(of pêr)bwata, par(of pêr)byata, padaka, jambangan, mahahimawan, mahendra, mèru, malaya, gana, gunungan, giri, gori, girindra, girinata, gorata, giriwara, gêgêr, basulingga, byata, ngasrama. These all means "mountain" in Javanese language
  • head: endhas/êndhas/ꦲꦼꦤ꧀ꦝꦱ꧀ (L413183) (ngoko), sirah/ꦱꦶꦫꦃ (L999025) (krama), mastaka/ꦩꦱ꧀ꦠꦏ (L413863) (krama inggil)

Discussion

edit
They're incorrect
So, you see, many synonym of endhas/sirah/mastaka (head) have the register ngoko, krama, or both, but none of them are paired as _the_ register variant to the triplet endhas/sirah/mastaka. Therefore we need dedicated properties to store these values. Most have one-to-one relations, while some rarely have one-to-two or two-to-one, but never one-to-many. Bennylin (talk) 11:04, 23 August 2024 (UTC)[reply]
@Mahir256, would you like to give your opinion? Regards, ZI Jony (Talk) 18:31, 16 September 2024 (UTC)[reply]

Wikibase form

edit

Wikibase sense

edit

Other

edit

has kanji reading

edit
   Under discussion
Descriptionphonetic reading or pronunciation of the kanji
Data typeString
Domaininstances of sinogram (Q17300291)
Example 1(Q3594955)よん
Qualifiers
subject lexeme (P6254)/よん (L625228)
sinogram reading pattern (P5244)kun'yomi (Q1147749)
Example 2(Q3594955)
Qualifiers
subject lexeme (P6254)/ (L641752)
sinogram reading pattern (P5244)on'yomi (Q718498)
Example 3(Q3594998)うみ
Qualifiers
subject lexeme (P6254)/うみ (L5120)
sinogram reading pattern (P5244)kun'yomi (Q1147749)
Example 4(Q3594998)カイ
Qualifiers
subject lexeme (P6254)no value
sinogram reading pattern (P5244)on'yomi (Q718498)
See alsosinogram reading pattern (P5244)

Motivation

edit

In japanese, chinese characters can be read as different vocalisations. With lexemes we currently only cover those sounds that make up actual words. See the examples /よん (L625228) and / (L641752) where forms that use the kanji have a sinogram reading pattern (P5244) statement.

Sometimes however, readings don't make up real words but are merely affixes that can be used in compounds. We currently clutter these readings under a lexeme, that happens to have the same Kanji representation. But those usually have a different ethymology and external ids that don't apply to the reading. These readings also sometimes don't share the same senses.

I want to split all these lexemes, so that every lexemes only represents a single reading. Those readings that do not constitute words would be deleted in the process, but I'd strive to preserve those. And I think the sinogram entity is the right place for that. –Shisma (talk) 10:00, 27 August 2024 (UTC)[reply]

I'm merely interested in, but am not a speaker of japanese. If I said something horribly wrong here, please correct me. –Shisma (talk) 10:18, 27 August 2024 (UTC)[reply]

should we transliterate on'yomi readings to katakana? – Shisma (talk) 11:45, 27 August 2024 (UTC)[reply]

Indeed, in kanji dictionaries published in Japan, on'yomi (Q718498) readings are usually written in katakana (Q82946). --Okkn (talk) 01:37, 28 August 2024 (UTC)[reply]
updated –Shisma (talk) 14:14, 28 August 2024 (UTC)[reply]

Discussion

edit

@Duesentrieb, Afaz, Was a bee, Deryck Chan, NMaia, Okkn: pinging everybody involved with the proposal of sinogram reading pattern (P5244)Shisma (talk) 10:16, 27 August 2024 (UTC)[reply]

  Notified participants of WikiProject Japan