Wikidata:Lexicographical data/Documentation/Languages/id
Subclass of | Malay ![]() |
---|---|
Native label | Bahasa Indonesia ![]() |
Short name | Indonesian ![]() |
Named after | Indonesia ![]() |
Country | Indonesia ![]() |
Indigenous to | Indonesia ![]() |
Coordinate location | 6°10′30″S 106°49′39″E ![]() |
Linguistic typology | subject–verb–object, agglutinative language, zero-marking language, noun-adjective, synthetic language ![]() |
Writing system | Latin script ![]() |
Language regulatory body | Agency for Language Development and Cultivation ![]() |
Ethnologue language status | 1 National ![]() |
Studied in | indonesiology ![]() |
Described at URL | https://afbo.info/languages/42 ![]() |
Related category | Category:Indonesian pronunciation ![]() |
Wikimedia language code | id ![]() |
![Map](https://maps.wikimedia.org/img/osm-intl,a,-6.175,106.8275,300x300.png?lang=en&domain=www.wikidata.org&title=Wikidata%3ALexicographical_data%2FDocumentation%2FLanguages%2Fid&revid=2194373967&groups=_1d5cd56d585600d1fa691d7388ca6dd4ce8e2460)
Information about Indonesian lexeme model for Wikidata. You're welcome to edit and contribute to this page. See also other language models for comparison.
General
edit- Language item: Indonesian (Q9240)
- Language code: id
- Wikidata:Lexicographical data
- Wikidata:Lexicographical data/Documentation
- Wikidata:Lexicographical data/Documentation/Languages
- List of Indonesian lexemes in Wikidata: Number of lexemes 19686 (24 Oct 2022)
Lexical categories
edit- noun (Q1084), e.g.: cinta (L15072)
- Nomina dasar
- Nomina turunan
- verb (Q24905), e.g.: keluar (L31530)
- transitive verb (Q1774805), e.g.: mencinta (L31535)
- Verba asal
- Verba turunan
- adjective (Q34698), e.g.: suka (L238379)
- Adjektiva dasar
- Adjektiva turunan
- adverb (Q380057), e.g.: agak (L498541)
- numeral (Q63116), e.g.: satu (L498544)
- pronoun (Q36224), e.g.: aku (L2474)
- Partikel:
- grammatical particle (Q184943), e.g.: ah (L498548)
- conjunction (Q36484), e.g.: dan (L2325)
- preposition (Q4833830), e.g.: akan (L498545)
- article (Q103184), e.g.: sang (L498546)
- interjection (Q83034), e.g.: aduh (L498547)
- precategorial (Q107000399) / root (Q111029)?, e.g.: bunuh (L31532)
- proper noun (Q147276) - untuk saat ini jangan tambahkan tipe ini, sampai sudah jelas ada pedoman dan contohnya.
Non-word categories
edit- letter (Q9788) (huruf), e.g.: a (L498537) - see Malay alphabet (Q2673515)
- prefix (Q134830) (awalan), e.g.: meng-/ng- (L15073)
- suffix (Q102047) (akhiran), e.g.: -an (L498538)
- confix (Q1133968) (awalan+akhiran), e.g.: peng-an (L479450)
- infix (Q201322) (sisipan), e.g.: -em- (L498539)
- proverb (Q35102) (peribahasa), e.g.: masuk angin (L498540)
- abbreviation (Q102786) (singkatan dan/atau akronim), e.g.: AD (L498542), angkot (L498543)
Lemma
edit- Lemma is the form of the word that is used in the head of dictionary. Indonesian dictionary usually uses root word as a lemma.
- Going with the definition of id:Leksem from Gunawan, et al. (1948), a lexeme is the smallest unit in a language that represent a concept or a symbol. It doesn't have to be a root word. Many root words in Indonesian are in fact bound morpheme, that have no meaning nor lexical class until it receives affixation (derivation).
- A Wikidata lexeme of Indonesian language, therefore, is not necessarily root words only, but could also be derived words.
- Therefore, all these are valid separate lexemes: cinta (L15072) (noun), cinta (L238377) (adjective), mencinta (L31535) (verb), ajar (L31557), pelajar (L6592), pengajar (L6593)
- Indonesian lemmas should start with lowercase, just like in Wiktionary. (case-sensitive)
Spelling variants
edit- subject of further discussion
The dictionary forms are largely used in written, formal speech, and in encyclopedia like Indonesian Wikipedia. In daily usage, most people would use informal spelling, informal pronunciations, and slangs. These includes informal speech, informal writing (social media, personal writings, etc.), and the varieties differ from region to region, depends and influenced by the many of the 700+ local languages in Indonesia. These variants very seldom enter into dictionaries (usually marked as 'regional'/'local language' words), and some tried to compile them into "kamus gaul" (slang dictionaries), even though many are no longer considered slang, but already very common. Other factors to consider includes: high number of bilingualist/trilingualist, code-switching/code-mixing, and bahasa 'gado-gado'
In case when the orthography require special letter(s) beside the 26 Indonesian alphabets, both orthographies (with and without special letters) should be used in page title.
The following lexemes has at least two spelling variants. The first (proper) variant should be as defined from KBBI, and the second variant is marked as id-x-Q4200642
.
Regional languages / variants
edit- subject of further discussion
Need a way to better handle lexemes in 700+ languages of Indonesia. e.g.: lexemes (usually nouns) in multiple languages that have the same sense.
- Indonesian languages (Brandstetter, Blagden (tr.), 1916)
- Indonesian (ind)
- Malayic languages
- (Historical "Standard Malay"): Riau Malay/Court Malay
- Indonesian Malay
- (non-creole Malay): Jambi Malay, Minang, Banjar, etc.
- Malaysian Malay
- Malay (mly) (i.e. Standard Malaysian) (zsm), Kedah Malay, Sabah Malay, etc.
- Singaporean Malay
- Bruneian Malay
- (Malayic language outside of the above): Thai Malay (Pattani Malay), Sri Lankan Malay, etc.
- (Trade and creole Malay): Baba Malay, Betawi, Ambonese Malay, Manado Malay, Papuan Malay, etc.
- Indonesian non-Malayic languages (excluding foreign languages, Hokkien, other Sinitic, Arabic, Indic, etc.)
- Languages of Java: Javanese, Sundanese, Madurese, etc.
- Languages of Kalimantan
- Languages of Sulawesi
- Languages of Sumatra (non-Malayic): Aceh, Batak, Nias, Mentawai, etc.
- Languages of Maluku
- Languages of Lesser Sunda Islands: Balinese , Lombok, Bima, Tetun, etc.
- Languages of Indonesian Papua (this region alone have around 270 languages)
- (Indonesian languages outside of the above): Philippines, Madagascar, Formosa
- Bahasa formal/baku (formal language (Q192161), standard language (Q399495)): Indonesian (Q9240)
- Bahasa nonformal/informal/tidak baku/sehari-hari (colloquial language (Q901711)): Indonesian slang (Q4200642) - terdiri dari dua macam dialek - regiolek (berdasarkan daerah geografis) dan sosiolek (berdasarkan kelompok sosial)
- Bahasa gaul/bahasa jalanan/regiolek (street language (Q12473490), regiolect (Q455374)) - beragam, bisa bahasa gaul daerah A, B, C, dll.
- Bahasa daerah Jakarta:
id-x-Q106296763
prokem (Q106296763) - Bahasa khas Medan:
id-x-Q5972
Medan (Q5972), contoh kesal/palak (L1329104)- Diskusi: Kalau perbedaan variasinya terlalu jauh, apa masih bisa disebut variasi, alih-alih bahasa/leksem/lema yang berbeda dengan makna yang sama? contoh kesal/palak (L1329104) kesal: id, palak: id-x-Q5972 (Medan)
- Bahasa permainan rahasia (secret language game (Q186427)) daerah Jawa:
id-x-Q13092737
Malang Javanese (Q13092737) (walikan),id-x-Q3741
Yogyakarta (Q3741) (dagadu), dsb. - dll.
- Bahasa daerah Jakarta:
- Bahasa sosiolek (sociolect (Q207101), slang (Q8102))
- Bahasa internet
id-x-Q1337
(leet (Q1337)): Alay (Q65205207) - Bahasa LGBT
id-x-Q2537421
(LGBT slang (Q2537421)): Bahasa Binan (Q4842492) - dll.
- Bahasa internet
- Bahasa gaul/bahasa jalanan/regiolek (street language (Q12473490), regiolect (Q455374)) - beragam, bisa bahasa gaul daerah A, B, C, dll.
Statements
edit- instance of (P31): root (Q111029), derivation (Q728001), or compound (Q245423); common noun (Q2428747), etc.
- usage example (P5831)
- derived from lexeme (P5191) (e.g. from Malay lexeme) or if instance of (P31) is compound (Q245423); said to be the same as lexeme (P11577)? e.g. janggut/jenggot (L303173)
- root (P5920): if instance of (P31) is derivation (Q728001)
- combines lexemes (P5238): if instance of (P31) is derivation (Q728001)
- grammatical gender (P5185): masculine (Q499327), feminine (Q1775415) - in the rare case that the noun is a loanword or with foreign affixes that denote gender
- classifier (P5978): for ekor (L1137669), e.g. lumba-lumba (L6591)
- described by source (P1343)?
- fabrication method (P2079) / has grammatical case (P2989): for reduplication (Q221446), e.g. lumba-lumba (L6591)
- TODO: make connection between them
Dialects: Indonesian dictionaries incorporate a lot of local languages, and they are marked as such in the lemmas, although there are almost no etymology dictionary in Indonesian.
Indonesian:
- ragam bahasa: arkais, percakapan, hormat, kasar, klasik, dll.
- bidang ilmu: (banyak macamnya)
Homography, homophony, and homonymy
editIf two similar lexeme share either lemma, pronuncation, or both, they are to be separated and connected to each other via the following statements:
- homograph lexeme (P5402) homograph (Q223981) (same writing, different pronunciation), e.g. apel (fruit) and apel (ceremony), seri (>< paralel) and seri (cantik; bagus), kecap (bumbu dapur) and kecap (decak bibir/lidah)
- homoglyph (P2444) homonym (Q160843) (same writing and pronunciation), e.g. bunga (L1126192) and bunga (L1126193) (both are nouns), suka (L238379) and suka (L238378), cinta (L15072) and cinta (L238377) (one noun, one verb)
- homophone form (P10822) homophone (Q221079) (different writing, same pronunciation), e.g. bank and bang, tank and tang
Senses
editConcept or symbol represented by the lexeme, glosses (in Indonesian language), definitions (meaning): e.g. kucing (L498558)
Claims
edit- image (P18): find Indonesian-related images. e.g. janggut, sate, atlet
- Commons category (P373) e.g. L801
- item for this sense (P5137)
- translation (P5972)
- synonym (P5973)
- language style (P6191): e.g. colloquial language (Q901711)
- predicate for (P9970): e.g. berenang (L575390) - predicate for swimming (Q6388)
Forms
editto be completed
- root (Q111029)
- Always root word that has no other lexical category (precategorial (Q107000399)). It has no other forms, no sense/gloss. E.g.: bunuh (L31532). Lemma with multiple (ambiguous) root berikan (L574420) - ber+ikan, beri+kan; berilah (L573868) - ber+ilah, beri+lah
- noun (Q1084)
- the base noun form most of the time could be used to denote singular or plural (Q106644026). It could be a root word, or an affixed word (-an, pe- (an), per- (an), etc.).
- singular (Q110786) (tunggal); plural (Q146786) (jamak/bentuk terulang); interrogative (Q12021746) (interogatif/penanya); affirmation and negation (Q3745428) (affirmatif/penegas/penekanan); first-person possessive (Q71470598) (posesif orang pertama); second-person possessive (Q71470837) (posesif orang kedua); third-person possessive (Q71470909) (posesif orang ketiga); and their combinations.
- TODO: false reduplication (singular)
- adjective (Q34698)
- positive (Q3482678) (root), equative case (Q3177653) (se-), superlative (Q1817208) (ter-), excessive (Q1385613) (ke-an)
- TODO: reduplication, infix (-em-), foreign suffixes, compound (synonym/antonym) adjective forms, denomynal & deverbal adjectives (pe-, meng-, ber-, ter-)
- verb (Q24905)
- It could be a root word, or an affixed word. If it's the latter, the base verb form should be: active (Q1317831) (me-/member-/memper-, ber-), or passive (Q1194697) intransitive verb (Q1166153) (ter-)
- For active voice (me-), add these forms: passive (Q1194697); transitive case (Q17140008); first-person possessive (Q71470598); second-person possessive (Q71470837); third-person possessive (Q71470909); affirmation and negation (Q3745428); interrogative (Q12021746); and their combinations.
- (Indonesia: untuk bentuk me-/memper-/member- (i, kan), tambahkan bentuk pasif di-/ku-/kau-(per) (i, kan), bentuk transitif (per-/ber-) -i/-kan (minus me-), posesif (-ku, -mu, -nya), afirmatif/penegas/penekanan (-lah), dan interogatif/penanya (-kah)
- For passive intransitive voice (ter-) (tidak sengaja/selesai dilakukan): TODO
- TODO: reduplication verb forms, compound verb forms
- Adverb, numerals, etc.
- TODO
Claims
editPernyataan yang diletakkan di Forms:
- pronunciation audio (P443)
- IPA transcription (P898)?
- word stem (P5187)/hyphenation (P5279)
- pronunciation (P7243)
- alternative form (P8530)? e.g. atlet/atlit (L1122428)
Old spellings
editSome other variants include archaic/classical words (usually from Sanskrit, Arabic, Dutch, Chinese, etc. origin), variants before spelling reforms (several reforms), variants of pronouncing the letter 'e' (schwa or non-schwa), and variants considered incorrect by Great Dictionary of the Indonesian Language (Q4200623) (KBBI). Other things to consider: pronunciations, affixation variants (mempelajari/memelajari, ), preposition variants (di) and prefix (di-), f/p/v variants, swarabakti (-er-/-r-) variants, mem- + [p] and men- + [t] variants, etc.
If the lexeme have multiple spelling variants the most recent orthography should be used (currently based on Great Dictionary of the Indonesian Language (Q4200623)). Spelling variants that are not considered valid anymore can by added in Forms section marked with language code specifying the last orthography reform that considered such variant valid. Here are some examples of language codes that might be used:
- cuci (L498556) - verb, kucing (L498558) - noun
- cuci, kucing
id-x-Q65205295
Q65205295 (Ejaan Baru, 1967-1972) Ejaan LBK [Lembaga Bahasa dan Kesusastraan], then called "Ejaan Baru"id-x-Q5378777
Enhanced Indonesian Spelling System (Q5378777) (EYD, 1972-2015) in 1972 Ejaan Baru was codified into EYD [Ejaan Bahasa Indonesia Yang Disempurnakan] with minor revisions in 1987, 2009, and 2015id-x-Q25470128
Q25470128 (EBI, 2015-now) in 2015 EYD was renamed EBI [Ejaan Bahasa Indonesia]- Note: no need to enter the codes above, they should all be -> id
- tjutji, kutjing
id-x-Q7314707
Republican Spelling System (Q7314707) (Ejaan Soewandi/Ejaan Republik)- tjoetji, koetjing
id-x-Q7330819
Van Ophuijsen Spelling System (Q7330819) (Ejaan van Ophuijsen) - Dutch spelling
Currently non-valid spelling variant must always be verified.
Regional variants code:
id-x-Q4200642
Indonesian slang (Q4200642), e.g.: lihat/liat (L6542)
Tools
edit(the following instructions in Indonesian. Feel free to help translate them)
Langkah-langkah
- tambahkan importScript( 'User:Bennylin/jsonLexeme.js' ); ke common.js pribadi (misalnya https://www.wikidata.org/wiki/User:Bennylin/common.js)
- di bilah sebelah kiri akan muncul dua tombol baru: Buat Leksem dan Sunting Leksem
- Klik Buat Leksem, lalu masukkan kode JSON, klik Buat. 😎
- (Kode JSON bisa didapat dengan memanggil modul Lexeme-id, dan menyuplai parameter yang sesuai. Lihat dokumentasi di https://www.wikidata.org/w/index.php?title=User:Bennylin/sandbox&oldid=2191449597)
- Sunting Leksem mirip dengan Buat Leksem, tapi perlu memasukkan ID Leksem yang ingin disunting. Atau kalau sedang berada di halaman Leksem tertentu ketika mengeklik "Sunting Leksem", maka ID Leksem akan otomatis terisi
List of lexemes
edit
Lexemes, forms and senses
editStatistic | Number | Query link | Date |
---|---|---|---|
Lexemes | 20,150 | [1] | 12:59, 30 June 2024 (UTC) |
Forms | 412,524 | [2] | 12:59, 30 June 2024 (UTC) |
Senses | 615 | [3] | 12:59, 30 June 2024 (UTC) |
By lexical categories and affixation
editVerba: ~15000 leksem
- (Q24905 verba ← query)
- 12976 hasil 08:22, 29 June 2024 (UTC)
- verba asal: ~800 leksem (~400 prakategorial)
- verba majemuk asal & verba majemuk turunan: ~1500 leksem
- verba turunan reduplikasi: ~1300 leksem
- verba turunan dasar bebas:
- Done ber-: 2.385 leksem, ber- (ulang) (498), ber- (frasa) (60)
- (L1330962 ber- ← pranala)
- Done ber-an: 299, ber-kan: 63, berke-an: 32, bersi-: 15
- (L1330987 ber-an ← pranala)
- (L1330988 ber-kan ← pranala)
- (L1331078 berke-an ← pranala)
- (L1331079 bersi- ← pranala)
- Done ter-, ter-i: 23, ter-kan: 51, ter- (ulang) (~150)
- (L1331003 ter-/ke- ← pranala)
- (L1331007 ter-i ← pranala)
- (L1331008 ter-kan ← pranala)
- Done meng- (di-, ku-, kau-): ~4300 leksem, meng- (ulang) (~300)
- Done meng-i (-i, di-i, ku-i, kau-i): ~1000 leksem, meng-i (ulang) (~30)
- (L1330953 meng-i ← pranala)
- (L1330954 -i ← pranala)
- Done meng-kan (-kan, di-kan, ku-kan, kau-kan): ~2500 leksem, meng-kan (ulang) (~80)
- Done memper- (per-, diper-, kuper-, kauper-) (-i, -kan): 140
- (L1330959 memper- ← pranala)
- (L1330963 per- ← pranala)
- Done memper-i (per-i, diper-i, kuper-i, kauper-i): 22
- Done memper-kan (per-kan, diper-kan, kuper-kan, kauper-kan): 208
- member-, member-kan
- se- (verba), -an (verba), ke-an (verba)
Nomina: ~43000 leksem
- (Q1084 nomina ← query)
- 7067 hasil 08:22, 29 June 2024 (UTC)
- nomina dasar: 18000 leksem (masing-masing nomina punya 23 bentukan, dengan total 98.900 bentukan!)
- nomina majemuk dasar & turunan: 18000 leksem
- Done -an: ~1600 leksem nomina
- (L498538 -an ← pranala)
- Done pe-: 110, pe-an: 84
- (L1330997 pe- ← pranala)
- (L1330999 pe-an ← pranala)
- Done peng-: ~1500, peng-an: ~1500
- (L1330998 peng- ← pranala)
- (L479450 peng-an ← pranala)
- Done per- (nomina): 15, per-an: ~500
- (L1330963 per- ← pranala)
- (L1330991 per-an ← pranala)
- Done ke-: 19 ke-an: ~1400
- (L1331000 ke- ← pranala)
- (L1331001 ke-an ← pranala)
- Done se-: ~300, se-an: 24
- (L1331002 se- ← pranala)
- (L1331074 se-an ← pranala)
- Done keter-an: 38, keber-an: 15, kepeng-an: 15, kese-an: 13
- perse-an
- Done nomina lainnya: 5, ketidak-an: 7
Adjektiva: ~5000 leksem
- (Q34698 adjektiva ← query)
- 184 hasil 08:22, 29 June 2024 (UTC)
- adjektiva dasar: ~4500 leksem
- adjektiva lainnya
Adverbia: ~300 leksem
- 3 hasil 08:22, 29 June 2024 (UTC)
- (Q380057 adverbia ← query)
Partikula: ~200 leksem
- 3 hasil 08:22, 29 June 2024 (UTC)
- (Q184943 partikula ← query)
Numeralia: ~200 leksem
- 1 hasil 08:22, 29 June 2024 (UTC)
- (Q63116 numeralia ← query)
Pronomina: ~50 leksem
- 4 hasil 08:22, 29 June 2024 (UTC)
- (Q36224 pronomina ← query)
Lainnya: interogativa 45, preposisi 45, konjungsi 24, interjeksi 7, artikula 6
- (Q2304610 kata tanya (interrogative) ← query): 0 hasil 08:22, 29 June 2024 (UTC)
- (Q4833830 preposisi (preposition) ← query): 5 hasil 08:22, 29 June 2024 (UTC)
- (Q36484 konjungsi (conjuction) ← query): 5 hasil 08:22, 29 June 2024 (UTC)
- (Q83034 interjeksi (interjection) ← query): 2 hasil 08:22, 29 June 2024 (UTC)
- (Q103184 artikula (article) ← query): 1 hasil 08:22, 29 June 2024 (UTC)
- (Q1867204 kata tugas ← query): 0 hasil 08:22, 29 June 2024 (UTC)
- (Q3916780 kata bantu bilangan ← query): 0 hasil 08:22, 29 June 2024 (UTC)
- (Q63153 kata penggolong (classifier) ← query): 5 hasil 08:22, 29 June 2024 (UTC)
- (Q35102 peribahasa (proverb) ← query): 4 hasil 08:22, 29 June 2024 (UTC)
- wd:L498540 masuk angin
- wd:L1119282 pucuk dicinta, ulam tiba
- wd:L1119283 patah tongkat berjeremang
- wd:L1120358 tong kosong nyaring bunyinya
- (Q9788 huruf (letter) ← query): 1 hasil 08:22, 29 June 2024 (UTC)
- wd:L498537 a
- (Q102786 singkatan (abbreviation) ← query): 1 hasil 08:22, 29 June 2024 (UTC)
- wd:L498542 AD
- (Q101244 akronim (acronym) ← query): 1 hasil 08:22, 29 June 2024 (UTC)
- wd:L498543 angkot
Afiks:
- Konfiks: 26 (Q1133968 confix ← query): 5 hasil 08:22, 29 June 2024 (UTC)
- Prefiks: 13 (Q134830 prefix ← query): 5 hasil 08:22, 29 June 2024 (UTC)
- Sufiks: 3: (Q102047 suffix ← query): 5 hasil 08:22, 29 June 2024 (UTC)
- -an, -kan/-in, -i
- -kah, -lah, -ku, -mu, -nya
- Infiks: 3 (Q201322 infix ← query): 5 hasil 08:22, 29 June 2024 (UTC)
- (L1331010 -el- ← pranala)
- (L498539 -em- ← pranala)
- (L1331011 -er- ← pranala)
- Prakategorial / morfem dasar terikat ~2000
Longest lemmas
edit- Longest words with affixation (without reduplication/phrasal affixation)
- Wikidata:Lists/lexemes/id/Interesting#Longest_words
- mengaktualisasikan
- mendemiliterisasi
- mendiskualifikasikan
- pendiversifikasian
- pengintensifikasian
- menginternasionalkan
- penginternasionalan
- menginventarisasikan
- mengualifikasikan
- meliberalisasikan
- bersimaharajalela
- menasionalisasi
- menasionalisasikan
- berperikemanusiaan
- merasionalisasi
- merehabilitasikan
- meresosialisasi
- mensosialisasikan
- pertelekomunikasian
- memvisualisasikan
- with phrasal affixation: mempertanggungjawabkannyalah (L1331435-F13) - verb
- with reduplication + affixation: seberuntung-beruntungnya (L523931-F1) (base lexeme) - adjective, perbendaharaan-perbendaharaannyalah (L698954-F22) - noun
- with alternative form/additional affixes: kecentang-perenangan-kecentang-perenangannyalah (L700048-F22) - noun
- Lihat pula
- Wikidata:Lexicographical data/Statistics/Count of lexemes by lexical category (daftar kelas kata)
- Wikidata:Lexicographical coverage/id/Missing
- id:wikt:Wikikamus:ProyekWiki bahasa Indonesia/Daftar kata
- id:wikt:Pengguna:Bennylin/Sense - list of words based on the number of senses (KBBI < 2016): id:wikt:keras, id:wikt:sesuai, id:wikt:pun
- Wikidata:Lists/lexemes/id/Interesting
- Wikidata:Lists/lexemes/id/Issues (need Indonesian specific issues queries) - see User:Nikki/German/Issues
Queries
editIndonesian nouns (noun (Q1084))
SELECT ?l ?lemma WHERE {
?l a ontolex:LexicalEntry ; dct:language wd:Q9240 ; wikibase:lexicalCategory wd:Q1084 ; wikibase:lemma ?lemma .
}
Indonesian verbs (verb (Q24905))
SELECT ?l ?lemma WHERE {
?l a ontolex:LexicalEntry ; dct:language wd:Q9240 ; wikibase:lexicalCategory wd:Q24905 ; wikibase:lemma ?lemma .
}
Indonesian adjectives (adjective (Q34698))
SELECT ?l ?lemma WHERE {
?l a ontolex:LexicalEntry ; dct:language wd:Q9240 ; wikibase:lexicalCategory wd:Q34698 ; wikibase:lemma ?lemma .
}
Get all existing Indonesian lexemes
editThe following query uses these:
- Items: Indonesian (Q9240)
SELECT ?lexeme ?lemma ?category ?categoryLabel WHERE { ?lexeme dct:language wd:Q9240; wikibase:lemma ?lemma; wikibase:lexicalCategory ?category; wikibase:lemma []. FILTER(LANG(?lemma) = "id") SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],ml". } } ORDER BY ?categoryLabel ?lemma LIMIT 100
- Ordered by newest to oldest creation time
The following query uses these:
- Items: Indonesian (Q9240)
SELECT ?lno ?lexeme ?lemma ?category ?categoryLabel WHERE { ?lexeme dct:language wd:Q9240; wikibase:lemma ?lemma; wikibase:lexicalCategory ?category . FILTER(LANG(?lemma) = "id") BIND(xsd:integer(substr(str(?lexeme), 33)) as ?lno) SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],ml". } } ORDER BY DESC(?lno) LIMIT 100
- Uses date last modified
The following query uses these:
- Items: Indonesian (Q9240)
SELECT ?lexeme ?lemma ?modified WHERE { ?lexeme dct:language wd:Q9240; wikibase:lemma ?lemma; schema:dateModified ?modified. } ORDER BY DESC(?modified) LIMIT 100
Get the count of lexemes in Indonesian belonging to different lexical categories
editThe following query uses these:
- Items: Indonesian (Q9240)
SELECT ?category ?categoryLabel (count(?category) as ?count) WHERE { ?lexeme dct:language wd:Q9240; wikibase:lexicalCategory ?category; wikibase:lemma []. SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],ml". } } GROUP BY ?category ?categoryLabel ORDER BY ?count
Others
edit- https://w.wiki/AXao - comparing languages of Indonesia (thanks Nikki)
Resources
edit- Great Dictionary of the Indonesian Language (Q4200623): https://kbbi.kemdikbud.go.id/
- Enhanced Indonesian Spelling System (Q5378777) (EyD / EBI): s:id:Pedoman Umum Ejaan Bahasa Indonesia yang Disempurnakan (1987)
- Q107272455 (PUPI) / pedoman penyerapan istilah asing
- Kamus Umum Bahasa Indonesia (Q97236177) (Purwadarminta, 1953, 1966)
- kata kepala ialah kata yang dijadikan tumpuan untuk mencari kata-kata turunan, dsb., diambil dari kata pokok. Tetapi jika sukar mencari kata pokoknya, ... [kata-kata turunan] itu diterakan sebagai kata kepala
- kata pokok serta contohnya, kata-kata majemuk, ungkapan, peribahasa, dsb.
- kata berulang (dari kata pokok)
- bentuk se-; bentuk ber- (an); bentuk me-[/di-/ku-/kau-] (i, kan), memper-[/diper-/kuper-/kauper-] (i, kan), ter- (i, kan), [member-kan, berke-an, bersi-, -i, -kan]; bentuk -an, [pe- (an), per- (an), ke- (an), ke(ber-, ter-, peng-, se-, tidak-)-an; -kah; -lah; -ku, -mu, -nya]
- s:An introduction to Indonesian linguistics, being four essays.djvu (1910-15 book about linguistics of Indonesian languages; translated to English in 1916), one of the earliest to differentiate it from Malay, p. vii. Word-base = kata dasar (word stem (Q210523)); root = akar kata (root (Q111029)). Common Indonesian and Original Indonesian (pp.77)
- Commons:Category:Indonesian dictionaries
- Category:Indonesian lemmas (Q30524912)
- id:wikt: Indonesian Wiktionary
- Module:Lexeme-id - adding lexemes via JSON, quick and easy. Documentation
See also
edit- Wikidata:Wiktionary
- Wikidata:Glossary for Wiktionary
- Wikidata:Tools/Lexicographical data
- https://ordia.toolforge.org/language/Q9240 - daftar dan statistik leksem bahasa Indonesia di WIkidata
- https://ordia.toolforge.org/language/ - statistik bahasa dengan leksem terbanyak di WIkidata
- https://ordia.toolforge.org/search?q=air
- https://leksem-indonesia.toolforge.org/
- https://machtsinn.toolforge.org/?lang=9240
- Recent changes
- Wikidata:Lexicographical data/Documentation/Languages/ms