Wikidata:Lexicographical data

Welcome to the project page for lexicographical data!

What is lexicographical data?

Since the start of Wikidata in 2012, the multilingual knowledge base was mainly focused on concepts: Q-items are related to a thing or an idea, not to the word describing it. Since 2018, Wikidata has also stored a new type of data: words, phrases and sentences, in many languages, described in many languages. This information is stored in new types of entities, called Lexemes (L), Forms (F) and Senses (S). You can learn more about the data model on the documentation page.

The structured description of the words will be directly connected to the concepts. It will allow editors to describe precisely all words in all languages, and will be reusable, just like the whole content of Wikidata, by multiple tools and queries—everything that the community creates to play with words. Lexicographical data can be reused on the Wikimedia projects, and can provide support for Wiktionary.

Timeline

2012: first discussions about including lexicographical data into Wikidata
2013–2016: many discussions with editors and developers, leading to several versions of the development plan
2016: start of the development
2017: continuing the development of the structure (Wikibase/Lexeme), development of several tools for Wiktionary (Sitelinks)
May 23rd, 2018: deployment of the first version of lexicographical data Done
October 16th, 2018: enabling lexicographical data in the Query Service Done
October 18th, 2018: enabling Senses Done
2018–2022: iteration of the project, maintenance

Useful links

v t e Lexicographical properties

General	item for this sense grammatical gender conjugation class word stem derived from lexeme (mode of derivation , object form , object sense ) Wikidata property example for lexemes Wikidata property example for forms officialized by attested in first attested from stroke count combines lexemes auxiliary verb homograph lexeme Han character in this lexeme valency requires grammatical feature usage example (subject form , subject sense ) paradigm class root creates lexeme type translation synonym antonym troponym of false friend Wikidata property example for senses classifier location of sense usage language style collective noun for animals variety of lexeme, form or sense grammatical aspect gloss quote pertainym of predicate for said to be the same as lexeme semantic derivation of

Phonetics	pronunciation audio IPA transcription X-SAMPA code pronunciation variety Slavistic Phonetic Alphabet transcription hyphenation tone or pitch accent class position of accent nucleus position of devoiced vowel position of nasal sonant UPA transcription pronunciation IAST transliteration

Other properties useful in lexicography	image described at URL described by source quotation Bharati Braille

Values for property instance of or has characteristic of the lexeme	plurale tantum/collective noun/singulare tantum inanimate/animate reconstructed word acronym

Values for property instance of or has characteristic of the form	obsolete form depreciative form rare form potential form non-depreciative form vocalic form non-vocalic form colloquial form strong form weak form incorrect form former form spelling recommended by Duden alternative spelling

Values for property language style of the sense	outdatedness colloquial language archaism rare idiomatic humorous euphemism vulgarism pejorative neologism profanity

Sandboxes	sandbox (L123) sandbox 2 (L1234) Sandbox-Lexeme Sandbox-Form Sandbox-Sense

Dictionaries and databases (list per language)	IHO Hydrographic Dictionary (S-32) Number SJP Online ID Oxford English Dictionary entry ID (pre-July 2023) SGJP Online ID Doroszewski Online ID Kopaliński Online ID WSO Online ID WSJP ID Dobry słownik ID Treccani Vocabulary ID SPXVI ID Słownik języka polskiego XVII i XVIII wieku ID Uralonet ID Álgu lexeme ID Oqaasileriffik online dictionary ID Ġabra lexeme ID Oudnederlands Woordenboek GTB ID Vroegmiddelnederlands Woordenboek GTB ID Middelnederlandsch Woordenboek GTB ID Techopedia ID DanNet 2.2 word ID Bantu Lexical Reconstructions ID Elhuyar Dictionary ID ePSD ID Ahotsak lexeme Sri Granth word ID TLFi ID Diccionario de la lengua española word (non-ID) Punjabipedia ID PIV Online ID Reta Vortaro ID Svenska Akademiens Ordbok unique ID ODLIS ID APA Dictionary of Psychology entry Biology Online Biology Dictionary entry synonymer.se ID sense on DHLE Kielitoimiston sanakirja ID Svensk ordbok ID IGI Global Dictionary ID Lur Encyclopedic Dictionary ID VerbaAlpina ID Investopedia term ID Glossary of Astronomical Terms ID Arabic Ontology lemma ID Mindat.org Glossary of Mineralogical Terms ID Revised Mandarin Chinese Dictionary ID Merriam-Webster online dictionary entry Dictionary.com entry Collins Online English Dictionary entry The Britannica Dictionary entry Dicionário de Gentílicos e Topónimos lemma ID Ma'agarim ID JLect entry ID Online Torwali Dictionary ID Little Academic Dictionary ID 18th Century Russian Dictionary ID Urdu Lughat ID Middle English Dictionary entry ID Taiwanese-Japanese Dictionary ID Cambridge Dictionary entry (British English) Cambridge Dictionary entry (American English) Macmillan Dictionary entry (British English) Macmillan Dictionary entry (American English) STEDT ID Infopédia entry Intercontinental Dictionary Series unit ID elexiko ID OWID Neologismenwörterbuch ID OWID Deutsches Fremdwörterbuch ID OWID Sprichwörterbuch ID OWID Kommunikationsverben ID Kleines Wörterbuch der Verlaufsformen im Deutschen ID Dictionary of Frequently-Used Taiwanese Hokkien ID Ushakov Dictionary ID Bosworth-Toller's Anglo-Saxon Dictionary Online ID Sindhi English Dictionary ID Encyclopedia of Italian ID Jeju Dialect Dictionary ID Online Aboriginal Language Dictionary ID Dictionary of South African English entry ID Digital Daijisen ID Jewish English Lexicon ID Michaelis ID LSJ Wiki ID Tatoeba sentence ID Quranic Arabic Corpus root ID Law Insider Legal Dictionary entry woordenlijst.org ID Mandarin-Cantonese Comparative Study ID Woordenboek der Nederlandsche Taal GTB ID Ordbok över Finlands svenska folkmål ID Il Nuovo De Mauro ID Arabic Ontology lexical concept ID ‎Oxford English Dictionary object ID (post-July 2023) OSL ID The Oxford Dictionary of Phrase and Fable ID The Law Dictionary entry Explanatory Ukrainian Dictionary ID Australian Oxford Dictionary ID The New Zealand Oxford Dictionary ID Canadian Oxford Dictionary ID Slovenian Etymological Dictionary ID Meurgorf identifier New Oxford Rhyming Dictionary ID A Dictionary of Biology ID A Dictionary of Plant Sciences ID A Dictionary of Zoology ID Irish-English Dictionary ID

Wikidata:Lexicographical data • Examples & resources • Property proposals • Create a new Lexeme • Wikidata Lexeme Forms • Property used with senses • Template:Language properties and (with their qualifiers)