Babel user information
kea-N This user has a native understanding of Kabuverdianu.
pt-5 Este utilizador tem um nível profissional de português.
en-4 This user has near native speaker knowledge of English.
pt-BR-3 Este usuário pode contribuir com um nível avançado de português no dialeto brasileiro.
es-2 Este usuario tiene un conocimiento intermedio del español.
gl-2 Este usuario ten un coñecemento intermedio de galego.
Users by language

About me: Waldir@meta.wikimedia

Work in progressEdit

To doEdit

AssortedEdit

GeographyEdit

PopulationEdit

  • Population of cities of Portugal: https://w.wiki/45s
    • Needs distinction between cities and municipalities (e.g. Braga vs. Braga (city))
    • Needs more data :)
    • Talk to Rui Cavaco Barrosa about this
    • INE.pt: Conceitos por tema: Território
      • Distrito, Município/Concelho, Freguesia
      • Vila: Aglomerado populacional contínuo, com um número de eleitores superior a 3000, possuindo pelo menos, metade dos seguintes: a) Posto médico; b) Farmácia; c) Centro cultural ou de espetáculos; d) Transportes públicos coletivos; e) Correios; f) Estabelecimentos comerciais e de hotelaria; g) Escola; h) Banco.
      • Lugar: Aglomerado populacional com dez ou mais edifícios destinados à habitação, e com uma designação própria, independentemente de pertencer a uma ou mais freguesias. Os edifícios que não devem distar entre si mais de 200 metros, excepto pela interposição de equipamento coletivo, como estradas, campos de desporto, jardins, etc.
      • Lugar urbano: Lugar com população igual ou superior a 2000 habitantes.
      • Quarteirão: Conjunto de edifícios implantados numa área urbana delimitada por arruamentos.
      • Subúrbio: Território urbanizado na periferia de um centro populacional marcadamente urbano (i.e. o centro da cidade).
      • Cidade: Aglomerado populacional contínuo, com um número de eleitores superior a 8000, possuindo pelo menos, metade dos seguintes: hospital com serviço de internamento; farmácias; corporação de bombeiros; casa de espetáculos e centro cultural; museu e biblioteca; hotéis e pousadas; escolas primárias e secundárias; escolas pré-primárias e infantários; transportes públicos, urbanos e suburbanos; parques ou jardins públicos.
      • TODO: create a Wikidata item for these concepts

MapsEdit

Map of PPP median incomesEdit
  • WIP: auto-generated map of median incomes in European countries
    • Work in progress! Still needs:
      • Display the results (value and corresponding year, if available) in map form
        • Here's a start
        • Here's a complex example that colors the shape files based on data: tinyurl.com/yckrqpcl (w.wiki URL shortening fails, possibly due to excessive length)
        • documentation
      • Ensure the units are PPS rather than raw Euros or other currency
        • That ought to be done by using psn instead of psv, but it doesn't seem to be working :(

Country and subdivision codesEdit

PortugalEdit
InternationalEdit

Cape VerdeEdit

Labels & descriptions in keaEdit

Lexemes (kea dictionary)Edit

Data sourcesEdit

Wikimedia:

Other:

  • OmegaWiki
  • Glosbe
  • Verbix
  • Books
    • Dicionário Caboverdiano—Português (Manuel Veiga)
    • Léxico do dialecto crioulo do Arquipélago de Cabo Verde (Armando Napoleão Rodrigues Fernandes)

Intermediary storage:

ToolsEdit
  • Apparently this tool (written in Java) was used to convert Basque Wiktionary entries into Wikidata entries
Automated lists of Wikidata KEA lexemesEdit

Check more ideas from the list at Wikidata:Lexicographical data/Ideas of queries

Automatic count of kea lexemesEdit

This list is periodically updated by a bot. Manual changes to the list will be removed on the next update!

WDQS | PetScan | TABernacle | Find images Recent changes | Query: SELECT (count(?item) as ?keaWords) WHERE { ?item a ontolex:LexicalEntry ; dct:language [wdt:P220 'kea'] . }
End of automatically generated list.
Automatic list of kea lexemesEdit

This list is periodically updated by a bot. Manual changes to the list will be removed on the next update!

WDQS | PetScan | TABernacle | Find images Recent changes | Query: SELECT ?item ?lexemeLabel WHERE { ?item a ontolex:LexicalEntry ; dct:language wd:Q35963 ; wikibase:lemma ?lexemeLabel . }
End of automatically generated list.

TEDxPraiaEdit

Goal: Add items for the TEDxPraia event (tedxpraia.com, TED event ID 19377) and talks

Scripts / gadgetsEdit

  • Auto-fill item titles (labels) with corresponding language's article title (using the same algorithm as link piping to cut out parentheticals, etc.)
  • Suggest content for unfilled descriptions, with:
    • First sentence of corresponding language's article
    • Automatic translation of description in other languages, in the order defined by translatewiki's fallback chain (should be accessible through API), ultimately falling back to English
  • Highlight (bolden) the label for the current interface language, or move it to the top

Guitar chordsEdit

Goal: model guitar chords in Wikidata.

Software dataEdit

  • Repology
    • See Comment by Repology's maintainer
    • See discussion in the property talk page
    • Automated report of outdated software versions in Wikidata
      • TODO: to connect this with one of the software version updater tools
    • Automated reports of packages missing in Wikidata that are in other repos: Arch, DistroWatch, etc.
      • TODO: convert these into a Mix'n'match catalog. Ideally weighted/filtered by number of (unrelated) repos?

Unix distro manifestsEdit

I.e. the set of packages that come pre-installed with (specific versions of) Unix-like operating systems (distros)

ListeriaEdit

Try replacing the table at pt:Prémio Camões#Premiados with Listeria, based on a query like this: https://w.wiki/LJB

LexemesEdit

FontsEdit

Useful stuffEdit

AssortedEdit

  • languages to skip on wikidata game: zh,ja,ru,uk,hu,ko,pl,tr,et,el,ar,bg,vi
  • languages to prefer on wikidata game: pt,gl,es,it,ro,fr
  • wikitext to produce a link to an item by giving one of its sitelinks: {{Item}} (there's no option for "item by label", since multiple items with the same label can't be automatically disambiguated)
  • Narrowing down search results: To search for Wikidata items by their title on a given site, use Special:ItemByTitle.
  • According to Special:MyLanguageFallbackChain, the languages that appear in item pages are determined by the contents of the {{#babel}} box in the userpage.

Data modelEdit

TABernacleEdit

  • TABernacle: provide a list of items for the rows, and a list of properties for the columns; the tool fills up the matrix and helps identify missing data, and add it directly.
  • There are short descriptions at Wikidata:Tools/Query data and the Tools directory
  • Issues / needed improvements
    • No way to sort the table columns (e.g. to locate empty cells)

QueriesEdit

Query building interfacesEdit

Notes:

  • Neither VizQuery nor Wikidata Query Builder allow combining conditions with OR
  • Neither VizQuery nor Wikidata Query Builder allow specifying non-property conditions (number of sitelinks, label/description, ...)

REST endpointEdit

DocumentationEdit

Introduction / general referenceEdit
PrefixesEdit
  • General reference
    • Prefixes (wd:, wdt:, etc.) are used to qualify elements of a query (operators and operands) depending on their type (e.g. item, property, value, etc.)
    • About prefixes
    • Full list
  • List of prefixes
    • wd = Wikidata entity (e.g. ___)
      • wds = Wikidata statement (e.g. ___)
      • wdv = Wikidata value (e.g. ___)
      • wdt = Wikidata property (equivalent to p + ps as shown below)
    • p = a property statement (e.g. ?item p:P123 ?prop.)
      • ps = prop/statement/ — the value of a property statement (e.g. ?item p:P123 ?prop. ?prop ps:P123 ?propValue.)
        • psv = prop/statement/value/ — the numeric value of a property as written in the statement (i.e. disregarding the unit)
        • psn = prop/statement/value-normalized/ — the numeric value of a property, normalized to the base unit of the measured quantity.
      • pq = prop/qualifier/ — a qualifier for a property statement (e.g. ?item p:P123 ?prop. ?prop pq:P456 ?propQualifier.)
        • pqv = prop/qualifier/value/ — ?
      • pr = prop/reference/ — ?
        • prv = prop/reference/value/ — ?
    • TODO: add examples above where missing
Query syntax cheatsheetEdit

Condensed/edited from the excellent —but awfully verbose— Wikidata:SPARQL tutorial)

  • The core structure of any query is a semantic triple (subject, predicate, object).
    The "predicate" represents the relationship between subject and object, so I'll call it "relation" to make this clearer:
    • ?subject wdt:relation wd:object.
  • The object of one triple can be the subject of another triple, which allows building more complex queries:
    • ?nephew wdt:child ?father. ?father wdt:brother wd:uncle.
    • There are also two shorthands for this:
      • ?nephew wdt:child/wdt:brother wd:uncle. — using the path separator character / to chain predicates together, creating a "property path" from the subject to the object.
      • ?nephew wdt:child [ wdt:brother wd:uncle ]. — using [] to nest a partial triple, where the omitted part is the missing piece in the outer triple.
  • Use , to append another object to the previous triple, reusing both the subject and the predicate:
  • Use ; to append a predicate-object to the previous triple's subject:
    • ?subject wdt:relation1 wd:object1;
      wdt:relation2 wd:object2.
  • Predicates can be combined using regex-like syntax:
    • Use the regex-like quantifiers *, + and ? to represent how many times a predicate appears in the query:
      • ?descendant wdt:child+ ?ancestor.
    • The two constructs above are commonly used to specify the notion "instance of X or of any subclass of X":
      • ?subject wdt:P31/wdt:P279* ?object.
    • As in regex, | means OR:
      • ?itemA wdt:relation1|wdt:relation2 wd:itemB.
    • As in regex, () groups expressions.
  • More useful info: https://www.slideshare.net/LeeFeigenbaum/sparql-cheat-sheet
    • UNION / MINUS (slide 8)
    • literal values (strings, numbers, ...)
    • comparison operators (!, &&, ||, <, =, !=, ...)
    • more predicate path operators (^, !, ...)
    • underspecified triples (e.g. two or even 3 variables)
  • Wikidata-specific helpers
    • label and description
      • Include SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
      • for a variable ?foo representing an item, that automatically binds its label to ?fooLabel, and its description to ?fooDescription

Example queriesEdit

Tools to work with scholarly worksEdit

aka academic publications (scientific papers, theses / dissertations, books, etc.)

BooksEdit

QuickStatements referenceEdit

  • QuickStatements v1 (deprecated)
  • QuickStatements v2 (recommended)
    • Can import commands in the v1 format
    • The batch mode ("run in background") doesn't seem too reliable; I got some errors, but then wasn't able to see what they were

ExamplesEdit

Example 1RescueTime (Q34637733): software version identifier (P348) = "2.12.5.1503"; publication date (P577) = +2017-06-09T00:00:00Z/11; platform (P400) = Microsoft Windows (Q1406); version type (P548) = stable version (Q2804309) (others listed here); reference URL (P854) = "https://www.rescuetime.com/updates/win_release_notes.html"; title (P1476) = "RescueTime for Windows Release Notes" (English).

Q34637733	P348	"2.12.5.1503"	P577	+2017-06-09T00:00:00Z/11	P400	Q1406	P548	Q2804309	S854	"https://www.rescuetime.com/updates/win_release_notes.html"	S1476	en:"RescueTime for Windows Release Notes"

Simplified template:

<item>	P348	"<version number>"	P577	+<date>T00:00:00Z/11	S854	"<url>"	S1476	en:"<title>"

Observations:

  • Note how source (reference) properties must be provided using the nonstandard "S" prefix — so "S854" instead of "P854".
  • Note that the whitespace characters are tabs, not spaces
  • Note that timestamps must have zero time
  • Note that the reference title requires a language specifier, here indicated by the en: prefix.

Example 2 → (TODO: human-readable translation)

CREATE
LAST	Len	"Buying Lumber"
LAST	Den	"song from the sountrack of the 2000 game The Sims"
LAST	P361	Q7764364	P1545	"4"	P2047	306U11574
LAST	P31	Q217199
LAST	P86	Q943225
CREATE
LAST	Len	"Mall Rat"
LAST	Den	"song from the sountrack of the 2000 game The Sims"
LAST	P361	Q7764364	P1545	"5"	P2047	164U11574
LAST	P31	Q217199
LAST	P86	Q943225

Observations:

  • Note the usage of CREATE and LAST directives, since we're creating new items, rather than adding statements to an existing item
  • Note the Len and Den, for the English label and description
  • Note now each line can only contain a single statement triplet, but a given statement (e.g. part of (P361)) can have any number of properties/qualifiers.
  • Note now the duration (P2047) is provided as seconds which are marked U11574, when in reality the item is second (Q11574).
  • Note how the number for series ordinal (P1545) is provided as a string, even though a plain number should work as a quantity, according to the docs ("unit is optional")

Example 3 → (TODO: human-readable translation)

Q2986828	P348	"CLDR 30.0.1"	S854	"http://cldr.unicode.org/index/downloads/cldr-30#TOC-CLDR-30.0.1-Maintenance-Release"	S1476	en:"CLDR 30 Release Note"	S958	"CLDR 30.0.1 Maintenance Release"

Example 4 → (TODO: human-readable translation)

Q839063	P1324	"http://git.savannah.gnu.org/cgit/oddmuse.git/"	P8423	Q186055

LexemesEdit

FAQEdit

TODO: Create a quickstart / FAQ / examples page in Wikidata:Lexicographical data

  • What are lexemes?
    • words, phrases/expressions, prefixes, acronyms, etc.
  • Wikidata vs. Wiktionary
    • In Wiktionary each page contains all homographs of a word, with sections for each language, and subsections for each lexical category (verb, noun, etc.)
      • There are separate pages for different forms of the same word, e.g. house vs. houses, pequeno vs. pequena, etc.
    • In Wikidata each page (lexeme) contains the homographs sharing the same writing-language combination, with one "sense" for each meaning of the word
      • Each variation (e.g. house & houses, pequeno & pequena, etc.) are stored in the same page, as different "forms" (of a sense?)
      • Other homographs are connected via homograph lexeme (P5402) (these can be for any language, not just the same as the lexeme's)
  • Lexeme vs. other Wikidata items
    • The lexeme has statements that describe the word
    • The item has statements that describe the concept

ToolsEdit

Data modelEdit

Also:

ProblemsEdit

  • The creation form shows a "language variant" field when the entered language is not recognized.
    • See Help:Monolingual text languages and the tracking ticket phab:T144272.
    • For some reason the list of languages is restricted to the one approved by the Language committee for new Wikipedias, rather than e.g. the full list of languages from CLDR
    • To request a language to be supported, a new Phabricator task needs to be created in the same model as one of the child tasks of the one linked above
    • As a workaround, new lexemes can still be created by using the mis as described in the help page linked above.
    • For Kabuverdianu, the code is actually already available/linked (since phab:T127435), but the extra field was appearing nonetheless in the creation form; possibly that was due to a "no value" value for the language code, which was just removed in this edit.
      • TODO: if the problem persists, maybe a new Phabricator issue needs to be created.

Question-answeringEdit

Tools:

See also:

Benchmarks:

Mini-biosEdit

Thanks to the Wikidata Game, it will be possible to move quickly to a state where we Wikidata will have all the information needed to build automated mini-bios in the form

<label> (<place of birth, <date of birth> — <place of death>, <date of death>) was a <country of citizenship> <occupation> who <description>.

In fact, the description field for people in Wikidata should probably forgo occupation and nationality, and go straight to their claim to notability, since the former are redundant with the corresponding fields.

This proposal was originally posted here.

Related resourcesEdit