Wikidata:Item quality


Grading scheme edit

Class Criteria Reader's experience Editing Suggestions Examples
A Items containing all relevant statements, with solid references, and complete translations, aliases, sitelinks, and a high quality image.
more details
  1. A plurality of external references for non-trivial statements
  2. Appropriate ranks
  3. Qualifiers where applicable
All available information is recorded with reliable references. Douglas Adams (Q42), Donald Trump (Q22686), Iron Man 2 (Q205028)
B Items containing all of the most important statements, with good references, translations, aliases, sitelinks, and an image.
more details
  • The most important properties for this type of item have statements with:
  1. External references for non-trivial statements
  2. Some appropriate ranks
  3. Some qualifiers (if applicable)
All of basic information and some extended information with references. Hans van Mierlo (Q288771), Cinderella (Q15046091), Wressle Castle (Q8037764)
C Items containing most critical statements, with some references, translations, aliases, and sitelinks.
more details
  • The critical properties for this type of item have statements with:
  1. References for some non-trivial statements
Most of the basic information that you'd expect is available. May not be well referenced or complete. Langenburg (Q82382), Montblanc (Q761735), Juana Díaz (Q2361577)
D Items with some basic statements, but lacking in references, translations, and aliases.
more details
  • Some relevant properties for this type of item have statements
  • Has a label and description
  • Minimal applicable aliases
The statements need to provide enough information to easily identify the item. Putative PAS/PAC sensor protein Smed_0429 (Q23609272), Johann Heinrich Geymüller (Q1694538), Gustavus Simmons (Q381312)
E All items that do not match grade “D” criteria. Arosa airfield (Q1433477), Saint-Fortunat chapel (Q22968194), Coniophora arida (Q10646558)

Notes edit

Relevant statements (completeness) edit

High quality items should contain all relevant statements. For instance, items with a statement instance of: human (Q5), should contain statements with properties sex or gender, date of birth, place of birth, and so on. It is encouraged to use own judgment of item completeness to evaluate whether each item contains all relevant statements.

The definition of "applicable" edit

There is a note "applicable" on several criteria such like the ones related to image, aliases, and qualifiers. This means, some quality scales allow items to not have images, aliases, and qualifiers because it might not be applicable for the items to have images, aliases, and qualifiers. For instance, items like externality (Q275372), are not applicable to have an image.

Statement sources edit

See also Help:Sources

Most statements should indicate where the data comes from via a source. Sources are not required for undisputed common knowledge, for statements that refer to an external source of information (e.g. authority control), or when the item itself is a source for a statement (e.g. the author of a book).

External references edit

Some statements will be sourced with a reference to Wikipedia (Q52) or another Wikimedia wiki, but in general, external references are desirable for most types of statements. See Help:Sources for exceptions.

Plurality of references edit

A well referenced item will have sources that come from more than one, non-correlated reference works.

The references of identifiers edit

The quality criteria also evaluates the references of identifiers. So, these references should be also counted.

Sitelinks edit

Not all items are necessarily to have sitelinks. For instance, biological related items such like hsa-miR-424-5p (Q27595296) and SRY (sex determining region Y)-box 9 (Q21990154), are unlikely to have sitelinks.

High quality media edit

See also Wikipedia:Manual of Style/Images (Q16745099)

High quality items have images and other media that are significant and relevant in the topic's context, not primarily decorative.

Poor-quality images—dark or blurry; showing the subject too small, hidden in clutter, or ambiguous; and so on—should not be used. Think carefully about which images best illustrate the subject matter. For example:

Translations edit

Labels, aliases, and descriptions are languages specific and require translation. The table below provides a convenient intersection of the top 10 languages by internet users, number of speakers, and active Wikipedia editors. There are 8 languages at the intersection: Chinese, Spanish, English, Arabic, Portuguese, Russian, French and German. In order to reach the largest audiences the highest quality Wikidata items and properties should provide translations for these 8 languages. Some items have specific languages for which translations are especially relevant. For example, kanji should have translations in Japanese and Chinese, and Bihar should have translations in Hindi and English.

rank Internet users[1] Total number of speakers[2] Wikipedia editors[3]
1 English Chinese English
2 Chinese English German
3 Spanish Spanish French
4 Arabic Hindi Spanish
5 Portuguese Arabic Japanese
6 Japanese Malay Russian
7 Malay Russian Italian
8 Russian French Chinese
9 French Portuguese Portuguese
10 German German Arabic

Different scripts edit

The scripts used in translations matter independently from the actually translation language. For example, having Douglas Adams (Q42) include both "Douglas Adams"(en) and "道格拉斯·亞當斯"(zh) as different translations allows far more people to read the name than an identical "Douglas Adams"(pt). Having a Latin script, Cyrillic script, Arabic script, and Chinese script represented allows for nearly all of the top language speakers to be able to read the name of an item -- even if their specific language is not translated.


ORES edit

Adding the following code to your common.js will show the quality class and score in item pages and history pages: importScript("User:EpochFail/ArticleQuality.js");

It uses ORES API (https://ores.wikimedia.org/v3/scores/wikidatawiki?format=json&revids=<revid>&models=itemquality; the result of ORES API is probability of each class (A, B, C, D, E). The predicted class is the class with the highest probability, and the score is calculated as 5*<probability of A>+4*<probability of B>+3*<probability of C>+2*<probability of D>+1*<probability of E>.

See also edit

References edit

  1. https://en.wikipedia.org/wiki/Languages_used_on_the_Internet#Internet_users_by_language
  2. https://en.wikipedia.org/wiki/List_of_languages_by_total_number_of_speakers
  3. https://en.wikipedia.org/wiki/List_of_Wikipedias