STATS! and State of WikidataEdit
Who is editing wikidata? Wikipedians? Bots? How many claims have references? How much vandalism? How much is wikidata used in Wikipedias?
- Wikimania State of Wikidata 2013.
- Wikidata:Statistics/Wikipedia/Type of content
- User:Jura1/People charts
- 2014_04_wikidata_by_country.pdf, Magnus Manske
- https://stats.wikimedia.org/wikispecial/EN/ReportCardTopWikis.htm#lang_wikidata Wikidata at a glance
- meta:Research:Wikidata gap analysis
- Language usage on Wikidata February 1, 2016 / addshore
|Distribution of items (2015-09-07)|
Number of Wikidata edits (in millions) for each action set in the item namespace (Claudia Müller-Birn, Benjamin Karran, Janette Lehmann, Markus Luczak-Rösch: "Peer-production system or collaborative ontology engineering effort: what is Wikidata?" OpenSym 2015 Proceedings)
Statements reference to other sources by statement type (4/2013 - 12/2013). https://tools.wmflabs.org/wikidata-todo/stats.php
- Wikidata:Database reports/Number of edits (as of 2015-05-01)
- Number of edits (in all namespaces): 209,819,199:
- edited by IP's: 0.5% (User:Jura1/test3)
- edited by bots: 82.6%
- edited with WiDaR: 12.3%
- edited by registered users without WiDaR: ~4%
- 2015 04
- total edits: 5,475,415
- IP edits: 27,938
- bot edits: 2,947,547
- WiDar edits: 2,031,065
- edits by registered users without WiDaR: 468,865 (~8,5%) (widar/bot?)
- Wikidata:Database reports/Number of edits/plot:
- SuccuBot (de): 45,212,327 (14.8%)
- Research Bot (de): 19,776,148 (6.5%)
- PLbot (de): 12,487,091 (4.1%)
- Reinheitsgebot (de): 11,624,213 (3.8%)
- QuickStatementsBot (de): 11,367,426 (3.7%)
- ProteinBoxBot (de/nl): 12,211,661 (4.0%)
- Edoderoobot (nl): 22,833,627 (7.5%)
- RobotMichiel1972 (nl): 10,017,807 (3.3%)
- BotNinja (bg): 26,965,135 (8.8%)
- ValterVBot (it): 25,361,973 (8.3%)
- Dexbot (fa): 23,857,242 (7.8%)
- KrBot (ru): 19,804,772 (6.5%)
- Mr.Ibrahembot (ar): 19,685,878 (6.4%)
- Emijrpbot (es): 18,644,530 (6.1%)
- Harej (en): 15,190,706 (5.0%)
- GZWDer (flood) (zh): 10,554,910 (3.5%)
Wikidata content statsEdit
|Wikidata in % of all statements (about 70 million)
Groups by number of statements with specific properties.
- Wikidata:Project_chat/Archive/2015/06#Revert_analysis: I analyzed all 2073 reverts of IP edits during the last 30 days and identified the country of origin of the IP's. In the graph below all countries with more than ten reverts are listed. Remarkably, seven out of the ten top countries are Spanish speaking countries. In total, IP's from these seven countries are responsible for 1117 reverted edits or 55% of all reverted edits during the studied period. I hypothesize that many IP's are coming from Spanish Wikipedia articles with a [editar datos en Wikidata] link as in es:Jesé Rodríguez. Such links are one hand very welcome as Wikipedia authors can faster improve the connected Wikidata item, on the other hand they open a new playground for vandals. I checked if the vandalizing IP addresses from Spanish speaking countries are also active on Spanish Wikipedia. Though only 14% of these IP's have also reverted edits on Spanish Wikipedia. This means that vandals on Wikidata and Spanish Wikipedia are different people but most probably many Wikidata vandals are coming from Spanish Wikipedia --Pasleim (talk) 20:26, 15 June 2015 (UTC)
- I created a second plot showing the number of reverts by country relative to the total number of IP edits made in the country. Only countries with more than 100 edits are shown. In average, 7% of all IP edits get reverted. --Pasleim (A) (talk) 20:08, 16 June 2015 (UTC)
- Towards Vandalism Detection in Knowledge Bases: Corpus Construction and Analysis Stefan Heindorf, Martin Potthast, Benno Stein, Gregor Engels, SIGIR’15, August 09 - 13, 2015, Santiago, Chile. http://dx.doi.org/10.1145/2766462.2767804 : "Our corpus is based on a database dump of the full revision history of Wikidata until November 7, 2014 (...) about 85% of revisions are made automatically by bots approved by Wikidata’s community. (...) As we are interested in detecting ill-intentioned contributions by humans, and not errors in bots, we base our corpus on the 24 million manual revisions."
- Revisions made on Wikidata 167,802,227 100 %
- Revisions made on meta pages 1,211,628 1 %
- Revisions made on special items 11,167 0 %
- Revisions made automatically 142,574,999 85 %
- Revisions made manually 24,004,433 14 %
- "Of the 24 million manual revisions made on Wikidata, a total of 103,205 have been reverted via rollbacks, and 64,820 via undo/restore. Based on our below validity analysis, we label roll-back revisions as vandalism, whereas this cannot be done with confidence for undo/restore revisions."
- (Figure 2: Manual revisions on Wikidata per month. Revisions affecting textual content (labels, descriptions, and aliases) are distinguished from revisions affecting structural content (statements and sitelinks). Major growth events are labeled.) "The first jump of growth rate was caused by enabling statement creation for first time. In the months around this event, Wikidata was connected to the Wikipedias in various languages, adding millions of statements and sitelinks. (...) The second growth rate increase is due to the emergence of semi-automatic editing tools for Wikidata, most notably the Wikidata Game.
- Vandalized Item Categories: "Table 1: Top vandalized items Cristiano Ronaldo, Lionel Messi, One Direction, Portal:Featured content, Justin Bieber, Barack Obama, English Wikipedia, Selena Gomez (...) the least vandalized category Places gets almost 4 times as much attention by all editors (31%) (...) The focus of vandals deviates significantly from typical editors (...) while categorizing the revision samples, we noticed that 11% of the vandalized items concerned India, cross-cutting all categories, compared to 0.5% overall."
- Vandalized Content Types: "About 57% of the vandalism happens in textual content like labels, descriptions, and aliases; and about 40% happens in structural content like statements and sitelinks. The remaining 2% of miscellaneous vandalism includes merging of items and indecisive cases."
- Vandals: "About 86% (88,592 of 103,205) of vandalism on Wikidata originates from anonymous users. (...) Unregistered users primarily vandalize textual content and sitelinks, whereas registered users primarily vandalize statements and sitelinks."
Wikidata in WikipediaEdit
- overview at de:Benutzer:Mabschaaf/WD-Nutzung in deWP
State of Wikidata 2015 (Wikimania Mexico 2015)Edit
"Tell us about Wikidata at "your" Wikipedia"Edit
- meta:Tell us about your Wikipedia
- meta:Tell us about Korean Wikipedia etc.
- Wikidata:Project_chat/Archive/2015/05#Using Wikidata on small wikis
- How do you track usage of WD in your WP? (frWP categories, deWP tables?)
- Do you have RfC/consensus/rules about use of WD? (enWP, deWP)
- Experiments with WD? (huWP automatic population graph per WD)
- Best/worst experience? (ocWP)
- WD project pages on WD? (village pump, ...)
- Other things you would like to tell us?