User:Lectrician1/The grand mess of data on Wikimedia

There are multiple data-related projects going on between Wikimedia Projects at the moment. It's quite the mess to grasp, manage, and ensure there are no conflicts. However, it's important and probably some of the best work the Wikimedia community is doing from a technical-usefulness perspective.

This list was made to summarize all of those projects. Almost all of the descriptions are copied from the links that proceed them. Their rights go to their authors.

Feel free to add to this list!

Abstract Wikipedia edit

  • The goal of Abstract Wikipedia is to let more people share more knowledge in more languages. Abstract Wikipedia is a conceptual extension of Wikidata. In Abstract Wikipedia, people can create and maintain Wikipedia articles in a language-independent way. A particular language Wikipedia can translate this language-independent article into its language. Code does the translation.
  • Wikifunctions is a new Wikimedia project that allows anyone to create and maintain code. This is useful in many different ways. It provides a catalog of all kinds of functions that anyone can call, write, maintain, and use. It also provides code that translates the language-independent article from Abstract Wikipedia into the language of a Wikipedia. This allows everyone to read the article in their language. Wikifunctions will use knowledge about words and entities from Wikidata.
    • mw:Extension:WikiLambda: The WikiLambda extension provides Wikimedia wikis with a wikitext parser function to call evaluation of functions written, managed, and evaluated on a central wiki.
      It is in early development that forms the core of the "Wikifunctions" software stack, as part of the work towards m:Abstract Wikipedia.

Wikidata edit

  • mw:Wikidata Bridge: The Wikidata Bridge (formerly known as “client editing”) is a project aiming to make it possible to edit Wikidata’s data directly from Wikipedia. This will be achieved by an interface, connected to the infobox, that users can access directly from their local wiki.
  • The Query namespace: Will be used for queries that generate lists on Wikipedia and maybe example queries on Wikidata too.
    • Wikidata:Listeria: Listeria is a Wikidata tool that allows the use of SPARQL queries to define lists, and provides a bot (ListeriaBot) that will update wiki pages containing these lists whenever the results of their defining SPARQL queries change.
  • Wiktionary: Wikidata aims to support Wiktionary editors and content, with including lexicographical data into the knowledge base, but also by providing automatic language links for the projects.

Structured Data Across Wikimedia edit

  • Ability to add Wikidata metadata to Wikipedia articles and sections.
  • Recommend Commons users to upload images that show things to add them to Wikipedia articles.

Tabular Data edit

Tabular data allows users to create CSV-like tables of data, and use them from other wikis to create automatic tables, lists, and graphs.

Commons edit

Commons tabular data offers an alternative to Wikidata for data that can be especially useful for time series, numerical data or data that are only available under CC-BY or CC-BY-SA licence. In order to make them more usable, we need to link the from Commons and standarize them using Wikidata conventions and identifiers.

Structured Data on Commons edit

Structured data on Commons is multilingual information about a media file that can be understood by humans, with enough consistency that it can also be uniformly processed by machines. Files on Wikimedia Commons can be described with multilingual concepts from Wikidata, Wikimedia's knowledge base.

Multi-Content Revisions edit

  • MCR provides a way to store content in multiple slots on a page. The content may all be of the same kind (use the same content model), or be of different kinds. This can be thought of like attachments on an email.
  • MCR is designed to remove the need to embed structured data in wikitext.
  • The storage mechanism for MCR is complete and has been in production since 2019. The migration of the database schema on Wikimedia systems has been completed in 2020, support for the old schema has been removed in the 1.35 release.
    The original vision for MCR included an easy way for extensions to define where the additional content would be shown on the page, and how it would be edited. As of 2020, this part of the vision has not been implemented since it was not needed for the initial use case (Structured Data on Commons). A generalized editing mechanism also seemed conceptually questionable, especially for content models that are not text based and require an interactive user interface for editing.
    • One example for this kind of thing is the way TemplateData places meta-data about template parameters on the template page using a special syntax. Instead, this information could be stored in a separate slot, in a machine readable form such as JSON. This would enable the creations of a specialized API and a dedicated user interface for displaying and manipulating this information.
    • Another example are categories: Wikitext uses a special syntax to place pages in categories. The complex nature of the wikitext syntax makes it hard to reliably extract or change these categories. If the community decides that this should change, MCR could be used to store categories apart from the wikitext (but still as part of the same page), as a data structure that can easily be manipulated. Provided the user interface allows it, changing the text and the categories can still be done in a single edit.

Synchronization of wiki data between wikis edit

  • mw:Global templates: Templates and modules that can be shared between wikis.
    • mw:Multilingual Templates and Modules: This project makes it possible for modules and templates to be used on multiple wikis, without any modifications. All translations are stored in one place, accessible from everywhere.
      • DiBabel tool allows users to copy templates and modules from origin (mediawiki.org) to all other sites/languages listed in Wikidata for that page, automatically translating dependent template and module names.
  • Global Gadgets: The Global gadgets project aims to reduce the friction for using a popular gadget that's in use on another wiki.