Wikidata:Schemas

Wikidata Schemas




What is a schema? edit

A schema defines the structure of data stored in a database. Schemas are widely used across the web and apply to 1000s of different kinds of data about the world. Wikidata schemas provide a standardised structure for data on a subject area. They allow you to define models for any type or grouping of items, for example E10 is the schema for a human.




Wikidata Schemas edit

Using schemas on Wikidata helps to improve data quality and usability, it helps to grow the community and reduce conflicts and makes it easier for organisations to share data with and reuse data from Wikidata.

  • Data quality: Providing clear guidance on data structure increases data quality and data completeness by allowing people to find and use the most appropriate schema for a subject. It decreases common misunderstandings about the data and repeat mistakes in the structure. Schemas can also be  used to automatically check the quality of Wikidata items.
  • Usability: A standardised structure makes data easier to find and use in queries, it increases the trust of results from queries and makes them simpler and more consistent. It makes it easier to build tools for other Wikimedia projects to use Wikidata data as well as  third party applications.
  • Community growth and health: schemas provide people new to Wikidata a way to learn more easily and produce higher quality work which helps the community grow, it also reduces arguments by recording community agreements on how items should be modelled for a subject.
  • Encourages sharing of data: having a clear structure for data increases confidence in organisations sharing data  that it will remain intact and be built upon after a donation, it also makes queries to ensure data quality is maintained possible.

For a more technical description of Wikidata Schemas please see Wikidata:WikiProject Schemas.




The value of using Wikidata Schemas edit

Wikidata Schemas provide a way to check how the data on a particular type of item should be structured. For example, there is a schema for a person which defines the statements expected for an item about a person on Wikidata.

Wikidata Schemas can be based on other Wikidata Schemas, for example, a simple “author” Wikidata Schema can be created by referencing the existing “human” schema, along with extra statements like “occupation = author”. This process of referencing other schemas has some major benefits:

  1. Saves time when creating new schemas
  2. Avoids having to repeat community discussions about data structures that have already been “agreed”
  3. Reduces the chance that different groups of editors make different choices about structure for similar items, increasing data consistency
  4. Defines a hierarchy of schemas that can be used to help editors explore and discover the correct one to use. For example, a user might want to see all schemas that are based on the “human” schema to check if there is one available for a “chemist”, or further expand “chemist” to see if a “biochemist” is available.
  5. Can be used with automated tools for generating reports on a list of items. The report will inform you which statements do not conform to the schema you are checking against.

The schemas are linked to the Wikidata item for that class by a statement (e.g. human (Q5) → Wikidata Schema → E10). Note: this is not yet available, the property proposal is on hold waiting for this Phabricator ticket.




Use a Wikidata Schema edit

Find a Wikidata Schema edit

There are several ways to find Wikidata Schemas

  1. (Coming soon) Look on the subject item Use the search box to find the Wikidata item for the subject you want to work on, e.g if you want to create an item for a playwright look at playwright (Q214917).  On this item you will find a property called Wikidata Schema, click on the schema to see the schema for the item.  If you have questions about the schema or how to follow it please ask on Wikidata:Project Chat.
  2. Wikidata:List of schemas has a full list of schemas to search and explore.
  3. Wikidata:WikiProjects is a list of Wikiprojects which often have a list of relevant schemas they use.

If you’re unable to find a schema you can:

  1. Search for a more generic term e.g ‘writers’ instead of ‘playwrights’.
  2. Ask on Wikidata:Project chat for help.
  3. Consider proposing a schema.

Adding information to items edit

  1. Use the schema to understand the kinds of statements that can be made about the subject and the best way to model the information.
  2. Add as many statements (with references) as you can that appear in the schema to the items you want to create or improve.
  3. When you can, add information to other related items which will improve them e.g. when adding information about plays to an item by a playwright (Q214917) go to the items for those plays and add the author.

Write a query edit

Wikidata Schemas provide the structure needed to find a type of item using a Wikidata Query. Anyone who is familiar with the SPARQL query language should be able to read the Schema conditions easily as there are many distinct similarities in the syntax.

You can’t guarantee you will find all items on Wikidata using a query based on a Schema, but it will give you a list of results with the “recommended” structure. If you need to find as much data as possible, you may need to expand your query to include other ways the data has been modelled.

Check data consistency edit

One of the most common uses for Wikidata Schemas is to check items for data quality e.g by generating reports showing which items need fixing, or statistics to indicate how complete the list of items is. There are many more uses for Wikidata Schemas, and they will get significantly more powerful as more tools are developed to make use of them.




Create a new Wikidata Schema edit

Wikidata:Schema proposals: If you’re unable to find a schema for the subject you can propose creating one, schemas are created through discussion with the community which helps to consider a wide range of use cases and contexts. Wikidata:Schema_proposals is the central discussion area, (similar to Wikidata:Property proposal) to propose and collaborate on developing new schemas.