User:Lectrician1/A better Wikidata

This is a proposal for a better ontological and database structure for Wikidata.

This proposal was triggered by my difficulties with the property instance of (P31), works/expressions/manifestations/items, and entities changing over time.

Particularly, these types of questions:

Ontology edit

Main ideas edit

  • Humans interpret the universe through our exposure to how things over time.
  • Things are often
  • Our universe is made of entities. For example, "the object is red". "object" and "red" are both entities.
  • Humans often abstract entities
  • When there are unique patterns of properties among other properties, humans group and abstract them as their own properties.
  • When properties change at a point in time because of other properties, humans recognize this as an action.

Elements edit

So from these main ideas we can create elements that will make up our ontology and database.

  • Properties
  • Ideas
  • Actions

Properties edit

Interpreting these elements edit

I see properties as a the collection of all the properties we distinguish over time on a timeline. All statements have a start (and end) date and can have varying "depth".

For example:

  • JYP, is, company
    • cause: founding of JYP

Database edit

Abstract edit

  • Minimize quantity of editing
  • Maximize speed of editing
  • Maximize understanding of entities for those of all languages
  • All data should be structured.
  • All data should be referenced.

Properties edit

  • Property flexibility. Properties should describe relationships appropriate for the type of item they are used on. For example, in English, we say Twice (Q20645861) "has members", not has part(s) (P527).
  • No more qualifiers. Qualifiers are extremely inconsistent in usage and intelligibility and make are therefore very hard to build consistent structured data with.
    • Text-based values in Wikidata right now should be replaced by Lexemes where-possible
    • No manually-created labels or descriptions. Data should describe itself and not require any human-centered content.
    • Labels and aliases can be seen as replaced by the various name (P2561) and identifier properties. This should help clean up the recurrent issue where the name of something is documented in a label or alias but not a
    • This would require a more-powerful entity search tool but the tradeoff is worth it.

Structure edit

What is a human? A human is an evolving organism. This is a very hard concept to "group" its stages under because a human doesn't really "start". It's just the general idea of the organism. So how do we relate its stages back to the original idea of a human? Well, let's make a timeline to show the stages of human development first.

  1. ovaries, produce, egg
  2. egg, grows into, zygote
    1. egg, fertilized, sperm
  3. zygote, grows into, embryo
  4. embryo, grows into, fetus
  5. fetus, grows into, infant
  6. infant, grows into, child
  7. child, grows into, adult


Relating back to the Main ideas, every property should have an occurrence that causes it to occur. We shouldn't expect editors to document exactly how the occurrence happened, so we need to build a model that offers flexibility in data quality, yet consistency in structure.

Lets go through some basic examples to get an idea how we should do this.

"X human was born on January 1st 2020"

We need to define what each part of this statement is.

What is X? X is the human composition of matter at the point in time in which they were considered born. Now X is a valid

Time and events edit

Time and events are really weird in Wikidata.

Some thoughts:

  • Every statement should have some data that describes how the statement became true and if when, it became false.
  • Everything evolves over time. Entities seem to be concrete which is not good. For example, a human develops from a embryo, to a fetus, to a human.
  • Part of my plan for a revised ontology is getting rid of qualifiers. I believe this is possible through the creation of items for most qualifiers.

Problems to address edit

  • How do we relate things ontologies that do not correspond to our ontology? For example, MusicBrainz conflates many elements of a music release into a single entity whereas we might have many.