Welcome to the WikiProject Personal Data


This project aims to reconstruct how personal data flows across diverse data controllers (GDPR), using WikiData as the underlying infrastructure. This methodology is flexible enough to be applied to different areas of concern (see list of subprojects).

In order to do this, some specific goals are:

  • Determine the data structure needed for collaborators to contribute easily
  • Gather interested actors and communities that help populate the database
  • Build tools that can use the data and produce useful information

WikiProject PersonalData (Q60228333) was started by Pdehaye as part of a PersonalData.IO (Q59695802), a nonprofit organization based in Geneva, Switzerland, promoting digital rights and trust in the digital world.

Although there are many initiatives from different sectors working on personal data use by corporations and developing tools, we consider a vital disconnection remains between them: the separate outputs of those tools are not aggregated. The consequence is a lack of information on how personal data flows through the system, which actors are involved (platforms, ad tech companies, app trackers) and how they are inter-linked across devices, techniques, jurisdictions and platforms. This also slows down the development of new tools necessary for people to exercise their rights in a hassel-free manner.

This project aims to tackle the problem by proposing an open, collaborative methodology that provides information to institutions, users, policy makers and specific communities.

See [1] for high-level description.

We will map nodes (data controllers), type of data exchanged, flows of data between them, presence or absence of privacy policies and more.

The mapping process can be divided into two (connected) instances: writing -input of data- and reading -obtaining data-.

  • ‘Write’ sources: 'watchdog' projects (ghostery, lightbeam, etc), DPA's registers, community mapping efforts, corporate actors mapping their own ecosystems (lumascape), ads.txt, automated scripts
  • ‘Read’ tools: data rights tool focused on superusers (journalists, activists, academics); ecosystem visualizations.

We will use open source tools such as Jupyter notebooks, & WikiBase in order to build on those efforts, reach out to their existing communities and support collaboration, transparency and freedom of use.

As this is a flexible methodology, it becomes clear that each subproject and its community will be the driving force to make it grow. That's why a very important component of this project is about community building.

See Wikidata:WikiProject_PersonalData/DataModel


See Wikidata:WikiProject PersonalData/Queries

A group of people is interested in learning how the transportation network apps work with personal data, but most information is missing. They join the project and start mapping in WikiData the data controllers they're interested in, such as Uber. They map Uber's privacy policy with its URL and the data protection officer e-mail. Moreover, with the policy information they can also map which kind of data is being used by the controller and shared with whom.

Using the reading tools, the process of generating access requests is facilitated. So with this basic data, they're able to generate multiple access requests to the data controller. Using other reading tools, they can obtain more information from these new data, that can be poured into the map to answer new questions.

Addittionaly, on top of the community mapping other sources of information complete the map (tools that already exist, like ghostery or lightbeam). By sharing this information on a common structure, the big picture of the personal data ecosystem starts to reconstruct itself, answering existing questions and creating new ones.

  • "EasyQuery" a gadget which allows to visualize relationships between items on the same page. You can also perform nice searches for similar instances, based on simple selection criteria. Yo can activate it in your User preferences page. Look for the 3 dots once installed!
  • "Quick Statements", a tool that allows you to edit a batch of Wikidata items at once, based on a simple set of text commands or even csv files

Take a look at our subprojects and start completing the missing items and statements.

If you're interested in one area in particular, you can propose your own subproject and start mapping the data controllers you're interested in.

Feel free to start topics/questions in the Discussion page.

