Wikidata talk:WikiProject PersonalData
What could be mapped? edit
- data protection authorities, contact details, languages
- legal basis for processing personal data
- some information about articles
- user rights ("Art 20 GDPR is portability, which is also in Philippine data protection law")
- names of national laws relating to data protection
- by sector/country, main brands
- by sector/country, main backend players
- data transfer instruments
- privacy policies URLs
- third parties (for instance, who processes PayPal's data)
- URLs for Data Transfer Agreements
- API URLs
- products
- regulatory texts (Terms of Service/Privacy Policies)
- typology of data collected
- roles
- ads.txt
Items edit
So we're interested in mapping data controllers (corporate actors but also others, such as political parties), their privacy policies, the type of data they collect.
What structure do we need?
- 'Data controller' item
- Items for each data controller
- Items for each of their privacy policies
- Items for types of data collected by controllers
- Should link types of data collected as instances of personal data
- For 2: a data controller is identified as such by a statement saying is instance of: data controller
- For 4: seems to be the most difficult - But we can start by the non-exhaustive list of identifiers specified in GDPR text. When there's a link, it means there's an item in WikiData for it.
Types of data collected edit
From GDPR:
Personal data means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person
Examples of identifiers from GDPR
- name
- identification number
- location data
- an online identifier
Examples of online identifiers from GDPR & ICO
- internet protocol (IP) addresses
- cookie identifiers
- other identifiers such as radio frequency identification (RFID) tags.
- MAC addresses
- advertising IDs
- pixel tags;
- account handles; and
- device fingerprints.
Special categories of personal data in GDPR
- race;
- ethnic origin
- political opinions
- religious or philosophical beliefs
- trade union membership;
- genetic data;
- biometric data (where this is used for identification purposes);
- health data
- sex life
- sexual orientation.
Statements edit
Of a controller edit
- is instance of: data controller
qualifiers of the statement:
- main regulatory text: controller's privacy policy
- uses: [type of data: personal name, IP address, etc]
Problem with property "uses": it's not easy to understand that it's talking about using personal data from users/customers. Should we suggest WikiData the addition of property "collects personal data"?
Of a controller's privacy policy edit
- is instance of: privacy policy
- official website: URL
- e-mail: xx@xx.com
- applies to part: organization where it applies
Problem with property "e-mail": sometimes it's a form URL instead of email, which creates problems with the property's constraints
Data we have by now edit
Data controller item was created.
Up to now we have the following controllers 'tagged':
Privacy policies:
Issues to solve edit
1. Inverse property of "complies with". Discussion ongoing here
organization --main regulatory text--> policy
policy --applies to part--> organization
2. Property for connecting organization with type of data collected
I tried: organization --uses--> personal data
And then added qualifiers to personal data: name, address, phone, etc. Problem is that i) not all items are available (e.g. I couldn't assign IP address as qualifier, though the item exists in Wikidata) ii) doesn't seem to be correct in terms of data structure.
After that I tried to qualify organization --is instance of--> data controller. I qualified it with uses: type of data collected (see example Uber)
3. "Uses" is the nearest available property for connecting organization ----> personal data
A more correct property would be collects/gathers--> name, address, ip address, etc.
"Uses" can be used for many different purposes than data, which can lead to future problems if we want to perform some data analysis. Should we suggest the property "collects"?
Cassandreces (talk) 17:35, 27 December 2018 (UTC)
For some of those we need to explore around the properties that we are considering. For instance, item operated (P121) is a subproperty of uses (P2283) which might be appropriate in some circumstances. Pdehaye (talk) 16:40, 2 January 2019 (UTC)