Property talk:P227
Documentation
identifier from an international authority file of names, subjects, and organizations (please don't use type n = name, disambiguation) - Deutsche Nationalbibliothek
The Integrated Authority File (GND) is managed by the German National Library in cooperation with various library networks in German-speaking Europe and other partners. Please look up GND at Online-GND or DNB-Portal. (VIAF is helpful but also often incorrect, outdated, and is mixing two identifier systems that in some cases produce dead links.)
GND ID (P227) (Template:Entities)
- GND 1019646128: Stan Lauryssens (b. 1946), type Tp (person) = Yes
- GND 122968751: Stan Lauryssens (no info), type Tn (name, a placeholder) = No
- Help page in German: de:Hilfe:GND#Wichtige Unterschiede bei GND-Datensätzen
- VIAF 120062731 Stan Lauryssens = Yes
- VIAF 293348885 Stan Lauryssens (undifferentiated) = No
- Update: now redirect VIAF:120062731
Known VIAF problems
- Johannes Fabian (Q15641418), VIAF 91414487 merged in January 2014 three GNDs, only one was correct:
- VIAF changes numbers with dashes: Instituto Brasileiro de Geografia e Estatística (Q268072).
- GND 1026669-0 (correct)
- VIAF-GND 004164695 ("404 Not Found")
- The strange thing is that when you goto http://d-nb.info/gnd/1026669-0, GND says "idn=004164695".
- Same name + same year of birth = different person
- In some cases VIAF merges two persons because only one of them has a GND
- Please use: GND = "no value" (for the person without a GND) [1]
- For items often confused use: different from (P1889)
- For unknown reasons VIAF is not importing all GND ids
- Samuel Ramos (Q7412445) = GND 1022446479 (created 16-05-12)
- June 2015: 3 years later the GND id still not harvested by VIAF
- DNB and VIAF have been made aware of the problem, there might have been a harvesting glitch during the first weeks of the GND going live in April 2012: GND records which never have been touched since then are still unknown to VIAF (as an estimate about 15.000 GND records for persons created in early summer 2012 are not represented in VIAF). [7. Jun. 2015 Gymel]
- Update: GND 1022446479 added to VIAF:59099151 on 2015-07-12.
- In some cases VIAF clusters get deleted instead of merged
- Åke Blomström (Q270863): VIAF 228866914; taken care by KrBot
- In rare cases VIAF clusters are reused for a different item
- William of Ockham (Q43936): VIAF:41835567 in 2015
- Lorenzo Traversagni (Q18674108): VIAF:41835567 in 2016
- William of Ockham (Q43936): VIAF:262145669298005170004 (created 2016-02-28)
- Drew Fudenberg (Q1258707): till 2016-09-25 VIAF:59183378 (now: "First National Bank of Lynn")
- William of Ockham (Q43936): VIAF:41835567 in 2015
Allowed qualifiers
See: property constraint (P2302)
- Person
- named as (P1810)
birth name (P1477)please use → named as (P1810)- pseudonym (P742)
- occupation (P106)
- sex or gender (P21) → example: geographer (Q901402) (male / female)
- Organization or geographical object
- stated as (P1932) → example: universe (Q1) ("Weltall" / "Kosmos <Begriff>")
- start time (P580) → example: Jerusalem (Q1218) ("Jerusalem" / from 1948: "al-Quds", Jordanien)
- end time (P582) → example: Jerusalem (Q1218) ("Jerusalem" / till 1967: "al-Quds", Jordanien)
- point in time (P585)
- located in the administrative territorial entity (P131)
- applies to part (P518) → example: Peace Palace (Q834448) (architectural structure / organization)
- Work
- author (P50) → example: Xenien (Q523255) (Goethe / Schiller)
- title (P1476)
inception (P571)please use → point in time (P585)
- Duplicate
- Please mark preferred GND with "preferred" rank (see Help:Ranking)
Subject headings
For subject headings the GND uses quasi-synonym (Q2122467) what is useful for libraries but does not fit to Wikidata.
Example:
- publisher (Q2085381), GND 4063004-3
- book publisher (Q1320047), GND ID (P227): 4063004-3
- add qualifier stated as (P1932): Verlag
- add qualifier criterion used (P1013): quasi-synonym (Q2122467)
See also
- Error list: Property talk:P227/Duplicates
- Error list: Wikidata:WikiProject Authority control/Tn
![]() |
Format “ List of this constraint violations: Database reports/Constraint violations/P227#Format, hourly updated report, SPARQL, SPARQL (new)
|(1[012]?\d{7}[0-9X]|[47]\d{6}-\d|[1-9]\d{0,7}-[0-9X]|3\d{7}[0-9X]) ”: value must be formatted using this pattern (PCRE syntax). (Help) |
![]() |
Conflicts with “instance of (P31): Wikimedia disambiguation page (Q4167410), Wikimedia list article (Q13406463)”: this property must not be used with the listed properties and values. (Help) Exceptions are possible as rare values may exist. Known exceptions: Kingdom of Granada (Q1495642)List of this constraint violations: Database reports/Constraint violations/P227#Conflicts with P31, SPARQL, SPARQL (new) |
![]() |
Single value: this property generally contains a single value. (Help) Exceptions are possible as rare values may exist. Known exceptions: American Association for Crystal Growth (Q104668029)List of this constraint violations: Database reports/Constraint violations/P227#Single value, SPARQL, SPARQL (new) |
![]() |
Distinct values: this property likely contains a value that is different from all other items. (Help) Exceptions are possible as rare values may exist. Known exceptions: Q196538, Q912313, Q19786, Q83367, Q302, Q51666, Q695368, Q206587, Q1301203, Q8222382, Q35672, Q1779748, Q15407350, Q15407351, Q20852, Q2293670, Q84, Q23306, Q520867, Q1110145, Q1787342, Q1803227, Q695, Q1588974, Q164256, Q1111292, Q6257, Q1662807, Q1502013, Q2515177, Q414110, Q514802, Q1780615, Q1780476, Q630163, Q1454729, Q152002, Q955464, Q439072, Q841090, Q1360467, Q695316, Q1454727, Q683834, Q864640, Q772835, Q853085, Q116, Q1097498, Q58968, Q381142, Q6256, Q7275, Q179043, Q214518, Q1232145, Q1803430, Q315027, Q1571264, Q329676, Q7380391, Q1803272, Q1787368, Q1305171, Q5261695, Q14756366, Q34497, Q192184, Q151843, Q301751, Q675085, Q1454726, Q17955, Q3059502, Q44352, Q155570, Q1460, Q4951156, Q15608499, Q2456068, Q40, Q533534, Q21010, Q15058181, Q1148511, Q170213, Q7473516, Q1490, Q67996, Q3905364, Q865, Q22502, Q707297, Q17353989, Q637238, Q690821, Q13362, Q693570, Q19831595, Q1206262, Q170390, Q697084, Q5923, Q1803420, Q2839, Q16332967, Q20883, Q20892, Q1396026, Q1787565, Q20791505, Q13344, Q157575, Q20808141, Q429850, Q19311569, Q277759, Q1700470, Q742421, Q2088357, Q41662, Q8867089, Q11634, Q350268, Q321087, Q13165156, Q6877935, Q104602244, Q163446, Q5309708, Q105375866List of this constraint violations: Database reports/Constraint violations/P227#Unique value, SPARQL (every item), SPARQL (by value), SPARQL (new) |
![]() |
Qualifiers “pseudonym (P742), named as (P1810), sex or gender (P21), stated as (P1932), point in time (P585), start time (P580), end time (P582), located in the administrative territorial entity (P131), title (P1476), applies to part (P518), author (P50), criterion used (P1013), occupation (P106), reason for deprecation (P2241), reason for preferred rank (P7452), alternate names (P4970), mapping relation type (P4390), latest start date (P8555), earliest date (P1319), earliest end date (P8554), latest date (P1326), applies to name of item (P5168)”: this property should be used only with the listed qualifiers. (Help) Exceptions are possible as rare values may exist. List of this constraint violations: Database reports/Constraint violations/P227#Allowed qualifiers, SPARQL, SPARQL (new) |
![]() |
Scope is as main value (Q54828448), as reference (Q54828450): the property must be used by specified way only (Help) Exceptions are possible as rare values may exist. List of this constraint violations: Database reports/Constraint violations/P227#scope, SPARQL, SPARQL (new) |
![]() |
Conflicts with “instance of (P31): Wikimedia category (Q4167836)”: this property must not be used with the listed properties and values. (Help) List of this constraint violations: Database reports/Constraint violations/P227#Conflicts with P31, hourly updated report, search, SPARQL, SPARQL (new)
|
![]() |
Allowed entity types are Wikibase item (Q29934200): the property may only be used on a certain entity type (Help) Exceptions are possible as rare values may exist. List of this constraint violations: Database reports/Constraint violations/P227#allowed entity types, SPARQL (new) |
For Persons, see de:Wikipedia:GND/Fehlermeldung. Hints for other types at User:Jneubert/GND_errors
Error type | Item(s) affected | Description/Duplicates | Exception | Reported | Resolved (also unpublished) | Resolved and published | Wikidata updated |
---|
![]() |
This property is being used by:
Please notify projects that use this property before big changes (renaming, deletion, merge with another property, etc.) |
![]() |
GND ID - type human should not have GND ID containing - GND ID exists, type is human, GND ID contains - (Help) Violations query: SELECT ?item ?gndid { ?item wdt:P227 ?gndid; wdt:P31 wd:Q5. FILTER(REGEX(STR(?gndid), "-")) } List of this constraint violations: Database reports/Complex constraint violations/P227#GND ID - type human should not have GND ID containing - |
![]() |
GND ID - type human should have sex GND ID exists, type is human, sex missing (Help) Violations query: SELECT ?item ?gndid {?item wdt:P227 ?gndid; wdt:P31 wd:Q5; MINUS {?item wdt:P21 []} } List of this constraint violations: Database reports/Complex constraint violations/P227#GND ID - type human should have sex |
![]() |
GND ID - type human should have VIAF GND ID exists, type is human, VIAF missing (Help) Violations query: SELECT ?item ?gndid {?item wdt:P227 ?gndid; wdt:P31 wd:Q5; MINUS {?item wdt:P214 []} } List of this constraint violations: Database reports/Complex constraint violations/P227#GND ID - type human should have VIAF |
![]() |
GND ID - type human should not be instance of a subclass of human GND ID exists, type is human, is instance of a subclass of human (Help) Violations query: SELECT ?item ?gndid { ?item wdt:P31/wdt:P279+ wd:Q5 . ?item wdt:P31 ?type . ?item wdt:P227 [] } List of this constraint violations: Database reports/Complex constraint violations/P227#GND ID - type human should not be instance of a subclass of human |
DiscussionEdit
Archives | |||
---|---|---|---|
|
Deprecate Tn claims; which "reason for deprecation" qualifier?Edit
There are meanwhile more than 1000 Tns in Wikidata again, so apparently users are still importing those ones from somewhere. I can batch-deprecate all of them to avoid another re-import, and add a reason for deprecation (P2241) qualifier. However, which value would be appropriate for this qualifier? Simply incorrect identifier value (Q54975531) would be possible, but somewhat meaningless. Do we have something more specific? —MisterSynergy (talk) 20:21, 1 August 2019 (UTC)
- I believe that Tns don’t fit any of the criteria in Help:Deprecation, so in theory they should just be deleted periodically. But as you said, they keep being added. The problem with incorrect identifier value (Q54975531) is that the wording does not convey any information about the real reason for deprecation to an unfamiliar user. Wouldn’t it be best to create a new item for “undifferentiated [in the sense of GND or LCCN]“ and use this? --Emu (talk) 21:49, 1 August 2019 (UTC)
- Shouldn't be necessary since DNB announced yesterday the deletion of all Tn records by June 16, 2020. -- Gymel (talk) 19:19, 29 August 2019 (UTC)
I support Emus idea to create a new item for "undifferentiated".Even if the DNB will delete all Tns June 2020 (what I doubt), these IDs will still remain in other datebases for years, if not for decades. Though the qualifier should only be added if a source is given; otherwise I would just delete the Tn. BTW: The list Wikidata:WikiProject Authority control/Tn is quite helpful for maintenance. --Kolja21 (talk) 00:00, 1 September 2019 (UTC)
- Shouldn't be necessary since DNB announced yesterday the deletion of all Tn records by June 16, 2020. -- Gymel (talk) 19:19, 29 August 2019 (UTC)
@MisterSynergy, Kolja21, Emu, Gymel:, such tool editing would be very helpful, see my comment in section "#Duplicates". Deleting all or deprecating all would help to see real duplicates via SPARQL. GND real duplicates are bad, because they also result in VIAF duplicates. MrProperLawAndOrder (talk) 19:35, 11 May 2020 (UTC) / Linkfix Property talk:P227#Duplicates. --Kolja21 (talk) 20:42, 11 May 2020 (UTC)
(careful) import from VIAF?Edit
Hi everyone, I was at a conference the last couple of days and some people mentioned that the GND coverage on Wikidata is a bit on the low side. For other countries I sometimes import links based on VIAF (example) and I could do the same for GND. https://w.wiki/C$5 gives an overview of potential candidates. After reading through this page I see I have to do some filtering:
- Check if the GND entry actually exists (viaf seems to contain dead links)
- Check if the GND entry is of type person and not of type name
Do you think this is a good plan? Has this been tried before? Do I need to apply additional filtering to prevent errors? Multichill (talk) 11:55, 30 November 2019 (UTC)
- Sounds great! A challenging project. User:Magnus Manske is one of the experts. AFAIK there is a third filtering needed. There are old GNDs with a dash. Example: GND 4029236-8 for Cairo (= VIAF DNB-040292363). If you start with GND, Type p (person) you can ignore this problem (it only efficts corporate bodies and geographical place names). --Kolja21 (talk) 16:47, 30 November 2019 (UTC)
- Well, if you had asked before the recent mass VIAF import has taken place, I would have supported this idea. Now I am not sure, as there were plenty of wrong VIAF identifiers imported recently. For persons, VIAF does pretty aggessive automatic matching of their clusters to existing Wikidata items, based on "same name + same year of birth" comparisons which results in way too many wrong matches. That said, it would probably be safe to import GNDs from VIAF clusters about anything except humans. —MisterSynergy (talk) 17:21, 30 November 2019 (UTC)
- How about a direct import? In VIAF clusters, the GND part usually has quite a lot of detail. Someone might already be doing that with the subset of economists. --- Jura 17:29, 30 November 2019 (UTC)
- Could you prepare a random sample import set of about 500 GND? We could check them for systematic and specific problems before the big import. Not a big fan of the last VIAF import either, almost all the changes on my watchlist were faulty. --Emu (talk) 17:56, 30 November 2019 (UTC)
- @Kolja21, MisterSynergy, Jura1, Emu: I did a small test run. Please have a look.
- As a general remark: If a link is incorrect, please don't remove it, but set it to rank deprecated with reason for deprecation (P2241) set to applies to other person (Q35773207). This avoids re-introducing mistakes. Multichill (talk) 19:54, 2 December 2019 (UTC)
- I've checked 10 edits: 8 are good (some of them were even missing on German WP), 2 were wrong:
- Imho even the wrong edits are helpful if they are marked as "rank deprecated" since many editors on Wikidata do the same kind of import. BTW: Is there a list like "the 500 most common names"? The bot could ignore persons with these names or put these edits on a seperate list for "please check"? --Kolja21 (talk) 22:08, 2 December 2019 (UTC)
For humans add VIAF if GND existsEdit
@Multichill: could your bot for humans add VIAF ID if GND ID and DtBio ID (P7902) are equal and present and VIAF missing? One can reach the VIAF cluster via GND ID, e.g. for P227=P7902=1047557762 the link is https://viaf.org/viaf/sourceID/DNB%7C1047557762 MrProperLawAndOrder (talk) 22:36, 23 May 2020 (UTC)
- I'm pretty sure I didn't continue this because the error rate was too high. Not sure. I have no plans to work on this anytime soon. Multichill (talk) 08:47, 24 May 2020 (UTC)
GND saturation of WikidataEdit
GND-only items currently saturated almost every other application. Given that we have more than 160,000 items with merely GND IDs, can we see an outline how this will be fixed?
According to MrProperLawAndOrder (see talk page of @Mike Peel:), they count on @Bargioni: (or @Epìdosis:) to fix it for them [2]. --- Jura 10:03, 25 May 2020 (UTC)
- @Jura1: you are aware of the fact that your claim about me is a personal attack? I never said what you claim. MrProperLawAndOrder (talk) 06:28, 26 May 2020 (UTC)
- @Jura1: In less than one week dates of birth and death will be imported by Bargioni from GND ID (P227). --Epìdosis 10:06, 25 May 2020 (UTC)
- Given the number of items, it seems unlikely that this can be done in a week, but I think we can hold that long. --- Jura 10:13, 25 May 2020 (UTC)
- @Jura1: Work in progress. We have to access GND a lot of times to grab dates. If more info is available, I'll grab it too. -- Bargioni 🗣 10:35, 25 May 2020 (UTC)
- Didn't they have downloadable dump? It might be easier to just create new items from scratch and nuke the others. --- Jura 10:38, 25 May 2020 (UTC)
- @Jura1: importing from the most recent dump would mean to import information that is already outdated. MrProperLawAndOrder (talk) 13:22, 25 May 2020 (UTC)
- Didn't they have downloadable dump? It might be easier to just create new items from scratch and nuke the others. --- Jura 10:38, 25 May 2020 (UTC)
- @Jura1: Work in progress. We have to access GND a lot of times to grab dates. If more info is available, I'll grab it too. -- Bargioni 🗣 10:35, 25 May 2020 (UTC)
- @Jura1: Can you provide a source for the claim in your first sentence? MrProperLawAndOrder (talk) 12:54, 25 May 2020 (UTC)
- @Jura1: reminder. MrProperLawAndOrder (talk) 14:05, 25 May 2020 (UTC)
- @Jura1: reminder. MrProperLawAndOrder (talk) 23:01, 25 May 2020 (UTC)
- @Jura1: reminder. MrProperLawAndOrder (talk) 14:05, 25 May 2020 (UTC)
- Let's see what @Bargioni: thinks of the dump approach. --- Jura 13:28, 25 May 2020 (UTC)
- @Jura1: I'm in trouble with my home (due to covid lockdown) network connection. The provider... :-( Anyway, please add a link to the GND dump: I'll try to evaluate it against the one by one record approach I was thinking to use. Thx. -- Bargioni 🗣 14:16, 25 May 2020 (UTC)
- @Bargioni, Jura1: Here all the dumps. --Epìdosis 14:53, 25 May 2020 (UTC)
- @Epìdosis, Bargioni, Jura1: what is this about? I avoided importing time information from outdated data sources to ensure best quality and now you are planning on using old dumps? MrProperLawAndOrder (talk) 23:06, 25 May 2020 (UTC)
- @MrProperLawAndOrder: I think that it is highly improbable that data have worsened two months, the only problem may be some death dates missing. Anyway, my message meant just to show the existence and the collocation of these dumps. I've spoken now with Bargioni and he said me that he is not going to use dumps, but to retrieve single GND entries, as first said. Of course it will be a long process, given that we are speaking of tens of thousands of entries. Bye, --Epìdosis 08:08, 26 May 2020 (UTC)
- @Epìdosis, Bargioni, Jura1: what is this about? I avoided importing time information from outdated data sources to ensure best quality and now you are planning on using old dumps? MrProperLawAndOrder (talk) 23:06, 25 May 2020 (UTC)
- @Bargioni, Jura1: Here all the dumps. --Epìdosis 14:53, 25 May 2020 (UTC)
- @Jura1: I'm in trouble with my home (due to covid lockdown) network connection. The provider... :-( Anyway, please add a link to the GND dump: I'll try to evaluate it against the one by one record approach I was thinking to use. Thx. -- Bargioni 🗣 14:16, 25 May 2020 (UTC)
Wrong gender imported from GNDEdit
Can you repair this: https://www.wikidata.org/w/index.php?title=Q94853704&oldid=1185186853 person is obviously female. --- Jura 10:13, 25 May 2020 (UTC)
- More examples are Q95335213, Q95338703, Q95339302, Q95350061, Q95349834, Q95348529, Q95349608, Q95350494. DNB seems to have wrong gender data (male instead of female) in at least some cases, even if they show the right (female) form of occupation (e.g. "Schriftstellerin" instead of "Schriftsteller"). --M2k~dewiki (talk) 12:12, 25 May 2020 (UTC)
Should be
SELECT ?person ?gnd
WHERE {
?person wdt:P227 ?gnd .
?person wdt:P7902 ?gnd .
MINUS { ?person wdt:P569 ?b . }
MINUS { ?person wdt:P570 ?d . }
?person wdt:P31 wd:Q5 .
?person wdt:P21 wd:Q6581072 .
?person wdt:P735 ?firstname .
?firstname wdt:P31 wd:Q12308941 .
}
ORDER BY DESC(?item)
and
SELECT ?person ?gnd
WHERE {
?person wdt:P227 ?gnd .
?person wdt:P7902 ?gnd .
MINUS { ?person wdt:P569 ?b . }
MINUS { ?person wdt:P570 ?d . }
?person wdt:P31 wd:Q5 .
?person wdt:P21 wd:Q6581097 .
?person wdt:P735 ?firstname .
?firstname wdt:P31 wd:Q11879590 .
}
ORDER BY DESC(?item)
However, it works only on items having given name (P735): probably items created in the last days don't have it yet. --Epìdosis 13:52, 25 May 2020 (UTC)
- @Jura1, M2k~dewiki: Thank you for the list of differing data. In GND gender and occupation are added separately so an actress can be male. All errors concerning articles in German WP have been corrected a few months ago. I've added the new items to this list: de:Wikipedia:GND/Fehlermeldung/Mai 2020#Todesjahr nach 1850. These errors will be corrected as well. --Kolja21 (talk) 00:32, 26 May 2020 (UTC)
Removal of redirected IDsEdit
Notified participants of WikiProject Authority control Hi all! Until recently redirected GND IDs have been periodically removed by KrBot maintained by @Ivan A. Krestinin:; recently the bot has been blocked (see here and here) because it has been said that these cluster should be kept deprecating them and adding reason for deprecation (P2241) redirection (Q8143062). I think that it can be appropriate to act in the aforementioned way for single authority control IDs; at the same time, it is inconsistent to have only some redirected IDs. We should decide in this discussion if we want that
- the bot always deprecates redirected IDs (which may be inconsistent, as in the past many redirected IDs have been deleted; but this solution is still somewhat possibile), never removing them
- the bot deprecates the redirected IDs only in some cases (we should establish a criterium and it should be possible for the bot to understand and respect this criterium, which may be not easy) and removes them in all the other cases
- the bot always removes the redirected IDs, unless they have already been deprecated (we should establish a criterium and apply it manually)
- the bot always removes redirected IDs, as it did before the block
In my opinion, option 3 is probably a good compromise, at least temporarily; so, if no objections are raised, I will unblock the bot (at least for this task) on the 10th of June, asking Ivan not to remove IDs which are already deprecated.
If objections are raised about this temporary compromise and in the meanwhile the bot gets unblocked for other tasks (e.g. for VIAF tasks, see this discussion), I will ask Ivan not to edit GND IDs until some consensus is reached about the above proposals. --Epìdosis 09:47, 4 June 2020 (UTC)
- Redirected IDs should never have been removed; any edits that did so should be reverted. The fact that some were wrongly removed in the past should not be used as a reason to remove more in the future. Your point 3 is not a good compromise; it would continue the harm done by such removals. I am opposed to the block being lifted unless an undertaking is given to remove no IDs. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:15, 4 June 2020 (UTC)
- What is the scale of this? (db size GND: actual entries, number of redirects, number added over some period of time; at Wikidata: number of redirects removed in a run). --- Jura 13:11, 4 June 2020 (UTC)
- Oppose option 3 and 4. Agree with Andy Mabbett: no GND IDs should be removed, if they still apply to the same entity. And what is the source for the bot's operations? Is it the VIAF DB? See also User talk:Ivan A. Krestinin#VIAF replacement without adjusting the reference, which still said "imported from Wikimedia project : German Wikipedia" MrProperLawAndOrder (talk) 03:40, 7 June 2020 (UTC)
- The data to use for GND is in GND LDS / GND dump. If the next dump is out, a bot or QS could add the redirects. I don't know what the value of having this is outside GND "piz" (=Q5). @Kolja21, Raymond: third parties use piz a lot and often don't change the old IDs. So it is important for resolving. Q5/piz should be the most important? MrProperLawAndOrder (talk) 02:07, 7 June 2020 (UTC)
- Care should be taken: WD mixes entities from GND, e.g. pseudonym GNDs are on the human. But each redirect applies to one GND entity. One should think about how to store this information. Most important seems to be to do it for humans, as many 3rd parties also only have items for humans, not one for "real" person and one for what GND calls "pseudonym". MrProperLawAndOrder (talk) 03:45, 7 June 2020 (UTC)
- Support I support the removal of redirects. I have not read any valid reasons yet why these should be kept. Andy states his opinion against removal, but does never actually give any arguments. MrProper does neither. What are the benefits for Wikidata and/or entities using the data that WD provides if we allow for redirects to be kept? Keeping redirected IDs would be the same as keeping Wikipedia article redirects after those articles have been renamed. Is that done? --Sotho Tal Ker (talk) 15:07, 7 June 2020 (UTC)
- You ask what is the benefit of keeping and claim I didn't state a reason, while I did, not only for keeping but also for adding, at 02:07, 7 June 2020 right on this page. You don't only oppose adding them, but also support removing them. Since you asked for a benefit, why didn't you state a benefit for the action 'you' ask for, namely removal of verifiable information provided by a high quality source? MrProperLawAndOrder (talk) 23:22, 7 June 2020 (UTC)
- You might want to clarify what you mean by your statement at 02:07, 7 June 2020. Because to me it does not make any sense. Why are old redirects important for resolving? Which third parties use piz a lot? Why does it matter in this context for redirects? Why would any third party be interested in obtaining current values AND older values that are redirects? Good reasons for keeping redirects are surely possible, i.e. this is correct historical information or the redirects are still valid data in itself. But none of this has been provided, only some vague gibberish. But feel free to elaborate more. Benefits for removal of redirects are easy to see:
- Values for redirects point to the same data as the current values. There is no use in keeping distinct values if they actually point to the same source. See also my example for Wikipedia redirects above.
- Values are constantly updated. Third parties can be confident that they always get the up-to-date values without any mixed in deprecated values. Of course this could be filtered aswell but removal will make the handling for third parties much easier.
- A cleaner database. There will only be a few items left with more than one value and those cases are well substantiated, like pseudonyms which have their own GND entry.
- Keeping redirects will waste computing power as these redirects have then to be resolved externally. This is not energy efficient.
- I never asked for anything, I just picked one of the options provided. But I am indeed in favor of removing outdated data. Knowing that an item had this this and that GND value which have then been merged into a single item is of no practical use. The only reason these redirects are kept by the authority data providers is that they were intended to be permanent. This is much better than suddenly having a dead link but it does not mean that these redirects have to be kept forever, especially not by secondary databases like Wikidata. In my opinion, any sensible approach would be to just replace the older value with the new one and move on.
- From another discussion I can see that people want to keep redirects that are already present, but do not want to add other redirects that exist. How does that make any sense? And I really would love to see your "high quality source" for "verifiable information" that still includes redirects. Unless it is a GND data dump, which could be classified as mediocre at its best. --Sotho Tal Ker (talk) 02:26, 8 June 2020 (UTC)
- Re "The only reason these redirects are kept by the authority data providers is that they were intended to be permanent." + "it does not mean that these redirects have to be kept forever" : could you post your definitions of "permanent" and "forever"?
- Re "You might want to clarify what you mean by your statement at 02:07, 7 June 2020. Because to me it does not make any sense." :
- "Why are old redirects important for resolving?" : Without them it isn't possible.
- "Which third parties use piz a lot?" : de:Wikipedia:BEACON#Datenquellen zu Personen (GND)
- "Why does it matter in this context for redirects?" : If nobody would use the values, the values would have no value.
- "Why would any third party be interested in obtaining current values AND older values that are redirects?" : Cleaning up their data, matching with other piz users
- Re "Good reasons for keeping redirects are surely possible, i.e. this is correct historical information or the redirects are still valid data in itself." - That is what all the requests for keeping are about.
- Re "And I really would love to see your "high quality source" for "verifiable information" that still includes redirects. Unless it is a GND data dump, which could be classified as mediocre at its best." - GND LDS.
- MrProperLawAndOrder (talk) 13:51, 8 June 2020 (UTC)
- Any english dictionary can give you the definitions I use. Permanent: "continuing or enduring without fundamental or marked change". Forever: "for a limitless time". Please also note that I explicitely stated the quoted second part for secondary databases. Primary databases are expected (at least by me) to never delete any redirects. But the explanations you gave do only partially satisfy me. They make sense for primary databases, but why should secondary databases like Wikidata keep obsolete values? I also do not see any commonalities between the linked BEACON-sources and Wikidata. The data in those sources clearly link to GND values and often their own websites. Where does Wikidata come into play and why would these third parties be interested in obsolete values stored in Wikidata? I already explained the reasons why redirects exist: So that users can update their data without getting any dead links or inconsistency, for example to avoid stuff like this: [4]. But this only applies to the primary database. Why would any secondary source want to keep obsolete values? If a clean up of their data is needed, I would advise those third parties to use the primary database directly, not any secondary one which usually lags behind a bit. If requests for keeping are made for historical purposes, why do I not see anyone mentioning this in the discussion? And lastly: The GND LDS dump is not of "high quality", sorry to be blunt. There are lots and lots of issues, but this is not the point in this discussion.
- Long reply made short: In my opinion your arguments apply mostly to primary databases which are the maintainers of original values. Wikidata is only a user of that data like those other mentioned third parties. But maybe I am missing something here. --Sotho Tal Ker (talk) 20:46, 8 June 2020 (UTC)
- @Sotho Tal Ker: thanks for insisting. Third party uses old GND and can link via a resolver to WD. If the value is deleted it cannot do that any more. "high quality" in general or not, but GND is at least for the redirects the authoritative source, even if they do errors on their redirects. GND LDS is current data, dump can be outdated. GND IDs for humans widely re-used, sites providing beacon-files. Beacons cannot be safely matched if the third parties use different IDs for the same human. Also note, Deutsche Biographie supports redirected IDs, they use the dump or GND LDS or whatever to obtain them. WD could do the same. MrProperLawAndOrder (talk) 22:58, 8 June 2020 (UTC)
- You might want to clarify what you mean by your statement at 02:07, 7 June 2020. Because to me it does not make any sense. Why are old redirects important for resolving? Which third parties use piz a lot? Why does it matter in this context for redirects? Why would any third party be interested in obtaining current values AND older values that are redirects? Good reasons for keeping redirects are surely possible, i.e. this is correct historical information or the redirects are still valid data in itself. But none of this has been provided, only some vague gibberish. But feel free to elaborate more. Benefits for removal of redirects are easy to see:
Better to continue in the general RfC: Wikidata:Requests for comment/Handling of stored IDs after they've been deleted or redirected in the external database. --Epìdosis 21:41, 8 June 2020 (UTC)
Replacement of redirected valuesEdit
@Epìdosis, Pigsonthewing, Bargioni, Kolja21, Raymond: the bot didn't only remove, it replaced, and it did so on deprecated values, without changing the rank, so the redirect target was then marked as deprecated in WD. And, worse, it did so on items that already had the target value as preferred value, creating inconsistency in WD [5]. These kind of edits by the bot are just a disaster. Maybe it did more harm to the data than any vandal. MrProperLawAndOrder (talk) 20:26, 8 June 2020 (UTC)
- Yes, these edits are the main problem. KrBot helped a lot to find duplicates (and also fixed internal VIAF-GND-IDs, since VIAF has problems with dashes) but overwriting values that have references or ranks is unacceptable. --Kolja21 (talk) 20:38, 8 June 2020 (UTC)
Closing this pointEdit
Can the users who haven't been block evading clarify what they prefer for this property? I think the options are:
- (a.) delete redirecting and deleted ids
- (b.) delete redirecting ids
- (c.) delete deleted ids
- (d.) skip deletion of (manually or otherwise) deprecated ids
- (e.) add all redirecting ids
I think krbot did (a). I'm not really sure that we have the resources to maintain anything beyond that. We could do (d.), but that implies that when an id is fixed somewhere, redirecting ids are fixed as well. I would be glad if Krbot would be reactivated soon. @Epìdosis, Bargioni, Kolja21, Raymond, Ivan A. Krestinin: --- Jura 09:04, 20 June 2020 (UTC)
- I would go with (d). At least for the GND I can confirm that redirected ids are permanent. Raymond (talk) 09:08, 20 June 2020 (UTC)
- The risk at Wikidata is that the redirecting GND id and the actual GND are on different items. If (d.) concerns only few items, this wouldn't much of a risk. --- Jura 09:12, 20 June 2020 (UTC)
- Comment @Jura1: The question is very much pertinent and I perfectly agree with the risk you underline in option (d.) [I've already started looking for these cases, see the second query here), but the problem should probably not be discussed here, since the general RfC Handling of stored IDs after they've been deleted or redirected in the external database is still open and was going in another direction: seemingly there was consensus for deprecating deleted and redirected IDs and some users also supported adding already redirected IDs (which still doesn't convince me completely). So I think the discussion should probably continue with a general perspective in the RfC, since, if here a few users decide something for this property and then the RfC is closed with a different result, the last would prevail. --Epìdosis 09:30, 20 June 2020 (UTC)
- RFCs are only applicable if the other discussions didn't lead to a result. If you feel other users should be invited to comment on this, please ping them. The general approach for IDs doesn't exclude that we use better solutions for some IDs. --- Jura 09:34, 20 June 2020 (UTC)
- I think we should probably mention @MisterSynergy: (the main supporter of deprecation of redirected/deleted IDs and of their insertion ex novo in items in the RfC) and also {{ping project|Authority control}} (I don't do it myself because I've received complains for supposedly using it too frequently). --Epìdosis 09:44, 20 June 2020 (UTC)
- I would prefer to standardize the handling of redirecting and deleted identifiers for all properties, including this one. Everything else is pretty difficult to implement and teach to the community. ---MisterSynergy (talk) 09:58, 20 June 2020 (UTC)
- If we can formulate an approach that covers a decent variety of use cases, why not. I don't really see that in the RFC's initial proposal: it seems to assume that there are only permanent redirects and everything else is stable. With external-ids at Wikidata covering properties like this one, social media account names, VIAF and ISO country codes, there are at least four different things that need to be explicitly mentioned. --- Jura 10:42, 20 June 2020 (UTC)
- I would prefer to standardize the handling of redirecting and deleted identifiers for all properties, including this one. Everything else is pretty difficult to implement and teach to the community. ---MisterSynergy (talk) 09:58, 20 June 2020 (UTC)
- I think we should probably mention @MisterSynergy: (the main supporter of deprecation of redirected/deleted IDs and of their insertion ex novo in items in the RfC) and also {{ping project|Authority control}} (I don't do it myself because I've received complains for supposedly using it too frequently). --Epìdosis 09:44, 20 June 2020 (UTC)
- RFCs are only applicable if the other discussions didn't lead to a result. If you feel other users should be invited to comment on this, please ping them. The general approach for IDs doesn't exclude that we use better solutions for some IDs. --- Jura 09:34, 20 June 2020 (UTC)
- So shall we go for (d)? These would be skipped. It should be possible to phrase this into a more general summary of how to handle them. --- Jura 17:30, 27 June 2020 (UTC)
- @Jura1: Two problems: first, I would prefer waiting the closure of the general RfC, although I acknowledge that it could require a long wait, so maybe a provisional solution is the best way to have KrBot restart its work; second, I've not fully understood what solution d) implies: the bot should not touch deprecated IDs (OK), but when finding values which GND has redirected but on Wikidata aren't yet deprecated, would it a) delete them b) deprecate them c) skip them? --Epìdosis 17:52, 27 June 2020 (UTC)
- As you are aware, RFC shouldn't be used to duplicate ongoing discussions elsewhere (that is here). Option (d) would skip the statements in the query when doing the updates it does currently. --- Jura 17:55, 27 June 2020 (UTC)
- Given that everyone had time to comment, I think we can re-activate this per option (d). --- Jura 23:45, 15 July 2020 (UTC)
- @Jura1: I agree, but could you please clarify me this point: "if the bot finds a GND value which has normal rank on Wikidata while in GND redirects to another value, should the bot remove the GND value from Wikidata, according to option (d)?" --Epìdosis 09:38, 16 July 2020 (UTC)
- If it runs as it did for now, I think it would update it to the new value effectively deleting the old one. Ivan runs the bot, so he would be the person to confirm this. Option (d) only changes what happens to deprecated statements. Someone else would need to find a way to maintain the redirecting deprecated statements (or people shouldn't rely on them to point to the correct redirect target). --- Jura 09:49, 16 July 2020 (UTC)
- @Jura1: I agree, but could you please clarify me this point: "if the bot finds a GND value which has normal rank on Wikidata while in GND redirects to another value, should the bot remove the GND value from Wikidata, according to option (d)?" --Epìdosis 09:38, 16 July 2020 (UTC)
- @Jura1: Two problems: first, I would prefer waiting the closure of the general RfC, although I acknowledge that it could require a long wait, so maybe a provisional solution is the best way to have KrBot restart its work; second, I've not fully understood what solution d) implies: the bot should not touch deprecated IDs (OK), but when finding values which GND has redirected but on Wikidata aren't yet deprecated, would it a) delete them b) deprecate them c) skip them? --Epìdosis 17:52, 27 June 2020 (UTC)
Usage of P227 in dewikiEdit
@Christian140, Kolja21, Raymond, emu, berita: Re de:Wikipedia:Umfragen/Normdaten aus Wikidata maybe some in the discussion are not aware of the recent changes in WD. 100000+ new items about humans that are in Deutsche Biographie have been created, VIAF ID is now added to them, directly from the GND DB, dozens of duplicated items that have been created from a dewiki article have been found and merged. On humans GND ID became the second VIAF source ID after ISNI that is used on 1 million items, see Wikidata:VIAF/type/human. The items are enriched not only with more data from GND but also by other edits. When the next GND dump is out, maybe user:Bargioni and user:Epìdosis can analyze the whole dump and maybe create more items, one could start with those that have relationships with other humans or have a value for ISNI and VIAF.
In WD it is also very easy to find format violations, recently I found cases in dewiki where a GND was stored in the field VIAF, this can also happen in WD, but a warning symbol will be displayed after saving.
Would be interesting to hear, what is bad about using the data from WD. The discussion isn't only about GND but also about VIAF and LCCN. Maybe a tool would be nice that allows easy editing of VIAF, GND, LCCN in WD by dewiki-users. One can link directly to the properties, e.g. P227: Q57188#P227, but this is not as user friendly as displaying the three fields next to each other. I don't know how a change in P227 appears in article watch lists, maybe there is something to be improved too. MrProperLawAndOrder (talk) 19:25, 8 June 2020 (UTC)
- Regarding editing Wikidata by (de)wiki-users also see:
- Wikidata:Client editing prototype
- de:Wikipedia:Wikidata/InfoboxErweiterung --M2k~dewiki (talk) 19:39, 8 June 2020 (UTC)
"Maybe a tool would be nice ..." deWP has a nice tool, see: de:Hilfe:Normdaten#Helferlein. --Kolja21 (talk) 20:20, 8 June 2020 (UTC)
- Kolja21, I meant a tool for editing in WD. If the tool could do that, that would be nice. Instead of storing in Vorlage:Normdaten the data would be stored in WD. So, this part could be made the same as is in dewiki now. dewiki also has a field for type, this could be done in WD via qualifier, one could even store the entity subtypes, piz, pis etc. which is more that dewiki currently has? MrProperLawAndOrder (talk) 20:33, 8 June 2020 (UTC)
- Just a few random example from the last days:
- bot and batch problems: Stefan Haas (Q15433293) (history): Silewe removed an incorrect VIAF, Bargioni re-entered it, I removed it. Sometimes it’s ping-pong over years.
- plain wrong: Otto Keller (Q2039491) (history) – all identifiers save for the Austrian parliament had the wrong Otto Keller. I found out because there was no entry in de.wp (but in WD) and I could cross-check – that would have been impossible with the envisioned Wikidata only approach.
- especially worrisome: Michael Fischer (Q95316658) used to be correctly considered distinct from Michael Fischer (Q21588913). Then User:MrProperLawAndOrder merged the two for some reason (my only information is: batch #35935 which isn’t helpful at all). Again, I could find the problem because there still is Authority Control in de.wp.
- I fail to see how a tool would be of any help here. --Emu (talk) 20:48, 8 June 2020 (UTC)
- Emu, the third one was me. I restored Q95316658 and merged the duplicate Q96106664 created by an unclickable temporary batch [6]. If you click on the batch link from my batch you can find more information. Today I also created a section for these batches here on the page: Property talk:P227#VIAF batch merge using QS. user:Raymond recently reported two errors in my batch merges. I merged 4400 - sorry for the three errors found so far. All merges involved Deutsche Biographie for which I created many new items. MrProperLawAndOrder (talk) 21:23, 8 June 2020 (UTC) //// (edit conlict, putting it here, extension to the last sentence of the post) All merges involved Deutsche Biographie for which I created many new items and that recently got a VIAF directly from the GND DB and where the VIAF value existed on another item. There were so many duplicates that I thought some mass operation could help - of course I feared wrong mergers. I included check for name and for date of birth. MrProperLawAndOrder (talk) 22:02, 8 June 2020 (UTC)
- I’m not sure how what you did now is any improvement over the solution I created. Anyway, we’ve reached the core of the problem: If I point out a problem (or even hundreds) of problems, the answer is always threefold: 1) the error rate is low (easy if it’s hard to find errors) 2) all problems can be solved with some gizmo 3) philosophical concerns about the relationship between de.wp and Wikidata. --Emu (talk) 21:47, 8 June 2020 (UTC)
- @Emu: the merge was wrong, Q95316658 should not be a redirect to the other M. Fischer, and your item had a higher number, but WD's standard is to merge into the lower, so I merged into Q95316658. I don't see where my answer was threefold by the definition you provided. Please criticize my answer directly. MrProperLawAndOrder (talk) 23:03, 8 June 2020 (UTC)
- Okay, I understand. After looking at it again: You also caused the problem with Otto Keller (and I solved it). So to answer your original question: Yes, I am aware of the recent changes. No, they don’t change my position. --Emu (talk) 14:01, 9 June 2020 (UTC)
- Emu, which "problem with Otto Keller"? MrProperLawAndOrder (talk) 17:34, 9 June 2020 (UTC)
- Otto Keller (Q95342132) was incorrectly merged with Otto Keller (Q2039491) --Emu (talk) 21:06, 9 June 2020 (UTC)
- @Emu: thank you, added to Property talk:P227#VIAF batch merge using QS. So, I am happy that out of the three errors two were mine, it means fewer different causes. MrProperLawAndOrder (talk) 00:04, 10 June 2020 (UTC)
- Otto Keller (Q95342132) was incorrectly merged with Otto Keller (Q2039491) --Emu (talk) 21:06, 9 June 2020 (UTC)
- Emu, which "problem with Otto Keller"? MrProperLawAndOrder (talk) 17:34, 9 June 2020 (UTC)
- Okay, I understand. After looking at it again: You also caused the problem with Otto Keller (and I solved it). So to answer your original question: Yes, I am aware of the recent changes. No, they don’t change my position. --Emu (talk) 14:01, 9 June 2020 (UTC)
- @Emu: the merge was wrong, Q95316658 should not be a redirect to the other M. Fischer, and your item had a higher number, but WD's standard is to merge into the lower, so I merged into Q95316658. I don't see where my answer was threefold by the definition you provided. Please criticize my answer directly. MrProperLawAndOrder (talk) 23:03, 8 June 2020 (UTC)
- I’m not sure how what you did now is any improvement over the solution I created. Anyway, we’ve reached the core of the problem: If I point out a problem (or even hundreds) of problems, the answer is always threefold: 1) the error rate is low (easy if it’s hard to find errors) 2) all problems can be solved with some gizmo 3) philosophical concerns about the relationship between de.wp and Wikidata. --Emu (talk) 21:47, 8 June 2020 (UTC)
- Emu, the third one was me. I restored Q95316658 and merged the duplicate Q96106664 created by an unclickable temporary batch [6]. If you click on the batch link from my batch you can find more information. Today I also created a section for these batches here on the page: Property talk:P227#VIAF batch merge using QS. user:Raymond recently reported two errors in my batch merges. I merged 4400 - sorry for the three errors found so far. All merges involved Deutsche Biographie for which I created many new items. MrProperLawAndOrder (talk) 21:23, 8 June 2020 (UTC) //// (edit conlict, putting it here, extension to the last sentence of the post) All merges involved Deutsche Biographie for which I created many new items and that recently got a VIAF directly from the GND DB and where the VIAF value existed on another item. There were so many duplicates that I thought some mass operation could help - of course I feared wrong mergers. I included check for name and for date of birth. MrProperLawAndOrder (talk) 22:02, 8 June 2020 (UTC)
- Just a few random example from the last days:
- @MrProperLawAndOrder: For the GND type WD had P107 (P107). This property was deleted. Also properties for
GNDName
,GNDCheck
undREMARK
are missing. --Kolja21 (talk) 20:49, 8 June 2020 (UTC)- @Kolja21: one could ask for undeletion, it is verifiable information from a high quality source. One could restrict the scope to qualifier, so it is attached to the GND, not the item. I don't know about the value of GNDname, Tn seems to be phased out. We should think about GNDCheck and REMARK. But that would be helpful for WD anyway. MrProperLawAndOrder (talk) 21:07, 8 June 2020 (UTC)