Property talk:P227

Active discussions


identifier from an international authority file of names, subjects, and organizations (please don't use type n = name, disambiguation) - Deutsche Nationalbibliothek
RepresentsGND ID (Q54506313)
Associated itemGerman National Library (Q27302)
Has qualityVIAF component (Q26921380)
Data typeExternal identifier
Template parameteren:Template:Authority control: |GND=Template:Authority control (Q3907614)
Allowed values1[012]?\d{7}[0-9X]|[47]\d{6}-\d|[1-9]\d{0,7}-[0-9X]|3\d{7}[0-9X]
Exampleuniverse (Q1)4079154-3 (RDF)
Jehan Sadat (Q212190)118604740 (RDF)
disabled persons (Q15978181)4005279-5 (RDF) and 4189397-9 (RDF)
Wolfgang Domhardt (Q1504481)1200055586 (RDF)
Washington County (Q484538)4253976-6 (RDF)
Formatter URL$1
Robot and gadget jobsDeltaBot does the following jobs:
Tracking: differencesCategory:GND different on Wikidata (Q55746867)
Tracking: usageCategory:Pages with GND identifiers (Q8709075)
Tracking: local yes, WD noCategory:GND not on Wikidata (Q56825653)
Related to countryFlag of Germany.svg Germany (Q183) (See 125 others)
See alsoDNB editions (P1292), Deutsche Biographie ID (P7902), Sächsische Biografie ID (P1710)
  • <items with the most statements of this property>
  • Count of items by number of statements (chart)
  • Count of items by number of sitelinks (chart)
  • Items with the most identifier properties
  • Items with no other external identifier
  • Items with no other statements
  • <most recently created items>
  • Items with novalue claims
  • Items with unknown value claims
  • Usage history
  • Mix'n'match (Report)
    Mix'n'match (Report)
  • Database reports/Constraint violations/P227
  • <random list>
  • Proposal discussionProposal discussion
    Current uses1,297,119 out of 15,208,093 (9% complete)
    Search for values
    Explanations [Edit]

    The Integrated Authority File (GND) is managed by the German National Library in cooperation with various library networks in German-speaking Europe and other partners. Please look up GND at Online-GND or DNB-Portal. (VIAF is helpful but also often incorrect, outdated, and is mixing two identifier systems that in some cases produce dead links.)

    GND ID (P227) (Template:Entities)

    1. GND 1019646128: Stan Lauryssens (b. 1946), type Tp (person) = Yes
    2. GND 122968751: Stan Lauryssens (no info), type Tn (name, a placeholder) = No

    VIAF ID (P214)

    1. VIAF 120062731 Stan Lauryssens = Yes
    2. VIAF 293348885 Stan Lauryssens (undifferentiated) = No

    Known VIAF problems

    1. Johannes Fabian (Q15641418), VIAF 91414487 merged in January 2014 three GNDs, only one was correct:
      1. GND 107342049: Fabian, Johannes R., undifferentiated
      2. GND 122878310: Fabian, Johannes (* 1937), Amerikan. Anthropologe
      3. GND 172084180: Fabian, Johannes R., Dipl.-Ing.
        • Update: now both Wikidata and VIAF have only one GND
    2. VIAF changes numbers with dashes: Instituto Brasileiro de Geografia e Estatística (Q268072).
      1. GND 1026669-0 (correct)
      2. VIAF-GND 004164695 ("404 Not Found")
    3. Same name + same year of birth = different person
      1. In some cases VIAF merges two persons because only one of them has a GND
      2. Please use: GND = "no value" (for the person without a GND) [1]
      3. For items often confused use: different from (P1889)
    4. For unknown reasons VIAF is not importing all GND ids
      1. Samuel Ramos (Q7412445) = GND 1022446479 (created 16-05-12)
      2. June 2015: 3 years later the GND id still not harvested by VIAF
        • DNB and VIAF have been made aware of the problem, there might have been a harvesting glitch during the first weeks of the GND going live in April 2012: GND records which never have been touched since then are still unknown to VIAF (as an estimate about 15.000 GND records for persons created in early summer 2012 are not represented in VIAF). [7. Jun. 2015‎ Gymel]
        • Update: GND 1022446479 added to VIAF:59099151 on 2015-07-12.
    5. In some cases VIAF clusters get deleted instead of merged
      1. Åke Blomström (Q270863): VIAF 228866914; taken care by KrBot
    6. In rare cases VIAF clusters are reused for a different item
      1. William of Ockham (Q43936): VIAF:41835567 in 2015
        1. Lorenzo Traversagni (Q18674108): VIAF:41835567 in 2016
        2. William of Ockham (Q43936): VIAF:262145669298005170004 (created 2016-02-28)
      2. Drew Fudenberg (Q1258707): till 2016-09-25 VIAF:59183378 (now: "First National Bank of Lynn")

    Allowed qualifiers

    See: property constraint (P2302)

    • Duplicate
      • Please mark preferred GND with "preferred" rank (see Help:Ranking)

    Subject headings

    For subject headings the GND uses quasi-synonym (Q2122467) what is useful for libraries but does not fit to Wikidata.


    See also

    Format “|(1[012]?\d{7}[0-9X]|[47]\d{6}-\d|[1-9]\d{0,7}-[0-9X]|3\d{7}[0-9X]): value must be formatted using this pattern (PCRE syntax). (Help)
    List of this constraint violations: Database reports/Constraint violations/P227#Format, hourly updated report, SPARQL, SPARQL (new)
    Conflicts with “instance of (P31): Wikimedia disambiguation page (Q4167410), Wikimedia list article (Q13406463): this property must not be used with the listed properties and values. (Help)
    Exceptions are possible as rare values may exist. Known exceptions: Kingdom of Granada (Q1495642)
    List of this constraint violations: Database reports/Constraint violations/P227#Conflicts with P31, SPARQL, SPARQL (new)
    Single value: this property generally contains a single value. (Help)
    Exceptions are possible as rare values may exist.
    List of this constraint violations: Database reports/Constraint violations/P227#Single value, SPARQL, SPARQL (new)
    Scope is as main value (Q54828448), as references (Q54828450): the property must be used by specified way only (Help)
    Exceptions are possible as rare values may exist.
    List of this constraint violations: Database reports/Constraint violations/P227#scope, SPARQL, SPARQL (new)
    Conflicts with “instance of (P31): Wikimedia category (Q4167836): this property must not be used with the listed properties and values. (Help)
    List of this constraint violations: Database reports/Constraint violations/P227#Conflicts with P31, hourly updated report, search, SPARQL, SPARQL (new)
    Error reports

    For Persons, see de:Wikipedia:GND/Fehlermeldung. Hints for other types at User:Jneubert/GND_errors

    Ideally this information should be moved to the item for the reference and be transcluded from there.
    Error type Item(s) affected Description/Duplicates Exception Reported Resolved (also unpublished) Resolved and published Wikidata updated


    Deprecate Tn claims; which "reason for deprecation" qualifier?Edit

    There are meanwhile more than 1000 Tns in Wikidata again, so apparently users are still importing those ones from somewhere. I can batch-deprecate all of them to avoid another re-import, and add a reason for deprecation (P2241) qualifier. However, which value would be appropriate for this qualifier? Simply incorrect identifier value (Q54975531) would be possible, but somewhat meaningless. Do we have something more specific? —MisterSynergy (talk) 20:21, 1 August 2019 (UTC)

    I believe that Tns don’t fit any of the criteria in Help:Deprecation, so in theory they should just be deleted periodically. But as you said, they keep being added. The problem with incorrect identifier value (Q54975531) is that the wording does not convey any information about the real reason for deprecation to an unfamiliar user. Wouldn’t it be best to create a new item for “undifferentiated [in the sense of GND or LCCN]“ and use this? --Emu (talk) 21:49, 1 August 2019 (UTC)
    Shouldn't be necessary since DNB announced yesterday the deletion of all Tn records by June 16, 2020. -- Gymel (talk) 19:19, 29 August 2019 (UTC)
    I support Emus idea to create a new item for "undifferentiated". Even if the DNB will delete all Tns June 2020 (what I doubt), these IDs will still remain in other datebases for years, if not for decades. Though the qualifier should only be added if a source is given; otherwise I would just delete the Tn. BTW: The list Wikidata:WikiProject Authority control/Tn is quite helpful for maintenance. --Kolja21 (talk) 00:00, 1 September 2019 (UTC)

    @MisterSynergy, Kolja21, Emu, Gymel:, such tool editing would be very helpful, see my comment in section "#Duplicates". Deleting all or deprecating all would help to see real duplicates via SPARQL. GND real duplicates are bad, because they also result in VIAF duplicates. MrProperLawAndOrder (talk) 19:35, 11 May 2020 (UTC) / Linkfix Property talk:P227#Duplicates. --Kolja21 (talk) 20:42, 11 May 2020 (UTC)

    The Tn mentioned in #Duplicates was inserted via WE-Framework gadget from ruwiki [2] @Chath: do you know why the GND 156791218 was inserted? It is a Tn/placeholder/name id, not a regular person id. MrProperLawAndOrder (talk) 21:24, 11 May 2020 (UTC)

    (careful) import from VIAF?Edit

    Hi everyone, I was at a conference the last couple of days and some people mentioned that the GND coverage on Wikidata is a bit on the low side. For other countries I sometimes import links based on VIAF (example) and I could do the same for GND.$5 gives an overview of potential candidates. After reading through this page I see I have to do some filtering:

    1. Check if the GND entry actually exists (viaf seems to contain dead links)
    2. Check if the GND entry is of type person and not of type name

    Do you think this is a good plan? Has this been tried before? Do I need to apply additional filtering to prevent errors? Multichill (talk) 11:55, 30 November 2019 (UTC)

    Sounds great! A challenging project. User:Magnus Manske is one of the experts. AFAIK there is a third filtering needed. There are old GNDs with a dash. Example: GND 4029236-8 for Cairo (= VIAF DNB-040292363). If you start with GND, Type p (person) you can ignore this problem (it only efficts corporate bodies and geographical place names). --Kolja21 (talk) 16:47, 30 November 2019 (UTC)
    Well, if you had asked before the recent mass VIAF import has taken place, I would have supported this idea. Now I am not sure, as there were plenty of wrong VIAF identifiers imported recently. For persons, VIAF does pretty aggessive automatic matching of their clusters to existing Wikidata items, based on "same name + same year of birth" comparisons which results in way too many wrong matches. That said, it would probably be safe to import GNDs from VIAF clusters about anything except humans. —MisterSynergy (talk) 17:21, 30 November 2019 (UTC)
    How about a direct import? In VIAF clusters, the GND part usually has quite a lot of detail. Someone might already be doing that with the subset of economists. --- Jura 17:29, 30 November 2019 (UTC)
    Could you prepare a random sample import set of about 500 GND? We could check them for systematic and specific problems before the big import. Not a big fan of the last VIAF import either, almost all the changes on my watchlist were faulty. --Emu (talk) 17:56, 30 November 2019 (UTC)
    @Kolja21, MisterSynergy, Jura1, Emu: I did a small test run. Please have a look.
    As a general remark: If a link is incorrect, please don't remove it, but set it to rank deprecated with reason for deprecation (P2241) set to applies to other person (Q35773207). This avoids re-introducing mistakes. Multichill (talk) 19:54, 2 December 2019 (UTC)
    I've checked 10 edits: 8 are good (some of them were even missing on German WP), 2 were wrong:
    Imho even the wrong edits are helpful if they are marked as "rank deprecated" since many editors on Wikidata do the same kind of import. BTW: Is there a list like "the 500 most common names"? The bot could ignore persons with these names or put these edits on a seperate list for "please check"? --Kolja21 (talk) 22:08, 2 December 2019 (UTC)

    @Multichill: could your bot for humans add VIAF ID if GND ID and DtBio ID (P7902) are equal and present and VIAF missing? One can reach the VIAF cluster via GND ID, e.g. for P227=P7902=1047557762 the link is MrProperLawAndOrder (talk) 22:36, 23 May 2020 (UTC)

    I'm pretty sure I didn't continue this because the error rate was too high. Not sure. I have no plans to work on this anytime soon. Multichill (talk) 08:47, 24 May 2020 (UTC)

    Create GND humans from Deutsche BiographieEdit

    I am currently creating humans that are in the DtBio database, all have P21. The QS tool seems to have a bug, and duplicates are created. I will use SPARQL ( to find the duplicates and QS batch to merge them (batch description "P7902 merge"). But during the process Property talk:P227/Duplicates section human could be less useful.

    The items don't have much data yet, the next things to add is VIAF and time information. But instead of using DtBio for time information I would prefer to get that data from the GND website, which should be more up to date. VIAF IDs could come from GND site too, or some of the VIAF bots can do it. MrProperLawAndOrder (talk) 18:10, 19 May 2020 (UTC)

    Progress report 2020-05-23Edit

    400000 humans exist that have P7902 and P227, that is 100000 new. Each of the 400000 has a value for P21 (, adding it to many long existing items. GND duplicates were down to 10 [3] of which a further two were fixed and one wrong GND assignment was resolved. Several merges added new name forms to longer existing items. I got several notifications due to new links to the new items. I also saw bots and users adding new data to the items.

    Now running a second import of about 60000. Each has P21 - as before - and at least a value in in the fields date of birth or date of death in the DtBio website. This adds many living artists, including many from MrProperLawAndOrder (talk) 16:04, 23 May 2020 (UTC)

    Please complete the items before creating more of them. Items with merely a GND identifier are mostly useless and saturate matching for any other application. --- Jura 09:23, 24 May 2020 (UTC)
    Also see User_talk:Mike_Peel#Matching_existing_wikidata_objects_with_unconnected_articles (diff). --M2k~dewiki (talk) 10:23, 24 May 2020 (UTC)
    As mentioned 19 May "the next things to add is VIAF and time information". GND is CC0 and offers a linked data service, so additional information is ready to be added by bots. I am working on that. MrProperLawAndOrder (talk) 15:09, 24 May 2020 (UTC)

    Query (all items above Q95000000 should belong to this import):

    SELECT ?person ?gnd
    WHERE { 
      ?person wdt:P227 ?gnd . 
      ?person wdt:P7902 ?gnd .
      MINUS { ?person wdt:P569 ?b . }
      MINUS { ?person wdt:P570 ?d . }
    ORDER BY DESC(?person)

    Try it! --Epìdosis 11:03, 24 May 2020 (UTC)

    Can we flag them somehow, similar to ORCID ones. --- Jura 11:50, 24 May 2020 (UTC)
    @Epìdosis: How can you know, especially if so many other items are shown in your query, demonstrating that GND-DtBio items without P669 and P570 exist since long ago? MrProperLawAndOrder (talk) 14:15, 24 May 2020 (UTC)

    Query limited to instance of (P31) human (Q5):

    SELECT ?person ?gnd
    WHERE { 
      ?person wdt:P227 ?gnd . 
      ?person wdt:P7902 ?gnd .
      MINUS { ?person wdt:P569 ?b . }
      MINUS { ?person wdt:P570 ?d . }
      ?person wdt:P31 wd:Q5 .
    ORDER BY DESC(?item)

    Try it! --Epìdosis 15:34, 24 May 2020 (UTC)

    Progress report 2020-05-24Edit

    475000 humans exist that have P7902 and P227. Each has a value for P21 ( Duplicates created due to bug in QS have all been merged, Property_talk:P227/Duplicates#Human at 6. @Jura1, M2k~dewiki, Epìdosis: Adding more information is high on the priority list. As mentioned 19 May "the next things to add is VIAF and time information". GND is CC0 and offers a linked data service, so additional information is ready to be added by bots. As mentioned 23 May the new items proved already useful to others who linked them. I am working with Kolja21 on DtBio, we made huge progress, fixing over hundreds of wrong assignments of ids and creating new items to disambiguate. MrProperLawAndOrder (talk) 15:07, 24 May 2020 (UTC)

    • Please do not create any further items before time information is added. Items that merely have identifiers are useless to Wikidata. We can all find identifiers elsewhere if want them. --- Jura 15:14, 24 May 2020 (UTC)
      @MrProperLawAndOrder: Thank you for the import. In less than one week (hopefully) I can work with @Bargioni: to import from GND ID (P227) the date of birth (P569) and/or the date of death (P570) for all the item not already having them. I agree about waiting for the creation of new items until the existing ones have these data added. --Epìdosis 15:15, 24 May 2020 (UTC)
      @Epìdosis: please explain the benefit of waiting for the creation of new items. Other tools need them as a basis, also note that items without b/d existed before. It is one task to initialize the items and there are several other tasks to enrich the them. Maybe Kolja21 can share more about DtBio humans, but AFAICT they are high value items, high quality - GND is not just any other identifier but controlled by DNB and gives access to CC0 LOD (linked open data). DtBio humans are a subset of GND humans that are found on dozens of third party websites in Germany. And of the DtBio, I did chose yet another subset.
      It could be helpful if the enrichment work would be coordinated and not restricted to DtBio humans but performed for all GND humans. DtBio is just a subset.
      For me personally it would be easier to run each enrichment task only once. If creation of more DtBio humans is postponed, then the tasks have to be run again.
      MrProperLawAndOrder (talk) 15:46, 24 May 2020 (UTC)
      @MrProperLawAndOrder: OK, it seems to make sense. Could you report here when you think to have finished imports for a while, so that we can concentrate on adding dates of birth/death? Also for me and Bargioni "it would be easier to run each enrichment task only once" :) --Epìdosis 15:53, 24 May 2020 (UTC)
      I don't agree to that approach. The items already saturate other tasks and Wikidata capacity is limited. If you continue the import, I will ask for a reblock. --- Jura 15:57, 24 May 2020 (UTC)
      These properties are marked as Wikidata property for an identifier that suggests notability (Q62589316). I have seen a lot of questionable imports in Wikidata - some items are even created without a label - but imho MrProperLawAndOrder does a great work. The new items he added are used for maintenance work, helps to track family relationships, tracking down duplicates and incorrect life data. --Kolja21 (talk) 16:41, 24 May 2020 (UTC)
      @Kolja21, MrProperLawAndOrder: Exact: these items are certainly notable and useful; of course more data are imported, more useful are the items. So, has this subset of DtBio been completely imported? --Epìdosis 16:49, 24 May 2020 (UTC)
      • Let's see how the import of additional information for existing items goes before assessing this iteration of their efforts. --- Jura 16:50, 24 May 2020 (UTC)
    The existance of GND-only objects created more manual work, since new articles can not be connected automatically anymore by Pi bot, every item has to be opened manually to check the information behind the GND if it describes the same person with the same year of birth/death due to the lack of this information in the newly created objects (also see User_talk:Mike_Peel#Matching_existing_wikidata_objects_with_unconnected_articles (diff).)
    In addition, when creating new GND-only-objects, it seems that already existing objects have not been taken into account and therefore now have to be merged manually, for example:
    Also see User_talk:MrProperLawAndOrder#Mathilde_Welcker_(Q94753027)_and_Mathilde_Welcker_(Q94753026)_are_identical (diff) --M2k~dewiki (talk) 16:51, 24 May 2020 (UTC)
    From my point of view, also the problem with Quickstatements (?!?), which seem to create two ore more identical GND-only objects in some cases should be analyzed and solved before creating new objects. Merging duplicates afterwards is only a workaround, not an actual solution to the initial problem. It would be better to avoid creating duplicates in the first place and solve the root cause problem before. --M2k~dewiki (talk) 17:01, 24 May 2020 (UTC)
    @M2k~dewiki: it is not correct that existing objects were not taken into account. You mention 9 items to be merged "manually", do you know how many I merged manually? Why don't you merge them? "The existance of GND-only objects" - this section is about items having type=human, label en/de/nl, sex, GND ID, DtBio ID, if you refer to them as "GND-only" then you are not correctly portraying them. "... created more manual work," and reduced a lot of other manual work. "since new articles can not be connected automatically anymore by Pi bot" - actually, we talk about items that have a GND, the project hosting the article should ensure that the correct GND is attached to it, and that it is much more safe than working on name and year of birth. Re "Mathilde_Welcker_(Q94753027)_and_Mathilde_Welcker_(Q94753026)_are_identical" this thread was created due to a bug in QS and has been solved, you added unrelated information. Please put any relevant information here.
    Re QS bug - I told you I have no control about the QS software and I don't see community consensus to disable the tool, but if you are interested in that, try it. It is not specific to P227/P7902 items. MrProperLawAndOrder (talk) 17:20, 24 May 2020 (UTC)
    You can't create duplicates merely because you don't want to match your import against existing items with full dates. Q93871865 had all that, but you created Q95340356. This is different from the QS bug that apparently you are checking. --- Jura 17:29, 24 May 2020 (UTC)
    I can as you can and did. I am doing high quality work, checking against "items with full dates" is not a measure to prevent duplicates. And nine duplicates when 175000 new items have been created is much better than your BLKÖ rate I guess. MrProperLawAndOrder (talk) 01:09, 25 May 2020 (UTC)

    Regarding Quickstatements also see

    --M2k~dewiki (talk) 17:42, 24 May 2020 (UTC)

    Progress report 2020-05-25Edit

    There are still items having P7902 but no P227. Before bots run on the DtBio humans these should be reviewed. Magnus Manske and Thierry Caro inserted many wrong P7902 via some tools, e.g. [4]. @Epìdosis: you asked "has this subset of DtBio been completely imported" - not yet, but Jura1 opposes, so I halted it. 475000 DtBio humans exist in WD (, the DtBio database contains 764044 GND-IDed humans (they still have some humans without GND) - WD on 2020-05-16 had 695656 GND humans (300000 thereof in DtBio), as of today WD has 868476 GND humans ( I don't know how to proceed now. To create claims about parents, spouses, children the target items have to exist first. If parsing the GND LOD (linked open data) and updating the items, it could be more economically to do several item updates at once. I have the impression that one can add several statements and only get one new item revision - this could reduce hardware usage and speed up the whole process. MrProperLawAndOrder (talk) 02:54, 25 May 2020 (UTC)

    Progress report 2020-05-26Edit

    I halted the import due to claims by user:M2k~dewiki and attacks and outright threat by user:Jura1. But there is still progress, since user:M2k~dewiki creates new items, e.g. Holger Dietrich having Lde, P31, P227 (with ref imported). One can easily see that compared to the DtBio creations done before: (Lde, Len, Lnl, P31, P21, P227, P7902) such an item has fewer statements. @M2k~dewiki: if you create new GND humans, would it be possible for you to add sex? Maybe add Len, since the majority of users looks at WD using the English interface? How will P7902 be added? MrProperLawAndOrder (talk) 06:52, 26 May 2020 (UTC)

    Progress report 2020-05-30Edit

    (section added after the one for 2020-06-01 was created, but status except for noted otherwise as of 2020-05-30): No evidence found for the claimed breaking of tools, to the contrary, the new items could in theory prevent wrong article attachments by showing that other humans having the same name etc, exist. All humans known to be female added. Several humans known to be make added. 1 million GND humans reached, which make GND the third most used id, after VIAF and ISNI and not counting ORCID, followed by LC. 2020-05-31: Some cases found via Property talk:P227/Duplicates where the GND was changed by KrBot due to mergers in GND DB resulting in duplicated GND IDs in WD and the IDs also have been changed on DtBio site. Noted at Talk:Q1202222#Updating during 2020-05 where user:Kolja21 added an observation that one redirect works on the DtBio site but not in GND itself. These ID changes before the publication of a new GND dump make me a bit reluctant to add new items.

    Statistics 2020-06-01 - Most used VIAF-related IDs on instances of humans
    Property items distinct IDs ID claims WDQS
    VIAF 214 1693076 1714325 1718597
    ISNI 213 1057262 1064999 1066499
    GND 227 1002043 1003072 1003084
    LC 244 965085 966336 966532
    DtBio 7902 608743 608833 608835
    NTA 1006 474457 475677 476179
    BNF 268 423979 428027 428052
    CBDB 497 421578 422650 422652
    IdRef/SUDOC 269 420612 422415 422605
    BNE 950 156891 158719 158790

    MrProperLawAndOrder (talk) 12:32, 1 June 2020 (UTC)

    @MrProperLawAndOrder: Very good, this should probably be copied also in Wikidata:VIAF/partner :) --Epìdosis 13:36, 1 June 2020 (UTC)
    User:Epìdosis, not sure which subpage, but a page Wikidata:VIAF/type/human could be helpful for various stuff, so I created it and copied the table there. Curious how the numbers will be after the current DtBio enrichment round. MrProperLawAndOrder (talk) 14:26, 1 June 2020 (UTC)

    Progress report 2020-06-01Edit

    @Bargioni: is starting now his import of statements from GND. The import will regard, in this first phase consisting of about 616k statements through QuickStatements (plus their references), only the items recently created by @MrProperLawAndOrder: and will add, if available, date of birth (P569), date of death (P570), VIAF ID (P214) and ISNI (P213) referenced from GND ID (P227). A second phase has been already programmed with the import of other statements (occupation (P106), kinships, languages spoken, written or signed (P1412) etc.). --Epìdosis 08:57, 1 June 2020 (UTC)

    The import of DtBio helped to find some invalid GNDs, see Wikidata:Database reports/Constraint violations/P227. Reason: In some rare cases DtBio made drag&drop errors like GND 11876980 instead of GND 118769804. Example: Emil Wohlwill (Q95730). --Kolja21 (talk) 12:47, 1 June 2020 (UTC)

    Edoderoobot adding false values for P244 [5] MrProperLawAndOrder (talk) 18:42, 1 June 2020 (UTC)

    @Epìdosis: no VIAF added [6] [7] is this because of Edoderoobot having inserted a value in P214? Everywhere where someone else added a value recently, the QS batch will not add a value+reference? MrProperLawAndOrder (talk) 20:35, 1 June 2020 (UTC)

    @MrProperLawAndOrder: No, such additions by Edoderoobot, correct or incorrect, do not influence my QS batches; in the three cases you mention, no VIAF was added just because GND contained no VIAF; I also guess that the wrong additions by Edoderoobot were due to some error of programming which managed badly the cases where no VIAF existed and, instead of skipping these items, added this incorrect value. --Epìdosis 20:40, 1 June 2020 (UTC)
    And Wilhelm Bader sen. (Q2571811) seems to be a GND duplicate. --Epìdosis 20:46, 1 June 2020 (UTC)

    Newly created duplicates BLKÖ via quickstatements without batch numberEdit

    [8], user:Jura1, could you run your QS command in a way that makes them easier to review? How did you check to not create duplicates? MrProperLawAndOrder (talk) 01:03, 25 May 2020 (UTC)

    Also, on the item above you removed "Ritter" from the name given in the BLKÖ article title, but on [9] you keep "Gräfin". What mechanism did you use? Since you asked others to "complete" their items before creating new ones, whilst they have a plan to enrich them, why did you not complete this one and not add 2x VIAF, 2x GND, 1x ISNI and do you even have a plan to enrich your items with authority control numbers? A GND for the Gräfin is stated in the Wikisource article about her since 2012 [10]. MrProperLawAndOrder (talk) 03:20, 25 May 2020 (UTC)

    • As far as P227 is concerned, existing items with GND had been linked. Adding GND or GND-based IDs to newly created items is currently not a priority, but some other things are being done (unrelated to Property talk:P227).
    Contrary to the 160000 GND only items, all items already have additional information at Wikimedia. I don't expect @Bargioni: to complete them for me [11]. --- Jura 10:03, 25 May 2020 (UTC)
    @Jura1: Anyway, work in progress. Unfortunately we have to access GND a lot of times to grab dates. -- Bargioni 🗣 10:29, 25 May 2020 (UTC)
    @Bargioni: what do you mean by work in progress, did you already start? If so, what exactly are you importing from GND. There is much more to obtain than only birth and death information. MrProperLawAndOrder (talk) 12:59, 25 May 2020 (UTC)
    @Jura1: can you answer the question regarding your system for keeping Gräfin but deleting Ritter? MrProperLawAndOrder (talk) 12:56, 25 May 2020 (UTC)
    @Jura1: if you don't bother importing the high quality identifier GND could you at least add VIAF and sex? You are increasing the number of constraint violations. See Property_talk:P1818#New_items_without_sex,_GND,_VIAF. MrProperLawAndOrder (talk) 13:03, 25 May 2020 (UTC)
    I think I answered as far as GND is concerned. --- Jura 13:27, 25 May 2020 (UTC)
    @Jura1: GND was not the concern, VIAF and missing sex were. You created several new constrained violations by adding humans without sex. MrProperLawAndOrder (talk) 14:03, 25 May 2020 (UTC)
    If it wasn't you adding GND entries, it must be me ;) --- Jura 14:14, 25 May 2020 (UTC)
    @Jura1: will you fix the missing sex values on items that you created that cause new constraint violations? MrProperLawAndOrder (talk) 23:03, 25 May 2020 (UTC)

    GND saturation of WikidataEdit

    GND-only items currently saturated almost every other application. Given that we have more than 160,000 items with merely GND IDs, can we see an outline how this will be fixed?

    According to MrProperLawAndOrder (see talk page of @Mike Peel:), they count on @Bargioni: (or @Epìdosis:) to fix it for them [12]. --- Jura 10:03, 25 May 2020 (UTC)

    @Jura1: you are aware of the fact that your claim about me is a personal attack? I never said what you claim. MrProperLawAndOrder (talk) 06:28, 26 May 2020 (UTC)
    @Jura1: In less than one week dates of birth and death will be imported by Bargioni from GND ID (P227). --Epìdosis 10:06, 25 May 2020 (UTC)
    Given the number of items, it seems unlikely that this can be done in a week, but I think we can hold that long. --- Jura 10:13, 25 May 2020 (UTC)
    @Jura1: Work in progress. We have to access GND a lot of times to grab dates. If more info is available, I'll grab it too. -- Bargioni 🗣 10:35, 25 May 2020 (UTC)
    Didn't they have downloadable dump? It might be easier to just create new items from scratch and nuke the others. --- Jura 10:38, 25 May 2020 (UTC)
    @Jura1: importing from the most recent dump would mean to import information that is already outdated. MrProperLawAndOrder (talk) 13:22, 25 May 2020 (UTC)
    @Jura1: Can you provide a source for the claim in your first sentence? MrProperLawAndOrder (talk) 12:54, 25 May 2020 (UTC)
    @Jura1: reminder. MrProperLawAndOrder (talk) 14:05, 25 May 2020 (UTC)
    @Jura1: reminder. MrProperLawAndOrder (talk) 23:01, 25 May 2020 (UTC)
    • Let's see what @Bargioni: thinks of the dump approach. --- Jura 13:28, 25 May 2020 (UTC)
      • @Jura1: I'm in trouble with my home (due to covid lockdown) network connection. The provider... :-( Anyway, please add a link to the GND dump: I'll try to evaluate it against the one by one record approach I was thinking to use. Thx. -- Bargioni 🗣 14:16, 25 May 2020 (UTC)
        • @Bargioni, Jura1: Here all the dumps. --Epìdosis 14:53, 25 May 2020 (UTC)
          • @Epìdosis, Bargioni, Jura1: what is this about? I avoided importing time information from outdated data sources to ensure best quality and now you are planning on using old dumps? MrProperLawAndOrder (talk) 23:06, 25 May 2020 (UTC)
            • @MrProperLawAndOrder: I think that it is highly improbable that data have worsened two months, the only problem may be some death dates missing. Anyway, my message meant just to show the existence and the collocation of these dumps. I've spoken now with Bargioni and he said me that he is not going to use dumps, but to retrieve single GND entries, as first said. Of course it will be a long process, given that we are speaking of tens of thousands of entries. Bye, --Epìdosis 08:08, 26 May 2020 (UTC)

    Wrong gender imported from GNDEdit

    Can you repair this: person is obviously female. --- Jura 10:13, 25 May 2020 (UTC)

    More examples are Q95335213, Q95338703, Q95339302, Q95350061, Q95349834, Q95348529, Q95349608, Q95350494. DNB seems to have wrong gender data (male instead of female) in at least some cases, even if they show the right (female) form of occupation (e.g. "Schriftstellerin" instead of "Schriftsteller"). --M2k~dewiki (talk) 12:12, 25 May 2020 (UTC)
    @Jura1: it's a wiki, you can deprecate the statement and provide what you think is correct. MrProperLawAndOrder (talk) 13:20, 25 May 2020 (UTC)

    Seen and fixed on other items before. A way to have the information available and prevent re-import is to deprecate the statement instead of removing [13]. Take and give: WD takes from GND but also can give back. @Kolja21, M2k~dewiki: what do you think? Of course one can also compare whole lists of items. Regarding occupation information: Not always do they have only the female occupation, have seen at least having female and male form, where I wondered if the person has changed sex during their life. MrProperLawAndOrder (talk) 13:18, 25 May 2020 (UTC)

    Apparently it's a systematic issue (cf. above the report by another use). Apparently you are aware of the quality issue. Please double check your import and fix problem we identified. --- Jura 13:25, 25 May 2020 (UTC)
    A possible additional plausibility check for all data could be a check for the first name. While same first names are possible for both genders (e.g. "Andrea" could be male Q18177306 or female Q18177321, most first names are unique for one gender, e.g. Rosemarie Q18087887, Ursula Q1087262, ...). This might give a list with items which should be checked. --M2k~dewiki (talk) 13:28, 25 May 2020 (UTC)
    I think it highlights problems contributors have to take in account and that large scale imports aren't suitable for new contributors as starters. --- Jura 13:33, 25 May 2020 (UTC)
    @Jura1: before attacking high quality imports made by new users, you could for a starter have a look at the constrained violations that you add and your creation of duplicated imports. MrProperLawAndOrder (talk) 13:40, 25 May 2020 (UTC)
    I think you are confusing your edits with mine. As this is the talk page of P227 it's very likely. --- Jura 13:43, 25 May 2020 (UTC)
    @Jura1: No, I am not. And to claim that I confuse something is a personal attack. Please refrain from making such statements. MrProperLawAndOrder (talk) 13:46, 25 May 2020 (UTC)
    So what's the constraint violation I added? --- Jura 13:48, 25 May 2020 (UTC)
    @Jura1: You created new humans without sex. MrProperLawAndOrder (talk) 17:09, 25 May 2020 (UTC)
    And what constraint did I violate? --- Jura 17:11, 25 May 2020 (UTC)
    @Jura1: if you see there is a system behind the issue, then please share it here. Of course you can also fix. MrProperLawAndOrder (talk) 13:35, 25 May 2020 (UTC)
    No, as any other contributor, we expect you to first try to fix your import, then, if you need help, we can try to sort it out. --- Jura 13:38, 25 May 2020 (UTC)
    @Jura1: you still didn't share the system, where is it? The high quality import was correct. Wikidata exists to store information found in reliable sources. GND is regarded as a reliable source. MrProperLawAndOrder (talk) 13:42, 25 May 2020 (UTC)
    I think M2k explained it to you. Maybe you want to do a root cause analysis for all samples given above. As long as this isn't done, the source can't be considered high quality for this property. --- Jura 13:45, 25 May 2020 (UTC)
    @Jura1: You may think so, but he didn't. Anyway, where is the system behind what you claim to be a "systematic issue"? MrProperLawAndOrder (talk) 13:48, 25 May 2020 (UTC)
    So why is the gender for the above people incorrect at GND? --- Jura 13:50, 25 May 2020 (UTC)
    @Jura1: "why" in the above questions can mean at least two kinds of questions
    1. Why is it judged to be wrong: a possible answer because it contradicts what the first name implies and what is stored in occupation - but note the occupation field could also be wrong
    2. Why was is added to GND DB despite being judged wrong: Because someone with write access to GND added it that way.
    MrProperLawAndOrder (talk) 13:57, 25 May 2020 (UTC)
    Ok, so we don't really know how GND determines gender and it's just something that "someone with write access to GND added it that way". How can that be considered "high quality"? --- Jura 14:01, 25 May 2020 (UTC)
    @Jura1: what does "that" in your last sentence refer to? The last lines were about assumed incorrect values, the first mention of "high quality" was in relation to the import. Errors in data fields at the source do not change the quality of an import. Nor do errors in some data fields mean that a whole database is not of high quality. MrProperLawAndOrder (talk) 16:08, 25 May 2020 (UTC)
    Please re-read the above. It's not an assumption the value is incorrect, we know it. We don't need low quality fields from databases even if you consider the db "high quality" (or is it your edits there?). --- Jura 16:17, 25 May 2020 (UTC)
    @Jura1: before asking others to re-read, can you re-read my question from 16:08 and answer it? MrProperLawAndOrder (talk) 17:07, 25 May 2020 (UTC)

    Should be

    SELECT ?person ?gnd
    WHERE { 
      ?person wdt:P227 ?gnd . 
      ?person wdt:P7902 ?gnd .
      MINUS { ?person wdt:P569 ?b . }
      MINUS { ?person wdt:P570 ?d . }
      ?person wdt:P31 wd:Q5 .
      ?person wdt:P21 wd:Q6581072 .
      ?person wdt:P735 ?firstname . 
      ?firstname wdt:P31 wd:Q12308941 .
    ORDER BY DESC(?item)

    Try it!


    SELECT ?person ?gnd
    WHERE { 
      ?person wdt:P227 ?gnd . 
      ?person wdt:P7902 ?gnd .
      MINUS { ?person wdt:P569 ?b . }
      MINUS { ?person wdt:P570 ?d . }
      ?person wdt:P31 wd:Q5 .
      ?person wdt:P21 wd:Q6581097 .
      ?person wdt:P735 ?firstname . 
      ?firstname wdt:P31 wd:Q11879590 .
    ORDER BY DESC(?item)

    Try it!

    However, it works only on items having given name (P735): probably items created in the last days don't have it yet. --Epìdosis 13:52, 25 May 2020 (UTC)

    @Jura1, M2k~dewiki: Thank you for the list of differing data. In GND gender and occupation are added separately so an actress can be male. All errors concerning articles in German WP have been corrected a few months ago. I've added the new items to this list: de:Wikipedia:GND/Fehlermeldung/Mai 2020#Todesjahr nach 1850. These errors will be corrected as well. --Kolja21 (talk) 00:32, 26 May 2020 (UTC)
      Done The GNDs with a wrong gender have been corrected. --Kolja21 (talk) 20:03, 28 May 2020 (UTC)

    Start fixing by adding given nameEdit

    @Jura1, M2k~dewiki: Do you know tools to add given name to more GND humans? MrProperLawAndOrder (talk) 17:32, 25 May 2020 (UTC) - 266056 results in 45170 ms MrProperLawAndOrder (talk) 17:48, 25 May 2020 (UTC)

    Request for evidence for claim Wrong gender imported from GNDEdit

    @Jura1: you started this section with the headline "Wrong gender imported from GND". Please provide evidence for that claim. MrProperLawAndOrder (talk) 18:07, 25 May 2020 (UTC)

    Enriching GND humans from GND databaseEdit

    Re "Unfortunately we have to access GND a lot of times to grab dates. -- Bargioni 🗣 10:29, 25 May 2020 (UTC)"

    @Bargioni, Epìdosis: could you explain

    1. on which GND humans the process is running
    2. from where information is obtained
    3. what information is obtained
    4. what is added

    ? MrProperLawAndOrder (talk) 13:29, 25 May 2020 (UTC)

    1. We are working on the humans listed in this query:
      SELECT ?person ?gnd
      WHERE { 
        ?person wdt:P227 ?gnd . 
        ?person wdt:P7902 ?gnd .
        MINUS { ?person wdt:P569 ?b . }
        MINUS { ?person wdt:P570 ?d . }
        ?person wdt:P31 wd:Q5 .
      ORDER BY DESC(?person)
      Try it!
    2. The information will be obtained from GND (GND ID (P227))
    3. We will obtain date of birth (P569) and/or date of death (P570), maybe also other information (we are reasoning about that)
    4. We will add date of birth (P569) and/or date of death (P570) whenever they have day precision, month precision or year precision; other information (e.g. occupation (P106)) will maybe be added in the next weeks

    --Epìdosis 13:47, 25 May 2020 (UTC)

    @Epìdosis: RE 1 from which URL do you read? MrProperLawAndOrder (talk) 13:50, 25 May 2020 (UTC)
    I guess from the RDF data of each ID (e.g. for, but I'm honestly not sure, because I'm not able to do such imports, while @Bargioni: is :) --Epìdosis 13:56, 25 May 2020 (UTC)
    @Epìdosis: can you ask Bargioni? That place also contains VIAF and if available ISNI and relationships to other humans. MrProperLawAndOrder (talk) 14:00, 25 May 2020 (UTC)
    @MrProperLawAndOrder, Bargioni: Good idea, we can import add VIAF ID (P214) and ISNI (P213); we will have a look at genealogies. Probably we will start working on GND tomorrow. --Epìdosis 14:06, 25 May 2020 (UTC)
    Is there a way to import GND's reference as well? --- Jura 14:12, 25 May 2020 (UTC)
    @Jura1: Obviously statements will have references to GND like the ones you can see in Johann Friedrich Wilhelm Dornheim (Q94690240) to FAST or VIAF; @MrProperLawAndOrder: we will import add VIAF and ISNI whenever present. --Epìdosis 14:28, 25 May 2020 (UTC)
    @Epìdosis: That's not exactly what I had in mind. GND has (or had) that nice, but somewhat complicated feature, that, as a tertiary reference, it stored the reference for its information (it used to be a code that could be decoded with some other list). --- Jura 14:32, 25 May 2020 (UTC)
    @Jura1: OK, now I understand: of course it is good that GND stores references for its statements. However, importing them in our references would probably require creating some new items and possibly other problems. For this reason, we prefer, at least for now, referencing imported information to GND; in the future it will obviously be possible, with more time available (now, as you justly note, it is crucial to add fundamental information such as birth/death dates as soon as possible), extracting also references listed in GND. Thank you very much for the suggestion! --Epìdosis 14:41, 25 May 2020 (UTC)
    I think the number of such sources is rather limited (it could be ADB or BLKÖ) and allows to determine the quality of DNB. I agree that the priority should be the dates. References with them would be nice. --- Jura 15:04, 25 May 2020 (UTC)
    The list of sources seems quite long indeed. --Epìdosis 15:32, 25 May 2020 (UTC)
    @Epìdosis: and nobody has explained here how that could be useful. WD stores references for individual statements. MrProperLawAndOrder (talk) 15:39, 25 May 2020 (UTC)
    Statements ideally have references. Wikipedia can be included in the reference section as a source, but it isn't considered a reference. The same goes for any other tertiary source. --- Jura 16:07, 25 May 2020 (UTC)
    Still no indication that "references for individual statements" do exist, because only that can be stored in WD. MrProperLawAndOrder (talk) 17:03, 25 May 2020 (UTC)

    Reading LDS from DNB websiteEdit

    Storing LDSEdit

    @Epìdosis: will the whole LDS file be stored for later extraction of information? MrProperLawAndOrder (talk) 14:23, 25 May 2020 (UTC)

    @MrProperLawAndOrder: I'm not sure what you mean, but (if I understand correctly) I think we can do it, if you are interested. --Epìdosis 14:28, 25 May 2020 (UTC)
    @Epìdosis: it's the proper way of doing it. It was said above "Unfortunately we have to access GND a lot of times to grab dates.", if one finds an error in the process of writing to WD one can then go back to local dump instead of reading from DNB website again. MrProperLawAndOrder (talk) 14:51, 25 May 2020 (UTC)
    OK, I understand. --Epìdosis 14:57, 25 May 2020 (UTC)

    Which LDS items to readEdit

    1. 764044 [14] items having GND IDs and sex=0 (unknown),1 (male), 2(female) plus some other info defined in "fl". Adjust rows to 1000000 to get all lines and fl to defgnd to only get the GND. Due to opposition and threats by one user above, they are not all in WD. But maybe you download all of them.
    2. all other GND humans in WD that miss gender, VIAF, ISNI, b or d. MrProperLawAndOrder (talk) 14:51, 25 May 2020 (UTC)

    I think you can find all the data in the files authorities-person_lds_20200213.jsonld.gz and/or authorities-person_lds_20200213.rdf.gz and/or authorities-person_lds_20200213.ttl.gz here. --Epìdosis 15:01, 25 May 2020 (UTC)

    @Epìdosis: that's outdated. MrProperLawAndOrder (talk) 15:18, 25 May 2020 (UTC)
    • Given that there are data quality issues with that field that haven't been explained yet (see #Wrong_gender_imported_from_GND above), I think we should skip that field. An exception could be made if that information is sourced. --- Jura 15:11, 25 May 2020 (UTC)
    @Jura1: this section is about which LDS items to read. Sex is a required information for humans. Not reading the LDS for these is no help. "An exception could be made if that information is sourced." - if the field is missing in WD by definition it cannot be sourced. MrProperLawAndOrder (talk) 15:18, 25 May 2020 (UTC)
    If GND's reference for the information can be provided that means. It's the same field as discussed above and apparently the way GND determines this can't be explained (see #Wrong_gender_imported_from_GND). --- Jura 15:23, 25 May 2020 (UTC)
    @Jura1: What do your statements have to do with the topic of this section which is named "Which LDS items to read"? MrProperLawAndOrder (talk) 15:29, 25 May 2020 (UTC)
    Can you explain the difference between the two? --- Jura 15:31, 25 May 2020 (UTC)
    @Jura1: This section is not for posting unrelated stuff and then asking others to explain differences between unrelated stuff and the topic. Still waiting for you to explain the relation. MrProperLawAndOrder (talk) 15:35, 25 May 2020 (UTC)
    @Jura1, MrProperLawAndOrder: Anyway, this discussion is practically useless: as from this query, no human having GND ID (P227) and Deutsche Biographie ID (P7902) misses sex or gender (P21), so no sex or gender (P21) will be imported from GND. --Epìdosis 15:37, 25 May 2020 (UTC)
    I think they probably already had been dumped into Wikidata. Makes one wonder about the quality of other available fields. --- Jura 15:41, 25 May 2020 (UTC)
    @Epìdosis: this section isn't restricted to DtBio items nor is it about which fields to write to WD. See #2 "all other GND humans in WD that miss gender, VIAF, ISNI, b or d. " If one enriches from LDS, one can also do it in the same run for other GND humans. And there are DtBio humans not created yet, which in DtBio website have unknown sex. MrProperLawAndOrder (talk) 15:47, 25 May 2020 (UTC)
    OK, so considering also items not having Deutsche Biographie ID (P7902) I see nearly 15k items needing sex or gender (P21). We will evaluate how the import. --Epìdosis 16:02, 25 May 2020 (UTC)
    @Epìdosis: there are more GND-without-DtBio humans that could be enriched. P21 is only one field to write. VIAF, ISNI, d+b are very helpful to find duplicates. Please read LDS for each GND human missing any of the fields VIAF, d+b too. MrProperLawAndOrder (talk) 16:24, 25 May 2020 (UTC)

    Use LDS dumpEdit

    @Epìdosis, Bargioni, Kolja21: According to the next dump may come in June/July. So, it could be much much easier to use that one. DtBio itself seems to use the dumps. The only problem is the pressuring by Jura1, and the claimed problems by M2k.

    I would favor to progress with careful processes as it was going until recently. MrProperLawAndOrder (talk) 13:59, 26 May 2020 (UTC)

    Use VIAF websiteEdit

    For a given ID one can look up the VIAF via redirect:


    MrProperLawAndOrder (talk) 13:59, 26 May 2020 (UTC)

    Analyzing LDS dataEdit

    @Epìdosis, Kolja21: The LDS (linked data service) could be an extension to the actual GND DB. Still good as a source, but one should keep that in mind. I think I have once seen duplicated VIAF and/or ISNI information inside, meaning, it could come from different sources. VIAF maybe comes from OCLC. ISNI could also come from GND DB directly. VIAF could be of great help, see an example of an almost two year old duplicate that I fixed today based on VIAF [15]. MrProperLawAndOrder (talk) 14:19, 25 May 2020 (UTC)

    Fields for timeEdit

    gndo:dateOfBirth "1906"^^xsd:gYear
    gndo:dateOfDeath "1971"^^xsd:gYear


    Quelle	NDB/ADB-online;Internet
    Zeit	Lebensdaten: 1906-1971 (Lebensdaten ca.)
    Lebensdaten: ca. 20. Jh.

    MrProperLawAndOrder (talk) 20:37, 25 May 2020 (UTC)

    gndo:periodOfActivity "1978-";

    MrProperLawAndOrder (talk) 00:44, 28 May 2020 (UTC)

    Field for occupationEdit

    gndo:professionOrOccupation <>;

    Stated as another GND object, note that GND distinguishes this field by sex, so one for Taxifahrer and one for Taxifahrerin.

    Writing LDS data to WDEdit

    Manner of writing LDS data to WDEdit

    @Epìdosis: - will it be done via QS or a bot? If by bot, will it be one edit adding several things? MrProperLawAndOrder (talk) 15:00, 25 May 2020 (UTC)

    @Epìdosis: Through QS, as always: one edit for each statement + one edit for each reference to GND. --Epìdosis 15:01, 25 May 2020 (UTC)
    @Epìdosis: Keep in mind that QS can add statements multiple times. Will you use the QS website interface? Via web UI would mean copy paste statements in groups - I had to split my create item lists in several pieces because the web UI didn't accept longer lists. I don't know if one can import longer lists via the command line. MrProperLawAndOrder (talk) 15:13, 25 May 2020 (UTC)
    @MrProperLawAndOrder: Bargioni tried importing through command line, but it failed, so we use the website interface; while it is true that batch mode can generate duplicates when creating items, we haven't ever had problem of duplicating statements added to existent items. It is true that QS doesn't accept big batches, so we will split the import. --Epìdosis 15:18, 25 May 2020 (UTC)
    @Epìdosis: did it fail because of the size? I had issues with QS adding same statements multiple times not only when creating items but also when adding IDs, they could be found via unique constraints. Anyway, there are probably other tools that later will remove exact duplicates of statements. MrProperLawAndOrder (talk) 15:26, 25 May 2020 (UTC)

    Writing via QS web UI: will you use run in background so there is a proper batch id and one can link to sets? MrProperLawAndOrder (talk) 15:26, 25 May 2020 (UTC)

    Obviously Bargioni will use batch mode. --Epìdosis 15:30, 25 May 2020 (UTC)

    Fields to write to WDEdit

    1. P214 VIAF (priority, to find duplicates in WD since many existing humans have no GND, query in section Property talk:P227#VIAF distinct value violations involving GND humans)
    2. P213 ISNI (similar to VIAF)
    3. time (priority, to solve potential issues reported regarding tool usage)
      • date of birth (priority)
      • date of death (priority)
    4. place
      • place of birth
      • place of death
    5. P21 sex (priority, reduce constraint violations, but GND contains errors: known female had a value for male; probably safe to add "female")
    6. relationships
      • mother
      • father
    7. occupation

    MrProperLawAndOrder (talk) 16:37, 25 May 2020 (UTC)

    @Epìdosis: what do you think? MrProperLawAndOrder (talk) 16:58, 25 May 2020 (UTC)
    It is OK. I hope Bargioni has his Internet connection fixed soon, since now it's unfortunately broken. Bye, --Epìdosis 17:01, 25 May 2020 (UTC)
    I'd start with time, then occupation. Agree about "f". I don't think adding all clustered ids is priority. --- Jura 17:06, 25 May 2020 (UTC)
    At the point for VIAF I added "query in section Property talk:P227#VIAF distinct value violations involving GND humans" MrProperLawAndOrder (talk) 16:20, 27 May 2020 (UTC)

    Coordinate writing to WDEdit

    @Edoderoo, Epìdosis: Edoderoo also working on it, see Topic:Vn7dpnl9v9dw6fer. How to coordinate to avoid duplicated work? Seems Edoderoo doesn't need QS. No idea how he does it. MrProperLawAndOrder (talk) 10:06, 28 May 2020 (UTC)

    I wrote a script in python with Pywikibot. Edoderoo (talk) 12:01, 28 May 2020 (UTC)

    GND DB data quality - we know it - original researchEdit

    Because of some values assumed by User:Jura1 and others to be wrong, User:Jura1 wrote "It's not an assumption the value is incorrect, we know it. We don't need low quality fields from databases [...]" [16]

    What does the community think

    1. shall any field that ever had one wrong value in any external DB be viewed at as "We don't need"
      1. which other GND DB data fields had wrong values according to the paradigm of "we know it"
    2. shall WD store information found in external sources or shall it store "we know it." and how would that be referenced?

    MrProperLawAndOrder (talk) 17:26, 25 May 2020 (UTC)

    • Do you actual disagree with the assessment that the value is incorrect?
    The explanation you gave for why the value you upload was that way on GND is "someone with write access to GND added it that way".
    I don't think this is a satisfactory explanation for this, nor did you provide any for all other samples listed. --- Jura 17:31, 25 May 2020 (UTC)
    You are again offtopic. What individual contributors think about individual values is irrelevant here. The topic is how to use external databases etc. Please show the diff for your statement about me starting "The explanation ...". MrProperLawAndOrder (talk) 17:54, 25 May 2020 (UTC)
    • In Wikidata we do care about truth and don't like to copy mistakes from external databases. When we discover that there was a bad value in GND the default is to deprecate the value on our side. Whenever we do import data it make sense to think about the data quality of our imports.
    When it comes to big imports of data the discussion of what should be imported is best done in a bot request. ChristianKl❫ 17:05, 26 May 2020 (UTC)
    @Christian: We in Wikidata know that. If you want to joint this discussion please explain how you can help. --Kolja21 (talk) 17:54, 26 May 2020 (UTC)

    VIAF distinct value violations involving GND humansEdit

    Since DtBio is a subset, checking that first could help more. The recently created DtBio humans mostly have no VIAF yet. @Kolja21, Epìdosis: might be interesting for you. MrProperLawAndOrder (talk) 16:13, 27 May 2020 (UTC)

    1. example: Q64711035 vs Peter Münch (Q64739515). Both are authors, both are born 1960. No chance for VIAF algorithm and a human will get crazy checking these edits. It's hard enough focusing on one authority file but a cluster kills you.
    2. see also Property talk:P214/Duplicates --Kolja21 (talk) 15:29, 31 May 2020 (UTC)

    @Kolja21: Yes, some cases are almost unsolvable. I solved some where DtBio is involved. Maybe we can bring down that number. The queries should be improved, e.g. show only items with birthday and where year of birth and / or death is the same or something like that. MrProperLawAndOrder (talk) 16:42, 31 May 2020 (UTC)

    The reports are currently inflated due to Edoderoobot adding "p://" for P214 on distinct items, creating many more distinct value violations. MrProperLawAndOrder (talk) 19:04, 1 June 2020 (UTC)

    I removed these values, manually, since QS doesn't let me log-in.[17] MrProperLawAndOrder (talk) 11:16, 2 June 2020 (UTC)

    Reinheitsgebot adding data from CERLEdit

    [18] - makes no sense at all. This is just a copy from GND DB. And Reinheitsgebot is not even doing it directly from CERL but from a MnM catalog. @Epìdosis, Kolja21: it's now some weeks that problems with that bot editing DtBio items have been made public, but no sign it is stopping. MrProperLawAndOrder (talk) 20:24, 1 June 2020 (UTC)

    My request about DtBio is still here waiting. However, in my opinion the edit you report in this section regarding CERL is perfectly correct. --Epìdosis 20:32, 1 June 2020 (UTC)
    CERL Thesaurus (Q60909659) focuses on the records of Europe's book heritage. It looks as if the project will not be developed further but it is still a reliable source. --Kolja21 (talk) 21:23, 1 June 2020 (UTC)

    @Epìdosis: it's correct, but it makes no sense. You can create a copy of GND and then link to the copy. @Kolja21: I think you said something like VIAF isn't a source, but a search engine. CERL for this record isn't even that, it is just a copy service. All data on is from GND plus the note that it (the subject) is found in DtBio. Not even a hyperlink to DtBio, not the record, not the homepage. It is not very nice to the planet earth to waste resources like that. Let's try to make it better with WD :-) MrProperLawAndOrder (talk) 06:00, 3 June 2020 (UTC)

    GND removedEdit

    [19] - would be helpful to have tools to detect such removals. MrProperLawAndOrder (talk) 05:52, 3 June 2020 (UTC)

    Removal of redirected IDsEdit

    T.seppelt (talk) 21:00, 18 February 2016 (UTC) Vladimir Alexiev (talk) 11:59, 13 March 2017 (UTC) GerardM (talk) 15:58, 26 March 2017 (UTC) Jonathan Groß (talk) 17:52, 26 March 2017 (UTC) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits Jneubert (talk) 13:47, 29 April 2017 (UTC) Framawiki (please notify !) (talk) Sic19 (talk) 20:42, 12 July 2017 (UTC) Wikidelo (talk) 21:15, 8 May 2018 (UTC) salgo60 Salgo60 (talk) 07:09, 10 June 2018 (UTC) ArthurPSmith (talk) 19:52, 22 August 2018 (UTC) PKM (talk) 19:40, 23 August 2018 (UTC) Ettorerizza (talk) 06:44, 8 October 2018 (UTC) Fuzheado (talk) 03:47, 19 December 2018 (UTC) Daniel Mietchen (talk) 16:30, 7 April 2019 (UTC) Eihel (talk) 15:13, 19 June 2019 (UTC) NAH (talk) 20:29, 18 August 2019 (UTC) Iwan.Aucamp (talk) 21:48, 3 October 2019 (UTC) Epìdosis (talk) 23:49, 22 November 2019 (UTC) Sotho Tal Ker (talk) 00:52, 1 May 2020 (UTC) Bargioni (talk) 09:48, 02 May 2020 (UTC) --Carlobia (talk) 14:34, 11 May 2020 (UTC)

      Notified participants of WikiProject Authority control Hi all! Until recently redirected GND IDs have been periodically removed by KrBot maintained by @Ivan A. Krestinin:; recently the bot has been blocked (see here and here) because it has been said that these cluster should be kept deprecating them and adding reason for deprecation (P2241) redirection (Q8143062). I think that it can be appropriate to act in the aforementioned way for single authority control IDs; at the same time, it is inconsistent to have only some redirected IDs. We should decide in this discussion if we want that

    1. the bot always deprecates redirected IDs (which may be inconsistent, as in the past many redirected IDs have been deleted; but this solution is still somewhat possibile), never removing them
    2. the bot deprecates the redirected IDs only in some cases (we should establish a criterium and it should be possible for the bot to understand and respect this criterium, which may be not easy) and removes them in all the other cases
    3. the bot always removes the redirected IDs, unless they have already been deprecated (we should establish a criterium and apply it manually)
    4. the bot always removes redirected IDs, as it did before the block

    In my opinion, option 3 is probably a good compromise, at least temporarily; so, if no objections are raised, I will unblock the bot (at least for this task) on the 10th of June, asking Ivan not to remove IDs which are already deprecated.

    If objections are raised about this temporary compromise and in the meanwhile the bot gets unblocked for other tasks (e.g. for VIAF tasks, see this discussion), I will ask Ivan not to edit GND IDs until some consensus is reached about the above proposals. --Epìdosis 09:47, 4 June 2020 (UTC)

    Redirected IDs should never have been removed; any edits that did so should be reverted. The fact that some were wrongly removed in the past should not be used as a reason to remove more in the future. Your point 3 is not a good compromise; it would continue the harm done by such removals. I am opposed to the block being lifted unless an undertaking is given to remove no IDs. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:15, 4 June 2020 (UTC)
    • What is the scale of this? (db size GND: actual entries, number of redirects, number added over some period of time; at Wikidata: number of redirects removed in a run). --- Jura 13:11, 4 June 2020 (UTC)
    Return to "P227" page.