Wikidata talk:WikiProject Authority control/Archive 2

This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.


VIAF games - some corrections

@Vladimir Alexiev: Your org query is incorrect - instead of wdt:P31+ you need to use wdt:P31/wdt:P279* - from this I get 1443285 organizations in Wikidata, of which 121404 have a VIAF statement, so about 2.9% of VIAF orgs are co-referenced, and 8.4% of WD orgs. ArthurPSmith (talk) 19:57, 22 August 2018 (UTC)

Problem in ULAN and VIAF

Based on the dates, it looks like the ULAN and VIAF records cited at Charles M. Nes Jr. (Q56256668) have conflated two father-and-son architects - Charles M. Ness Jr. (1907-1989) info here and Charles M. Nes III (1928-2009) obituary here. Other than creating separate articles for these two men, how should we handle the authority control errors? - PKM (talk) 22:32, 26 August 2018 (UTC)

@PKM: we need to create both entries, assign the external ids as non unique and notify viaf and ulan of the constraint violationsviolations --Vladimir Alexiev (talk) 03:09, 28 August 2018 (UTC)

Done. These are Charles M. Nes Jr. (Q56256668) (father) and Charles M. Nes III (Q56285407) (son). I expect ULAN means the father and just has the birth date wrong, but I have ieft the identifiers on both records for now. - PKM (talk) 21:22, 28 August 2018 (UTC)

Benedetto Bembo

@Vladimir Alexiev: From your perspective, how should Wikidata handle items like this and this which both almost certainly represent Benedetto Bembo (Q3638043)? - PKM (talk) 21:52, 26 August 2018 (UTC)

@PKM: I think that's clear: assign both VIAF to the wikidata item, and proliferate the 3 other ids as well --Vladimir Alexiev (talk) 03:06, 28 August 2018 (UTC)

@Vladimir Alexiev: Thank you, that makes sense to me. - PKM (talk) 03:26, 28 August 2018 (UTC)

VIAF error

VIAF has conflated Donald King (Q59621224) and Donald King (Q59621258) - is there anything we can do to straighten this out other than having the two Wikidata items with their proper authority control records? - PKM (talk) 05:16, 11 December 2018 (UTC)

FYI, I sent them email and these entries are now separated. - PKM (talk) 23:12, 10 January 2019 (UTC)
In the past I have noted a couple of such items at en:Wikipedia:VIAF/errors#VIAF_merges_different_identities_(into_one_cluster) -- not sure if VIAF are still monitoring this (and it would be good if they removed items once they were done), but I think they may be. Jheald (talk) 15:27, 11 January 2019 (UTC)
Thanks for that link! - PKM (talk) 19:50, 11 January 2019 (UTC)

VIAF and deprecation

Is it a known problem that VIAF disregards Wikidata deprecation ranks in the sense that it makes no difference between a regular statement and a deprecated statement --Emu (talk) 12:55, 1 September 2019 (UTC)?

CONOR

Hi. VIAF recently added CONOR.SI ID (P1280)[1], so what is the best way to import the CONOR ID from VIAF that are not yet on Wikidata? --Sporti (talk) 08:21, 2 July 2019 (UTC)

External identifiers

Hi All,

I wrote a proposal about expanding the statements regarding external identifiers, please have a look and add your opinions: https://www.wikidata.org/wiki/Wikidata:Project_chat#External_Identifiers_-_expanding_statements,_best_practice --Adam Harangozó (talk) 15:27, 1 November 2019 (UTC)

Best practices for a subject with multiple VIAF entries?

Russell Iredell (Q77506786) appears to have three VIAF IDs, one each for ISNI, ULAN, and BNF. There are many similar cases like this. Should all VIAF IDs be added to the item? Should the VIAF IDs be added to Wikidata:WikiProject Authority control/VIAF errors? (I don't quite understand the message "Please do not report clusters which need only to be merged..."). Please advise, thanks. -Animalparty (talk) 03:57, 7 December 2019 (UTC)

User:Animalparty, I think add all VIAFs and regarding the message: don't report items with multiple VIAFs as errors - if the only error is that they need merging. MrProperLawAndOrder (talk) 16:09, 9 May 2020 (UTC)
@Animalparty: I agree exactly with @MrProperLawAndOrder:: just add all the VIAFs if this is the only problem; Wikidata:WikiProject Authority control/VIAF errors collects only cases in which VIAF clusters confuse different subjects. --Epìdosis 16:21, 9 May 2020 (UTC)

Statistics instance of human

Statistics for external identifier usage on instances of human
Property Plain name Property number Instances of Q5 having at least one statement
(2020-05-16)
Statements on instances of Q5
(2020-05-16)
Comment
VIAF ID (P214) VIAF 214 1655332 1679933 https://w.wiki/Qxa
ISNI (P213) ISNI 213 1043706 1052903 https://w.wiki/Qxc
Library of Congress authority ID (P244) LC 244 944803 946238 https://w.wiki/Qxd
GND ID (P227) GND 227 695656 696715 https://w.wiki/Qxe
Bibliothèque nationale de France ID (P268) BNF 268 421886 425955 https://w.wiki/Qxh
CBDB ID (P497) CBDB 497 421591 422653 https://w.wiki/Qxg
IdRef ID (P269) SUDOC 269 419758 421744 https://w.wiki/Qxi
Nationale Thesaurus voor Auteursnamen ID (P1006) NTA 1006 400319 402028 https://w.wiki/Qxj
IMDb ID (P345) IMDb 345 350098 350870 https://w.wiki/Qxm
Deutsche Biographie (GND) ID (P7902) DtBio 7902 300059 300095 https://w.wiki/Qxn
National Library of Spain ID (P950) BNE 950 155982 157863 https://w.wiki/Qxp
SELECT (COUNT(DISTINCT(?item)) AS ?count_item) (COUNT(?id) AS ?count_id) 
{
 ?item wdt:P31 wd:Q5 .
 ?item wdt:P214 ?id .  
}
Try it!

MrProperLawAndOrder (talk) 01:25, 16 May 2020 (UTC)

@MrProperLawAndOrder: Not sure if this is relevant to you, but your same query with ORCID iD (P496) yields over 1.4 million people. ArthurPSmith (talk) 17:24, 18 May 2020 (UTC)
@ArthurPSmith: interesting, thank you! And no timeout, while for VIAF if asking for both counts at the same time, I get timeout. MrProperLawAndOrder (talk) 18:14, 18 May 2020 (UTC)

Archive main page

Hi all! The main page Wikidata:WikiProject Authority control contains a lot of useful information, but is very long and a lot of its content dates to past years, so it might be somehow outdated; I've also noticed that some of the materials (like proposals or suggestions) it contains are usually more familiar to talk pages (so to this page) than to main WikiProject pages. So, if there are no objections, tomorrow I would archive all the paragraphs from "Data sources" to "Item creation from a thesaurus concept via Quickstatements" in Wikidata:WikiProject Authority control/Archive, in order to start then building a new infrastructure of up-to-date stats. Thank you all, --Epìdosis 09:37, 10 May 2020 (UTC)

User:Epìdosis, before archiving, could you start the new page? Maybe at /new? So others can better see what you envision. Archiving and moving /new to main can be done later. MrProperLawAndOrder (talk) 02:10, 11 May 2020 (UTC)
What does it cover? Difference to VIAF members, difference to external IDs. What about pages by type? Is an archive needed, or could the page just be changed step by step. MrProperLawAndOrder (talk) 03:03, 11 May 2020 (UTC)
@Epìdosis: refactoring that page is badly needed and your contribution will be much appreciated. However, rather than moving it to Archive, I vote to organize it by topic. We have at least 2 major authorities that merit separate subpages: AAT and VIAF. Projects like "visual arts" have this nice tabbed structure for sub-pages, do you know how to do that? Also, https://www.wikidata.org/wiki/User_talk:Vladimir_Alexiev#VIAF_-_update needs to be merged into the table https://www.wikidata.org/wiki/Wikidata:WikiProject_Authority_control#VIAF_Links_per_Source . Cheers! --Vladimir Alexiev (talk) 07:42, 29 May 2020 (UTC)
@Vladimir Alexiev: Thank you! I'm very busy in these days, but I'll certainly try to do something when I have a bit more concentration. Bye :) --Epìdosis 08:35, 29 May 2020 (UTC)
@Vladimir Alexiev, Epìdosis, Kolja21: Before working on visual changes, what would it contain? If VIAF is such an important part, maybe work on Wikidata:VIAF and subpages there? With so much VIAF content, the other stuff is harder to see. What do you think about copying the section VIAF Games to Wikidata:VIAF and let people interested in VIAF work there? Some people like me may have a strong focus on VIAF-related work and there are places in WD already treating VIAF-related IDs differently, e.g. the property sorting. The URL "Wikidata:VIAF" would be much shorter making it easier to go there, to maintain subpages and by doing so make working on VIAF-related things easier. New editors and third parties would have a central page to understand the relations between WD and VIAF. MrProperLawAndOrder (talk) 16:21, 29 May 2020 (UTC)
@MrProperLawAndOrder: Thank you for creating Wikidata:VIAF, very good idea: I perfectly agree that "new editors and third parties would [and should!] have a central page to understand the relations between WD and VIAF". If you and @Vladimir Alexiev: want to start working on updating the data in that page, it is obviously welcome! --Epìdosis 19:28, 29 May 2020 (UTC)

VIAF cluster pages

Wikidata:VIAF/cluster should be the central page describing the clusters. I suggest to bring the error pages closer to that area and especially make them subpages of Wikidata:VIAF to make navigation between the VIAF pages easier. Probably subpages by type could make working easier, e.g. some people like to work on human clusters, other on organizations etc. But for now, I propose:

  1. Wikidata:WikiProject Authority control/VIAF linking to multiple items ->
    1. Wikidata:VIAF/cluster linking to multiple Wikidata items
    2. Wikidata:VIAF/cluster/linking to multiple Wikidata items - allows autonavigation to go to /cluster
  2. Wikidata:WikiProject Authority control/VIAF errors ->
    1. Wikidata:VIAF/cluster containing errors - it's about errors in their cluster, not errors in Wikidata items
    2. Wikidata:VIAF/cluster/containing errors - allows autonavigation to go to /cluster

MrProperLawAndOrder (talk) 00:45, 31 May 2020 (UTC)

@Kolja21, Epìdosis: what do you think? MrProperLawAndOrder (talk) 11:44, 31 May 2020 (UTC)

OK, it makes sense; I would prefer options 2. --Epìdosis 12:15, 31 May 2020 (UTC)
Probably we need both. "VIAF cluster linking to multiple Wikidata items" is only the most common of multiple errors. How can we trace if a cluster have changed? --Kolja21 (talk) 13:28, 31 May 2020 (UTC)

@Kolja21, Epìdosis: I checked the second and all reported errors seemed to be conflations of sources, with a subclass conflations where only Wikidata should be removed. So I created, using options 2 to get autonavigation, which Epìdosis also supported:

  1. Wikidata:VIAF/cluster/linking to multiple Wikidata items
  2. Wikidata:VIAF/cluster/conflating entities

I don't know of other cluster-related errors that are collected. VIAF only(?) merges sources, so most other errors would probably be in the sources. Talk on new types of errors should probably be done at Wikidata talk:VIAF. MrProperLawAndOrder (talk) 23:46, 31 May 2020 (UTC)

Hi @MrProperLawAndOrder:, here are some suggestions:

  1. Move WD:VIAF under "WD Project Authority Control"
  2. Break up and incorporate ALL text on "WD Project Authority Control" concerning VIAF into these VIAF pages; in particular "VIAF Games"
  3. In particular, rather than the poor and obsolete section https://www.wikidata.org/wiki/Wikidata:VIAF/cluster#Identifiers_in_VIAF_DB, use my section with the table that lists WD props, country etc. But update it with the latest stats from my personal page (there is a link there)
  4. Regarding "cluster-related errors": break up the "VIAF errors" page into its subsections: who to contact for reporting, the different errors listed... In particular I've been adding to a section "Worldcat Identities errors" from reverts of my recent import of that data (1.5M links)
  5. Don't put it under "clusters" but directlty under VIAF. cheers --Vladimir Alexiev (talk) 07:48, 4 June 2020 (UTC)

ISNI new website

Just saw they have a more readable website now, still c/o EDItEUR: https://www.isni.org MrProperLawAndOrder (talk) 14:04, 3 June 2020 (UTC)

  1. Property talk:P213#URL broken 2020-06-07 - proposal to switch to LD URL MrProperLawAndOrder (talk) 23:18, 11 June 2020 (UTC)
  2. Property talk:P213#URL broken 2020-06-11 - proposal to switch to LD URL - broken again. MrProperLawAndOrder (talk) 23:18, 11 June 2020 (UTC)

Resolver

Is there any resolver provided by WD itself, I could only find, e.g. for GND https://wikidata-todo.toolforge.org/resolver.php?prop=227 MrProperLawAndOrder (talk) 13:58, 8 June 2020 (UTC)

@MrProperLawAndOrder: If you go to the Talk page for, for example GND ID (P227) (i.e. to Property talk:P227) you'll notice at the bottom of the Documentation Template a box "Search for values" that does this sort of resolution. You can also just type into the search box on any page at the top right the same string used there - "haswbstatement:P227=XXXX" with XXXX the identifier you want to find. ArthurPSmith (talk) 17:18, 12 June 2020 (UTC)
@ArthurPSmith: no redirecting to the actual content. Most IDs are only found on one item. MrProperLawAndOrder (talk) 22:59, 12 June 2020 (UTC)

LCCN error report

Has anybody been able to successfully make an error report to the LoC in case of errors in the LCCN? I have tried the form on the website and email, but to no avail (even after several months). --Emu (talk) 09:23, 15 July 2020 (UTC)

Probably not; reports are being collected in Wikidata:WikiProject Authority control/LCCN errors in order to keep trace of them, hoping they will consider the page in the future. --Epìdosis 10:17, 15 July 2020 (UTC)

Renewal of main page

Hi all! I've renewed the main page of the project, moving all the old content to Wikidata:WikiProject Authority control/Archive. If you have any proposal about how to improve the main page, you can post it here for discussion or you can also use Wikidata:WikiProject Authority control/Draft to experiment your ideas. Thank you all in advance! --Epìdosis 15:08, 3 August 2020 (UTC)

  • @Epìdosis: Thank you! One of the things I find frustrating about this project is how much old, out-of-date information there is, which makes it really hard to know how best to contribute. As of today, 22000 items in AAT have been matched to Wikidata (54.4%). I'll add that to the draft Main Page. - PKM (talk) 21:12, 11 September 2020 (UTC)
@PKM: Thanks, I agree that moving old information to the archive was very much needed. Maybe I can move to Archive also the section "History" (what do you think?). You can update whichever part of the page you consider outdated (obviously expect the Archive, which is conserved as it is for historical purposes), also not passing through the draft. If you have any particular proposal, you can also discuss it here. Thank you very much in advance, of course! --Epìdosis 10:26, 12 September 2020 (UTC)
@Epìdosis: I've copied my new section on AAT from the draft to the main page. I plan to also make a section for ULAN. As for History, I think we should add more recent info (if we have it) and then maybe archive everything prior to 2017? - PKM (talk) 20:26, 12 September 2020 (UTC)
@PKM: Great for AAT. I would archive prior to 2019 (substantially all the section as it is now), restarting the new section from now. I've seen you made a reference in draft to User:Magnus Manske/authority control.js: it was and it is still a good gadget, but it often doesn't keep the pace with VIAF evolution, sometimes causing problems (e.g. IDs getting [[User_talk:Magnus_Manske/authority_control.js#BAV_Vatican_Library|added to old properties instead of new ones et similia), so it's better to indicate User:Sotho Tal Ker/authority control.js (the constantly updated version of the previous) or my favorite User:Bargioni/moreIdentifiers - all the three, anyway, are already listed in Wikidata:VIAF/cluster#Gadgets. --Epìdosis 20:37, 12 September 2020 (UTC)
@Epìdosis: Thanks for the heads-up on the AuthorityControl gagdet. I took out my edit, and I'll be trying one of the other tools you recommended.
I have added a section on ULAN to the draft page; it's a bit more subjective and I'd like your opinion before it goes live on the project page.
Starting the History section over with 2019 works for me! - PKM (talk) 21:19, 12 September 2020 (UTC)
@PKM: ULAN section seems good to me, you can move it forward :) --Epìdosis 21:39, 12 September 2020 (UTC)
@Epìdosis: Thanks for checking. Done! - PKM (talk) 21:50, 12 September 2020 (UTC)
  • @Epìdosis: Thank you for updating the page! I am working on the items for people from the Dictionary of Art Historians. Many items were imported 5 years ago, contain little information and aren't cross-referenced with any other authority control. I am primarily adding VIAF IDs to these items and I am planning to import more information about these people both from both the dictionary and authority control data. Some questions I am having and where I'd like to find a definite answer for on this page:
    1. What are the best practices for marking conflated VIAF clusters/wrong authority control identifiers?
    2. Are there (currently) any bots adding additional authority control identifiers based on VIAF clusters?
    3. Which rules are there for bots working with authority control data? (A common complaint seems to be that bots mess everything up: w:de:Wikipedia:Umfragen/Normdaten aus Wikidata)
    4. If you copy claims from authority control, how should the source look like? I have seen:
  • Probably some of the answers are not yet decided upon, but I think the community should do so at some point. --Pyfisch (talk) 08:35, 12 September 2020 (UTC)
  • @Pyfisch: Very very interesting questions, which deserve a clear guideline maybe to be placed in Wikidata:VIAF or one of its subpages. For the moment, I try to answer in detail here, then maybe we can draft something. So:
    1. Two actions are usually good to manage values of VIAF ID (P214) which are in fact conflated clusters:
      • especially in cases where the cluster is really a big mess (e.g. two IDs related to one person, three to another person), you can simply add the VIAF ID (P214) with deprecated rank and the qualifier reason for deprecated rank (P2241)conflation (Q14946528) (e.g. Jadwiga Łubieńska (Q11716014))
      • in cases where the cluster contains many IDs and only a little number (i.e. one or two) are wrong, I think using the previous solution would be excessive; it's much better to use User:Bargioni/moreIdentifiers and add a report about single IDs which should be moved away from the present cluster, using the button ⚡
      • For what concerns other authority control IDs, I tend to think that removing the IDs not pertinent to the items is probably the easiest solution for solving problems; anyway, you can also leave them with deprecated rank and the qualifier reason for deprecated rank (P2241)applies to other person (Q35773207)
      • If you find that a confusion between two persons is very much probable, you can (and should) the reciprocal different from (P1889) in both items - it will be useful for solving future confusions and to avoid eventual wrong merges
    2. Substantially no, since conflations in VIAF clusters are not so rare and so importing IDs from them without a lot of precaution tends to be risky; I think that a bot importing sistematically Nationale Thesaurus voor Auteursnamen ID (P1006) from VIAF ID (P214) is still present, for the other IDs (luckily) nothing; I think that, after solving all the "unique value" violations regarding VIAF members-IDs and at least most of "unique value" violations regarding P214 itself, we can maybe approach eventual imports from VIAF ID (P214) with a little more boldness;
      • anyway, in my opinion our approach for the diffusion of VIAF members-IDs should relate more on creating and managing good Mix'n'match catalogs than on importing IDs via bot from VIAF; the process is obviously much slower, but also much more precise
    3. While normal bots should pass through Wikidata:Requests for permissions/Bot and so there is the necessary control, the majority of bot-imports comes through QuickStatements or (worse) through OpenRefine and there is still no control a priori (but only a posteriori) on such imports, although I have a vague reminescence of users asking somewhere to impose some a priori restriction to such imports; so in fact there is no regulation for most of the imports (which we can start discussing, of course! I feel this point as very important, as you and many users on German Wikipedia, although I think that the biggest damages have been inflicted in the first years of Wikidata and most recent imports usually are pretty much correct, unfortunately with some exceptions)
    4. To reference data from authority control solution 1 (only stated in (P248) should always be avoided as too generic, solution 4 (only reference URL (P854)) should also be avoided as nearly impossible to query, solution 3 (stated in (P248)+subject named as (P1810)) isn't so bad but lacks stated in (P248) and especially lacks retrieved (P813) which should always be present for the verifiability of the information, so I always adopt solution 2 (stated in (P248)+GND ID (P227)+retrieved (P813)) - the absolutely best reference would be like a solution 5 (stated in (P248)+GND ID (P227)+subject named as (P1810)+retrieved (P813)), but since it's somewhat easy to bot-add subject named as (P1810) in a second time, also solution 2 is perfectly fine and I always recommend it since it has the two key requisites: easily verifiable through combination of GND ID (P227)+retrieved (P813) and easy to query through either stated in (P248) or GND ID (P227)
      • As you can see, references are something still not uniform, but since I think no user would prefer solutions 1-3-4 over the solution 2 or (better) my solution 5, I think that using a bot in order to enforce a good standard would in fact de facto impose a good standard (since new users usually learn how to construct an item seeing existent items, if the great majority of existent items has the references added in a precise way, then it's highly probable that a big number of users learn this standard and reproduce it autonomously).
  • If you have any other question, I will be surely glad to answer you! Bye, --Epìdosis 10:26, 12 September 2020 (UTC)
@Epìdosis: I agree with you, just one thing: The problem with the use of subject named as (P1810) in references might be misleading. Take Julius Klob (Q60815735) and its place of birth value. Most users will see Olomouc, both sources call the city Olmütz. The subject named as (P1810) value Klob, Julius would be a little confusing, most would probably expect Olmütz here. --Emu (talk) 16:48, 12 September 2020 (UTC)
@Emu: Maybe subject named as (P1810) is a little misleading, but that can be solved improving its label and description. Actually, P1810 should always be used for indicating which is the name of the subject of the item in the source used (so "Klob, Julius"); if you want to specify which is the exact formulation of the information you are sourcing in the source (e.g. the source says "Olmütz"), you should add (I've done it as an experiment) to stated in (P248)+GND ID (P227)+subject named as (P1810)+retrieved (P813) a fifth one, quotation (P1683), with value "Geburtsort: Olmütz". Probably it's not used so frequently, but I think it works well. --Epìdosis 16:57, 12 September 2020 (UTC)
@Emu, Epìdosis: I usually address this problem by using object stated in reference as (P5997) + the string in the source material at the end of the reference for the specific statement, so <stated in reference as> "Olmütz" in your example. - PKM (talk) 21:34, 12 September 2020 (UTC)
In my opinion object stated in reference as (P5997) is a perfectly good alternative to quotation (P1683), of course. --Epìdosis 21:39, 12 September 2020 (UTC)

RKDartists

I’d like to add a section on RKDartists, partly because BotMultichillT will add a ton of useful statements with references once you have an RKDartists identifier - see this diff. I need to be sure I know all the things the bot will do. - PKM (talk) 04:05, 13 September 2020 (UTC)

Sounds good to me. I noticed that the references use reference URL (P854) instead of RKDartists ID (P650). Maybe this should be changed? --Pyfisch (talk) 09:24, 13 September 2020 (UTC)
I think it should be changed; I would also use the order stated in (P248)+RKDartists ID (P650)+retrieved (P813) (with P813 at the end); for the rest, obviously the bot does a very good work. --Epìdosis 09:42, 13 September 2020 (UTC)
Also RKD staff actively respond to issues at User talk:RKDdata. - PKM (talk) 02:51, 14 September 2020 (UTC)
As of yesterday the bot is adding “stated in” + “reference URL” + “retrieved”. :-) - PKM (talk) 03:37, 15 September 2020 (UTC)

Duplicate items indicated by VIAF

MrProperLawAndOrder (talkcontribslogs) (blocked permanently) created a lot of items based on the Deutsche Biographie, but some of these persons already have an item. Later VIAF was added and now it is apparent that there are about 5000 duplicate items: [2]. Most of them are true duplicates but one needs to check carefully. What is the most efficient way to check this, so this doesn't take months? --Pyfisch (talk) 20:06, 15 November 2020 (UTC)

Many items lack dob, dod, so there are only a few hundred items with matching dates: [3] --Pyfisch (talk) 21:17, 15 November 2020 (UTC)
Just tried a few from your list. It’s really hard (but also very interesting) and I can’t think of a really good way beyong reviewing every single one. But maybe Marv1N can weigh in, as he has a lot experience in handling duplicates? --Emu (talk) 22:06, 15 November 2020 (UTC)
Well... I merge items by hand when I found one (two) randomly (for example when making genealogical conection). So, this method is highly inefficient (I am affraid). --marv1N (talk) 08:18, 16 November 2020 (UTC)
There are also ~500 instances GND-IDs that are found on two items, many are duplicates and they are easier to check than the VIAF ones. --Pyfisch (talk) 11:01, 16 November 2020 (UTC)
Thanks for the link. The import of DB-IDs was very helpful. I also will work off some of the duplicates. --Kolja21 (talk) 11:35, 18 November 2020 (UTC)
@Pyfisch: I've emptied manually the ~500 unique-value GND constraint violations. --Epìdosis 15:30, 22 November 2020 (UTC)

Also see

https://w.wiki/rP4 : A slightly improved query that returns conflicts only once (str(?a)<str(?b)) and queries more languages --Vladimir Alexiev (talk) 16:52, 20 December 2020 (UTC)

What I do not understand is, the DNB records typically also have a VIAF record.. It is something you find at the VIAF records. Also all the VIAF items typically have a Wikidata identifier.. I thought there was an API to query VIAF for this. Thanks, GerardM (talk) 10:24, 22 December 2020 (UTC)
VIAF API spec
GND/DNB and VIAF are different in that GND entries are curated manually and establish an identity. VIAF holds algorithmically created clusters of such identities (from many national libraries and other bodies) - very useful, but there can always are errors. At some point in time Wikidata was used as a mechanism to spot and correct errors in these clusters (when, say, a GND id is attached to one item, and a SUDOC id from the same cluster is attached to another item, this is clearly suspect and could be used to automatically split a cluster. Similar with merging.) I'm not sure if such a mechanism is functional in the VIAF processing right now. Jneubert (talk) 10:54, 22 December 2020 (UTC)
@Jneubert: We are still collecting reports of errors in VIAF here, but we don't know if someone at VIAF looks at them. --Epìdosis 11:58, 22 December 2020 (UTC)

Change P920 property’s data type from “string” to “external identifier”

Hi all! This is my first contribution in this amazing project. I'm writing to you because the subject headings for public libraries maintained by the Spanish Ministry of Education, Culture and Sport P920 is a Wikidata property related to a thesaurus Q89560413 like P1014 and I believe its data type should be changed from “string” to “external identifier” what do you think?

Vladimir Alexiev (talk) 11:59, 13 March 2017 (UTC) Jonathan Groß (talk) 17:52, 26 March 2017 (UTC) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits Jneubert (talk) 13:47, 29 April 2017 (UTC) Sic19 (talk) 20:42, 12 July 2017 (UTC) Wikidelo (talk) 21:15, 8 May 2018 (UTC) ArthurPSmith (talk) 19:52, 22 August 2018 (UTC) PKM (talk) 19:40, 23 August 2018 (UTC) Ettorerizza (talk) 06:44, 8 October 2018 (UTC) Fuzheado (talk) 03:47, 19 December 2018 (UTC) Daniel Mietchen (talk) 16:30, 7 April 2019 (UTC) Iwan.Aucamp (talk) 21:48, 3 October 2019 (UTC) Epìdosis (talk) 23:49, 22 November 2019 (UTC) Sotho Tal Ker (talk) 00:52, 1 May 2020 (UTC) Bargioni (talk) 09:48, 02 May 2020 (UTC) Carlobia (talk) 14:34, 11 May 2020 (UTC) Pablo Busatto (talk) 03:22, 23 June 2020 (UTC) Matlin (talk) 10:53, 6 July 2020 (UTC) Msuicat (talk) 21:57, 27 August 2020 (UTC) Uomovariabile (talk) 10:04, 27 October 2020 (UTC) Silva Selva (talk) 17:21, 30 November 2020 (UTC) 1-Byte (talk) 15:52, 14 December 2020 (UTC) Alessandra.Moi (talk) 17:26, 16 February 2021 (UTC) CamelCaseNick (talk) 21:20, 20 February 2021 (UTC) Songceci (talk) 18:45, 24 February 2021 (UTC)]] moz (talk) 10:48, 8 March 2021 (UTC) AhavaCohen (talk) 14:41, 11 March 2021 (UTC) Kolja21 (talk) 17:37, 13 March 2021 (UTC) RShigapov (talk) 14:34, 19 September 2021 (UTC) Jason.nlw (talk) 15:15, 30 September 2021 (UTC) MasterRus21thCentury (talk) 20:22, 18 October 2021 (UTC) Newt713 (talk) 08:42, 13 March 2022 (UTC) Pierre Tribhou (talk) 08:00, 20 March 2022 (UTC) Powerek38 (talk) 17:21, 14 April 2022 (UTC) Ahatd (talk) 08:34, 4 August 2022 (UTC) JordanTimothyJames (talk) 00:54, 31 August 2022 (UTC) --Silviafanti (talk) 17:07, 14 September 2022 (UTC) Back ache (talk) 02:03, 1 November 2022 (UTC) AfricanLibrarian (talk) M.roszkowski (talk) 10:44, 4 January 2023 (UTC) Rhagfyr (talk) 19:36, 9 January 2023 (UTC) — Haseeb (talk) 13:10, 4 August 2023 (UTC) 13:26, 15 November 2023 (UTC) MrBenjo (talk) 15:20, 23 April 2024 (UTC)

  Notified participants of WikiProject Authority control @Lea Lacroix (WMDE):

Silva Selva (talk) 17:20, 30 November 2020 (UTC)

Hi @Silva Selva:, I don't think it's possible to change the datatype of an existing property and therefore a new property might need to be created with the correct datatype. Not certain about that though. Simon Cobb (User:Sic19 ; talk page) 18:15, 30 November 2020 (UTC)
Changing datatype from string to external id is surely possible, it has already been done (e.g. for ISIL (P791)), and I   Support this change. Thanks for proposing. --Epìdosis 18:19, 30 November 2020 (UTC)
Thanks for your support @Epìdosis:! I didn't add the context sorry @Sic19:! I should've said I proposed this change to the development team (who can do this change), but they need to make sure the community approves, so if you do, would you also add your support? :) Thank you both for reading/commenting this thread! Silva Selva (talk) 18:29, 30 November 2020 (UTC)
Yes, since it is possible I   Support the change. Simon Cobb (User:Sic19 ; talk page) 18:40, 30 November 2020 (UTC)
  Support. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:12, 30 November 2020 (UTC)
Yay! Thank you for your support @Sic19:, @Pigsonthewing:, @PKM:, @Jneubert:, @Epìdosis:. @Lea Lacroix (WMDE): is there anything else you need for this case? Silva Selva (talk) 18:01, 1 December 2020 (UTC)
Thanks! I created a ticket and will let you know if we need anything else. Lea Lacroix (WMDE) (talk) 08:50, 2 December 2020 (UTC)

Matching whole VIAF clusters (people only)

Using Bargioni and Epìdosis' fantastic tool moreIdentifiers, I've just added all of the missing ID's that were in the VIAF cluster for James P. Kennett and Richard B. Firestone. This allowed me to add references (using UseAsRef) to both of their date of birth statements, huzzah! It got me thinking though; I checked through them all to make sure that they actually referred to the person I was adding the info to, and I checked on the little graph thing on the VIAF page to make sure there were no single connections, but some of the records were a bit sparse (like this bibsys record for Dr. Kennett) and pretty much only had their full name.

I read on the talk page for the moreIdentifiers that (as you would assume) sometimes there are errors and doing things like linking records "in the blind" can just lead to propagating them. My question is how likely is this to be the case? Should I have only added the ID's for the records that I could be 100% sure about, or is doing a cursory check and giving VIAF the benefit of the doubt usually okay? I'm also wondering if just adding all of the ID's is actually useful. Is there any other reason not to match a whole cluster? Thanks! Aluxosm (talk) 18:08, 1 June 2021 (UTC)

@Aluxosm: Errors are very common in two specific categories of people: people with very common names (e.g. John Brown, James Green); people with names in non-Latin alphabets, especially from East Asia (Chinese/Korean/Japanese scripts). These categories require very high attention. In other cases usually a cursory check is sufficient. --Epìdosis 18:14, 1 June 2021 (UTC)
@Aluxosm: Thx a lot! About VIAF clustering errors, in my opinion they depends on the algorithms used by VIAF itself, where no human checks or edits are applied. But clustering process we perform in Wikidata can be very accurate. MoreIdentifiers simplify the registration, but the quality still depends on us. -- Bargioni 🗣 20:05, 1 June 2021 (UTC)
@Epìdosis, Bargioni: Sounds a lot like the issues I've encountered using Author Disambiguator, I guess that makes sense. I think I was hoping that they would be using some super advanced method or just have a human checking over each record haha. Thanks for your input and again for the great tools! Aluxosm (talk) 13:06, 2 June 2021 (UTC)

Label for P8179 (Canadiana NCF ID)

Hi everyone! I'm affiliated with Library and Archives Canada and was wondering if there would be support for revising the Wikidata label and definition of the ID found in our name authority file (P8179). We feel a label such as Canadiana Name Authority File ID would better describe the ID. I've added a topic to the talk page for the property with more info but I thought I would also post here for your input. Thanks! --Ostapt (talk) 12:48, 26 October 2021 (UTC)

User:Magnus Manske/authority control.js has stopped adding

Hi. For years I have been using MM's simple lookup and add tool to quickly add the relevant authority detail for new Wikisource authors. Today I find that the additions have stopped—it still does the lookup. Is there an alternate tool, or do I need to go back to MM and ask for his investigation. I had a look at Bargioni scripts though they don't work straight out of the box, I am not interested in having to manually add the individual VIAF(s) and then adding the clusters, nor faffing around with config files for something that should be super easy lookup and add. Also noting that mix'n'match doesn't work on new items, and there is an unknown and unpredictable time for it to function, and I would prefer to do these as I create the items. Thanks. (please ping me if there is a ready replacement)  — billinghurst sDrewth 23:52, 29 October 2021 (UTC)

Also noting that MM's tool also was able to do the birth and death year matches, and able to extract full date of birth and death dates if known, a very handy feature.  — billinghurst sDrewth 00:00, 30 October 2021 (UTC)

Matching whole VIAF clusters (people only)

Using Bargioni and Epìdosis' fantastic tool moreIdentifiers, I've just added all of the missing ID's that were in the VIAF cluster for James P. Kennett and Richard B. Firestone. This allowed me to add references (using UseAsRef) to both of their date of birth statements, huzzah! It got me thinking though; I checked through them all to make sure that they actually referred to the person I was adding the info to, and I checked on the little graph thing on the VIAF page to make sure there were no single connections, but some of the records were a bit sparse (like this bibsys record for Dr. Kennett) and pretty much only had their full name.

I read on the talk page for the moreIdentifiers that (as you would assume) sometimes there are errors and doing things like linking records "in the blind" can just lead to propagating them. My question is how likely is this to be the case? Should I have only added the ID's for the records that I could be 100% sure about, or is doing a cursory check and giving VIAF the benefit of the doubt usually okay? I'm also wondering if just adding all of the ID's is actually useful. Is there any other reason not to match a whole cluster? Thanks! Aluxosm (talk) 18:08, 1 June 2021 (UTC)

@Aluxosm: Errors are very common in two specific categories of people: people with very common names (e.g. John Brown, James Green); people with names in non-Latin alphabets, especially from East Asia (Chinese/Korean/Japanese scripts). These categories require very high attention. In other cases usually a cursory check is sufficient. --Epìdosis 18:14, 1 June 2021 (UTC)
@Aluxosm: Thx a lot! About VIAF clustering errors, in my opinion they depends on the algorithms used by VIAF itself, where no human checks or edits are applied. But clustering process we perform in Wikidata can be very accurate. MoreIdentifiers simplify the registration, but the quality still depends on us. -- Bargioni 🗣 20:05, 1 June 2021 (UTC)
@Epìdosis, Bargioni: Sounds a lot like the issues I've encountered using Author Disambiguator, I guess that makes sense. I think I was hoping that they would be using some super advanced method or just have a human checking over each record haha. Thanks for your input and again for the great tools! Aluxosm (talk) 13:06, 2 June 2021 (UTC)

Label for P8179 (Canadiana NCF ID)

Hi everyone! I'm affiliated with Library and Archives Canada and was wondering if there would be support for revising the Wikidata label and definition of the ID found in our name authority file (P8179). We feel a label such as Canadiana Name Authority File ID would better describe the ID. I've added a topic to the talk page for the property with more info but I thought I would also post here for your input. Thanks! --Ostapt (talk) 12:48, 26 October 2021 (UTC)

User:Magnus Manske/authority control.js has stopped adding

Hi. For years I have been using MM's simple lookup and add tool to quickly add the relevant authority detail for new Wikisource authors. Today I find that the additions have stopped—it still does the lookup. Is there an alternate tool, or do I need to go back to MM and ask for his investigation. I had a look at Bargioni scripts though they don't work straight out of the box, I am not interested in having to manually add the individual VIAF(s) and then adding the clusters, nor faffing around with config files for something that should be super easy lookup and add. Also noting that mix'n'match doesn't work on new items, and there is an unknown and unpredictable time for it to function, and I would prefer to do these as I create the items. Thanks. (please ping me if there is a ready replacement)  — billinghurst sDrewth 23:52, 29 October 2021 (UTC)

Also noting that MM's tool also was able to do the birth and death year matches, and able to extract full date of birth and death dates if known, a very handy feature.  — billinghurst sDrewth 00:00, 30 October 2021 (UTC)

Matching whole VIAF clusters (people only)

Using Bargioni and Epìdosis' fantastic tool moreIdentifiers, I've just added all of the missing ID's that were in the VIAF cluster for James P. Kennett and Richard B. Firestone. This allowed me to add references (using UseAsRef) to both of their date of birth statements, huzzah! It got me thinking though; I checked through them all to make sure that they actually referred to the person I was adding the info to, and I checked on the little graph thing on the VIAF page to make sure there were no single connections, but some of the records were a bit sparse (like this bibsys record for Dr. Kennett) and pretty much only had their full name.

I read on the talk page for the moreIdentifiers that (as you would assume) sometimes there are errors and doing things like linking records "in the blind" can just lead to propagating them. My question is how likely is this to be the case? Should I have only added the ID's for the records that I could be 100% sure about, or is doing a cursory check and giving VIAF the benefit of the doubt usually okay? I'm also wondering if just adding all of the ID's is actually useful. Is there any other reason not to match a whole cluster? Thanks! Aluxosm (talk) 18:08, 1 June 2021 (UTC)

@Aluxosm: Errors are very common in two specific categories of people: people with very common names (e.g. John Brown, James Green); people with names in non-Latin alphabets, especially from East Asia (Chinese/Korean/Japanese scripts). These categories require very high attention. In other cases usually a cursory check is sufficient. --Epìdosis 18:14, 1 June 2021 (UTC)
@Aluxosm: Thx a lot! About VIAF clustering errors, in my opinion they depends on the algorithms used by VIAF itself, where no human checks or edits are applied. But clustering process we perform in Wikidata can be very accurate. MoreIdentifiers simplify the registration, but the quality still depends on us. -- Bargioni 🗣 20:05, 1 June 2021 (UTC)
@Epìdosis, Bargioni: Sounds a lot like the issues I've encountered using Author Disambiguator, I guess that makes sense. I think I was hoping that they would be using some super advanced method or just have a human checking over each record haha. Thanks for your input and again for the great tools! Aluxosm (talk) 13:06, 2 June 2021 (UTC)

Label for P8179 (Canadiana NCF ID)

Hi everyone! I'm affiliated with Library and Archives Canada and was wondering if there would be support for revising the Wikidata label and definition of the ID found in our name authority file (P8179). We feel a label such as Canadiana Name Authority File ID would better describe the ID. I've added a topic to the talk page for the property with more info but I thought I would also post here for your input. Thanks! --Ostapt (talk) 12:48, 26 October 2021 (UTC)

Return to the project page "WikiProject Authority control/Archive 2".