User talk:Vladimir Alexiev/Archive 1

Latest comment: 4 years ago by Vladimir Alexiev in topic Municipality vs. seat in WorldCat


See also User talk:Vladimir Alexiev/Archive

VIAF - update edit

Hi! I've noticed the beautiful tables you posted in Wikidata:WikiProject Authority control#VIAF Games. Would it be possible to update them in the future? The last import of 570k VIAF identifiers (over +30%) might have changed the situation of one year ago. Thank you very much, --Epìdosis 17:02, 24 November 2019 (UTC)--Epìdosis 17:02, 24 November 2019 (UTC)Reply

  • @Epìdosis: Updated number of VIAF links in WD.
    • Re-counted 2019-11 with viaf-links-count.pl, see this gist. Tightened the counts: duplicated IDs are not counted twice (see the gist)
    • I also tried counting WD external-ids with SPARQL (.rq) but those queries timed out. HELP NEEDED.
    • Below are the new VIAF link stats but I don't have the time to merge them in Wikidata:WikiProject_Authority_control#VIAF_Links_per_Source. Could someone help with that?
    • You will notice some new types of ID. I've created two proposals but we need to explore the rest and make more proposals.
Vladimir Alexiev (talk) 11:59, 13 March 2017 (UTC) Jonathan Groß (talk) 17:52, 26 March 2017 (UTC) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits Jneubert (talk) 13:47, 29 April 2017 (UTC) Sic19 (talk) 20:42, 12 July 2017 (UTC) Wikidelo (talk) 21:15, 8 May 2018 (UTC) ArthurPSmith (talk) 19:52, 22 August 2018 (UTC) PKM (talk) 19:40, 23 August 2018 (UTC) Ettorerizza (talk) 06:44, 8 October 2018 (UTC) Fuzheado (talk) 03:47, 19 December 2018 (UTC) Daniel Mietchen (talk) 16:30, 7 April 2019 (UTC) Iwan.Aucamp (talk) 21:48, 3 October 2019 (UTC) Epìdosis (talk) 23:49, 22 November 2019 (UTC) Sotho Tal Ker (talk) 00:52, 1 May 2020 (UTC) Bargioni (talk) 09:48, 02 May 2020 (UTC) Carlobia (talk) 14:34, 11 May 2020 (UTC) Pablo Busatto (talk) 03:22, 23 June 2020 (UTC) Matlin (talk) 10:53, 6 July 2020 (UTC) Msuicat (talk) 21:57, 27 August 2020 (UTC) Uomovariabile (talk) 10:04, 27 October 2020 (UTC) Silva Selva (talk) 17:21, 30 November 2020 (UTC) 1-Byte (talk) 15:52, 14 December 2020 (UTC) Alessandra.Moi (talk) 17:26, 16 February 2021 (UTC) CamelCaseNick (talk) 21:20, 20 February 2021 (UTC) Songceci (talk) 18:45, 24 February 2021 (UTC)]] moz (talk) 10:48, 8 March 2021 (UTC) AhavaCohen (talk) 14:41, 11 March 2021 (UTC) Kolja21 (talk) 17:37, 13 March 2021 (UTC) RShigapov (talk) 14:34, 19 September 2021 (UTC) Jason.nlw (talk) 15:15, 30 September 2021 (UTC) MasterRus21thCentury (talk) 20:22, 18 October 2021 (UTC) Newt713 (talk) 08:42, 13 March 2022 (UTC) Pierre Tribhou (talk) 08:00, 20 March 2022 (UTC) Powerek38 (talk) 17:21, 14 April 2022 (UTC) Ahatd (talk) 08:34, 4 August 2022 (UTC) JordanTimothyJames (talk) 00:54, 31 August 2022 (UTC) --Silviafanti (talk) 17:07, 14 September 2022 (UTC) Back ache (talk) 02:03, 1 November 2022 (UTC) AfricanLibrarian (talk) M.roszkowski (talk) 10:44, 4 January 2023 (UTC) Rhagfyr (talk) 19:36, 9 January 2023 (UTC) — Haseeb (talk) 13:10, 4 August 2023 (UTC) 13:26, 15 November 2023 (UTC)Reply

  Notified participants of WikiProject Authority control

Help needed. Cheers! --Vladimir Alexiev (talk) 16:28, 28 November 2019 (UTC) Reply

  Notified participants of WikiProject Authority control Help needed. Cheers! --Vladimir Alexiev (talk) 16:28, 28 November 2019 (UTC)Reply

     16121 ARBABN
    377331 B2Q
    376803 BAV
   1824982 BIBSYS
    300092 BLBNB
    185349 BNC
    206029 BNCHL
    666416 BNE
   2498551 BNF
     29007 BNL
    647483 CAOONL
     95020 CYT
    181601 DBC
    111686 DE663
   8451411 DNB
     54624 EGAXA
     69054 ERRR
    180762 FAST
    151623 GeoNames
    178348 ICCU
      9991 IMAGINE
   8535276 ISNI
   8454581 Identities-lccn
  11464602 Identities-viaf
    242578 JPG
    594543 KRNLK
  10560894 LC
    747513 LIH
    210682 LNB
     15879 LNL
      8569 MRBNR
    228054 N6I
   1144734 NDL
   1735345 NII
    925678 NKC
   1119857 NLA
      3940 NLB
   1902439 NLI
    159825 NLR
    547775 NSK
     33727 NSZL
   2735696 NTA
   1875069 NUKAT
     87316 ORCID
      1228 PERSEUS
   1619777 PLWABN
    440339 PTBNP
   2194878 RERO
    220157 SELIBR
     61360 SIMACOB
     51717 SKMASNL
       209 SRP
   3528681 SUDOC
    138138 SZ
     85957 UIY
     12342 VLACC
    143730 W2Z
   1951619 WKP
   8012817 Wikipedia
      1518 XA
   2142739 XR

Worldcat Identities edit

I'm inserting 1.7M Worldcat Identities links. I've added them as 85 batches of 40k statements each https://tools.wmflabs.org/quickstatements/#/batches/Vladimir%20Alexiev), and it will take some months for them to trickle through QS.

Related:

--Vladimir Alexiev (talk) 10:34, 6 February 2020 (UTC)Reply

Bot vs QS edit

Please check batch https://tools.wmflabs.org/quickstatements/#/batch/25559 in QuikStatements. It stopped after 522 items edited. And then changed the status to finished. I think you need to start this batch again. -- Hogü-456 (talk) 19:09, 4 February 2020 (UTC)Reply

  • Thanks for notifying me, as I plan to run about 200 more such batches. It is running now and making progress. I did not restart it, so it was some WD internal thing . Vladimir Alexiev (talk) 01:52, 5 February 2020 (UTC)Reply
  • I asked at the Page Wikidata:Bot requests if it is possible to add property pairs to Wikidata in a QuickStatements like format with a bot. I think this is faster than QuickStatements. I don't know how high the capacity of QuickStatements is but I think it is lower than that what a bot can add.
  • A bot can add more things at the same time what is not in every case possible in QuickStatements. Maybe you can think about using a bot if the number of batches you want to upload is so high. I think a bot needs 1,5 months to add it. With QuickStatemens you need probably 5 months.
    • Can you suggest a both that I can use?
  • Because there are also other users who want to upload something and there are 2 commands if you add the property and the source. These are some numbers for your information. Please pay attention that you don't upload to many batches at one day.
    • Why not? They are all posted, but only 2 of them are moving (at the snail speed of 30 sec per claim). They are not blocking batches by other people
  • I think that there can be problems in QuickStatements with handling that big amount. But there I am not sure if this was the reason of the problems with the tool in the last year. Usually you can't do more than 100,000 edits at one day. At the end it is a good thing that it exists and in the most times it works good. -- Hogü-456 (talk) 20:56, 5 February 2020 (UTC)Reply
Your comments are interesting. Especially the last information with the time what is needed to add that number. Maybe my information about how the batches are running is wrong. I thought that there is a principle of first in first out and then it were a problem and some suggestions at my talk page seem that it is so. In the last two days QuickStatements does not run in a good way. You can talk with Magnus and tell him if you have ideas to improve it that it is faster. As far as it seems there are other Databases who are much faster in adding information. -- Hogü-456 (talk) 17:46, 6 February 2020 (UTC)Reply

https://m.wikidata.org/wiki/Wikidata:Contact_the_development_team#Batch_QuickStatements_has_become_unsustainably_slow shows a speedup of QS Vladimir Alexiev (talk) 22:48, 6 March 2020 (UTC)Reply

Semantic Scholar edit

Have you seen the recent progress in adding SemScholar links and metadata? [1] Sj (talk) 14:34, 21 February 2020 (UTC)Reply

Request translation Isabelle de Charriere edit

Hello Vladimir Alexiev, Could you write/translate the article of Isabelle de Charrière for the Bulgarian Wikipedia or find someone else to do that? That would be appreciated. Boss-well63 (talk) 10:44, 13 March 2020 (UTC)Reply

@Boss-well63: Please make a request at https://bg.wikipedia.org/wiki/%D0%A3%D0%B8%D0%BA%D0%B8%D0%BF%D0%B5%D0%B4%D0%B8%D1%8F:%D0%97%D0%B0%D1%8F%D0%B2%D0%BA%D0%B8_%D0%BA%D1%8A%D0%BC_%D0%BF%D0%BE%D0%BB%D0%B8%D0%B3%D0%BB%D0%BE%D1%82%D0%B8 Vladimir Alexiev (talk) 09:05, 15 March 2020 (UTC)Reply
Sorry but I can't read or write Bulgarian. That's why I asked you. Boss-well63 (talk) 16:18, 16 March 2020 (UTC)Reply

Errors with WorldCat edit

Hi Vladimir. I've noticed that a number of your WorldCat additions have been matching WorldCat entries for geographic locations to the Wikidata entry for railway stations; see for example here, here, and here - and matching a musical group to a station here. I'm guessing that these are errors in the WorldCat data set. If the dataset has that many errors, should it really be added? Pi.1415926535 (talk) 22:58, 24 March 2020 (UTC)Reply

Hi @Nezdek:! About your revert: If I understand correctly, Q60846274 and Loire Forez Agglomération (Q2986901) are different "incarnations" of the same Public institution of intermunicipal cooperation. I think that https://www.worldcat.org/identities/viaf-138726920 and http://viaf.org/viaf/138726920 describe the same institution. I think that VIAF and WorldCat don't have separate entries for the different "incarnations", so it's correct to apply them to any of those incarnations? --Vladimir Alexiev (talk)

@Vladimir Alexiev: : I suggest you to check the French Wiki, or the period of existance on Wikidata. There was a merger of 4 organizations in 2017. One of them had approximately the same name than the new one. That’s why ! Nezdek (discussion) 14:41, 7 April 2020 (UTC)Reply

Joconde IDs edit

  • Is data.culture.fr still working? (I'm not able to open it).
  • If the IDs are still used, is old style ID (Txxx-yyyy) deprecated? Are there always a GUID for each old style ID entries?
  • I propose to merge all Joconde UUIDs to one property, but the first two questions should be answered first.

--GZWDer (talk) 19:00, 3 April 2020 (UTC)Reply

Are there entities with old style ID only?--GZWDer (talk) 06:23, 4 April 2020 (UTC)Reply

@GZWDer: It appears to be down at the moment. But the thesauri are still valid. I found another description page at https://data.culture.gouv.fr/explore/dataset/les-vocabulaires-du-ministere-de-la-culture-et-de-la-communication/.

Each concept has one URL. The old style URLs (segmented per thesaurus) were migrated from some older system. But the new system GINKO can only allocate UUIDs.

I think we should merge their ID props into one, and that neither of these answers invalidates this.

Unfortunately we cannot have very powerful validation regexps because of that historic inconsistency. Vladimir Alexiev (talk) 06:18, 4 April 2020 (UTC)Reply

Municipality vs. seat in WorldCat edit

Hi. Regarding Special:Diff/1152172243 and Special:Diff/1152172503: why should we use these identifiers inaccurately? This just encourages users to mess things up further, as illustrated by adding inaccurate WorldCat id based on inaccurate VIAF id. Separate VIAF entity for the municipality may be created later. Until then, I believe it's better to keep municipality item without VIAF link. Please note that earlier I also moved these VIAF ids to items about settlements (seats), Q191106 and Q3044083, and so values are not distinct anymore. 2001:7D0:81F7:B580:4926:AA1B:8734:8BCA 11:47, 7 April 2020 (UTC)Reply

  • Hi! How can you tell the VIAF entry and WorldCat page are about the seat and not the municipality, given that there is no other entry in VIAF? There is no requirement, nor it is realistic, that WD and other global databases will have 1:1 correspondence. If and when VIAF creates another entry, we will split them. --Vladimir Alexiev (talk) 13:26, 7 April 2020 (UTC)Reply
Current municipality of Haapsalu was created recently in 2017. As VIAF entry is linked to other databases that refer to older sources then it must match something else, like the settlement. As for Padise, VIAF links only to LCCN (besides Wikidata), which refers to "populated place", i.e. also not the (former) municipality. Also, note that name of the latter municipality in Estonian is "Padise vald", and not just "Padise".
Of course there isn't 1:1 correspondence between Wikidata and other databases (or sometimes entries in these other databases really are way too vague, and so it's impossible to tell what they are about). As a consequence, it seems natural to me that there is no reason to try link all entries in another database that has no 1:1 correspondence. Otherwise what's the point of providing these links if it only contributes to further confusion and errors. 2001:7D0:81F7:B580:4926:AA1B:8734:8BCA 15:13, 7 April 2020 (UTC)Reply
I claim that the VIAF record is about both municipality and seat because the two are so closely related, and because there's no other VIAF record. The WorldCat page is NOT wrong because it shows docs relevant to both the municipality and the seat. What confusion and errors do you mean in this case?
BTW, thank you for all your other fixes, now those were real examples of confusion and errors. I'm collecting them at https://en.wikipedia.org/wiki/Wikipedia:VIAF/errors#WorldCat_Identities_errors and will report them to OCLC --Vladimir Alexiev (talk) 11:23, 8 April 2020 (UTC)Reply
What doc does this WorldCat entry show that are specifically about municipality that was created in 2017? I don't see any.
I doubt that we can conclude anything from the fact that there's no other VIAF record. There are many Wikidata items for which there is no matching nor even closely related VIAF entity anyway. For example, most of 4k+ settlements in Estonia, I hope you don't suggest that these should be linked to something that is "close enough", e.g. VIAF entity about the country.
Municipality and seat may be closely related (or poorly distinguishable) in some other country, but in Estonia at least the distinction is very clear: both have distinct official boundaries, and if municipality is dissolved then settlement generally remains as such (as is the case for Padise (Q3044083)), and even their names match only partly if at all.
Inaccurate use of identifiers suggests that municipaly and seat are the same, while actually they are clearly distinct. Based on inaccurate identifiers further inaccurate data is added, which seems to be pretty much what already happened when these WorldCat ids were added based on VIAF.
Ok, I agree with you as soon as you don't merely delete the WorldCat but move it to the correct item --Vladimir Alexiev (talk) 12:10, 10 April 2020 (UTC)Reply
As for Special:Diff/1152998089: this WorldCat entry per its label and per works that it mentions is about some historical school. This WorldCat entry links to VIAF and LCCN entries, but these other entries are about settlement, nothing suggest that they are about school instead. So most likely WorldCat entry is erroneously linked to these other databases. 2001:7D0:81F7:B580:8C08:F986:5B46:F52E 16:38, 8 April 2020 (UTC)Reply
Yes, the label is about a school. But the labels on the right are various names of the village. There's also a military map "Reihe V. Blatt 5. Fellin. / bearbeitet i. d. Kartogr. Abt. d. Stellv. Generalstabes d. Armee. - Maassstab 1:126 000. - [Berlin], Druckauflage 1915" that's likely about the village. If the books are about a school in the village, then by extension they are also about the village. "WorldCat entry is erroneously linked to these other databases": which other databases do you mean? This page merely shows books from the WorldCat catalog that are indexed with the VIAF and LCSH terms. --Vladimir Alexiev (talk) 12:10, 10 April 2020 (UTC)Reply
Return to the user page of "Vladimir Alexiev/Archive 1".