Wikidata:Property proposal/Il Sole 24 Ore ID

Il Sole 24 Ore ID edit

Originally proposed at Wikidata:Property proposal/Authority control

Descriptionidentifier for on the website of Il Sole 24 Ore newspaper
RepresentsIl Sole 24 Ore (Q1658262)
Data typeExternal identifier
Domainhuman (Q5), group of humans (Q16334295), institution (Q178706), historical event (Q13418847), company (Q783794)
Allowed values[a-z]+(-[a-z]+[0-9]+)*
Example 1abaya (Q305718)abaya
Example 2Italian Data Protection Authority (Q2254049)garante-privacy
Example 3Zheng Bijian (Q3368176)zheng-bijian
Example 4Luca Zaia (Q508338)luca-zaia
Example 5strait of Messina Bridge (Q373856)ponte-stretto
Example 62018 Italian general election (Q16970032)elezioni-2018
Sourcehttps://argomenti.ilsole24ore.com
External linksUse in sister projects: [ar][de][en][es][fr][he][it][ja][ko][nl][pl][pt][ru][sv][vi][zh][commons][species][wd][en.wikt][fr.wikt].
Planned useManual addition
Number of IDs in sourcethousands
Expected completenessalways incomplete (Q21873886)
Formatter URLhttps://argomenti.ilsole24ore.com/$1.html

Motivation edit

A reliable database by one of the most important national newspaper. It's mostly focused on economy, hence the attention of the database to companies, but for example the scientific articles of Il Sole 24 Ore are quite accurate as well, in my opinion is probably slightly better than other major newspapers.

They have some grouping in subsets but in reality a single link works as well, for example compare [1] and [2]

In general, the newspaper website also has some simple string search, and it looks like that they are creating more detailed descriptions the more such topics become important.--Alexmar983 (talk) 14:49, 18 April 2020 (UTC)[reply]

  Notified participants of WikiProject Italy--Alexmar983 (talk) 14:58, 18 April 2020 (UTC)[reply]

Discussion edit

I am so sorry... I checked dozens of links and in all of them and the categories were a complication and I was happy to ignore them as long as the direct link was correct. It looks like you find some bugs. We could try to isolate something I guess. I wish I had found them myself... very unlucky. Sigh. I don't have time in this hour but I will look carefully. But I have to look so carefully, I guess it's a duplicate of the manual work to add them. Such a pity, I don't understand why Italian newspapers management of IDs is so poor, this was the only one which looked solid, it's worth to save it somehow.--Alexmar983 (talk) 16:05, 18 April 2020 (UTC)[reply]
(I simplify my comment but you find all the examples in the history of this page, i really checked a lot):
  • The journalist/writers profiles are all fine, I found one with no description but this is common with established IDs for publishing companies. One or two might be blank, but it looks like a process. If you write more than one article, you get a profile and such profile is progressively enlarged over the years. So that part can be saved and "Peoples" look like "writers/journalists", which has sense since they are both people. It's a direct link... It looks like that in this case the profile is always created on purpose (the creation of profile of recurring writers look more automatic) so it's never blank and it's quite complete since the beginning.
  • "companies". This category is VERY precious, we have a lot of reliable IDs for people but topics related to economy is always less "mapped". They all look fine as links.
  • "keywords", it's 3966 entries, I compared A LOT of long and short url forms... and no problem. I am struggling to find a "bug" of a short url with no redirect there. Very useful descriptions of technical concepts BTW. Please let me know if you find any problem with the url architecture here. The only thing who caught my attention scrolling the whole list is this one with a question mark ("sell of golden coins" would have been better string) but strange phrasing or unusually specific IDs are a minor amount in every archive. You ignore same way you would with a wrong VIAF. ID--Alexmar983 (talk) 17:13, 18 April 2020 (UTC)[reply]
  • "storie" is only 17 topics and it's actually weird in its own existence. They are simply "keywords" with some less strict and defined grouping. It's like the example of the golden coins above... this one should be "2016" this one "Giro d'Italia 2017", this one should be under "organization"... it should simply be ignored as a whole, it's quite unrelevant. But all redirects seem to work in the shortcut, am I right? Also, the only duplicate Salone del mobile seem to link with the shortcut to the one under "parole chiave"... but actually the first one is "Salone del mobile di Milano" while the other one is the combination of all the international editions. The reason why the same string is given is because that category is not fully integrated with the rest and it is not noticed. it's just this whole subcategory that is nonsense. But IF you use the direct link you avoid dealing with such nonsense in thsi case.--Alexmar983 (talk) 17:50, 18 April 2020 (UTC)[reply]
  • So the problem is that subcategories should be used for visualisation and not architecture or they make a mess because they overlap. Or they are randomly distributed... that's why you have "wikipedia" under "companies" and "Vinitaly" in a generic category. That's why they should be ignored focusing only with bigger established ones. For some reason in "enti e organizzazionI" ("oganizations") they messed up even more then in "storie" (or they have slightly more cases, so different types of mistakes emerges naturally). It's probably because it's a left-over poorly managed category with only 84 entries. This explains why you found also few cases a non-working short urls there (and so far only there).
  • My advice: keep the short url and ignore the 17 entries in "storie" and 84 entries in "enti ed organizzazioni", keep the short url in all other much bigger categories, don't split the IDs. The short url so far looks fine. I simply sampled in the beginning proportionally to the size of categories, that's why I did not find these minor bugs, because they probably don't appear in the bigger categories.--Alexmar983 (talk) 17:50, 18 April 2020 (UTC)[reply]
Thank you. I also have written to the website suggesting to fix those little bugs of few urls, but the ID can be used on thousands of items efficiently.--Alexmar983 (talk) 12:03, 22 April 2020 (UTC)[reply]