Wikidata talk:WikiProject Periodicals

(Redirected from Wikidata talk:Periodicals task force)
Latest comment: 29 days ago by PAC2 in topic Introducing the media directory

Notability of newspaper articles edit

  Notified participants of WikiProject Periodicals Hi all! Recently @Gerwoman: has reported me that @LAP959: is creating some items for newspaper articles; Gerwoman and I tend to think that they could be outside of WD:N (and thus should be deleted), but since I haven't found specific guidelines about the notability of newspaper articles (Wikidata:WikiProject Journalism just redirects to an abandoned Wikidata:WikiProject Journalists), I open this discussion in order to establish some sort of guideline about them. Please express your opinion. Thanks, --Epìdosis 15:32, 6 July 2022 (UTC)Reply

-Epìdosis, I try to be guided by the ideas of Wikidata as a repository of sources presented at the WikidataCon in 2017 (video WikiCite: Wikidata as a structured repository of bibliographic data - Talk at WikidataCon 2017 ) Like books, and scholarly articles, newspaper articles from reliable sources provide the references needed to justify claims, complete with the appropriate metadata (title, author, publisher, date of publication, topics). They can be cited in Wikidata and, maybe in the future, across other projects as well. LAP959 (talk) 15:57, 6 July 2022 (UTC)Reply

Calling also @Veverve: who reported a case. --Gerwoman (talk) 16:16, 6 July 2022 (UTC)Reply

WD:N says NOTHING about whether the source is "reliable": only that "It fulfills a structural need". (That's the third of three criteria cited in WD:N. It seems to me that the first two are special cases of this third criterion.)
More generally, I'm opposed to deleting references, particularly in Wikidata.
I know that w:Wikipedia:Deprecated sources encourages people to delete references to sources officially deprecated. I think it's appropriate and necessary to discuss the relative credibility of specific sources and articles. Wikipedia:Reliability of Wikipedia notes that Wikipedia is virtually unique in getting people with very different perspectives to collaborate in crafting a description of the subject at hand that all sides can more or less live with. In that context, deleting references is an obstacle to that kind of debate.
The best research that I know relating to that is The Wisdom of Polarized Crowds (Q47248083): The authors did a content analysis of all edits to English Wikipedia articles relating to politics, social issues and science from its start to December 1, 2016, roughly 5 percent of the English Wikipedia. They found that the best articles tended to have the most diverse groups of editors. They said that 95 percent of articles could benefit from more conflict; only 5 percent of articles had conflict that on balance was counterproductive.
In sum, we need more and more diverse editors on Wikipedia. Deprecating sources, in my judgment, drives away people we should be cultivating and engaging in serious discussion about the credibility of the sources they want to cite. We should NOT be deleting references they think are credible.
An example of this stupidity: Wikipedia:Kris Kobach#Political positions, accessed 2022-07-06, says, "In October 2017, Kobach wrote a column in Breitbart News which said that immigrants commit a disproportionate share of crimes, ... [citing] a column by Peter Gemma, who is associated with white supremacist groups and the American Holocaust denial movement." That passage in Wikipedia cites an article in the w:Kansas City Star but NEITHER of the two articles in Breitbart in question. How stupid is that? I'm positively discouraged from checking the source on Breitbart myself. I'm supposed to accept the claims of the author of that Kansas City Star article and the Wikipedia editor who posted that summary. To me, that's an attack on the spirit and intent of Wikipedia: I think we should be engaging people who believe Breitbart in honest dialog, to the extent that they are willing to discuss the evidence relating to the claims in those articles. We should NOT be driving them away by deleting references they find credible.
But the standards for notability on Wikidata (WD:N) are much looser than those on Wikipedia: "An item is acceptable [for Wikidata] if and only if ... it meets at least one of" three criteria, and the first two are special cases of the third: "It fulfills a structural need". Thus, for example, w:Breitbart News is deprecated per w:Wikipedia:Deprecated sources, but it has a Wikipedia article and a Wikidata item (Q4960434), and I think that's appropriate. DavidMCEddy (talk) 16:41, 6 July 2022 (UTC)Reply
It is my belief none of those items meet WD:N. Wikdata is not here to gather each and every news article. Some news articles are notable as per pt. 2 of WD:N, e.g. Q111516801, because they are discussed by third party reliable sources. Veverve (talk) 17:31, 6 July 2022 (UTC)Reply
I'm on the fence and in between... Currently they do not exactly meet WD:N (not the ones I've seen at least) but they do have potential value. Linking an article to its periodicals and the periodicals to its articles is not enough to meet the 3rd criteria (it's obvious and tautologic ; almost self-referencing). Following @DavidMCEddy: maybe we could make WD:N more explicit to precise that "a structural need" need to be at least a little meaningful, for instance for articles to be actually used as article. In the end, I'm not in favour of deletion but rather of deleting only the articles that are not really used (and leaving some times to LAP959 to used them, obviously). Cheers, VIGNERON (talk) 17:58, 6 July 2022 (UTC)Reply
@VIGNERON: the thing is, LAP959 created items for news article which are useless, only to use said useless items with Property:P1343; this loophole makes them serve a structural need. See for example Q112939802 being used in Q3954094. It is not useful to catalogue each and every news article discussing each and every topic, especially ones widely known and discussed; we already have encyclopedia items for this purpose. Veverve (talk) 18:12, 6 July 2022 (UTC)Reply
@Veverve: there are unused (except - I agree with you - for obvious uses that indeed doesn't count) but I don't think they are useless. Let's take an example: Ireland: a woman dies after being prevented from having an abortion (Irlande : une femme meurt après avoir été empêchée d'avorter) (Q112891907) was totally unused, but I add it as a reference on Q5247527#P570 (which was unreferenced so far, only a link to English Wikipedia which is not a source) and Q5247527#P509 (which has citation needed constraint (Q54554025)), it could be used for many other statements, this article contains a lot of data that could serve as references (and thus making Wikidata actually smaller than using the same url and metadata each time). Cheers, VIGNERON (talk) 18:33, 6 July 2022 (UTC)Reply
The world has a major problem today with political polarization, e.g., the 6 January w:2021 United States Capitol attack. w:Robert W. McChesney's solution is to dramatically increase the number of local news outlets. (See, e.g.,The Local Journalism Initiative: a proposal to protect and extend democracy (Q109978060) or To Protect and Extend Democracy, Recreate Local News Media (Q109978337).)
A definition of "notability" that is too narrow is a form of censorship. The international big money interests who are funding the most divisive media in the world today would LOVE to pay trolls to use Wikimedia Foundation criteria for WD:N or Wikipedia:Wikipedia:Deprecated sources as tools to censor information they don't like.
They'd also be happy with Wikimedia Foundation projects convincing their followers that Wikipedia is biased against them. DavidMCEddy (talk) 20:22, 6 July 2022 (UTC)Reply
If a newspaper article is used as a reference for other substantive information (somebody's birth date, citizenship, employment, etc.) then I think it is fine to have that source article as an item in itself. If there is no use of the article as a reference then I don't think it should be considered notable on its own, and if we need to update the notability guidelines for this specific case (or the slightly more general case of small works that may be used as references) I think that would be fine. ArthurPSmith (talk) 21:12, 6 July 2022 (UTC)Reply
If an individual newspaper article is not used as a reference for anything, then I don't see a problem with it not being in Wikidata, provided we allow time between when someone creates a Wikidata item for a newspaper article and the time they cite it someplace. For example, 1.5 years ago I translated the Wikipedia article on w:Julia Cagé from French into Spanish. In the process, I pressed "post" before I intended to, the article was speedily deleted, and I got blocked from editing on the Spanish-language Wikipedia. Almost three months later, I finally overcame that block, and the es:w:Julia Cagé article was published.
I mention that to say that I think we should not be too trigger happy to delete things from Wikidata if someone claims they plan to use them.
I'd like to ask another question: There's a U.S. Newspaper Directory, 1690-Present maintained by the Library of Congress. The database currently contains 157,521 items. I recently downloaded the entire database and learned that "end_year" = 9999 in 13 percent of the cases. Clearly there are data quality problems. I'd like to see a relationship between Wikidata and this database be developed that would allow people to clean up the Wikidata items to improve that database. For many of these cases with "end_year" = 9999, other information is available to tell whether that newspaper is still publishing, and if not what the actual "end_year" was. In any event, having data like that in Wikidata makes it possible to crowd source data cleaning.
Comments? Thanks, DavidMCEddy (talk) 22:18, 6 July 2022 (UTC)Reply

How do you plan to determine whether an item about a newsppaer article is used as a citation on one of the Wikipedias, for example via Template:Cite Q (Q22321052)? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:34, 7 July 2022 (UTC)Reply

Some of your comments are changing my mind, but this is what I thought a few days ago: One article deserves an item in WD if it has some notability by itself. For example a Pulizer Prize or something like that. Here some examples: https://www.aarweb.org/AARMBR/About-AAR-/Award-Programs-/Awards/Journalism-Award-Winners-and-Sample-Articles.aspx https://www.npr.org/2022/05/09/1097668290/2022-pulitzer-prize-news-winners?t=1657034095607 Not every article about a relevant event is per se a relevant item for Wikidata. --Gerwoman (talk) 16:01, 7 July 2022 (UTC) In favor of the notability of all published articles ever, if we really want to "sum all human knowledge". Nomen ad hoc (talk) 06:37, 8 July 2022 (UTC).Reply

  • I am all for news articles having their own entries whether they are used as references in other Wikidata items or not. I imagine one day we will be able to crawl through all the public domain newspaper archives with an AI and create entries for every obituary and funeral notice. The newspaper archive GenealogyBank recently did just that and the index is at Familysearch. That prompted Ancestry.com to do the same with their newspaper.com website, and index the same and include wedding announcements, and even ascertain the relationships between the people mentioned. It would be similar to the guy that extracted all the images for Commons from books scanned at the Internet Archive and grabbed any adjacent text for context. --RAN (talk) 19:06, 30 July 2022 (UTC)Reply

requesting advice on a newspaper that was closed down and then reopened edit

Hello group, Wikidata:Wikiproject Periodicals/Participants I am working on a newspaper that opened in 1906, lasted for four issues only until it was ordered to shut down, but then, eleven years later, was revived under the same title, by the same political party and under the same masthead. I see properties for inception (P571), "dissolved, abolished or demolished date" (P576). I have been entertaining options such as "followed by", but it requires creating of a new WD item for essentially the same title. It would be great to have properties "Ceased publication" and "Resumed publication" but they are not available at this point. Did anyone dealt with a similar situation and can share how they got around this problem? It is such a typical situation that I cannot imagine it has not come up before. MatrosMonk (talk) 20:33, 6 October 2022 (UTC)Reply

@MatrosMonk: Use significant event (P793) with an appropriate value like temporarily closed (Q55653430) though we might need a more suitable item than that? ArthurPSmith (talk) 18:18, 14 October 2022 (UTC)Reply

dissolved, abolished or demolished date edit

  Notified participants of WikiProject Periodicals Hi all, just to let you know, the use of the property dissolved, abolished or demolished date (P576) on items for periodicals currently gives a type constraint error, and has done for some time now (I'm not sure for how long exactly, though I think it's been a week at least). See for instance Epoch (Q5383780), the example used in the main WikiProject page to demonstrate use of this property. Anyone have any idea why this is happening now? I've been trying to look into whether recurring event (Q15275719) in particular (one of P576's type constraints) was recently removed from a periodical related item for instance, but I haven't found anything yet. Monster Iestyn (talk) 22:20, 27 October 2022 (UTC)Reply

This issue appears to be resolved now (not that anyone responded here in over a month anyway...) Monster Iestyn (talk) 02:49, 8 December 2022 (UTC)Reply
Nevermind, it's still happening. Monster Iestyn (talk) 15:30, 10 December 2022 (UTC)Reply

Untangling two Chinese journals edit

Since its creation (!!!) until today, Science in China. Series A: Mathematics (Q15758572) contained data for two journals from China: "Science in China. Series A: Mathematics" (ISSN 1862-2763) and "Scientia sinica", the English edition of "Zhongguo kexue" (ISSN 0250-7870). I have now transferred all the latter journal's data to the existing Scientia sinica (Q107018940) (which previously had only a cancelled ISSN linked to ISSN 0250-7870), but unfortunately there remains about 380 article items which may have been incorrectly linked to the mathematics one as a result of the conflation (link). It would take forever to check all of these manually, so is there some way a bot can automatically relink them to Scientia sinica (Q107018940) instead, if it can verify which journal each article belongs to? Monster Iestyn (talk) 20:11, 28 April 2023 (UTC)Reply

Journal name changes edit

  Notified participants of WikiProject Periodicals

There are many journals whose names has changed over time. Consider this list of Springer journals, for example, which have had a name change. For many of them, we have two separate items for the old and the new name, which seems to be incorrect (e.g. Palgrave Communications (Q27727606) and Humanities and Social Sciences Communications (Q64521835), or Experientia (Q21385347) and Cellular and Molecular Life Sciences (Q5058352)). That being said, a name change necessitates an update to the ISSN and ISSN-L, and so before and after the name change, the ISSNs are different, and there are occasionally other external sites that give them independent identifiers in these cases as well. Should we be merging such items? If so, how should we mark identifiers like ISSN which apply to the old vs the new name? Should we use time-based qualifiers (start time (P580) and end time (P582)) or title-based ones (official name (P1448), applies to part (P518), or title (P1476)). Thanks. לוכסן (talk) 11:40, 14 July 2023 (UTC)Reply

Hi @לוכסן, I would argue strongly against merging such journals. They have different names and different identifiers, please keep them separate. We have properties to link them together (e.g., replaces (P1365) replaced by (P1366)), and in some cases relationships can be complicated, involving different dates, publishers, etc. That information can be valuable to those of us trying to understand a journal's history. For a fun example see http://alec-demo.herokuapp.com/Q21386186. To give some context, I work in taxonomy (Q8269924) where I am trying to link names for species to their original publications, often based on data from other publications and databases. These sources typically use the names of journals at the time (or the ISSNs), hence any retrospective renaming (e.g., by merging different names into a single item) makes this project harder. It also has implications for projects that use Wikidata to generate citations (e.g., https://en.wikipedia.org/wiki/Template:Cite_Q). If items for old journal names are merged with items for newer names, citations in Wikipedia that are generated by CiteQ will be incorrect. Rdmpage (talk) 06:24, 15 July 2023 (UTC)Reply
@לוכסן, Rdmpage: I'm not sure but here some thoughts :
  • if the name is the only thing that change and it's the same journal (looking at Wikipedia article may also be a good clue), then having two different items will produce a lot of redundancy and confusion
    • if we really want to keep them separate then to avoid redundancy we should have 3 items per the "Bonnie and Clyde" solution : one general for the journal and two for each title (with only the data specific to each title)
  • usually we merge them and qualifiers are made exactly to handle precisely such changes (eg. if a person or city changes its names, then we have only one items with name qualified)
  • if it's the same journal with a different title, then replaces (P1365)/replaced by (P1366) is incorrect
  • if templates like Cite_Q are not good enough to look at qualifier then the template should be corrected, we don't model data according to bad tools.
  • if there is more change than just the name (for instance two journal merged/splitted, like European Journal of Taxonomy (Q21386186)), change in periodicity, etc.), then yes, obviously, too much qualifiers defeat the purpose, merge is often technically impossible and separate items should be preferred (and again the "Bonnie and Clyde" solution may be applied if needed)
  • for the record, 43968 periodicals have more than one title: https://qlever.cs.uni-freiburg.de/wikidata/d3U212 (and up to 14 titles!).
  • here an other example : Annales de Bretagne et des Pays de l'Ouest (Q2850663)/Annales de Bretagne (Q96701910), it's clearly the same journal (the Wikipadia articles are on both and - per copyright law - the Wikisource text are only the old ones) and the current situation is messy (triggering a lot of constraint violations).
Cheers, VIGNERON en résidence (talk) 13:10, 18 July 2023 (UTC)Reply
@VIGNERON en résidence Obviously there are different ways to model these situations. I favour having distinct items if major identifiers such as Property:P1687 or Property:P8375 differ. In the example of Q2850663 and Q96701910 most of the constraint violations are because of having multiple web sites for the journals without indicating a preferred value, and a lack of language qualifiers, rather than anything to do with have two items for the “same” journal. I think it is reasonable to model this as two items, one follows the other. That keeps things simple and avoids having to add date range qualifiers to lots of properties.
I understand the concern about modelling data versus supporting tools, but surely these will always be concerns. If the model is too complex to be easy to use then we won’t have useful tools. Tools like CiteQ aim to bring value to users outside Wikidata itself, I think it would be useful to consider how modelling data affects those who make use of the data.
Obviously there is no one answer to multiple items versus one item with data qualifiers, and I suspect different Wikidata community will have different approaches. But in cases like journals which are linked to multiple other items (such as articles) and on which other projects (Scholia, WikiCite, CiteQ) depend, can I suggest caution before people decide to merge things that they consider to be the “same” when other might not share that view. Rdmpage (talk) 07:13, 20 July 2023 (UTC)Reply
Yes we should be cautious about merging things - however there are many journals which have occasional name changes without any other change (resulting in ISSN-L changes) and I don't see any useful purpose in keeping separate items in this case when we don't do so for people or organizations or other entities that may change their names. ArthurPSmith (talk) 16:04, 24 July 2023 (UTC)Reply
@Rdmpage, @VIGNERON en résidence, @ArthurPSmith:
Thank you all for your help. The case I was immediately interested in was for the journal Ha'Ivrit (Q6590196), whose name from the journal's inception in the year 5705 AM (Q2817621) up until 5770 AM (Q2740731) was "לשוננו לעם", but starting in the 5771 AM (Q2817680) edition, its name became "העברית". Making separate items for this would be very confusing, and not helpful, as the identity of the journal did not meaningfully change and, for example, regardless of whether you refer to the journal under the old or new name, it has the same inception (P571), and it would be incorrect to say that the journal, under the new name, was founded later, as it's truly the same entity, just with an updated name (and due to ISSN rules, new name means new ISSN, so also that).
I want to add the ISSN(s) to the item, but was blocked wondering how to do this considering it has had two.
(As a side note: Note the desire to mark the start and end dates according to the Hebrew calendar as the editions of the journal are identified by Hebrew calendar years and not Gregorian calendar years, (as similarly discussed in Wikidata talk:WikiProject Award#Annual awards according to other calendar systems in the context of prizes which are awarded according to Hebrew calendar (Q44722) years).)
לוכסן (talk) 19:42, 27 August 2023 (UTC)Reply
Tagging @Rdmpage, VIGNERON en résidence, and ArthurPSmith once more. לוכסן (talk) 21:51, 22 September 2023 (UTC)Reply
I can't say much beyond my own preference is to have separate items for journals that change ISSN, and connect them by a property describing their relationship. This avoids issues with multiple identifiers, and name changes may be meaningful to some users, even if the journal is the "same". I can't say more because the item being discussed Ha'Ivrit (Q6590196) has no identifiers, at least none that work. The one link it has returns a 404. Rdmpage (talk) 18:59, 23 September 2023 (UTC)Reply
@Rdmpage: Thank you for the response. If you wouldn't mind taking a second look at Ha'Ivrit (Q6590196), I added a number of identifiers to it now. One for the original name "Leshonenu La'Am" and a second for the new name "Ha'Ivrit". Thank you. לוכסן (talk) 19:23, 23 September 2023 (UTC)Reply
@לוכסן: Use qualifiers start time (P580) and end time (P582) on the ISSN's, official name, website, or any other properties that might be different - though if there are a large number of differing properties than I agree with Rdmpage's advice to split it into two. In general though it is simpler to handle fewer items than more, and clearer to end-users who may otherwise wonder which one to use. ArthurPSmith (talk) 16:52, 25 September 2023 (UTC)Reply

As before, if this was a journal I was working on I would split it into two, following the external identifiers ISSN, OCLC, and Sudoc, and link the two, together with start and end dates. How important this is probably depends on what the goal is. I want to link articles to the correct journal (typically via ISSN) and ensure that articles can be correctly cited by tools such as citeproc.js, hence I tend to treat distinct ISSNs as distinct journals. I suspect people will chose whichever solution works best for them, and this may evolve over time (hopefully without edit wars). Rdmpage (talk) 08:32, 24 September 2023 (UTC)Reply

Describing a page range for an article within a PDF file edit

I asked at https://www.wikidata.org/wiki/Wikidata:Project_chat#Describing_a_page_range_for_an_article_within_a_PDF_file whether when referencing a magazine available as a PDF with 2 leader pages before the numbered pages, whether I should refer to the visible page number or the PDF page number in the article, and whether it would be of merit to document the number of lead pages in the journal description so the PDF page could be computed from the visible page. Vicarage (talk) 14:02, 29 September 2023 (UTC)Reply

USNPL ID links now go to malware edit

The domain for the US Newspaper Links (USNPL) now redirects to malware:
https://www.usnpl.com/

So the 6K Wikidata items with a USNPL ID now have malware links, e.g., Wikidata's NY Times page has this malware URL:
https://www.usnpl.com/search/newspapers?q=2293

Proposed fixes for the USNPL ID property are either:
1. Remove the "URL match pattern", or,
2. Change it to go to the Wayback Machine. For instance, this URL finds the latest capture in 2022 (when all the USNPL pages were still online):
https://web.archive.org/web/2022/https://www.usnpl.com/search/newspapers?q=5812

So the URL match pattern could be:
^https://web.archive.org/web/2022/https?:\/\/(?:www\.)?usnpl\.com\/search\/newspapers\?q=([1-9]\d*)

(Also posted this in Wikipedia_talk:WikiProject_Newspapers/Wikidata)
Hearvox (talk) 19:26, 11 January 2024 (UTC)Reply

Introducing the media directory edit

The media directory (https://observablehq.com/@pac02/media-directory) is a simple tool which makes it easy to see the list of newspapers, broadcasting channels and TV channels by country. PAC2 (talk) 21:56, 17 March 2024 (UTC)Reply

Return to the project page "WikiProject Periodicals".