Wikidata talk:WikiProject Books

Do review apply to works or editions? edit

Latest comment: 4 months ago9 comments4 people in discussion

Hi there, @INS Pirat: and I have been disagreeing here on whether reviews like E. J. KENNEY, W. V. CLAUSEN (edd.), The Cambridge History of Classical Literature, II: Latin Literature. Cambridge, University Press, 1982. xviii, 974 pp. Pr. £ 40,- (Q123251163) or The Cambridge History of Classical Literature 2: Latin Literature (Q123251162) apply to the work or the edition (currently only Q123250189 exists, whose P31 needs to be determined). I'm arguing for the latter, as the reviewers are reviewing what they have in their hands, and that is an edition, not a work . The fact that some review go so far as to include detail such as date and place of publication, publisher, pages number, and price makes me all the more inclined to consider that only editions are reviewed. Happy to hear other voices on that. --Jahl de Vautban (talk) 19:57, 31 October 2023 (UTC)Reply

The beginning. I would elaborate on my last point in there. We demonstrate the notability of a literary work as an article topic in Wikipedia, providing the reviews. The fact, that some reviews may contain the publication data of a specific edition or the multiple editions, usually doesn't affect the reviews themselves (and their mutual relevance) at all. INS Pirat ( t | c ) 20:59, 31 October 2023 (UTC)Reply

To me it seems that while there are certainly reviews that have a specific edition of a work as their subject (for example, a revised edition of an academic reference work or a new translation of a classic), most reviews take as their subject the work itself, and some might not even reference a specific edition. If a review discusses a work in general and references a specific edition, I see no reason not to link both items as its subject. Note that if the more specific review of (P6977) rather than the general main subject (P921) is to be used for linking reviews with editions, its current description and constraints would need to be changed. Pfadintegral (talk) 07:40, 1 November 2023 (UTC)Reply

@Pfadintegral: would you have an example of a review that don’t mention a specific edition? I admit that I have only ever see reviews for editions and I wasn’t aware some could exist for works. In the end however I find it odd that review of (P6977) could link to either work or editions. I would be fine however with a triangle where reviewreview of (P6977)edition, editionedition or translation of (P629)work and workdescribed by source (P1343)review. —-Jahl de Vautban (talk) 07:23, 2 November 2023 (UTC)Reply

Take this review for example: https://www.tor.com/2010/11/04/overloading-the-senses-samuel-delanys-nova/ - which has been republished in a print book and has a UID at the Internet Speculative Fiction database, so it should be notable by the standards of Wikidata. Pfadintegral (talk) 09:30, 2 November 2023 (UTC)Reply

Okay, I admit that for this kind of review it is hard to pinpoint to a given edition, thought I still got the feeling that this is more of a shortcoming and that it shouldn't be the norm. I had hoped for more point of views but two-to one I stand wrong. @INS Pirat: you can revert me back to where it was. --Jahl de Vautban (talk) 09:38, 5 November 2023 (UTC)Reply

I’m sorry I missed this discussion. I would say Walton's piece is more of an essay than it is a book review, and perhaps assessments of works rather than editions should be essays or scholarly articles (depending on context) and assessments of editions, especially when issued shortly after publication, can best be considered book reviews. - PKM (talk) 00:20, 19 November 2023 (UTC)Reply

What is the claim, that the reviewers generally care which specific edition they review, based on? INS Pirat ( t | c ) 13:07, 19 November 2023 (UTC)Reply

I often read reviews whose author made comments on the editing (typos, or whether the notes are end note or foot notes) or cited some pages because they wanted to engage with the content of that page - none of that should be relevant to the work. --Jahl de Vautban (talk) 18:12, 20 November 2023 (UTC)Reply

Formats & Forms edit

Latest comment: 5 months ago1 comment1 person in discussion

I’ve been cleaning up some confusion with “book format” in the sense of folio, quarto, octavo and “book format” in the sense hardcover, ebook, audio book. The sitelinks and all but one of the external IDs on book format (Q18602566) were specifically about the folio, quarto sense, so I’ve started my cleanup there. The subclasses and instances were all over the place (I think because the description was vague). Now we have book format (Q18602566), book form (Q104624828), book distribution format (Q123330346) and print book format (Q82046811) - but book form (Q104624828) still allows for confusion and needs more thought. Does anyone have ideas about how to proceed? - PKM (talk) 03:36, 5 November 2023 (UTC)Reply

What should Diana Gabaldon's Outlander series look like on Wikidata? edit

Latest comment: 2 months ago6 comments3 people in discussion

At first I thought that a bunch of the books in Outlander series (Q18153036) were duplicates, e.g., Written in My Own Heart's Blood (Q17184122) and Written in my own heart's blood (Q54870952). But looking at the sitelinks led me to it:Diana_Gabaldon#Serie_di_Outlander which (via Google Translate) seems to indicate that most of the books were published in two parts in Italy. So what's the correct way to model that in Wikidata? Should each of those books have one item for the overall work, and one item for each part? Should the items for each part be works, or editions? dseomn (talk) 01:24, 17 November 2023 (UTC)Reply

@Dseomn: I'm not sure how much of intended by the author was the division of the books as it was done by the Italian publisher. I have therefore doubt that they should be considered works on their own, but they could stand as part of a work (Q88392887) perhaps? --Jahl de Vautban (talk) 18:19, 20 November 2023 (UTC)Reply

That makes sense. I'm still not sure how to handle the different items though. Most of the labels, descriptions, and claims on both items in each pair seem to be about the work as a whole, and it's just the itwiki sitelinks that are different. So maybe it would make the most sense to create new items for each of the parts of a work, move the itwiki links to those, and then merge the existing items into each other? Or should I try to figure out which of each pair is the overall work, and only create 1 new item for each pair, to move the itwiki link to? dseomn (talk) 01:37, 23 November 2023 (UTC)Reply

It looks like https://sv.wikipedia.org/wiki/Sagan_om_Drakens_%C3%A5terkomst has the same issue. dseomn (talk) 18:45, 15 January 2024 (UTC)Reply

On inventaire.io (Q32193244), we model these cases strictly like this: isbn:9789127163201 is one edition of multiple works; isbn:9782253132905 and isbn:9782253132912 are one work split into two editions in French.

This keeps the structure neat and rigid as they are often done for practical or financial reasons by the publisher, but that solution is not possible here with the site links. I think it would be best to make objects like this instances of a new object that is subclass of (P279) both Wikimedia duplicated page (Q17362920) and part of a work (Q88392887). This would make obvious both what's going on and that is is an artifact of Wikidata following another Wikimedia project. /Autom (talk) 10:59, 20 January 2024 (UTC)Reply

Why Wikimedia duplicated page (Q17362920)? There are pages for the book and for parts of the book, but I don't think I've seen cases where there are multiple pages for the book as a whole. dseomn (talk) 18:04, 20 January 2024 (UTC)Reply

OCLC Classify is being discontinued edit

Latest comment: 4 months ago1 comment1 person in discussion

I've started a conversation at meta:Talk:The Wikipedia Library § OCLC Classify is being discontinued that is relevant to Wikidata:WikiProject Books. Daask (talk) 18:23, 20 November 2023 (UTC)Reply

Property proposal: Walmart product ID edit

Latest comment: 4 months ago1 comment1 person in discussion

Hello! This property proposal hasn't got any participant in seven weeks. May any member of this WikiProject give their opinion on it? Thank you, Horcrux (talk) 19:11, 20 November 2023 (UTC)Reply

What are these items supposed to be? edit

Latest comment: 4 months ago3 comments2 people in discussion

I can't figure out what literary forms of (Q26213430) and Literary forms and genres (Q30032140) are supposed to be, they link to nothing on Wikidata and seem superfluous. StarTrekker (talk) 14:53, 4 December 2023 (UTC)Reply

If you follow the link ID on literary forms of (Q26213430) it points to an article. I think it's meant to be a data item for that web-based article. --EncycloPetey (talk) 15:38, 4 December 2023 (UTC)Reply

I've looked at both links and I'm having a hard time figuring out what the descriptions and statements should be for them.StarTrekker (talk) 14:54, 10 December 2023 (UTC)Reply

Property proposal:Book format edit

Latest comment: 4 months ago1 comment1 person in discussion

I've proposed a new property Wikidata:Property proposal/Book format. Please help improve it and comment! - PKM (talk) 23:38, 13 December 2023 (UTC)Reply

edited volume as P31 edit

Latest comment: 3 months ago3 comments2 people in discussion

Hi there, currently edited volume (Q1711593) isn't modelled as a subclass of version, edition or translation (Q3331189), yet it is mostly used as P31 for editions. Should we change the subclasses or replace its use as P31? --Jahl de Vautban (talk) 10:43, 31 December 2023 (UTC)Reply

I wouldn't change that. An edited volume (Q1711593) will have its origin as a work, and then have editions / translation. I've been working on just such a work on Wikisource, a 12th/13th century edited collection of Japanese poetry, but in English translation. The original is treated as a "work" in library databases. All of the translations need a "work" item to be translations of, and most libraries have a "work" as well as instances. When working with ancient and classical literature, this happens a lot. --EncycloPetey (talk) 18:36, 31 December 2023 (UTC)Reply

Ah, I see. So the edited volume (Q1711593) is the work and there might be the edition, just like Heroism in the Harry Potter Series (Q56708797). However stuff like Domestic Space in the Roman World: Pompeii and Beyond (Q47091855) seem to conflate the work and the edition. Probably a lot of cleaning is necessary anyway, but I'll refrain from a global batch solution. --Jahl de Vautban (talk) 13:21, 1 January 2024 (UTC)Reply

Book sections only with graphic material edit

Latest comment: 2 months ago3 comments2 people in discussion

Hi, recently I created Q124045529, which is a book containing severals sections by different people. What's special about it is that it not only contains literary texts but also sections consisting only of graphic or photographic material, e. g. by Signe Pierce (Q23664376). So these are not illustrations of the texts, but artistic works on their own. For now I listed these artistic contributors as author (P50), but I'm looking for a more suitable property since I think neither illustrator (P110) nor contributor to the creative work or subject (P767) is what would be right to use here. Would be glad for suggestions, thanks! Dorades (talk) 23:48, 12 January 2024 (UTC)Reply

@Dorades: hi, a bit late but you'll probably need to create individual sections for each text/graphic material and states the author on them, not on the book itself. I don't think that graphic materials would act any differently than chapters in the textual sense. Also, currently Q124045529 conflates the work and the edition, you may want to have a look at Wikidata:WikiProject_Books#Bibliographic_properties to untangle it. --Jahl de Vautban (talk) 12:08, 25 January 2024 (UTC)Reply

Thank you, I created Q124363942 accordingly. As for creating items for each section, is that necessary? Do you have an example at hand? --Dorades (talk) 15:41, 27 January 2024 (UTC)Reply

Property proposal: volume title edit

Latest comment: 2 months ago1 comment1 person in discussion

D6194c-1cc have proposed a new property Wikidata:Property proposal/volume title. Please help improve it and comment! Regards, ZI Jony ^(Talk) 12:30, 24 January 2024 (UTC)Reply

Book bots edit

Latest comment: 2 months ago1 comment1 person in discussion

There is a relevant discussion on book bots here. Please, join to help define format for labels/descriptions and other stuff. Thanks. Emijrp (talk) 20:15, 15 February 2024 (UTC)Reply

is "Book ID in publisher website" acceptable property edit

Latest comment: 1 month ago2 comments1 person in discussion

Some users proposed property of Book ID in some publishers websites. I think no need for this property when the published books have ISBN and ISBN is global book ID. I want to discuss this here, to make a unified opinion about. حبيشان (talk) 08:21, 19 February 2024 (UTC)Reply

As I see there are 3 publisher have their own book Id property https://w.wiki/9JrF. حبيشان (talk) 07:28, 28 February 2024 (UTC)Reply

Large-scale books metadata imports? edit

Latest comment: 1 month ago9 comments5 people in discussion

When checking if some book has an item in WikiData, most of the time I found it to be missing. Are there not large scale imports of available book sources? I have not checked in depth, could you please post some links about what's already there? It could also be good to have this info on the WikiProject page directly.

I think books are a kind of item that really shouldn't be added manually by hand, but rather be imported at large scale as the former would be a futile way to bind valuable volunteer time resources and because there already are quite large books databases.

There is the OpenLibrary data, ISBNdb.com (maybe they'd make it free on request), and BookBrainz (downloadable). Some further ways are here, many other books sources with potential APIs or dumps are listed here, and since recently one could also use Anna's Archive. For the latter it seems one could download Anna's Archive ISBNdb scrape json dump. The latter currently seems to be the most constructive efficient best approach.

Tools would check if the item already exists, and if it does update it with any additional data; any data conflicts should also be tracked so people can set whatever is better or both (either via semi-automatic edits or by changing the import scripts or properties so that it can add both).

I kind of thought most books in online databases would already be in Wikidata but apparently they aren't. The imports of scientific studies are even less extensive; things like Wikidata:Scholia could start to be useful (e.g. by AI-set "main subjects" data and statistical charts) if maybe 40% of all studies or 60% of cited/notable ones were included but it seems like currently not even 5% of all have been integrated. Prototyperspective (talk) 12:39, 20 February 2024 (UTC)Reply

@Prototyperspective: Yes, it is strange and interesting. Currently, "only" ~70,000 books and ~500,000 versions/editions/translations are available in Wikidata. According to Google, there are over 120,000,000 different book editions. I think we should have an item for each (here in this wikiproject the writen work/edition concepts are prefered, I am fine with it, it is a good solution). But if that is too much, we could start creating items for all written works and editions by any author who has an item. And import any written work/edition by an author when he/she is added to Wikidata, gradually improving our coverage. I am trying to do that with CC-0 metadata from the National Library of Spain (see this). Emijrp (talk) 21:37, 21 February 2024 (UTC)Reply

There are HUGE numbers of books not here yet, but part of the problem includes: the inconsistencies in various databases, leading to duplicate listings that have to be manually identified and corrected, as well as incorrectly merged items in those other databases that get mismatched with existing items here. There is also inconsistency in major library databases, such as the Library of Congress in the US. There have also been attempts to import data from Wikipedias, but these have resulted in a mishmash of data that then has to be manually cleaned up. Things like an ISBN on a 19th century publication. But you are correct, there are many, many books not yet present in Wikidata. --EncycloPetey (talk) 22:40, 21 February 2024 (UTC)Reply

Well it is a question. For example Czech National Library, which has a bibliography for about 7 milion entries was blocking the relase even it should be Public Domain for ages. Recently I heard that they finnaly release it. So it may be the reason, that bibliography holders are blocking that elsewhere.

Regarding the manual adding, I am doint that. I am doing that because of previous statement, that we were not sure, wether we would be ever able to import Czech bibliography, so I would not prohibit that. Or better to say, why contributor cannot do what they want? Juandev (talk) 22:44, 25 February 2024 (UTC)Reply

Personally I don't think it's particularly useful to do mass imports of book metadata into Wikidata right now (though eventually we should, let's say within the next 20 years). If you're interested in mass imports, it's probably best to start importing into OpenLibrary, where the data is actively used and maintained. There are still big gaps there, as the bulk imports focused on some countries. If/when Wikidata starts having a use case for metadata about millions of books, it will be easier to import from OpenLibrary. Nemo 07:54, 23 February 2024 (UTC)Reply

If Wikidata does not even have a comprehensive dataset on books what exactly is it or will it be good for? Books-metadata is one of the first things that comes to mind where an open freely-accessible structured dataset would be useful (once it's as comprehensive as alternatives).

In addition, if volunteers enter the data by hand due to missing data on books, that draws out valuable contributor-time. Fixing this issue thus improves the state of open knowledge overall.
The use-cases include scientific research, visualization, Scholia, integrating things like a module for "Most-cited, most-popular, most-relevant books about this topic" for Wikipedia articles, archives completeness evaluations, upcoming AI-scientist software agents, structured-data based search engines, open source apps like ebook readers, and more. All of this would only be possible once the dataset becomes more complete.
The same also applies to other contents like podcasts, studies, foods, and software – the data only becomes valuable once it becomes reasonably complete. This is about books since for these there already readily available datasets to integrate centrally here.

One issue is that the open nature of Wikidata means that it would be near impossible to make sure people don't add vandalizing/false data into items when that many items exist. Thus, non-bot edits to these items should somehow be tracked separately so they can be checked and maybe also other measures such as semi-locking these items to only bot- and reviewed-edits. I don't think Wikidata should aim to be anything but the most comprehensive open structured data repository of the world. If it doesn't even contain books metadata it falls short of even the basics.

Concerning OpenLibrary that I also linked above: isn't that far less comprehensive than the other datasets I linked such as the ISBNdb json dump? One can bulk-revise items after bulk importing all so it's not a one-time thing.

@Emijrp: Interesting! But why don't you put the estimated number of known/not-totally-insignificant books there instead of only saying that the "Total number of books […] is unknown"? That 120 M number for example would be better than none and there can be multiple estimates each with source. Thanks also for the info about the bot, I was partly looking to see which bulk import efforts are currently being done.

@EncycloPetey: Then it seems like the code for identifying duplicate items needs to be improved. However, I think it would be difficult to misidentify books as the same when they have ISBN IDs and most items seem to have these. There could be reports of mergers / likely duplicates to review where people only need to click a button to merge or correct misidentified separate items. It would be a much better situation if all books without such issues were imported and all items with such issues were on hold and on a list of items with inconsistency issues. I think in most of the latter cases, which could be worked on at a later point, the solution could usually be as simple as importing the data from both databases with a reference to the database so we have both and/or can pick what is better. This wouldn't mean a mishmash but having the data of both so data-users can simply choose which data to pick. Prototyperspective (talk) 13:13, 23 February 2024 (UTC)Reply

No, ISBNs do not exist on "most books". In fact the majority of books present on the Wikisource projects do not have ISBNs because those books predate the invention of ISBNs. Believing that ISBNs will somehow solve problems is a naive approach. Some editions have multiple ISBNs associated with them, and many, many books have no ISBN associated with them. --EncycloPetey (talk) 17:39, 23 February 2024 (UTC)Reply

I was referring to most items in the data dumps, especially the json one.

Wikisource probably has mostly very old books. If "most items" (you misquoted me btw) in the dumps indeed don't have an ISBN, then one could at least import those that have and I was just asking about what's being done in regards to book imports and what the difficulties are. If ISBNs are not a good way for a substantial fraction of importable items then I guess one could use book title + author and I wasn't coming to this claiming to have a fully fledged out way that I'm proposing or anything like that. Prototyperspective (talk) 18:43, 23 February 2024 (UTC)Reply

Also, are you away of the difference between a work and an edition? ISBNs are for editions, and there is no easy way to automate connecting editions to their work data item. I've seen other folks do automated imports which went very wrong in this regard. --EncycloPetey (talk) 20:57, 23 February 2024 (UTC)Reply

What about audiobook? edit

Latest comment: 1 month ago1 comment1 person in discussion

In my understanding audiobook (Q106833) is distribution format (P437) of version, edition or translation (Q3331189). But I don't see any information on this page and real usage in Wikidata is different. Am I right? Skim (talk) 21:03, 24 February 2024 (UTC)Reply

Version or book for P31 edit

Latest comment: 1 month ago4 comments3 people in discussion

Recently, well about year ago, Wikidata started to push me not using Book or Journal as a value for P31, but Q3331189 (version). So OK, but where do I indicate it is a book or journal then? Moreover if P31 equals to Q3331189 services like Zotero have a problem to map it and create a Zotero citation out of it. Juandev (talk) 22:39, 25 February 2024 (UTC)Reply

For journals you have Wikidata:WikiProject_Periodicals. Skim (talk) 21:32, 27 February 2024 (UTC)Reply

I use the distribution format (P437) property (usually as a qualifier to ISBN) with the ebook (Q128093), printed book (Q11396303), hardback (Q193955), softcover (Q990683) and paperback (Q193934) values for books. D6194c-1cc (talk) 05:33, 6 March 2024 (UTC)Reply

Also, take a look at book edition (Q57933693). D6194c-1cc (talk) 07:46, 8 March 2024 (UTC)Reply

Property proposal: Hindawi Foundation book ID edit

Latest comment: 1 month ago2 comments2 people in discussion

MSMST1543 have proposed a new property Wikidata:Property proposal/Hindawi Foundation book ID. Please help improve it and comment! Regards, ZI Jony ^(Talk) 18:21, 28 February 2024 (UTC)Reply

See my comment Above. حبيشان (talk) 17:49, 29 February 2024 (UTC)Reply

Self-published works edit

Latest comment: 1 month ago1 comment1 person in discussion

I started a discussion at Property talk:P123 § Self-published works about how to model self-published works. Daask (talk) 18:33, 16 March 2024 (UTC)Reply

Property proposal: az.lib.ru author ID edit

Latest comment: 19 days ago1 comment1 person in discussion

Wikidata:Property_proposal/az.lib.ru_author_ID - If you have an opinion on whether this property should or should not exist, please vote. Podbrushkin (talk) 14:39, 29 March 2024 (UTC)Reply

Add topic