Wikidata talk:WikiProject Books

(Redirected from Wikidata talk:Books task force)
Latest comment: 6 days ago by Wallacegromit1 in topic Property Proposal: indexer
On this page, old discussions are archived. See: 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023.

What should Diana Gabaldon's Outlander series look like on Wikidata?

edit

At first I thought that a bunch of the books in Outlander series (Q18153036) were duplicates, e.g., Written in My Own Heart's Blood (Q17184122) and Written in my own heart's blood (Q54870952). But looking at the sitelinks led me to it:Diana_Gabaldon#Serie_di_Outlander which (via Google Translate) seems to indicate that most of the books were published in two parts in Italy. So what's the correct way to model that in Wikidata? Should each of those books have one item for the overall work, and one item for each part? Should the items for each part be works, or editions? dseomn (talk) 01:24, 17 November 2023 (UTC)Reply

@Dseomn: I'm not sure how much of intended by the author was the division of the books as it was done by the Italian publisher. I have therefore doubt that they should be considered works on their own, but they could stand as part of a work (Q88392887) perhaps? --Jahl de Vautban (talk) 18:19, 20 November 2023 (UTC)Reply
That makes sense. I'm still not sure how to handle the different items though. Most of the labels, descriptions, and claims on both items in each pair seem to be about the work as a whole, and it's just the itwiki sitelinks that are different. So maybe it would make the most sense to create new items for each of the parts of a work, move the itwiki links to those, and then merge the existing items into each other? Or should I try to figure out which of each pair is the overall work, and only create 1 new item for each pair, to move the itwiki link to? dseomn (talk) 01:37, 23 November 2023 (UTC)Reply
It looks like https://sv.wikipedia.org/wiki/Sagan_om_Drakens_%C3%A5terkomst has the same issue. dseomn (talk) 18:45, 15 January 2024 (UTC)Reply
On inventaire.io (Q32193244), we model these cases strictly like this: isbn:9789127163201 is one edition of multiple works; isbn:9782253132905 and isbn:9782253132912 are one work split into two editions in French.
This keeps the structure neat and rigid as they are often done for practical or financial reasons by the publisher, but that solution is not possible here with the site links. I think it would be best to make objects like this instances of a new object that is subclass of (P279) both Wikimedia duplicated page (Q17362920) and part of a work (Q88392887). This would make obvious both what's going on and that is is an artifact of Wikidata following another Wikimedia project. /Autom (talk) 10:59, 20 January 2024 (UTC)Reply
Why Wikimedia duplicated page (Q17362920)? There are pages for the book and for parts of the book, but I don't think I've seen cases where there are multiple pages for the book as a whole. dseomn (talk) 18:04, 20 January 2024 (UTC)Reply

edited volume as P31

edit

Hi there, currently edited volume (Q1711593) isn't modelled as a subclass of version, edition or translation (Q3331189), yet it is mostly used as P31 for editions. Should we change the subclasses or replace its use as P31? --Jahl de Vautban (talk) 10:43, 31 December 2023 (UTC)Reply

I wouldn't change that. An edited volume (Q1711593) will have its origin as a work, and then have editions / translation. I've been working on just such a work on Wikisource, a 12th/13th century edited collection of Japanese poetry, but in English translation. The original is treated as a "work" in library databases. All of the translations need a "work" item to be translations of, and most libraries have a "work" as well as instances. When working with ancient and classical literature, this happens a lot. --EncycloPetey (talk) 18:36, 31 December 2023 (UTC)Reply
Ah, I see. So the edited volume (Q1711593) is the work and there might be the edition, just like Heroism in the Harry Potter Series (Q56708797). However stuff like Domestic Space in the Roman World: Pompeii and Beyond (Q47091855) seem to conflate the work and the edition. Probably a lot of cleaning is necessary anyway, but I'll refrain from a global batch solution. --Jahl de Vautban (talk) 13:21, 1 January 2024 (UTC)Reply

Book sections only with graphic material

edit

Hi, recently I created Q124045529, which is a book containing severals sections by different people. What's special about it is that it not only contains literary texts but also sections consisting only of graphic or photographic material, e. g. by Signe Pierce (Q23664376). So these are not illustrations of the texts, but artistic works on their own. For now I listed these artistic contributors as author (P50), but I'm looking for a more suitable property since I think neither illustrator (P110) nor contributor to the creative work or subject (P767) is what would be right to use here. Would be glad for suggestions, thanks! Dorades (talk) 23:48, 12 January 2024 (UTC)Reply

@Dorades: hi, a bit late but you'll probably need to create individual sections for each text/graphic material and states the author on them, not on the book itself. I don't think that graphic materials would act any differently than chapters in the textual sense. Also, currently Q124045529 conflates the work and the edition, you may want to have a look at Wikidata:WikiProject_Books#Bibliographic_properties to untangle it. --Jahl de Vautban (talk) 12:08, 25 January 2024 (UTC)Reply
Thank you, I created Q124363942 accordingly. As for creating items for each section, is that necessary? Do you have an example at hand? --Dorades (talk) 15:41, 27 January 2024 (UTC)Reply

Property proposal: volume title

edit

D6194c-1cc have proposed a new property Wikidata:Property proposal/volume title. Please help improve it and comment! Regards, ZI Jony (Talk) 12:30, 24 January 2024 (UTC)Reply

Book bots

edit

There is a relevant discussion on book bots here. Please, join to help define format for labels/descriptions and other stuff. Thanks. Emijrp (talk) 20:15, 15 February 2024 (UTC)Reply

is "Book ID in publisher website" acceptable property

edit

Some users proposed property of Book ID in some publishers websites. I think no need for this property when the published books have ISBN and ISBN is global book ID. I want to discuss this here, to make a unified opinion about. حبيشان (talk) 08:21, 19 February 2024 (UTC)Reply

As I see there are 3 publisher have their own book Id property https://w.wiki/9JrF. حبيشان (talk) 07:28, 28 February 2024 (UTC)Reply

Large-scale books metadata imports?

edit

When checking if some book has an item in WikiData, most of the time I found it to be missing. Are there not large scale imports of available book sources? I have not checked in depth, could you please post some links about what's already there? It could also be good to have this info on the WikiProject page directly.

I think books are a kind of item that really shouldn't be added manually by hand, but rather be imported at large scale as the former would be a futile way to bind valuable volunteer time resources and because there already are quite large books databases.

There is the OpenLibrary data, ISBNdb.com (maybe they'd make it free on request), and BookBrainz (downloadable). Some further ways are here, many other books sources with potential APIs or dumps are listed here, and since recently one could also use Anna's Archive. For the latter it seems one could download Anna's Archive ISBNdb scrape json dump. The latter currently seems to be the most constructive efficient best approach.

Tools would check if the item already exists, and if it does update it with any additional data; any data conflicts should also be tracked so people can set whatever is better or both (either via semi-automatic edits or by changing the import scripts or properties so that it can add both).

I kind of thought most books in online databases would already be in Wikidata but apparently they aren't. The imports of scientific studies are even less extensive; things like Wikidata:Scholia could start to be useful (e.g. by AI-set "main subjects" data and statistical charts) if maybe 40% of all studies or 60% of cited/notable ones were included but it seems like currently not even 5% of all have been integrated. Prototyperspective (talk) 12:39, 20 February 2024 (UTC)Reply

@Prototyperspective: Yes, it is strange and interesting. Currently, "only" ~70,000 books and ~500,000 versions/editions/translations are available in Wikidata. According to Google, there are over 120,000,000 different book editions. I think we should have an item for each (here in this wikiproject the writen work/edition concepts are prefered, I am fine with it, it is a good solution). But if that is too much, we could start creating items for all written works and editions by any author who has an item. And import any written work/edition by an author when he/she is added to Wikidata, gradually improving our coverage. I am trying to do that with CC-0 metadata from the National Library of Spain (see this). Emijrp (talk) 21:37, 21 February 2024 (UTC)Reply
There are HUGE numbers of books not here yet, but part of the problem includes: the inconsistencies in various databases, leading to duplicate listings that have to be manually identified and corrected, as well as incorrectly merged items in those other databases that get mismatched with existing items here. There is also inconsistency in major library databases, such as the Library of Congress in the US. There have also been attempts to import data from Wikipedias, but these have resulted in a mishmash of data that then has to be manually cleaned up. Things like an ISBN on a 19th century publication. But you are correct, there are many, many books not yet present in Wikidata. --EncycloPetey (talk) 22:40, 21 February 2024 (UTC)Reply
Well it is a question. For example Czech National Library, which has a bibliography for about 7 milion entries was blocking the relase even it should be Public Domain for ages. Recently I heard that they finnaly release it. So it may be the reason, that bibliography holders are blocking that elsewhere.
Regarding the manual adding, I am doint that. I am doing that because of previous statement, that we were not sure, wether we would be ever able to import Czech bibliography, so I would not prohibit that. Or better to say, why contributor cannot do what they want? Juandev (talk) 22:44, 25 February 2024 (UTC)Reply

Personally I don't think it's particularly useful to do mass imports of book metadata into Wikidata right now (though eventually we should, let's say within the next 20 years). If you're interested in mass imports, it's probably best to start importing into OpenLibrary, where the data is actively used and maintained. There are still big gaps there, as the bulk imports focused on some countries. If/when Wikidata starts having a use case for metadata about millions of books, it will be easier to import from OpenLibrary. Nemo 07:54, 23 February 2024 (UTC)Reply

If Wikidata does not even have a comprehensive dataset on books what exactly is it or will it be good for? Books-metadata is one of the first things that comes to mind where an open freely-accessible structured dataset would be useful (once it's as comprehensive as alternatives).
  • In addition, if volunteers enter the data by hand due to missing data on books, that draws out valuable contributor-time. Fixing this issue thus improves the state of open knowledge overall.
  • The use-cases include scientific research, visualization, Scholia, integrating things like a module for "Most-cited, most-popular, most-relevant books about this topic" for Wikipedia articles, archives completeness evaluations, upcoming AI-scientist software agents, structured-data based search engines, open source apps like ebook readers, and more. All of this would only be possible once the dataset becomes more complete.
  • The same also applies to other contents like podcasts, studies, foods, and software – the data only becomes valuable once it becomes reasonably complete. This is about books since for these there already readily available datasets to integrate centrally here.
One issue is that the open nature of Wikidata means that it would be near impossible to make sure people don't add vandalizing/false data into items when that many items exist. Thus, non-bot edits to these items should somehow be tracked separately so they can be checked and maybe also other measures such as semi-locking these items to only bot- and reviewed-edits. I don't think Wikidata should aim to be anything but the most comprehensive open structured data repository of the world. If it doesn't even contain books metadata it falls short of even the basics.
Concerning OpenLibrary that I also linked above: isn't that far less comprehensive than the other datasets I linked such as the ISBNdb json dump? One can bulk-revise items after bulk importing all so it's not a one-time thing.
@Emijrp: Interesting! But why don't you put the estimated number of known/not-totally-insignificant books there instead of only saying that the "Total number of books […] is unknown"? That 120 M number for example would be better than none and there can be multiple estimates each with source. Thanks also for the info about the bot, I was partly looking to see which bulk import efforts are currently being done.
@EncycloPetey: Then it seems like the code for identifying duplicate items needs to be improved. However, I think it would be difficult to misidentify books as the same when they have ISBN IDs and most items seem to have these. There could be reports of mergers / likely duplicates to review where people only need to click a button to merge or correct misidentified separate items. It would be a much better situation if all books without such issues were imported and all items with such issues were on hold and on a list of items with inconsistency issues. I think in most of the latter cases, which could be worked on at a later point, the solution could usually be as simple as importing the data from both databases with a reference to the database so we have both and/or can pick what is better. This wouldn't mean a mishmash but having the data of both so data-users can simply choose which data to pick. Prototyperspective (talk) 13:13, 23 February 2024 (UTC)Reply
No, ISBNs do not exist on "most books". In fact the majority of books present on the Wikisource projects do not have ISBNs because those books predate the invention of ISBNs. Believing that ISBNs will somehow solve problems is a naive approach. Some editions have multiple ISBNs associated with them, and many, many books have no ISBN associated with them. --EncycloPetey (talk) 17:39, 23 February 2024 (UTC)Reply
I was referring to most items in the data dumps, especially the json one.
Wikisource probably has mostly very old books. If "most items" (you misquoted me btw) in the dumps indeed don't have an ISBN, then one could at least import those that have and I was just asking about what's being done in regards to book imports and what the difficulties are. If ISBNs are not a good way for a substantial fraction of importable items then I guess one could use book title + author and I wasn't coming to this claiming to have a fully fledged out way that I'm proposing or anything like that. Prototyperspective (talk) 18:43, 23 February 2024 (UTC)Reply
Also, are you away of the difference between a work and an edition? ISBNs are for editions, and there is no easy way to automate connecting editions to their work data item. I've seen other folks do automated imports which went very wrong in this regard. --EncycloPetey (talk) 20:57, 23 February 2024 (UTC)Reply

What about audiobook?

edit

In my understanding audiobook (Q106833) is distribution format (P437) of version, edition or translation (Q3331189). But I don't see any information on this page and real usage in Wikidata is different. Am I right? Skim (talk) 21:03, 24 February 2024 (UTC)Reply

I think these are entered in the audio field when an audio is available. I guess multi-part audiobooks should be converted to one file. I have added a few audiobooks from these and there are also many Librivox audiobooks. See Commons:Category:Audiobooks by language. It would be nice if a script / bot was made that added the files we have on WMC and maybe also Internet Archive hosted ones. Then maybe audiobooks could be added as a feature to WikiVibes and be auto-displayed on the associated Wikipedia page. Probably you're asking at least mainly about sth different, I don't know what you're asking about...whether or not "audiobook" is widely set on distribution format if and audiobook is available. If there is no script / bot that checks audiobooks sites like Audible for audiobooks and adds them to distribution format that would be valuable in the sense of data completion, I think the more important task would be making it useful or doing things that are already useful rather than just completing data for the sake of completeness. Prototyperspective (talk) 11:04, 11 June 2024 (UTC)Reply

Version or book for P31

edit

Recently, well about year ago, Wikidata started to push me not using Book or Journal as a value for P31, but Q3331189 (version). So OK, but where do I indicate it is a book or journal then? Moreover if P31 equals to Q3331189 services like Zotero have a problem to map it and create a Zotero citation out of it. Juandev (talk) 22:39, 25 February 2024 (UTC)Reply

For journals you have Wikidata:WikiProject_Periodicals. Skim (talk) 21:32, 27 February 2024 (UTC)Reply
I use the distribution format (P437) property (usually as a qualifier to ISBN) with the ebook (Q128093), printed book (Q11396303), hardback (Q193955), softcover (Q990683) and paperback (Q193934) values for books. D6194c-1cc (talk) 05:33, 6 March 2024 (UTC)Reply
Also, take a look at book edition (Q57933693). D6194c-1cc (talk) 07:46, 8 March 2024 (UTC)Reply

Property proposal: Hindawi Foundation book ID

edit

MSMST1543 have proposed a new property Wikidata:Property proposal/Hindawi Foundation book ID. Please help improve it and comment! Regards, ZI Jony (Talk) 18:21, 28 February 2024 (UTC)Reply

See my comment Above. حبيشان (talk) 17:49, 29 February 2024 (UTC)Reply

Self-published works

edit

I started a discussion at Property talk:P123 § Self-published works about how to model self-published works. Daask (talk) 18:33, 16 March 2024 (UTC)Reply

Property proposal: az.lib.ru author ID

edit

Wikidata:Property_proposal/az.lib.ru_author_ID - If you have an opinion on whether this property should or should not exist, please vote. Podbrushkin (talk) 14:39, 29 March 2024 (UTC)Reply

Notability of vanity press

edit

Hi,

With Fralambert, we are wondering if vanity press (self-publication) are notable enough for Wikidata. For instance, Q62648172/Q62662230 (especially in this case where the items have not sitelink and are only linked to each other, so they clearly fail the 1st and 3rd criteria of WD:N, but what about the 2nd?).

Cheers, VIGNERON (talk) 15:45, 18 May 2024 (UTC)Reply

Form and genre

edit

Hi y'all,

We have two properties: genre (P136) and form of creative work (P7937). Since both are not fully independent (you can technically have an autobiographic poem but 99% of the time, autobiographies are narrations in prose), there is some overlap and confusion on how people use it (and also P136 is an older property so some people only know/use it and don't know/use P7937...).

I tried to do some queries (which sadly timesout :/), I could use some help for :

  • finding strange or wrong uses (like https://w.wiki/AAHZ where the exact same value is used as genre (P136) and form of creative work (P7937) !).
  • defining how to correct these strange or wrong uses (sometimes it's easy, like play (Q25379) is a form and clearly not a genre, but not always)
  • more generally and ideally, do we have good sources to be used as references (which could solve a large part of the questioning on each individual item)

Ping @EncycloPetey, Bodhisattwa: who worked on this matter (and see for instance Wikidata_talk:WikiProject_Books/2023#Form_vs_genre as a previous talk on the same subject, there is maybe other relevant discussion ; we should probably make a page to document it clearly somewhere).

Cheers, VIGNERON (talk) 06:59, 23 May 2024 (UTC)Reply

Format for Adding Indexers on Book Items

edit

Wanted some advise by Wikiproject Books, on Which format would be more suitable to add/credit 'Indexers' in published works (before I try for a full property proposal):

Index by (Indexer/Indexed by)
  Chitra Karunanayake
0 references
add reference


add value

Or,

editor
  Mei Yen Chua
subject has role indexer
subject named as contributing editor
0 references
add reference


add value
  • The first option is harder to find direct references for, however they do exist, and would overall be better as Wikidata could add more value to these book items and the people who Index.
  • The second option has references, including Amazon where Indexer's are credited as 'contributing editor'.
  • There are also many Database Indexers, so if a property is created, it would be useful for databses as well.

Or, are there other alternatives? Would anyone be interested in combining catalogers and indexers as one property? What issues do you feel could arise with these options or a future property proposal? Wallacegromit1 (talk) 11:46, 2 June 2024 (UTC)Reply

Labels for edition

edit

Hi y'all,

Help:Default values for labels and aliases is moving along. I would like to add version, edition or translation (Q3331189) as an example where the "mul" label should be used. What do you think?

Cheers, VIGNERON (talk) 13:15, 3 July 2024 (UTC)Reply

Seems reasonable, but not sure if we need to distinguish between version, edition or translation (Q3331189) and scholarly article (Q13442814) under "Titles". --Jahl de Vautban (talk) 13:26, 3 July 2024 (UTC)Reply
@Jahl de Vautban: true, for people knowing biliography it may seem obvious and a bit redundant ; but for the sake of clarity, I would prefer to be explicit and list both (conversely, written work (Q47461344) maye seem similar "it's also a book" but is not in the same case). Cheers, VIGNERON (talk) 08:36, 5 July 2024 (UTC)Reply

Property Proposal: indexer

edit

Hi All,

Kindly requesting your input as   Support or   Oppose, for the following property proposal for 'indexer' at https://www.wikidata.org/wiki/Wikidata:Property_proposal/indexer

You may also Comment to improve or critic the proposal. Appreciate any constructive feedback. Wallacegromit1 (talk) 07:36, 8 July 2024 (UTC)Reply

Return to the project page "WikiProject Books".