Wikidata talk:WikiProject Books

Active discussions
On this page, old discussions are archived. See: 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020.

The ISBN13 Format Regex is brokenEdit

I believe the format constraint is wrong. Of the 2157 violations, 1833 are of the format 978-9X-.... But according to the prose explanation of the format, and the booksellers I checked, a two-digit second group is fine when it starts with a 9. Even one of the two property examples listed on the property does not validate!

To be honest, I would suggest scrapping the dashes and simplifying the format constraint to ^97[89]\d{10}$, i. e. "Start with 978 or 979 and have 13 digits" and the checksum. It's terribly annoying to try to guess where the dashes belong when websites routinely leave them out, and the data is practically useless: on the standard page for ISBNs, there's a link to this atrocity of a 40-line SPARKQL query:

SELECT ?isbn13 ?statement WHERE{
 ?statement prov:wasDerivedFrom ?ref .
 ?ref pr:P212 ?isbn13 .
 BIND(lcase(?isbn13)AS?l)
 BIND(lcase("9780954771003")AS?i)
 BIND(SUBSTR(?i,4,1)AS?i41)BIND(SUBSTR(?i,4,2)AS?i42)
 BIND(SUBSTR(?i,4,3)AS?i43)BIND(SUBSTR(?i,4,4)AS?i44)
 BIND(SUBSTR(?i,4,5)AS?i45)BIND(SUBSTR(?i,6,7)AS?i67)
 BIND(SUBSTR(?i,7,6)AS?i76)BIND(SUBSTR(?i,8,5)AS?i85)
 BIND(SUBSTR(?i,9,4)AS?i94)BIND(SUBSTR(?i,10,3)AS?i103)
 BIND(SUBSTR(?i,11,2)AS?i112)BIND(SUBSTR(?i,12,1)AS?i121)
 BIND("-"AS?h)BIND(CONCAT(SUBSTR(?i,1,3),?h)AS?i13)
 BIND(CONCAT(?h,SUBSTR(?i,13,1))AS?x)
 FILTER(CONTAINS(?l, ?i)||
  CONTAINS(?l,CONCAT(?i13,?i41,?h,SUBSTR(?i,5,1),?h,?i67,?x))||
  CONTAINS(?l,CONCAT(?i13,?i41,?h,SUBSTR(?i,5,2),?h,?i76,?x))||
  CONTAINS(?l,CONCAT(?i13,?i41,?h,SUBSTR(?i,5,3),?h,?i85,?x))||
  CONTAINS(?l,CONCAT(?i13,?i41,?h,SUBSTR(?i,5,4),?h,?i94,?x))||
  CONTAINS(?l,CONCAT(?i13,?i41,?h,SUBSTR(?i,5,5),?h,?i103,?x))||
  CONTAINS(?l,CONCAT(?i13,?i41,?h,SUBSTR(?i,5,6),?h,?i112,?x))||
  CONTAINS(?l,CONCAT(?i13,?i41,?h,SUBSTR(?i,5,7),?h,?i121,?x))||
  CONTAINS(?l,CONCAT(?i13,?i42,?h,SUBSTR(?i,6,1),?h,?i76,?x))||
  CONTAINS(?l,CONCAT(?i13,?i42,?h,SUBSTR(?i,6,2),?h,?i85,?x))||
  CONTAINS(?l,CONCAT(?i13,?i42,?h,SUBSTR(?i,6,3),?h,?i94,?x))||
  CONTAINS(?l,CONCAT(?i13,?i42,?h,SUBSTR(?i,6,4),?h,?i103,?x))||
  CONTAINS(?l,CONCAT(?i13,?i42,?h,SUBSTR(?i,6,5),?h,?i112,?x))||
  CONTAINS(?l,CONCAT(?i13,?i42,?h,SUBSTR(?i,6,6),?h,?i121,?x))||
  CONTAINS(?l,CONCAT(?i13,?i43,?h,SUBSTR(?i,7,1),?h,?i85,?x))||
  CONTAINS(?l,CONCAT(?i13,?i43,?h,SUBSTR(?i,7,2),?h,?i94,?x))||
  CONTAINS(?l,CONCAT(?i13,?i43,?h,SUBSTR(?i,7,3),?h,?i103,?x))||
  CONTAINS(?l,CONCAT(?i13,?i43,?h,SUBSTR(?i,7,4),?h,?i112,?x))||
  CONTAINS(?l,CONCAT(?i13,?i43,?h,SUBSTR(?i,7,5),?h,?i121,?x))||
  CONTAINS(?l,CONCAT(?i13,?i44,?h,SUBSTR(?i,8,1),?h,?i94,?x))||
  CONTAINS(?l,CONCAT(?i13,?i44,?h,SUBSTR(?i,8,2),?h,?i103,?x))||
  CONTAINS(?l,CONCAT(?i13,?i44,?h,SUBSTR(?i,8,3),?h,?i112,?x))||
  CONTAINS(?l,CONCAT(?i13,?i44,?h,SUBSTR(?i,8,4),?h,?i121,?x))||
  CONTAINS(?l,CONCAT(?i13,?i45,?h,SUBSTR(?i,9,1),?h,?i103,?x))||
  CONTAINS(?l,CONCAT(?i13,?i45,?h,SUBSTR(?i,9,2),?h,?i112,?x))||
  CONTAINS(?l,CONCAT(?i13,?i45,?h,SUBSTR(?i,9,3),?h,?i121,?x)))}

Try it! ("Try it" the template says. Because it's in on the joke that the query does not even work.)

There is some information encoded in these groups, but by my estimation there are about two people on the planet that intuitively know the publisher or country just from looking at it, and neither of them are well-liked by their friends. But, in any case, the ISBN Institute (it's a thing) publishes the data needed to dasherize an ISBN if one ever feels the urge. --Matthias Winkelmann (talk) 13:43, 9 August 2020 (UTC)

There is https://www.loc.gov/publish/pcn/isbncnvt_pcn.html to hyphenate ISBNs. Guessing the locations of the hyphens is not possible. The blocks have variable lengths in order to have enough numbers for a few large publishers, but also numbers for many small publishers. --Pasleim (talk) 19:40, 10 August 2020 (UTC)
+1 with @Pasleim:, no guessing needed or involved here. And the hyphens are needed and should be kept for many obvious reason already explained. « there are about two people on the planet » I'm pretty sure there is more than 2 librarian/bookseller/publisher/editor on the planet :P
That said, @Matthias Winkelmann: is right and the constraint is indeed wrong and should be corrected (and maybe completed and/or splitted into several simplier constraints to make them more understandable). Could someone located the problem and fix it? edit: the more I look, the more I find errors in this regex (most if not all 979 generate error too). I suggest to adopt a more easier regex, something like 97[89]-\d{1,5}-\d{2,7}-\d{1,6}-\d maybe?
@Matthias Winkelmann: the query does seems to work fine, what is the problem you've seen? I see no result but that's to be expected, no item use this ISBN as a ref, does it?
Cdlt, VIGNERON (talk) 11:49, 24 August 2020 (UTC)
I've edited the query above to include the ISBN13 of Ulysses (2004) (Q28599849), and it doesn't find the item. Neither in the format without dashes, nor with them. The main search finds the edition, but only if you manage to enter the ISBN with dashes, in all the right places. Even assuming that query can somehow be fixed, it is completely insane. This is probably the most well-known identifier there is. But to use it requires copy and pasting this...thing, which probably consumes half of the allotted time the query service gives you. In reality, finding items by ISBN would tend to be just part of a query. But any such query is immediately rendered unreadable and, for people with remnants of professional pride, unsharable.
The main problem with the dashed format is that ISBNs formatted in that way are not unique: there are dozens of different ways to split 13 numbers into five sets that satisfy the regular expression constraint, even if we manage to fix it.
Yes, there is in fact the one "true" format for any given ISBN. But because that truth is essentially a list of ISBNs (instead of some rules), users have no way of getting to that format from a dash-less ISBN except to find some tool that does it for them. That probably happens quite often: I just checked Google, OpenLibrary, and Amazon. The first two show ISBNs without any dashes, while Amazon splits of the initial 978. With the possibilities as they are, we are indeed capable of running a bot to occasionally correct them. But we can't offer any help at time of entry. That means users wishing to add an ISBN get constraint violations in return, and not knowing about the bot, or the relative weakness of constraints' prescriptive power, will waste time or give up in frustration. I believe the lack of citations for even potentially controversial statements on high-profile items is among WD's top problems, and entering ISBNs is on the critical path for adding them (even though it's a small part of how terrible that process is, but I digress...).
A user might then add dashes in appropriate places that satisfy the regex without actually being "correct" (this is what I meant by "guessing", above). Unless that bot does its work immediately, this leads to several problems: multiple items can have the same ISBN, but they will not be marked as such because they are cosmetically different (this is also a problem between ISBN10s and 13s). They also cannot be found by search, except by trying all possible combinations. Reconciliation is unlikely to work, etc. Essentially none of the benefits of an identifier actually materialise, even though the ISBN system is pretty close to being the "canonical", widely known, identifier.
I'm not too invested in this issue since books aren't items I usually work with for my actual job, and it's fine if people want to fix the obvious problems, and the world will keep turning. But if it's the work involved that scares people, I'd be willing to handle everything that's within my power. --Matthias Winkelmann (talk) 18:33, 24 August 2020 (UTC)
@Matthias Winkelmann: « it doesn't find the item » oh I see now! This query wass not made to find the item but to find the items citing this ISBN in references. Indeed, we probably should change that ; we could either make 2 separate queries or expand the existing query, both are very easy (it's just the first two lines of the query, replace these first two lines by ?item wdt:P212 ?isbn13 . and add ?item in the SELECT to find the item itself) just tell me which is best and clearer.
« A user might then add dashes in appropriate places that satisfy the regex without actually being "correct" » is it even possible ? For a given ISBN there is one and only possible position for the dashes (AFAIK), so if the regex is correct (which is a big if, the current regex still not being perfect) then the ISBN is correct too. I don't see any exception and how "guessing" can be wrong.
Cheers, VIGNERON (talk) 07:28, 25 August 2020 (UTC)
The hyphenation of ISBNs is a presentation "human readable" form, and spaces are just as correct as hyphens, according to the (latest, 7th ed) ISBN Users' Manual (section 5). "Note: The use of hyphens or spaces has no lexical significance and is purely to enhance readability."
The position, or lack or, hyphens or spaces does not change the 13 digit ISBN id number. The current regex is over-restrictive and is not a good way to store ISBNs in a database. If hyphenation is important, it should be added back in the view layer, and not affect storage or querying. This is why other publishing and library organisations use un-hyphenated formats for cataloguing and interchange (see the ONIX metadata description in section 8 of the User's Manual). ISBN-13 (P212) is a subproperty of Global Trade Item Number (P3962), but anything matching the current ISBN regex cannot validate against the EAN13/GTIN13 regex, which would be appropriate, and potentially useful. Storing hyphens in ISBN is akin to storing thousands separators in a database, and I don't consider that an exaggeration. I can see why ISBNs on Wikipedia might want to have hyphens for citations, which are a more of a display format, and my opinion on that is less formed. For a sub-property of Wikidata property to identify books (Q29547399) I think hyphens are inappropriate.
The LOC ISBN hyphenation form linked above can easily be outwitted by entering an ISBN from a non-Bowker region, e.g.: for the Chinese ISBN 9787503829901 978-7-50-382990-1, 978-7-503-82990-1, 978-7-5038-2990-1 are all hyphenation guesses that validate against the regex. LOC says "Unable to hyphenate this ISBN!" The 'safest' guess of 978-7-50382990-1 does not validate. The regex demands an arbitrary hyphen to split the last non-check-digit block. Putting that hyphen in the wrong place does not change the registrant or their publication element; that's always correctly represented by the ISBN, it simply makes it harder to search for or compare. Unfortunately putting the hyphen in the correct place also makes search and comparison at scale harder.
Another example is 9787543644748, which according to the printing on the book is correctly hyphenated as 978-7-5436-4474-8. The Bowker based online tools cannot split this as Bowker does not administer that region/range, and the https://www.isbn-international.org/range_file_generation mentioned does not help correctly split the 'registrant' from the 'publication element' as it does not list individual registrants by region. I believe this provides examples of what 'guessing' means, and also that there is no single source of information to make the full 'dasherization' for all possible ISBN -- the link from the official ISBN Institute is not sufficient. A lot of the publisher registrant information is commercially protected also, so I wouldn't consider that information is even supposed to be freely or publicly available.
The 2 people who can interpret the split groupings can still do with or without the explicit hyphens or spaces, and if it's important, it's generally better to strip out any well-meaning hyphens and form the groupings specific to the particular registrant or region of interest. At least that's my experience, maybe I shouldn't speak for the other one ;)
I support a simplification of the ISBN regex (@Matthias Winkelmann:' first regex suggestion looks like a correct ISBN 13 validation to me) rather than more bot effort into splitting stored ISBNs into (possibly incorrect?) human readable forms. Removing hyphens would make the data more useful and interoperable with other datasets. Any existing tools which can reliably make the splits should be used usefully in display contexts. Salpynx (talk) 23:24, 3 September 2020 (UTC)

Two WikiCite grant programs - applications closing soonEdit

I just wanted to make sure that people who frequent this project page are aware of these two grant programs currently open, highly relevant to this Wikiproject's activities. They were announced on the main project chat a month ago (and various other places) but I wanted to write here specifically too. Apply by 1 October.

1. Project & events [$2-10k]

2. e-Scholarships [per-diem calculated on your city; 1-5 people (single, or as a 'remote group') for 2-4 days, for COVID-era "stay at home" projects. Paid in advance living allowance, no expense report required.]

There is lots of documentation, eligibility requirements, selection criteria, program design principles at those links. Please check them out. Sincerely, LWyatt (WMF) (talk) 13:58, 18 September 2020 (UTC)

Need help importing kn (Kannada) language data related to books & news papers onto wikidataEdit

I would need some hands on help to import data on the kn language books and news papers on wikidata. Spoke to User:VIGNERON this afternoon about the same in WS session we had. Look forward for the support to take this forward. Thank you in advance. Omshivaprakash (talk) 15:26, 29 September 2020 (UTC)

Dewey Decimal property in works table -- update neededEdit

Hi! In doing some editing of a work item, some colleagues and I noticed that the Dewey property in the work item properties table is not the right one. I think that rather than P1036 it should be P8359. I'm happy to go in and change it, but wanted to flag it here first. Many thanks! Aap1890 (talk) 20:08, 9 October 2020 (UTC)

Dewey Decimal Classification (works and editions) (P8359) is new. I didn't follow the discussion: Wikidata:Property proposal/Library classifications' IDs for topics. --Kolja21 (talk) 02:45, 16 October 2020 (UTC)

book edition (Q57933693)Edit

Hi to all. We either need to build Q57933693 into our documentation, or we need to get it otherwise managed. It just becomes an extra level of confusion and management to have random unattached items out there.  — billinghurst sDrewth 01:12, 19 October 2020 (UTC)

Should short story (Q49084) be used with instance of, genre, or both?Edit

Our usage here is completely inconsistent and seems to be totally random. So what should it be? For a short story, such as The Library of Babel (Q473), should short story (Q49084) be set as instance of, genre, or both? Please ping me on any replies. Thanks. Kaldari (talk) 15:40, 27 October 2020 (UTC)

@Kaldari: There was a discussion at Wikidata_talk:WikiProject_Books/2019#Novel_used_with_is_een_(P31)_instead_of_genre_(P136), but it doesn't seem to me to have reached any conclusion. I think that the problem is the definition of en:genre, which is a somewhat all-encompassing concept "Genres may be determined by literary technique, tone, content, or even (as in the case of fiction) length". Generally, when we have a specific property such as genre (P136) then we don't want to record the same information redundantly with instance of (P31). It's acceptable to have multiple P136 values, so setting genres of "short story" and "science fiction literature" can be done without needing to define "science fiction short story" (although this exists in any case: science fiction short story (Q21905924).) Ghouston (talk) 23:26, 29 October 2020 (UTC)
I look at this the way library catalogs do. When a library catalogs a work, they do it as Novel--Science fiction--Modern. Or Short story collection--Horror--Early modern. They give its form, its genre, and its period this way. So the best general form descriptor that's more specific that "something written" is what I use as the instance of. --EncycloPetey (talk) 15:49, 1 November 2020 (UTC)
What about for an edition item, is it not valid to set the genre to "short story"? Using P31 doesn't seem appropriate, since it's already set to version, edition, or translation (Q3331189). Ghouston (talk) 02:00, 2 November 2020 (UTC)
An edition item is, by nature, a sub-item of the primary item. If the sub-item is an edition or a translation of a short story, why would it be inappropriate to use both values on P31? We also have the option of using the qualifier of (P642) so it can be "Edition (of) short story". --EncycloPetey (talk) 22:19, 2 November 2020 (UTC)
I agree that many written work items are inconsistent. Whatever consensus is agreed upon, it would be wise to have periodic semi-automated bot-assisted sweeps (maybe twice a year, or more often if warranted) that can rectify and standardize the properties that go astray (I still get confused about whether a novel or non-fiction book should be an instance of written work (Q47461344) or literary work (Q7725634), and completely forget about form of creative work (P7937)). There are gazillions of permutations of works, editions, genres, and forms. Wikidata is still largely 'whatever works', Wild Wild West, and with tens of millions of items (probably soon to be hundreds of millions), it will take concerted effort to ensure standard and consistent data organization. -Animalparty (talk) 02:29, 3 November 2020 (UTC)

anthology editionEdit

I haven't seen an edition like this before: Strange New Worlds Ⅱ (Q77035851). There are multiple values of edition or translation of (P629), each matching one of the stories collected in the work. I think maybe it's misusing the property? Ghouston (talk) 11:44, 10 November 2020 (UTC)

How to deal with author duos when each also has an individual item?Edit

Example: Hard to Be a God (Q211927) currently has three values for author (P50): Arkady and Boris Strugatsky (Q153796), Boris Strugatsky (Q59054) and Arkady Strugatsky (Q61699). This is problematic because:

  1. It duplicates information.
  2. Naive processing will think there are three authors when there are only two. To get around this you have to specifically check each author item to see if any is "part" of another. That doesn't seem right.

However, I can also see why having all three listed makes it easier to query by both the duo or one of the individuals.

My compromise was to mark the duo item as "preferred" to differentiate. This allows "truthy" queries to give a reasonable result, while also retaining the flexibility to be able to include the normal-ranked individuals if need be (remember wdt isn't intended to cover everything). However this change was reverted (CC: User:Infovarius).

I could have just as easily marked the individuals as preferred instead, would that be better? If statement ranks aren't the answer, what is? This seems like an appropriate use of them to me, but maybe I'm mistaken. --NoInkling (talk) 12:09, 20 November 2020 (UTC)

Just as another point of data in a slightly different domain, The Matrix (Q83495) currently doesn't include The Wachowskis (Q195719) at all, even though they were credited as a duo. --NoInkling (talk) 12:30, 20 November 2020 (UTC)

Personally, I'd set Boris Strugatsky (Q59054) and Arkady Strugatsky (Q61699) as the authors and omit the duo item. Ghouston (talk) 21:33, 20 November 2020 (UTC)

Banned booksEdit

Not too long ago, I introduced the statement instance of (P31) banned book (Q65770793) to Ulysses (Q6511) in the context of a wider discussion on the property "banned", which has not been created yet.

Later, another person split this single statement with multiple qualifiers (different dates in the US & UK) into separate statements.

After I brought this example up in the "banned" property discussion, an admin reverted it on sight and suggested that I ask here for better ways of modeling the encoding of data about banned books. Should there be a more general property "banned", should we use "instance of banned book", or are there other better solutions?

(I think the admin prefers not including information about the banned status on the entity itself, but rather on the banning entity -- sometimes created specifically for the single purpose of identifying the document banning the book (e.g. town council meeting, court decision, government edict, law, school board minutes, Index Librorum Prohibitorum, etc.).

Since my proposal in May 2020 an active voice version "prohibits" was created using one of my main examples (the 13th Amendment "prohibits" slavery in the US), which is a good case for active voice use. In my opinion, Slaughterhouse-Five (Q265954) may be a pretty good case for passive voice use (banned by a town council and a school board). The question is larger, obviously, than just "banned books", but I'm wondering how you would model this given your expertise.

Any thoughts? SashiRolls (talk) 23:03, 10 December 2020 (UTC)

  • Let's start by discussing Ulysses (Q6511) instance of (P31) banned book (Q65770793). SashiRolls used it when to circumvent the property proposal process when the property was not created as he wanted. Whether or not we generally want to store information about whether a book was banned in the item of the book itself, this would not be the ideal way to store that information.
If a school board bans a book that can mean many different things. It can mean that it doesn't allow it's teachers to use it as part of the curriculum. It can mean ordering school libraries to get rid of the book. If the school board of New York bans Ulysses (Q6511) in any of those ways you can say that the book is banned in New York and a reader doesn't know what "banned" means.
If we instead model the information as "school board resolution XY" prohibits (P8739) Ulysses (Q6511) the nature of the ban is easier for the reader to assess and reason about.
For some books that are banned in different schools you likely could find 100s of different places where the book is banned. If we allow those 100 places being listed with separate instance of (P31) statements that would be a major headache.
Even listing them via another property would increase page-load time and bloat up the item about the book. ChristianKl❫ 00:03, 11 December 2020 (UTC)
ChristianKl: Stop the aggressive behaviour in line #1 immediately.
I found the entity related to banned books (which had been created prior to any discussion of the property "banned in") and did not understand why that entity would exist if it were not meant to be used as a category; so I used it to describe Ulysses with the help of another seemingly experienced user. SashiRolls (talk) 15:20, 11 December 2020 (UTC)
Cleaning up is not aggressive behavior no matter how often you call it that way. The item for banned book (Q65770793) is exists because https://fi.wikipedia.org/wiki/Kielletyt_kirjat makes it a notable item. ChristianKl❫ 17:22, 11 December 2020 (UTC)
Treat your elders with respect, Christian. Claiming that I created a claim "when to circumvent the property proposal process" is disrespectful, ungrammatical, and wrong. SashiRolls (talk) 17:45, 11 December 2020 (UTC)
  • "banned" is never an inherent property of a work of literature. It is a property it holds in relation to a specific group. So "banned book" is not an appropriate value for "instance of", because it applies to a relationship of the work to an entity, and not to any property of the work itself. --EncycloPetey (talk) 00:58, 11 December 2020 (UTC)
    ^^THIS^^, and it would have a start and finish date, which doesn't suit instance of (P31). I would think that it is just a significant event (P793) usage then qualified with by whom, when, etc.  — billinghurst sDrewth 10:28, 11 December 2020 (UTC)
    prohibits (P8739) still seems to superior when you actually know what happens. significant event (P793) does have the advantage of not allowing non-significant bannings of a random school board and thus not bloat an item with hundreds of statements. ChristianKl❫ 13:43, 11 December 2020 (UTC)
    I have used prohibits (P8739) and its complementary property permits (P8738) resulting in property constraint violations on Obscenity trial of Ulysses in The Little Review (Q16153541) & United States v. One Book Called Ulysses (Q2895062). Could you explain why that is happening here, but did not happen on Thirteenth Amendment to the United States Constitution (Q175613)? I suspect it may be that there in no entity for a "US district court case"? SashiRolls (talk) 16:24, 11 December 2020 (UTC)
    The constraint warning happen because there wasn't much effort in setting up the constraint types. Now, the contraint is set for legal act (Q1864008)/decision (Q16513426). Creating an item for "US district court case" class is just like creating an item for an individual "US district court case". It's easily created. ChristianKl❫ 17:26, 11 December 2020 (UTC)
    Billinghurst, this results in a value type constraint violation if I use the entities "banned book" or "book censorship in the United States" as the value since those are not events. Instead using the cases I mentioned just above solves that problem, but now it does not show the basic information that the book was censored/banned (though the reader may be able to guess that). Similarly on the the event pages (case pages) mentioned above there is no way for SQL to infer that Ulysses is a book. This is no help for generating a list of banned books. Hmm... I guess you can figure out via SQL that it's a literary work: https://w.wiki/qEZ SashiRolls (talk) 17:15, 11 December 2020 (UTC)
    Thank you for this answer, which corresponds pretty closely to what I guessed the reasoning behind the deletion must have been. I am still trying to understand the reasoning for treating special events like the very early award received (P166) & nominated for (P1411) (which record similar "accidental" a posteriori-type relationships) differently than book bannings.SashiRolls (talk) 15:20, 11 December 2020 (UTC)
  • award received (P166) & nominated for (P1411) both have a very clear meaning. It's clear what it means to receive an award. Bannings need more context. Just like there are different awards there are different bannings and additionally the jurisdiction of the banning matters. ChristianKl❫ 17:18, 11 December 2020 (UTC)
This is why the property proposal is "banned in". SashiRolls (talk) 18:39, 11 December 2020 (UTC)
When used with the banning that doesn't tell us about the nature of the banning. Ulysses (Q6511) significant event (P793) Obscenity trial of Ulysses in The Little Review (Q16153541) is much better at that task. ChristianKl❫ 15:36, 12 December 2020 (UTC)
  • @Infovarius: can you explain why you disagree with the consensus here to the point of readding Ulysses (Q6511) instance of (P31) banned book (Q65770793)? ChristianKl❫ 15:36, 12 December 2020 (UTC)
    • Because I wasn't aware about this discussion. And without this claim there is no info about banning in UK. I won't object if this would be expressed in the other way. --Infovarius (talk) 22:20, 12 December 2020 (UTC)
      • I added a bit on the discussion page which may include leads towards a "banning" primary source. This problem could have been avoided had a message been left on the talk page of Ulysses (Q6511) when the British library article by an Oxford professor was summarily deleted. ChristianKl, please use the talk page in future. It appears you did do so on this burning urine (Q40924) issue from the same day you deleted this source, not sure why it wasn't done here... SashiRolls (talk) 22:58, 12 December 2020 (UTC)
        • I didn't delete any items for British library article by an Oxford professor. To the extend that a British library article by an Oxford professor is notable it deserve it's own item. I deleted content that violated the general way data gets modeled on Wikidata. The decision of how a relationship is modeled is not one for the talk page of an indiviudal item because it's a broader issue. Wikidata is not Wikipedia.
I sent you here to the WikiProjects Book because that's actually the venue for the question of how certain information about books should be modeled. It's helpful to have discussions at one place. ChristianKl❫ 01:29, 13 December 2020 (UTC)
Here is the diff of you deleting the British library article by an Oxford professor. You just deleted the same reference again, so I'm finding it difficult to believe you could have overlooked that fact. Edit-warring is not useful. Problem-solving is. You have just removed information about the fact that Ulysses was effectively censored in the UK from 1922-1936 again, saying nothing on the talk page about your removal of that information with its corroborating reference, but instead suggested it was wiser not to leave any record of previously richer versions containing more information there. ~Very perplexing behaviour for a putatively collaborative project. Still waiting to see how you propose encoding that UK censorship without "banned in"... SashiRolls (talk)

Have you guys considered using banned content rating (Q104153449)? @SashiRolls: --Trade (talk) 01:09, 13 December 2020 (UTC)

  • @Trade: that's an metaclass for items like BPjM restricted (Q102129161) properties that are about content ratings like USK rating (P914). We have a lot of those properties because it's valuable to structure such relations in a very precise way. SashiRolls wants to store much more general statements then we are used to storing about content ratings.
The fact that we do model it effectively as a ban that certain games are banned from being sold to people under 18 is a good illustration of why just slapping instance of (P31) "banned X" on items is so problematic. Banning means lots of different things. ChristianKl❫ 01:54, 13 December 2020 (UTC)

What to do about existing mixed work/edition items?Edit

According to this query we have more than 16.000 items with ISBN-13 (P212) and publisher (P123) (indicating it is a book edition), not being instance of version, edition, or translation (Q3331189), neither having an edition. I found two cases:

Could we make the first or both of these cases also instance of version, edition, or translation (Q3331189)? -- JakobVoss (talk) 07:59, 14 December 2020 (UTC)

  • I made a suggestion at Wikidata_talk:WikiProject_Periodicals for an item class representing a conflation of literary work (Q7725634) and version, edition, or translation (Q3331189). I'm assuming that many articles published in periodicals, including academic articles, will never have separate edition items, so it should be handled explicitly somehow. A problem, however, is what you do if you decide later to split the work and edition, since the item may be linked from elsewhere as either a work (such as from VIAF) or as an edition (such as a reference in Wikipedia). This problem would also occur with conflations as they are currently encoded. Perhaps you would just leave the item as a conflation of a work and one of its editions, with additional edition items created as required. Ghouston (talk) 01:04, 29 December 2020 (UTC)
    (general comment about what I have done around nexus). It is an imperfect model as there so many variations. My approach (not calling it a solution) has been to list biographical compilations as version, edition, or translation (Q3331189) instance, and probably as a own work item (if I can be bothered). Each biographical component, has been listed as biographical article (Q19389637) and is linked to the edition with published in (P1433) as I think that either the whole work is reproduced as a new edition, never really as individual articles. Example Men of Kent and Kentishmen (Q101589898) containing William Prude (Q104636618). yes less than perfect.
    Follow similar process if I have a compilation of papers; or chapters of a book. Works, though imperfect.
    Poetry compilations, it doesn't work as individual pieces get a life of their own, in republishing by author, by other authors, or translations. So I always make each item Q3331189, and will look to complete a parent item.
    there are other variations and it defeats good rules, so to me my internal guidance is to whether it is referenced (article/chapter/...) or readily reproduced (separate edition). I think it falls outside of any guidance, but it is my working practice (until someone gives me something better).  — billinghurst sDrewth 13:30, 7 January 2021 (UTC)
  • The problem with that is that such conflations aren't a very good database model. Perhaps a better idea would be to accept that for many works it's only necessary to have an edition item. The "work" item would only be needed if you want either a) to link multiple edition items together b) include external links to "work" items in other databases. Ghouston (talk) 06:05, 3 January 2021 (UTC)

Best way to indicate a cross reference subpage?Edit

The listing of articles within an edition is always an interesting exercise. When I have done such the biographical and encyclopaedic compilation I generally have ignored that they are their own editions, and labelled them as articles of some sort.

Also in some of these works there is a 'see cross-reference/redirect. Where they have been reproduced in DNB they were made an instance DNB redirect page (Q19648608) (subclass of cross-reference (Q1302249) and Wikimedia redirect page (Q21528878)). Now a bit stuck with how these would be represented where they are recreated at WikiSource, as there is no generic equivalent and looking to utilise something on O'Kelly, John Thomas (Q99196993) as both the subclass items mentioned throw up a violation. I don't wish to create a DNB equivalent in this case. @ChristianKl:  — billinghurst sDrewth 22:36, 28 December 2020 (UTC)

  • I am saying what currently exists within DNB and has existed for years, and was what I attempted to copy. So would prefer solutions of what we can do, rather than to be told what I shouldn't.  — billinghurst sDrewth 07:16, 3 January 2021 (UTC)

Instance of literary work vs bookEdit

I've created my first WD item for a book, which will be followed by matching, separate items for each of its four translated editions (all by different publishers, different dates). Stuck on the first Statement: Instance of - which is correct, literary work or book? The "book" has won numerous awards, so does that relate to "book" (as a publication) rather than the creative aspect of a "literary work"? -- Deborahjay (talk) 16:04, 3 February 2021 (UTC)

I think a "book" is an edition, and you'd want a literary work item to link together multiple editions. Ghouston (talk) 22:36, 3 February 2021 (UTC)
A book is not an edition in my camp, and I avoid the term book as it is too indeterminate IMNSHO. Use of literary work can never be wrong, then I always just use edition for each version. About the only time that I vary is when it is a speech, that is later published, so I will always identify the speech and the date it was given as a different criteria. @Deborahjay: Can I suggest that you utilise the WE framework gadget/tool as it allows you to do a work or an edition and populate all in one fell swoop. It is set up at English Wikisource as a gadget, though set mine to run globally through my global.js file.  — billinghurst sDrewth 04:55, 4 February 2021 (UTC)
We've decided before not to use "book" since it can refer to many different levels: a work, an edition, a physical copy, a format of manufacture, a section of a classical work, and many other things besides. --EncycloPetey (talk) 01:35, 3 March 2021 (UTC)

Conflicting informationEdit

The identifiers on Nightmare Japan: Contemporary Japanese Horror Cinema (2008 Rodopi Publishers ed.) (Q105713976) are giving conflicting information regarding publication date and number of pages. Can anyone help me sort it out? --Trade (talk) 00:56, 3 March 2021 (UTC)

For the date, I'd probably go with 2008. Amazon has a date on its page of November 19, 2007, but using the "look inside", the date printed in the book is 2008, and that's what other sources have. "1 January 2008" is probably just another way of saying 2008. If you also look inside at the table of contents, the numbered pages run from 1 to 217 plus the size of the index, but if you scroll to the end of the book, it finishes on page 219. I don't know whether the page count in Wikidata is supposed to include front matter, etc., though. Property talk:P1104 isn't entirely conclusive, to me. Ghouston (talk) 01:46, 3 March 2021 (UTC)
Return to the project page "WikiProject Books".