Wikidata talk:WikiProject Books/2017

WikiCite 2017 applications open through February 27

  WikiProject Books has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

 

(apologies for cross-posting) We just announced that applications for WikiCite 2017 are open until February 27, 2017. WikiCite 2017 is a 3-day conference, summit and hack day to be hosted in Vienna, Austria, on May 23-25, 2017. It expands efforts started last year with WikiCite 2016 to design a central bibliographic repository , as well as tools and strategies to improve information quality and verifiability in Wikimedia projects. Our goal is to bring together Wikimedia contributors, data modelers, information and library science experts, software engineers, designers and academic researchers who have experience working with citations and bibliographic data in Wikipedia, Wikidata and other Wikimedia projects. Members of WikiProject Books are among the most relevant and knowledgeable communities in Wikidata on the topic of structured data for sources and it would be fantastic to see you in Vienna. We have (limited) travel funding available, if you're interested in participating please consider submitting an application. This year's event will be held at the same venue as the Wikimedia Hackathon and we'll be able to accommodate up to 100 participants. Any questions? Get in touch with the organizers at: wikicite@wikimedia.org --DarTar (talk) 18:25, 11 February 2017 (UTC)

A question I also on Meta: why is this a closed event? It seems that already last year this wasn't open event as most Wikimedia events? Does WMF allow funds to be used for private reunions?
--- Jura 15:47, 26 February 2017 (UTC)

Genres for non-fiction works

I've been struggling with how to categorize non-fiction works like books on sewing and embroidery techniques. For the time being, I have made a new genre textile arts and crafts (Q28940853) and put it directly under nonfiction (Q213051), but I'd like to build out at least skeleton of a genre hierarchy for crafts.

nonfiction (Q213051) seems to have <instances of> that are non-fiction works and its subclass tree has more refined genres. It includes a tree:

I note that Publisher's Weekly has a set of 17 categories of adult non-fiction. Has anyone looked at mapping these to wikidata's genres?

I expect I've missed a bunch of prior conversations about genre in this project, so forgive me if there's guidance somewhere that I haven't seen. - PKM (talk) 01:54, 14 March 2017 (UTC)

Some missing properties for language textbooks

Textbooks often have a rating for the level of the language you will reach. This textbook for example (https://www.oebv.at/search?global-query=szia) covers the topics needed to pass the A2 test in the GERS system. Also we don't seem to have properties to link to the workbook and the CD, that accompanies the textbook. Does anyone have an opinion about how best to capture these relations? --Tobias1984 (talk) 17:10, 26 March 2017 (UTC)

Which is the percentage of textbooks in WD and in the language rated by this kind of parameter ? What is the purpose of this data in WD ? Everyone can work on what he wants but as you ask the opinion of the community, I give you mine. Snipre (talk) 21:06, 26 March 2017 (UTC)

Property for "number of volumes" of an edition

What property should be used to record the number of volumes a multi-volume edition is published in? The only discussion I've found is this which isn't very conclusive. There's number of parts of this work (P2635) which seems the closest (there's also number of episodes (P1113) which is sort of similar). Perhaps there's no need for this, because each volume should be its own Item, and use volume (P478) to record the volume number? Sam Wilson 08:51, 6 April 2017 (UTC)

Not certain that each volume is its own item. While some works have volumes that they have published over time, eg. 63 volumes of DNB, the errata, and first and second supplements were published at the same time. Other works were similarly coincidentally published, and the only reason they are in multiple volumes is the limitation of the bookbinding process. in I do know that where we have transcribed multi-volume works that we have added each volume as a Wikisource Index link.  — billinghurst sDrewth 12:59, 6 April 2017 (UTC)
There are two cases:
  • The case where a collection is composed of several volumes. Most of the time in this case, each volume is considered as a distinct work with its own set of authors and identifiers like ISBN. In that case each volume is a work and the relation with the collection item should be done by part of (P361).
  • The case of one work divided into several volumes due to printing reasons. The first thing to check is if each volume has a different ISBN. If yes, we should create an item for each volume to avoid any constraint violation and use the part of (P361) to link all volumes to a general work item.
If there is no ISBN for each volume then we have to discuss if we want to have separate items. We can use one item for the general work without any mention to the fact that several volumes exists and add the volume number like we are doing now with pages in the reference section of the statement. Snipre (talk) 13:18, 6 April 2017 (UTC)
Either way, we simply do need to record the number of volumes in the edition/series. And fwiw the Wikisources are generally working on works preceding ISBNs, and these older works are those most likely to be multi-volume, coincidentally, published works due to said binding limitations. Example s:My Life in Two Hemispheres / My Life in Two Hemispheres (Q19065902).  — billinghurst sDrewth 22:40, 6 April 2017 (UTC)
It looks like this has been proposed before: Wikidata:Property proposal/Number of volumes and the solution is to use number of parts of this work (P2635) with volume (Q1238720) or chapter (Q1980247) as the units. I'll add something to the documentation here. Sam Wilson 04:20, 7 April 2017 (UTC)

Wikidata:Property proposal/contributed to

Hi. The proposal that I put forward to fix some of the misapplication of contributor to the creative work or subject (P767) is quiescent. I would appreciate if there is any comment about this being a good or a bad idea could be added. I see that we need this property so where we have contributors to journals, periodicals, or extensive multi-volume works that we can list to where they contributed on their author page. Thanks.  — billinghurst sDrewth 06:56, 21 April 2017 (UTC)

to note that this has been done, and available at contributed to creative work (P3919)  — billinghurst sDrewth 00:53, 25 May 2017 (UTC)

Editions that are 'editions of' more than one work

Just out of interest: editions that belong to more than one work. There are quite a few! Sam Wilson 20:22, 24 May 2017 (UTC)

I did find that Bible translation (Q86860) was a subclass of edition, which would be problematic. There needs to be a constraint violation that a work cannot be subsidiary to an version, edition or translation (Q3331189).  — billinghurst sDrewth 00:46, 25 May 2017 (UTC)
Also found that something like Septuagint (Q29334) is an "edition or translation" that has editions. Should that be caught in your query? It should be showing as a work, not an edition.  — billinghurst sDrewth 00:51, 25 May 2017 (UTC)
I see numbers of these are biblical, so it would seem that some need to be creating the intervening parent works for the editions. For the query, would you please be able to add the language of the underlying edition, to help people work out whether they have a hope of solving it.  — billinghurst sDrewth 01:05, 25 May 2017 (UTC)
Query probably needs a tweak, Q23541390 should not be appearing. Otherwise we have an upstream problem rather than at the work level. hosed data. Definitely needs a tweak, I see Så går det till på Saltkråkan (Q20855913)  — billinghurst sDrewth 01:15, 25 May 2017 (UTC)

Clarification in Wikidata:WikiProject Books documentation

  WikiProject Books has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Hi all,

There is a lot of property on Wikidata:WikiProject Books which is awesome but as there is quite a lot, I find it a bit difficult to find what property are really important/relevant/useful or not (and I'm used to this structures and have librarian background, I can't imagine how hard it would be for a newcomer).

Can I suggest to split the tables in two parts? (not sure about the names of the subsection but probably something along « mandatory/most common properties » and « extensive/full/other properties »). What do you think?

Cdlt, VIGNERON (talk) 10:57, 26 June 2017 (UTC)

Can't someone make a statistics of how are they currently used? If there is a clear group of outliners, this is probably going to be so for a long time.--Alexmar983 (talk) 11:24, 26 June 2017 (UTC)
It's quite easy, for book (Q571) :
#defaultView:BubbleChart
SELECT ?property ?propertyLabel ?propertyDescription ?count WHERE { 
	{
		select ?propertyclaim (COUNT(*) AS ?count) where {
			?item wdt:P31 wd:Q571 .
			?item ?propertyclaim [] .
		} group by ?propertyclaim 
	}
	?property wikibase:claim ?propertyclaim .
	SERVICE wikibase:label { bd:serviceParam wikibase:language "fr,en" . }
} ORDER BY DESC (?count)
Try it!
(by the way, you can see that some of the most used properties are not supposed to be used on book (Q571), like publication date (P577) or ISBN-13 (P212) :( ).
And for version, edition or translation (Q3331189) :
#defaultView:BubbleChart
SELECT ?property ?propertyLabel ?propertyDescription ?count WHERE { 
	{
		select ?propertyclaim (COUNT(*) AS ?count) where {
			?item wdt:P31 wd:Q3331189 .
			?item ?propertyclaim [] .
		} group by ?propertyclaim 
	}
	?property wikibase:claim ?propertyclaim .
	SERVICE wikibase:label { bd:serviceParam wikibase:language "fr,en" . }
} ORDER BY DESC (?count)
Try it!
Cdlt, VIGNERON (talk) 11:36, 26 June 2017 (UTC)
It would be interesting to know how many items WikiProject Books considers meet its standards and include a core set of properties. I keep hearing that this WikiProject can't move ahead because of this or that problem (mostly by users only commenting on talk pages, not actual contributors). So from those actually contributing here, I'd be interested in hearing where it stands.
--- Jura 11:51, 26 June 2017 (UTC)
The « core set of properties » is not well-defined (if defined at all), that's is exactly what I'm asking for here.
For meeting the standard, as I mentioned and as the requests show, there is some significant violations (on the 94215 items with P31 = Q571, there is 46131 claims with publication date (P577) = 49% and 25677 claims with ISBN-13 (P212) = 27%). I spend a big amount of time to correct them but I feel I'm kind of alone (at least, clearly, the number of violations is not going down :( ), I think and I hope that some clarifications would help to solve the problem.
Cdlt, VIGNERON (talk) 12:13, 26 June 2017 (UTC)
At Wikidata:WikiProject Movies/Properties#Core_properties I tried to determine what would be the core properties and did some statistics based on that Wikidata:WikiProject_Movies/Numbers#Key_indicators. For films, the properties are fairly stable, but many items are incomplete. Still, I think it's progressing nicely. This without much theory about how it could or ought to be abstracted into some other model.
--- Jura 12:33, 26 June 2017 (UTC)
Thanks for the example, that exactly what I wish for the WikiProject Books. Other wikidatalibrarians, what do you think? Cdlt, VIGNERON (talk) 12:40, 26 June 2017 (UTC)
I really hope me and Chiara can dedicate some time to this in the following weeks/months. We have sort of a working draft from WikiCite conference, and if some users here (like you, Vigneron ;-) can help us clean some existing mess and populating some other items, we can make that the standard. Unfortunately, books are in my experience much more complicated than movies and the like, also because we have some few centuries more of production and cataloging ;-) Aubrey (talk) 13:11, 26 June 2017 (UTC)
I think our main problem is that most editors who are not part of the Books project will enter both works and editions as "books" The problem isn't that the "wrong" properties are added to books, but rather that the common-sense non-specialist notion that The Lord of the Rings is a book isn't correct in wikidata. I believe we need to accept that "book" is a perfectly natural label for "a published work" and that editors will continue to use it that way. I look forward to seeing the work that came out of WikiCite. - PKM (talk) 18:54, 26 June 2017 (UTC)
If we consider that the current approach isn't working out, one needs to find a different way. Maybe a Wikibase feature could make inclusion of edition information easier.
--- Jura 11:02, 27 June 2017 (UTC)
@PKM, Jura1: is the current approach really not working? Sure there is a lot of mess but nothing that can be solve if we give clear and precise rules (which is not quite the case right now) and if we roll up ours sleeves and work on implementing them. The global structure sounds good to me, indeed the problem is more on the workflow. Moreover, because of the Wikisources, I don't see how we can avoid making the FRBR distinction. Can we move on forward or not? Cdlt, VIGNERON (talk) 11:45, 30 June 2017 (UTC)
Supposedly, any scheme can be made to work with sufficient work, but this doesn't necessary mean it's one that works well or even is practically relevant. In this case, it seems that even people that keep advocating this approach for books (and other works), don't actually use it. Either due to a lack of general interest in the projects or due to an incapacity to actual use it. Maybe the experience with datatypes for Wiktionary could eventually help find a solution that could work out. In the meantime, a simplified approach that focus on a few main properties could work out better.
--- Jura 10:15, 1 July 2017 (UTC)

Proposal

  WikiProject Books has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Right now, we only have one table with the following properties (in this order) :

Like on Wikidata:WikiProject Movies/Properties, I'd like to split it it several tables :

This is only a first proposal to start a concrete discussion of what is in the core (mandatory properties always expected) and what is not (I tried to split it according to what I know and what is used, but I'm not sure). What do you think? Cdlt, VIGNERON (talk) 12:07, 30 June 2017 (UTC)

I confess: I am an intruder, I signed and forgot. So you can ignore what I say, but here it is.
It seems clear. We start and we improve over the years. I work with newbies, they need "structure" and, if possible, core properties, especially those which might be shared with other items should be separated. Some newbies start from the very generic ones, other newbies have some experience so they have to be introduced to the more specific ones. Some order does help.
I suggest in any case, from my experience with other items, to clarify the "image" property. It will be difficult to reorganize it later. People for example have many type of minor images properties from signature to grave/tomb... We should have something more precise such as "cover of first edition" IMHO.--Alexmar983 (talk) 12:38, 30 June 2017 (UTC)
Having a list with core properties is of course useful. But really helpful would be an input mask for books (and articles) and an import tool. In LibraryThing I just need to click on a catalog of one of the major libraries to import the data. Nobody - except a few experts - should need to read a headbook and remember dozen of properties to enter bibliographical data. Wikidata is still running in DOS-Mode. --Kolja21 (talk) 13:08, 30 June 2017 (UTC)
I think the list of core vs. other properties should be different depending on what the item is an instance of. For a book, journal, etc,. "inception" might be a core property while for an edition or a journal article "publication date" might be a core property instead. Jc3s5h (talk) 13:55, 30 June 2017 (UTC)
@Alexmar983: intruding is good, one of the problem here is to be easily understandable by newcomers, so you're external point of view is very useful.
@Kolja21: very true, but doesn't these tools need to define what properties are important/mandatory or not? BTW, how does LibraryThing handle copyright? (I understood it could be a major problem for mass import).
@Jc3s5h: also true but not entirely sure, for me a core property is a property always needed, regardless of the document type. For inception (P571), maybe (probably) it's not a core property at all; I added it because the request above shows it's the 5th of the most used property on instance of (P31) = book (Q571) (reminder I'm only talking about the « Work item properties » for now, we can talk about the others sections after).
Cdlt, VIGNERON (talk) 09:40, 1 July 2017 (UTC)
For movies, "core" properties should be available for any film (currently this isn't entirely true). "main" properties might eventually vary depending on genre, but I also used frequency of actual use in Wikidata to distinguish from "core" ones.
As to your suggestion here, personally, I wouldn't include conditional properties (e.g. subtitle (P1680)).
--- Jura 10:15, 1 July 2017 (UTC)
If a core property is always needed, and we are only discussing works, not editions, then "inception" is a core property; it would be equivalent to the "orig-date" parameter in citation templates, but only the "orig-date" of the 1st edition. "Author" is not a core property because some works are created by some sort of committee, and in some citation styles, only the publisher is cited. Also, various editions may have different authors, especially technical works. "Genre" is not a core property because technical works do not have any genre (at least, not the way genre (P136) is currently described: "a creative work's genre or the genre in which an artist works." Jc3s5h (talk) 12:10, 1 July 2017 (UTC)


I think it's a great idea to split the table up into these groups. I'm not sure we need to worry too much about classes outside of books and "sub-books" (of which there are currently 92; we could probably clean these up a bit) but even just for books and their editions (are we assuming that everything of class 'book' or one of its subclasses has editions?) I think it'd be great to get some more clarity. Be bold, I say. Sam Wilson 10:20, 1 July 2017 (UTC)
@VIGNERON: The splitting of the properties into different categories is not enough: just have a look at the discussion below. Having a list of core properties doesn't say anything about when to use or how to use them. Just take the examples of subtitle (P1680): what do we have to do if we have a work without subtitle ? Do I have to add some kind of empty subtitle statement ? If yes, with which value ? Then what's about author (P50): if I have an edition, do I have to add the author or is the link to the work item enough to access to the author's name ? If we use the FRBR system we need to have a list of core properties and optional properties for each level:
Properties Work Edition (in original language) Edition (translated version) Exemplar Manuscript
instance of (P31) Mandatory Mandatory with value version, edition or translation (Q3331189) Mandatory with value version, edition or translation (Q3331189) ... ...
has edition or translation (P747) Optional na na ... ...
title (P1476) Mandatory Mandatory Mandatory in the translated language ... ...
subtitle (P1680) Mandatory Mandatory Mandatory in the translated language ... ...
publication date (P577) na Mandatory Mandatory ... ...
place of publication (P291) na Mandatory Mandatory ... na
inception (P571) ? na na ... ...
genre (P136) Mandatory -1 -1 ... ...
author (P50) Mandatory -1 -1 ... ...
author of foreword (P2679) na Mandatory Mandatory Mantdatory na
author of afterword (P2680) na Mandatory Mandatory Mantdatory na
editor (P98) na Mandatory Mandatory Mantdatory na
contributor to the creative work or subject (P767) ? ... ... ... ...
publisher (P123) na Mandatory Mandatory ... ...
translator (P655) na na Mandatory ... ...
illustrator (P110) na Optional Optional ... ...
na : not applicable (the property shouldn't never used in this class of items)
mandatory: the property has to be always present in the item with a value or no value or unknown value
optional: the property can be present in the item (with a value) or not
1 Value are retrieved from the work item linked to the edition item
Then as described by Jc3s5h, there are always exceptions and depending on the number of exceptions, a rule has to be defined in order to handle these exceptions. Finally I just use the example of Help: Sources: having a list of properties is not sufficient, you have to define a list of cases and for each case, you have to create your list of core properties with optional properties and when needed, you have to describe the special cases and the way to add data when the standard model is not applicable. So the first step is not to define one, two or three lists of properties but to define the standard cases. For the moment we have at least three: work, edition and real book like a manuscript in a museum. So we need the core properties for work, for edition and for real book cases. The question is: do we have more cases than these three ? Snipre (talk) 11:48, 3 July 2017 (UTC)
Can we create a page like Wikidata:WikiProject Books/Data model in order to have a place to discuss and to propose things ? Snipre (talk) 11:56, 3 July 2017 (UTC)
@Snipre: Of course it's not enough, it's a beginning and we are on a wiki. Plus, I prefer to go slowly but steadily (for instance, we all know that instance of (P31) is the elephant in the room and we probably should entirely rethink it or the duplication of author (P50) is not optimal and should indeed be retrieved ; but that's both too big and too small irrelevant questions here).
For the exceptions, depending on how you counting but 100 % of « books » can be exception as they're all different (manuscript, codex, volumen, incunabulum, journal, magazine, paper, publication, edition, tome, volume, textbook, comic book, flyer, booklet, script, libretto, etc. etc. etc.) and you can distinguish billion of different cases. The three FRBR levels seems to be a good trade-off, a bit complexe but simple and precise enough without being too complicated.
I like the format of your table (except that I don't see why you split edition and translation and that I'm not sure what you're calling « Real book » - is it exemplar ? -, FYI, all four FRBR levels are commonly named book). We probably should take it in introduction of the page, what do you book worms think ?
Cdlt, VIGNERON (talk) 15:15, 3 July 2017 (UTC)
Yeah, I think we should try to find an approach that is easier for contributors (and users). Theoreticians and Wikipedians a like tend to get lost with the current approach and ignore how to add things.
--- Jura 15:23, 3 July 2017 (UTC)
The current user interface seems ill-suited to do this sort of thing. The chief problem I see in creating an automated system is the need to check for items that already exist before a given citation is created, such as works, books, publishers, and authors, since the names of these items is not completely predictable, so they are hard to search for reliably. Jc3s5h (talk) 15:33, 3 July 2017 (UTC)
I don't really caught your point: in WD we use item not string so even if we create 2 or 3 different items for the same publisher for example and later we can merge the items without doing any major changes in the WP articles: the data will be automatically updated. This is not a critical problem preventing us to start to work: I am currently curating hundreds of duplicates about chemicals after bot importations. Snipre (talk) 15:59, 3 July 2017 (UTC)
@VIGNERON: Here is some differences in our visions: I don't think we have to start from a too simple model because we will never do the work twice and without proposing a starting classification contributors will create their own classification.
If I take your list (manuscript, codex, volumen, incunabulum, journal, magazine, paper, publication, edition, tome, volume, textbook, comic book, flyer, booklet, script, libretto,...) I think we can already group these different type of books into 2-4 groups (here we need the input of specialists) in order to start the classification. It will be always easier later to go deeper in the classification when starting from 2-4 groups of books than for one big group.
Then for the reason I splitted edition into 2 groups, this is because the differences between the two groups are growing: language is different (for edition in original language we can assume that the language in not necessary as we can find it in the work item), translator (specific property) and often the translation is performed from another edition in the original language so we can create a link between a translated edition and an edition in the original language using a property to create.
And yes, real book means exemplar. Snipre (talk) 15:51, 3 July 2017 (UTC)
I don't think that you can resume the huge diversity of books in 2-4 groups. There are not even on the exact same levels. Above all, the definition level: some terms are very specific (incunable (Q216665) is a book printed in Europe between 1450 and 1501 ; libretto (Q131084) is the book where is put the text of a musical work, or the text itself, etc.) some are very general (manuscript - everything written by hand -, publication - everything published -, etc.). But there is a lot of cultural level too.
If, of course, we need to improve the current classification (some parts are not bad but some parts are a Brobdingnagian mess), I'm not sure this is the thing we need the most right now. Here, I would prefer to focus working on making the structure we already have better and easier to understand. If people want to work on classification a new thread would be more appropriate.
Cdlt, VIGNERON (talk) 16:56, 3 July 2017 (UTC)

Talk:Q3331189 shows editions and works loop hierarchically

Looking at Talk:Q3331189 it seems that we have loops within the hierarchy of how and where editions fall. Creative work, intellectual and work appear and loop.  — billinghurst sDrewth 21:50, 26 June 2017 (UTC)

It is quite complex but it seems there is only one direct loop between text (Q234460) and literary work (Q7725634), each subclass of (P279) of the other (the second added by Laddo last month).
Cdlt, VIGNERON (talk) 07:47, 27 June 2017 (UTC)
My bad, I corrected that. -- LaddΩ chat ;) 12:37, 27 June 2017 (UTC)

should a novel in several volumes be considered a book or a book series?

I'm interested in your opinion on this discussion with @Andreasmperu:, where I argue that 1Q84 (Q208971) should have the claim instance of (P31)book series (Q277759) and not instance of (P31)book (Q571) given it was published in several volumes: what is the convention here? Any previous cases of this kind to rely on? -- Maxlath (talk) 11:08, 30 June 2017 (UTC)

We had a discussion here on the same subject Topic:Tlnf8myx0twxvzhh (in French). I have no strong opinion on the matter, I just wish for us to be a bit more consistent (right now, most of items on books are more-or-less of a useless and unusable mess). Cdlt, VIGNERON (talk) 11:28, 30 June 2017 (UTC)
I think a set of volumes = a book (Q571), and so isn't a book series; each item is a volume (Q1238720), and the collection as a whole is an version, edition or translation (Q3331189) of a book (Q571). I think the difference between a volume and an item in a book-series is that the latter can, at least to some extent, stand on their own (thematically, story-wise, etc.). Sam Wilson 10:42, 1 July 2017 (UTC)

P50 of edition (when there is already the same P50 on works)

Hi,

As pointed by Snipre above, the duplication of P50 in both work item and its edition items is quite unnecessary and generate unworthy works for wikidata users and bots. I've looked on Wikidata request and some bibliographical resources, I see no case where work and edition can have different author (if we except the persons augmenting the editions but they're not really author and we have different and more specific properties for that, like editor (P98) author of foreword (P2679), author of afterword (P2680)).

Here some reasons:

  • it's very easy to retrieve the author (P50) from the work item.
  • duplication is per se not a good thing; it double - at least - the maintenance and the constraint checkings, I say double but for works that have tens, hundreds or thousands of editions - which is not that unusual - the job is 10, 100, 1000 times more complex, and it's even worse it the work has more than one author, again something not unusual, in short: less is more
  • it's more consistent (I requested items where the author is different on the work and edition items, almost all of the results are human mistakes and inconsistencies like Brothers Grimm (Q2793) on one item and Jacob Grimm (Q6701) and Wilhelm Grimm (Q6714) on the other item...)
  • the label on author (P50) in most language refer to text or work, not edition.
  • etc.


So I suggest, to remove author (P50) on Wikidata:WikiProject Books#Edition item properties and Template:Book properties Édition line (and other places where documentation). What do you think? Did I miss something?

  WikiProject Books has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Ping to the top 10 users of this property: @Jura1, Sergey kudryavtsev, Nono314, Daniel Mietchen, Hsarrazin, ShinePhantom: @555, CEM-air, DavidMar86hdf, FischX:

Cdlt, VIGNERON (talk) 14:44, 4 July 2017 (UTC)

Care must be taken, in case the "author" is the translator or editor, put into the wrong location. But as long as the "author" on the edition is confirmed to be the same as for the work, there should be no issues I can think of. --EncycloPetey (talk) 15:44, 4 July 2017 (UTC)
The table above is just a proposition. Feel free to change it if necessary. Snipre (talk) 15:46, 4 July 2017 (UTC)
Fundamentals of Physics (Q5508989) is currently in its 10th edition, and the latest edition had three authors (according to the Wikipedia article about the work, and internet advertisements). The first edition, Fundamentals of Physics (Revised Printing) (Q29161386) only had two authors (I possess this edition). So the edition is a safer place to put the authors than the work. Jc3s5h (talk) 23:48, 4 July 2017 (UTC)
There are always going to be exceptions that do not fit the general rule. A book that is an anthology of poems or short stories will not have an author for the volume, but an editor. It will be the component poems / stories included in the volume that have authors, not the book itself. And in the first edition of the anthology, those component poems / stories will be editions of the works they represent, which will be a mess to try to set up, especially for an anthology that runs to multiple editions. --EncycloPetey (talk) 02:51, 5 July 2017 (UTC)

It will be a very bad thing to remove author (P50) from an edition, as editions are what we should use for referencing on Wikipedia, other Wikimedia wikis and outside these environments. As I see it, it will create unnecessary complexity if the author information should be fetched from the work, - and as stated above the work may change author depending on the edition. It is not uncommon for textbooks to change author. In cases where we want to reference a textbook (or other factual non-fiction book) and it is not available on Wikidata, we would enter the edition (and not necessarily the work). In this case we would lack the author information, if we choose to not accept author information on the edition page. I see no complication in duplicating the author information on the work and edition items. — Finn Årup Nielsen (fnielsen) (talk) 14:43, 13 July 2017 (UTC)

@Fnielsen: The original language will be extracted from the work article after the merge of P407 and P364, not anymore from the edition article so the complexity is already something we will need to take account. And I am not so sure about your comment for the use of edition in WP article: most of the time WP articles deal with the work and perhaps with the first edition. Few other articles focus on other editions. Finally the creation of edition items without the corresponding work item is a bad practice which has to be prohibited: we have to be coherent in data structure because we are working with a database, not just applying the system of WP in WD: if we have different data models we won't be able to ensure a correct data extraction and this is the priority for WD as database. Snipre (talk) 15:26, 13 July 2017 (UTC)
@Snipre: I see no reason not to include language of work or name (P407) and original language of film or TV show (P364) in the edition. I agree with you that Wikipedia articles are almost always about works. However, when referencing we should use editions. I know that English Wikipedia is reluctant to use Wikidata (items) for referencing, but there are plenty of other places we can use Wikidata items as references (I use it with Scholia for instance, see https://github.com/fnielsen/scholia/wiki). I cannot see why it would be a problem to create an edition item without a work item. In many cases where we do not have a Wikipedia article or multiple editions, there might be little reason to create a work item. — Finn Årup Nielsen (fnielsen) (talk) 15:47, 13 July 2017 (UTC)
@Fnielsen, Snipre: I agree with Fnielsen that citations should be made to editions, not works. One obvious reason for this is that a statement that appears on page 100 in the first edition might appear on page 120 in the second edition, so if the citation were made to the work, many readers would be unable to find the statement, and might form the opinion that the editor who provided the citation is a liar or a slob. Jc3s5h (talk) 15:57, 13 July 2017 (UTC)
It may also be that "facts" changes between editions. I recall the famous Irving v Penguin Books and Lipstadt (Q4382268) where Richard J. Evans (Q607427) and his assistant bought several (all?) copies of The Destruction of Dresden (Q3208262) showing different numbers for the killed people in the different editions. I believe this was an important point in the expensive lawsuit. — Finn Årup Nielsen (fnielsen) (talk) 16:47, 13 July 2017 (UTC)
@Fnielsen, Jc3s5h: Sorry I missed the "for referencing" in the first sentence of Fnielsen. So yes, I agree about the use of edition for referencing purpose in WP, and we agree that most WP articles about books are related to work and not edition. So this is clear now.
But I maintain my comment about the necessity to create a work item for each created edition item: this is the only way to link all editions together and to perform query on works. How do you want to extract the list of books written by an author if you have only edition items ? Using editions in the query, you will get all possible editions of the same work and all translations and you will have to find a way to filter that list to get only the relevant titles. Having the work items allow you to retrieve only the works and then to have the list of titles published by the author in its languages.
You can't think only to your reference purpose, you have to think in terms of data structure.
And Fnielsen, you didn't understand my comment about languages: currently two properties were used for languages, so it was possible to extract the original language from the edition item, in the future this won't be possible anymore as we will have only one language property. The original language will be defined as the language of the work item so if you want to extract the original language after the properties merge, you will need to extract the work item from the edition statements and then to extract the language statement from the work item. Snipre (talk) 17:03, 13 July 2017 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── It's fairly common for revised editions of textbooks and reference works to have different co-authors. Example: Survey of Historic Costume (Q28946447), Survey of Historic Costume (Q28946484) and Survey of Historic Costume (Q29002765). - PKM (talk) 21:39, 13 July 2017 (UTC)

Basic statistics

To get a general idea where we stand, it might be worth trying to determine to how to many works various items refer to.

Then determine a few basic aspects: who, when, genre, title, .. Several properties could be consolidated into one. To make things easier, try to avoid bibles and works first published > 100 years ago.

Once done, one could attempt to determine the number of specific forms (editions) various elements refer to.

This is somewhat different then approach suggested above based on films, but given the current structure of items, it should work better.
--- Jura 07:10, 5 July 2017 (UTC)

@Jura1: Can you explain more precisely what you mean and expect? Do you mean something like a list of type of works? If so, this list pretty much covers it and regardless of the genre and type, all books have the same core properties already outlined by this project (needless to say, with a heck lot of improvements to do), what more do you want? (personally, I think improve the presentation and the pedagogy is the more pressing matter but I can help on other matters if someone want) Cdlt, VIGNERON (talk) 08:26, 5 July 2017 (UTC)
The purpose is to get an idea about the number of works being described similarly to the one I suggested before.
As for pedagogy, I think it might be fairly simple: most sitelinks are about works. What people need for references is generally at the other end of the field. Looking at some of the edit history of items, it appears that VIAFbot might have combined the two.
--- Jura 08:34, 5 July 2017 (UTC)
Sorry, I don't really understand why you want to know the number (does it matter if we have 300 000 - probably a good guesstimate of the number of « books » right now on Wikidata - or 130 000 000 books - Google estimation of number of unique books in the World) and more importantly, I don't get what you want and how (so I can't really help you... I'm poking in the dark here).
Not exactly, in absolute, most sitelinks are *not* works, most sitelinks *to wikipedia* are probably about works (de facto not true right now, only second for frwiki : [1] and third for enwiki). Most sitelinks to wikisource is about edition (and Wikisources has a far greater total number of books than Wikipedias). And that's only if we talk about FRBR levels (which despite the explanations and efforts ot this projects are widely unused), most sitelinks to frws have instance of (P31) = poem (Q5185279) (10043 items , then article (Q191067) - 9521 items - and after that encyclopedia article (Q17329259) - 3823 items, for enws the more used P31 is biographical article (Q19389637) in 35086 items ).
Cdlt, VIGNERON (talk) 11:02, 5 July 2017 (UTC)
Good point about Wikisources, these sitelinks are different. I had in mind Wikipedia. Wikisource things should mostly be self-contained, primarily relevant to that project.
Back to the initial question: if we have no idea how many there are (first step), you can't really assess their completeness (second step) or what work would need to be done (third step). Maybe the conclusion is that we only have items about works that include identifiers for one or several editions.
--- Jura 13:26, 5 July 2017 (UTC)
Wikisource is deeply linked with other Wikimedia projects (before Wikidata, it was the most central project of the Wikimedia galaxy), I'm pretty sure that putting them away is a bad idea.
I have many idea (and even better some bibliographic knowledge) but I just don't understand the question. Could you please explain it?
About completeness, I'm not sure that completion is something we should aim to (Wikidata has 28 millions items, do we really want to create at least 130 millions items ? and I'm not counting the millions of other item for the writers of these books). About the work to be done, in my mind it's pretty clear: make the data we already have more consistent and rich (a lot of items only have very few basic informations), and I aim at improving the documentation to ease the reach of this goal.
Cdlt, VIGNERON (talk) 15:13, 5 July 2017 (UTC)
Completeness in the sense that if an item has statements with specific properties or not. Which properties are useful varies depending on the purpose. In general, full details are easily available elsewhere.
--- Jura 15:47, 5 July 2017 (UTC)

What is a manuscript ?

I have some problem to understand what is a manuscript. From what I know, we can have 2 types of manuscript:

  1. A manuscript is the working document of the author before any publication. This is a modern definition and most of the time, this kind of manuscript can be linked to a work and a first edition (the manuscript was then published and several examplars were released). This is a special case of exemplar.
  2. A manuscript is an unique version and this is the only edition and the only exemplar of a work. This definition is more related to old document before printing. In that case we have to decide if we want to have for each manuscript item a corresponding work item.

Then can we have a clear definition for exemplar and for manuscript in order to be able to differentiate these two different classes of documents. Currently manuscript is defined as "document written by hand". Is it correct ? Can we merge exemplar and manuscript in one unique class of document. As it is defined now the manuscript is only a special format of the exemplar class and not a different class of document. Snipre (talk) 14:24, 5 July 2017 (UTC)

@Snipre: a manuscript is indeed anything written by hand, this meaning lato sensu is the one of the item manuscript (Q87167) and it seems correct to me. The two examples you give are specific stricto sensu meanings that probably should be created (as subclasses of manuscript (Q87167) and your second example having illuminated manuscript (Q48498) as a subclass). Besides, I understand but I'm not sure you definitions are entirely correct (in particular the « this is the only edition and the only exemplar of a work » is usually true but not always, see the manuscripts of the Bible for instance, there is multiple exemplar of the same edition).
An exemplar is very different as it's not necessarily written by hand. Take any book in the real physical world, this is an exemplar (in cataloguing and FRBR, each library have an item for their exemplars allowing to say « our XX exemplar of the Misérables is borrowed by M. John Doe until 'Christmas in July' »). The confusion come from the fact that most notable exemplars are manuscripts (especially in Wikidata) but not all of them are manuscript (all the incunabula for instance are noteworthy exemplar but - by definition - printed and not handwritten).
Cdlt, VIGNERON (talk) 14:57, 5 July 2017 (UTC)
@VIGNERON: Here is a small problem: what's about a draft of a book written by the author on a typewriter before publishing ? According to my first definition, this kind of document is a manuscript, according to the definition "anything written by hand", this draft is not a manuscript. Perhaps my problem is coming from my French background where manuscript is used both for the draft of the author before publication (written by hand or not) and for the old and unique written by hand documents before printing.
If we will be consistent, this implies that every manuscript is an exemplar and can be treated like this class of document with a edition and a work item so manuscript should be considered as a special case of exemplar and not with a dedicated list of properties. Correct ? Snipre (talk) 15:12, 5 July 2017 (UTC)
@Snipre: Are you referring to typescript (Q28924364) ? (indeed, this is document - more-or-less-wrongly but commonly - called manuscript and the item should probably be corrected). The linguistic confusion is not specific to French as it exists in others languages too (the Webster entries in English are quite confusing too). I'm curious, I will look into it to see how far the lexicalisation of not handwritten manuscript has gone.
Yes, every manuscript is an exemplar (and - as far as I can see - not matter the meaning of manuscript). I think the end of your sentence is correct too (but I don't see manuscript as a special case or a subclass of exemplar, in the same way that I don't see a person as a subclass of human, maybe the answer is in the idea I suggested last year : #P31, FRBR and VIGNERON maybe crazy idea).
Cdlt, VIGNERON (talk) 15:30, 5 July 2017 (UTC)
@VIGNERON: Ok, so the the splitting of the properties into 2 tables, one for exemplar and one for manuscript, in the main page of the project is wrong: we should merge both tables in one. The presentation of the properties lists is very confusing because it let think that manuscript is another class among work, edition and exemplar. So in my table above I have to delete the column manuscript and only consider work, edition and exemplar.
The main consequence I see is that manuscript can't be used alone as instance of: a poem can be a manuscript, a libretto (Q131084) can be a manuscript. If manuscript is defined as "something witten by hand", this means we still need a second instance of to define the something in "something written by hand". In that case we should create a new property "format" where the value can be manuscript. Snipre (talk) 15:38, 5 July 2017 (UTC)
@Snipre: yes, or more precisely: either you choose the 3 FRBR levels, either you choose something completely different, but the mix-up of two different things is strange.
For our what is a manuscript question, I've got a headache about this and I'm unsure. We need at least 3 items with the same labels (in all the languages apparently and AFAIK). The first being the manuscript (Q87167) (I don't see anything to correct, maybe just UNESCO Thesaurus ID (P3916)) and the two others being your two examples.
The first would be : Len manuscript, Den « original copy of a work submitted for publication », Lfr manuscrit, Dfr « version originelle d’une œuvre soumise à publication » (please complete). Not sure what property use on this one, subclass of (P279) = manuscript (Q87167) ?? (apparently considering tywritting as similar to handwriting and opposed to printing is something new, still unclear but the lexicalisation is in progress/almost done...) or a more (maybe too) general subclass of (P279) = document (Q49848) ? Probably Bibliothèque nationale de France ID (P268) = 13618158b (ping Hsarrazin qu'en penses-tu ?). More clearly manuscripts by Anne Frank (Q29571913) should use this item as a value for instance of (P31) instead of Q87167 and typescript (Q28924364) should have subclass of (P279) this item.
The third would be : Len manuscript Den « handwritten document, usually created the invention of printing press » Aen medieval manuscript (maybe in Len?) Lfr manuscrit Dfr « document écrit à la main, généralement avant l'invention de l'imprimerie » Afr manuscrit médiéval (idem, maybe in Lfr). Clearly, it should have subclass of (P279) = manuscript (Q87167) and illuminated manuscript (Q48498) should have subclass of (P279) this item. Apparently, most of the 4000+ instance of (P31) = manuscript (Q87167) should use this new item instead.
(@Spinster: I think I remember that you were interested - and worked ? - on medieval manuscripts; maybe there is some knowledges you can share).
Cdlt, VIGNERON (talk) 15:54, 6 July 2017 (UTC)
@VIGNERON: if I I understand your point correctly, I totally agree with you :D - 3 items for manuscript ( and 2 of them subclasses of the 1st one)... seems very clear -- after reading 4 or 5 times, because I'm tired -- --Hsarrazin (talk) 19:48, 6 July 2017 (UTC)
What I see when I spend some time on this subject is the necessity to clarify the classifications, yes classifications in plural form. People try to use one unique term to caracterize a document but in reality each document has to assessed under different criteria:
FRBR level Type of writing Media Form Genre
Work na na roman, poem, novel Epic, romance, SF, horror
Edition (manuscript)/(typescript)/printed book, audio-book, codex, wax tablet, scroll
Examplar manuscript/typescript/printed
Each item about a literary work has to have a value for these 5 different classifications:
So a written by hand draft of an author about SF roman in the table of an publisher should be classified as examplar, manuscript (in the sense written by hand), manuscript (in the sense of stack of paper sheets), roman, SF
The bible written by hand by a monk of the XII century and used in religious ceremonies should be classified as examplar, manuscript (in the sense written by hand), book (in the sense of stack of bound paper sheets between 2 rigid covers), religious text, allegory/historical story
Snipre (talk) 20:33, 6 July 2017 (UTC)
@VIGNERON: We need two items for manuscript: one for the type of writing and one for the draft document of an author before publication. But for old documents written by hand by monks, I don't think we need a specific item. I don't see the difference between a book written by hand and a manuscript according to your last definition. If we agree to differentiate the type of writing from the form, then we can express everything by 2 terms and we can avoid to use some ambiguous terms. Snipre (talk) 20:46, 6 July 2017 (UTC)
The best is to take examples and to start the classification: Federal Charter of 1291 (Q662637), Codex Vaticanus Graecus 1209 (Q209285) and Rhind Mathematical Papyrus (Q213540). All these 3 documents can be considered as manuscript but their form is so different that using the term manuscript doesn't help to classify them correctly
Document FRBR level Type of writing Media Form Genre
Federal Charter of 1291 (Q662637) Exemplar manuscript (written by hand) scroll ? legal document
Codex Vaticanus Graecus 1209 (Q209285) Exemplar manuscript (written by hand) codex ? religious text
Rhind Mathematical Papyrus (Q213540) Exemplar manuscript (written by hand) scroll ? scientific document
So instead of using manuscript better use the more precise term codex which covers most the handwritten documents of the Middle-Age. Snipre (talk) 09:59, 7 July 2017 (UTC)

I have argued before, that we need to differentiate between autograph (Q9026959) and file copy (Q332148) in linking work and (physical) manuscript. --HHill (talk) 12:20, 7 July 2017 (UTC)

@HHill: And how do you call a book draft typed on a typewriter by the author ? file copy (Q332148) is not correct as the draft is the unique document.
We have to choose one system which covers most of the cases: if we choose autograph (Q9026959) for handwritten document provided by their author, we need a parallel term for document typed on typewriter by their author.
Or we can choose the proposed system above where we separate the parameter "type of writing" from the document and we use 2 properties to define the document, this allowing all possible combinations. Snipre (talk) 13:52, 7 July 2017 (UTC)
The current English of autograph (Q9026959) is incorrect, and incorrect for this use. It says "document transcribed entirely in the handwriting of its author". [Emphasis added.] But transcribe means to copy from another source, meaning that a manuscript can only be a handwritten copy, not a work that flows from the author's mind to the paper. Jc3s5h (talk) 14:17, 7 July 2017 (UTC)
Well, as a stage in the book production/publication process it's obviously a draft. And as a physical object it is a typoscript (which, at least in the first half of the 20th century, might have ended up in the manuscript collection of a library). Today, many if not most of the stages in the writing and publication process can be performed on a computer, and still people refer to a manuscript stage, rather metaphorically. And if you look closer to the writing process, there is often not just one manuscript per work but many different draft stages (which physically might be represented by manuscripts, typoscripts or computer files). In the artisanal period of book production the same may be true for the printed books, not all present the same identical text. And if you go back to a time when all stages where in manuscript there is even more room for variation (and not just because of scribal errors). There is the authors own exemplar (or even several of them), which he used or uses for drafting and/or writing, there might be a clean copy he made or whose production he supervised to be presented to a patron, copies made by others during different stages of the writing process. So the equation one manuscript = one work of literature = one first edition is way to simple to be useful. And for the description of physical manuscripts we should stick to things we can observe: was this particular handwritten by the author of this text or has someone else copied it by hand (and believe me, this can get messy enough). Trying to squeeze said handwritten object into an publication model which very likely didn't exist at the time may be misleading. --HHill (talk) 10:03, 8 July 2017 (UTC)
@HHill: There are few electronic copies which is used as reference or which which sold in the collectors market. We are not speaking about the process of books creation but mainly focused on what is left after the creation of the books. Most electronic copies of intermediate manuscripts are destroyed once the book is printed. So we can let these cases beside for the moment. Snipre (talk) 07:30, 20 July 2017 (UTC)
@Snipre: electronic copies which are used as reference are you talking about preprint (Q580922)? --HHill (talk) 13:20, 24 July 2017 (UTC)
@HHill: I am refering to all electronic versions prior the published version. Snipre (talk) 18:08, 24 July 2017 (UTC)
I've observed electronic drafts of standards, such as ISO 8601 being used as references because they are free while the official versions are quite expensive. Jc3s5h (talk) 19:00, 24 July 2017 (UTC)
@Jc3s5h: The draft of norms are published, often with a particular numbering or a identification parameter. This is not the case of the working versions of book before publication. I think you don't undestand the topic of the discussion. Your draft of a norm is corresponding to the manuscript of a book, in an electronic form. Not the working documents where each author put its comments or modifications during the progress of the writing. Snipre (talk) 20:00, 24 July 2017 (UTC)
I like the two definitions of "manuscript" at the Oxford English Dictionaries. I think we need two items for manuscript. - PKM (talk) 20:47, 14 July 2017 (UTC)
I agree we need 2 items: one for all documents written by hand from the old papyrus to the modern documents (the support and the age don't play any role) and one for the document, hand written or not, of an author before the book is accepted by any editors. Snipre (talk) 07:30, 20 July 2017 (UTC)

Editions and federated Commons?

Wondering if anyone has played with the test federated components of Commons, and the interplay with an edition of a work within test data. From my early and naive poking around, there is nothing available, and it would seem that we have a very direct requirement to see that it works well, as it means that can start with WD, then import the file. The big issue is which one comes first, and the linking, and the data pull availability, etc. as we always seem to be reactively building relationships between things, and therefore re-entering information in multiple places, rather than the ability to create based on what is in place at one site, and having the ability to inhale that data elsewhere.  — billinghurst sDrewth 01:24, 20 July 2017 (UTC)

Change made to obverse page was to remove language of work or name (P407) and original language of film or TV show (P364), and put in place alternate P2439 (P2439). I have reverted that change at this time as it appears to me that the discussion is to merge, though I do not see the consensus to migrate to the new property.  — billinghurst sDrewth 10:55, 23 July 2017 (UTC)

The edit was again applied, with a link to the general conversation. I do not see that P2439 has been agreed as the replacement. This should be discussed rather than enforced.  — billinghurst sDrewth 04:27, 24 July 2017 (UTC)

to remove language of work or name (P407) and original language of film or TV show (P364), and put in place alternate P2439 (P2439)

Wrong: it says "P2439 or P407"

The edit was again applied, with a link to the general conversation.

Far better than only P407 without any explanation. d1g (talk) 04:43, 24 July 2017 (UTC)
There is simply no consensus for P2439, please do not add it. If you think that it should be added then the appropriate place for that discussion is here in the project talk page, not on some discussion to delete. As I said elsewhere, if you wish to deprecate P364, and replace with P407, then make that edit. More than that is not appropriate.  — billinghurst sDrewth 06:07, 24 July 2017 (UTC)

There is simply no consensus for P2439, please do not add it

You should read carefully very carefully text below:

Property for language (P407 or P2439) is under discussion

@billinghurst: please point us words that indicate "consensus for P2439", thanks! 06:16, 24 July 2017 (UTC)  – The preceding unsigned comment was added by d1gggg (talk • contribs).
d1gggg: P2439 isn't for books, so don't insert it here. It's just disruptive.
--- Jura 06:27, 24 July 2017 (UTC)
I have replaced P364 with P407 to clarify with d1gggg that I have no issue with that action as a community consensus.  — billinghurst sDrewth 13:52, 24 July 2017 (UTC)

P2439

@Jura1: you seem to forgot your own message...

  1. I can read "language associated with this item" in 6 languages
  2. Comment I changed the domain to "any". --- Jura 05:16, 12 December 2014 (UTC)

d1g (talk) 06:49, 24 July 2017 (UTC)

@Jura1: anyway, most comments in 2014 and now argue against multiple specific properties for languages (language of work or name (P407) P2439 (P2439) now) in favor of any single one.
Maybe edit by User:Srittau was a step in right direction?..
We don't need exceptions for sake of exceptions, don't you think? d1g (talk) 07:08, 24 July 2017 (UTC)

book series conventions

  WikiProject Books has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Is there a place where book series conventions have been discussed? I would especially be interested in knowing if there is a consensus around where properties like series ordinal (P1545), follows (P155), followed by (P156) are expected to be set on a work: as qualifiers on the part of the series (P179) claim or as separate claims?

If it wasn't discussed yet, please let your opinion on the matter here after. -- Maxlath (talk) 16:13, 17 August 2017 (UTC)

Integrating information about books in Wiktionary

Hello,   WikiProject Books has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

In the process of supporting Wiktionary with Wikidata, we will enable soon the access to Wikidata data from English Wiktionary. We suggested to experiment first with their "citation" namespace, where they put quotes from books or other media, including information about this source. For now, they are doing it manually, and Wikidata should be helpful to have a semi-automatic display of the basic information about a book. On a long run, this should also help improving the quality of data on this field.

The project will start in September, and I expect the Wiktionary community to have a lot of questions regarding how we organize books in Wikidata :)

When I created an example to see how it could be done, I went myself through some questions.

  • where should one find information about publisher, ISBN? Is it in the item about the book, or an item about the edition should exist?
  • for the Wiktionary users, is it better that they display data from an edition of the book or the work itself? Or both?
  • does anyone knows a bit of Lua to help me building a small template for the example?
  • in general, what are the advice/best practices you would give to Wiktionary users who would like to create items about books if they don't exist yet?

Thanks a lot for your help, Lea Lacroix (WMDE) (talk) 10:41, 23 August 2017 (UTC)

I inform itwikt users.--Alexmar983 (talk) 10:49, 23 August 2017 (UTC)
Hi Lea Lacroix (WMDE),
Great news!
Here some answers, according to our data models :
  • information about publisher or ISBN should always be on the edition item.
  • from the edition (a work is intangible, it doesn't even have a specific langage, so you can't take a citation from it).
  • joker, sorry (@Maxlath, Zolo: ?)
  • always create a specific item for the specific edition, it's a bit more complicated but it's far better in the long run.
Cdlt, VIGNERON (talk) 10:59, 23 August 2017 (UTC)
I have added a "item" and a p"age" parameter to {{Quote}}. They can be used to provide the source.
example
{{quote|And that is the reason why a young buck with an intelligent looking calf's head before him, is somehow one of the saddest sights you can see. The head looks a sort of reproachfully at him, with an “'''Et tu Brute!'''”|item=Q37805498|page=326}}} gives
And that is the reason why a young buck with an intelligent looking calf's head before him, is somehow one of the saddest sights you can see. The head looks a sort of reproachfully at him, with an “Et tu Brute!
Herman Melville, Moby-Dick, New York City: Harper, , p. 326, doi: 10.5962/BHL.TITLE.62077
However, it will need to be substantially refined, and of course, it will have to be on enwiktionary that do not have the same module. Actually, I wanted to test it in the wikitionary, until I realized that we can't have it until the wikibase extension is deployed there (anyway they will probably need to develop their own modules there). Zolo (talk) , 23 August 2017 (UTC) 23 August 2017 (UTC)


  • Feedback:
    • It would be very helpful if the citation would be in a standard bibliographic format that is currently used in the world of library science (i.e., APA, Chicago, Harvard, MLA, Turabian). So as not to "reinvent the wheel."
    • Also, it would be very helpful if the four RefToolbar 2.0 citation styles (web, news, book, journal) were incorporated since the data is separated by pipes and is therefore a bit more machine-readable. Hand creating citations like this is for the birds.... :-)
      • See CPPC publications list here
      • Almost all of these book are in Wikidata and are in the standard cite book template, so feel free to use them as examples.
    • I don't use Wiktionary, but it seems like the more aligned with the rest of the sister Wikimedia projects the better.
    • Hope this is helpful. -- Erika aka BrillLyle (talk) 22:03, 23 August 2017 (UTC)
    • One of the main problems: The item will change and not always for good. (Publishers are renamed and bought by other companies. Users love to merge works with editions, change titles and language etc.) Traditionally if I add a source I can be sure the bibliographic data is save. One way to solve the problem could be adding oldid:
      • item=Q37805498 -> Q37805498&oldid=545456543
    • Now with one click I could check what source was meant and how it has changed.
    • Additionally for quality control it would be helpful having a parameter like "source checked" or "citation checked". --Kolja21 (talk) 10:50, 25 August 2017 (UTC)
  • Some comments
    • The citation format will not follow any standard used in library science. Wiktionary citations are organized first by date, then give the citation, followed by the quoted text on a separate line, and if the quotation is not from English, then a trasnaltion into English is supplied.
    • Because quotations can be in another language (any language), templates supporting non-Lain scripts must be employed.
    • Citations are not always from "books" in the usual sense. Quotations often come from plays, poems, magazines, newspapers, religious texts, famous manuscripts, etc.
    • Citations on Wiktionary include information not usually placed into standard citation format, such as including both chapter and page number for a work divided into chapters, especially when a copy exists on Wikisource, since it is desirable to have a link to the copy of the work from which the citation was taken.
    • The link data and display data do not always follow the same system, depending on how the particular work was set up on Wikisource. For example, if a book has chapters given in Roman numerals, but the chapters on Wikisource were set up as subpages using Arabic numerals, then a conversion must take place. For classical works, this can apply to the "books" into which a work is divided as well.
    • Likewise, links are included to author pages on Wikipedia for the author of the work, but the author of a work is not identified on the data item for the edition, but on the data item for the work. When a particular translation is referenced, then the translator should be indicated and linked as well.
    • As someone who has worked extensively on both Wiktionary and Wikisource, I can tell you that the data does not yet exist to begin using Wikidata for Wiktionary citations, and that adding it will be an enormous effort. --EncycloPetey (talk) 04:57, 26 August 2017 (UTC)

Books and edition

This is a separate thread about a detail of the discussion just above.

@BrillLyle: did you really mean Moby-Dick (Q174596) ? This is the item about the work, which is supposed to be intangible (so no citation and no pages). Plus, it seems htat this item doesn't really fit to the standard of this project and confuse the book and it's edition. Shouldn't it be a specific item about the edition? like Moby-Dick (Q37805498), where you can indeed find this citation : s:en:Moby-Dick/Chapter 65 (this item is a stub that need improvements too, I created it yesterday in order to split the work and edition).

@Lea Lacroix (WMDE): for information.

Cdlt, VIGNERON (talk) 16:24, 24 August 2017 (UTC)

Explaining the relationship between a book and a novella set in the same 'universe'

Hi all

I've added the item Party Discipline (Q39129615), which is a novella set in the same 'universe' as the book Walkaway (Q30068660). How can I explain the relationship between the two items?

Thanks

--John Cummings (talk) 07:57, 10 September 2017 (UTC)

You could create "Walkway universe" <instance of> fictional universe (Q559618) and set both books there. This would be consistent with Dilbert universe (Q20826245) and Majesty universe (Q18005554).
You could use this post as a reference- PKM (talk) 20:46, 10 September 2017 (UTC)
Perfect, thanks PKM, this is what I was thinking, I just wasn't sure if two works were enough to have a separate item for the universe. --John Cummings (talk) 21:47, 15 September 2017 (UTC)

Trying to build a showcase item for an 18th century book

I'm trying to persuade academics and funders to use Wikidata as a platform for a semantic web of books, and am creating a small set of data as an exemplar. See An address to the people of England, Ireland, and Scotland, on the present important crisis of affairs (Q39638127) and its linked editions, with an example query for editions and their digital surrogates at User:MartinPoulter/queries/ee#Editions_of_works_by_Catharine_Macaulay.2C_and_their_online_text I'd be grateful for other users to take a look at it and see if there can be improvements. The book quotes extensively from Political Disquisitions (Q39643140), which is a multi-volume non-fiction work, and it would be great if we could use that Wikidata item as an exemplar for a representation for that type of work. Some issues that have come up:

  • I am using full work available at URL (P953) to link to digital surrogates. However, there are two importantly different types: transcriptions (such as in Wikisource, Project Gutenberg, or the Oxford Text Archive) and scans + OCR (such as in the Internet Archive or Google Books). We seem to have the same property for "full scan" and "full text". Can this distinction be made with qualifiers, or is there a case for two separate properties?
  • One of the editions has an ISBN, according to Worldcat, and on investigation it looks like the ISBN applies to a microfiche scan. I don't distinguish the printed book edition and the microfiche in the items I've created, and I can see arguments that they should never be mixed and that the microfiche copy should have a separate item, or alternatively that they are not different enough to create different items for, and the ISBN added to an 18th century edition adds useful information.
  • I've set these books' type as non-fiction work (Q20540385) rather than literary work (Q7725634) because the latter's English description indicates "aesthetic or recreative purposes", which would seem to exclude a political book. Some of the discussion in this Wikiproject seems to take literary work (Q7725634) as more inclusive of works of literature in general.

MartinPoulter (talk) 12:39, 14 September 2017 (UTC)

Nice work! I've added some statements, feel free to modify or remove them if you see fit. ~nmaia d 16:08, 14 September 2017 (UTC)
I've removed an ISBN, 0-665-20482-5, which is specific to the second edition. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 07:50, 15 September 2017 (UTC)
Great work! I use document file on Wikimedia Commons (P996) for full scan files in commons.--Mauricio V. Genta (talk) 18:25, 15 September 2017 (UTC)

Is catalog an instance or a genre?

Hi,

Everything is in the title but here some more detailed explanation.

I added the claim

⟨ Q40220792      ⟩ genre (P136)   ⟨ exhibition catalogue (Q780605)      ⟩

(a subclasse of catalogue (Q2352616)) but it triggered a constraint violation (as catalogue (Q2352616) is not a genre, which is very true) so I changed it to

. Even if the second seems better, I'm still unsure as neither of this claims really satisfy me... What would you do ?

PS: here is a crude query to see the current uses of catalogue (Q2352616): 202 items with instance of (P31) and only 8 items with genre (P136)

SELECT ?prop (COUNT(?book) AS ?count)
WHERE
{
  ?book ?prop wd:Q2352616 .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
GROUP BY ?prop
ORDER BY DESC (?count)
Try it!

Cdlt, VIGNERON (talk) 10:49, 4 October 2017 (UTC)

I stumbled upon the same question, also for encyclopedia (Q5292): 569 statements with instance of (P31) vs. 283 with genre (P136). --Marsupium (talk) 17:39, 5 October 2017 (UTC)
I would say neither is a genre. There are catalogs and encyclopedias for all sorts of things in various genres: general history, sports, animals, etc. Jane023 (talk) 12:50, 6 October 2017 (UTC)
Thank you! Okay! It also causes trouble here. How best to model Catalog of the paintings on show at the Rijksmuseum in 1956 (Q16986324) then? :-) There probably should be some connection to catalogue (Q2352616). And catalogue (Q2352616) isn't a subclass of book (Q571). Use two instance of (P31) statements? --Marsupium (talk) 09:29, 7 October 2017 (UTC)
Why not make it instance of both catalogue (Q2352616) and book (Q571)? -- JakobVoss (talk) 17:35, 13 October 2017 (UTC)

How to properly manage Complete works editions for known authors ?

Hello all,   WikiProject Books has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Many prolific authors works have been compiled in so-called "Complete works" (in French "Œuvres complètes"), which are generally anything but complete... and we have a lot of those on wikisources...

Complete works by a publisher can be very different, and contain different works than CW by another one. So my question is : how do I work on a CW edition of an author..?

Example : s:fr:Œuvres_complètes_de_Voltaire (Voltaire's Complete works - Garnier, 1877). Obviously, this is version, edition or translation (Q3331189), since it is a certain publisher and a certain date... but what would be the corresponding edition or translation of (P629) ?

How can I state what exactly is this "Complete works"... apart than the list of all the texts comprised, which is very, very, very long...

Should I create a hypothetical "Complete works of (so-and-so)", which would not be a "work"... since it is the compilation of all the works ?

In fact, do we have a special treatment for this kind of compilation ?

Thanks for your help ! --Hsarrazin (talk) 12:46, 6 October 2017 (UTC)

It depends how long "very, very, very long" is. We have lots of room on Wikidata for items about works by famous people. Jane023 (talk) 12:52, 6 October 2017 (UTC)
I think the question behind this is « Is a "Complete works" a work ? » And I'm not sure how to answer that but I guess we could do as if it is true (for the sake of consistency and all). Cdlt, VIGNERON (talk) 12:56, 6 October 2017 (UTC)
@Jane023:
well, you may click on the provided link and see for yourself... let's say I have absolutely NO INTENTION to try and list them in the item...
but I need this item to be able to link the different works that are IN IT, like Memnon (Q41659922) - s:fr:Memnon, to the main publication.
@VIGNERON: how would you complete Complete Works of Voltaire (Q41659984) then, please ? --Hsarrazin (talk) 13:18, 6 October 2017 (UTC)
I would create a second item for the work as usual and link them with edition or translation of (P629)/has edition or translation (P747). Cdlt, VIGNERON (talk) 13:26, 6 October 2017 (UTC)
What I would do is to create a new item (if it does not already exists) called "complete works edition" or something similar and add it for P31 of the edition items. And don't create any edition or translation of (P629) for them because I don't think a work item for "the complete work of Y by Z" is going to be used anywhere outside of this edition or translation of (P629) (but I would be happy to be proved wrong). Tpt (talk) 13:31, 6 October 2017 (UTC)
@Tpt: I had that idea too... this way "complete works edition" could be a subclass of edition, but wouldn't need edition or translation of (P629).
it would be more logical :) --Hsarrazin (talk) 13:35, 6 October 2017 (UTC)

Well if each blue link is a text, then that list doesn't seem very long to me. The linked item that has been translated into other Wikisource versions is a bit odd though - how can an item Memnon (Q41659922) be the same text if it is translated into different languages? If you can't link to the French version then these should not be linking to each other either. Conversely they could all be interwikilinks of the same item with proper modelling I suspect. I guess I don't quite understand what you are trying to do. What is Garnier Frères (Q29186262)? Jane023 (talk) 14:09, 6 October 2017 (UTC)

Garnier Frères (Q29186262) is the publisher.... and has been merged with Garnier Frères (Q733801).
@Jane023: each blue line is a volume of texts. Some of them may contain hundreds of texts :) --Hsarrazin (talk) 09:17, 7 October 2017 (UTC)
I did not build the item Memnon (Q19698441). It represents the work. But many language links have been added to it (like thousands of actual items, where wikisource links have been added to the work item with all wp-links an exemple of frws texts wrongly linked). All those still have to be divided into as many items as ws-edition... ;)
Also, there is still a need for a template (or better, wikisource automatic process) that would link all editions of a same work on other or same ws-project, since (for now) separating editions in different items removes the links between languages :( @Tpt: --Hsarrazin (talk) 09:17, 7 October 2017 (UTC)
How about treating said Complete works as a book series (Q277759)? --HHill (talk) 14:35, 6 October 2017 (UTC)
I think it's the same problem we face when managing a compilation, selection or anthology of works by an author but only that a 'complete works' aims to be more.. complete (failing to achieve that many times). I mean, couldn't we use the already existing compilation (Q1614239), since it's a compilation of works but 'more complete' ?.--Zeroth (talk) 14:40, 6 October 2017 (UTC)

This seems very close to what we discussed above as 'composite editions': I think the right way to do it is to link the edition item to all the works it is composed of, via edition or translation of (P629), to keep the 'work' level to entities that came to existence by the will of the author (+ canonical/well established posthumous works). Generally speaking, I think we should try to keep the work level as ignorant as possible from publishers actions, a "pure world of Ideas" from which the editions are composed of. book series (Q277759) should also belong to this world of Ideas (ex: there is only one Gaston (Q737153) serie (singular intended), despite all its re-publications in different arrangements). But then, on the editions level, we could also have the means to mark those identifiable types of editions: their could be a way to mark those editions as a 'complete works of author X' type of edition, either via their instance of (P31) value, using a subclass of version, edition or translation (Q3331189), or some other dedicated property: edition -> 'complete works of' -> author. -- Maxlath (talk) 17:46, 6 October 2017 (UTC)

If you have a book called "Complete works of X", published while X was still alive, it is obviously not complete, but that is not a problem. It is a "reference work", genre "monograph", and has various parts (namely the works). You add works, by assigning the property "part of". You don't need to have items for all works in the book, though that would be nice. Why do you need this concept of "editions"? If a new edition comes out that is 10% thicker because X has continued to go on writing for another 5 years, no problem, you just add a new item for the new edition and say that it is "based on" the former book. Jane023 (talk) 19:09, 6 October 2017 (UTC)
those editions were not published during Voltaire's life, but much later. There were a lot of those, compiled by a lot of reference editors, (so referential that this editions are known by the name of the editor), like for Montaigne, or Pascal, or many other classical authors (I speak of French authors here, but I guess it's the same for Shakespeare or many other world famous prolific authors).
each of them contains different works, as they were not based on the previous edition, but on a scientific work of edition, directly from the author's manuscripts, and sometimes works previously attributed to the author can be re-attributed…
I was not aware of the "composite edition" previous discussion, but it doesn't answer the specific question of "complete works"… as much as 2 or 3 works can easily be listed in the item for the edition, the "complete works" on some authors, tends to make that nearly impossible.
my best idea would be to have
  • a "complete work edition" item (same for all authors, with description like "edition of all works of an author, complete or tending to completeness" (to imply that it may be "not complete"). This is by essence, not a selection, nor a choice.
  • This item would would be a subclass of edition, but not have any work to be linked to through "edition of"
  • Then, it could be used as instance of (P31) on any "CW of so-and-so", with Author, publisher, date, editor, etc., number of parts (since a lot of them include dozens of volumes) like, any edition item.
  • each edition of the text included could then be linked to it through published in (P1433), which would allow for a list of texts in it through SPARQL.
this way, the texts edited on wikisource could all be related to this "CW edition", when they are edited (we do not intend to create items for texts before they are edited on wikisource for now, "create when needed"…
what do you think ? @VIGNERON, Tpt: --Hsarrazin (talk) 09:17, 7 October 2017 (UTC)

I think you are approaching this from the wrong perspective. Of course it is a good idea to plan for the future and try to model how the data will look eventually. It is quite another thing to contribute to Wikidata. Going back to my original question on this page, namely what are the minimal statements needed to properly describe a book (something which has still not been answered by this project), I would like to know what are the minimal statements required to create items about what you call "editions"? Because I just spent an hour trying to backtrack to figure out what the heck that "Editions Garniers" was. I have linked it up along with another loose "Editions Classiques Garniers" to "Freres Garniers" here: Garnier Frères (Q29186262). When you create items, either make sure they link to a Wikipedia or Wikisource article somewhere, have an English label and description, or at least 4 statements describing what they are. Otherwise we can't help you, I'm afraid. My advice is to create the items you want created and then discuss them here one by one. Once you have a group of e.g. 10 items that form what you call a "CW", then we can talk about creating a "CW" item structure. Jane023 (talk) 10:27, 7 October 2017 (UTC)

@Jane023: thanks for not patronizing and diverting the subject of the question… I do not see what the question about Garnier has to do with the question of Complete works (the fact that Garnier published some of them is incidental).
what I call an edition is version, edition or translation (Q3331189) - most of ws items are version, edition or translation (Q3331189) per definition. the name of the publisher should not have bothered you, since it was properly linked as publisher (P123).
the minimal statements for version, edition or translation (Q3331189) would be, as stated here, instance of (P31) => version, edition or translation (Q3331189) + edition or translation of (P629) and author (P50), title (P1476), language of work or name (P407), publisher (P123)/published in (P1433), publication date (P577) + document file on Wikimedia Commons (P996) if there is one on ws… which is the best source for French books.
also, for now, I am not trying to create items ; I'm trying to fix the mess created by uninformed bot that mass-imported pages from wikisource, 2 years ago, without bothering to even warn wikisource contributors… most of these items do not even have a P31, and cannot be completed properly without the link to the work, that generally has to be created, because no wikipedia has an item about it…
a lot of these items have also been improperly merged with the item containing other wikisource editions (in other languages) and/or with items containing the work, generally for the only purpose of interwikilinks, without trying to come on the project Books and understand what it is. All this has to be cleaned... but to clean completely, I need to link my edition of a work to the main one, Complete Works of Voltaire (Q41659984) being one of them. number of texts that already exist on ws, from Complete Works of Voltaire (Q41659984)], and from them, 86 already exist on wikidata…
the minimal statements for a Complete works of ... would be instance of (P31) (and no link to a work, or link to a fictive work ? - this is the point of my question), author (P50), publisher (P123), title (P1476), language of work or name (P407), publication date (P577), number of parts of this work (P2635), because there are generally much more than 1… + document file on Wikimedia Commons (P996) if it's a wikisource item with FS, and Bibliothèque nationale de France ID (P268) whenever I can find it.
for the Complete works question, do you tell me to create them boldly ? or do you ask that I clean up everything but instance of (P31) on a dozen of CW items ?, and then, ask you if it's allright ?... --Hsarrazin (talk) 12:48, 7 October 2017 (UTC)
Yes I totally agree with Billinghurst on this one. There is cleanup work (thanks for all your efforts, btw) and then there is data modelling, and the two should not be confused. The main key to cleaning up Wikidata messes is finding the items linked to the mess (which can be extremely difficult when the items just have one language label and no other links to speak of), rounding them all up somehow and stuffing them in a list to be dealt with later. The key is making them findable enough so that others can also help in the cleanup process. I agree that a few years ago there was a lot of stuff uploaded from various wikisource projects (not just French) that have been left hanging. I suggest just adding everything you find on Voltaire for example as part of Category:Works by Voltaire (Q6828533) for now. Theoretically anything with a scan will be on Commons and will be in that category, so at least it will be on the radar for Structured Commons, etc. Jane023 (talk) 13:17, 7 October 2017 (UTC)
@billinghurst: thanks.
I don't try to determine whether it is indeed complete or not :)
I'm trying to determine how to qualify Complete Works of Voltaire (Q41659984) (and many others, not yet created). Is it a work (Q386724) or an version, edition or translation (Q3331189), because version, edition or translation (Q3331189) needs a edition or translation of (P629) statement, and there should not be publisher (P123) on a work (Q386724).
I like compilation (Q1614239) which is very general, but has neither instance of (P31) nor subclass of (P279). sooo, the question is the same : work (Q386724) or version, edition or translation (Q3331189) ? if it's a subclass of version, edition or translation (Q3331189) that will be fine with me :) --Hsarrazin (talk) 13:06, 7 October 2017 (UTC)
@Hsarrazin: it is an version, edition or translation (Q3331189). The parent of the edition in this case would be the compilation that I mentioned, and if it is notable then it would be what would have the wikipedia link.  — billinghurst sDrewth 13:13, 7 October 2017 (UTC)

Everything printed has a publisher, even if it is a self-published manuscript. In such cases (also for hand-written letters), the archive-holder is the publisher (even if you took the photograph, the archive "publishes" the document by releasing it to you). Jane023 (talk) 13:28, 7 October 2017 (UTC)

@billinghurst: Please don't mix compilation (Q1614239) with the work/edition/exemplar classification: a work can be a compilation, an ediiton can be a compilation and a exemplar can be a compilation. You mix different types of classification and this is the mess later to be able to retrieve data. A text can be described by:
- FRBR type: work, edition or exemplar
- format:...
- genre:...
I don't know in which type of classification compilation (Q1614239) can be used, we can use instance of (P31) if we want but not as replacement of work/edition/exemplar. Snipre (talk) 21:53, 8 October 2017 (UTC)

I also like compilation (Q1614239) and think that's the best choice for this matter, but lack the knowledge to complete the properties mentioned by @Hsarrazin: (instance of (P31) and subclass of (P279)). Maybe someone could complete them and we can start using it.--Zeroth (talk) 23:44, 8 October 2017 (UTC)

"One-shot" editions

This is a question that I received from a frwikisource contributor @Aristoi:, who is not a librarian, and tries to systematically add his edited texts referenced properly on wikidata, and is quite thorough with it (\o/ for him)

Q19166418 was only published once, and won't probably ever have another edition[1]… Is it really necessary to create a work item for it, or is there another way to manage it ? Should such an item be at the same time version, edition or translation (Q3331189) and literary work (Q7725634) ?

Thanks for simple do-able answer and not philosophical dissertation about the aim of the project :) --Hsarrazin (talk) 15:45, 7 October 2017 (UTC)

@Hsarrazin: If you are a machine and you were programmed to extract data from both edition and work items to create a data set, how do you will deal a case of an edition item without a corresponding work item ? Does it answer your question ? A model is a model, every exception has to be declared and identified to allows automatic data processing to handle correctly the data. Snipre (talk) 14:45, 8 October 2017 (UTC)
@Snipre, Aristoi:
which is the question : is there a solution for simple editions, not to create 2 items with a specific claim ? (which would be good, if we don't want to totally disgust non-librarian contributors from other projects to contribute here… ;) --Hsarrazin (talk) 14:50, 8 October 2017 (UTC)
No. And for why, see above. Snipre (talk) 18:35, 8 October 2017 (UTC)
Surely the “machine” could be programmed to handle items which were both “work”and “edition”. I have a lot of faith in rogrammers. - PKM (talk) 19:23, 8 October 2017 (UTC)
@PKM: This is not only programmers who have to deal with exceptions: non-programmers will use wikidata using SPAQRL queries so what is the best: 1) to respect some constraints once when adding the data and to have a very easy data extraction during the rest of WD life or 2) to have once an easy data addition and then thousands of difficult queries during the rest of WD life ?
Somebody has to do the job once: the guy who adds the data or the guy who extracts the data. And as there is only one data entry compared to thousands of data extraction, it is worth to have a simple model which has some constraints instead of a very flexible model which will require a lot of control loops to ensure a correct data extraction. This is efficiency and if people don't want loosing time by entering data, some tools exist to create automatically items starting from a good spreadsheet: it is possible to avoid all the manual data entry. Database is not working as usual document: to be efficient, you need to work with database tools. Snipre (talk) 21:35, 8 October 2017 (UTC)
@Snipre: I agree with you on the use of efficient database tools to contribute...
Unfortunately, while we were working on efficient and systematic tools to transfer our data to wikidata, some bot (that I will not name, but is pretty well known) did import 25 thousands of our pages, without any claim... even empty pages and pages that are not supposed to go on wikidata, like technical pages "/texte entier", which are only created to ease pdf export, and also chapters (but not all, would be too good a job), etc.
after that, a lot of people, thinking it was better for interwiki links, merged those empty items (editions) with the corresponding "work" items from wikipedia...
and now, the mess must be cleaned up... can't see how to automatize it : it is a case by case task...
also, nice contributors from wikisource (who are not opposed to wd like our sister project's wp) take care to transfer their work on wikidata once completed... but have difficulties with the 2-levels cataloguing : I don't think it is a good idea to repell them from contributing through too strict rules, and very long and hard... because it's always a 1 book at a time process for them.
It would be nice to have a claim of some kind to pile up those incomplete data, and then batch-process them regularly to complete them, a kind of workflow...
so, any suggestions, someone ? constructive ones, I mean, to ease the process... :) --Hsarrazin (talk) 10:42, 11 October 2017 (UTC)
Hurrah Hsarrazin, well said. The sources that many of us are using for old works are not well-formatted to a spreadsheet. I am significantly more likely to be working on one item with depth, than many items with shallowness. So I have an edition or a work, be it at Wikisource/Commons/IA, so I am working at the edition level, and usually there is no creative work component. Or I am working on a newspaper article or a biographical article published in other works; or something to cover a range of articles.

I can work on the production of a lecture from a lecture series, so I should create 1) the lecture, 2) the lecture series, 3) the work level, then the edition level, and maybe other components like author, the location of the talk, the subject matter, etc. For these lectures they are usually published once, especially being contemporaneous or very specific to a narrow topic. Guess what, I end up just creating the edition level, more towards the bare minimum as repetitively typing data, and sorting out all the weird issues/rules for sorting through WD.  — billinghurst sDrewth 01:54, 12 October 2017 (UTC)

I'd object creating more than one item for these cases --JakobVoss (talk) 17:24, 13 October 2017 (UTC)
  1. which should be the case of the majority of wikisource texts
Hi, Do we have to create a work item for collection like Q43198303 (s:fr:Livre:Tolstoï - Les Rayons de l’aube.djvu)? Of course, a work and an edition item are needed for each of the text included, but I don't see the need to create 2 items for the collection. IMO, this is not a work, but a collection of works. We could have Wikipedia articles about some of the essays, and links to other languages, but on the collection level, a work item would be useless. Opinions? Regards, Yann (talk) 23:34, 16 November 2017 (UTC)
Imho we should only create work items when it make sense (i.e. when we only deal with two different editions or more). With this approach we only add complexity when required, from the scientific article that have only one edition (and one item) to the Bible with one item about the abstract complete work, on item for each different major version (the Catholic Bible...), one work item for each book... Whatever we do for books we will have to deal with one item for both work and edition for scientific article and with complex things like the Bible so I don't think that the argument "let's have at least two items for automated tools stands because automated tools will have deal with a much bigger complexity anyway". There is probably going to be many more humans than programs dealing with the data so let's optimize for humans. So, if you ask yourself the question "should we create two items?" the answer in my point of view is "if you don't find a good reason to create two just don't split". Tpt (talk) 09:06, 17 November 2017 (UTC)

work (Q386724) vs creative work (Q17537576) — occasionally problematic

How do people see the difference, if any, between the two. I ask due to work (Q386724) is set for published in (P1433); whereas creative work (Q17537576) is used by contributed to creative work (P3919). The difference is problematic with constraints and it seems that they have just developed separately. I think that we should be looking to clarify and possibly realign.  — billinghurst sDrewth 03:05, 11 October 2017 (UTC)

Can a work be not creative ? Snipre (talk) 22:05, 11 October 2017 (UTC)
There is a clear disconnect with the constraints and classification, so a rhetorical question is not helpful. One could propose that all works have a level of creativity, and as such how copyright is often viewed, hence why I brought the matter here.  — billinghurst sDrewth 00:10, 12 October 2017 (UTC)
In the US, for copyright purposes, directories, or other information where there is only one reasonable way to arrange the information, is not eligible for copyright because it isn't creative. See w:Feist Publications, Inc., v. Rural Telephone Service Co.
The data/information is not copyrightable, though the production can be due to its creativity in formatting, presentation, artwork. So if someone burns a CD with that data, they can rework the data, they cannot start burning and selling the CD. Which is the work? What differentiation between calling it a work, or a creative work? Are they the same, or is one subsidiary to the other?  — billinghurst sDrewth
"What differentiation between calling it a work, or a creative work?" So again Can a work be not creative ? My question was not rhetoric, this was just a way to solve the problem: we have a problem of definition, we just need to describe correctly each item with an appropriate definition and everything will be clear. Snipre (talk) 09:15, 17 October 2017 (UTC)

Cases as examples

Some contributors complaint about the lack of information about the way to model books and other works. I was starting to create some cases and to try to propose some structure to store data in WD. See this page to access the cases and the structures

If you can :

  • comment the existing cases and the proposed structures
  • propose other cases in other to create a library of cases which can be proposed later to contributors

Thanks. Snipre (talk) 09:20, 17 October 2017 (UTC)

Book in two volumes

  WikiProject Books has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Hi! Creating Description Historique et Chronologique des Monnaies de la République Romaine (Q42251781) I have some doubts. Is there anywhere a guide for books with volumes? I used as template The Lord of the Rings (Q15228) but I'm not sure was the case. In particular from this project homepage is said that number of parts of this work (P2635) applies only to editions, instead I put it on the work. Moreover I added identifiers to the work, like Bibliothèque nationale de France ID (P268) but there isn't in the work table because normally BNF contains editions. Can someone help me to solve those contraddictions? --AlessioMela (talk) 10:03, 19 October 2017 (UTC)

This is a difficult case but if we follow the data structure, we need:
Total 5 items. Snipre (talk) 13:31, 20 October 2017 (UTC)
@Snipre, AlessioMela:
Hi, I would say "volumes" (physical division) are typically edition items, since the same book can be published in 1 volume by a publisher, 2 or 3 by another. "tomes" are another matter... tomes are an intellectual division, and therefore could be catalogued as parts of a work.
I don't see why we would need 5 items. 1 for the work and one for the edition (specifying number of parts of this work (P2635)) seems enough for me, unless you really want to catalog each part of the book separately, because 1 part is about a specific subject, and the other about a 2nd specific subject. Otherwise, something like Anna Karénine (Q40905546) (edition) and Anna Karenina (Q147787) (work that still needs to be cleaned of various editions info), should be enough.
The Lord of the Rings (Q15228) is an example of book with 3 different parts, that were written separately, have different titles and edition dates, which indeed need separate items for each part. But a lot of books in 2 or more volumes do not need such a complex structure, unless, of course, you want to detail the content.  ;) --Hsarrazin (talk) 19:54, 20 October 2017 (UTC)
@Hsarrazin: Since when a work collection can be composed of 2 editions ? A work can be composed of other works but not of editions. If you have several editions of each volumes mentioned above, with your system, you are not able to link correctly the second and higher editions item to the item of the work colection.
Take the generic case:
  • A1, item for the work collection
  • C1a, first edition of the first volume
  • C1b, first edition of the second volume
  • C2a, second edition of the first volume
  • C2b, second edition of the second volume
If you don't have a work item for each volumes, you have only one way to link A to C1a and C1b:
With 2 additional work items Aa and Ab, you can write
These relations are necessary to be homogeneous with the basic case one work/one volume. Can you propose the same relations with less than 7 items ? Snipre (talk) 20:39, 20 October 2017 (UTC)
@AlessioMela: Do you know if there are several editions of Description Historique et Chronologique des Monnaies de la République Romaine (Q42251781) ? And if this is not the case, this is the case for other work collections so we need to have one unique system able to handle all cases and not different data structures depending on cases. Without an unique structure, we won't be handle to extract correctly the data using a simple query. Snipre (talk) 20:50, 20 October 2017 (UTC)
@Snipre: yes the are other more recent editions. In order to reply to a previous question, the two tomes need one item each in order to link the to the correspondent djvu file and in future to wikisource. --AlessioMela (talk) 15:16, 23 October 2017 (UTC) P.S. it's quite common in catalogs to have this kind of situation: one work but printed in two tomes.
@Snipre: - you read me wrong, I never said that a work could be composed of 2 editions. I said that volumes and tomes should not be confused, because volumes are parts of an edition, while tomes are parts of a work.
Description Historique et Chronologique des Monnaies de la République Romaine (Q42251781) is not a collection of works. It is a single work, that has been published in 2 volumes because of the size of the book, not because there are 2 different works in it. Another edition could be in 4 volumes, or 3, or 1, but there still is only 1 work ; therefore I do not see why you want to make a volume of an edition a separate work. The Lord of the Rings (Q15228) is a wrong example for Description Historique et Chronologique des Monnaies de la République Romaine (Q42251781).
I agree with you that, for a collection of works, one item per work, + 1 item for the collection is necessary, but it is not the case for Description Historique et Chronologique des Monnaies de la République Romaine (Q42251781). That is why I proposed Anna Karenina (Q147787) as a more similar example. --Hsarrazin (talk) 19:27, 23 October 2017 (UTC)
@Hsarrazin: Ok, the case is more subtil than I thought, but the distinction you proposed can't be modeled (one work into 2 volumes). Just think about the second edition of the work, how do you link the work item with the second edition volumes ? Again what is your proposition to link:
A1, the work, C1a, C1b, both volumes of the first edition and C2a and C2b, both volumes of the second edition ?
Currently we have A1 which corresponds to Description Historique et Chronologique des Monnaies de la République Romaine (Q42251781), C1a to Description Historique et Chronologique des Monnaiees de la République Romaine Tome Premier (Q42251879) and C1a to Description Historique et Chronologique des Monnaies de la République Romaine Tome Deuxième (Q40796567) and the following relation:
A1 has part(s) (P527) C1a, A1 has part(s) (P527) C1b
If I create QXXX which corresponds to C2a, how can I link A1 with C2a ? By adding a new statement A1 has part(s) (P527) C2a ? Snipre (talk) 10:13, 25 October 2017 (UTC)
@Snipre:
no, my proposition is not to create items for the volumes. Volume items are not relevant, no more than an item for page 150-229 would be relevant.
my proposition, which is the solution used by libraries which began to use frbr is to have 1 item for the work, and 1 item for the edition, the edition item being specified as number of parts of this work (P2635) -> 2 volumes. and no has part(s) (P527).
there is no need to complexify for no purpose : a volume is just that... a physical division, generally made for easier manipulation, because of the size of the book ; it makes no sense to have an item for it.
Volumes do not need to have an item if they don't match an intellectual part of the work (a "tome"), which could be anything from chronological to thematical or geographical. IF there is such a division, AND it is interesting to have 1 item for such division, THEN, we can have 1 item for each part (a work part, a "tome"), and an item for each edition of each tome. But, in other cases, like this specific book, there is absolutely no need for an item for each volume.
therefore, Description Historique et Chronologique des Monnaiees de la République Romaine Tome Premier (Q42251879) and Description Historique et Chronologique des Monnaies de la République Romaine Tome Deuxième (Q40796567) are not needed. only Description Historique et Chronologique des Monnaies de la République Romaine (Q42251781) and another item for the edition, with number of parts of this work (P2635).
if another edition, in 1 2, 3 or 4 volumes exists, another item for this edition, indicating the number of volumes, will be enough. :)
if you need a division of the work in parts, for wikisource purpose (which I can understand completely), then use parts of the work or chapters... not volumes. - then, of course, each part or chapter needs a "work item" AND an "edition item" (I also do it on wikisource, but this is not needed for any book in many volumes ; only for wikisource purpose… :)
also, AlessioMela, if the "volume" items are for the sole purpose of linking to djvu, then you can also do like Le Diable à Paris (1868 ed.) (Q42367359) - and yes, there will be 2 items for each short story in this book ;) -- Hsarrazin (talk) 11:45, 28 October 2017 (UTC)
@Hsarrazin: Re volumes, I would generally agree that all volumes belong to a work item and not have individual items, though I would have said where they are concurrently published, or close to simultaneous. There are a few exceptions that I see to that are works that have been published in multiple volumes over a range of years and the publishing details change. My favourite example is the DNB which was published from 1885 to 1900 in 63 volumes, different editors, contributors, time of publication, etc. So, I would like to suggest to have your words above to be our general guidance, though not a hard rule, ie. follow the guidance unless you can really justify that a work is different.  — billinghurst sDrewth 13:42, 28 October 2017 (UTC)
@Hsarrazin: Your proposition is valid but is not able to face the driving force in WD: data. You pretend that volume 1 and volume 2 should not be separated into 2 items but as both volumes have different values for the same properties like number of pages or ISBN, people will create 2 items, one for each volume, in order to be able to add that kind of information. And I suppose that other identifiers are different for each volume, so in that case, the need of 2 items becomes critical. Snipre (talk) 23:11, 30 October 2017 (UTC)
@Snipre: I did not say that you could not, I said it was not necessary - i.e. you don't have to - only if you want to :)
besides, no wikisource book have any ISBN : they are PD, and thus published well before 1970 ;) --Hsarrazin (talk) 08:36, 31 October 2017 (UTC)
@Hsarrazin: Don't be so restrictive: Wikidata is not designed only to support Wikisource so you have to take care of other cases outside of Wikisource field and if you are right saying it is not necessary, what do you propose to do when the case will arrive (the case: contributors creating several items for volumes in order to add specific identifiers for volumes)? In order to be helpful your proposition has to cover all cases.
Just for information, there is an edition of Description Historique et Chronologique des Monnaies de la République Romaine (Q42251781) with the following data:
  • Tome 1: ISBN = 978-1497359550, number of pages = 632 pages
  • Tome 2: ISBN = 978-1497359567, number of pages = 674 pages
So my case is not hypothetical. Snipre (talk) 17:23, 31 October 2017 (UTC)
@Snipre: I was not restrictive : the original question was about wikisource use with 2 djvu.
I say : when a contributor asks if s/he must create an item per volume, I answer "no" you are not obliged to. When a contributor asks if s/he can : I say OK - this is not an obligation. that's all.
but they also could add different info about different volumes in the same item, as qualifiers for example :)
I say create when needed not because you have to... --Hsarrazin (talk) 20:09, 31 October 2017 (UTC)

  WikiProject Books has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead., Ok, back to the margin…

You should not be obliged to systematically have a work part for an edition part - sometimes volumes match a part of a work, and sometimes they simply don't.

I here think of a specific work we've had a nightmare afternoon on wikisource a few years ago : Mémoires de Saint-Simon. This single work was published, the same year (1856), by the same publisher Hachette (Q349566), in 13 and 20 volumes, in different format and printfont. The content is absolutely the same, but divided differently, to accomodate the printing needs.

This work has no clear parts, except for chronological logic, since it is memoir (Q112983).

There is absolutely no way you can apply the frbr model as described above on those editions, in a rational way : we would have to create 66 items (20+13*2), just for those 2 editions, and none of those parts could match each other... and this same work has also been published in 2, 5, 12, 16, 45 volumes (not counting the editions I do not know, and the translations abroad : hundreds of editions).

That would be nonsense to apply the 1 work-part matches 1 edition-part to this book... None would match...and it would create a mess.

So I say, only create volumes when you really need the volume item, not just for the sake of having 1 book 1 item... this is useless and contrary to the FRBR model, which was created to be able to link together editions of a same item, not to divide a single work into artificial parts that do not exist in the work itself.

Here I see 2 possibilities to accomodate :

  1. do not create volume items and use qualifiers (or multiple values) to add what you want to the edition item.
  2. create a "physical part of an edition" item, that could be used as P31 on parts, without the need for a "part of a work" counterpart item. Only editions need a work item. This would allow to create volumes, without leading to the artificial creation of "part of work" that do not match the reality of the work.

So, what do you think, all ? --Hsarrazin (talk) 09:11, 1 November 2017 (UTC)

Comment This model looks fine to me. Regards, Yann (talk) 09:35, 1 November 2017 (UTC)
What Yann said, and in addition to that, please pity the non-Wikisource contributor who is adding a book, in 1-100 volumes, and is coming from e.g. Wikipedia or some other place, and has no idea what you are talking about. There must be a clear pathway to split the items you need from that initial book item in the normal wiki-way to (eventually) link to various wikisource pages if the book ever gets uploaded. Jane023 (talk) 09:41, 1 November 2017 (UTC)
wikisource or non-wikisource user Jane023 ;) - I would even say librarians ADN non-librarian as many wikisource contributors are not librarians either… - the 1 item for each volume is NOT frbr... FRBR is 1 work item/X edition item. Forcing the creation of items for volumes is contrary to FRBR. --Hsarrazin (talk) 10:59, 1 November 2017 (UTC)
Amen to that! Jane023 (talk) 11:15, 1 November 2017 (UTC)
I agree with Hsarrazin, we should avoid to create items for volumes. It could increase a lot the number of items with proabaly very few data in them. More, we already have Commons files and Wikisource Index: pages that are on the volume level. Tpt (talk) 13:46, 1 November 2017 (UTC)

Rereading the above, I would just like to say that "volume" = "tome". There are physical divisions as well as logical divisions across editions, depending on time period. You basically can't generalize in any meaningful way that separates these (hoping to keep any decisions made here to be the same across Wikisource languages). Jane023 (talk) 09:35, 2 November 2017 (UTC)

I'm not sure for English but in French, it's clearly « volume@fr = physical divisions vs. tome@fr = logical divisions ». But in the end, this doesn't really matter as most of the time, the divisions for the volume and the tome are the same. Plus, we seems to reach a consensus « don't create several items if you don't need it », so it matters even less. Cdlt, VIGNERON (talk) 10:59, 2 November 2017 (UTC)
@Tpt: How do we reconcile "we should avoid items for volumes" with "we already have Commons files and Wikisource Index: pages that are on the volume level"? When there are different digital forms of the different volumes of one specific edition of a work: Wikisource transcriptions, Commons files, Archive.org files, Google Books links... shouldn't they each have an item? I take the points that we want to avoid unnecessary complication if possible, and that for many purposes the volume is not a significant division, but it seems it is a significant division when we are getting scans or transcriptions of the book. MartinPoulter (talk) 17:10, 2 November 2017 (UTC)
@MartinPoulter: Use different properties to link to the different repositories. Snipre (talk) 21:35, 2 November 2017 (UTC)
@MartinPoulter: Sorry for the late answer. What we could do is put these data in the future Commons metadata system that is going to be on file (i.e. scan level). It would allow us to avoid to have both a Wikidata item and a file description entity for something quite similar. Tpt (talk) 09:53, 17 November 2017 (UTC)

Importing books from the Open Library

Hoi, we have a good cooperation going with the Open Library. We have already synchronised identifiers for authors. In a next move we want to add publication by these authors that have 1) an external identifier like ISBN or Library of Congress identifier. 2) they have a readable version available like a PDF or an ebook.. The point is very much to enable people to read these books.

What else is needed to make this happen. How do we indicate the format available? Thanks, GerardM (talk) 11:16, 25 October 2017 (UTC)

@GerardM:
1) Map all WD properties with Open Library data structure
2) Does Open Library provide data only about editions ?
3) Set up an importation process with a bot based on the data structure described in Wikidata:WikiProject Books
4) Request the importation approval in Wikidata:Bot requests (points 3 and 4 can be done together but better already discussed wit a bot operator first)
Snipre (talk) 11:37, 25 October 2017 (UTC)
To be honest, I do not care all too much about the niceties; with the ISBN or the LoC identifiers we can retrofit edition information. The point is with a books name or its author we find a public. The rest is extra. Thanks, GerardM (talk) 11:40, 25 October 2017 (UTC)
If you have to set up a bot to import some data, it is worth to add the rest of the data: really better spend a little more time on that step because what will bring value in WD is data completeness. Snipre (talk) 14:19, 25 October 2017 (UTC)
Some of us do place priority about edition vs book level descriptions, and it sounds as if it is something readable then it is edition data. Noting that we will already have numbers of these works, and many will have WP descriptions. So if they are going to be imported they should be new items, and edition level. Some will link with our existing edition items anyway and at those points we can merge if they are found to be duplicates. We will still face the issue of work vs. edition data; matching and linking existing works with new editions; and trying to get easy creation of work level items where we have existing edition level items.  — billinghurst sDrewth 22:26, 25 October 2017 (UTC)
I agree, the edition-work split is important (e.g. we've got so much of this sort of thing, that is just going to have to be cleaned up one day; let's not make more if it). But anything with an ISBN is by definition an edition. We could import only those, and it'd still be a big step. Sam Wilson 06:08, 27 October 2017 (UTC)
  Comment things with ISBN details are pretty unlikely to be at the WSes, so they are a cleaner edition import.  — billinghurst sDrewth 11:18, 27 October 2017 (UTC)
But there are other IDs like DNB, Internet-Archive-ID, Google-Books-ID, Open-Library-ID, OCLC-ID...
--> Wikidata:WikiProject Books#Edition item properties --WikiAnika (talk) 11:28, 27 October 2017 (UTC)

Obviously, the data as is available will be included. It will include at least one external number. A short story can be included in many places and as long as it is the same text, the difference is academic.

On a different subject; for Wikisource we do not know what is ready for reading. In my opinion all the information about Wikisource is only relevant to the Wikisourcerers as it does not make a difference in finding a public for their work. Thanks, GerardM (talk) 13:42, 27 October 2017 (UTC)

Query requests

Would someone with the skill be able to write separate queries that

  1. identify work level items that have wikisource link (indicator that edition level should be created to which we can move WS link if necessary)
  2. identify items with links to multiple wikisources (indication that item needs split to multiple level)

Thanks.  — billinghurst sDrewth 16:28, 29 October 2017 (UTC)

@Billinghurst:
Petscan is your friend there ;) --Hsarrazin (talk) 09:13, 1 November 2017 (UTC)
@Billinghurst:
Here is your first query for all wikisources:
SELECT DISTINCT ?book ?page WHERE {
    ?book wdt:P31 wd:Q571 .
    ?page schema:about ?book .
    ?page schema:isPartOf/wikibase:wikiGroup "wikisource" 	
}
Try it!
Or only for one wikisource:
SELECT DISTINCT ?book ?page WHERE {
    ?book wdt:P31 wd:Q571 .
    ?page schema:about ?book .
    ?page schema:isPartOf <https://en.wikisource.org/> .
}
Try it!
For your second query:
SELECT ?book ?bookLabel (COUNT(?page) AS ?count) WHERE {
 ?book wdt:P31 wd:Q571 .
 ?page schema:about ?book .
 ?page schema:isPartOf/wikibase:wikiGroup "wikisource" .
 SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?book ?bookLabel
HAVING (?count > 2)
ORDER BY DESC(?count)
Try it!
We are at the limits of what the Query service can do (you even may have to relaunch it several times as it can TimeOut sometimes). And indeed, I agree with @Hsarrazin:, especially for the first one, PetScan can be far more useful here (a lot more options like including or excluding templates, etc.).
Cdlt, VIGNERON (talk) 10:03, 1 November 2017 (UTC)
my prefered Petscan search would work from a wikisource category (validated texts for example) and look for those with more than 1 link like this which is on frws(obvious problem since there should not be any other link)  :) --Hsarrazin (talk) 10:28, 1 November 2017 (UTC)

use of has edition or translation (P747)

Concerning the use of has edition or translation (P747), I read all discussions above, and could not determine whether this should only be used for 1st edition of a work, or to link all edition items.

The second solution seems too big, since some works may have hundreds of editions. But I have seen the first solution on many items... soo... which is the correct use ? Is there a consensus, or not yet ? --Hsarrazin (talk) 10:41, 1 November 2017 (UTC)

Strange, it was always obvious that all editions should be listed (I didn't even know there is a discussion about it). If not, some editions would be isolated and unlinked (or linked in strange way with other properties), which is very bad. What would be the problem to have hundred or thousand editions? (which is only a limited numbers items about very famous works, te vast majority of books only have one edition) Cdlt, VIGNERON (talk) 11:02, 1 November 2017 (UTC)
well, they would always be linked through edition or translation of (P629) ;) --Hsarrazin (talk) 11:37, 1 November 2017 (UTC)
@Hsarrazin: yes but if
⟨ A ⟩ edition or translation of (P629)   ⟨ B ⟩
then the reverse claim
⟨ B ⟩ has edition or translation (P747)   ⟨ A ⟩
is mandatory, so back to square one… Cdlt, VIGNERON (talk) 11:41, 1 November 2017 (UTC)
@VIGNERON: I agree completely. The reason I asked is because I read otherwise (just read on Wikidata_talk:WikiProject_Books#Wikiproject_Books_2.0, and did not want to do wrong :) --Hsarrazin (talk) 12:10, 1 November 2017 (UTC)
@Hsarrazin: well this proposal was very good but I think this specific point is wrong, even if I understand the fear that some items may contains too much data (also I'm not sure about the need to use two differents properties for splitting translations and editions, but that's another point). @Nonoranonqui, Aubrey: what do you think? Cdlt, VIGNERON (talk) 16:51, 1 November 2017 (UTC)
I'm on a train to Florence to meet with Nonoranonqui and talk about this stuff! ;-) Personally, I don't like has edition or translation (P747), and I very much like edition or translation of (P629). To me, if we want to know how many editions does book A have, we need to do a query. It's much easier to put edition or translation of (P629) in all the edition items than to track them and then update book A work item... Aubrey (talk) 09:00, 3 November 2017 (UTC)
I tend to agree with Aubrey. When there was no query service, inverse properties where necessary to have access to data. Now that it's not necessary, I would prefer to only track major editions with has edition or translation (P747) (like 1st edition) and use edition or translation of (P629) to have all editions. :/ --Hsarrazin (talk) 09:24, 3 November 2017 (UTC)
  •   Comment as an ultimate goal at a point in time I will agree with you, though not at this point of time. 1) It is the only ready means to see from a work that there are editions below the parent; 2) from the push side if there is a work at an WP then this is a useful means to then track down to a WS item; whereas having to run a query for an WP to see whether there is a chance of a link, and a link from "edition of" seems a hard way to do something, 3) until we get better data, and placeholders like this, it is about the single means to manage and reverse incorrect merges of editions to books by the helpful clueless.; 4) they are the one ready means to navigate between books and editions when one is generating and populating data, especially where splitting works. When the system is able to automatically show these shadow properties on an item I am in favour of their retention.(comment written by Billinghurst)
I'd explore something that I don't really know: how does Wikidata work with inverse properties? This is a problem that other WD projects have surely coped with. At the moment, I don't want to delete the property, but I'd very much prefer it to be maintained by a bot: also, has edition or translation (P747) works with items and not numbers, as many users mistakenly do. But, again: it's ok to have it, at the moment, knowing that we need to resolve it in the future. Aubrey (talk) 11:19, 3 November 2017 (UTC)
I agree that these perfectly symmetric properties are not really required, but it does seem that they are required at the moment. For example, there is currently no way to run a query from Lua, and so if one wants to list all editions-and-translations of a work on Wikisource, the only way to do it is to populate has edition or translation (P747). But as people have said, this can reliably be done by a bot, because there's no edge cases in which the inverse doesn't hold true. So I'd say: let's keep it, and add the values, and work up a bot that does it for us (from either end of the relation). Sam Wilson 01:21, 6 November 2017 (UTC)

misuse of NNL item ID (P3959)

Every instance thus far where I have come across NNL item ID (P3959), it has been misused. This property is supposed to link to work identifiers from work data items, but what I'm finding is that it links to copy records of Hebrew translations from the work data item, which should never happen. --EncycloPetey (talk) 19:22, 5 November 2017 (UTC)

make it clearer in the description? maybe look to add/suggest a constraint that flags the addition/use as incorrect.  — billinghurst sDrewth 23:43, 6 November 2017 (UTC)
I also noticed what you say EncycloPetey, but not reading hebrew, I did not dare remove them. Do you think I should ?
in fact, do you know if there are real work IDs in NNL catalog, or only editions ids (like in many libraries) ? if there are not, maybe rephrase the property as "id for an edition" ? (in all languages) --Hsarrazin (talk) 10:53, 7 November 2017 (UTC)
If you follow the links, you get the library record, which includes the standard library catalog headings in English. Without a single exception, every single record I've come across includes a "...--Translations into Hebrew" header, which means it is not an authority record for the original work. I honestly don't know in the NNL catalog has any work IDs or authority records, but I certainly haven't seen any linked. So the way in with the property is being used does not match its description in Wikidata at all. --EncycloPetey (talk) 13:47, 7 November 2017 (UTC)
mmmmh. no, in fact, I get the standard library catalog heading in hebrew ([http://aleph.nli.org.il/F/?func=direct&con_lng=heb&local_base=nnl01&doc_number=001251501 ex. here from Don Quixote (Q480)'s first NNL link). And when I click on the "English" button to go to the english interface, I get an English page that invites me to make a search, not the En version of the notice. But what I can clearly see is that there is a place set, and a date... which means it is an edition ^^
almost all NNL item ID (P3959) data were added by a single user @זאב קטן: (3614/3743 from NavelGazer data. - which means if we can agree with them, it would be much easier...
is there someone who understands hebrew on the Books project, who could search the database, and maybe help us here ? --Hsarrazin (talk) 14:04, 7 November 2017 (UTC)
Look about halfway down the listings you mentioned (try the second, third, etc items). You'll see catalog information in English such as "Fiction--Translations into Hebrew". I just looked at the listings on Don Quixote (Q480) that you mentioned and the problem is clearly visible to me. I'm not sure why you're not seeing it, and no, you don't have to click on the "English" button to see the cataloging information. The items are standard in English, and are placed among the Hebrew catalog information. --EncycloPetey (talk) 14:54, 7 November 2017 (UTC)
@Nahum, Dovi: Are you or any of your heWS colleagues able to shed light on this topic?  — billinghurst sDrewth 22:01, 7 November 2017 (UTC)
I don't think the NNL has "works" as such in its database, only records of editions. I don't know what "NNL work ID" should link to. I believe the NNL ID property should link to the edition(s) that exist in the library, because that is the place where the information about the work is stored at this library. This at least has been my personal experience with their database.--Nahum (talk) 22:22, 7 November 2017 (UTC)
Then it sounds (and looks to me) as though what the NNL database has are recording of its specific holdings, which would mean specific copies of books rather than works or editions. If that is indeed what it lists, then we have no framework yet for including them on Wikidata. You could argue that the listing could also be treated as editions (or translations), but those would require listing as separate data items rather than inclusion on the data item for the work. --EncycloPetey (talk) 04:08, 8 November 2017 (UTC)
Hello! I'm sorry, I only happened across this just now. I'm bilingual English/Hebrew, and very active in cataloging books with these systems. I'm also doing this in cooperation with the Israel National Library cataloging staff. so, I'll be glad to try to clarify anything, as well as I can.
For now, I'll just say that National Library of Israel ID (old) (P949) is parallel to: VIAF ID (P214) while NNL item ID (P3959) is parallel to: Library of Congress Control Number (LCCN) (bibliographic) (P1144). "authority records" for works are at- National Library of Israel ID (old) (P949). -- Shilonite (talk) 11:22, 13 December 2017 (UTC)
Every use of NNL item ID (P3959) I've come across is using it as parallel to VIAF ID (P214), which (as you say) is incorrect. The name is also confusing, since "work" has a specific meaning in WikiProject_Books --EncycloPetey (talk) 14:20, 15 December 2017 (UTC)
I'm sorry I didn't get back to you till now (i'm not very well...). I understand what you are saying, but please, so we would both ‘be on the same page’, try to examine the cataloging that I have done, and see if it agrees with your method.
If it does, and I understand that we are both working by the same method, then I will go back and correct whatever you find erroneous. If not please point out the differences to me.
For example what's your opinion about this: https://www.wikidata.org/w/index.php?title=Q20278655&diff=next&oldid=600585790 . ...and in general, what Ive done with that book. I'm asking to learn, if perhaps I've misunderstood.
thank you, Shilonite (talk) 11:42, 4 January 2018 (UTC)

Distinguishing the types of digital text

In book scholarship there are crucial differences between types of editions, serving different scholarly purposes. See for example these course notes. To take a practical example, a Project Gutenberg edition of a book gives you the electronic text but does not show you what the pages looked like, whereas Archive.org or Google Books usually provide visual scans of the book (plus uncorrected OCR). Which you want to use depends on whether your interest is in readable text or in the exact layout and typography (or handwriting) of the original). full work available at URL (P953) is not specific enough to tell the user what the link will provide, so we need to express these different types. One list given to me by an academic has these core types:

These could be used in instance of (P31) statements, but my interest is in using them to distinguish the different digital versions of a text in the Wikidata entry for an edition. It seems like the most appropriate way to do this is with the qualifier object has role (P3831). See Political Disquisitions (1775 edition) (Q42788256) for an example, where I represent that all 3 volumes of this edition of this book are available from the Oxford Text Archive in a detailed text edition while page scans of individual volumes are available from other sources. I'm feeling my way here, so I welcome other perspectives and improvements. MartinPoulter (talk) 15:37, 7 November 2017 (UTC)

@MartinPoulter: So I don't presume or misread ... can you more express how you see these being used. Are you looking to use these as qualifiers for full work, or are you looking to use these as the base instance of (P31)? Above in your notes are classifications there are various types of new editions or secondary, or maybe tertiary, reproductions.

Then I suppose I am trying to figure how/where reproductions like enWS and Gutenberg's works would fall where I have considered them per the edition published at the time, though are really becoming editions on their own, which does that then mean we shouldn't link them to those editions as they are there own derivatives. [We so need professional guidance]  — billinghurst sDrewth 22:36, 7 November 2017 (UTC)

Hmm, now you talked "digital text" and how are we seeing that as different from version, edition or translation (Q3331189).  — billinghurst sDrewth 22:41, 7 November 2017 (UTC)
@billinghurst: For my purposes, I'm looking to apply facsimile (Q194070) and diplomatic edition (Q42794047) as qualifiers for full work available at URL (P953), though I could imagine annotated edition (Q4769619) and eclectic edition (Q42793760) being in use for instance of (P31) (and possibly there could be exceptions both ways). I'm not proposing to create separate entries for, say, the Oxford Text Archive transcription of an edition. As I understand it, that properly belongs as a full work available at URL (P953) property in the entry for that edition. I'm looking to tag the full work available at URL (P953) property in a way that tells the user whether they will get the faithful text of the book or faithful scans of the book. Maybe the "edition" terminology is confusing when used this way, but this is the terminology of academic bibliography. Hope this clarifies. MartinPoulter (talk) 20:48, 14 November 2017 (UTC)

Are we conflating editions and translations; or are we missing translations as their own works?

To me when we have the creative work (Q17537576) or the variations that we use, we then have edition or translation of (P629). Are we right to combine editions and translations? If we have a translation by an author it gets its own copyright for the translator, and to me that makes it its own creative work (Q17537576) of which then there can be their own editions. I know for the work of Anton Chekhov (Q5685) that enWS has the same works by different translators, so at the work level, are we right to group all the Russian language editions, and the variety of different language translations under the one Property:P629?  — billinghurst sDrewth 22:51, 7 November 2017 (UTC)

technically, a translation is not a creative work (Q17537576), it is a derivative work (Q836950). --Hsarrazin (talk) 19:15, 8 November 2017 (UTC)
What is the gain ? we can always try to complexify the system but for which purpose ? And when I read the difficulties to some contributors to create work and editions items, I think the creation of an additional work item for the translations will jus be a nightmare (one item for the translation as edition, one item for the translated work and one item for the original work). Snipre (talk) 01:04, 8 November 2017 (UTC)
The gain is clarity, consistency, and a disentanglement of conflated concepts that are actually very different. An editor of an edition does a very different job from a translator, and the results of the two processes require different kinds of information. Stuffing them together into a single block has resulted in all sorts of editing headaches. --EncycloPetey (talk) 01:27, 8 November 2017 (UTC)
I understand the conundrum, I have hesitated to post over this matter for months. I am more trying to have an open conversation and deciding to do nothing with our eyes wide open, rather than having to unpick a situation with "Why didn't I say something earlier". In the whole conversations as they have persisted, we have the issue of the conceptual idea (creative work/work/...), to the manifestation/output (edition/book/...). We will continue to be caught by this until we do a far better job explaining this matter.

At the Wikisources we are governed by public domain/free licences, we list at the conceptual level, and reproduce at the manifestation(s). So when we have translations we need to explain the concept of dual licenses. At this point in time we manage all the data and manually apply licenses, though that is not the best way to undertake the curation, especially when it is common data across WP/WS/Commons. Ultimately we should be able to suitably licence translations according to the concept of author and translator irrespective of edition, and one day it will be fed from Wikidata. [And I am probably doing a shithouse job of describing as I need a whiteboard and a marker and to draw pictures, supported by hand-waving, rather than explanation.  — billinghurst sDrewth 03:54, 8 November 2017 (UTC)

@EncycloPetey: Please provide an example how the distinction of editions in original language and translations will help to clarify the situation: just write the relations between the editions in original languages, the translations and the corresponding work items. I did the job with the current system and I will be happy to compare with your simplified system. Snipre (talk) 17:11, 8 November 2017 (UTC)
I don't understand the symbolic language you have used to describe the relations. Please convert your model into prose or some other understandable form, if you would like me to assess it. --EncycloPetey (talk) 17:14, 8 November 2017 (UTC)
@EncycloPetey: You don't need to understand my model, you need to describe once your model, using words, graphics or what is relevant. But try once to put your ideas on the paper and SHOW where the simplicity is. Snipre (talk) 20:39, 27 November 2017 (UTC)
@billinghurst: Ok, you indicate the possible gain even if I don't see clearly what prevent you now to do what you want to do, but I would like to see the cost of that model modification in terms of items relations: can you show us how we would have to link the different editions, translations and works items with your model ? We don't need discussions, we need diagrams to be able to validate a model and that's what is missing now. An ontology follows mathematical rules so discussions are useless: tables, diagrams, systematic descriptions of relations, that's what is important.
You mention some automatic addition of license values to items, this implies bots so you should convert your idea in some programming language. Snipre (talk) 17:11, 8 November 2017 (UTC)
the problem of translations is one of the reasons why FRBR uses 3 levels to describe books (+1 for examplaries, which is not our problem). The need for an intermediate state between the original work and the edition... but the modelling on wikidata seems really difficult, and I'm not sure the linking of editions to the original work through a "translation" level would allow the retrieval of info like "date of creation of the original work" from the edition item. :/
moreover, like billinghurst says, it's already a very difficult task to explain on wikisource how the 2-levels model works… if we have to apply a 3-levels model, it will be nightmare :
and, if it could probably be achieved for books (with a lot of difficulties), it would be absolutely hell for poems and short texts... (--Hsarrazin (talk) 19:15, 8 November 2017 (UTC)

First hack at some cases

  • A1Y1: a translation of work A1 of author X1 by translator Y1 ... language detail
  • A1Y2: a translation of work A1 of author X1 by translator Y2 ... language detail
  • A2Y1: a translation of work A2 of author X2 by translator Y1 ... language detail
  • A2Y3: a translation of work A2 of author X2 by translator Y3 ... language detail

manifestations of these cases each role into the edition model thereafter, they are just editions (and editions of the translation)

So A1 has editions in the same language or translations into other languages. A1 does not have editions in other languages except via the translations.

the why

So we need a means to identify the one translation of a work, then the variety of places that it appears. Please feel perfectly entitled to update this for clarity. If I can get time at a whiteboard, then I will.  — billinghurst sDrewth 21:22, 8 November 2017 (UTC)

  •   Oppose having a new distinct property for translation without at least one good reason, I don't see the problem with the current uses of edition or translation of (P629). Moreover, as @Hsarrazin: pointed it, there is some over-simplifications in the initial statements ; a translation doesn't really have its « own copyright » (see derivative work (Q836950)) and in the others hand, an edition can also be considered as a derivative work (Q836950) and having protection on its own. More importantly, FRBR doesn't care about copyright to distinguish the levels, nor should we. Cdlt, VIGNERON (talk) 09:12, 9 November 2017 (UTC)
      Comment I understand that a translation is a derivative work, even so it does have its own copyright as a creative work. Many pages around that explain this, eg. http://bookwormtranslations.com/copyright-law-and-translation-what-you-need-to-know/  — billinghurst sDrewth 11:16, 9 November 2017 (UTC)
    Well as always with laws, it's complicated; translation doesn't really have « own copyright » but they have « some copyright of their own » (as such, the translator is not the sole author of the translation but just the co-author with the author of the original work, and depending of the country the translator can have less rights on his translation than the orginal author).
    But anyways, I don't see why and how copyrights intervene here, translation are very specific edition but still they are edition (and there is editions way more strange than translations, should we have a different property when an editor transform a poem in verse into a poem in prose? and vice versa? or when other significant changes are made to the original work? in some extreme cases, the better is just to consider that the modifications are so important that this is an entirely new work, for instance An Iliad (Q548338) with Iliad (Q8275)).
    Cdlt, VIGNERON (talk) 12:03, 9 November 2017 (UTC)
A translation can definitely be a new work ("FRBR-style"), because as @Nonoranonqui: patiently explained to me the fundamental discrimination between works is the "Authorial responsability"... So a translation is both a new work, and it's based on/derived/it's a translation of another one. But we probably don't need a new property: we can create an item for a translation, and use
  1. edition or translation of (P629)
  2. translator (P655)
  3. based on (P144)
If I'm not mistaken, these 3 properties give us what we need for understanding the relationship between a book and his translation. A query could look authors, languages and what not to understand everything. I'm not a very good wikidatian, but if properties are simple and clear is better for everyone: we still have queries for complex relations between items. 80.181.62.189 16:46, 13 November 2017 (UTC)
Except it doesn't. Where a translation has multiple editions of its own, this model fails or is corrupted.  — billinghurst sDrewth 21:22, 13 November 2017 (UTC)
I fear that this part of the problem has no solution, from a theoretical point of view. A good translation is both a work and an edition, even for librarians. It's like the wave-particle issue in physics: it's both, depending on how you look at it, what are your needs. Wikidata works with item, which should be "unique". But books don't work that way. So we have to deal with the ambiguity of what we need. I suggest everyone to read this very good free book from @Kcoyle:, she's a great librarian and information professional, and also she's one of us ;-) Aubrey (talk) 09:50, 17 November 2017 (UTC)
@Aubrey: Wrong, there is no obvious solution but we can define a solution with some advantages/disadvantages. We just need to have a logic solution which can be handled by any programming language like SPARQL. Snipre (talk) 15:15, 29 November 2017 (UTC)
@billinghurst: I think we need to distinguish 2 different problems:
Take the case
E1, an edition of work W1 with author X1 in language L1
E2, an edition of work W2 with author X2 in language L1
T1, an translation of edition E1 by translator X3 in language L2
T2, an translation of edition E2 by translator X4 in language L2
If an editor decide to create an new book containing T1 and T2 as
E3, an edition of work W3 by editor X5 containing T1 and T2
There is no problem to create new items for E3 and W3 if we consider that collecting different works is a kind of new work. The Wikidata model is able to handle that situation.
The second problem is to link E1 and W1 to T1 and T2.
To be correct, the information about the fact that T1 and T2 are parts of E1/W1 have to be integrated in W1 and not in E1. Then we have to the answer the question: can we accept the following relations
* W1 has part T1
* W1 has part T2 ? Snipre (talk) 23:55, 18 November 2017 (UTC)
I still don't understand the distinction beetween edition and translation and even less the need for a distinction.
I'm not sure to understand either the case you present here, do you have a concrete example? For W1 has part T1, T2, there is already some case, see this query. Is it what you were thinking about?
Cdlt, VIGNERON (talk) 08:53, 22 November 2017 (UTC)
The use of based on (P144) to link a translation to the document used as based original text for the translation is not the best choice: some book like this one is a translation of this one which is based on the game Mass Effect (Q275960). So based on (P144) can be used twice on the same item once for the translation relation and then for the topic relation. Better avoid that situation. Snipre (talk) 15:15, 29 November 2017 (UTC)

Decameron

Does anyone know of a Linked Open Data dataset for the stories in Bocaccio's Decameron? We have articles on a couple of stories {Day 1 Tale 1 of the Decameron (Q18600581), Cymon and Iphigenia (Q26710491)) but no structure for the days and the stories for each day that I can find. It would be nice not to have to do this from scratch. - PKM (talk) 20:38, 13 November 2017 (UTC)

To start: brigata (Q43256358), days.
--- Jura 16:07, 17 November 2017 (UTC)
Oh excellent! I can add some references to these. - PKM (talk) 21:04, 19 November 2017 (UTC)
@Jura1: Wow, I have realized just how much deep structure you built here! I am stunned.
I have added novella (Q43334491): short prose tale popular in Renaissance Italy, progenitor of the short story <different from> the modern genre novella (Q149537): written, fictional, prose narrative normally longer than a short story but shorter than a novel, and made novella in the Decameron (Q43303440) a subclass. Much more work to do as time permits. Onward! - PKM (talk) 21:54, 19 November 2017 (UTC)
@PKM: I started a list at Decameron editions and translations and included what I found at enwiki/wikisource. Maybe it's possible to give it a reasonable coverage.
--- Jura 12:29, 24 November 2017 (UTC)
@Jura1: thank you for this page and thank you for creating items about editions and translations. As you've seen I've did some corrections to fit the model of WikiProject Books; you reverted me but I see no reason to not use the model of WikiProject Books (especially as there is another discussion, which is more leaning toward keeping the current model). Cdlt, VIGNERON (talk) 14:16, 24 November 2017 (UTC)
It seems consistent with the current model, except maybe that the manuscripts should use "exemplar of" and not "edition of". I don't mind if you change that. I noticed that some of items used the wrong "translation" item, thus the constraint violations. It's fixed now.
--- Jura 14:20, 24 November 2017 (UTC)
I see many points not respecting the model, there was edition of edition (but edition or translation of (P629) is not transitive, corrected now), there was wrong instance of (P31) (thank you for fixing it), there is still several constraints violations (for manuscripts but not only, identifiers too, eg. something is wrong on Q16438#P1256) and in the end, there is a lot of missing information and some wrong information (like Q16438#P577, the property should be inception (P571) and the values should be better indicated, more precise and referenced with better source, like the entry in the Treccani). Cdlt, VIGNERON (talk) 14:37, 24 November 2017 (UTC)
Q16438 isn't even on the list. The source at enwiki I was mentioning is at w:The_Decameron#Translations_into_English. It should be possible to find the same information in Wikidata. Other languages have similar lists.
--- Jura 14:54, 24 November 2017 (UTC)
Q16438 is the work, it's not on the list but it's the more important item of this list.
And please, learn how to use edition or translation of (P629) and has edition or translation (P747) as it was intended (between a work and an edition, never between two editions ; more information on Wikidata:WikiProject Books and on en:Functional Requirements for Bibliographic Records).
Cdlt, VIGNERON (talk) 15:39, 24 November 2017 (UTC)
No problem. I thought you were trying to present some argument and reference about the items on the list you were breaking. Yes, I think we all agree that Wikidata isn't complete yet and you obviously invited to contribute. A list of French translations could be interesting ..
--- Jura 15:56, 24 November 2017 (UTC)

Additional properties

Why these properties aren't used at all: country of origin (P495) and after a work by (P1877)? --Infovarius (talk) 10:35, 16 November 2017 (UTC)

Hi, Infovarius
AFAIK, after a work by (P1877) is more for artworks (like an etching after a work by (P1877) an original painting)... for books, I'd probably use based on (P144) or inspired by (P941) - these should be applied on the work item, of course, not on the edition.
as for country of origin (P495), what is the point of giving a country of origin ? the work has an author, and a language ; the country in which the author lived at the time is not necessarily the origin of the work (see Voltaire (Q9068)'s works, written in French, but written in Prussia, and published in Prussia (because of France censorship)... should they have Kingdom of Prussia (Q27306) as country of origin (P495) ? this seems rather inadequate.
on version, edition or translation (Q3331189) items there is already place of publication (P291) - why would you add country of origin (P495) ? --Hsarrazin (talk) 11:17, 16 November 2017 (UTC)
after a work by (P1877) has a different sense from based on (P144) or inspired by (P941) - it has value "person" not "work".
I understand difficulties like with Voltaire (Q9068). But what if all is unambiguate: work has been created and first published in a country - citizenship of an author. Why not to mark this in work item? --Infovarius (talk) 16:45, 17 November 2017 (UTC)
Have you referred to the creation proposal Wikidata:Property_proposal/Archive/31#P1877 ?  — billinghurst sDrewth 06:27, 18 November 2017 (UTC)
@Infovarius: do you have an example for after a work by (P1877)? based on (P144) and inspired by (P941) seems more than enough to me in all cases I can think of (if it is really after *a* work by a person it seems more accurate to directly link to this work instead of the person, plus see the hijacking of after a work by (P1877) which was not at all intended to be used in that way :/ ).
For country of origin (P495), I don't see the need: there is plenty of way to find where a book come from (directly with property like place of publication (P291) - which is far more intuitive and easier to reference - or though the author(s)'s data). Is there a case where the value in country of origin (P495) would be different than the value in place of publication (P291)?
Cdlt, VIGNERON (talk) 10:05, 22 November 2017 (UTC)

I was checking, for 94,017 items with instance of (P31) = book (Q571) (36 %), there is 34,025 with a country of origin (P495). Maybe it should be accepted on works, as place of publication (P291) is only for editions. It's redundant (which is a bad in itself) but it would be easier to do queries and other stuff (like using the redundancy to check the consistency). Cdlt, VIGNERON (talk) 14:43, 24 November 2017 (UTC)

Language property

I am completely confused which property (language of work or name (P407) or original language of film or TV show (P364)) should be used for books, works and films and which is deprecated and will be deleted. User:Pasleim deletes P407 statements, sometimes deletes P364, User:VIGNERON deletes P364. Can you come to an agreement and explain to others? --Infovarius (talk) 20:35, 23 November 2017 (UTC)

original language of film or TV show (P364) is deprecated and in a process of deletion (for several months now, it's even written in the original language of film or TV show (P364) description) as it was meaningless most of the times (for multiple reason but thank to the FRBR model). For information, I deleted all original language of film or TV show (P364) only on items about 'edition' *and* when there was already a language of work or name (P407) with the exact same value (about ~200 items IIRC). So globally, never use original language of film or TV show (P364) and always language of work or name (P407). the first removal you cite was an obvious mistake. Cdlt, VIGNERON (talk) 21:12, 23 November 2017 (UTC)
Consensus was reached on WD:PFD to merge original language of film or TV show (P364) into language of work or name (P407). However, members of the WikiProject Movies insist on keeping both properties for movies. If you think this is confusing, your comment is highly appreciated on Wikidata:Properties for deletion#Closure of stale thread. --Pasleim (talk) 08:30, 24 November 2017 (UTC)
@Jura1: what are you talking about? The plan is quite clear and logic, see Wikidata:WikiProject Books. And AFAIK, information is not lost (at least not by me, I checked that the information was already there before deleting the deprecated property). Cdlt, VIGNERON (talk) 09:17, 24 November 2017 (UTC)
For items like Q17352560, there was at least two clues that is it an edition : 1. not in a language spoken by the author and 2. link to Wikisource. I improved the items (who weren't at all following the plan, so it is illogical to use this item as an example of alleged failure of the plan), I think it's clear now.
Cdlt, VIGNERON (talk) 09:17, 24 November 2017 (UTC)
We were looking for a conversion plan. Not that it matters now, we already lost the information in relation to books.
https://www.wikidata.org/w/index.php?title=Q17352560&oldid=563333657 was correct when it was created/edited, but the change of the property on other items made us loose the information that it was just the language of the edition. The same probably applies to all similar items. You will probably need to find a new source to rebuild the information.
--- Jura 09:27, 24 November 2017 (UTC)
https://www.wikidata.org/w/index.php?title=Q17352560&oldid=563333657 was wrong since the beginning. It had a sitelink to Wikisource but was instance of a work. --Pasleim (talk) 09:45, 24 November 2017 (UTC)
Somehow I got the impression the contributor who made it is an expert in the field. So if the approach isn't clear to them, it's unlikely to scale well.
--- Jura 09:54, 24 November 2017 (UTC)
@Jura1: again: what on Earth are you talking about? this edit is clearly and entirely correct, what is wrong with it? (besides the obvious missing properties on the same item but that's beside the point, the item is better after this addition ; and why are you even mentioning it? it's very loosely related to the problem here). What information is lost exactly? For the conversion plan, it's quite easy: delete all original language of film or TV show (P364) and replace them by language of work or name (P407) with the same value (and in bonus: check the instance of (P31) and other properties like edition or translation of (P629) and country calling code (P474)). Cdlt, VIGNERON (talk) 10:11, 24 November 2017 (UTC)
I think you are confusing things. Pasleim is stating that the item was wrong to begin with. At least you seem to be satisfied with the approach that seems to be applied for books.
--- Jura 10:16, 24 November 2017 (UTC)
I don't think I'm confusing thing but clearly I'm confused by you. The item Q17352560 was wrong in the beginning as it was empty and missing a lot of property and the instance of (P31) was too general (but reminder: it was created back then in 2014). In 2017, @Hsarrazin: add a language of work or name (P407) and it was a good thing. The only « mistake » (but can we really call it that way?) is that she didn't added others properties nor corrected the P31, but the edit in itself was good. In the end, none of that really matter as original language of film or TV show (P364) is not at all involved here.
Can we move on and use a more relevant example? For instance Q19157120 and the P364 deletion I made two days ago. Is there anything you consider as lost here? and why? (I don't see any lost but maybe I'm missing something). If not, do you have an explicit example?
Cdlt, VIGNERON (talk) 10:25, 24 November 2017 (UTC)
Apparently Pasleim and you disagree on https://www.wikidata.org/w/index.php?title=Q17352560&oldid=563333657 . I'm not sure what I can add to help you with this. I'm not aware that an edit is or can be considered incorrect if one doesn't add more statements.
--- Jura 10:31, 24 November 2017 (UTC)
I actually fully agree with VIGNERON. P31 needed to be corrected which wasn't done till today, but this correction should have happened independently of the language property merge. --Pasleim (talk) 10:37, 24 November 2017 (UTC)
Well, P31 on https://www.wikidata.org/w/index.php?title=Q17352560&oldid=563333657 being too general and wrong isn't really the same. It's not independent of the language properties (original language of work and language of edition) because the the use of P407 made it clear that it may not have been the original language of the written work. Once all written work just use language of edition, it's no longer clear.
--- Jura 10:43, 24 November 2017 (UTC)
It's maybe not entirely independent (everything is connected in this Universe) and indeed on this particular example it had to be corrected/completed but now, it seems good to me for this item. Do you have other example where it's unclear or where allegedly « information was lost »? PS: language of work or name (P407) is *not* « language of edition ». Cdlt, VIGNERON (talk) 10:53, 24 November 2017 (UTC)
Ideally, the conversion plan would have taken care of such problems. As Pasleim is doing it, maybe he can detail.
--- Jura 11:00, 24 November 2017 (UTC)
I'm not the only user developed a conversion plan and I'm not the only user doing the conversion (btw, I'm not even member of this WikiProject).
I think, you had too high expectations on the conversion. language of work or name (P407) was not and is not "language of edition". If an item is an edition or a work is defined over P31, P279, sitelinks and external identifiers. If these values were set wrong, they are still wrong now after conversion but it didn't lead to any information loss. --Pasleim (talk) 11:22, 24 November 2017 (UTC)
So if P364 is going to be deprecated I don't understand why User:Pasleim is massively deleting P407 in favor to P364? --Infovarius (talk) 15:27, 24 November 2017 (UTC)
The current rules of WikiProject Movies say to use P364 for movies, therefore I remove P407 in cases where it is redundant to P364. If P364 is going to be deprecated depends on whether or not user accept community consensus. --Pasleim (talk) 18:38, 24 November 2017 (UTC)
We reached consensus to deprecate P364, but we never reached a consensus about the manner in which it would be deprecated, or how the data currently in the property would be handled. The most obvious problem is that the property is explicitly for language of a work, and not for language of editions, nor does the process of deprecation attend to the issue of marking source languages for translations or editions of translations, which is not always the same as the language of the ultimate source (=work). Nor does it solve the problem that works themselves do not have a language; only individual editions / copies will have a language. A "work" refers to the creative piece independently of any specific copy. --EncycloPetey (talk) 05:09, 25 November 2017 (UTC)
Why do you think that work doesn't have a language?? Usually literary works are created in one and only language which is the language. And editions in other languages are just translations from the original language (translation itself can be regarded as creation of new creative work, in different language). Infovarius (talk) 19:51, 29 November 2017 (UTC)
Sorry ? of course all works (textual works obviously) must have a language. But it is language of work or name (P407) not original language of film or TV show (P364). --Hsarrazin (talk) 20:00, 29 November 2017 (UTC)
@Infovarius: But the work data item is for the work as a whole, meaning every edition and not any specific edition. A work can appear in any language to which it is edited or translated. We have chosen to eliminate the "original language" property, and now have no means of indicating the original language unless there is a "first edition. This itself is a problematic issue, and some works have no known first edition, and some have a first serialized edition that predates the first bound (book) edition, etc.
We also have no propoerty for marking "language of edition" or "language of translation". We only have a property for "language of work". --EncycloPetey (talk) 02:06, 30 November 2017 (UTC)
Maybe I am not understanding, but if we are talking about the "work" the language P407 is used, and it replaces P364. Editions have P407, and have no requirement for original language as you refer back to the work.  — billinghurst sDrewth 08:50, 30 November 2017 (UTC)
Billinghurst: Why would an edition be marked with language of work or name (P407), since that property is explicitly for the language of the work? Editions are not works. ::::: Also, how do we mark the source language for translations and for editions of translations? We currently have no logical means of doing that. Yes, there can be pointers back to a "work", but for translations we cannot agree on whether the translation is an "edition" or is a "work" and needs its own edition data items.
And, yes, current practice puts language of work or name (P407) on works, but that makes no logical sense. Language is not a property of a work; it is a property of an edition. The language can differ in various translations/editions, so it is not a property native to the work. An item's properties must be invariant, or they are not properties of that data item. "Author" is a property of a work, because a work will always have that author, and this is why we do not replicate the author information on all the data items for the editions. But the date of publication varies with every edition, so we do not put "date of publication" on the work item, but rather on the individual items for each edition. The work instead gets a "date of first publication", or no date at all. The "language" property is in the same category as "date"; it varies with editions/translations, and is not inherent to the work. Yes, a work has an original language of composition, but we've decided to eliminate that property. --EncycloPetey (talk) 14:35, 30 November 2017 (UTC)
language of work or name (P407) on a work is the language it was originally composed by the author... How can you say that it makes no sense to put a language on a work... the language is intrisec to the work... this way, when an edition is the same language, it means it was not translated, whereas when it is different it means it is a translation... Work notice at Bnf (for ex.)
what caracterizes a work is :
  1. an author,
  2. a title (sometimes conventional),
  3. a language,
  4. a date of creation.
Without a language, how can you say that Shakespeare wrote in English, Molière in French or Goethe in German ? --Hsarrazin (talk) 15:14, 30 November 2017 (UTC)
Pretty much my point of view. I would even take it a step further and say that all language belongs on a work, not on edition. Though to do that I have to go back to my argument that each translation is a work too. Any edition of a work, or of a translation, has to be in the same language of its respective parent.  — billinghurst sDrewth 17:02, 30 November 2017 (UTC)
I may be wrong but I think we are mixing very different definitions and senses of the word work here. The sens of work in frbr (that I will write down workfrbr) is very narrow. Editions (and by extension translations, who are expressionsfrbr that we defined to be equivalent to editionwikidata) are not workfrbr but they are work. When P407 says work, I believe this is lato sensu, not stricto sensu. billinghurst: I hear your argument but I feel this is unnecessary or at least I don't see the need (and meanwhile, I see a lot of potential trouble, especially as languages are not always clearly delimited, one can argue that Shakespeare and Molière were not writing in English or French but Early Modern English (Q1472196) and Classical French (Q3100376)). Cdlt, VIGNERON (talk) 17:20, 30 November 2017 (UTC)
Just to gum things up a bit more, the newest version of FRBR, which calls itself the Library Reference Model - LRM adds a new attribute "LRM-E2-A2 Representative expression attribute" - the "representative expression" being the hedge term for "original work." This came out of some (admittedly limited) studies that showed that users of bibliographic data often had a special "feel" for all editions being in relation to the original. This hasn't been entirely incorporated into library cataloging, but the interesting thing to me is how it coincides with how works are treated in Wikipedia entries, often with a fair number of data elements relating to the original. The "LRM" hedges on this because libraries are often cataloging works where either the original is unknown or the cataloger is just not reasonably going to have the time to do the research on it. Rather than put this at a work level the LRM lets you set one of the expressions as the "representative" one. It looks to me like WD could tag one edition as the "original", but it would probably also be useful to carry the original language in all editions. That data is available in the library record format for all published language works.
Note, I do not deny that letting people set one edition as the original could be dangerous, but it also could be very useful for many works. Kcoyle (talk) 19:09, 1 December 2018 (UTC)

Ancient Greek works

@billinghurst, Hsarrazin, VIGNERON, Snipre: There are currently 13 items left which use both P364 and P407.

SELECT ?item ?itemLabel WHERE {
  ?item wdt:P364 []; wdt:P407 [] .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!

Me and EncycloPetey disagree on how to apply the guidelines on the front page of this WikiProject. Can somebody of you help us with these items? --Pasleim (talk) 09:34, 30 November 2017 (UTC)

@EncycloPetey, Pasleim: what is the disagreement exactly? Didn't we all agree that original language of film or TV show (P364) is deprecated and shouldn't be used? (and if it weren't used for movies too, the property would probably be already deleted for months)
This is maybe besides the point but I've looked at the results and there is something wrong: The Comedies of Aristophanes (Q21286489) is indicated as work in instance of (P31) but many data indicates this is in fact an edition (the 1853 edition by Hickie according to the link to WS and the publication date (P577)). All items seems to be in the same case.
Cdlt, VIGNERON (talk) 11:22, 30 November 2017 (UTC)
IMO, original language of film or TV show (P364) should be removed from The Comedies of Aristophanes (Q21286489) but EncycloPetey reverted that change 4 times during the last months [2] [3] [4] [5]. I also think that it should be marked as edition or translation but also that edit was reverted by them [6]. --Pasleim (talk) 12:28, 30 November 2017 (UTC)
@VIGNERON: If you still don't understand the disagreement after months and months of discussion, then explaining it to you all over again isn't going to do any good, is it? Please look at all the previous discussion. There is a lot of it already.
Re: The Comedies of Aristophanes (Q21286489). If you believe this is an edition, then what is it an edition of? What is the work? --EncycloPetey (talk) 14:37, 30 November 2017 (UTC)
@EncycloPetey: I saw the discussions and was wondering if there was a new argument because I thought (wrongly apparently) we were over this, almost all original language of film or TV show (P364) has been removed, your 13 texts are the only ones left.
True the case of anthology is a but strange but since you use properties for editions, it seems better to tell it's an edition (or subclass of edition). It should be checked but an item for s:en:Comedies of Aristophanes would be good for the work, wouldn't it? On the other way around, I can ask you almost the same question: « If you believe this is a work, then what is its edition ? ». And on other items like The House of Atreus (Q30349006) you add instance of (P31) = translation (Q7553) ; which is weird as translation (Q7553) is an action, you probably mean translated text (Q39811647) which is a subclass of version, edition or translation (Q3331189). Cdlt, VIGNERON (talk) 14:56, 30 November 2017 (UTC)
Why would you think it was resolved? No consensus on what action to take was ever reached. We agreed to deprecate the one property, but never agreed on how to go about that or how to preserve the information it indicates.
Re: the anthology: huh? No, s:en:Comedies of Aristophanes would not be good for the work. That's a disambiguation page for multiple works that bear the same title. There is no work for this to come from. This is, as you say, an anthology composed entirely of components that are editions, but itself has no parent work to come from. As I keep saying, the data model we are using is both flawed and incomplete, so an appeal to that flawed and incomplete model is merely circular reasoning. --EncycloPetey (talk) 16:18, 30 November 2017 (UTC)
I thought it was resolved since all original language of film or TV show (P364) has been removed without trouble (except these 13 items). Is there anyone else except you who disagree?
Why not use s:en:Comedies of Aristophanes (and replace the s:en:disambig with s:en:Template:Versions, as it closer to the second), if we take the criteria given by Hsarrazin, all these anthology have : the same author, the same title, the same language, and the date of creation. It seems to fit the bill. True the content is different for each editions but work has no content so it doesn't really matters (and other properties are already here to explicit that). An other solution is to create a specific work for each different version (a bit overkill but it works too).
FRBR may not be perfect and there always will be tricky cases but this is the best system I know. Plus, it's quite well documented and this project agreed on to use FRBR. Do you know a better cataloguing system?
Cdlt, VIGNERON (talk) 17:04, 30 November 2017 (UTC)
@EncycloPetey: "...the data model we are using is both flawed and incomplete..." This is perhaps true but where are your contributions to solve that ? Perhaps it is time to act as a contributor and to propose a better model. Can we hope once to see contribution to a model ? Snipre (talk) 21:17, 30 November 2017 (UTC)
@Snipre: So, you agree that the data model is both flawed and incomplete, and are willing to explore change? This is the first indication that you or anyone else has made that change might be possible. If we can get more of the community to agree to this, then we will be able to proceed. Up to now, everyone has been pushing to spread the flaws. --EncycloPetey (talk) 21:21, 30 November 2017 (UTC)
@EncycloPetey: Read again what I said: "This is perhaps true that the data model we are using is both flawed and incomplete". But as you never show how to complete or improve the current model how can I judge what is missing ? Until you propose something, and this is something I ask you to do since several weeks, the current model is the best we have and we have to use it in order to be coherent inside WD. I prefer to have a bad solution than only criticisms saying we can do better. Snipre (talk) 21:30, 30 November 2017 (UTC)
Since you're not willing to move forward or admit change, it's disingenuous to criticize others for not proposing solutions. If you're not going to implement the ideas of others, there's no reason to propose those ideas. --EncycloPetey (talk) 21:34, 30 November 2017 (UTC)
@EncycloPetey: Where do you read that I am not ready to change my mind ? Please link to one of my comments saying that. You are not logic because you asked people to change their mind BEFORE showing them any reasons to do it. We have a model which need to be extended but we have something real. And you, what do you have ? You never presented nothing, so for me you have nothing to propose. WD is not a poker game so put your cards on the table or leave the table. Snipre (talk) 21:54, 3 December 2017 (UTC)
  •   Comment I agree with Pasleim on these. We can call the original language by reference to the parent work rather than trying to replicate it in every edition. For that set of works, that smattering of the instance of (P31) it is just getting ugly ... book/translation/edition. Compilations don't fit the model, wonder how we go with "Greatest Speeches of ..." compilation, it is going shred your models. /me throws his hands into the air, and leaves it to the experts. I think I will stick with doing editions.  — billinghurst sDrewth 17:20, 30 November 2017 (UTC)
  • The correct way to handle a case like The Comedies of Aristophanes (Q21286489) is to consider it as a normal book: we need a work item and an edition item. The work item for this book will contain the links to other work item of the works composing the book
So having this case:
  • A1: a work of author X1 in language L1 represented in WD by QAAA
  • A2: a work of author X2 in language L2 represented in WD by QBBB
  • D1: a combined edition of A1 and A2 by translator X3 in language L3
To represent D1 in wikidata we need 2 items:
QXXX, Work item for D1 with the following statements:
QYYY, Edition item for D1 with the following statements:
Do you agree with the proposed model ? Snipre (talk) 21:17, 30 November 2017 (UTC)
@billinghurst, Hsarrazin, VIGNERON, Snipre, EncycloPetey, Pasleim: With my proposition we solve the question of original language of film or TV show (P364). Snipre (talk) 21:20, 30 November 2017 (UTC)
With which proposition is that? I see nothing that solves that questions currently under consideration. What we need are two new properties: (1) language of composition (or first performance, or first publication), and (2) language of edition (which might be adaptable from "language of work"). And for translations we need some third item to indicate language of source text, and/or the identity of the source text, from which the translation was prepared. --EncycloPetey (talk) 21:24, 30 November 2017 (UTC)
@EncycloPetey:Can't you use your capacity to infer ? This is one principle of database: use relations to deduce information not written. Example, if I said that all dogs are mammals and Floppy is a dog, can't you deduce that Floppy is a mammal even if I don't say it ?
So if you don't have the language of the original text in a item defining a translation, go to the item of the corresponding original text. So the original language of D1 is the language of A1 and A2, so if I have to extract the language value from QAAA and QBBB.
If I want to know the gender of the author of The Knights (Q1215817), why do I have to look in item Aristophanes (Q43353) and not in The Knights (Q1215817) ? Snipre (talk) 21:54, 30 November 2017 (UTC)
I'm sorry that you still can't understand the problem after all the discussion we've been through. Your analogy is flawed for all the reasons we've discussed elsewhere. If all dogs are mammals then that is an invariant quality. It does not change. Likewise, the gender of an author does not (usually) change, so there is no need to replicate it of mark it elsewhere because it is an invariant quality. But language of a work varies and does change, and it is context dependent upon the particular edition, translation, or performance of that work, so it is not an invariant property. Likewise "date" of a work depends upon the specific edition, translation, or performance. You cannot deduce anything when the values are inconstant.
And we've already been through the problem of identifying "original" texts. There is no means of marking that reliably. A translation of a text might be made from the "original", or it might be made from a derivative text in another language. --EncycloPetey (talk) 01:13, 1 December 2017 (UTC)
Please (again) don't mix work and workfrbr, work may have several languages (and even that is a bit dubious to me) but workfrbr clearly has always only one language (the on inside the head of the author). For translation (Q7553) vs. second-hand translation (Q23808533), this is a different and separate matter (already discuss in multiple sections of this page), if an edition of Hamlet in German has been translated from one of the first edition in Early Modern English or from a different edition in English or French, doesn't change the fact that Shakespeare was thinking his workfrbr in Early Modern English or that the first editions were in that language, this is clearly invariant. Cdlt, VIGNERON (talk) 14:27, 1 December 2017 (UTC)
Edit: my mistake, in FRBR, workfrbr has no language, this is a property of expressionfrbr only. Cdlt, VIGNERON (talk) 14:48, 1 December 2017 (UTC)
@Snipre: it seems good to me and it seems to be more or less what the FRBR recommends : FRBR 2008 (look on pages 30, 67 and passim, could you take a look and confirm if it fits or not?). Cdlt, VIGNERON (talk) 14:27, 1 December 2017 (UTC)
  •   Comment If P407 by itself isn't sufficient to express the information, it probably needs qualifiers. If cases are rare, this might scale. If these are frequent, a solution with a dedicated might be needed. I don't see how it helps us determining the meaning of statement, but if we just say that one solution is "correct" or "what mother recommends implicitly". Once we have a solution, one can try to determine if it can be interpreted in this or that scheme.
    --- Jura 14:33, 1 December 2017 (UTC)
    @Jura1: the solution chosen is quite simple: P407 is the language of the item, if P407 is used on an item with P31 = work (or subclass of), then this is the language of the work, if P407 is used on an item with P31 = edition, then this is the language of the edition (and if P407 is used on something else, then look at the P31). We can use qualifier to make it more explicit and duplicate the P31 but honestly, you just have to look at the P31 to already infer a clear answer. Cdlt, VIGNERON (talk) 14:48, 1 December 2017 (UTC)
@VIGNERON: That's what I said since several months: a compilation of works is a new work. But here I see a potential problem for some particular cases: if I have a work for one original text I don't have a work for a corresponding translated edition of that work, so if someone decides to publish a new book containing the original text and the translated text, then the proposed model requires a work item for the translation. Does it means we need a work item for all translation, perhaps not, but we have to find a solution for this case. For books, this case is rare but for poems, this case is more frequent. Snipre (talk) 02:30, 3 December 2017 (UTC)
@Snipre: It's not uncommon for texts at all, and is common far beyond poetry. It applies to drama, correspondence, essays, and most of all it applies to a high proportion of classical literature (Greek, Latin, Chinese, etc.) where parallel texts are common and also anthologies of translations are common. --EncycloPetey (talk) 21:22, 3 December 2017 (UTC)
@EncycloPetey: And ? Do you have a solution or a proposition ? Snipre (talk) 21:44, 3 December 2017 (UTC)
A large part of our problem, in a nutshell, is that we are limited to a binary system of [ "work" or "edition/translation" ]. Translations are neither wholly one or the other, yet they do have editions. So, we need a third option of "translation" that effectively lies in between the levels of "work" and "edition". That doesn't solve all the issues, but would be a positive step if we could implement it. --EncycloPetey (talk) 21:51, 3 December 2017 (UTC)
@EncycloPetey: Good. This is a first step. Can you please provide then the relations between the work, the edition in original language and your new class translation in order to see how complex the model is. Do we have all properties or do we need to create some new ones ? Snipre (talk) 22:06, 3 December 2017 (UTC)
I'm not sure what you're asking or what you're driving at. I know "relation" in the mathematical sense and the biological sense, but think it must have a slightly different meaning the way you are using it. How do you expect a response to be framed? --EncycloPetey (talk) 22:13, 3 December 2017 (UTC)
We have 3 classes (work, edition, translation) so we need at least 3 relations and possible 3 others if we want to have reverse properties. Snipre (talk) 23:41, 3 December 2017 (UTC)
The approach at Wikidata:Lists/Decameron editions and translations works out quite well. We just need to find a good way to express what language something was translated from.
--- Jura 07:48, 4 December 2017 (UTC)
With your approach one can easily determine the language something was translated from by following the edition or translation of (P629) chain. The concern is that you are using edition or translation of (P629) to link both edition with translation and translation with original work. But maybe widen the scope of P629/P747 is more comprehensible than creating a handful new properties. --Pasleim (talk) 11:23, 4 December 2017 (UTC)
Agree for this sample, but it's more complicated for EncycloPetey's. I noticed that there was some inconsistency in the labels of P629/P747. Maybe "translation" should be included in both properties and all languages.
--- Jura 16:45, 4 December 2017 (UTC)
RE Pasleim: "With your approach one can easily determine the language something was translated from by following the edition or translation of (P629) chain." But that won't always work. Assuming that the chain exists, and is complete, and isn't confounded by more than one layer of translation/edition (not all these conditions are always met), all the end of the chain may tell you the language of an ultimate work, not necessarily the language from which the translation was made. The English book The Waning of the Middle Ages is ultimately a translation of a Dutch work, but the translation was made from an unpublished French translation that was radically different from the original Dutch. I also have a book I'm woking with on Wikisource where the original language of composition was German, but the English translation was published first because of the death of the author before the German could be published. So is language of composition and the language of first publication are not the same. Our methods for indicating basic information like author, date, and language are too simplistic to cope with a lot of the data we need to record. --EncycloPetey (talk) 17:06, 4 December 2017 (UTC)
────────────────────────────────────────────────────────────────────────────────────────────────────
A main principle of database design is to avoid duplicate data. You find in the web a lot of literature explaining why duplicate data are bad in a database. If we agree to aim for a good database design, we store the language of the original work only once, namely on the item about the original work. The same with the author. The question left to answer is then how to link editions/derived works/translations with the item about the original work. We currently have edition or translation of (P629)/has edition or translation (P747) and has part(s) (P527)/part of (P361) and published in (P1433). If you think this is not sufficient, please make proposals for new properties. --Pasleim (talk) 17:38, 4 December 2017 (UTC)
WMF DE seems to be for duplicates (at least Wikibase supports symmetric constraints and explicitly doesn't develop better alternatives). WMF gives grants for triplicate schemes .. So I think with an occasional supplementary statement we are still much closer to the ideal.
--- Jura 18:50, 4 December 2017 (UTC)

Edition of an edition

  WikiProject Books has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Hi,

It seems obvious and trivial to me, and as documented on the property page, the main page here Wikidata:WikiProject Books and in the FRBR, that an « edition of an edition » is not possible and doesn't even make sense (an edition is by definition the thing edited from a work). So logically, I corrected it on Декамерон (Q43475477) but Jura1 reverted me and is asking for « references ».

For me it's as obvious as « the sky is blue » or « water is wet », I don't know what more to explain... Any idea, remarks, etc. ?

Cdlt, VIGNERON (talk) 16:01, 24 November 2017 (UTC)

PS: to be sure, I checked again in the FRBR, it's clearly stated « Translations from one language to another, musical transcriptions and arrangements, and dubbed or subtitled versions of a film are also considered simply as different expressions of the same original work. » (FRBR, pages 17-18)

Q43475477 is an edition of a 19th-century translation: Q43169039.
Similar to Q43517456 which is an 1860 edition of the 15th century translation Q43516994.
Maybe the Commons sitelinks shouldn't be on these items.
The objective is to provide a full list of translations at Wikidata:Lists/Decameron editions and translations
similar to w:The_Decameron#Translations_into_English.
--- Jura 16:14, 24 November 2017 (UTC)
I don't speak russian enough so I don't know if the edition Q43475477 is based or not on the edition Q43169039 (BTW, they're both edition as translation are edition). But in any case, the property to indicate this information is based on (P144) not has edition or translation (P747)/edition or translation of (P629). For the list, it is easier to create the exact same list when all editions are link to the same work (the SPARQL request would be shorter with just P629 and not P629+ which doesn't really make sense as P629 is not transitive).
Cdlt, VIGNERON (talk) 16:20, 24 November 2017 (UTC)
The way the FRBR Group 1 classes have been implemented on Wikidata does not allow to express that two editions (frbr:Manifestation) are the embodiments of the same frbr:Expression. based on (P144) is not adequate to express this, as it could also be used to express that an adaptation is based on a particular translation. I'm not sure to what extent it is necessary in Jura1's example and use case to actually be able to express such subtleties. I can however understand that some confusion may arise if one has the distinction between frbr:Expression and frbr:Manifestation in mind. --Beat Estermann (talk) 00:25, 26 November 2017 (UTC)
It seems that the labels/definitions of edition or translation of (P629) aren't the same in all languages. At some point, "translation" was added to English ([7]) and some other languages. The approach chosen for Decameron seems consistent with current constraints.
--- Jura 13:29, 26 November 2017 (UTC)
« edition of » and « edition or translation of » is the same thing as translations are editions (at least until now in this project and in FRBR), the precision in the label is just a way to make more explicit for users. Formally there is not constraints right now to forbid edition of edition but there should be as (I feel) this is not at all in the spirit of this project where 'edition of' is supposed to be between only Work and Edition levels, not inside item of the Edition level. Cdlt, VIGNERON (talk) 23:03, 28 November 2017 (UTC)
  Comment This seems to be what I have been addressing at #Are we conflating editions and translations; or are we missing translations as their own works?. The generic "translation" can be at the work level, or at the edition level. Like the misuse of "book" which can relate the creative work, or a specific edition of the work. The difference in the jargon is not important to most people.  — billinghurst sDrewth 23:04, 26 November 2017 (UTC)
@billinghurst: exactly but even if we choose to consider editions and translations to be different things (which I think to be a bad and unnecessary idea, as most databases and references consider translations to be editions) then we would need a new property 'translation of' and in this case we wouldn't have edition of editions, right ? Cdlt, VIGNERON (talk) 23:03, 28 November 2017 (UTC)

For information, right now there is 341 results for edition of edition:

SELECT ?item1 ?item1Label ?item2 ?item2Label ?item3 ?item3Label WHERE {
  ?item1 wdt:P629 ?item2 .
  ?item2 wdt:P629 ?item3 .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!

Some of them seems to violates multiple constraints (including of lot of manuscripts which probably should use exemplar of (P1574) instead). What should we do with these items?

Cdlt, VIGNERON (talk) 23:03, 28 November 2017 (UTC)

@VIGNERON: For exemplar we need to use exemplar of (P1574) and to link the exemplar to the edition or to the work if no edition exists. Snipre (talk) 13:10, 29 November 2017 (UTC)
The main problem is from this kind of item Septuagint manuscript (Q7452368): this is typically a Wikipedia structure which doesn't correspond to the Wikidata model and create an additional layer in instance/subclass classification without having any meaning in the FRBR classification.
Second problem is this item Septuagint (Q29334) or Vulgate (Q131175) which are defined as work but described as translation of the Bible. We need to find a solution for these items. Snipre (talk) 14:30, 29 November 2017 (UTC)
The Septuagint and Vulgate are effectively anthologies of translations. The Vulgate and Septuagint both have the same collection of translated texts, but "The Bible" can vary in what it contains depending upon the form of Christianity (Coptic, Ethiopian, Orthodox, Protestant) or Hebrew (which will not contain the New Testament). So "The Bible" is not a fixed text nor a definite anthology. --EncycloPetey (talk) 21:31, 30 November 2017 (UTC)

frbr:Expression

  WikiProject Books has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Hi,

I'm presently working on the ingest of a pilot dataset of performing arts productions. The Expressions that are to be described in the context of the performing arts are not necessarily editions (often, they have not been published, but we do know who the translator or the adapter was, we know their language, etc.). I would therefore suggest to create a separate class "Expression", corresponding to frbr:Expression, and to slightly modify the description of version, edition or translation (Q3331189) on the Books project page: in fact, version, edition or translation (Q3331189) seems to correspond first and foremost to frbr:Manifestation.

When describing editions from the perspective of physical artefacts, as is most common in the library world, the approach that was used so far, employing the classes version, edition or translation (Q3331189) and creative work (Q17537576), would be maintained as is. However, when describing expressions from the perspective of their content, as is the case in the theatrical databases I'm working with, a more refined data model could be used which distinguishes between the four FRBR Group 1 classes.

I've described the rationale in more detail here and am looking forward to your comments. --Beat Estermann (talk) 00:05, 26 November 2017 (UTC)

@Beat Estermann:
On this page it's indicated: « Not to complicate too much, we didn't use the FRBR terms "expression" or "manifestation", as the boundary between the definitions it's not easy to grasp. So we used "edition" instead, collapsing those 2 FRBR layers in 1 (other conceptual frameworks similar to FRBR (like Bibframe) collapse those 2 layers too). Thus the double layer work - edition has been used for creating Book properties. »
I don't know if our "edition" level closer to the "expression" or to the "manifestation" FRBR level, I don't know (if I had to tell, I would have said "expression" but it's true that our edition level is - wrongly - more seen as physical than intellectual) and I'm not even sure the question make sense as it's both by design.
For the creation of a new class for "expression", it's a good idea (at least we would be exactly aligned with FRBR), but I don't know how to make it useable for everyone (most of the problem on this project are because the simplified models is too complicated already - even if meanwhile some people want to add a fifth level for translation... - so I'm not sure to deal with one more level).
I've read you text quickly and some things seems a bit strange but it sounds good globally, I'll try to read it more thoroughly soon.
Cdlt, VIGNERON (talk) 16:18, 28 November 2017 (UTC)
Return to the project page "WikiProject Books/2017".