Wikidata talk:WikiProject sum of all paintings/Archive/2014

This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

Doubles

Something to watch out for: works that are part of one collection, but on long-term loan in another. Such as this work from Quentin Massys, which is in the collection of the Rijksmuseum but on long-term loan at the Mauritshuis:

How to merge them and properly express this? Spinster (talk) 16:15, 9 August 2014 (UTC)

Check out Portrait of a Man with a Red Beard (Q17276171). Do you like the way that is modeled Spinster? Multichill (talk) 21:18, 12 August 2014 (UTC)
Yes. I was already thinking in that direction: add time spans as qualifiers with the work's location. Thanks! Spinster (talk) 14:33, 13 August 2014 (UTC)

Creatorless works: artist of the same name from different generations

I'm slowly going through the list of Rijksmuseum works that have no creator yet + fixing these manually.

While doing that, I'm encountering many cases where the import script/bot has (understandably) missed artists that have the same name but are from different generations. Examples: Frans van Mieris the Elder, Frans van Mieris the Younger; Frans Pourbus the Elder, Frans Pourbus the Younger. In the Rijks' database these are respectively named Frans (II) van Mieris, Frans (I) van Mieris; Frans (II) Pourbus, Frans (I) Pourbus. Different naming conventions... When I find examples like this, I add several name variations under 'Also known as', since this will probably help with future imports? Let me know if I can do anything else with such cases. Spinster (talk) 14:42, 13 August 2014 (UTC)

Sorry I don't know of any other way to do it than the way you are doing it. Thanks for your work on this!! Jane023 (talk) 08:01, 16 September 2014 (UTC)

Public Catalogue Foundation (UK)

The Public Catalogue Foundation has catalogued all known oil paintings in UK publicly funded galleries and collections.

The catalogue is made available for browsing through the BBC Your paintings site. ("About" page).

The metadata appears to be released on an NC licence, according to the Copyright page on the PCF's site, referring the reader, "for ALL other uses", to "actively obtain further consent from the contributing collection".

I think some of the institutions do release their catalogues CC0 -- but it seems that eg the Tate is not one of them. [1]

It might be useful to contact the PCF, and ask if it would track any institutions which do release their metadata, and whether for such institutions the PCF would give it to us. Jheald (talk) 22:37, 14 August 2014 (UTC)

I think that Jane023 has been in touch with the PCF before... maybe even about these topics. Spinster (talk) 19:17, 15 September 2014 (UTC)
Last I checked they said no, but after Wikimania a small delegation paid a visit to their office in London and the PCF made them this offer, which you are welcome to pursue, btw. You can contact Jonathan Cardy at the WMUK office for details, and otherwise just send them an email at their contact address. They are quite friendly, but just seem scared to collaborate with us in any structural way. If you want the data, Magnus Manske has already downloaded the complete set of artists here, though I would also like to see the whole set of paintings online too. Jane023 (talk) 07:57, 16 September 2014 (UTC)

Europeana

All metadata is available CC0 [2]

The keyword "Painting" with media-type image produces 161,000 hits. [3] Jheald (talk) 22:53, 14 August 2014 (UTC)

I've been importing paintings from Europeana over the last couple of weeks, see Wikidata:WikiProject sum of all paintings/Location/Netherlands. I shared some (crappy) code in git. Multichill (talk) 16:41, 16 August 2014 (UTC)

Dupe or no dupe?

With the Wikidata game I stumbled about Q17340437 which seems to be identical with Q17322988 with exception of the number "SK-A-4625" vs. "SK-A-791". The paintings File:Bernardus de Bosch I (1709-86). Dichter en kunstbeschermer te Amsterdam Rijksmuseum SK-A-4625.jpeg and File:Bernardus de Bosch I (1709-86). Dichter en kunstbeschermer te Amsterdam Rijksmuseum SK-A-791.jpeg looks identical too. Can someone clarify of this is a dupe in the Rijsmuseum database or are they are really different paintings? In other words: Can the Wikidata items be merged or not? Raymond (talk) 14:26, 14 September 2014 (UTC)

Intriguing! I have looked at both images closely, but can't say for sure. Even the frames are extremely alike. They have slightly different metadata (e.g. acquisition date). I have sent an email to the collection department to inquire - will keep you posted if I receive a reply. Spinster (talk) 18:30, 14 September 2014 (UTC)
I got a reply! They are definitely different paintings. There might be a mistake with the images on the website, according to the lady who answered my email. Spinster (talk) 19:09, 15 September 2014 (UTC)
Thanks for clarifying. Raymond (talk) 06:30, 17 September 2014 (UTC)

Sum of all paintings (that have ever been on Belgian/Flemish territory) by Rubens

I had posted this on the talk page of Belgium, but that probably went unnoticed :-)

The Rubenianum has made the metadata of the website rubensonline.be available under CC0 as part of the opencultuurdata.be project. Find more information and various downloads here.

This means that we can import metadata (and, if still needed, images) of many paintings by Rubens. A job beyond my own technical skills, but perhaps someone else might be interested in tackling this? Spinster (talk) 18:36, 14 September 2014 (UTC)

Great news, thanks for posting! I am still working on Frans Hals, but I may take a look when I am done with that. Jane023 (talk) 07:58, 16 September 2014 (UTC)

To-do

  1. Get a list of all paintings on Wikidata without an image, but for which the image is be older than (say) 1900 and therefore possible to host on Commons. Next, see if the image is already on Commons and if so link it in, and if not, go get it, upload it to Commons and then link it in.
  2. Figure out for copies and lost originals, how to link copies to the original creator in such a way that the creator doesn't look like the creator of the copy
  3. Figure out for copied copies, how to link the intermediate copyist in as a creator
  4. How to set up a "follower of Florentine school", "Flemish school" and any other "schools" that are widely used (e.g. by the Web Gallery of Art) to indicate origins for creatorless works
  5. Get the Mona Lisa and at least one of major "schools" into the "showcase items" as teaching tools

Just some thoughts Jane023 (talk) 09:20, 3 October 2014 (UTC)

Hi Jane! Some thoughts in answer to yours (and I took the freedom to format your post a bit): I think discussion about your points 2 to 5 mainly belong in WikiProject Visual arts here on Wikidata. I already saw some (undecided?) discussion there about how to describe 'school of', 'workshop of', 'follower of', 'after'. Not sure about copies, but that discussion belongs there too.
For your first item, here's an imperfect list but it's a start: all paintings without image, created between years 100 and 1850 (I noticed that 1900 produces too many images still under copyright). As we get more paintings with correct creation dates, this list should grow; and as we add more images to these items, the list should shrink again :-)
My own wishlist would, among many things, include the following:
  1. Have at least one very well-known painter's oeuvre as complete as possible, with all known paintings as items in Wikidata and as much of their metadata filled in, checked, referenced and polished as possible. Think Vermeer, Rembrandt, Rubens, Van Dyck, Hals... As a showcase too. I think you're trying to achieve this with Hals; I'm sometimes working on Rubens but it's a huge job. Vermeer might be doable; I'd be intrigued if we can find a way to integrate the fake Vermeers by Van Meegeren!
  2. Import all metadata from RKDimages. Damn, that database is AMAZING. Heh.

Keep up the good work! Spinster (talk) 19:50, 4 October 2014 (UTC)

Leonardo, Vermeer, and Seurat are good candidates to have a complete oeuvre, since each of them produced couple of dozens works.--Ymblanter (talk) 20:14, 4 October 2014 (UTC)

All of Commons artwork

At a quick count, there are ~58K images on Commons that

  • have the {{Artwork}} template, and
  • have a {{Title}} template, and
  • have a {{Creator}} template

From that, it would be simple (aka "I know how to do it, but will require some work") to auto-create Wikidata items, with label (from "title"; multiple languages, where available), creator (from "creator" template), image (duh), and maybe some more (which?) statements parsed form the file description, as available.

  • Is that something that is wanted, and (if done) defended against the usual naysayers?
  • How should I handle duplicates? I could exclude artwork where the title (from the template, not the file name) already exists on Wikidata, but that would exclude quite a few (common titles; haven't checked though). Besides duplicates of existing items, there could be multiple photos of the same artwork in the 58K corpus.
  • What other statements to generate? I can look for common templates that go with artwork, but maybe there are some "special" ones you'd like? ID numbers etc.?

Opinions? @Multichill:? --Magnus Manske (talk) 13:17, 19 November 2014 (UTC)

Commons contains a lot of files using artwork that not worth importing to Wikidata, these should be filtered out. Also you want to add some sort of identifier to prevent duplicates. Jean-Frédéric wrote a tool this weekend at the hackathon. Jean-Frédéric, can you respond? Multichill (talk) 19:22, 19 November 2014 (UTC)
Thanks for the ping. Yes, I am working on such a tool (will push my local repo somewhere public tonight) and happy to get more contributions (the number of edge cases we have on Commons… >_<). Will ping here when it is done :) Jean-Fred (talk) 11:12, 20 November 2014 (UTC)
Nice idea! It would also be useful to try to update all possible Wikidata painting items with Google Art images from Commons. Theoretically we have no dupes of those on Commons, am I right? Jane023 (talk) 13:46, 20 November 2014 (UTC)
Hi Magnus Manske. I have began to work on this question Artworks import from Wikimedia Commons. The problem of duplicates is important with 1) items already in Wikidata 2) Duplicates in Commons
  1. On Wikidata there is curently more than 16000 of visual artworks linked with one (or more) image of wikimedia Commons: http://www.zone47.com/crotos/?l=en&c18=1&mode=1
  2. Metadata can vary on Wikimedia Commons. The variation of titles for an artwork is frequent not because of Commons but because of Art History (giving a specific title is a recent practice), access number not often mentionned...
I explored severally options but really don't know what is the best thing to do. But for sure I aggree that it could be possible to make massive creation of artworks on Wikidata with Wikimedia Commons files and I even think that this is the best big thing to do on artworks in Wikidata to have artworks with images.
As Jane023 suggests I try to work with homogeneous lots and I started to work on Google Art images. There are less than 10 dupes on Commons ; 8% of files are already used in Wikidata and 12% correspond to artworks which had another image. It was easy to check automaticly the 8% but for 12% it was complicated because often we don't have enough metadata to check easily.
To understand the situation we have to know that the first wave of artworks items of Wikidata came from Wikipedia pages. So, one way was to import first metadata from DBpedia. It was done, 1 year ago, and the first massive metadata import on items artworks (about ~8000) came from extractions of DBpedias (it,en,fr,de,es,nl,he)- with the method explained here in French:Journey from DBpedia to Wikidata on a bot
So many items have the image of wikipedia page and it is not always (and not often) the disponible Google art image uploaded often after. On wikidata, the author or the institution is not always mentionned. Sometimes you have a painting, with just a label in a title in a language that I do not know at all, the author name and the insitution ( Head of a Prostitute (Q11708249) ) ; for this lot it was really often not easy to check and I had to do it visually. Visual checking is the worst and the most frequent of difficulty to be effective (but a pleasure too I admit). And we will have similar problem with other Commons files with Artwork template. I had the same problem to create items for prints of Dürer which have many duplicates Prints of Dürer on Crotos with always various titles and different institutions too.
But the checking is done yet for Google Art images:http://zone47.com/div/gap.xlsx
I can continue to work on this lot (alignments on Q item and data processing) before creating items on Wikidata. But If anybody has a better idea, I'm interested. Shonagon (talk) 07:23, 21 November 2014 (UTC)
(i) How good is collection (P195) completeness for items in Wikidata? Would it make sense to go collection by collection?
(ii) How good is "other versions" completeness on Commons? If the filepage pointed to by a image (P18) property had an "other version" from the Google Art Collection, that would idetify a duplicate (or a P18 to be updated). Jheald (talk) 09:29, 21 November 2014 (UTC)
(iii) Has anybody played with any automated similarity detection libraries for images? (Upcoming session at the British Library on this, on Dec 18). Jheald (talk) 09:33, 21 November 2014 (UTC)
(i) I think it's a good approch too, and my preference, because this is a clear and defined scope and we have often the Commons category. Maybe will I continue on this way.
(ii) "other versions" is unfortunately most often not given in Wikimedia Commons even if there are "other versions. I you have it, it's an information ; If you don't have, no information and no clue. For Google Art images I have found 240 alternative images already in image (P18) but not mentionned in "other versions". If the Google Art image was better, it was set for replacement in image (P18). (Example with 40-50 items of Uffizi Gallery (Q51252) , Autolist )
(iii) A very interesting approach. Personnaly I just have used this tool manually (very often with Google Similar image via Who-stole-my-pictures FF extension ) Shonagon (talk) 10:19, 21 November 2014 (UTC)
At the same time, we should also add |wikidata = paramters to files on Commons (it is all the more needed as we often have several, more or less redundant images of the same work). Some partial cases:
None of them will be error-free but I think the error-rate will be acceptable (and a few sanity-checks should be feasible afterwards). I have created Commons:Category:Artworks without Wikidata item to get some numerical feeling. --Zolo (talk) 15:06, 23 November 2014 (UTC)

wdq links used for cleanup

Here are a few links worth looking at every now and then to clean up incomplete imports:

  • [4]: All objects with a statement for inventory number with a collection qualifier, but no collection statement. (simply add collection statement from the qualifier)
  • [5]: All objects with an inventory statement but no collection statement

Note that WDQ does not seem to be process page reverts correctly (i.e. it doesn't remove/re-add statements manipulated by a rollback) which is why I also added the 217 property. If this is missing the item can be skipped.

A more complicated one to deal with is:

Also I'm not sure whether I got this one wrong or if there just aren't any objects for cleanup:
  • [7]: Objects with an inventory number but no collection qualifier.
/Lokal Profil (talk) 10:38, 2 December 2014 (UTC)

Ingesting the list of Swiss heritage institutions into Wikidata - Help needed!

Hi,

So far, I have listed all art museums in Switzerland which have paintings in their holdings: List of Swiss institutions. (There may also be other types of heritage institutions holding paintings, so the list is certainly not complete.) I'm planning to contact these museums in view of the upcoming Swiss Open Cultural Data Hackathon, in order to ask them whether they would be ready to provide the metadata of their paintings (and possibly also scans) under a free license.

The data have been derived from the Swiss GLAM Inventory. There is also a Wikiproject on German Wikipedia, as part of which all Swiss heritage institutions are linked up with existing Wikipedia articles.

Now, instead of adding the Wikidata-Entity-Numbers manually to the list of Swiss institutions having paintings in their holdings, I would prefer to first ingest the Swiss GLAM Inventory into Wikidata, and then add the collection numbers automatically. - Can anybody help me achieve this by script?

Cheers, --Beat Estermann (talk) 10:48, 3 December 2014 (UTC)

Launch of WikiProject Wikidata for research

Hi, this is to let you know that we've launched WikiProject Wikidata for research in order to stimulate a closer interaction between Wikidata and research, both on a technical and a community level. As a first activity, we are drafting a research proposal on the matter (cf. blog post). It would be great if you would see room for interaction! Thanks, --Daniel Mietchen (talk) 01:48, 9 December 2014 (UTC)

Return to the project page "WikiProject sum of all paintings/Archive/2014".