Wikidata:Requests for permissions/Bot/SamoaBot 32
- The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Approved--Ymblanter (talk) 11:55, 19 October 2013 (UTC)[reply]
SamoaBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Ricordisamoa (talk • contribs • logs)
Task/s: import date of birth (P569) and date of death (P570) from Bibliothèque nationale de France (Q193563)
Function details: it uses online JSON files such as this, and tries to parse the "birthdate" and "deathdate" fields. This is an alternative to the "poor-sourced" Task 26. --Ricordisamoa 06:23, 2 June 2013 (UTC)[reply]
- Test edit, of course. --Ricordisamoa 06:24, 2 June 2013 (UTC)[reply]
- PS: it compares the "ark" field of JSON data with the Bibliothèque nationale de France ID (P268) code present on the item, and skips the item if they don't match. --Ricordisamoa 06:25, 2 June 2013 (UTC)[reply]
- Source code will be available soon :-) --Ricordisamoa 06:26, 2 June 2013 (UTC)[reply]
Support - Looks fine, Byrial (talk) 07:12, 2 June 2013 (UTC)[reply]
- Is the license suitable for Wikidata? --Ricordisamoa 08:02, 2 June 2013 (UTC)[reply]
- Ups. I am not good at French, but there may be problems with that license. It looks to me like only noncommercial use is free. I guess someone better at both French and law than me should look at this. Byrial (talk) 09:06, 2 June 2013 (UTC)[reply]
- It states "you can do whatever you want with our Json and RDF data as long as you cite the source. Other data are only for non commercial use, I dont know how exactly that makes sense, but in any case it makes the license for Json data more liberal than Wikipedia CC-BY-SA license. I dont know if it is suitable, but if it is not we are going to have problems importing from Wikipedia, and many other sources. --Zolo (talk) 09:13, 2 June 2013 (UTC)[reply]
- Ups. I am not good at French, but there may be problems with that license. It looks to me like only noncommercial use is free. I guess someone better at both French and law than me should look at this. Byrial (talk) 09:06, 2 June 2013 (UTC)[reply]
Comment, it seems that data.bnf.fr only contains a small subset of BNF authority files (those linked from Bibliothèque nationale de France ID (P268)). --Zolo (talk) 10:16, 2 June 2013 (UTC)[reply]
Support. Ayack (talk) 10:33, 2 June 2013 (UTC)[reply]
Comment. The license is compatible with CC-BY, not CC0 (public domain), so I don't think we can use this data. Mushroom (talk) 10:42, 2 June 2013 (UTC)[reply]
Comment I don't know the French law. In US these data cannot be protected by copyright law, because the law only protects expressions, not facts. Birth and death dates of people are uncreative facts, which don't meet the threshold of originality (see Feist v. Rural (Q5441583)). --Stevenliuyi (talk) 12:19, 2 June 2013 (UTC)[reply]
- Yes, I think the question is more "what is copyrightable " than "which licenses are compatible with CC0". And that is really something we should start to discuss more broadly: essentially no license is compatible with CC0, and certainly not Wikipedia's CC-BY-SA. --Zolo (talk) 12:30, 2 June 2013 (UTC)[reply]
Comment. I have found the relevant French law, it's called Code de la propriété intellectuelle and it states:
- The authors of translations, adaptations, transformations or arrangements of works of the mind shall enjoy the protection afforded by this Code, without prejudice to the rights of the author of the original work. The same shall apply to the authors of anthologies or collections of miscellaneous works or data, such as databases, which, by reason of the selection or the arrangement of their contents, constitute intellectual creations. Database means a collection of independent works, data or other materials, arranged in a systematic or methodical way, and capable of being individually accessed by electronic or any other means. (French text here)
The English translation above is taken from this book chapter, which also describes how the law has been applied in court. Unfortunately it seems France is a lot more restrictive than the US when it comes to database copyrightability. Mushroom (talk) 22:25, 2 June 2013 (UTC)[reply]
- Thanks. Yes, there are "database rights" in France, and presumably other countries that do not exist in the US. Then we should know when it starts (we are only importing a small part of their database). If it is too restrictive, we may want to decide whether we should abide to local law, or only to the US jurisdiction where Wikimedia is located, just as we have Commons:Template:PD-Art that may not be compatible with all local laws. Or I guess we can still change the license to any other that is permissive enough to be used in Wikipedia. --Zolo (talk) 05:19, 3 June 2013 (UTC)[reply]
- By reading the book above it seems that the threshold of originality is quite difficult to determine and different French courts have given different judgements. Obviously there can't be much apport intellectuel in a simple list of dates, but we never know what a court might decide. If we choose to follow US law and ignore the local ones, I'm afraid some Wikipedias might decide it's too risky to use our data. The root of the problem here is CC0, I can see why it was chosen for Wikidata and personally I think it would be great to have a huge database like this in the public domain, but it would have been easier if we had just used CC-BY-SA. It seems to me by some measure we are doing things backwards, copying data from Wikipedia and other sources without knowing if it's legal or not. There has to be a discussion on this. Mushroom (talk) 15:33, 3 June 2013 (UTC)[reply]
Support It seems that the licence "Licence ouverte/Open Licence" implies just to indicate the source (at least the name of the «Producer ») and the date of the extraction ; the two are done. There is the producer : " imported from Bibliothèque nationale de France " and there is the date of editing in the history, which corresponds to the date of extraction. It is written that there is no noncommercial restrictions. So, the licence is suitable with Wikidata. --Shonagon (talk) 00:36, 3 June 2013 (UTC)[reply]
- I think the license is compatible with the use we make in Wikidata but not with Wikidata's CC0 license that allows people to reuse parts of Wikidata's content without keeping the mention about the source and such things. --Zolo (talk) 04:42, 3 June 2013 (UTC)[reply]
- Yes, the terms are – in principle – observed, but anyone could remove the source, making the use of the data illegal. --Ricordisamoa 06:02, 3 June 2013 (UTC)[reply]
- Yes that's the problem. If we used CC-BY-SA like Wikipedia, and cited the source properly, it wouldn't be our responsibility if someone blindly copies the data with no attribution. But if someone finds the data here (with license CC0) and uses it illegally, then it's our fault for releasing it under that license. Mushroom (talk) 15:33, 3 June 2013 (UTC)[reply]
- Yes, the terms are – in principle – observed, but anyone could remove the source, making the use of the data illegal. --Ricordisamoa 06:02, 3 June 2013 (UTC)[reply]
Comment Obviously, the license issue should be clarified but assuming that it is resolved, I'm basically in favor. However, we must first determine that the BNF data is reliable and not, say, copied from Wikipedia. The latter is unlikely but nevertheless, if we're going to import a huge chunk of data, it's important to make sure that the source is ok. As far as I'm concerned, the license issue and the reliability issue are both deal-breakers and should be investigated thoroughly. Pichpich (talk) 17:28, 3 June 2013 (UTC)[reply]
- BNF authority data are usually considered very relable, but I have no proof for that :|. The sources are provided here here. Unfortunately, they do exactly match one source to one claim so that we cannot add the more direct source as to Wikidata. --Zolo (talk) 18:58, 3 June 2013 (UTC)[reply]
- Support, unless it's illegal obviously. However, as for CDBD I would like a more precise source statement than "imported from the BNF". Given that data.bnf.fr, is just a prettier, machine readable version of stated in (P248), I guess that it would be fine to add the BNF ID in the source. --Zolo (talk) 18:58, 3 June 2013 (UTC)[reply]
- Hello, I'm French, and a librarian who works with French law everyday, and thus, I understand a little better French law about databases…
- what is protected under the French law is the "database", not individual data, so if someone "illegally copies" the data (especially the date of birth of death), it does not matter, since, these data are not "property" of the BnF, they are "objective facts", that are grouped in the BnF database…
- As for the reliability, each BnF individual Authority file gives the source of the date see here for ex.
- I would be more worried about automatically aspiring the data from the database, unless the BnF is OK with it…
- as for the BnF ID, it is already in the item in property Bibliothèque nationale de France ID (P268) - is it really necessary to reproduce it in the source… the link seems obvious to me ;)
- Support as far as it is legally possible… for now, I'm just adding them "by hand", so, no legal problem ;) --Hsarrazin (talk) 16:28, 4 June 2013 (UTC)[reply]
- We can usually guess what "imported from BNF" means, but there may be databases where things are not that clear. Beside, the source should be transcludable in Wikipedias, and this is much easier if the precise source is explicitly given. The proposed guideline is:
- stated in (P248): BNF (or data.bnf, BNF authority files ?)
- Bibliothèque nationale de France ID (P268): CBDB ID
- point in time (P585): date retrieved
- --Zolo (talk) 06:00, 5 June 2013 (UTC)[reply]
- We can usually guess what "imported from BNF" means, but there may be databases where things are not that clear. Beside, the source should be transcludable in Wikipedias, and this is much easier if the precise source is explicitly given. The proposed guideline is:
- Support Zolo's proposal, and addition of "BNF authority file" as Source :) --Hsarrazin (talk) 04:50, 7 June 2013 (UTC)[reply]
CommentI think it should be synchronized with Wikidata:Requests for permissions/Bot/VIAFbot 2, as it essentially uses the same source. --Zolo (talk) 06:00, 5 June 2013 (UTC)[reply]
- Wikidata:Requests for permissions/Bot/VIAFbot 2 uses the the same principle as I think is working here - using a Library identifier to go get some info from the catalogue. The reason VIAFbot uses "imported from VIAF" is because I'm using the data from VIAF and not the BnF. Most of the sex data in VIAF is from the BnF, but its also a merge from all the catalogues in VIAF so it could be from National Library of Norway or Spain. Although VIAFbot 2 takes this one step further and looks up the LCCN record from VIAF and then the sex record in LCCN. (I do this because LCCN records multiple sexes in the case of a change). So I'm sometimes going WD->VIAF->LCCN , and then I write "imported from VIAF". Maybe SamoaBot 32 could do the same thing by using the VIAF link and then BnF link, or even another catalogue, if there isn't a BnF link? Also I don't think our bots need to be synchronised to run together but we do need to make sure we don't duplicate any imports. There's a lot of data to import from these catalogues, enough to share. :) Maximilianklein (talk) 17:08, 5 June 2013 (UTC)[reply]
- And p.s. can you post your code? I'd like to do a similar date import bot using more than the BnF (maybe all the VIAF contributor), and its quicker to reuse code.Maximilianklein (talk) 17:35, 5 June 2013 (UTC)[reply]
- I've created a GitHub repo, and will post the code there in few days. Regards, --Ricordisamoa 18:23, 5 June 2013 (UTC)[reply]
- Comment Unless the wikidata servers reside in France, then French copyright law is irrelevant. If the servers are in the U.S., then we need to comply with U.S. copyright laws. This is my understanding, at least. Also, I can't see how anyone, anywhere, could claim copyright over dates of births, that just seems nonsensical. Danrok (talk) 23:03, 25 June 2013 (UTC)[reply]
Oppose because of the uncertain legal status. If we tell the world that wikidata is cc-0, then everthing should be cc-0. I would not like to see one sued because of data copied from Wikidata. Saying, it's ok if your not French does not help. — Felix Reimann (talk) 12:48, 3 July 2013 (UTC)[reply]
- Licencse seems ok. See below. Thx for clarification! — Felix Reimann (talk) 21:01, 26 September 2013 (UTC)[reply]
Support The english version of the licence is available here : http://data.bnf.fr/docs/Licence-Ouverte-Open-Licence-ENG.pdf (16/09/2013) : you are free "To disseminate and redistribute the « Information »" as long as you "Attribute the « Information » by acknowledging its source". -- (194.199.4.201 08:56, 16 September 2013 (UTC) Romain Wenz, for the data.bnf.fr team - data [a] bnf.fr[reply]
- Comment I have been speaking with Romain Wenz, head of data.bnf team and I can assure you that it is really him who wrote the line above and that you can use those data. Remi Mathis (talk) 09:01, 16 September 2013 (UTC) (Wikimedia France chair, and curator at the BnF (and friend of Romain Wenz)) Remi Mathis (talk) 09:01, 16 September 2013 (UTC)[reply]
- That's wonderful news, thanks! Ricordisamoa, are you ready for this task to be approved now? Legoktm (talk) 17:43, 24 September 2013 (UTC)[reply]
- Yes, of course! --Ricordisamoa 20:22, 7 October 2013 (UTC)[reply]
- That's wonderful news, thanks! Ricordisamoa, are you ready for this task to be approved now? Legoktm (talk) 17:43, 24 September 2013 (UTC)[reply]