Open main menu

Wikidata:Project chat

Wikidata project chat
Place used to discuss any and all aspects of Wikidata: the project itself, policy and proposals, individual data items, technical issues, etc.
Please take a look at the frequently asked questions to see if your question has already been answered.
Please use {{Q}} or {{P}}, the first time you mention an item, or property, respectively.
Requests for deletions can be made here. Merging instructions can be found here.
IRC channel: #wikidata connect
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2018/12.

Contents

Scholarly articles that are book reviewsEdit

If a scholarly article is a book review, should the “main subject” be the edition of the book, or the subject of the book, or both? - PKM (talk) 02:32, 14 November 2018 (UTC)

@PKM: In the absence of a single value constraint on P921, I see no harm in using both. Mahir256 (talk) 03:10, 14 November 2018 (UTC)
@PKM:Typical problem when mixing different classifications: scholarly article is a text format, book review is about the content of a text. WikiProject Books should once fix the classification by analyzing in detail the characteristics of a book. Snipre (talk) 07:37, 14 November 2018 (UTC)
@Snipre: While a "scholarly article" as used in WD is clearly more like an "edition" than a work (based on its properties), currently, "scholarly article" is <subclass of> "article" is <sublcass of> "work" (while also being a subclass of "publication"). If you have an idea of a way to separate "publications" from "works", please lay it out so we can discuss - I agree this area is fuzzy. But I would say that a "book review" is a type of article either way. In any case, I would not recommend or support making separate "work" and "edition" items for every scholarly article in WD. That way lies madness. - PKM (talk) 20:47, 14 November 2018 (UTC)
@PKM: In relation to your original question, I think "main subject" should point to the book-edition. The item should also be identified somehow (genre?) as a book review. I am dubious about "main subject" also pointing to the book-subject. How much would one learn about the book-subject in the round from the book review? Possibly an case for a weaker subject indicator, perhaps the proposed subject keyword property? Jheald (talk) 22:09, 16 November 2018 (UTC)
@Jheald:, that's my first instinct as well. book review (Q637866) is currently a genre, so that part works. Lots of missing books, alas. Also, we have 2789 items with <instance of> "book review". - PKM (talk) 22:18, 16 November 2018 (UTC)

Just going to throw in that IMO academic journal article (Q18918145) is more accurate than scholarly article (Q13442814) for the instance of (P31) statement of a book review. Circeus (talk) 14:24, 21 November 2018 (UTC)

@Circeus: Agreed! - PKM (talk) 20:51, 24 November 2018 (UTC)

My two cents, @PKM: I created Le nom de peuple Rhedones (Q52160525) some times ago. What do y'all think? Cdlt, VIGNERON (talk) 17:52, 24 November 2018 (UTC)

@VIGNERON: That's very nice! - PKM (talk) 20:51, 24 November 2018 (UTC)
Shouldn't "book review" be a property, rather than an item, as Q637866?
Example: I just created Q59319726 for a book. A published review of that book was previously available as Q58565334. There should be a way to link these two. If "book review" were a property, one could then specify the book as a property of the review. ???
We need some way to link a review with the book it reviewed. Please suggest an alternative or endorse this. Thanks, DavidMCEddy (talk) 03:12, 1 December 2018 (UTC)
Another example: Q58570163 is a review of a book for which a Wikidata seems not yet to exist. I plan to create a Wikidata entry for that book.
Shouldn't the "scholarly article" that is a book review have a property = "review of" being the book? And shouldn't the book have a property "reviewed as", being the review?
I see properties "review score", "review score by", and "reviewed by". None of these sound to me like either "review of" or "reviewed as".
Thanks, DavidMCEddy (talk) 19:56, 3 December 2018 (UTC)
@DavidMCEddy: I have no problem using the existing "main subject" to include the meaning "review of", but I wouldn't object to a separate property "review of" in lieu of both "main subject" and <instance of> "book review" for these items. That would be a very clean solution. - PKM (talk) 20:48, 10 December 2018 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── I've also run into a number of "scholarly articles" that are reviews of museum exhibitions. I've addressed these by creating items for the exhibitions and linking from the review using "main subject". (See ‘Opus Anglicanum: Masterpieces of English Medieval Embroidery’, Victoria and Albert Museum, London, 1 October 2016–5 February 2017 (Q57678571).) Whatever solution we come to should probably handle other types of reviews than book reviews (so if we make a new property "review at" or somesuch, it could accept reviews of plays, films, exhibitions, and books. - PKM (talk) 20:44, 10 December 2018 (UTC)

EOL has changed all its URLs/IDsEdit

Encyclopedia of Life (Q82486) has changed all its IDs and URL formats, making Encyclopedia of Life ID (P830) wrong where it is used. Example: Prunus prostrata (Q1258395) links to https://eol.org/pages/631658 but should link to https://eol.org/pages/11164788. Abductive (talk) 07:13, 27 November 2018 (UTC)

  • I updated the description. You might want to propose a new property for the new scheme. --- Jura 07:23, 27 November 2018 (UTC)
    • What, keep the old IDs live on Wikidata, and add the new IDs? Abductive (talk) 07:26, 27 November 2018 (UTC)
    • By the way, on how many items is P830 used? Abductive (talk) 07:27, 27 November 2018 (UTC)
  • Yes, for stability's sake, we don't mix the old and the new scheme. Otherwise data users couldn't be sure what they are getting. --- Jura 07:35, 27 November 2018 (UTC)
  • I'm going to leave to pesky details on creating a property and editing 1,375,794 items to you guys. Abductive (talk) 20:11, 27 November 2018 (UTC)
@Abductive: Your claim „EOL has changed all its IDs and URL formats” is wrong.
@Jura: formatter URL (P1630) has to be changed from http://eol.org/pages/$1/overview to https://eol.org/pages/$1. That solved the general problem.
@Abductive: EOL ids refer to taxon concepts and are a subject to change. In the past we had EOL ids that refered to internally used ids and caused similar errors. Maybe references to The Plant List 1.1 (Q15628808) got new ids. Not sure at the moment.
--Succu (talk) 20:15, 27 November 2018 (UTC)
In my example the ID is different. That may not always be the case? Anyways, I have already stated that this is not my forte. It's up to you guys to fix or ignore link rot (Q1193907) as you see fit. Abductive (talk) 20:19, 27 November 2018 (UTC)
Sorry, but rechecking the known 2,237,550 EOL ids against our taxa has a low priority on my todo list. --Succu (talk) 20:37, 27 November 2018 (UTC)
Presumably this can be automated? Perhaps by another with the requisite skill set? Abductive (talk) 05:40, 28 November 2018 (UTC)
Regarding your example: Added by User:BotNinja (= User:Termininja). --Succu (talk) 20:56, 27 November 2018 (UTC)
Okay, so you don't want to fix it either. I completely understand; this is a volunteer project and nobody knows how to write a program to fix these things. Abductive (talk) 02:25, 4 December 2018 (UTC)
You are wrong and as I told you above this has a low priority on my todo list. --Succu (talk) 22:15, 10 December 2018 (UTC)

Spam filter madnessEdit

The spam filter is preventing me from adding a URL to a search on gov.uk, to a WikiProject page - Wikidata:WikiProject British Politicians/ambassadors. Details here. Can anything be done ... it would be handy if the project page could display the link, which is concerned with info sources for QA. --Tagishsimon (talk) 00:28, 29 November 2018 (UTC)

@Tagishsimon: m:Spam_blacklist/Log is an interesting place. @Billinghurst: regarding this September decision. Mahir256 (talk) 01:25, 29 November 2018 (UTC)
THanks Mahir256. In fairness, right now, quite a lot of us would like to block the UK government. @Billinghurst: - what was the background? There's a clear use case for us having a link to gov.uk/search - it's the only URL route to a useful set of information; it's a key interface for accessing UK government information. What exactly occurred that led you to block it? I see you've also blocked bbc.co.uk/search. Also seems odd. I'm not seeing any process here ... you seem to propose and then execute the blocks. Where's the casus belli log for these actions? --Tagishsimon (talk) 02:00, 29 November 2018 (UTC)
@Tagishsimon: FWIW we don't do these things randomly, and with the best intentions, though sometimes there are unenvisaged consequences. Always happy to discuss, and make modifications.

Domain search links are a spambot favourite sitting beside hard spam links (have a look at special:log/spamblacklist somewhere where you have sufficient rights) and also Special:Abuselog. I think that they try to leverage "good" links like greylisting or credibility, though also where they have a known search result leading to their items. I do run through various Special:LinkSearch links and have run reports (where possible) from COIBot to watch for consequences prior to actioning, though predominantly focusing on WPs rather than WD. [We don't have a good means to look at all wikis for abuse, or linksearches]. There are also direct spammers who use the search links like url shorteners, which we all know are widely abused. Comfortable with removing it if we don't have a means to improve it, noting though that generally the WPs discourage such search links as they are dynamic and variable in their output and rubbish for references. The community can add an exclusion to Mediawiki:Spam-blacklist if you are looking to allow something locally.  — billinghurst sDrewth 12:14, 29 November 2018 (UTC)

@billinghurst: Thanks. I guess I can kinda see how spammers can game some sites via search, for instance if they can add comments to forums that are indexed. I'm struggling to understand how they can game the gov.uk search, which I'd have thought is pretty thoroughly tied down. I'm afraid I'm not able to cause the special:log/spamblacklist UI to list instances of gov.uk abuse; and in terms of transparecy, that would only show refused edits after the domain was blocked, not whatever went on to cause the block in the first place. This is very much not my area, so please forgive what may be staggeringly niaive questions. Presuming I'm missing something obvious, can you give me worked example of a way in which gov.uk/search might be abused? I do appreciate that proactive dealing with spammers is important, and I very much AGF w.r.t. your work in the bilges. --Tagishsimon (talk) 07:29, 30 November 2018 (UTC)
@Tagishsimon: Sometimes it is that the search links are bad/evil, and sometimes they are workable, though indicators of foul play, or solely an attempt to legitimise the spam attempt. If the search links are not necessary crosswiki, then where they are being excessively abused it should be a reasonable defence to exclude them. If one wiki needs those links, or subsets of those links, then the accepted methodology to utilise links is the local whitelist by the expected means of the community (usually either mediawiki talk:spam-whitelist or local admin noticeboards). If communities find an addition problematic, then we remove it.

To my process, when I do blacklisting in a case like this, I will usually monitor spam links for a period of time, and then subsequently check the blacklist logs at a several primary sites including enWP, Commons, if language specific, occasionally WD, but no so much. For the gov.uk search regex, I have just reviewed logs for one month for enWP and C, and 3 months for WD, and your attempt is the only legitimate attempt to use that search link against the 100+ spambot attempts. Where I see that an addition is problematic then it either does not happen, or subsequently that it is causing issues, then I reverse it. I am open to suggestions on how to better advertise these blocked search links. There is not a better means to have explanatory text fro the system, nor a more convenient, nuanced means to defer a link addition in the mediawiki system. Noting that these defences are generally not sexy enough to get developer resources, and the generally ability to stop spambots getting inside the system is totally crap and not one that MW developers evidently dedicate time.  — billinghurst sDrewth 12:00, 30 November 2018 (UTC)

@billinghurst: Thanks again, Billinghurst. I'm not going to push you to whitelist gov.uk/search. Still a bit confused about what the spammers are up to - at least in respect of gov.uk, having got my act together and looked through a few pages of the log. As you say, perhaps trying to legitimise their account, or an edit, or a series of edits. All most odd.
Thank you for the work you do. It all looks and sounds an unenviable but necessary task. I could make suggestions like: the error message when a spamtrap is triggered might be altered to tell the user where to go to look for more info ... but perhaps it's as well to keep the regex & proces obscure, all things considered. --Tagishsimon (talk) 12:24, 30 November 2018 (UTC)
Like any aspect of Wikimedia projects which are effectively in the hands of one editor, or a small number of editors, it cannot work properly. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:53, 29 November 2018 (UTC)
Seems like an unhelpful and inaccurate comment. Please enlighten us of the benefit of that contribution.

With regard to the spam management, this probably also works better than no one doing anything, but let us not worry about effectiveness of that aspect. Apart from the component that numbers of people work in the area of managing spam defences against a system that is abused by spambots, there are indeed a small number of people who concentrate some efforts. Now nothing is done out of vision of the community, logs are available of both the spam lists, spam hits and global abuse filters, and messages are provided where edit blocks occur in all scenarios, and this should enable the communities to identify problems and provide suitable feedback. Noting anyone is welcome to come along and help, though it is not attractive mop work, and it does not get any plaudits. Always willing to hear of improved ways to do things, and where problems are caused and solutions are needed, and I look forward to that shared positive approach into the future.  — billinghurst sDrewth 21:33, 29 November 2018 (UTC)

en:MediaWiki_talk:Spam-whitelist/Archives/2018/03#360cities.net is one example. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:33, 1 December 2018 (UTC)
A domain blocked by Beetstra in 2011, after spamming at enWP and others, following a blacklisting requested by the community, and you are pointing to a whitelist request at enWP. What is your point? You are not adding clarity to your criticism, nor providing suggestions on the improvements that could be made.  — billinghurst sDrewth 11:05, 4 December 2018 (UTC)

how to go about getting wikidata or wikipedia to coordinate the efforts of all of humanity to combat climate change and the general degradation of the planet.Edit

With the planet in dire straits, and the threat of being past or approaching the "tipping point", it is necessary for humanity to act and try to save the situation. So the first step, in my opinion, to be effective, is to coordinate all the efforts of all of humanity throughout the planet. This could seem like a daunting challenge, yet it us obviously necessary even though it has never been attempted before. Wikipedia or wikidata, I am not sure which, would be the best website on the internet to do it, due to its vast readership and humanistic bent, and it would be best if there is only one website doing it.

It would take a large team of dedicated volunteers to categorize in many different areas all that is going on, who is doing what, where, how, what do they need, etc. to minimize redundancy and to be sure that nothing is being left out.

There are already website with lists of things that need to be done such as Drawdown (https://www.drawdown.org/?fbclid=IwAR3Vx2mA5c4_8Kgk_qBlorLi0GAxxQIRQ9bPtB_tADa6U83cH2xFqqmtr1s). And there are many other possible solutions. And groups doing big projects like treesisters.org who have organized women around the world to plant one billion trees per year.

Not only would it be an organizing site, but it would be an opportunity to help unify humanity to a large degree if it becomes popular and with this unity it may be possible to convince governments to do things, like bring back the military to work on remedial projects.

I have a few other ideas that would be useful, but first I need help or ideas how to get wikipedia or wikidata (whichever is most appropriate) to take on the project. Please respond so we can get moving on this, if you agree with the logic of the idea.  – The preceding unsigned comment was added by Mofwoofoo (talk • contribs) at 22:57, November 29, 2018‎ (UTC).

Yes I agree. One idea I had was to conduct a major survey of all remaining single-line rail systems in the US that are near or in large urban areas and run a cost-benefit analysis to change them into double-line rail systems suitable for commuter services. This would drastically reduce the carbon footprint of traffic jams. I have lots of otherr ideas too, but no idea how to model them in Wikidata. Good luck with your project! Jane023 (talk) 10:06, 30 November 2018 (UTC)
Part of Wikipedia's mission is to be neutral. As such it's not the best platform to organize activism. To the extend that you think that adding certain information to Wikidata is valuble for your enviromental activism you might do that, this is likely not a good place unify people around activism for a particular course. ChristianKl❫ 11:42, 30 November 2018 (UTC)
Well I also like to think that we can make contributions that could save the planet. However I admit I don't quite know how to do this. Maybe a Wikidata wikiproject "Environmental activism" or "Climate Change" or "Save the Planet"? Jane023 (talk) 10:58, 1 December 2018 (UTC)
I'm sure you can organize something towards environmental activism if you want to. You could probably organize a pie-eating contest as well, and maybe a for-profit corporation and a book group and a band. However, you certainly would not be permitted to do any of these things on Wikidata itself, as that is not what Wikidata is for. I don't even understand why you would want to do any such thing on Wikidata, it's not like Wikidata discussion pages have some associated +5 effectiveness bonus to whatever you want to do. --Yair rand (talk) 06:44, 3 December 2018 (UTC)
Well I tend to organize a lot of stuff on Wikidata, some political, some art-related, and no one has objected before. I am surprised you feel politics shouldn't have a home on Wikidata. I am really impressed with the work done on politicians, for example. I see no problem in serving up items with few statements to people who may be interested in adding to them. There are lots of politicians interested in climate change per country. Jane023 (talk) 13:21, 5 December 2018 (UTC)
IMO, Wikipedia is an encyclopedia. Like the news, it shouldn't get involved. It should only report what happened. It shouldn't choose a side. It shouldn't try to sway future events. It should simply be a record of historical/important events. Lazypub (talk) 11:52, 1 December 2018 (UTC)
Even encyclopedias are political. In fact, it's hard to find some aspect of human life that is not political. Jane023 (talk) 13:21, 5 December 2018 (UTC)

GlobalFactSyncRE/DBpedia project proposalEdit

DBpedia, which frequently crawls and analyses over 120 Wikipedia language editions, has near complete information about (1) which facts are in infoboxes across all Wikipedias, and (2) where Wikidata is already used in those infoboxes. GlobalFactSyncRE will extract all infobox facts and their references to produce a tool for Wikipedia editors that detects and displays differences across infobox facts in an intelligent way to help sync infoboxes between languages and/or Wikidata. The extracted references will also be used to enhance Wikidata. For more see meta:Grants talk:Project/DBpedia/GlobalFactSyncRE

Please let us know what you think, your opinion is important to us! Thank you!  – The preceding unsigned comment was added by 212.27.205.232 (talk • contribs) at 13:38, 30 November 2018‎ (UTC).

Credited to Vs. real authorEdit

Seeking the proverbial wisdom of the crowd here: Do we have a standard way of representing the real creator of a work alongside the "official" creator of a work? If not, any good tips on how to hack it? Is a new property in order, perhaps? I'm looking for something general, but mostly work on music myself. So a classic example would be the "Lennon/McCartney" credits. Moebeus (talk) 01:28, 1 December 2018 (UTC)

@Moebeus: one example I’m aware of: Trump: The Art of the Deal (Q7847758). However, it’s the only work on Wikidata with two author statements disputing each other (source), so it’s not really representative of a “usual” way to model this. --TweetsFactsAndQueries (talk) 10:12, 1 December 2018 (UTC)
@TweetsFactsAndQueries: Heh, while funny that's not exactly what I'm looking for. Disputed statements are a thing apart, what I'm looking for is something more formal, in relation to your exay mple maybe "author=Donald Trump, ghostwriter=Tony Schwartz". Or in the case of the Beatles: How to indicate that "Yesterday" was written by Paul McCartney but credited to Lennon-McCartney. Moebeus (talk) 15:53, 1 December 2018 (UTC)
Real vs real (in the entertainment industry) is one of the reasons I gave up on Wikipedia. I find it hard to be an encyclopedia when you're only allowed to use approved sources, all of which happen to be paid advertisements regurgitating the same cover stories. Lazypub (talk)

This also applies to much older performance works, which are traditionally attributed to an author, but which some (or most) modern scholarship rejects. For example the Sanskrit play Mṛcchakaṭikā (Q3429324) is traditionally ascribed to Śūdraka (Q1331151), a king for whom there is no evidence that he ever existed, and is now assumed to be a pen name of some kind. Or Rhesus (Q667750), a Greek tragedy traditionally credited to Euripides (Q48305), but which is disputed by modern scholarship on the basis of style. In these situations, how do we indicate traditionally-assigned authorship? How do we indicate scholarly doubt or disputation concerning authorship? And what do we offer as an alternative when there is no other individual identified as the author?

We have the same problem with some published literature, such as The Vampyre (Q509070). All of the earliest copies of the novella state on the cover page that Lord Bryon was the author (even now French Wikisource identifies him as the author). But the actual author was John William Polidori (Q364264), as evidenced by published letters from both Byron and Polidori asserting the latter's authorship. So how do we indicate for a published edition that the edition claims one individual as author, but the actual author was a different individual? --EncycloPetey (talk) 17:44, 1 December 2018 (UTC)

@EncycloPetey Imagine if the vinyl junkies and the bookworms got together and proposed a couple of new properties, we'd be unstoppable! Jokes aside, hit me up if you want to collaborate on coming up with something that would work for both books and music - we already share a lot of properties so it could make sense. (I guess Project Music is really encroaching on book properties more than sharing, but that's the way the cookie crumbled) Moebeus (talk) 02:31, 2 December 2018 (UTC)

If I had even the start of a workable idea, I'd be proposing it. I'm hoping that someone else has a means to tackle the problem because this is one where I know the problem, but am stumped as to solving it. --EncycloPetey (talk) 02:44, 2 December 2018 (UTC)
How about a property “credited author” AKA “credited to” = “name appearing as the creator of a published work, version or edition, when different from the actual creator”? - PKM (talk) 23:28, 2 December 2018 (UTC)
Perhaps a qualifier approach (object has role (P3831) or nature of statement (P5102)) might be more appropriate, given that for some other properties (e.g. publication date (P577)) the original work's data is kept even if it's factually incorrect. Jc86035 (talk) 00:32, 3 December 2018 (UTC)
@Jc86035: I kinda like the idea of using nature of statement (P5102) but then that requires a relatively strict set of approved and well thought-out values or it will quickly get super-confusing. Moebeus (talk) 00:49, 3 December 2018 (UTC)
I continue to fail to really grasp the intended dividing line between sourcing circumstances (P1480) and nature of statement (P5102) (possibly not helped by a really rather unhelpful label on P1480 ?), but P1480 might also be appropriate ("sourcing circumstances" = "record company fiction") ? Jheald (talk) 12:57, 4 December 2018 (UTC)

Storing all sports results in Wikidata?Edit

Hey, does anyone know if there has ever been any discussion about expanding Wikidata's sport data to include scores and other data from all matches of professional sports teams? On Wikipedia, it's pretty common to have this data in the articles for each team's season (e.g. en:2017–18 Manchester United F.C. season). This data ends up being duplicated a lot – every result needs to be recorded in both teams' season articles and the league's season article, in every language's Wikipedia. If this data were on Wikidata, it would be less work for editors to update when matches conclude, and it would automatically sync to every page in every language. It would also be incredibly valuable to anyone wishing to reuse Wikidata data for other projects. But, this would mean every sports match would have its own item, and we'd definitely need major buy-in from one of the larger Wikipedias for the data to be kept up to date. I'm curious what others think of implementing something like this. Thanks, IagoQnsi (talk) 17:51, 3 December 2018 (UTC)

The field of sports results is not well developed indeed. Some thoughts:
  • A major concern is about licencing of typically commercial sports results databases. Unlike in the cultural sector where lots of institutions move to open licences, most sports results databases are incompatibly licenced (with proprietary licences), as companies earn quite some money with their data. I am not sure based on which legal basis Wikipedias copy that much data from various proprietary databases, but doing so in Wikidata where we add an explicit CC0 licence tag to the sports results would clearly create some serious headache. At least for me.
  • If we ignore the legal concerns, some further thoughts: you can aim for in depth sports results that cover each and every aspect of a match/game/competition/tournament; this requires a lot of items likely on a per-match/race/fight/etc. basis, which need to be properly connected to each other and properly identified against external resources with identifiers; that is technically possible already now, but the amount of items would clearly make this a demanding task that requires a lot of automation. Alternatively, a simple approach would be to add consolidated results (winner, second, third, …, last place; with some relevant extra data like final scores, race time, etc) to existing items, but ignore all in-depth details that happened before the final round of matches; this way, one wouldn’t need to create lots of extra items and could immediately use such data in infoboxes etc, but one could not replace excessive data in pages such as en:2017–18 Manchester United F.C. season.
  • Another problem of in-depth results is that one also needs a lot of extra items about players, clubs, competitions, etc. that are not yet available. In Wikipedia, you can just add a red link if something is missing, but we cannot do so here. Formally, we are probably able to create many of these missing items as well (as long as they can be matched against an external resource), but this is a significant task by its own which should not be ignored.
  • When presenting sports results, all kinds of obscure special situations have to be considered and properly modeled. This is much easier to get done with unstructured, explanatory wikitext in Wikipedia than with structured data here at Wikidata.
  • There is also a desire to make “sports results in Wikidata” available according to a somewhat generic model that fits “all” types of sports, while still being able to consider sports-specific details. That would allow to gain lots of really interesting insight that is not even remotely available anywhere else. The excessive data collections in Wikipedia have not been designed with that aim; in Wikidata, the general model would be much more complicated to elaborate.
I would really like to see some progress regarding sports results in Wikidata. So, editors, please add more input here. —MisterSynergy (talk) 18:30, 3 December 2018 (UTC)
@MisterSynergy: Thanks for your thorough response. To respond to some of your concerns...
  • Licensing: You can't own the facts. If we went really in-depth (like detailed stats on each player), licensing might be an issue, but I don't think we need to go that far. Just the basics (e.g. teams, league, final score, date, time, referees, venue, line-ups, who/when for goals/discipline/substitutions, etc) would be enough. Our schema is entirely original, so there'd be no realistic basis for a copyright dispute.
  • Missing items: This will definitely happen sometimes, but I don't think it'd be entirely that often. On English Wikipedia, the notability standard for sports is a very low bar: for most major sports, if an athlete has appeared in even one fully professional match, they are considered notable. Thus, we have Wikipedia articles (and thus, Wikidata items) for even obscure players.
  • Lack of structure: This is true to an extent, and making a full data model won't be easy, but I think it's definitely doable. A lot of Wikipedia articles already use highly-structured templates such as en:Template:Football box collapsible (example: en:2017 Atlanta United FC season#Results). There will surely be some edge cases, but I think 99.9% of matches will be represented without major issues.
  • Need for automation: Automation will definitely be needed to initially add the data, but if we can get buy-in from a big Wikipedia, I think we might have a reduced need for ongoing automation. If, say, English Wikipedia started replacing manually-edited templates with Wikidata templates, then the hordes of people who typically update the articles would hopefully learn how to make the update on Wikidata instead (liberal use of "edit on Wikidata" links and explanatory source code comments will likely be necessary). For the initial import, we can hopefully automate importing a lot of data from articles that already use templates (as I mentioned in the "Lack of structure" bullet point). There are also some public domain sports data sources such as sport.db and football.db.
IagoQnsi (talk) 19:01, 3 December 2018 (UTC)
@IagoQnsi: "You can't own the facts": you are right ... if you extract the data from the newspapers, TV sport program or other different sources. But if you extract all data from the same source, then you are wrong. One fact id not copyrighted, but a set of structured facts in one data model is.
Then import from WP is not welcome as this is copyrighted under CC BY SA and WD is licenced under CC0. And I don't speak about the rejection by WP of massive and uncontrolled data from all WPs in WD. Snipre (talk) 20:42, 3 December 2018 (UTC)
@Snipre: In the U.S. (where Wikimedia's servers are located), a database is only copyrightable if there is something original/creative about the way the data was gathered/organized. Facts such as the date, the location, who won, who scored at what time, etc require no creativity to compile, and thus are not protected by copyright. These facts are comparable to phone book information, which the Supreme Court has ruled are not copyrightable, or to data in Wikipedia infoboxes, which we import to Wikidata all the time. It's possible that the visual layout of templates such as en:Template:Football box collapsible could be copyrightable, but we wouldn't be replicating the visual presentation on Wikidata -- we'd just be grabbing the raw data. In the European Union, there is the Database Directive which provides database rights as you described, but as long as we avoid importing data from European sources, we should be fine. More info: meta:Wikilegal/Database Rights, en:Sui generis database right. –IagoQnsi (talk) 21:05, 3 December 2018 (UTC)
Additional note: For importing data from the Wikipedia templates, we would definitely have to omit the 'notes' field, and we can't import the match report URLs unless we're very careful about it (sometimes people will add multiple URLs and/or link to the team's report instead of the league's, which could be argued as a creative decision). But for everything else, I don't think there's any realistic argument that any originality went into compiling that data. –IagoQnsi (talk) 21:22, 3 December 2018 (UTC)
@IagoQnsi: I am aware of meta:Wikilegal/Database Rights so I can only rcommend you to read it carefully, especially that comment about copyright protection in the US: "The level of creativity required is low, so it doesn’t have to be very creative — as long as the author had some discretion and made some choices in what to include or how to organize it, the database is likely to be protected." So if a website is copyrighting its database, better be careful before judging that Supreme Court reference can overule that copyright.
And your choice of using WP instead of original databases to import data into WD is clearly a way to bypass the copyright question, if it is really so clear that no copyright can be applied why don't extract data directly from the original databases ? Why use a dataset which is probably corrupted due to manual entry ? And finally you don't provide any argument to oppose the fact that WPs don't want to use WD data because WD is unreliable due to use of data from sources considered as no reliable reference. WP itself considers that WP data can't be used to source other WP statements but only external sources can be used as reference. You just reinforce their criticism about WD, so thank you to work so hard to transform WD as completely useless for WP. Snipre (talk) 20:10, 4 December 2018 (UTC)
@Snipre: I definitely agree that trying to import from premium sports databases that claim copyright is risky and not ideal. This is why I favor importing data from Wikipedia, since there's little risk that anyone would challenge our public domain rationale. I also like using that data because it's what Wikipedia is already using. It will be easier to convince the Wikipedia community to keep using the same data they already have (just migrated to a new place), rather than switching to a completely new data source that they don't know anything about. It doesn't really matter that it's unreliable when they're already using it anyway.
Additionally, my expectation is that we'd only really be importing data to start this project—moving forward, the data would be maintained by Wikipedia users. The existing data on Wikipedia doesn't come from a database – after every match, someone manually updates each article based on match reports. If we started replacing the old manually-edited templates on Wikipedia with new Wikidata-powered templates, then those people who used to update the templates would hopefully learn to update Wikidata instead. We've already got an army of volunteers doing the work of maintaining this data—we just need to redirect them to Wikidata. Of course, this will require major buy-in from the Wikipedia community, but I'm optimistic that we could get consensus—after all, this would be reducing their workload in the long-term. –IagoQnsi (talk) 20:55, 4 December 2018 (UTC)
If you import data from Wikipedia then the resulting data on Wikidata won't be accepted in any of the major Wikipedias. If your goal is to have data that gets used by multiple Wikipedias it's no available strategy. ChristianKl❫ 09:49, 5 December 2018 (UTC)
I've started working on a document outlining precisely what properties a sporting event can/should/must have: User:IagoQnsi/sports. Any feedback would be appreciated. –IagoQnsi (talk) 19:45, 3 December 2018 (UTC)
It seems to me like a collaboration with sport.db and football.db would be ideal as those projects already value putting data into the public domain. Given that they have a forum, how about asking there what they think about collaboration? ChristianKl❫ 10:58, 5 December 2018 (UTC)
@ChristianKl: I'm okay with the idea of using an external data source like football.db. However, if we're choosing to work with them over importing from Wikipedia because we don't think people will trust Wikipedia data, I don't think they're any better – just like Wikipedia's data, theirs is manually maintained by editors. Wikipedia data is probably more trustworthy because it usually includes a link to the official match report as a reference (which we could import to Wikidata). There are other issues with football.db's data as well. It's very incomplete and barebones – except for their World Cup, they only have the team names and the final score, even for the biggest leagues (e.g. English Premier League, Bundesliga, La Liga). For less-popular leagues like Major League Soccer, the data can be years out of date. Wikipedia data is much more detailed and is very up-to-date.
I know accuracy/reliability is a major problem with imported Wikipedia data, but I think it's a solvable problem. After importing the data, we can manually check it against official match reports, and then link those match reports in the reference field. This would be probably have to be done rather slowly, one league at a time, but I think it's the only way we can get data that is detailed, accurate, and not-legally-risky. –IagoQnsi (talk) 19:52, 6 December 2018 (UTC)

Wikipedia redirects reflected in wikidataEdit

Hello we are integrating data from wikidata partly based on Wikipedia ids. In many cases the Wikipedia url has changed meanwhile. The old url redirects to the new one, which itself point to the relevant wikidata item. My question will it be acceptable to also add the old wp urls to wikidata, in order to be able to integrate through them in the future?

Here are some examples: http://en.wikipedia.org/wiki/University_of_Grenoble http://en.wikipedia.org/wiki/Catholic_University_of_Leuven_(1834–1968)

Thank you. --Nikola Tulechki (talk) 08:35, 4 December 2018 (UTC)

@Nikola Tulechki: I'm not sure what you're trying to do, but it is technically impossible to add more than one "sitelink" for a Wikipedia edition or other project from a Wikidata item (i.e. one item can only have zero or one links to the English Wikipedia). Is it problematic to extract the Wikidata ID from the current page's title? Are you using a list of Wikipedia articles which is out of date? (If you were to regenerate such a list you could use the Wikidata Query Service to look at Wikidata's data.) Jc86035 (talk) 09:32, 4 December 2018 (UTC)
@Jc86035: Yes, I have some outdated WP links in my source data. I am using federated SPARQL queries to get WD items from them by querying for ?WPurl schema:about ?WDitem . For outaded links like the ones in the example I get no hits. What I want to do is add new schema:about triples using quickstatements. However, as I am writing this I realize that that may not even be technically possible. --Nikola Tulechki (talk) 12:14, 4 December 2018 (UTC)
@Nikola Tulechki: You could try using w:en:Module:Redirect (i.e. {{subst:#invoke:redirect|main|page-title-1}} % {{subst:#invoke:redirect|main|page-title-2}} ...) in an English Wikipedia edit window and previewing the page to get the current article titles, although you would be limited to 500 articles for each preview. Jc86035 (talk) 13:30, 4 December 2018 (UTC)

Where can I find information about the Data: namespace on Commons?Edit

Hi all

I've been trying to find information about the Data: namespace on Commons (which Wikidata uses a lot) and what kind of files are allowed and under which licenses (e.g tabular data and maps) but I can't find any information, even on Commons:File types, does someone know where I can find this? I tried asking on Commons Village pump but no reply :(

Thanks

--John Cummings (talk) 09:38, 4 December 2018 (UTC)

@John Cummings: c:Commons:Data links to c:Help:Tabular Data and c:Help:Map Data. Jc86035 (talk) 09:48, 4 December 2018 (UTC)
Thanks very much @Jc86035:, perfect. Just to double check, I had previously been told in a workshop that files in the Data: namespace on Commons had to be available under CC0, but reading this documentation it appears that also CC BY and CC BY-SA are also accepted? Is this correct?
--John Cummings (talk) 10:14, 4 December 2018 (UTC)
@John Cummings: I'm not sure how this works ("technically possible" doesn't mean a project will allow it), but presumably you would have to link to the data page in some way to comply with BY-SA. Jc86035 (talk) 10:26, 4 December 2018 (UTC)
@John Cummings: From the discussion at m:Community_Wishlist_Survey_2019/Multimedia_and_Commons/Allow_non-CC0_licensed_data_for_datasets, it looks as though there may still be some issues to be addressed before the interfaces will be adjusted to allow data to be uploaded that is not CC0. Jheald (talk) 12:48, 4 December 2018 (UTC)
Thanks very much @Jheald:, so is it that the instructions on those pages are wrong? Or is it that Commons is allowing them to be uploaded but there's a bug meaning they can't be used yet? I'm asking because I'm helping to write some instructions for Geoshapes use on Wikidata at User:John Cummings/Geoshapes and I'm getting conflicting information. --John Cummings (talk) 13:46, 4 December 2018 (UTC)
@John Cummings: I haven't tried to upload any datasets yet, but I believe that the uploader only allows you to specify them to be CC0, and that license is hard-coded into the description page (which doesn't use the usual Commons templates or wikitext). But somebody may be able to confirm or correct this understanding? Jheald (talk) 15:50, 4 December 2018 (UTC)
An OSM geoshape of Hungary

@John Cummings: The ability to add non-CC0 data was added and then rolled back a few days later. I've just tried to modify a page I created that uses CC-BY-2.5 and it wouldn't save. But as mentioned at the bug report, you could use CC0-1.0 and note the correct licence in the "sources" or "description" fields, as done here.

BTW, I would advise using "Commons map data" (or similar) instead of "geoshape", as "geoshape" is one of the types of data available directly from OpenStreetMap via Wikidata IDs. (And, yes, the Wikidata property is badly named.) Gareth (talk) 21:46, 4 December 2018 (UTC)

@Gareth: Saying CC-BY-SA 4.0 on a page with "Data available under Creative Commons Zero" seems to be very hostile for data reusers who might not read all text but count on the meta data being correct. ChristianKl❫ 13:12, 5 December 2018 (UTC)
@ChristianKl: This was a suggestion from a WMF developer. And as I mentioned above, saving with the correct licence no longer works, so it's a choice between either leaving pages broken and unusable (which is what I've done because I wasn't able to finish what I started before the changes were rolled back) or providing an incorrect licence and clarifying the situation as best as possible. Gareth (talk) 23:18, 5 December 2018 (UTC)

Identify problems with adding new languages into WikidataEdit

The development team is aware that several problems occur around the process of adding and using new languages in Wikidata.

With this feedback loop, we would like to list and describe the problems, so we can address them and together with the community, find stable solutions for these different problems.

Please add your input directly on the talk page :) The feedback loop is open until December 18th. Lea Lacroix (WMDE) (talk) 11:35, 4 December 2018 (UTC)

Wikidata weekly summary #341Edit

Facto Post and ScienceSource latestEdit

The latest issue of Facto Post is available on Wikipedia, concentrating on WikiCite and librarians. It links also to the ScienceSource wiki, where text mining has been active, to create annotations. A ScienceSource queries page here displays some SPARQL that runs over there: the Ps and Qs are not Wikidata's, but in that Wikibase (the overloading of the letters is currently inevitable. So you do better to consult http://sciencesource.wmflabs.org/wiki/SPARQL_and_suggester_queries where there are links that run, even if the page is in comparison black-and-white rather than Technicolor.

This link, there called "Co-occurrence version 3", is the simplest way to understand the ScienceSource project. It illustrates where in a short paper a drug and a disease term occur close together (less than 200 characters apart). We are looking for statements for medical condition treated (P2175), and such places are the best ones to check out.

Leave me a message if you'd like to know more.

Charles Matthews (talk) 15:19, 4 December 2018 (UTC)

Funny botsEdit

The French singer Richard Anthony (Q774501) has apparently authored tens or hundreds of scientific publications. Any scientist named "R Anthony" is now redirecting towards the singer. I don't know why, and what to do, but I'm just letting you know. :) Totodu74 (talk) 17:23, 4 December 2018 (UTC)

ping Daniel Mietchen, looks like a sourcemd/quickstatements batch went wrong --Lucas Werkmeister (talk) 17:39, 4 December 2018 (UTC)
Actually it looks like the source of the problem was a merge back in October from item Q56191657 for "R Anthony". ArthurPSmith (talk) 18:52, 4 December 2018 (UTC)
There were wrong identifiers assigned to the item even earlier, and I wouldn’t be surprised it the merge was done based on the (wrong) identifiers. —MisterSynergy (talk) 19:14, 4 December 2018 (UTC)
Yes, identifier salad that led the bots astray. I think I've cleaned it up for the moment. --Daniel Mietchen (talk) 20:31, 4 December 2018 (UTC)

I came across another scientific musician today, same type of case it looks like: Joseph Hughes (Q47698199). I just removed the occupation=researcher and thought nothing of it until I read this and it rung a bell: Went back and this 80s pop star has a bunch of scholarly articles linked to him. Should I correct or will there be a more automated effort to roll these back? Moebeus (talk)

Wikidata in bussinessEdit

I'm giving a talk/workshop about Wikidata in few days and wanted to include some info on examples of Wikidata use in bussiness world. Maybe we know something? --Edgars2007 (talk) 18:42, 4 December 2018 (UTC)

Apparently Apple use it, see for example [1]. Ghouston (talk) 02:12, 5 December 2018 (UTC)
Linked Geodata: OpenStreetMap (based commercial and community maps) using this. ( for example: Mapbox ) ; OpenStreetmap has a new name/brand sugestion index for contributors - with banks, shops - with wikidatid ( see : [2] ) ; osm_wiki"brand:wikidata" ; Current tags:osm_taginfo"brand:wikidata" ; Google/Facebook likes Wikidata + OSM : good for SEO ! --ImreSamu (talk) 12:57, 6 December 2018 (UTC)
Google uses it in Knowledge Graph, DuckDuckGo uses it for some of the external identifiers. Jc86035 (talk) 11:22, 7 December 2018 (UTC)

Similar itemsEdit

Is Q1427384 (de:Flatted fifth) the same as diminished fifth (Q12378925) or tritone (Q623939)? I think the former item is describing the same thing as one of the others, but it's not really clear. All three items have dewiki sitelinks. Jc86035 (talk) 18:55, 4 December 2018 (UTC)

  • As a (non-classical) musician, I'd understand them interchangeably, though I don't think I personally would be likely to say "diminished fifth". Usually I'd call it a "flatted fifth" if the lower note was the tonic of the scale I'm in, and otherwise a "tritone". - Jmabel (talk) 20:07, 4 December 2018 (UTC)
    @Jmabel: I've redirected the dewiki article to de:Tritonus and merged the Wikidata item, based on the content of the relevant articles in the English and German Wikipedias. Jc86035 (talk) 15:27, 5 December 2018 (UTC)

Award - what property to use for 'forEdit

e.g. Kath Weston awarded the Ruth Benedict Prize in 1997, for Render Me, Gender Me: Lesbians Talk Sex, Class, Color, Nation, Studmuffins . What property do I use as a pq: in Q59430803#P166? thx. --Tagishsimon (talk) 21:43, 4 December 2018 (UTC)

for work (P1686), AFAIK. --Kam Solusar (talk) 23:56, 4 December 2018 (UTC)
(Anyone know if there's plans to have the property suggester work from qualifier property constraints? That would have made this easier, I think.) --Yair rand (talk) 20:17, 5 December 2018 (UTC)

Purpose of Q19803442 and the likeEdit

How should Q19803442, Q19803497 and the like be used? Are they supposed to be for names where the person is known just by the initial (I'm looking at Q4647657) and if we find the full name, should that be there too? Or should it only be the letter name? Was this correct or should both have been kept? I can see why it would be helpful to keep track of B.B. King under Ben and Riley for given names but then should he also be under "B." as if that's an actual name? -- Ricky81682 (talk) 01:39, 5 December 2018 (UTC)

For convenience in reading: A. (Q19803442), B. (Q19803497), A. Craig Copetas (Q4647657). - Jmabel (talk) 04:44, 5 December 2018 (UTC)
  • given name (P735) can hold multiple values. Depending on the use, one or the other can be the one people are most interested in. --- Jura 05:45, 7 December 2018 (UTC)

Heads-up: Upcoming GeneDB item creationEdit

The good folks at GeneDB (Q5531047) are the authoritative source for gene annotation data for about 50 species. They are preparing a new web presence, and they would like to base some of it on Wikidata! As part of this, I have test-imported two of their species genes, and accompanying proteins, as items into Wikidata a while ago, and made a prototype browser for that information. Now that proof-of-concept is established, they would like me to go ahead and import their entire database as Wikidata items. That's about 770K genes, and about as many proteins, so we are talking about ~1.5M new items for this, over the next 1-3 months. All items will have a GeneDB ID (P3382) statement, and various referenced statements about genomic location, taxon, orthologs/paralogs, and function (example items: PPPK-DHPS (Q18970312) and hydroxymethyldihydropterin pyrophosphokinase-dihydropteroate synthase (Q56565045)).

We already have complete genes and proteins for several species (human, mouse, various bacteria), and I believe this will be a real boost, both in network effects (think main subject (P921) for publications), and in establishing Wikidata as a serious, reliable cornerstone of modern science and research. --Magnus Manske (talk) 09:56, 5 December 2018 (UTC)

  •   Support This is great. ChristianKl❫ 11:39, 5 December 2018 (UTC)
  •   Support Sounds good to me. Have you talked to the User:ProteinBoxBot people to make sure you're not both importing the same things? Are these items that GeneDB (Q5531047) already assert are not yet in Wikidata? ArthurPSmith (talk) 15:51, 5 December 2018 (UTC)
  •   Support pretty please can someone write a case study for this project? I think it would be super useful to help other organisations understand how Wikidata would be useful for them. I'm very happy to help with a write up John Cummings (talk) 16:15, 5 December 2018 (UTC)
  •   Question Why don't they set up their own wikibase instance and federate it with Wikidata (when that is possible)?--Micru (talk) 16:30, 5 December 2018 (UTC)
    • I don't have a strong enough concept of the standards for inclusion here to have a strong opinion, but I share Micru's question. Not as an objection, but out of curiosity, to know how more involved Wikidatans tend to view such decisions. -Pete F (talk) 23:07, 5 December 2018 (UTC)
      • They are looking into that possibility, but despite the Docker containers available, this is still early stage technology, and "federation" doesn't really work on wikibase level yet (SPARQL, perhaps). If they set up their own wikibase, they would likely do so to track additional, detailed data that would be unsuitable here. --Magnus Manske (talk) 15:02, 6 December 2018 (UTC)
  •   Support Yay! Thanks! --Denny (talk) 22:20, 5 December 2018 (UTC)
  • How would updates happen? --- Jura 08:27, 8 December 2018 (UTC)

How do I model an item for a series of reportsEdit

Hi all

I'm trying to model an item for a series of reports produced by UNESCO Reshaping Cultural Policies (Q59456378), there is a Wikipedia article for the reports but I don't know how to model it in Wikidata since it refers to more than one publication. What should Instance of be?

Thanks

--John Cummings (talk) 11:38, 5 December 2018 (UTC)

  • Presumably, each report should get its own Wikidata item. Then there are several different ways you can effectively relate them to the item for the article. If the article corresponds to some recognized grouping, has part (P527) probably is best, along with part of (P361). Also, if the reports have a clear sequence you might consider relating them to one another with follows (P155) / followed by (P156); if each is an update of the one before there is replaces (P1365) / replaced by (P1366). - Jmabel (talk) 18:05, 5 December 2018 (UTC)
    • Should be ideally identical to how TV series for example are modelled. @Jura1: I think worked on this, where can documentation or informations be found on current practices on concensus on modelling series ? author  TomT0m / talk page 11:37, 6 December 2018 (UTC)
  • @John Cummings: You should provide a better description of the case: is it a serie of reports or one report composed of several documents ?
In the first case, each report has to have an item and an additional item is required for the serie. The report items have to follow the model described by Help:Sources#Reports, policy, legislation and technical documentation. In the second case, we need to provide a clear model to handle that case. Snipre (talk) 12:41, 6 December 2018 (UTC)
@John Cummings: I think it would be useful to make a new item "report series" modeled on existing book series (Q277759). In fact, I am rather surprised we don't have this already. There are a number of items that are defined as a "series of law reports" but are <instance of> "law report", and that seems wrong to me; those should "report series" (or even a subclass "law report series") with parts of the class "law report". - PKM (talk) 20:59, 6 December 2018 (UTC)

How to link a Youtube user?Edit

I only have found YouTube channel ID (P2397), is there any property for user accounts on youtube?--Micru (talk) 16:26, 5 December 2018 (UTC)

@Micru: The YouTube channel ID appears to be a unique identifier for a YouTube user; you can find this identifier by going to a YouTube user page and searching for, among around 130 different places in the source of a given YouTube user's page (when I tried one of the examples on P2397), a meta element with "twitter:url" as its property attribute. (Also note the instruction on the property to add a website username (P554) qualifier to uses of P2397.) Mahir256 (talk) 16:49, 5 December 2018 (UTC)
The channel ID is the user account. Lazypub (talk)

tool for importing information from Commons templates to Wikidata?Edit

Hi, is there a tool that takes information from a Commons image's template and uploads it to Wikidata? Rachel Helps (BYU) (talk) 19:05, 5 December 2018 (UTC)

For some Commons templates QuickStatements is used, see commons:Category:Pages with QuickStatements links.--Jklamo (talk) 23:59, 5 December 2018 (UTC)
thank you! Rachel Helps (BYU) (talk) 21:46, 6 December 2018 (UTC)

Watchlist notices: let users dismiss these notificationsEdit

A while ago Tagishsimon noted that there is no simple way for users to hide the notices that are displayed above the watchlist (see the discussion). This is a real issue on small screens, as the list takes a lot of space above the watchlist itself (see this screenshot in 1280x800). I therefore propose that we switch to the solution adopted by the English Wikipedia, where watchlist notices consist of short sentences which can be dismissed with the help of a Javascript gadget (see en:Wikipedia:Watchlist_notices). Please add your support here if you think this would be a good move so that we can get the gadget enabled on Wikidata. @Multichill, Nikki, Galaktos: pinging editors involved in the previous discussion. − Pintoch (talk) 04:05, 6 December 2018 (UTC)

  • Yeah, support. --Tagishsimon (talk) 12:52, 6 December 2018 (UTC)
  • Sounds good, be bold :-) I think some wiki's have the option to only show it to certain user groups (like only admins). Might be useful for us too. Multichill (talk) 12:58, 6 December 2018 (UTC)
  • The gadget is suitible for seperate lines of notices. We use static sections with dynamic content, not sure how that will work. Sjoerd de Bruin (talk) 13:20, 6 December 2018 (UTC)
    • @Sjoerddebruin: yes, so we should switch to using separate lines of notices. − Pintoch (talk) 23:13, 6 December 2018 (UTC)

WikidataCon, the conference for open data enthusiasts, will take place in Berlin in October 2019Edit

Hello dear Wikidata community,

After the success of the first WikidataCon in 2017 and many conversations regarding the future of this conference, we are now ready to announce that the WikidataCon 2019 will take place on October 25th-26th 2019 in Berlin.

Apart from this first piece of information (save the date in your calendars!), there are some important changes regarding the goal of the conference that we would like to share with you.

After its first iteration, the goals of the conference will evolve. The event will be more focused on networking and strategic discussions. The target group will of course include the Wikidata community in its broad sense, especially people and organizations who are already involved in editing, structuring and reusing the data, but will also reach further the organizations who may be interested in using Wikidata and Wikibase, or contributing in various ways to the evolution of Wikidata.

The program will be adapted to this new goal, and organized around different levels of involvement for the participants: from being inspired (with keynotes, success stories, vision of Wikidata’s future) to diving deeper into the hot topics, but also action: the participants will have the opportunity to work on projects during the conference.

Besides many possible topics that will be submitted by the participants, we plan to have a dedicated track about lexicographical data, discussing the perspectives for further reusing data about words within and beyond Wikidata.

A lot of space will be left for discussions, networking, building connections between the different stakeholders and members of the broad Wikidata community.

The conference will be hosting at least 250 participants. In order to make sure that the access to the conference is fair and to not reproduce the issues of last year (rush to get a ticket and long waiting list), we will set up a more streamlined registration process, where a committee made up of both staff and volunteers will make sure that the applicants will contribute in one way or another to the goal of the conference. I will keep you updated for the next steps of this process.

You can already see Wikidata:WikidataCon 2019 which will be improved step by step over the next months, and you can already use the discussion page. If you have any question, you can also contact me on my talk page or by email: lea.lacroix@wikimedia.de.

We're looking forward to a great WikidataCon 2019, together with you!

Thanks for your attention, Lea Lacroix (WMDE) (talk) 08:24, 6 December 2018 (UTC)

Label when adding a wikipedia pageEdit

Can someone explain me why the title of a wiki page is not automatically used for the label of the item when a wikipedia page is added in that language from the wikipedia? It is this way by default if you create an item, but not if you only add a link to a wikipedia page. Why is that behaviour of the Wikipedia tool? Thanks in advance, Paucabot (talk) 10:53, 6 December 2018 (UTC)

As I recall (and it has been a while), this is because of the Wikipedia problem with disambiguation of names, so for people and places (the bulk of early Wikidata items) you would have "Saint Mary's Church, (XXX)" where the XXX is for location, which is generally placed in the description, not the label. That is just one example, but there are many more. Nowadays it might be worth reinvestiging this decision. Jane023 (talk) 11:57, 6 December 2018 (UTC)
Is it a technical problem? Because I think it is worse to have no label than to have misplaced parentheses ... Paucabot (talk) 12:33, 6 December 2018 (UTC)
It's not a technical problem. It's just that the code to enforce/undertake addition of a label based on the sitelink is not in place, because no-one has thought it a good idea to make it so, largely for the reasons set out by Jane023. I have some sympathy for your view, but equally sympathy for the notion that labels are important enough that we don't fudge them. Next, your idea /might/ sort out a single or a small set of language label values, albeit at the cost of the addition of suboptimal labels, but it's not applicable across the board: the Thai label taken from a Thai article is not much use for the English langauge label. Given there are ~200 languages wanting labels, the solution doesn't go very far. --Tagishsimon (talk) 13:01, 6 December 2018 (UTC)
@Tagishsimon: I'm not asking for the label to be automatically set in every language, only for one language: if I'm adding a catalan sitelink, then this text would be automatically used for the label in catalan only. Paucabot (talk) 19:30, 6 December 2018 (UTC)

I created a ticket for a similar idea at phab:T148762 a couple of years ago, where the label would not be set automatically, but where the user is given the choice to set the label for the connected Wikidata item.

Unfortunately, nothing has come out of it yet, but if more people show their interest in features like this, maybe we could finally get something implemented. --Njardarlogar (talk) 16:40, 6 December 2018 (UTC)

Thanks, @Njardarlogar:. I'll ask it there. Paucabot (talk) 19:35, 6 December 2018 (UTC)

I have seen that there is a bot (@MatSuBot:) that does this kind of work: Special:diff/800581443. If it's done with a bot I don't see any reason why not do it automatically from the beginning. Paucabot (talk) 08:48, 7 December 2018 (UTC)

@Paucabot: I don't agree that a strictly automatic solution is good, because then a lot of labels would get wrong capitalization, and the user would have to go over to Wikidata to correct the labels, anyway, if they want the labels to be correct. Automatic labels should be better than the current state, where no labels are added at all (with the exception of the work of bots); but you should be able to adjust the label in the dialogue where you connect the page to item, such that you can ensure that the capitalization is correct and that no disambiguation is present in the label. --Njardarlogar (talk) 11:16, 8 December 2018 (UTC)
@Njardarlogar: If, anyway, I have to go to Wikidata to get things done, I prefer only having to change capitalization or removing disambiguations than having to add everything from scratch. Maybe even we could get rid of the parentheses automatically ...
Have you seen T58410? They have discarded offering to change the label as an option. Having done so, I think that a strictly automatic solution is better than the current situation. Paucabot (talk) 11:26, 8 December 2018 (UTC)
Having corrected lots of item labels (for various reasons different from Wikipedia titles) I think it's not a good idea to do this automatically, unless it is offered as a yes/no pop-up option when the sitelink is created. Then it is up to the user to have the whole title copied into the label or not. Even this may be a bad idea, I am not too sure how often "Wikipedians with zero Wikidata experience" actually add sitelinks by hand. Jane023 (talk) 11:56, 8 December 2018 (UTC)
@Jane023: Maybe you're right, but this behaviour is not very coherent with the actual behaviour of the tool if the item does not exist (it copies the sitelink to the label) or with the existence of bots that do exactly this work. Paucabot (talk) 12:19, 8 December 2018 (UTC)
@Paucabot: In the ticket I created (which is still open), I actually address the concern about casual users mentioned by hoo (see my reply to Edgars2007): the option for adjusting the label could be under an "advanced" section, or even completely hidden unless the user opts in on the feature in the user settings. I'd also add that connecting Wikipedia pages in different languages is already some distance into advanced users territory, so the addition of an "advanced" section should not be very problematic. If you are really sceptical about such a section and think it will confuse inexperienced users, you could even have it appear only after the user has connected 10 pages.
Regarding the concern about making the linkItem code more complicated, I don't know how important that is; but I doubt it would/should be that much more complicated to e.g. make it launch a new dialogue with its own code that allows the user to set the label after the page has been connected to the item. Again,such a dialogue could not show at all by default, or the user could choose to have the dialogue never appear again.
In short: if there is a will to implement such a feature, there is almost certainly a way. :-) --Njardarlogar (talk) 12:11, 8 December 2018 (UTC)
Thanks for your thoughts, @Njardarlogar:. Paucabot (talk) 12:15, 8 December 2018 (UTC)
How about having a popup in case a new Wikilink gets entered in a language that doesn't already have a label that suggests adding the title as a label but leaving the user the option to adapt it as well? ChristianKl❫ 13:05, 10 December 2018 (UTC)
That would be fine for me, @ChristianKl:. Paucabot (talk) 19:54, 10 December 2018 (UTC)

Urlencode for external identifierEdit

Is it possible to urlencode the value in external identifier property? isomeric SMILES (P2017) usually has symbols like '/' etc. which cannot be properly used in an URL without encoding it first, so in many situations link in the property is faulty, e.g. http://www.simolecule.com/cdkdepict/depict/bow/svg?smi=CC[C@H]1[C@H]([C@@H](C/C(=C/C=C/[C@@H]([C@H](OC(=O)/C(=C\C(=C\[C@H]([C@H]1O)C)\C)/OC)[C@@H](C)[C@H]([C@H](C)[C@]2(C[C@H]([C@@H]([C@H](O2)/C=C/C)C)O)O)O)OC)/C)C)O&zoom=2.0&annotate=cip should be encoded to something like [3] to work properly. Wostr (talk) 13:08, 6 December 2018 (UTC)

@Wostr: The ID in Wikidata should be the real string form of the ID, whatever it is. URL-encoded strings don't get interpreted properly by the Wikidata client UI anyway. There may be a fix we could work out with the wikidata-exteranlid-url service, you can experiment with it here: https://tools.wmflabs.org/wikidata-externalid-url/ or let me know if it doesn't seem to be handling the string properly. ArthurPSmith (talk) 19:13, 6 December 2018 (UTC)
@ArthurPSmith: that does not seem to work for isomeric SMILES (P2017); it returns valid URL (prefix+id+suffix), but it does not encode the identifier: http://tools.wmflabs.org/wikidata-externalid-url/?p=2017&url_prefix=http://www.simolecule.com/cdkdepict/depict/bow/svg%3Fsmi=&url_suffix=%26zoom%3D2.0%26annotate%3Dcip&id=CC[C@H]1[C@H]([C@@H](C/C(=C/C=C/[C@@H]([C@H](OC(=O)/C(=C\C(=C\[C@H]([C@H]1O)C)\C)/OC)[C@@H](C)[C@H]([C@H](C)[C@]2(C[C@H]([C@@H]([C@H](O2)/C=C/C)C)O)O)O)OC)/C)C)O or maybe I'm doing something wrong? Wostr (talk) 20:51, 6 December 2018 (UTC)
I'm sure there's a way to do it, I'll take a look... ArthurPSmith (talk) 01:53, 7 December 2018 (UTC)

Genus and/or species for BismarckiaEdit

The two items Bismarckia (Q13055333) and Bismarckia nobilis (Q664462) seems to be mixed and intertwined. One of them is supposed to regard the plant genus Bismarckia and the other of them the species Bismarckia nobilis within that genus, but right now it's all a bit confusing. Looking through the different Wikipedia links on Bismarckia nobilis (Q664462) one can see that they sometimes point to a species page, and sometimes to a genus page. I guess one of the problems is that several of the Wikipedias redirect Bismarckia to Bismarckia nobilis since the genus is monotypic and a separate page for the genus therefore seems redundant. This is of course not the case here at Wikidata, where we always keep specific WD items for specific taxa. How can we fix this without mixing up all the identifier links (IPNI, the PlantList, Tropicos, etc.) on each of the item pages? –Tommy Kronkvist (talk), 14:42, 6 December 2018 (UTC).

Pragmatic policy is to keep all the sitelinks together in one item (to make it easier for users to find other pages), in the item of the species in most cases (but in the item for the genus for fossils). When there is a Wikipedia that has pages for both the genus and the species, the sitelinks are split: those for the genus in the genus item, those for the species in the species item. - Brya (talk) 03:52, 8 December 2018 (UTC)

Admin pages Q3907246 and Q4580256Edit

User:Infovarius reverted me. IMHO, that wasn't an improvement.

I suggest we lump "page used to make different requests to administrators" (which Q3907246 was reverted to) together with "page used for communication which requires administrators attention" (Q4580256 (this tends to be the same thing) and move link portals to Q9004001. Perhaps create a new item for admin-specific link portals. On w:Wikipedia:Requests for administrator attention, w:pt:Wikipédia:Pedidos and w:nl:Wikipedia:Verzoekpagina voor moderatoren, you can't make any request. (please ping me if you reply) Alexis Jazz (talk) 18:27, 6 December 2018 (UTC)

Highest current QID / LID / PIDEdit

Is there an API call to figure out the highest Q ID, L ID, or P ID right now? --Denny (talk) 19:48, 6 December 2018 (UTC)

@Denny: https://www.wikidata.org/w/api.php?action=query&list=recentchanges&rctype=new&rcnamespace=0&rclimit=1 (Sort of cheating, but...) --Yair rand (talk) 20:42, 6 December 2018 (UTC)
Thank you, perfect! :) --Denny (talk) 21:24, 6 December 2018 (UTC)
Highest L is izioki (L171081) --- Jura 04:36, 7 December 2018 (UTC)
:D perfect, thanks, Jura. I was looking for Yair's answer, but you are correct! (for now) --Denny (talk) 16:48, 7 December 2018 (UTC)

New Wikimedia password policy and requirementsEdit

CKoerner (WMF) (talk) 20:03, 6 December 2018 (UTC)

male / female only parentingEdit

I'd like to suggest a new property in order to distinguish between single, male only or female only, and child rearing by both parents. There are many animal species where the male is the only parent (seahorses, cassowary, ...). Then there is the female does the parenting (tigers, elephants, ...). It would be great if we could add female/male, male/male, and female/female offspring care as well. Thoughts? --Hedwig in Washington (talk) 18:24, 7 December 2018 (UTC)

@Hedwig in Washington: This has to be discussed on Wikidata:Property proposal. Esteban16 (talk) 21:32, 7 December 2018 (UTC)
Yes, it would be possible and useful to have such a property. - Brya (talk) 03:57, 8 December 2018 (UTC)

References not to displayed unfoldedEdit

Is there a way the references not to displayed unfolded when there is a constraint violation within the reference? It really makes my constributions difficult. Xaris333 (talk) 10:34, 9 December 2018 (UTC)

Hello,
Displaying the reference unfolded when there is a constraint violation is done on purpose: this way, we encourage the community to fix either the content or the constraint. However, if many people complain about it and wish for a way to fold the references, we're ready to think about it further. I hope this answer helps you. Lea Lacroix (WMDE) (talk) 11:48, 10 December 2018 (UTC)
@Lea Lacroix (WMDE): I understand why that was applied. But for me it is very difficult. Because of that problem Wikidata:Project chat/Archive/2018/11#single-value constraint with title property, in many items that I am adding properties, the references are display unfolded. Xaris333 (talk) 16:10, 10 December 2018 (UTC)
I think we really need to clean up the constraint then. I know this sucks but it's better to fix the underlying issue than glossing over it :/ --Lydia Pintscher (WMDE) (talk) 09:45, 11 December 2018 (UTC)

Q11263220 and Dumbing down (Q5313720) - are they the same and should they be merged?Edit

Q11263220 refers to policies like Fado, Football, and Fatima (Q3539497) or 3S Policy (Q10854388) or Bread and circuses (Q845658) that make people either unable to or doesn't want to care about politics, usually for the purpose of maintaining or strengthening the control of the ruling government on a country/society. Can it be said as equal to dumbing down and thus can the two wikidata items be merged? C933103 (talk) 16:30, 9 December 2018 (UTC)

Miniature map for coordinate propertiesEdit

Hello all,

After we deployed the Commons miniature pictures for the image properties, some editors suggested to do something similar with maps. From Wednesday, December 12th, we will also have a miniature map displayed for the properties containing coordinates, like coordinate location (P625). This feature uses mw:Extension:Kartographer and OpenStreetMap.

Viewing the map directly in the item will help the editors to check quickly if the coordinates seem correct, and therefore improve the data quality.

Some useful features:

  • Double-click on the miniature map: display it in full screen
  • On full screen view: click on “external maps” to display different map services and links to many external maps
  • In edit mode: when adding or editing coordinates in the field, the maps is automatically updated to fit the coordinates you entered
  • The coordinates are still displayed under the map and link to Geohack

If you have any questions or issues with this feature, let me know. You can also see the related ticket. Lea Lacroix (WMDE) (talk) 15:22, 10 December 2018 (UTC)

Linking musical settings with source textsEdit

What are the correct properties to use when linking a source text with a musical setting of that source text? and when linking an original melody with a set of alternate lyrics written to that tune at a later date? In these cases the melody and the lyrics will be separate items. Sorry if this has been asked before. Beleg Tâl (talk) 15:27, 10 December 2018 (UTC)

Wikidata weekly summary #342Edit

Request for filling out a questionnaire on WikidataEdit

Dear Wikidata enthusiasts, I have a huge favor to ask of everyone reading this --

TLDR: Please fill out the questionnaire linked here!

And now in more details: As some of you may know, I'm PhD candidate at the School of Education, Tel Aviv University, and my research focuses on Wikidata as a learning platform. To succeed in my research, I need lots of data about how users (such as yourselves) interact with Wikidata, so it would be of great help if you could take the time to fill out the questionnaire linked above. It's estimated that it should take between 15-30 minutes, depending on how detailed your answers would be. I'm striving for at least 100 replies, but this is one of those cases of "the more, the merrier", so really, every single person filling it out is of huge help.

If you have any questions, or are willing to participate in a follow up interview, please feel free to ping me via my talk page, privately or by email (shani.even gmail.com). Thanks in advance for considering filling it out. I will be forever grateful to anyone who can help and promise to update you on my progress. :) Shani Evenstein (talk) 22:46, 10 December 2018 (UTC)

Property:P443Edit

How about to move all instances of this property from items to appropriate Lexemes? (Only a word can have pronunciation, not a notion) --Infovarius (talk) 14:41, 11 December 2018 (UTC)

Yes, that sounds like a worthy project. ChristianKl❫ 15:12, 11 December 2018 (UTC)