Wikidata:Project chat

Wikidata project chat
Place used to discuss any and all aspects of Wikidata: the project itself, policy and proposals, individual data items, technical issues, etc.
Please take a look at the frequently asked questions to see if your question has already been answered.
Please use {{Q}} or {{P}}, the first time you mention an item, or property, respectively.
Also see status updates to keep up-to-date on important things around Wikidata.
Requests for deletions can be made here.
Merging instructions can be found here.

IRC channel: #wikidata connect
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2017/03.

Query regarding the descriptions of famous CEO'sEdit

Hi all,

I was looking at the description of some of the famous CEO's like Marissa Mayer,Tim Cook etc. -> American business executive -> American business executive and engineer.

I guess it would be better to have specific description like CEO of Yahoo! for Marissa Mayer, CEO of Apple Inc for Tim Cook. However Sundar Pichai( has Google CEO as description. Can anyone please let me know why is there's disparity in description for different CEO's?

I guess it would be better to have consistent descriptions for all CEO's.

Thanks in advance.

Subramanyam  – The preceding unsigned comment was added by 2001:4998:EFEB:7805:0:0:0:1020 (talk • contribs) at 13. 2. 2017, 04:37‎ (UTC).

I don't see any immediate profit of changing one machine-generated descriptions to another machine-generated description. Well, there is a space to improve, like inserting company names, main roles in films, genres etc. to make namesakes more distinct. --Lockal (talk) 19:07, 19 March 2017 (UTC)

Item quality and the number of sitelinksEdit

Wikidata:Item quality is currently fairly vague about how to measure the quality induced by sitelinks:

  • (A): All appropriate sitelinks to corresponding Wikimedia projects
  • (B): Most appropriate sitelinks to corresponding Wikimedia projects
  • (C): Some sitelinks to corresponding Wikimedia projects
  • (D): At least one sitelink (if applicable)
  • (E): everything else.

There are probably at least two ways of measuring this:

  • 1. In a relative way: does the item include sitelinks to all articles Wikipedias have on the same concept. If there are 10 sitelinks on the item and 1 sitelink on some duplicate item, this should be fine for (B)/(C)/(D)/(E). Assessing this would require spot checking for duplicates.
  • 2. In an absolute way: does the item include sitelinks to all of the some 500 WMF sites we could link. If there are 10 sitelinks on the item and 1 sitelink on some duplicate item, this might only be fine for (D)/(E) and maybe (C). An automated calculation can assess this (WDQS has a simplified way to check it).

The relative way (1) measures something Wikidata contributors build and can fix, the absolute way (2) is just a measurement about a gap in one or the other Wikipedia. What do you think?
--- Jura 15:26, 12 March 2017 (UTC)

I thinkt the relative way is better, because otherwise you will never get class (A) items. You can't force small wikipedias to create a page about a person that is totally not relevant for their language, like a local rapper. Q.Zanden questions? 16:43, 12 March 2017 (UTC)
Wikidata:Item quality already states this in a "relative" way. To me, the "absolute" proposal is a nonstarter. It just doesn't make any sense. This conversation would be better held on Wikidata talk:Item quality where we can discuss the specific language to include in the criteria. All are welcome to join us there where we are addressing all sorts of questions like this one. --EpochFail (talk) 13:37, 15 March 2017 (UTC)
This discussion about sitelinks as a measure of quality is silly. There are lots of quality items that do not have any sitelinks at all, because they are closely related to something that does, but they have a different purpose on Wikidata - just look at all the "trees" vs "fruits of those trees" vs "seeds of those trees" etc. A typical Wikipedia article would just bundle all those concepts together, which is fine. Jane023 (talk) 15:58, 16 March 2017 (UTC)
Slightly off-topic, is there any publicly available tool by Wikimedia that can be used to measure the (relative/absolute) quality of individual items by the above definition, i.e., the input to the tool is item number and the output is its quality? Jsamwrites (talk) 12:39, 18 March 2017 (UTC)
No. That is what we are trying to get to with the above. --Lydia Pintscher (WMDE) (talk) 13:12, 18 March 2017 (UTC)
What is it that you are trying to achieve. It escapes me and arguments are made for another approach but it does not get attention. Neither is there any response to the arguments against this approach. Thanks, GerardM (talk) 09:07, 19 March 2017 (UTC)
Silly or not, I don't think it's entirely clear what the item quality criteria currently state, especially as actually checking for duplicates of the same concept doesn't seem to be an obvious thing to do. Some seem to be insist to do things in a Wikipedia way .. maybe we should follow the suggestion made earlier and attempt to allocate Wikidata development resources to more important issues.
--- Jura 08:56, 19 March 2017 (UTC)

Quality is not measured in a Wikipedia wayEdit

The current proposal for quality is imho wrong. Quality is in the consistency of the data and in the linking to sources. A source may be a Wikipedia but for an author for instance a link to VIAF is more relevant; it establishes an identify in the international libraries. Data for an item like an author is associated with categories; the data the categories stand for should be included in Wikidata. Wikidata holds an edge over a Wikipedia when it knows about more articles that should be in the category.

When specific data on an item can be qualified; the qualifications are an integral part of the quality involved. Consequently it is great to know that a person was a ruler for a specific kingdom or sultanate, knowing dates and predecessor and successor make for great quality.

It is important to remember how young and incomplete Wikidata is. Things are improving rapidly but the relevance is mostly in the data itself. When language is considered, the fact that we hold the complete structure of Chinese places and their administrative entities makes it of high quality. The fact that we do not have English labels is hardly relevant. Collaboration on maintaining this data is much more relevant. When we think that language support is vital in assessing quality, it follows that attention is given to the usability of other languages. Showing a Q-number when a label is missing is something that needs urgent remediation. There is always at least one label.

The current approach to "quality" is one where single items are considered in isolation. It is much better to consider quality as a function of "set theory". It is more important to consider quality as one of connectedness to other items and sources. Quality can be expressed in the completeness of statements. Arguably when a Wikipedia article has Wikilinks including red links, Wikidata could have an associated statement. Thanks, GerardM (talk) 06:44, 13 March 2017 (UTC)

One thing that occurred to me was that a tool might be quite interesting that determined how many of the bluelinks in the first paragraph of a Wikipedia article were accounted for (and which were not) by statements on the corresponding Wikidata item.
Are there particular types of relationships that aren't yet being well captured? And are there items where the proportion is particularly low? This I think might be interesting to look at. Jheald (talk) 09:39, 13 March 2017 (UTC)
Yes I am inclined to agree with this. The trick is identifying the links you are talking about - I assume "instance of" and maybe "from xxx country", but maybe also more specific things regarding that article's "claim to fame" on that specific language Wikipedia. Jane023 (talk) 16:01, 16 March 2017 (UTC)
There is much data in categories that could be harvested on a regular basis. They reflect data as "educated at" "faculty at" etc. For many categories there are definitions in place (in Wikidata) that can be used for automation. When this is done for all Wikipedias it would improve Wikidata over the quality of single categories, Thanks, GerardM (talk) 16:16, 16 March 2017 (UTC)
Wikidata would be better served with more wiki links and red links. We could do a better job capturing data from categories; there are two obvious strategies and the easiest is to cooperate with the good people of DBpedia. Thanks, GerardM (talk) 14:48, 13 March 2017 (UTC)

Contribute entity URI to WikidataEdit

This is in reference to an earlier query on contributing entity URI to Wikidata: Wikidata:Project chat/Archive/2017/03#Contribute entity URI to Wikidata

We have further queries below.

Based on our understanding, anyone can add/edit the data from Wikidata and these will be fed into the infobox of a corresponding Wikipedia article. Similarly, anyone can add/edit the data from an article in Wikipedia. Does this mean that the data in Wikidata will be overwritten if someone add/edit the data in the corresponding Wikipedia article? Or is it a case where data is fed only from Wikidata to Wikipedia, but never in the reverse direction?


P.S. Notice you have shifted the previous discussion to archive. How do we continue with the current discussion thread?  – The preceding unsigned comment was added by Nlbkos (talk • contribs) at 09:30, 14 March 2017‎ (UTC).

Should we have a property for "religion" or "religious affiliation"?Edit

I'd like to invite folks to participate in the discussion here: Property talk:P140#Using most specific information and proposed label change.--Pharos (talk) 23:52, 18 March 2017 (UTC)

Abstract classesEdit

So, those here who have a background in software development may be familiar with the concept of "abstract classes" from object-oriented programming languages (e.g. in Java). An abstract class is a class which can have subclasses but cannot itself be directly instantiated. In Wikidata terms, this would be saying that certain items should only be the targets of P279 not P31. So, do we have a way to express that constraint right now? Or would it make sense to introduce such machinery? Some items are so abstract that they should not have any direct instances only subclasses, e.g. geographical object (Q618123) – anything directly an instance of that should be made an instance of some more specific subclass instead. (The next question is, if we don't have any such machinery at present, and if we should, what would be the mechanics of implementing it? e.g., we could introduce an item "Wikidata abstract class", make abstract classes P31 (or maybe P1552) of "Wikidata abstract class", and then add a constraint to P31 saying that target items must not be P31/P1552 "Wikidata abstract class".) SJK (talk) 11:55, 19 March 2017 (UTC)

Though it is very interesting to have the flexibility provided by Wikidata, it's equally important to have the ability to add constraints in cases similar to the use cases specified by SJK. There are many other examples, where you can't create instances (e.g., programming paradigm). Properties like official website (P856) do provide restrictions on values by only allowing values that respect the given format. Such form of restrictions need to be extended to items, may be in the long run. Jsamwrites (talk) 12:22, 19 March 2017 (UTC)
@Jsamwrites: Yes, I think your idea of somehow putting constraints on items is a good one. To give another example, consider an entity like local government area of Australia (Q1867183). Any instance of that class (and its subclasses) should have country (P17)Australia (Q408), and there should be some way to declare that constraint on local government area of Australia (Q1867183). There are heaps of other examples where "constraints on items" could be used. Now, the question is whether it should be encoded in properties, or as a template on the talk page like we currently have for properties. The former is the ideal solution; however, the fact that we are still using the later demonstrates that the former solution has some difficulties (some complex constraints are far easier to encode in a template than in a claim, and the template displays it in a user-friendly way, something constraints-as-claims can't really do right now). I wonder how hard it would be to take the exist constraint templates/constraint violations reports/etc and repurpose them for items? All that said, I think my original idea of a "Wikidata abstract class" can possibly be implemented with just property constraints, and so I still would like to pursue that as a property constraint until item constraints become available. SJK (talk) 12:57, 19 March 2017 (UTC)
  Support Matěj Suchánek (talk) 15:46, 20 March 2017 (UTC)

Pay website to make use of Wikidata ?Edit

There is a grant proposal at meta:Grants:Project/EveryPolitician. It asks to provide WMF funds to develop software that would make use of Wikidata (look for the "Funder" column in the budget section). Oddly in a field where our coverage is actually already quite good. Not sure what to think of it.
--- Jura 14:14, 19 March 2017 (UTC)

It happens all too often that I add data in Wikidata for yet another parliament that we do not cover. We certainly are not complete for most of them and some we do not even know. No, our coverage is pathetic and yes, combining the power of data has its place. Thanks, GerardM (talk) 18:29, 19 March 2017 (UTC)
@Jura1: we clearly haven't expressed this part of the proposal very well. This part isn't about us using Wikidata (in terms of pulling data from here) — it's about things like reworking the hundreds of scrapers we have that check every Parliament site across the world looking for updates every day, and instead of simply feeding that information directly into EveryPolitician, feeding it into Wikidata instead. The proposal used to say exactly that, but I edited it to try to show that this will be about much more than just rewriting scrapers, as that's just one part of what we'll be doing. Clearly I made things worse rather than better, so I've tried to broaden that out again.
I agree with GerardM (talkcontribslogs) that the coverage in Wikidata at the moment is generally fairly bad, outside a handful of countries where people have put a lot of effort in. Try ordering the table at Wikidata:EveryPolitician#By_country by number of members to see even how little this data is even filled in at all, and of course for lots of those there aren't even start/end dates, never mind things like constituencies or party/faction affiliations.) When you expand that out to things like Cabinet-level memberships things get even worse (and a huge proportion of that data that is filled in was imported from Infoboxes where the links were to Ministry pages rather than Minister pages, creating a lot of quite broken data).
By my reckoning not much more than about 50% of the current national-level parliamentarians in the world even have Wikidata entries at all (and that reduces quickly as you start to look at historic data in most countries), and the modelling of quite core concepts (like what the national parliament of each country even is) is incredibly inconsistent at the minute, making lots of the queries that serious users of this data would want to make effectively impossible, especially if doing multi-country analysis. There's a lot of work here to be done, and rather than us putting all our effort into building EveryPolitician entirely independently from Wikidata, our proposal is that we channel that work into making sure that Wikidata's coverage in this area really is not only "quite good", but the best source of open political data available. Clearly this isn't something we'd be doing in isolation — and for this to work well, we'll need to work very closely with everyone who is currently contributing in this area. So we'd love to hear more about what you think about the proposal — and particularly where you think we're going to hit the biggest challenges. Some of those are fairly obvious (e.g. that there will be countries where there simply aren't enough, or even any, Wikidata contributors who'll keep the data updated), but doubtless there will be many other issues too. --Oravrattas (talk) 07:20, 21 March 2017 (UTC)
  • Well, Wikidata is a work in progress. I doubt there are many fields where we reached completion. The question is if this topic is more or less advanced compared to others. I think we will soon have reached a good coverage for Finland: apparently something that can be grown through a project on Wikidata itself.
    It's obviously a good thing if websites make use of our data and develop tools that can help us improve our data. Toolserver is a resource that is made for this.
    Still, should WMF finance development of websites just to enable them to make use of Wikidata data? There is already a development backlog at Wikidata itself, so it don't quite see why one would finance development elsewhere when this should be furthered here. If import and list generation was improved at Wikidata, many other topics would benefit from this as well.
    --- Jura 18:14, 21 March 2017 (UTC)
Sure, there are definitely countries where Wikidata already has good coverage, and those should definitely be used as models for how to achieve that everywhere. I'm not so sure that Finland is one of those though. It's just outside the top 10 countries with most members marked in Wikidata with a relevant position held (P39) statement (in this case member of the Parliament of Finland (Q17592486)), with over 2300 people. However, only 162 of those include dates of the membership — most are simply a bare "was a member" — and only 43 of those are listed as a current member (of the 200 there should be). And only 120 of those 2300 memberships include a electoral district (P768) qualifier — and 10 of those constituencies have a instance of (P31) of something other than Electoral district of Finland (Q28657263).
If we run a query to list the current holder of positions that are part of the Cabinet of Finland (Q2366737), we get 8 positions — which out of 15 is a much better ratio than the majority of countries, but, comparing it toä_Cabinet shows that 3 out of those 8 are actually out of date, and have since been replaced (in some cases more than once since the data was accurate).
Presumably much of this information will be organically cleaned up over time. But we've been working for the last two years on gathering all this sort of information for over 200 countries and territories and working out how to keep that up to date. For quite some time we've been pulling information from Wikidata into EveryPolitician to augment the data we get from other sources. But we now believe that putting our efforts into getting much more of that information into Wikidata, and creating better tools for keeping it up to date here, will be a much better approach than us essentially "competing" on building parallel datasets. As a charity, mySociety's purpose is to create and popularise these sorts of datasets and tools. We believe that bringing the work we've done to date on EveryPolitician into Wikidata is the best way to do that, and that together we can greatly increase the amount of well-structured, freely available and usable political data available to parliamentary monitoring groups, campaigners, researchers, and other groups or individuals who require exactly this sort of information.
--Oravrattas (talk) 20:12, 21 March 2017 (UTC)
  • "the modelling of quite core concepts (like what the national parliament of each country even is) is incredibly inconsistent at the minute" - Absolutely, i second that. I care about parliament articles on german Wikipedia and they are horrible. Quite often substandard content to begin with, no infoboxes, no references and outdated for serveral election cycles. And i see those same problems in all languages Wikipedias, for hundreds of national parliaments, and their elections! If Wikidata were as curated as PARLINE by the Interparliamentary Union, we would be much better off.
  • "But we've been working for the last two years on gathering all this sort of information for over 200 countries and territories and working out how to keep that up to date." - that it is exactly the kind of thing that wikidata needs to learn to do, keep data up to date! This is a great learning opportunity, with lessons that can be applied to other knowledge areas in wikidata. --Atlasowa (talk) 21:04, 23 March 2017 (UTC)

Writing to Wikidata using Bot / REST APIsEdit

Hi All,

I would like to add new 'file format' entries for the property 'Readable file format-P1072' in an item page 'Adobe Photoshop-Q129793' and I have to do similar operations on many softwares' item pages. I would like to know if its a must to create a Bot to do this operation or just using RESTAPIs will do the job. Please let me know.

Thanks in advance. Sharmeelaashwin (talk) 12:21, 20 March 2017 (UTC)

A challenge to us allEdit

Hoi, suppose you are the project manager that needs to report on quite a big project. For instance the "100 women" that the BBC features each year. Arguably all those women have merit and they are at least notable enough to have a Wikidata entry. At this moment there are four years. What queries would you want to monitor the progress of the project. For your information there are four queries I use in a Listeria page for each of the years.

The objective of this challenge is to really consider how Wikidata can be used to manage our editathons, our projects. So please consider what it is that you need, what the prerequisites are and how multiple queries give the insight necessary to report to a GLAM / a BBC / a chapter or the WMF. Thanks, GerardM (talk) 18:39, 19 March 2017 (UTC)

PS Not all the items exist yet. I am adding them based on the information in the en.wp article.


I have added all the missing items for 2016 and 2015. Questions that I have are: how and where do I include sources to a person for editors of a Wikipedia article? What is the breakdown of the countries involved and to what extend is the information in existing articles available in Wikidata. Thanks, GerardM (talk) 07:55, 20 March 2017 (UTC)

This is great Gerard. They can use this as a task list for WiR editathons as well as the ongoing annual project! -- Erika aka BrillLyle (talk) 05:00, 21 March 2017 (UTC)

Help requestEdit

There was once: 'iron (native)' 'instance of' 'mineral'
User:Infovarius prefers:
  • Qualifier: 'as' 'mineral'
User:Chris.urs-o prefers:
  • 'Iron (native)' 'instance of' 'native metal'
  • 'Iron (native)' 'instance of' 'iron (element)'.
  • Qualifier: 'as' 'mineral'
There is another possibility:
  • 'Iron (native)' 'instance of' 'native metal'
  • 'Iron (native)' 'instance of' 'mineral'.
  • Qualifier: 'as' 'iron (element)'
Any suggestions? Regards --Chris.urs-o (talk) 09:12, 20 March 2017 (UTC)
Why not have it as a qualifier? Thanks, GerardM (talk) 09:43, 20 March 2017 (UTC)
We're trying to deprecate as (P794) because it's too vague and untranslatable beyond European languages. I would prefer telluric iron (Q2248028) subclass of (P279)native metal (Q310948). Deryck Chan (talk) 13:56, 20 March 2017 (UTC)
Ohh. I assume that we need a qualifier. --Chris.urs-o (talk) 16:00, 20 March 2017 (UTC)
@Deryck Chan: Could you name it 'qualifier, strict sense' instead of 'as' and not to deprecate it, please. --Chris.urs-o (talk) 05:45, 22 March 2017 (UTC)
@Chris.urs-o: See WD:PFD#As (P794). Deryck Chan (talk) 14:33, 22 March 2017 (UTC)

I demand justice ! (just a clickbait title for better justice properties)Edit

Looking at politicians biographies, I noticed that many had been condamned for libel, had attacked others for libel… by a judge from a certain court, had paid damages for a certain amount. I added one for a French politician, that was swiftly reverted (with no explanations so far, even though I asked for some). Thus a couple of questions:

  • Has it been debated and set aside due to privacy issues ?
  • Can we add this information for public (especially political) figures ?
  • Can we perhaps do better, than the current situation (by adding properties to simplify description) ? A wikiproject/taskforce ?

--Teolemon (talk) 09:42, 20 March 2017 (UTC)

We have, as yet, no equivalent of the en:Wikipedia:Biographies of living persons policy (Wikidata:Living people exists as an unadopted proposal), but are nonetheless bound by this Wikimedia Foundation resolution: foundation:Resolution:Biographies of living people. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:03, 20 March 2017 (UTC)

Weekly Summary #252Edit

copy/paste whole statements ?Edit

Is there any script or gadget available to copy & paste (or cut & paste) whole statements from one item to another? (including all the qualifiers, references, etc).

There have been a few times recently when I have found a statement on the wrong item of two items with a similar label, and this would be really useful. Jheald (talk) 10:01, 21 March 2017 (UTC)

  • There is a longstanding feature request for this. Supposedly it's not a priority.
    --- Jura 18:03, 21 March 2017 (UTC)

Free the Open Library identifiers from FreebaseEdit

Hoi, I would like for us to approve all the Open Library identifiers for authors that are part of the Freebase data we have. It will enable us to link more humans to both VIAF and OL. Thanks, GerardM (talk) 11:15, 21 March 2017 (UTC)

@GerardM: What do you mean by "approve"? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:53, 21 March 2017 (UTC)
What I mean is approve them and make them available in Wikidata. They are now in Freebase purgatory. Thanks, GerardM (talk) 12:55, 21 March 2017 (UTC)
@GerardM: there are no pending property proposals listed at Wikidata:Property proposal/Overview that refer to either "Freebase" or "Open Library", as far as I can see. Can you point to at least an example of what you're talking about here? ArthurPSmith (talk) 17:38, 21 March 2017 (UTC)
I think he might be referring to statements in the Primary Sources tool.
--- Jura 18:17, 21 March 2017 (UTC)
yes I do. Thanks, GerardM (talk) 05:29, 22 March 2017 (UTC)

VIAF identifiers as well?Edit

It would be great if we could have the VIAF identifiers as well. Do we know how many of them are only in the Freebase collection ? Thanks, GerardM (talk) 21:52, 22 March 2017 (UTC)


I have a list with items and their area P2046.

1) How can I also add the unit? [1]

2) It is not adding decimals.

Xaris333 (talk) 14:22, 21 March 2017 (UTC)

I believe it did not work in the original quickstatements (which you link to) but Magnus is working on a QuickStatements 2 here that may support this - if it doesn't work there please contact him! ArthurPSmith (talk) 17:41, 21 March 2017 (UTC)

Label capitalizationEdit


I am a semi-active Wikidata editor and I usually find that, at least in Catalan, there are a lot of labels which are a common name that shouldn't be capitalized (according to Help:Label#Capitalization. I changed a lot of them but of course going one by one by hand isn't going to make a dent in the overall set of elements. So my questions are:

  1. Is there any script which allows to make automatic changes do de-capitalize labels which have certain characteristics? I know I could program some kind of bot, but if it's possible I'd like to avoid it.
  2. Is there any experience with this? I searched a little and the issue has come up in the past but I haven't found a solution nor a convincing answer.

Thank you in advance.--Arnaugir (talk) 15:25, 21 March 2017 (UTC)

almost every item with a subclass of (P279) statement or that is the value of a instance of (P31) on another item is a generic concept in some form and should not be capitalized in English and other languages with that rule (unless its name involves some other specific proper noun). Tools like QuickStatements (Q20084080) can make it relatively simple to change a large number of labels quickly. ArthurPSmith (talk) 17:45, 21 March 2017 (UTC)

Item Quality Pilot CampaignEdit

Hi everyone!

My name is Glorian. I am a student who is working on a thesis about applying machine learning to evaluate Wikidata item quality. I am aiming to develop a kind of ORES for evaluating item quality. Furthermore, it enables us to easily find low quality items and fix them. That way, I hope we can improve our data quality.

As the first step to develop the mentioned tool, I am going to launch a campaign for grading the quality of Wikidata items in the near future. Prior to launching such campaign, I decided to launch a pilot campaign for testing my sample.

Since you are the expert on item quality, I would like to ask you to help me grading item quality in the pilot campaign. Feel free to invite your peers to help grading the items in this pilot campaign. Also, It would be fantastic if you can give me feedback regarding to the campaign. To participate in the pilot campaign, you can go to this link, and click “Request Workset”. As for the hint in grading the items, you can refer to the criteria here.

Many thanks for your help! --Glorian WD (talk) 16:40, 21 March 2017 (UTC)

When those are the criteria, I hardly think your work is worth it. Thanks, GerardM (talk) 17:02, 21 March 2017 (UTC)
When we are to express quality in a meaningful way, compare the number of incoming and outgoing links of an article with the incoming and outgoing links of an item. When there are no obvious surprises as homonyms in error. We know we are reasonable. When there are issues we have at least a tool that informs about the quality of either a Wikipedia article or a Wikidata item. Quality does not exist in isolation and it should be actionable. Thanks, GerardM (talk) 17:32, 21 March 2017 (UTC)
Our quality prediction models are built on "actionable metrics". This is a term we have been using for a long time, so it is something we're sensitive to. See Warncke-Wang et al.'s Tell me more: an actionable quality model for Wikipedia. for the research that our modeling strategy is based on. I'm not sure what your problem is. --EpochFail (talk) 19:15, 21 March 2017 (UTC)
Exactly, wrong project. This is Wikidata where things are different. Thanks, GerardM (talk) 05:34, 22 March 2017 (UTC)
Thanks. I'm familiar with the distinction. --EpochFail (talk) 18:28, 22 March 2017 (UTC)
@Glorian WD: Please see meta:Research:Index. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:13, 21 March 2017 (UTC)
Hi Andy, I'm working with Glorian on this stuff. I'm not sure why you're directing Glorian to m:R:I, but FWIW, this is hardly a research project at all and much more an attempt to put together basic tooling for Wikidata. Our article quality models for English, Russian, and French Wikipedia have been very popular, so we're trying to get similar functionality implemented for Wikidata. --EpochFail (talk) 19:15, 21 March 2017 (UTC)
Because of "...a student who is working on a thesis about applying machine learning to evaluate...". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 07:48, 22 March 2017 (UTC)
Gotcha. Makes sense. --EpochFail (talk) 15:46, 22 March 2017 (UTC)
Maybe you'd want to test it first within the WMDE/WMF staff member-linked usergroup who drafted the quality criteria? This way, you could do a new draft of the criteria and propose this to the community.
--- Jura 18:02, 21 March 2017 (UTC)
We've already done that before releasing this open pilot. --EpochFail (talk) 19:15, 21 March 2017 (UTC)
Yeah. I think we have done drafting the quality criteria. What we are going to learn from the feedback of this pilot campaign could be used to revise the existing quality criteria. --Glorian WD (talk) 16:41, 22 March 2017 (UTC)
If I'm right your „campaign“ is about judging the correctness of Wikipedia edits. Why should this be done by Wikidata users? --Succu (talk) 20:51, 21 March 2017 (UTC)
Succu: After authorizing there is a step where you have to choose Wikiata. I guess you chose the wrong one? I am using the tool right now to evaluate Wikidata items. Syced (talk) 07:14, 22 March 2017 (UTC)
Without changing anything I now get a set of items. --Succu (talk) 15:32, 22 March 2017 (UTC)
Great, please let us know when results are ready! I guess most items are either D or E. While evaluating I got a redirect item, do you also want to evaluate the quality of redirects? A redirect has no reason to be of bad quality I believe. Cheers! Syced (talk) 07:14, 22 March 2017 (UTC)
Thanks! Good catch with redirects. This is exactly the type of issue that we're looking out for in the pilot. We should remove those redirects from the dataset for the full labeling campaign (which will be ~5k items). --EpochFail (talk) 15:46, 22 March 2017 (UTC)
If you don't do it already, I suggest you extract some information about the image of each Wikidata item: size, contrast, sharpness, and give this information as input to your machine learning. This information can be extracted using ImageMagick (command-line) and OpenCV (library). Syced (talk) 08:44, 22 March 2017 (UTC)
Good idea. Glorian WD, I don't think we have a page put together for proposed feature engineering strategies. Could you create one at Wikidata:Item quality/Modeling or something like that? It'll be good to capture these ideas for when we get to start on feature engineering work. --EpochFail (talk) 15:46, 22 March 2017 (UTC)
Thanks for the suggestion Syced. I have created the page Wikidata:Item quality/Modeling and added your suggestion there! --Glorian WD (talk) 21:59, 22 March 2017 (UTC)
Again, this model is based on Wikipedia. Popularity on Wikipedia projects have no bearing on the relevance to Wikidata quality. The research is not about Wikidata so what is the basis for all this? Thanks, GerardM (talk) 08:00, 22 March 2017 (UTC)
it is not about the popularity. why don't you let the boffins develop some metrics so we can shove them in the face of the wikipedia know it alls. Slowking4 (talk) 14:55, 22 March 2017 (UTC)
In the last discussion I asked for a use-case of developing this feature and I don't think I got an answer from you. Can you tell us a user story of how you believe the feature you develop will be used for the benefit of Wikidata? ChristianKl (talk) 13:48, 22 March 2017 (UTC)
if i may - if the ORES can incorporate editor judgement of quality, it will allow you to make work groups by quality level focusing effort. and it will allow you to assess quality improvement over time. Slowking4 (talk) 15:18, 22 March 2017 (UTC)
Thanks Slowking4. ChristianKl we're working on a brief summary of the expected benefit when we announce the larger labeling campaign (this is just a pilot to work out the bugs. Here's the short list of some of the things we have been able to do with quality models in Wikipedias that should have corresponding interesting things in Wikidata too: measuring content coverage dynamics, helping students write articles, and helping WikiProjects prioritize re-assessments. Fuzheado has been using our article quality models in his classroom to help students know the impact of their contributions to Wikipedia. The Wiki Education Foundation is using the article quality models to recommend work to students (basically, trying to fill in quality gaps). These are just a few things that are useful with quality models in other wikis. I'm sure Wikidata will have unique use cases too, but this provides a bit of a sample. --EpochFail (talk) 15:55, 22 March 2017 (UTC)
I think it's worthy of criticism that you try to build a quality metric without first looking at the use cases. As I said in the previous discussion I don't think the fact that quality metrics are useful for Wikipedia matter here. The fact that you simply try to copy something from Wikipedia and not listen to feedback enough to bring up the same talking point again, pushes me into a direction that it's worthy to oppose your project.
Outcome metrics that are designed to let people reallocate their efforts to other tasks have the potential to create harm, so the concern isn't just that the project is useless.
I don't think that telling students that edit Wikidata that they should optimize for some quality metric is likely to be beneficial. It makes more sense for the student to add data that they believe is missing and where they think adding the data creates an improvement. That might mean that they create a lot of items with only 3 statements. On Wikipedia creating a lot of stubs is discouraged and when you try to force a system on Wikipedia that copies those Wikipedia norms, I think that's bad for our project. ChristianKl (talk) 16:22, 22 March 2017 (UTC)
ChristianKl, we did think of use-cases ahead of time. See my list above and Lydia's comments in this discussion. Also, I'm pretty sure that Lydia Pintscher was the person who first asked me and Glorian to take a look at this project, so I'm sure she had use-cases in mind. It seems you're not at all familiar with Fuzheado's work with the article quality metric and his students. Maybe you could familiarize yourself before raising such strong, negative predictions. The quality model will not capture "Wikipedia's norms" or whatever it is that you mean by that. --EpochFail (talk) 18:27, 22 March 2017 (UTC)

I am not impressed by my trial of a work set. Criteria are somewhat subjective and to my surprise I was asked to judge redirect pages. Lymantria (talk) 15:45, 22 March 2017 (UTC)

Lymantria thanks for pointing that out. It's certainly something we missed when filtering nonsense out of the dataset. Catching issues like this is exactly why we're running this small pilot. We'll make sure to have those filtered out for the larger full campaign. --EpochFail (talk) 15:57, 22 March 2017 (UTC)
Oh! One more thing. You're right that the criteria is subjective. That is by design. We decided to that we wanted to capture Wikidata editors' judgement about quality and that the criteria should only help you make judgements. In this case, we're going to compare ratings by different editors to find out where judgement does and does not show consistency. We'll likely need to iterate on the criteria a bit before we're ready for the the full labeling campaign to make sure that we make good use of the time we spend labeling. --EpochFail (talk) 16:00, 22 March 2017 (UTC)
Hi Lymantria! thanks for the feedback. We will make sure we eliminate the redirect pages in the real (full) campaign. Concerning to the subjective criteria, as EpochFail pointed out, we intentionally did it because we want to capture the editors' definition/judgment of item quality. Do you have any other feedback that you want to point out? --Glorian WD (talk) 16:32, 22 March 2017 (UTC)
Hi, Yes, I do have some other feedback concerning the subjectivity of the criteria. I understand that you are looking for judgements and that judgements do imply a portion of subjectivity.
First is concerning examples at Wikidata:Item quality. Comparing Coniophora arida (Q10646558) to Gustavus Simmons (Q381312) I think that the first is not a very good example for an E and perhaps deserves a better judgement than the second. Both have descriptions in four languages, but the first has more external sources (through identifiers). I would perhaps judge it with C rather than E - which I think is a relatively big difference. It's because of these examples that I find it hard to understand the intention of the classification A-E.
I wonder why class A requires a good image, while not in all subjects - think of more conceptual ones - can have one.
An open question of mine: Could an item with instance of (P31)  Wikimedia disambiguation page (Q4167410) perhaps be judged D without more statements? What more statements would one expect? Should the description then perhaps have "statement(s)" in stead of the pure plural? Lymantria (talk) 07:31, 23 March 2017 (UTC)
Lymantria, looks like the examples that I put on Wikidata:Item quality are a bit outdated. So, it seems that the examples were added prior to the existing criteria. Based on the existing criteria, I agree that those two items should fall on higher class.
Regarding to the image on class "A", note that there is "if applicable" on that criterion. This means, there are items class "A" which are not applicable to have images.
I am considering to remove items with instance of (P31)  Wikimedia disambiguation page (Q4167410) from the sample of the full-campaign. What do you think about this? --Glorian WD (talk) 15:06, 23 March 2017 (UTC)
Perhaps removing instance of (P31)  Wikimedia disambiguation page (Q4167410) is a good idea, as well as instance of (P31)  Wikimedia category (Q4167836), instance of (P31)  Wikimedia template (Q11266439) and perhaps instance of (P31)  Wikimedia list article (Q13406463). But perhaps these could normally be D class as well. Lymantria (talk) 15:28, 23 March 2017 (UTC)
Lymantria, thanks! I have added this to my note for improving the sample prior to launching the full campaign. Are you still participating on the pilot campaign? perhaps, you will find more bugs ;) --Glorian WD (talk) 16:39, 24 March 2017 (UTC)
The A to E criteria is subjective and created by you. If you are really interested in the opinions of Wikidata editors in which items are high quality it doesn't make sense to tell the Wikidata editors which criteria they are supposed to use. If you are sincerely interested in the opinion of Wikidata editors about what quality means you shouldn't prime with your own sense of what quality is supposed to mean.
We had an RFC about quality. It suggests that Accuracy, Objectivity, Reputation and Consistency are important for quality. I don't see those aspects in the proposed A to E criteria. I haven't seen any explicit reasoning why they aren't included. Telling people who you ask to judge quality to assess quality in a way that doesn't address them is problematic. ChristianKl (talk) 09:27, 24 March 2017 (UTC)
ChristianKl, the existing criteria that we have now is by far and large, built upon the Wikidata:Showcase_items, plus the discussion with the editors here. I think Showcase item criteria is the best that we have now for signifying item quality. I am aware with this RFC. I did contact the author of this RFC and he said that he has still to incorporate the community feedback to this quality framework. This means it might be wise for not basing my work on this RFC for now. Moreover, before I am starting this research, Lydia Pintscher (WMDE) even suggested to use the Showcase items criteria for this research. --Glorian WD (talk) 16:49, 24 March 2017 (UTC)
There has been no vote to adopt that document as policy. I don't like the process of making it a defacto policy without a vote by making it the basis of the quality assessment process. It would be easy at this point to simply let people rate items on a 10 point scale and see what Wikidata editors actually think constitutes quality. I don't see a good argument for why the editors should be told what they should consider to be quality when the goal is to get their opinions about the item quality (which is the spirit in which you started this thread). ChristianKl (talk) 17:30, 24 March 2017 (UTC)

A serious chatEdit

Hey folks, we need to have a serious chat about the reception that Glorian WD received here. He's doing some important work because he believes it will help Wikidata. Almost all of the replies have been intensely negative. This really disappointed and I think that it's time to reflect on how we treat good-faith contributors -- students or otherwise. Feedback is great! Positive feedback expands a project and incorporates more perspectives. Negative feedback attempts to control and shut down ideas that people don't personally agree with; it assumes stupidity and bad-faith, and it will to make our community incredibly hard to approach for newcomers. Are we OK with going down the same road as the big Wikipedias by making newcomers not feel welcome? We should be better than this. We should strive to be welcoming, to see the value in each others' work, and to increase that value constructively. In the end, positive feedback is our best hope we have in actually achieving our goals. --EpochFail (talk) 20:31, 22 March 2017 (UTC)

When we are to have a chat, what do you want to talk about. When I read YOUR feedback, you do not address any of the feedback given. I desperately want better feedback and what I get is something that does not provide something I understand we will benefit from. I have given specific scenario's, use cases how Wikidata WILL improve quality both for Wikipedia and Wikidata. There is nothing from you. When you want a serious chat do not blame it on the other because your model of quality is rejected. If anything it is not the student who is given this task that is to blame. What is implemented is a flawed system that may iterate into something useful eventually. Not during his/her project.
When we talk about quality it is sources that are so important. But what sources and how do you concentrate on sources. A previous student did his/her thesis. Did not finish the functionality and we are left with nothing. When you stand for quality lets talk and lets talk seriously. But let us argue about quality and convince. I am on record on how we can improve quality and as far as I am concerned we have not talked. Thanks, GerardM (talk) 20:55, 22 March 2017 (UTC)
Thanks for illustrating my point. How do you know that the quality model will not be useful while Glorian is working on it? I believe in Glorian and in the value of his efforts. I think that the work he's already done will prove to be useful. Who is that past student you speak of and why are you faulting Glorian for this past students failure to do what you wanted them to do? --EpochFail (talk) 21:28, 22 March 2017 (UTC)
<grin> In that case we are both good illustrators. </grin> Obviously you have to stand by your man. You fail to address any of the points made. Disappointing. Thanks, GerardM (talk) 21:34, 22 March 2017 (UTC)
Shouldn't he speak for himself? I think your meddling ("we"), EpochFail, didn't help him. --Succu (talk) 21:50, 22 March 2017 (UTC)
Glorian is a newcomer. I'm decidedly not (been around for ~10 years), so I'm a bit more empowered to push back against this kind of behavior. --EpochFail (talk) 22:37, 22 March 2017 (UTC)
If it comes to WD, EpochFail, you are a newcomer too. So how your empowerment to push back against this kind of behavior helps Glorian to fight his way to the Wikimedia jungle? --Succu (talk) 22:53, 23 March 2017 (UTC)
Ok, so the talk is with you. You state that you feel empowered to "push back".. I would say you throw your weight around and you do not listen, you do not respond to the pushback for the proposal that is on the table. We are told to accept because "you and some others" have an opinion that something might be good. I tell you and I am longer with the Wikimedia Foundation and I have way more edits on Wikidata and I have published on my blog more about quality in Wikidata than you have. Both our weight, and I am overweight is immaterial when we keep it to arguments.
Quality in Wikipedia is NOT in single items, it is the relationship of items and in the relationship with sources. When a Wikipedia article (a source) states that there are 24 awardees for an award and Wikidata has 24 awardees, it is quality. When the numbers do not match, there is an issue. When someone is called an author and he does not have a VIAF identifier, an author did not publish or is not known in any of the worldwide libraries.. That is a quality issue.
I am not telling you anything new here but it is this kind of actionable quality that matters. When an awardee has little information except for a label and the fact that he is an awardee, the value of the item is high because he completes a set of data. The problem with your approach is that people will mistake your single item based approach and put it up for deletion. When they feel empowered enough by the "low quality" and, yes it is just a Muslim, they may even speedily delete.
With 25,380,559 items and counting there is no meaningful way to manage single items. My watchlist is too big. I have 168 items on it at the moment and there is no way I can check them all. When a bot adds labels it is easily 10 times that much. The descriptions for Wikidata items are useless and we have something better that could replace it but we do not even discuss it. I use Reasonator as a tool because it is accepted that the Wikidata edit screen sucks but ok we suck and we have not had the resources to do something about it.
When you want to engage the community and make a dent in quality consider what we have. It is a dataset. Address quality as a set theory problem and you get traction on many ends. When I am interested in "100 women (BBC)" let me subscribe on my watchlist and have all associated items trigger my watchlist. When I am interested in "Black Lunch Table" same thing. But once you start thinking about quality and engagement in this way, you will understand why the current approach lacks any sense of connection. It does not do so at all.
So the data can be split up in parts to get attention from people having an interest, the data itself can be tracked in this way as well. When DBpedia and Wikidata differ in opinion on a set of items, it is for both projects a quality issue. Have a subset with these differences and you are engaging communities in a meaningful way. When the {{authority control}} on English Wikipedia does not exist on authors, we do not inform readers about VIAF or Open Library. The first informs, the second provides access to authors and freely licensed books. That is operational quality, that is how you engage our community. You provide information that engages and makes for a positive difference.
So you can persist and tell us how important the work that you throw at us is. Or you can react to the criticism to the arguments. So far you are in violent opposition to comments from the community and you do not address any of the issues raised. More importantly you cannot even indicate how your approach will make a positive difference. In my opinion, it is at best a first step on a learning curve about Wikidata for you. It is a lesson many of us have learned already. Thanks, GerardM (talk) 03:40, 23 March 2017 (UTC)
Gerard, I thought I had explained everything to you yesterday and you had understood and agreed that the project can be benefitial. I am disapointed that I apparently failed completely. We'll conclude this discussion here and Glorian and I will sit down and work on a page explaining the project more. --Lydia Pintscher (WMDE) (talk) 08:47, 23 March 2017 (UTC)
I am happy with an experiment. I am happy with an initial approach to see where it gets us. I am happy for you and Christian to work on this. I do not accept it as a valid approach because the data representation is oversimplified. I said so at the time in our conversation. You asked me to wait and see because there is merit in being able to identify problems in items, and there is. When EpochFail throws his weight around, and insists that the approach is valid (a professional assessment), he has to appreciate that as a professional working on research, he will be challenged for the validity of what he says. It is not, he knows it.
Lydia you did not fail. Things happened. I am not happy about it and I doubt anyone is. Thanks, GerardM (talk) 06:42, 24 March 2017 (UTC)
The fact that you claim that the work he's doing is important doesn't mean that it's important. It also doesn't mean that the project is net positive. I'm willing to be convinced that it's net positive but till now I'm not seeing a reasonable effort being made. In the previous discussion I asked for use-cases. I don't think that was an unreasonable demand. Those haven't been provided and there's an attempt to get other people do do something to support the project, I approached this discussion more negatively.
I haven't seen you or Glorian acknowledging the possible harm created by having bad quality metrics. In particular I do have the concern that this project will encourage that a given amount of information is spread about over fewer items than would be desireable.
I do think that quality is important but I have the impression that this project seeks the keys under the lamppost. I for example believe that an item with high quality sources is better than an item with lower quality sources. Unfortunately we don't have direct data about source quality on Wikidata and as a result the model that Glorian wants to build is unlikely to pick it up. There's currently a grant proposal in that direction ( but that's a different project than Glorian's.
The model is also likely unable to say whether entries on Wikidata accurately reflect what a source says. That's important for quality but your algorithm won't be able to pick it up.
If someone runs a bot and adds 100,000 new statements to Wikidata someone might look at the tool that's supposed to be built here to decide whether or not the bot improves or lowers Wikidata quality. If the metric is bad that's potentially harmful. ChristianKl (talk) 17:39, 23 March 2017 (UTC)

My use caseEdit

As a big open project, it is important to have statistics showing how quality evolves over time. But more personally, here is the use case I would like to be able to have:

I spend a lot of time improving embassy items on Wikidata. A great tool would tell me: "Look at this embassy item, its picture is very small and over 80% of its properties have no reference".

Cheers! Syced (talk) 08:19, 23 March 2017 (UTC)

Great idea. Thanks Syced! --Glorian WD (talk) 15:12, 23 March 2017 (UTC)
If the output of this project is a quality score, then it won't tell you "the picture is very small". Given that Wikidata itself doesn't host the data about the image size I'm also not sure that the trained model would easily pick it up. For that use-case another output than a direct quality score would likely be better.
There are advantages to having statistics of how quality evolves over time but that's only true when the quality score measures what it's supposed to measure. ChristianKl (talk) 17:57, 23 March 2017 (UTC)

Landing (Information) PageEdit

Hi folks,
Thanks for all responses! I do appreciate them :)
I have created a page for clarifying what I am trying to do. You can find that page here. I hope things can get more clearer now! --Glorian WD (talk) 17:13, 24 March 2017 (UTC)

Thoughts on QualityEdit

I have my thoughts about quality and you can find them here. As you will read I do support the project by Glorian WD. It has potential, it is a first iterative step in the right direction and I support him in this. It will bring us one step in the right direction of better quality but it will not bring us an absolute quality. For that we need more discussion.

While I support this project also because of its limitations, it is important that a discussion about Wikidata qualities and how they apply to all the Wikimedia projects is held. This discussion should be open and assumptions held should be made transparent and discussed. Wikidata is not Wikipedia and what works for Wikipedia and its community does not necessarily work here. There are multiple objectives for this discussion.

  • Find a common ground and see how we can expand this.
  • Make clear that we are truly talking about Wikidata quality and not conflating this with external arguments.
  • Expand the understanding of the qualities of Wikidata and how they can be expanded
  • Build trust.
  • Grow the Wikidata community.

Thanks, GerardM (talk) 09:04, 25 March 2017 (UTC)

Help needed with donating imagesEdit


The heritage institution I am working for is currently considering donating ca 1300 images (digitized glass-negatives) to wikimedia/wikidata under a CC-BY-4.0 licence. I'd like to get into contact with someone from the community who can guide me through this process. We are based in The Hague, The Netherlands.

Many thanks in advance. DirkJanse (talk) 08:03, 22 March 2017 (UTC)

Hoi, you can contact me.. I am based in Almere. Thanks, GerardM (talk) 08:20, 22 March 2017 (UTC)
FYI I have had contacts. Know the basic requirements and linked to the Dutch chapter as there is a potential for a lot of great cooperation. Thanks, GerardM (talk) 13:59, 22 March 2017 (UTC)
DirkJanse: Please write to with a subject line such as "Wikimedia Commons collection donation". See Thanks! Syced (talk) 08:23, 23 March 2017 (UTC)


How do you register that someone was a slave.. Who was the "owner"? Phyllis Wheatley is one example. She is on a category with American slaves. Thanks, GerardM (talk) 10:21, 22 March 2017 (UTC)

In the discussion for the creation of social classification (P3716) this use-case was spoken about. ChristianKl (talk) 16:46, 22 March 2017 (UTC)
Thanks, I have added many USA slaves. GerardM (talk) 03:43, 23 March 2017 (UTC)

Items as captionsEdit

The problem with media legend (P2096) is that there is not automation; you actually have to enter a caption in every language. Compare this to the two images used on Alces alces (Q35517), which use sex or gender (P21) to specify the sex of the animal in the picture. Is there something similar that can be used on Salmo (Q310436) to specify that the (current) image shows Atlantic salmon (Q188879)? Could instance of (P31) be used?

Another idea is to have a property like "Primary focus of media" and "secondary focus of media", which could e.g. be used to describe a Great Tit (Q25485) (primary focus) sitting on a branch of Corylus avellana (Q124969) (secondary focus), which could then be used to generate (the somewhat imprecise) "Great Tit (Q25485) together with Corylus avellana (Q124969)" in e.g. infoboxes. --Njardarlogar (talk) 10:52, 22 March 2017 (UTC)

Can you use depicts (P180), which is a Wikidata property to describe media items (Q28464773)? - PKM (talk) 20:05, 22 March 2017 (UTC)
Not immediately clear to me if it is intended for photographs, and it seems to require an item (with work (Q386724) for instance of (P31)). A good candidate, though. --Njardarlogar (talk) 11:00, 23 March 2017 (UTC)
Isn't this what Structured data for Commons is going to handle? Matěj Suchánek (talk) 13:38, 23 March 2017 (UTC)
Maybe, I'll have to to look into it (links for future reference: Wikidata:WikiProject Commons and c:Commons:Structured data). --Njardarlogar (talk) 09:37, 25 March 2017 (UTC)
Ah, yes, looking at what seems to be a test deployment, I see the property 'depicts' exists and is used for photographs. Good. --Njardarlogar (talk) 09:46, 25 March 2017 (UTC)

Nicolás Antonio de ArredondoEdit

Q5929942: Wrong year of birth. Please change to 1726. If you need proof see source on English Wikipedia. – 16:33, 22 March 2017 (UTC)

1726 is the correct year but the wrong data. There is a reference to a more exact date. Fix English Wikipedia. Thanks, GerardM (talk) 03:45, 23 March 2017 (UTC)
I just added the full date today. It's sourced to a thesis, so I'm not certain that's a reliable source in EN wikipedia. We should be able to track down better source. - PKM (talk) 04:08, 23 March 2017 (UTC)

Constraints on Follows/Followed byEdit

Looking at followed by (P156) and follows (P155), it seems there has been considerable discussion about whether these properties can be used as statements or must be qualifiers to some sort of series statement. It appears constraints to allow these properties only on qualifiers were added in 2015. One of the Wikidata property examples for <followed by> is "March (Q110) <followed by> April (Q118)". These are not qualifiers, but statements. If there's going to be a constraint then at least the example properties should be consistent with that. - PKM (talk) 19:53, 22 March 2017 (UTC)

don't know about this particular case, but in general Wikidata property examples don't actually work for properties that are used as qualifiers (or in references). I'm not sure how it could even be fixed... ArthurPSmith (talk) 20:12, 22 March 2017 (UTC)
If the constraint to qualifiers is valid, the fix would be to make an item for a series called "months of the year" and then say March <series> months of the year <follows February, <followed by> April. Personally, I'd rather have the constraint removed. But if there is consensus that the constraint is valid, then the property examples should all be items that use the property within the rules of the constraint. - PKM (talk) 21:46, 23 March 2017 (UTC)

Outlier points in administrative territorial entitiesEdit

This query may be of interest:

It finds the furthest-from-average place for each one of a set of administrative units (eg, in the link above, for places in each civil parish (Q1115575) in North West England (Q47967)), and lists the worst case for each unit, in descending order by distance.

I am finding it quite useful to identify where a number of places have been included in the wrong one of two districts with similar names, which tend to come at the very top of the list; and also, further down the list, some places that have slightly incorrect coordinates. One day it may be possible to do more automated boundary checking; but I am finding this useful to identify the worst errors.

The set of administrative entities being looked at can be changed by changing the requirement ?cp wdt:P31 wd:Q1115575 . ?cp wdt:P131+ wd:Q47967 in lines 4 and 5 of the query, to any other set of administrative units.

To investigate the outliers further, it is useful to plot all the points in the unit with a query like, putting the item number for the relevant administrative unit in line 3.

I wasn't able to get the query to run to look at all the civil parishes in the UK at once (any further optimisations very welcome); but it seems to be going quite happily looking region by region, and may be worth trying for other parts of the world too. Jheald (talk) 20:34, 22 March 2017 (UTC)

Icon set for Wikidata classesEdit

Hi all,

I created an icon set covering high-level Wikidata classes such as hospital/park/prison/school/etc.

The goal is to have a stylistically coherent collection of icons for use in Wikidata-powered applications. I will use it in my app that shows a map of nearby Wikidata items.

While many items have an image, images can not be used because most are unintelligible when reduced to the size of a map pin. The icons could also be used by Reasonator-like data explorers when an item image is not available. Example: If HospitalA has no image, showing a picture of HospitalB would be misleading, but showing an icon representing a generic hospital could be a way to make the data more visual.

New icons very welcome as long as their style is coherent with the others. And if you know an existing project that already does exactly this (stylistically coherent icon set linked to Wikidata) please let me know! Thanks a lot :-) Syced (talk) 08:47, 23 March 2017 (UTC)

In addition to the image property we have icon (P2910). ChristianKl (talk) 13:27, 23 March 2017 (UTC)
ChristianKl: These icons do not have a coherent style, unfortunately. Also, the examples for this property are items for which an icon is well-established, for instance a wheelchair... so I guess I will get reverted if I design an icon for "animal" and add it, right? If not, I will start adding icons for many items, but I am sure that would cause a major controversy, in particular because the icons are original designs with zero reference. Thanks! Syced (talk) 13:51, 23 March 2017 (UTC)
The domain for icon (P2910) doesn't suggest that it shouldn't be used for a concept like animals, so I don't see why someone would revert. A coherent style does happen to be a concern. Maybe there's a way to signal the style of the icon via a qualifier? ChristianKl (talk) 13:59, 23 March 2017 (UTC)
or a reference - perhaps Syced should create a Wikidata Item to anchor this icon project and use it as source for the icons added here? Or perhaps this should be its own property, if this is to be a formal wikiproject of some sort? It seems to me it's a nice idea at least. ArthurPSmith (talk) 14:04, 23 March 2017 (UTC)
Yes, it might make sense to simply have an item for the project on the Github page and then use stated in (P248). ChristianKl (talk) 16:56, 23 March 2017 (UTC)
I saw your question and examples ("I re-used the excellent Open-SVG-Map-Icons icon set and so far matched about 5% of the icons to Wikidata classes"), great. I think the icon set could have its own property: "wikidata class icon" (not just "any" icon, P2910). Oh, and please upload this free icon set to Wikimedia Commons! Another huge icon set is by Nicolas Mollet: c:Category:Map icons by Nicolas Mollet (but: cc-by-sa!), have a look! --Atlasowa (talk) 19:45, 23 March 2017 (UTC)

BTW, Syced: Is your "app that shows a map of nearby Wikidata items" related to or --Atlasowa (talk) 20:04, 23 March 2017 (UTC)

Huh. So, a year or so ago I was trying to extend the WikidataInfo script to allow users to view bits of data and add simple statements from Wikipedia, using little popout bubbles filled with clear icons. For example, for books there would be a little popout menu filled with just icons for various genres (a UFO for scifi, dragon for fantasy, etc), and clicking on one would add the appropriate statement to the Wikidata items. Unfortunately, I was only able to find a few pictures in different styles, and while I was able to use unicode symbols for some things ('👤', '📖', '🎮', '📰' for P31, '♂', '♀', etc for P21), that didn't help much.
I don't suppose you're planning to expand this to non-geographic classes? This could be really useful for a lot of things. (Also, you may want to add the icons to Commons, if they're not there already.) --Yair rand (talk) 23:10, 23 March 2017 (UTC)
I uploaded two of the icons to Commons and added them as icon (P2910) yesterday, and apparently they have not been reverted yet. In order to get a better idea I wrote a query that shows icons used for 100 random items. As you can see, there is a bit of everything, but black-on-white with roundy shapes and no gradient seems to be the most common type. Syced (talk) 04:56, 24 March 2017 (UTC)


When someone holds an office as acting, how do we code that.

Example: The current Chief of Defence (Denmark) Bjørn Ingemann Bisserup earlier held the office twice as acting. I tried to encode it in position (P39) under the qualifier subject of the statement (P805) ... but this should better point to a more detailed article ... better suggestions?

Poul G (talk) 08:57, 23 March 2017 (UTC)

Validation failed: Negative pattern matched: /^\s|[\v\t]|\s$/Edit

Something weird happens when I am trying to change label and description (see error in the title). Please try to set ru-label for Q11079271 "хеширование". --Infovarius (talk) 20:00, 23 March 2017 (UTC)

Strange, don't work with lower case, but work with upper case... --ValterVB (talk) 20:27, 23 March 2017 (UTC)
Sorry. A bug slipped through here. We've created a patch and it is waiting for deployment later today or early next week. --Lydia Pintscher (WMDE) (talk) 12:23, 24 March 2017 (UTC)
Same thing here, I still cannnot post "者" into label/description while this few houers.Pleas try "数学者" or "利用者". --Suisui (talk) 13:50, 24 March 2017 (UTC)

Golden Hind (Q546198) / Golden Hinde (Q20870362)Edit

Could somebody take a look at Golden Hind (Q546198) and Golden Hinde (Q20870362) for me ?

Golden Hind (Q546198) originally tripped my attention because it has coordinates for one replica ship (in Devon), but a located in the administrative territorial entity (P131) for another (in London). But should it have either, if the original ship was broken up in the late 1600s ?

On the other hand, no wikis other than en-wiki (and Commons) have a separate article for the replicas, and even en-wiki only has one, for the replica in London. All other wikis treat everything in the one article. (Though fr-wiki and de-wiki do have redlinks on a disambiguation page: Golden Hind (Q3110070)) And, also, how should one indicate the relationship between the items? I added based on (P144) on Golden Hinde (Q20870362) for the replica in London, but I'm not sure that's right; and should there be something on the original ("inspired"? "copied as"?), to indicate a later version was made? (Something that could also be relevant for artworks).

Grateful for any thoughts or advice. Jheald (talk) 20:45, 23 March 2017 (UTC)

For Plimoth Jacket (Q28916697), a recreation of a 1620s-style embroidered woman's jacket, which is based on two historical items (one for the cut and shape, and one for the embroidery pattern), I used Plimoth Jacket (Q28916697) <inspired by> Layton jacket (Q6759619) <applies to part> cut (Q11626671). I couldn't find an appropriate property for "inspiration for" or similar on the original jacket's item, though I would like such a property. - PKM (talk) 21:15, 23 March 2017 (UTC)
For the Golden Hinde in particular, you might look at Vera Cruz (Q9696493) and Götheborg (Q2702575), which are both <instance of> ship replica (Q3456301) (which is a <subclass of> both museum ship (Q575727) and replica (Q1232589)). [And of no relevance whatsoever, I toured the Golden Hinde when it was in California many many years ago.]- PKM (talk) 22:10, 23 March 2017 (UTC)
And further, you might qualify ship replica (Q3456301) <of> Golden Hind (Q546198) in addition to your <based on> property. - PKM (talk) 22:16, 23 March 2017 (UTC)
ship replica (Q3456301) is a great find, thank you very much -- exactly what was needed.
Regarding the properties on Golden Hind (Q546198), which are a mixture of data about the replicas and the original (in line with the articles on most wikis), any thoughts on the best way forward? Jheald (talk) 22:32, 23 March 2017 (UTC)
I would split up the statements, statements about the original on Golden Hind (Q546198), statements about the replica on the replica. The Queen really did have the original Golden Hind established near Deptford as a memorial (?the first "ship museum" may be a stretch), according to Stow, you can cite here, p. 41 of the PDF, 23 of the book, so I'd change "ship museum" to "memorial" with a location at Deptford and start date 1581. Not clear when it was broken up, but we ought to be able to dig up a "not later than" date. - PKM (talk) 00:35, 24 March 2017 (UTC)

Cannot add an aliasEdit

I am trying to add a Ukrainian alias to Wenkheim (Q830544). I want to add Венк фон Венкхейм, which is an alternative spelling to the article name Венк фон Венкгейм. However, I am getting a strange error: Could not save due to an error. Malformed input: Венк фон Венкхейм. I don't understand what does it stand for. Does anyone know where can this come from? — NickK (talk) 21:39, 23 March 2017 (UTC)

Crazily, the problem seems to be with the lowercase Cyrillic letter х (Х (Q179860)). Does anyone know why this can happen? — NickK (talk) 21:45, 23 March 2017 (UTC)
As there is no answer here and I could not identify any turnaround, filed it as a bug: phab:T161263NickK (talk) 22:28, 23 March 2017 (UTC)
I am also getting the same error with labels, descriptions, and aliases using the Bengali letter অ (try adding it to the start of twenty-eighth (Q28469738)'s Bengali label). Mahir256 (talk) 01:22, 24 March 2017 (UTC)

A fix is in progress and should be deployed here later today or early next week. Sorry for the issue. --Lydia Pintscher (WMDE) (talk) 12:25, 24 March 2017 (UTC)

I guess it should be deployed ASAP? reported on Korean VP too. — Revi 12:26, 24 March 2017 (UTC)
Deployments on a Friday are highly discouraged and only allowed in very extreme exceptional cases. I will try to get this in as I said but I can't promise it. --Lydia Pintscher (WMDE) (talk) 12:28, 24 March 2017 (UTC)
FYI. Here are testcases on Douglas Adams (Q42). "더글러스 애덤스" OK. "더글러스 노엘 애덤스" failed. ( screenshot: File:Wikidata-error-더글러스-노엘-애덤스.png ) --Jmkim dot com (talk) 12:35, 24 March 2017 (UTC)
If "Unbreak now!" bug blocking non-latin text from being saved (Bengali, Korean, Ukrainian chars are all non-latin, so I'm guessing so. Correct me if wrong.) is not a "very extreme exceptional case", what would be considered "very extreme exceptional cases"? — Revi 12:41, 24 March 2017 (UTC)
"site does not load" or "we are leaking passwords" would be ;-) --Lydia Pintscher (WMDE) (talk) 12:43, 24 March 2017 (UTC)


I was unable to merge no label (Q29017199) and Julius Lovy (Q16947552). 23:33, 23 March 2017 (UTC)

  Done MechQuester (talk) 00:24, 24 March 2017 (UTC)
I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. Matěj Suchánek (talk) 14:06, 24 March 2017 (UTC)

Property:grammatical gender?Edit

It exists a generic gender of a scientific name of a genus (P2433) (e.g. only grammatical gender) suitable to specify a per-language grammatical gender? For example, in Italian a box (Q188075) is a feminine gender (Q1775415) thing. --Valerio Bozzolan (talk) 01:48, 24 March 2017 (UTC)

I do know that I had the most difficult time getting the property approved, because there was great fear that it would be used for purposes other than generic names of organisms, so no, that would be misuse. See the discussion of the property. - Brya (talk) 04:15, 24 March 2017 (UTC)
@Visite fortuitement prolongée: I've seen that you suggested "grammatical gender of latin language nouns" in that discussion, and I think that it could solve the problem. --Valerio Bozzolan (talk) 12:47, 24 March 2017 (UTC)

Criminal organisationsEdit

Moved to Wikidata:Форум

Question about Russian articles should be discussed at Russian forum. Infovarius (talk) 12:34, 24 March 2017 (UTC)

Infovarius. I wasn't just talking about Russian articles, I was also asking about a dewiki article. Also, I don't speak Russian (or German), so I am not sure if I should be posting in English on the Russian or German forums. I think if someone came to this page and started posting in Russian or German, some people might not appreciate it. I really think the discussion should be moved back. SJK (talk) 21:38, 24 March 2017 (UTC)

Entity/property usage calculatorEdit

Hey guys, I've just added a new tool to quickly calculate and report back links count of a page/entity/property/template, called "linkscount" on gadgets preference, analogous to "Transclusion count" link on English Wikipedia but much easier and faster to use. I often found myself in a situation to want to know exact number of property usage on Wikidata or how much a template is used, on other wikis; this tool facilitates this by providing a link, "(count)", on Special:WhatsLinksHere, for example see this Special:WhatLinksHere/Property:P961 or Special:WhatLinksHere/Template:Navbox when you've enabled the tool through the gadgets preference. I hope you find it useful. Cheers −ebrahimtalk 09:17, 24 March 2017 (UTC)

geographical envelope/shell concept in Russian geographyEdit

Can someone who speaks Russian/Ukranian/etc confirm what no label (Q2627400) is in English? If I understand this correctly, this is called in English "geographic envelope"/"geographical envelope"/"geographical shell"/"geographic shell", it is a concept which is influential in (ex-)Soviet/Russian geographical science but is not used so much in geography in other parts of the world. Is that right? (I was going to edit the item but was worried I am more guessing than knowing what I am doing.) If that is right, what should P31/P279 be on this item? SJK (talk) 09:40, 24 March 2017 (UTC)

I don't know English equivalent for this but I'll try to explain. It is a main (and only?) scope of geographical studies (physical geography, mainly). It is some layer +-10 km from Earth surface. It consists of crust, hydrosphere, biosphere, inner atmosphere (tropo- and may be strato-) and anthroposphere. I don't know which P31/P279 is suitable. Infovarius (talk) 12:25, 24 March 2017 (UTC)

New filters for Recent Changes - Beta deployment scheduledEdit


(Sorry to write in English for non English-speakers. Please help translate to your language and also inform everyone about this change, thanks!)

The Collaboration team is going to launch a new Beta feature on your wiki, New filters for edit review. This deployment would happen on April 11 (to be precised). This Beta feature is an improvement of the current ORES Beta feature.

What it this new feature?

This feature improves Special:RecentChanges and Special:RecentChangesLinked by adding new useful features that will ease vandalism tracking and support of newcomers:

  • Filtering - filter recent changes with easy-to-use and powerful filters combinations.
  • Highlighting - add a colored background to the different changes you are monitoring, to quickly identify the ones that matter to you.
  • Quality and Intent Filters - user ORES predictions to identify real vandalism or good faith intent contributions that need assistance.

What will happen?

At the moment, the ORES beta feature is available in your Beta preferences. We the deployment will be done:

  • Users who has enabled the ORES beta feature are now using the New filters for RC Beta feature (no action needed, unless if you want to opt-out).
  • ORES predictions highlighting is activated by default for all users (not in Beta anymore).
    • It is symbolized by a "r" in the RecentChanges pages, for all users.
    • Users who want to change the accuracy level of ORES predictions or opt-out can do so in their preferences.

How to prepare this change?

You can discover the purpose of this project by visiting the quick tour help page. Also, please check the documentation (and help to translate it). Please also have a look at the translations of the interface on The messages are prefixed as rcfilters- and ores-rcfilters-.

For an early trial, the new filters will be available on, Polish Wikipedia and Portuguese Wikipedia as a Beta feature, on March 28, 13:00 UTC.

You can ping me if you have questions. I'll reply on Monday.

All the best, Trizek (WMF) (talk) 12:46, 24 March 2017 (UTC)

NLC (P1213) has some strange ID/URL mechanismsEdit

Property:P1213 currently documents a pretty simplistic mechanism for generating the URL, which is wrong considering that w:zh:Module:Authority control uses another external parameter NLC_URL for the acc_sequence parameter. (For example, this author has a sequence of 000221367 and an identifier of 000080574. Putting his sequence into the FIND-ACC function yields a list of his books. I am not sure if there a thing like FIND-IDENTIFIER.) Should Wikidata create a new property for storing the acc_sequence that builds the actual URL? Or should Wikidata actually shift to this NLC_URL parameter as it is actually how you find other identifiers? (I don't think I am ever going to find out how one get any one of these things in the first place.)

Pinging User:Dabao qian (zhwp). --Artoria2e5 (talk) 14:03, 24 March 2017 (UTC)

Error while trying to set descriptionEdit

I've added a russian label for item Q2548040, but when I try to set description to value 'химическое соединение' (without quotes, means chemical compound), I've receive error message: "Could not save due to an error. Malformed input: химическое соединение". Same error at item Q413421, but some days ago I successfully set such description for item Q28976361. Def2010 (talk) 19:22, 25 March 2017 (UTC)

#Cannot add an alias Matěj Suchánek (talk) 19:38, 25 March 2017 (UTC)