Wikidata:Project chat/Archive/2020/05

Q36930430 and Machowicz

Originated with wrong spelling of a category for Richard Machowicz (Q2150152) in the Commons. I have fixed it in the Commons. There are still dozens of language labels to be corrected in the WD item. Do I have to change them by hand or is there possibly any other way ? Kpjas (talk) 06:38, 1 May 2020 (UTC)

Empty mass and Gross mass

Hi, Is there a proper way to define the empty mass and the gross mass of, for instance, a car? I could not find proper qualifiers for the mass property. Thanks. Sovxx (talk) 10:35, 1 May 2020 (UTC)

How do we keep bot owners from importing the same bad data over and over

How do we keep bot owners from importing the same bad data over and over?

For example, there are multiple instances on Virgil (Q1398) the bot BotMultichillT (talkcontribslogs):

imports a large batch of bad data
which is then removed
then imported again a few months later
removed again
imported again
removed again
imported again
removed again
imported again
removed again
imported again
removed again

How are we ever going to make progress as editors cleaning up bad data if the bot owners keep putting the bad data back in? --EncycloPetey (talk) 14:29, 26 March 2020 (UTC)

See Help:Deprecation. You keep the entry but mark it as a false claim. The entry can't be reinserted as it already exists but no one pays attention to it because it is recorded as false. From Hill To Shore (talk) 15:20, 26 March 2020 (UTC)
It's not possible to mark aliases that way. --EncycloPetey (talk) 15:32, 26 March 2020 (UTC)
In the specific example you link above where the bot is inserting several different language labels into the English fields, you need to flag it up to the bot operator. Either the source that the bot is using needs to be put on a black list or the logic of where the bot is inserting the data needs to be improved. From Hill To Shore (talk) 15:28, 26 March 2020 (UTC)
Several people have brought this and similar edits to the bot owner's attention multiple times. There are at least two active threads about ULAN data (from three editors) on his talk page right now. --EncycloPetey (talk) 15:32, 26 March 2020 (UTC)
The first step is to ping the bot owner to see if they will engage here. @Multichill:. If the bot owner doesn't respond to multiple requests on their talk page or pings to related discussions then you can flag it to administrators to intervene. What often happens is that we find the bot owner is busy and had either not picked up the earlier messages or misunderstood the implications. Most issues can be resolved once the bot operator engages in the discussion. From Hill To Shore (talk) 16:21, 26 March 2020 (UTC)
@From Hill To Shore: The bot owner has replied below. He refuses to change his bot and is accusing me of vandalism. He has reverted me [1] and claims all the data is valis as English aliases, contrary to my explanations below, simply because they are in another database. --EncycloPetey (talk) 16:51, 26 March 2020 (UTC)
  • "Bad" doesn't really say much. Sometimes it happens that bots have code errors or do mis-mappings and the result should be different or added elsewhere. This means that the bot's code has a design issue and should be blocked.
Here this doesn't seem true. I think we discussed these alias here before and concluded that it's a good idea to add them. Why do you keep deleting them?
It's a common feature of people born before spellings of names were standardized that there isn't just one that was used to refer to them. Bear in mind that Wikidata is not Wikipedia nor an encyclopedia.
If you think some are problematic, you could add them as statements with the given reference, deprecated rank, and a reason for deprecation. Such cleanup would be most helpful, merely suppressing referenced data is not. --- Jura 15:34, 26 March 2020 (UTC)
"Publiusz Wergiliusz Maro" is not English; it's Polish.
"Publio Virgilio Maron" is not English; it's French.
"Vergil." is not an alias; it's Vergil with a period added.
You think that "... Virgil" is a valid alias? Why? Why would we need an alias with preceding ellipsis?
In short, I don't think you properly looked at the list of added aliases. I don't see how adding 68 statements about deprecation is a good solution to the problem of alias cruft. --EncycloPetey (talk) 15:41, 26 March 2020 (UTC)

I have no intention of changing the behavior of the bot because the Union Lists of Artist Names (ULAN) considers these valid aliases. You should not be removing these aliases. See https://www.getty.edu/vow/ULANFullDisplay?find=&role=&nation=&subjectid=500337098 to see theses aliases are all individually sourced. Removing these aliases borders vandalism. Multichill (talk) 16:41, 26 March 2020 (UTC)

Removing bad data is not vandalism; it improves the quality of Wikidata content. Why are you refusing to alter the behavior of your bot? --EncycloPetey (talk) 16:51, 26 March 2020 (UTC)
In your opinion it's bad data. This opinion is not backed by any sources. In my opinion these are valid and useful aliases and my opinion is backed by various sources. Shouting generally doesn't improve your point. Multichill (talk) 17:03, 26 March 2020 (UTC)
In your opinion, does "Vernacular" mean "English"? Your bot edits indicate that you think so. The data at ULAN you are adding is marked as Vernacular, but your bot is dumping the entire lot of it into the English aliases field, regardless of what language the data is in. The ULAN data does not indicate its language; it is therefore inappropriate to repeatedly claim that it is English. I have not "shouted" anywhere; I have bolded the key lines of my discussion for the ease of readers who do not wish to slog through all of the discussion. Emphasizing is not shouting. --EncycloPetey (talk) 17:45, 26 March 2020 (UTC)
You are asserting that the ULAN aliases are perfect. But this is clearly not the case. Once imported, if those aliases are improved, the improvements should persist. It would appear that bot-imported aliases, plus improvements by other Wikidata editors, are superior to the original ULAN aliases, and so the original ULAN aliases should not take precedence. —Scs (talk) 22:49, 26 March 2020 (UTC)
That's not my field, but in general: the fact that some organisation considers something valid does not mean that something is valid. There are many examples in chemistry databases where there are names that are obviously incorrect for an alias in WD (e.g. broader/narrower/related concepts that should have or already have different item in WD). Wostr (talk) 16:50, 26 March 2020 (UTC)
@Multichill, call me a vandal if you dare, will you? — Mike Novikoff 14:20, 31 March 2020 (UTC)

It is indeed a problem, I see a lot of incorrect aliases of chemical compounds that are a result of automatic imports (incl. aliases in different languages matched to English, aliases that are names of broader/narrower/related concepts, aliases with capital letters despite the fact that the same alias is already present without capitals). Sometimes such erroneous aliases are propagated to other languages (eg. English aliases are copied to Welsh, British English etc.) However, I don't think there is a universal solution for this — many of these errors are caused by imperfections in imported databases — but I think that every mass import should be discussed before it starts (at least a month earlier) in relevant WikiProject or in other place. This could reduce the number of errors. Wostr (talk) 16:47, 26 March 2020 (UTC)

    • I think there was some gene bot the messed up a lot of aliases. I think they were mostly fixed. Also, imports of redirects from Wikipedia as aliases as done for some languages is known to be problematic. None of this is relevant to the referenced additions by the bot above. --- Jura 17:07, 26 March 2020 (UTC)
  • What is the primary purpose of aliases? To me they are simply a way to improve search results. If a variant of a name is likely to be used for search purposes, then it is useful to have that as an alternate label. The more aliases the better, generally. Is there some other use for aliases of which I am unaware, that is damaged by having too many? ArthurPSmith (talk) 17:29, 26 March 2020 (UTC)
    But should Polish, French, and German aliases appear in the English alias field? Should aliases that simply add a period after the name be added? --EncycloPetey (talk) 17:52, 26 March 2020 (UTC)
    I'd say that enabling searching is one of two equally-important purposes of aliases. But "enabling searching" does not necessarily require explicitly including every possible misspelling and abbreviation and punctuation variation. —Scs (talk) 12:09, 27 March 2020 (UTC)
  • I'm guessing the problem here is that the Union List of Artist Names does not tag aliases by language, but of course Wikidata does. So yes, the ULAN aliases are all valid; what's invalid is importing them into Wikidata and tagging them as "en". So we need to figure out a way of augmenting the ULAN aliases with language mappings for proper importing, or else find a way to import them into Wikidata either without a language tag, or with some kind of "unknown" or "unspecified" language tag. Or -- here's another idea -- instead of deleting the "bad" aliases, re-tag them with better languages, and then teach the bot not to re-import an alias if it's present under any language tag. —Scs (talk) 17:49, 26 March 2020 (UTC)
    That's part of the problem. The ULAN database also has aliases that are the same other aliases, but with punctuation added, such as the name followed by a period, or the name preceded by ellipses. --EncycloPetey (talk) 17:50, 26 March 2020 (UTC)
I was about to say, the bot should be filtering those out, too, because in this case they're clearly unnecessary. The tricky part is that there are other cases where punctuation can be more significant. So it's not immediately obvious what a bot's rules should be for when to strip insignificant punctuation, or which punctuation is insignificant.
In any case, it seems we do need a better consensus on bot activity. This is the second time in as many weeks we've had complaints here about bots importing poor-quality data. It seems to me that in such situations bot operators should not just be falling back on the defense of "the bot is fine, and the data is fine, stop complaining". —Scs (talk) 18:00, 26 March 2020 (UTC)

This issue was raised back in 2018 (part1, part2), where I noted that many of the aliases being added by BotMultichillT from ULAN simply do not conform to our alias policy. The issue of adding alternative names marked as "vernacular" or "undetermined language" as English aliases was called out, as was the fact that many ULAN aliases are simply related concepts. The botop was asked to make the bot respect the work of editors removing these incorrect aliases, but apparently nothing was done. Bovlb (talk) 18:19, 26 March 2020 (UTC)

Can we block the bot until the problem is fixed? It's added the bad data in yet again since this discussion started. --EncycloPetey (talk) 20:15, 26 March 2020 (UTC)
  • So, having spent a lot of time in editing items about ancient Greek and Latin literature, I agree with many opinions expressed above, namely by @EncycloPetey, Scs, Bovlb:: it's true that ULAN contains a lot of aliases whose language is not stated; consequently, all these names exist, but it's wrong to add them indiscriminately as English aliases, because most of them aren't used in English sources, but in sources written in other languages. I have myself removed a lot of such aliases in the past years and months (this today). Since the problem has already been discussed (as stated by Bovlb), I think the bot should stop additions - or be blocked - until a consensus is reached; my suggestion is that the bot should at least never add an English alias when it is present in at least one label or alias other than English (of course there are a few exceptions, e.g. Publius Vergilius Maro can be a valid label/alias both in English and in Latin and in other languages, but it's better to manage those cases manually). --Epìdosis 20:34, 26 March 2020 (UTC)
  • Thanks for those links, @Bovlb:. At the risk of reopening that debate (and at the further risk of seeming to criticize the unsung work of bot operators, which is certainly not my intent), I have to point something out.
It's claimed that these "extra" aliases, the ones that some here are complaining about and trying to trim down, are important for searching. That might be true if we had a really dumb search engine, but we don't: the Mediawiki search engine(s) is/are pretty good. If you were to search for, say, "P. Vergilius Maro", and if Q1398 did not have an alias with that exact spelling, your search would still find Q1398 perfectly easily. (I tested this hypothesis by searching for "Q. Vergilius Maro". Similarly if you search for "Mantoano Virgilio" even though the closest explicitly-listed alias is "Virgilio Mantoano".) So there may be a reason for preserving the full breadth of these "extra" aliases, but enabling better searching isn't it. —Scs (talk) 11:31, 27 March 2020 (UTC)
  • Easily with Special:Search maybe provided that the relevant string property is indexed, but most searching at Wikidata is done with entity selector. If it hasn't happen in 8 years, it's not possible otherwise by now, it's unlikely to happen ever.
Anyways, it's still not stated why it's a problem to have more than 2 alias for an item. What are you trying to do with them? Maybe the usecase for not having them should be stated. --- Jura 11:51, 27 March 2020 (UTC)
  • Some of the points I have seen for retaining this data dump of aliases under the English tag suggest that the field is used solely for searching on Wikidata. This is incorrect. The aliases are reused elsewhere, such as in the Commons Creator template, providing a way for users of many projects to interact with our content. By dumping multiple language alias data against the English entry, we present English users with a nonsensical list of names (many of which can't even be read). Of a more serious and damaging nature though, we are hiding the native labels from non-English readers, because the content is against the wrong filter. From Hill To Shore (talk) 13:00, 27 March 2020 (UTC)
    • Can you provide us a sample from Commons creator template you see as problematic? --- Jura 13:12, 27 March 2020 (UTC)
      • I can't give you a specific example of a creator that has been affected by this problem as it is unclear on the scale of the issue. If bots and human editors are edit warring over this, there may be no specific cases that have had their data read by Commons. However, as a hypothetical example, see Commons:Creator:Rowland Langmaid. This shows a number of aliases if you are viewing Commons in English but will show different aliases if you are set to a different language. From Hill To Shore (talk) 13:30, 27 March 2020 (UTC)
  • It's a problem to have supposed aliases that aren't really aliases because someone writing something is liable to feel free to choose one as a stylistic matter, and end up writing something inappropriate. - Jmabel (talk) 15:37, 27 March 2020 (UTC)

I looked at some cases people are not happy about the bot importing many aliases (Virgil (Q1398), Homer (Q6691), Jerome (Q44248) & Dante Alighieri (Q1067)). The common denominator is that these people don't seem to be a (subclass of) visual artist (Q3391743). I updated the query to only work on visual artists. Is that a good compromise? Multichill (talk) 18:18, 27 March 2020 (UTC)

This update will limit the problems, but this is not a generic solution. --NicoScribe (talk) 18:40, 27 March 2020 (UTC)
Moreover do you plan to remove the incorrect values that have already been imported? --NicoScribe (talk) 15:52, 31 March 2020 (UTC)

This is not a problem of a bot, but a different understanding of what we should/should not be allowed as alias between one user and one bot-operator. Blaming the bot-script is not going to solve this. Starting a discussion with the bot-owner might either solve the issue, or end as agree to disagree. Edoderoo (talk) 10:29, 30 March 2020 (UTC)

What do you mean by "one user"? There's a lot of users complaining for more than four years already. Could it for once end as enough is enough? — Mike Novikoff 15:33, 30 March 2020 (UTC)
@Edoderoo: I am not really blaming the bots of Multichill. They are doing what their operator wants. Moreover "The contributions of a bot account remain the responsibility of its operator [...] In the case of any damage caused by a bot, the bot operator is asked to stop the bot. [...] The bot operator is responsible for cleaning up any damage caused by the bot" per Wikidata:Bots policy.
What do you mean by "one user"? Look at the old discussions and the current one: this is not "between one user and one bot-operator", this is "between 25 users and one bot-operator". Starting the old discussions with the bot-operator did not solve all the issues. --NicoScribe (talk) 15:39, 31 March 2020 (UTC)
What do you mean by one user. Counter question: why is the subject bot owners in plural? When the problem is with only one bot owner, it is brought as a generic problem between bot owners and the smart people. Discussing with such pre-positioning is not offering any solution. Edoderoo (talk) 16:11, 31 March 2020 (UTC)
@Edoderoo: yes, this discussion's title should include the words one bot owner instead of bot owners. So, what is your solution when 25 users disagree with one bot-operator? --NicoScribe (talk) 16:26, 31 March 2020 (UTC)
@Edoderoo: what is your solution when 25 users disagree with one bot-operator, please? If your solution is to talk, that is exactly what we are trying to do here. --NicoScribe (talk) 14:37, 4 April 2020 (UTC)
I guess our next move should be to ask for a block of BotMultichillT at AN. It clearly goes against the consensus, yet the bot owner says "I *am* right, period". Some other admin will put *another* period there, won't he? Personally, I'd suggest to forbid to use ULAN at all for any of Multichill's bots. Four years were more than enough to show his mighty skills to filter out junk from crap, let's strike a balance now. And I sincerely hope that WD is not like ruwiki (which I had left more than year ago) where any user with admin flag will do whatever he wants and nothing will ever stop him. — Mike Novikoff 21:44, 7 April 2020 (UTC)
  • Multichill already "filter out the non-latin strings" so I do not understand why Multichill would be unable to filter out some keywords, such as "called", "dit", "detto", "genannt", "surnommé", "plus connu sous le nom de", "eigentlich", "known as". --NicoScribe (talk) 15:52, 31 March 2020 (UTC)
  • Note also the Wikidata v. Renon case. Looks like an utter madness, doesn't it? I can't readily propose an algorithm to filter out such things, and moreover I don't suppose it's my burden. What does it mean in practice? Does it mean that Renon should persist for some another five or ten years? Just imagine all the people... no, alas, just imagine how globally this "Renon" will be replicated all across the Universe... well, across the Internet by then. — Mike Novikoff 18:25, 31 March 2020 (UTC)
  • I wrote to the Getty (the Vocabulary Program chief editor and 2 IT people): There's a heated discussion about Multichill (the author of the Sum of All Paintings project) importing ULAN names and aliases indiscriminately. The example being discussed is Virgil. Two problems are pointed out:
    • 1. Many aliases are a duplicate of another, with just a trailing dot added. Sometimes the dot indicates an abbreviation, but not always. Even an example with leading ellipsis is pointed out. I some cases the dots are required to show abbreviation (Eg "Pub.V.M." and "P.V.M.") but in other cases are parasitic (eg "Wergiliusz" vs "Wergiliusz.") Since the dots are not useful for searching, can the Getty fix some of these problems?
    • 2. There is no language tag, so his dumping of all aliases in Wikidata as EN is incorrect. But I don't see an easy way for the Getty to fix this... --Vladimir Alexiev (talk) 06:21, 13 April 2020 (UTC)
If this set of aliases is generally good (that is, if it's worth bulk-importing at all), but if it contains these anomalies that can't readily be automatically filtered out or properly tagged, one solution would be to let the volunteers here clean them up (as indeed some volunteers have been trying to do). That would work if the bot can be persuaded not to re-import the same data over and over. —Scs (talk) 11:13, 13 April 2020 (UTC)
If I were running this bot, I would alter its current algorithm:
for each alias a in external database D
if a is not present on the relevant entity e in Wikidata
add a to entity e
and change it to:
for each alias a in external database D
if a is not in auxiliary list X and is not present on the relevant entity e in Wikidata
add a to entity e
add a to auxiliary list X
Scs (talk) 12:17, 14 April 2020 (UTC)
@scs: Such a change would resolve the issue of bots edit-warring with human editors but, as ghuron noted in a different discussion, "I need to create infrastructure that can cache hundreds of millions of individual statements. This task alone is significantly more complex than the rest of the bot code." Bovlb (talk) 16:50, 14 April 2020 (UTC)
@Bovlb: That's a good point; thanks for reminding me.
The difference here is that (a) I don't get the impression there are hundreds of millions of ULAN aliases being imported, and (b) aliases don't have rank, so the people who would like to see the aliases cleaned up have no choice but to try to convince the bot operator to change its algorithm.
(If this problem persists, I'm thinking that the people who would like to see the aliases cleaned up are going to have no choice but to seriously propose adding a whole new ranking/deprecation mechanism for aliases.)
The other question I would ask is whether the ULAN aliases are so vital a dataset that a 100% synchronization process between them and Wikidata has to be ongoing. If not, it seems like it would be enough to make one pass through the importation process, then call it done and move on to some other database to import. Or (since it's true that new people worthy of inclusion are being born and identified every day) there could be a simple mechanism to limit the continuous import to process only people newly added to ULAN. (That would be simple, without requiring an entire "auxiliary list X", if it happens to be the case that ULAN's IDs, like Wikidata's Q-ids, are monotonic.) —Scs (talk) 11:39, 15 April 2020 (UTC)

Abandoned the ULAN alias bot

No way to please people and I'm done with the unconstructive comments, nasty remarks and demotivating comments. Zero motivation to spend any more time on this so I'm abondoning it. I hope this makes you all very happy and proud of what you achieved. Multichill (talk) 18:54, 16 April 2020 (UTC)

I don't know if you're referring to me, but no, of course it doesn't make me happy. I don't like to see useful functionality discarded, and I don't like to see anyone as angry as you obviously are. But I really don't think much that was said here was nasty or unconstructive -- I at least was trying very hard (perhaps without success) to be constructive. I'm sorry if I failed at that. —Scs (talk) 12:21, 17 April 2020 (UTC)
Perhaps we are supposed to show much woe (like in the story of Juliet and Romeo), but I'll just say "thank you" instead.
And please get it right: thanks are not for all these years, just for now. I hope that we really won't see any of these 'Ko-zyöl Shmo-zёl' or 'kooklo-wod Brat.chuk' anymore.
And seriously, thank you for seeing the reason. An impossible situation in the fscking ruwiki. Please don't get offended, it's just that we are all here to improve the quality of the data rather than to be proud of such things like a "number of edits" or "I had written a script" or whatever. Remember the CC0 at last. — Mike Novikoff 01:11, 22 April 2020 (UTC)
Suggest to desysop Multichill based on this topic. --117.136.54.124 20:51, 23 April 2020 (UTC)
If you are going to make this personal about him, maybe you should have at least the courage to identify yourself. - Jmabel (talk) 23:43, 23 April 2020 (UTC)
I suppose I should explain the link to the next topic here. As some user who seems primarily to revert other people keeps changing the section header. Above we saw a series of edits showing how users reverted the same edit by the bot several times without attempting to address the matter in a way that could allow us to expand Wikidata's base. Clearly it's an aspect of the story we need to attempt to address. --- Jura 18:51, 27 April 2020 (UTC)
MGIMO finished? — Mike Novikoff 19:29, 27 April 2020 (UTC)
@Jura1: I'm not sure what you mean by "allow us to expand Wikidata's base", and I'm not sure what you would expect those aggrieved editors to do.
The impression I get is that concerns about the quality of some of those bot imports have been raised, many times, and have been repeatedly dismissed. The consensus seems to be that what the bots are doing is fine, and that those (few?) people who are concerned about them should find something else to worry about. (More on this below.) —Scs (talk) 13:52, 1 May 2020 (UTC)
If anything, I'd support desysopping a user who had done THAT much harm to the project, who had been doing it persistently for years (just imagine where any IP or new user would be if they would have done a one hundredth of this!), and haven't even suggested to rollback these edits himself, leaving the task as an exercise for ordinary rollbackers like me. I'm sure I haven't even seen most of these yet, so there's a lot of room for a future uncivil undo summaries. :( — Mike Novikoff 19:29, 27 April 2020 (UTC)
Multichill, what did you mean at 06:28, 23 April 2020?!! — Mike Novikoff 22:44, 27 April 2020 (UTC)
@Mike Novikoff: I'm not Multichill, but based on the arguments that have been taking place here I believe I can summarize the argument for including that alias:
The ULAN aliases come from a respected database, curated by Getty. Any alias that is good enough for ULAN is good enough for Wikidata. The individuals who claim that some few of those aliases are somehow "bad" have adduced no objective proofs for their claims, or at any rate, no proofs which outweigh the demonstrated value of the ULAN aliases as a whole. It's the bot's job to make sure that any alias in ULAN is also in Wikidata, so if someone manually removes one of these aliases from Wikidata, the bot is only doing its job by putting it back.
I'm not saying I agree with this argument (I don't), but I believe that's the argument. If I have misrepresented it I'm happy to be corrected. —Scs (talk) 13:52, 1 May 2020 (UTC)
But no: the Ulan alias bot has - very unfortunately - not been abandoned at all, and persists, apparently on a weekly basis, in including as aliases anything considered as such by ULAN, like here advertisements for enamels, as here on April 16, here on April 23, and here today. I assume next time will be on May 7 ? Sapphorain (talk) 08:00, 30 April 2020 (UTC)
It seems to me we have an impasse between three factions:
  1. We have a few bot operators who interpret their mandate quite broadly: Wikidata should be a proper union of every other database we consider worthy. For a variety of reasons, it is best if our inclusion of another database's items is 100%. Trying to include 99% would mean having to define and remember and apply some definition of the 1% which we wish not to include, and that's not worth it.
  2. We have a few editors who have observed that a few pieces of data imported from those external databases are (to put it charitably) of considerably lower quality. These editors feel that they are improving Wikidata by removing that low-quality data. These editors are frustrated when the bots continue to re-import the same data. If the data in question were imported as statements, they could be deprecated in various ways, but the problem is particularly acute for aliases, which have no ranking or qualification mechanism.
  3. Finally, the majority of editors -- and this includes the more senior, respected, consensus makers -- seem to, basically, not care. Either they fully support the actions of the bots, or they figure that the bots do so much good overall that it's okay if they import a bit of chaff along with the wheat, or they feel that it's too unseemly to ask an underappreciated bot operator to do even more work by reprogramming the bot to be more selective. Furthermore, this majority is (it seems) tired of the complaints of faction #2, and wishes they would go away.
It's always dangerous to factionalize (Lord knows there's far too much of it in the word today), and I feel bad about presenting these divisions so starkly. I know I've made a lot of unwarranted assumptions, and I know what I've presented is an oversimplification, but I'm doing so to make the point that this is how the situation feels to me, and I suspect the editors in faction #2. Please, someone, explain how I've gotten this wrong, or even better, suggest what you think we ought to do in order to resolve this imbroglio. Should we be asking bot operators to make their bots more selective? Should we be asking the folks in #2 to settle for "good enough" and move on? Should we be asking the wikibase folks to add qualifiers or ranking mechanisms for aliases, analogous to statements? Or is this not worth talking about? —Scs (talk) 13:52, 1 May 2020 (UTC), revised 14:39, 1 May 2020 (UTC)
@Scs: To me the solution is fairly simple: if the ULAN alias isn't tagged with something more precise than vernacular, then the bot shouldn't import it, because it has a chance not to be in English (and in a number of cases it's not in English). That would considerably lower the amount of bad data being imported, since a good many of aliases aren't language-refined. Alternatively, and I think you already mentionned it, why should the bot run every single week on every single item? If it wasn't updated in the meantime, there is no point in doing so. --Jahl de Vautban (talk) 14:55, 1 May 2020 (UTC)
I never argued that the aliases weren't "good". I argued that the supposedly English-language aliases often are not things that are ever used in the English language, and I stand by that. - Jmabel (talk) 16:02, 1 May 2020 (UTC)
@Scs: Before I begin, I want to clarify that I have never seen myself as part of a faction nor do I wish to be part of one. I try to consider any situation on the merits of the arguments presented and leave room to be swayed in support of a position I originally disagreed with (or perhaps to abstain if I can't support but have been convinced to withdraw an objection). My view on this is that all datasets have a degree of error and there is a need to correct those errors and maintain the data. A bot or script is a great way to import data en masse, but we need to acknowledge that it will also speed up the importation of the errors in the original dataset.
If a bot imports a large body of data in a single run and then switches off, I can see no problem. Any errors imported with the data will be cleaned by human editors over time. However, if we have a bot that is continuing to run with the same orders over a long period of time, it will attempt to overwrite the cleaned data with the errors in the original source dataset. We need bot operators to acknowledge that there are errors in the source data they are importing and come up with a method to stop their bots or scripts from reinserting the same error. Most bot operators I have encountered acknowledge this and make attempts to alter their bot's behaviour when an error or problematic action is pointed out to them. The source of the discussion here appears to have been flagged to the bot operator several times over a period of at least two years. That suggests either the bot operator was unwilling to fix the problem, or the nature of the source dataset made it impossible to implement a fix within the script (I have no knowledge of the previous cases or actions, so I make no judgement either way).
In situations where a bot continues to reinsert bad data over cleaned data, the first action is to fix the bot. If that doesn't work, the next action is to stop the bot. When a bot stops the correction of known errors, it has to be seen as crossing the line from a beneficial tool to a detriment to the project. From Hill To Shore (talk) 16:42, 1 May 2020 (UTC)
I've re-read the discussion above and I can't square the debate with SCS's summary. The vast majority of people who have commented have objected to the error and a couple of editors have defended the bot operator's right to import source data regardless of quality. Summarising the debate into opposing camps and stating both are equal looks like a misrepresentation of the consensus. Also, I am unclear on where the third category of "the majority of editors... seem to, basically, not care" has come from. Was this based on the vast majority of editors not choosing to participate in the discussion? If we held all decisions on Wikimedia projects to the burden of participation, we would never reach a consensus. Reading a motive into silence is a logical fallacy. From Hill To Shore (talk) 17:21, 1 May 2020 (UTC)
@From Hill To Shore: Here are the facts, then: (1) users continue to complain about "bad", bot-imported data; (2) the operators of the bots in question continue to not change their bots, and to either not respond to the complaints at all, or state baldly that the bots are fine and they have no intention of changing them; (3) the bots in question continue to not get blocked by administrators. You're right, reading a motive into silence is a fallacy, but the silence here is deafening, and the fact that these blocks don't get blocked does, I think, represent a de facto ruling by administrators that the bots are fine.
The conclusion I'm left with is that the users who would delete and/or complain about the bad data are indeed wasting their time. If they delete the data it gets re-inserted; if they complain about the bad data here they may get notes of agreement, but nothing changes. They have no choice, it seems, but to heave a heavy sigh and concede that Wikidata's data quality can't be brought up to their desired level in this regard.
(But I've got to remember that I, too, am wasting my time, and the time of anyone who is reading all this. So far, the only thing I've accomplished is to give the impression that I'm anti-bot, when nothing could be farther from the truth. So I'll try, again, to stop blathering on this topic.) —Scs (talk) 19:22, 1 May 2020 (UTC)
It may be worth drawing up a list of principles for bot editing (we could build on statements in the existing bot processes) and then initiate an RfC to generate a discussion. I'm thinking something along the lines of "bots and scripts are beneficial to wikidata and they allow... however, bots and scripts can sometimes cause problems such as... Bot and script operators should..." The RfC shouldn't be a vote to support or oppose but rather a discussion to talk around the merits of the principles; should we have them, are they the right ones, should they be adopted as a formal guideline or policy and should they be enforced in some way? When does a bot edit switch from being beneficial to being a burden?
As a starting point, I'd advise having further discussions under a more neutral heading. The title at the top of this section implies that bot operators as a group are causing problems, which is not going to set the right tone for a constructive discussion. The subheading is also misleading as we have moved on from talking about an individual bot to a more general discussion of managing situations where a bot operator edits against consensus. From Hill To Shore (talk) 21:20, 1 May 2020 (UTC)
I agree, and I wish someone luck with that. My earlier attempt at defining some bot principles bombed rather spectacularly, as numerous commentators seemed to feel that it treated bot operators much too harshly. —Scs (talk) 23:13, 1 May 2020 (UTC)
Well, all this talk seems in general be leading nowhere. But. In the particular case of the ULAN alias bot, Multichill has stated that he abandoned it. Could he then kindly actually do it? For real? And not just state he did it? This would at least solve this small part of the problem. Sapphorain (talk) 21:43, 1 May 2020 (UTC)

Introduction into R

Hello,

in the last months I have thinked about how I can make it clear how I create the descriptions I add. At the moment this is not clear for other users. I gave to the way how I make it with a Spreadsheet an Introduction in Ulm last year. Since then I learned a bit using R. I dont know much about and so it were great to learn something from other users who use it. I know how it is possible to make a VLOOKUP kind thing in R and how to filter data and creating subsets out of it and putting them to a new file. At the installation of PAWS about what you can find more information here, Wikidata:PAWS it is possible to use R. Is here somebody who uses R to prepare the data before uploading and can tell me something about it. --Hogü-456 (talk) 19:39, 28 April 2020 (UTC)

Most probably use Python or Wolfram Mathematica languages. Nevertheless I saw a couple of diagrams built with R. --Infovarius (talk) 19:34, 1 May 2020 (UTC)

Need help with merging

Could someone help me with the merging of two items? (i've never merged items, so i'm a bit hesitant) I did find two wikidata items that concern both to the Formula 1 activities of Alfa Romeo, (so this is incorrect as i see it).

Therefore i think that this wikidata item: (Q622489)called "Alfa Romeo" needs to be merged into : Alfa Romeo in Formula One Alfa Romeo in Formula One (Q65960697).

The last item is the newer one, however i still think this description "Alfa Romeo in Formula One" is a better/more specific description and to avoid confusion with the car brand "Alfa Romeo" (Q26921]).

Kind regards Saschaporsche (talk) 07:34, 30 April 2020 (UTC)

Help:Merge explains how to merge items covering the same topic. However, the scope of the two:

seems to be different (which is why they link to each other). --- Jura 07:39, 30 April 2020 (UTC)

Sorry, may be i don't get it, but both items cover the same subject: "Formula 1 activities of the brand Alfa Romeo" so why are they separated? kind regards Saschaporsche (talk) 07:44, 30 April 2020 (UTC)
Looking through the edit history, it does seem odd. I suppose one could conceive having several items, but it's not clear to me why the sitelinks were split merely based on the format of page titles at Wikipedia. --- Jura 07:58, 30 April 2020 (UTC)
So where do we go from here? What "sitelinks" are you talking about? (i'm only a beginner on wikidata). Kind regards Saschaporsche (talk) 08:28, 30 April 2020 (UTC)
Both items link to several Wikipedias (sitelinks). Maybe I should look at each page, but it appears that the difference between what is one item and what is on the other items merely depends on whether that Wikipedia uses a naming convention for their article title that is "Alfa Romeo in Formula One" or "Alfa Romeo (Formula One)". If that is correct, then all the sitelinks should be on the same item. --- Jura 08:34, 30 April 2020 (UTC)
I see this was discussed on English Wikipedia at w:Wikipedia_talk:WikiProject_Formula_One/Archive_53#Wikidata_item_merging. It seems Eurohunter disagree with the other parties. Stryn (talk) 15:27, 30 April 2020 (UTC)

Thanks for all our comments, i've started to do some work; i moved a couple of sitelinks to Alfa Romeo (Q622489). Now "Alfa Romeo in Formula One (Q65960697)" is empty. Can i now suggest this wikidata item for deletion? Kind regards Saschaporsche (talk) 18:44, 30 April 2020 (UTC)

The problem with the discussion on enwiki is that it isn't really relevant for Wikidata contributors. --- Jura 19:57, 30 April 2020 (UTC)
Wikidata can have an item for the current Alfa Romeo Racing / Alfa Romeo Racing Orlen, founded in 2019, even if the English Wikipedia decides to cover it in a page with all of Alfa Romeo's historical Formula One activities. Ghouston (talk) 02:17, 1 May 2020 (UTC)
They can't be merged as I said there can be different articles about Alfa Rome like F1 team (even in different period), engine supplier and whole activity of Alfa Romeo in F1 (team and engine supplier). Eurohunter (talk) 16:45, 1 May 2020 (UTC)
So how can we change things, because now we have two wikidata items that don't quite interconnect in my opinion. Saschaporsche (talk) 19:36, 1 May 2020 (UTC)

I would like to link Template:Error (Q5400225) to https://wikisource.org/wiki/Template:Error. However, I do not know what to enter for the language field. I thought perhaps I could use oldwikisource based on w:en:Help:Interwiki linking#Project titles and shortcuts, but was unsuccessful. How can this be accomplished? Daask (talk) 13:44, 1 May 2020 (UTC)

As far as I know it's not yet possible to link to that particular wiki. There's a ticket at phab:T138332 - Nikki (talk) 18:29, 1 May 2020 (UTC)

Bug with symmetrical propoerties

Hello! I have found this bug. You may help with it. Thanks! -Theklan (talk) 21:59, 6 May 2020 (UTC)

This section was archived on a request by: --- Jura 10:06, 7 May 2020 (UTC)

Special:MathWikibase

The above special page displays label, description and formula from an item,

e.g. Special:MathWikibase/Q205692 with Poisson distribution (Q205692).

Does it do anything else? Is it documented somewhere. I don't think it's on w:Help:Displaying_a_formula or mw:Extension:Math. --- Jura 12:11, 27 April 2020 (UTC)

@Jura1: I also don't know where the documentation is; I think I saw it announced on the project chat a few months ago, though I can't find it in the archives. There is at least one more feature, seen in Special:MathWikibase/Q35875: using has part(s) (P527) with quantity symbol (string) (P416) as a qualifier you can have it display components of the formula. ―Vahurzpu (talk) 13:24, 27 April 2020 (UTC)
@Vahurzpu: thanks, I don't recall that either. Interesting feature though. Looks like it's using the incorrect properties though. calculated from (P4934) and in defining formula (P7235) should be used. @Lea_Lacroix_(WMDE): would you know where it was announced and could you arrange for the configuration to be adjusted? --- Jura 13:34, 27 April 2020 (UTC)
I don't have any specific information about this page, it was not developped by the Wikidata team. I only found this ticket that may be related. Maybe @Physikerwelt: has more information? Lea Lacroix (WMDE) (talk) 14:55, 27 April 2020 (UTC)
Thanks. By testing, I found it also supports defining formula (P2534) as qualifier (in addition to quantity symbol (string) (P416)).
If a "part" property is to be used, shouldn't it have been has part(s) of the class (P2670)?
The sample given in the ticket shows that it also links to articles with used on enwiki (e.g. on Special:MathWikibase&qid=Q1899432). Cool!
@Andreg-p:: can you adjust it to use calculated from (P4934) and in defining formula (P7235) (qualifier and main statement) ? --- Jura 15:16, 27 April 2020 (UTC)

@Lea_Lacroix_(WMDE): Thank you for bringing this issue to my attention. The math extension currently uses the following properties ([2]):

Changing this behavior requires a two-step process. First, a concrete proposal needs to be established and then approved by the math community. To start this process, someone needs to volunteer to interact with the community and @Andreg-p: for a time span of a few months before this is implemented and rolled out in production. Without a fixed contact person putting work into the implementation is too frustrating, cf. phab:T208758. --Physikerwelt (talk) 10:39, 29 April 2020 (UTC)

Property:P2313 and P2312 for minimum and maximum values outside property constraints

  Notified participants of WikiProject property constraints @Ivan A. Krestinin:


Originally these two properties were created for property constraints (see Help:Property constraints portal/Range).

At same point the English labels were changed to remove "(property constraint)", but not the description ("qualifier to define a property constraint in combination with P2302"). Some uses are now on items.

Should these be used outside property constraints, used for regular statements, or should we create an additional pair of properties? --- Jura 05:15, 28 April 2020 (UTC)

Yes, a new property is needed, just as maximum date (property constraint) (P2311) as to latest date (P1326).--GZWDer (talk) 10:20, 2 May 2020 (UTC)

Swiss municipalities with two coordinates

Hello! Lots of municipality of Switzerland (Q70208) have two coordinates, making automatic transclusions break. It seems that lots of them are an import from Cebuano Wikipedia. Would it be possible to delete them, as it is redundant (and sometimes even unexact: Eschenbach (Q7092))? -Theklan (talk) 16:47, 1 May 2020 (UTC)

They violate a single-value constraint, so yes, it seems it would be not just possible, but desirable to delete the redundant ones! (One challenge, of course, is picking which one is "better".) —Scs (talk) 17:33, 1 May 2020 (UTC)
If they are imported from another source, would it be better to deprecate one? Deleting it will just lead to eventual reinsertion. From Hill To Shore (talk) 17:38, 1 May 2020 (UTC)
I think it's fine to remove them. Bots don't usually add more coordinates when an item already has some. The ones from the Cebuano Wikipedia are almost certainly caused by merges anyway. - Nikki (talk) 18:12, 1 May 2020 (UTC)
All information imported from Cebuano Wikipedia can safely be removed. I would go as far as to allow a bot removal.--Ymblanter (talk) 18:34, 1 May 2020 (UTC)
@Ymblanter: Can you clarify that statement, please? I pay a fair amount of attention to geodata, and I have observed that:
  • there are many places in the world that have an article only in cebwiki, but
  • the information in cebwiki seems to be accurate enough, because
  • it seems to have been bot-imported (into cebwiki, that is) from some amazingly comprehensive geographic database that I've never heard of and that does not appear to be represented in any of the other wikis.
So, yes, if a particular piece of cebwiki-imported data seems to be inferior to another piece of data we've got, by all means, supersede the cebwiki data with the better data by deleting the cebwiki data. But no, I would never say we should delete all cebwiki data, because much of it appears to be high-quality. —Scs (talk) 19:32, 1 May 2020 (UTC)
Disclaimer: I have no connection with cebwiki; I barely even know what language "ceb" represents.
Indeed, all information in Cebuano wiki is bot imported from elsewhere, and this is exactly the resaon why we should not have it here. If we want this information, we should be importing it (presumably, by bot) from the primary sources, whiah that bot used. Btw I have come across really wrong info on the Cebuano Wikipedia, though I will not be able to recollect now where exactly it was.--Ymblanter (talk) 19:38, 1 May 2020 (UTC)
I'm fairly sure the "amazingly comprehensive geographic database" User:Scs is talking about is GeoNames (Q830106). It's probably better to keep non-GeoNames data just because it's more likely to have been entered by a human and thus sanity-checked. Vahurzpu (talk) 20:06, 1 May 2020 (UTC)
I see. Someone created hundreds of stubs on Kazakhstav localities in the based on GeoNames on the English Wikipedia and did not take into account that the localities were renamed. As a result, we often have two copies of the articles (which presumably made it into Wikidata as well) and it is vertually impossible to figure out that they are the same.--Ymblanter (talk) 20:11, 1 May 2020 (UTC)
The worst data introduced from ceb are the elevation values for the geographic object - especially for mountains/hills they are often total bogus. And to top it - the bot which imported it here often did not even add the reference, so it is impossible to notice that its the wildly guess number. Ahoerstemeier (talk) 21:36, 1 May 2020 (UTC)
Accidentally, we just today got an example of an item which was created in duplicate on ceb.wp and then the error propagated here, and we can not do anything about it: Wikidata:Bureaucrats' noticeboard#permanent duplicated item (P2959).--Ymblanter (talk) 11:11, 2 May 2020 (UTC)

date of first performance (P1191)

Applies to works in progress/unfinished/demos? Eurohunter (talk) 23:23, 1 May 2020 (UTC)

  • I would say that it does not. Iwan.Aucamp (talk) 09:43, 2 May 2020 (UTC)
    • @Iwan.Aucamp: In Wikipedia article we would add sentance like "on 24 April 2020 part of song was played in BBC Sounds during the interwiew" and I wonder if this data could be added to Wikidata. Eurohunter (talk) 13:14, 2 May 2020 (UTC)
      • @Eurohunter: Just my view, I won't make a fuss if you use it but IMO it would be better to have a "performed on" property or something and use that with qualification. The problem with using this property is that when the first performance of the completed work happens it will be a bit of a conflict as to which date is appropriate. Iwan.Aucamp (talk) 15:04, 2 May 2020 (UTC)

How to find which bots are scraping a specific site or database?

I would like to know if there are any bots operating on https://dblp.uni-trier.de/ - but I'm not aware of a simple and straight forward way to check this.

If each bot had an item associated with it and the bot items had ?botuses (P2283)dblp computer science bibliography (Q1224715) then it would be significantly easier to find which bots uses what. Ideally each bot should say:

  • What it uses for sources (e.g. dblp)
  • What items it will create (e.g. instance of human)
  • What properties it will set

Maybe the right solution is to have:

  • Bot item
  • Bot task item

And then some of this goes on the Bot task item?

Maybe there is already some way to figure this out. Any input would be appreciated. Iwan.Aucamp (talk) 23:43, 1 May 2020 (UTC)

Each item in an external database should have an identifier which is imported and has its own property. What would be needed is a tool that shows which users have the most edits with this property. --SCIdude (talk) 08:32, 2 May 2020 (UTC)
For example the tool Navel Gazer which however only shows changes not creations (it says "Data derived from database dump wikidatawiki-stub-meta-history.xml" so this is not a normal SPARQL query). For DBLP the associated property would be DBLP author ID (P2456) and the user with most edits (not creations but still) would be Florian.Reitz. --SCIdude (talk) 08:49, 2 May 2020 (UTC)
While this may be a way to find the information I want in some cases there are many cases for which this would not actually yield the information I want even if everything worked. For one, in this case what you found was not actually a bot. Further if a bot goes offline and a new one comes online it will also not work to find the new one. I think the right answer is to have structured data for this, as items. It is really not that difficult and it is a lot cleaner. Iwan.Aucamp (talk) 09:38, 2 May 2020 (UTC)

Scooby-Doo duplication

https://www.wikidata.org/wiki/Q936279 and https://www.wikidata.org/wiki/Q205683 are duplicates, but when I try to merge them I either get errors or nothing happens. Any idea of what's going on?

They have separate articles on enwiki (and probably others, I didn't check).
So that's why they are (and have to be) separate entities here, too.
Q936279 is an instance of animated series (Q581714), and Q205683 is an instance of media franchise (Q196600).
Scs (talk) 14:13, 2 May 2020 (UTC)

Should I merge Macclesfield Bank (Q20050783) and Q14592080 or not?

Both looks purely same by de jure. --Liuxinyu970226 (talk) 01:00, 2 May 2020 (UTC)

@Liuxinyu970226: They don't look the same to me, what am I missing? Iwan.Aucamp (talk) 07:43, 2 May 2020 (UTC)
Perhaps the same, or perhaps 中沙大环礁 "Zhongsha Great Atoll" is just part of 中沙岛礁 "Zhongsha Island Reef". Ghouston (talk) 02:03, 3 May 2020 (UTC)

Two people conflated

Nicolaes van Bambeeck (Q4625873) has two people conflated, is there an easy way to tease them apart? They are 100 years apart, the Wikipedia article is about the earlier man, but all the positions held is about the later man. Usually conflation has just one of two values that need t be pulled apart. --RAN (talk) 19:04, 2 May 2020 (UTC)

In these situations I create a duplicate item, make sure thde descriptions are disambiguated, and then selectively remove incorrect statements from each one. - PKM (talk) 19:15, 2 May 2020 (UTC)
Two items had been merged; I undid the merge and restored Nicolaas van Bambeeck (Q57151023). Peter James (talk) 22:36, 2 May 2020 (UTC)
A likely explanation is that the Commons categories for Nicolaes use the spelling "Nicolaas"; not sure if it should be changed or if that is an alternative spelling (I checked the identifiers and they don't mention it). Peter James (talk) 22:45, 2 May 2020 (UTC)

Q92453624

I am curious as to why Q92453624 was deleted. What admin rights do we need to see deleted items? --RAN (talk) 02:43, 2 May 2020 (UTC)

Unless you are a member of the Wikidata staff team, you should be an administrator. I think you need the delete right to view deleted revisions. Ahmadtalk 05:00, 2 May 2020 (UTC)
@Ahmad252:Can you clarify why visibility of deletions is restricted? Iwan.Aucamp (talk) 20:31, 3 May 2020 (UTC)
Can we know why the label was suppressed on deletion? --- Jura 07:54, 2 May 2020 (UTC)
The label have been supressed in the majority of deleted items for quite some time now. It's quite a pain@Jura1:--Trade (talk) 19:02, 2 May 2020 (UTC)
@Iwan.Aucamp: Sometimes, there are legal reasons. For example, especially on wikis like Wikipedia or Commons, there are thousands of files and revisions deleted because of being copyright violations. Making them visible to the public will practically violate the copyright and can therefore lead to legal problems for the Wikimedia Foundation. I think this is of less importance on Wikidata, given that items essentially can't contain copyright violations (can they?). Other reasons can be privacy issues, biography of living person (BLP) issues, or a variety of other issues. Generally, the idea is that a page is deleted because it contained something inappropriate, so the only ones who can still see it should be trusted. The thing is that it is an all-or-nothing. To my knowledge, if you give someone the right to view deleted revisions, you will give them the full right to see all deleted revisions. There is no option to limit this access. Ahmadtalk 22:18, 3 May 2020 (UTC)
  • The entry was for a child of another entry, I was just curious what is contained, that led to the deletion, so I can avoid having entries I create deleted. A child of another human entry appears to have a structural need. I once asked if there were restrictions on how many generations of a family can be added and was told there were no restrictions. For instance there are ten generations present for most US presidents and 20 or more for noble families. I am also curious about why some Q-entries go through a consensus deletion process, and others can be deleted by a single editor with deletion rights. Also, why can't we have an expanded list of editors that can view deleted entries? Do we have an appeals process to reverse deletions? --RAN (talk) 08:44, 2 May 2020 (UTC)
It would also be nice if a message was automatically left on the creator's talk page when an entry they created is deleted. You should't have to find out when you go looking for the entry, and find it missing. --RAN (talk) 18:41, 2 May 2020 (UTC)

Human lexeme

Just found this lexeme that seem to be not a lexeme at all but a human created as a lexeme at all. What’s weird is that it’s found in a query

select * {
  ?item wdt:P31 wd:Q5 .
} limit 10000
Try it!

which means a lexeme can very well be found in a regular query about items, is this a bug or something ?

Apart from that, do we have a procedure to find/handle such mistakes ? author  TomT0m / talk page 16:15, 2 May 2020 (UTC)

The lexeme can be deleted as an item Jurgi Kintana Goiriena (Q57659657) already exists. Peter James (talk) 16:33, 2 May 2020 (UTC)
I have done so. —MisterSynergy (talk) 14:48, 3 May 2020 (UTC)

UserWarning script

For some time now, I've been looking for an easy way to warn users. I couldn't find any UserWarning script, so I began localizing one myself. It can be found here. Given that we don't have many uw templates, I chose a rather simple script. A short documentation is available here. If you have any suggestions, please let me know. Ahmadtalk 05:40, 3 May 2020 (UTC)

  • Nice work but we also have User:Bene*/userwarn.js. Yeah we don't have so many warn templates but we need to develop more wikidata centric. ‐‐1997kB (talk) 05:54, 3 May 2020 (UTC)
    Oh, I must've missed that. Wish I've seen it sooner; it's a nice one (and is also translatable, an important feature for Wikidata). Thanks. Ahmadtalk 06:32, 3 May 2020 (UTC)
  • What exactly do you mean when you say that you want to warn users? If you are talking about a standardized way to send errors to new users when they do certain errors, those messages should likely be templates given that templates can be read by users in their own language and you might not know which language a new user is fluent in when you want to send them a message.
See also https://www.wikidata.org/wiki/Wikidata:WikiProject_Welcome/Automated_Bot_messages for a potential way to list a bunch of templates for common errors. ChristianKl07:18, 3 May 2020 (UTC)
@ChristianKl: Yes, I think that is the idea. I agree about the templates, and this tool does the same thing (the messages aren't built-in, I only specified which templates it should use). I got a list of available templates from Category:User warning templates (the list certainly needs a review, though. I will try to do it at least for some templates. Many aren't translatable, and icons differ from one template to another). I like the bot idea, but this tool is a little bit different: it also covers templates that aren't likely to be sent by a bot (e.g. warnings about vandalism, test edits etc.). It is my understanding that Wikidata:WikiProject Welcome/Automated Bot messages covers more auto-detectable issues, right? Ahmadtalk 08:04, 3 May 2020 (UTC)

Merge request

Hiya yesterday I created an EN wikipedia page for Bose Ogulu. I added wikidate information but somehow managed to create a new item instead of editing the existing one. Would it be possible to merge or delete Q71976779 since Q92994095 (the new one) has much more info on it now? Thanks for any help. Mujinga (talk) 10:39, 3 May 2020 (UTC)

  • We would rather merge more relevant information into the older-created item. So the deletion is not necessary, Q92994095 to be merge into Q71976779. --Wolverène (talk) 10:44, 3 May 2020 (UTC)
  Done--Ymblanter (talk) 10:46, 3 May 2020 (UTC)
Thanks to both for the fast response, of course it indeed makes sense to merge to the older one, I should have thought of that. Cheers! Mujinga (talk) 10:49, 3 May 2020 (UTC)

are there any differences between Grace Frankland (Q88824282) & Grace Frankland (Q4794922) ?

can we consider one of them as duplicate ? Leela52452 (talk) 12:46, 3 May 2020 (UTC)

  Done @Peter_James: thanks for fixing so quickly Leela52452 (talk) 13:20, 3 May 2020 (UTC)

Does Wikidata have a specific policy on the copyright policy of the images it hosts? I couldn't find anything at Wikidata:List of policies and guidelines.

I know that, for example, Commons is stricter than Wikipedia. I'm getting the impression that Wikidata is probably closer to Commons.

(The example I came across that got me thinking about this is Leela (Q121841). She has a low-res, copyrighted, fair-use picture on Wikipedia. Here, and on Commons, there's only a fan dressed up as Leela, which seems wrong, although I suppose someone felt it was better than nothing. At Bender (Q750023) we've similarly got a fan in costume. At Mickey Mouse (Q11934) we've got a "real" image of the character, but it's an old one, claimed to be out of copyright. And at Bart Simpson (Q5480) there's no image at all.)

Not saying there's anything wrong here, just wondering if it's written down anywhere. —Scs (talk) 14:13, 3 May 2020 (UTC)

Hi Scs, Uploading images on Wikidata has been deactivated. This means that locally uploading images is not possible, and that only images that are stored on Wikimedia Commons can be used. Basically this means that Wikidata has the same policy on images and files as WikidataWikimedia Commons.
Fair use is not allowed on Wikimedia Commons and is only allowed on some Wikipedias where the community has arranged a fair use exception with WMF. Romaine (talk) 14:30, 3 May 2020 (UTC)
@Romaine: Thanks for confirming. That's about what I figured. (I assume you meant "Wikidata has the same policy as Commons".)
Anybody know if this is written down anywhere, or is it basically the default across all Wikimedia projects, with narrow, project-specific exceptions as Romaine mentions? —Scs (talk) 14:37, 3 May 2020 (UTC)
To my knowledge, there is no text regarding image use. As User:Romaine mentioned, Wikidata does not host any files (see here). It is worth to mention, however, that in data items, files are not really used; they are just *linked* (but displayed in the web UI for convenience). Technically one can only link files hosted at Wikimedia Commons, which in fact means that all of their files can be linked from here, and nothing else. —MisterSynergy (talk) 14:45, 3 May 2020 (UTC)
Whoops, yes. Fixed! It is a Wikimedia wide thing, something I would search for on Meta. I quickly also came across m:Non-free content + wmf:Resolution:Licensing policy - Romaine (talk) 14:47, 3 May 2020 (UTC)
@Romaine, MisterSynergy: Thanks, all. —Scs (talk) 23:40, 3 May 2020 (UTC)
It is for not often user of Wikidata from my point of view not clear that the files that can be found in items are not hosted in Wikidata. I think that the Text about the license of content in Wikidata should be modified. It is from my point of view not for every one clear what structured Data is and what not. This is the sentence about the specification of the license of Wikidata. All structured data from the main, Property, Lexeme, and EntitySchema namespaces is available under the Creative Commons CC0 License. There is not mentioned that there are properties in the main space who embed data, for example maps or pictures and other media files. --Hogü-456 (talk) 17:50, 3 May 2020 (UTC)
Well, since Wikidata started embedding images on its item pages, these item pages are no longer necessarily public domain or CC-zero, but should at least be freely useable according to the licences on Commons. The underlying data properties only contain the file names and are assumed to be not under copyright (although it's unclear if this is really the case for all long file names). Ghouston (talk) 00:39, 4 May 2020 (UTC)

Q29637965

I propose that we undelete Q29637965, which was deleted after discussion archived here, even though there was little support for deletion, and a case was made that it meets our notability criteria; as is still the case Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:09, 3 May 2020 (UTC)

Property proposal: seats

Hello. After this discussion Wikidata:Project chat/Archive/2020/04#legislative election I proposed Wikidata:Property proposal/Organization#parliament seats. We need a way to add the seats a political party won in elections (for parliament, for municipal council etc) or the party has in a body like parliament or how many seats a constituency has. A user proposed to me to used number of seats in assembly (P1410) but the property should not be used as qualifier. Please read the property proposal and say your opinion there. I just need a way to add the seats each party won, either with a property we already have or with a new one. Xaris333 (talk) 22:18, 3 May 2020 (UTC)

Android smartphone model

I think that an item like Q91804224 (recently created) is undesirable, since it's just an intersection of concepts. The operating system installed on a device can be better recorded with operating system (P306). Whenever something is done in two ways, it makes it harder to write queries, since you have to check both methods. Also, it may be only a matter of time before somebody decides to subclass it further with other properties, such as brand, making "Samsung android smartphone model", etc. Items like smartphone model (Q19723451) were already unnecessary, in my opinion, since it's just the combination of smartphone (Q22645) and product model (Q10929058). @MProperLawAndOrder:. Ghouston (talk) 03:54, 25 April 2020 (UTC)

I tend to agree these items are generally undesirable for use as instance of (P31) but it shouldn't make much difference if properly subclassed. Vojtěch Dostál (talk) 09:51, 25 April 2020 (UTC)
The harm is firstly the usual problems you get with redundancy in databases: if there are two ways of expressing the same information, you don't know which is correct when they disagree. Secondly, you get an ever-growing forest of intersection items, see Commons categories as an example. Ghouston (talk) 11:17, 25 April 2020 (UTC)
Their username is @MrProperLawAndOrder:. I wouldn't have created Q91804224, Q19723451 looks like it isn't currently necessary, although it could be useful, but these are similar to how I'm using A road (Q18019452) and B road (Q89021600) in instance of (P31) which could also be undesirable for the same reason (it looks like transport network (P16) would also allow these, if there's an item for "classified road" or "numbered road"). What's more important is consistency with similar items. Peter James (talk) 13:07, 25 April 2020 (UTC)
I agree with Ghouston: such items are intersection items and should not be created nor used. I can’t think of anything that item that cannot be solved by a tiny bit of SPARQL. Commons 'had to' implement such categories because we had no "good way" of doing intersections.
The issue with the existence of these items is that editors will use them − why wouldn’t they: after all, if it exists, it must be for a good reason.
Jean-Fred (talk) 20:19, 25 April 2020 (UTC)

@Ghouston, Peter James, Jean-Frédéric: I agree this should be changed, but smartphone model isn't any better. Regarding the reasoning above: The values in instanceOf and subclassOf are by nature intersections of something. That's the whole point of classification.

The reason why I created Android smartphone model was that smartphone model seemed too weakly defined. When is a mobile/cell phone smart?

What can be done? Looking at manufactured goods:

  1. automobile model (Q3231690)
    1. instance of: first-order metaclass (Q24017414) + type of manufactured good (Q22811462)
    2. subclass of: Ford SPORT Wagon 3 puertas año 1998
  2. Android smartphone model (Q91804224)
    1. subclass of: smartphone model
    2. operating system: Android
  3. smartphone model (Q19723451)
    1. instance of: first-order metaclass (Q24017414)
    2. subclass of: cell phone model, computer model
  4. cell phone model (Q19723444)
    1. instance of: first-order metaclass (Q24017414) + type of manufactured good (Q22811462)
    2. subclass of: electronic device model, tangible good, telephone model
  5. telephone model (Q41622600)
    1. instance of: first-order metaclass (Q24017414)
    2. subclass of: model (Q10929058)
  6. electronic device model (Q62008942)
    1. instance of: first-order metaclass (Q24017414)
    2. subclass of: model (Q10929058)
  7. computer model (Q55990535)
    1. instance of: first-order metaclass (Q24017414)
    2. subclass of: electronic device model (Q62008942)
  8. model (Q10929058)
    1. instance of: first-order metaclass (Q24017414)
    2. subclass of: type of manufactured good (Q22811462) !! here the value is in subclass, above it was in instance of
    3. has quality: brand - really? No model without a brand?
    4. equivalent class: http://schema.org/ProductModel

What do you think about only using model (Q10929058)? What type of model can be inferred from subclassOf and/or other properties. MrProperLawAndOrder (talk) 02:02, 26 April 2020 (UTC)

  • There are items in Wikidata to represent concepts, like microprocessor (Q5297), which is a kind of manufactured product, and then items like Motorola 68020 (Q916240) which represents a particular product of this type. That's expressed using the subclass statement (since a lot of these processors have been manufactured, not just one), and the item also declares it to be an instance of product model (Q10929058). As far as I can see, there's nothing to gain in this case by creating another item to represent "model of microprocessor", and it would be redundant. We shouldn't really need to create an item "model of X" for every product type X. I think a bigger problem is working out exactly what qualifies as a "model" in the first place: is Motorola 68020 (Q916240) really a model? Maybe in this case it is, but in other cases, there are microprocessors that come in different clock speeds and cache sizes, and their are cars that have different engine and trim options, and phones that have different variants sold in different countries: at what point is something a model series (Q811701) instead? I suppose at least at the point where the different variants have their own Wikidata items. Whether mobile phone (Q17517) and smartphone (Q22645) can be meaningfully distinguished or not is really an unrelated question, since the OS can be expressed in operating system (P306), if that helps. Creating an item "Android smartphone" would have been better than creating "Android smartphone model", so it can be used in the subclass statement, but I still think it would be an undesirable redundancy. Ghouston (talk) 03:51, 26 April 2020 (UTC)

@Ghouston: agree, additionally creation of subclasses should also be discussed. More model items:

  1. commercial weapon model (Q22809622)
  2. commercial object model (Q22809624)
  3. belt sander model (Q23811260)
  4. circular saw model (Q23811261)
  5. drill model (Q23811262)
  6. grinder model (Q23811263)
  7. hammer drill model (Q23811264)
  8. impact wrench model (Q23811265)
  9. jig saw model (Q23811266)
  10. miter saw model (Q23811267)
  11. orbital sander model (Q23811268)
  12. reciprocating saw model (Q23811269)
  13. screwdriver model (Q23811270)
  14. scooter model (Q23867828)
  15. moped model (Q23868001)

NOTE; Android smartphone model has been replaced with smartphone model on the ~120 items that used it. Before doing so, I made sure each item has a value for OS - it had none in ~25 cases then I added Android -, and is a subclass of smartphone. But I prefer this is changed to product model. And then people can argue about whether something is a model, a model series, or a variant of a model. At the end, if any property differs, it is a new class of products. MrProperLawAndOrder (talk) 04:31, 26 April 2020 (UTC)

User:MrProperLawAndOrder exchanged a lot of instance of electronic device model and instance of smartphone model with just instance of model. My impression was that it is better to sort the items into subclass if that has a norrower topic and fits the item. What is now the correct way of tagging? --D-Kuru (talk) 07:24, 2 May 2020 (UTC)

@D-Kuru, Ghouston, Jean-Frédéric, The RedBurn: First I added "subclass of X" to many of the "instance of X model" items - they were completely outside the regular classification tree via subclass of. I only replaced instance of X model with model, if the item was a already a subclass of X - and did so to remove redundance. MrProperLawAndOrder (talk) 07:37, 2 May 2020 (UTC)

instance of smartphone model

As discussed in the section above, "[specific kind of object] model" items used in instance of (P31) don't seem to add anything compared to product model (Q10929058) used in instance of (P31) + "[Specific kind of object]" item used in subclass of (P279). The latter also avoids having to create (and use) a "[specific kind of object] model" item for each and every type and subtype of object, avoids redundancy and allows to have a more direct link with Wikipedia. The RedBurn (ϕ) 09:54, 2 May 2020 (UTC)
Yes, although there are a few items like automobile model (Q3231690) that have Wikipedia articles, so that we are stuck with them. There are also some pointless subclasses of model series (Q811701), such as automobile model series (Q59773381) and computer model series (Q60484681). Ghouston (talk) 10:11, 2 May 2020 (UTC)

Subclass of model

Below are some of the subclasses of product model (Q10929058). SPARQL ?item wdt:P279* wd:Q10929058 https://w.wiki/PrS returned 455 until I removed "family car subclass of automobile model" [6], now it returns 151.

Electronic devices

  1. handheld game console model (Q67387549)
  2. video game console model (Q56682555)
  3. Android smartphone model (Q91804224)
  4. smartphone model (Q19723451)
  5. cell phone model (Q19723444)
  6. telephone model (Q41622600)
  7. electronic device model (Q62008942)
  8. computer model (Q55990535)
  9. model of calculator (Q19799634)
  10. smartwatch model (Q19799938)
  11. digital camera model (Q20741022)

Tools

  1. belt sander model (Q23811260)
  2. circular saw model (Q23811261)
  3. drill model (Q23811262)
  4. grinder model (Q23811263)
  5. hammer drill model (Q23811264)
  6. impact wrench model (Q23811265)
  7. jig saw model (Q23811266)
  8. miter saw model (Q23811267)
  9. orbital sander model (Q23811268)
  10. reciprocating saw model (Q23811269)
  11. screwdriver model (Q23811270)

Vehicle

  1. Q29048322 vehicle model
  2. Q18758641 watercraft class
  3. Q23867828 scooter model
  4. Q23868001 moped model
  5. Q5411847 Typhoon variant
  6. Q5531853 F-16 Fighting Falcon model
  7. Q5684964 Hunter variant
  8. Q5684968 Hurricane variant
  9. Q6665639 P-3 Orion model
  10. Q7643937 Spitfire model
  11. Q7925343 Viscount model
  12. Q15126161 prototype aircraft model
  13. Q16986328 Constellation variant
  14. ...

Other

  1. Q20888659 camera model
  2. Q22809622 commercial weapon model
  3. Q22809624 commercial object model
  4. Q29889276 model of earth-moving machine
  5. Q29982117 musical instrument model
  6. Q42314054 ammunition model

MrProperLawAndOrder (talk) 06:26, 4 May 2020 (UTC)

There are some items that miss some sitelinks in the schema:about property. See example:

SELECT ?item {
  ?item schema:about wd:Q5084390 .  
}
Try it!

Charli XCX (Q5084390) has links to 39 wikis, but the query returns 37 and misses enwiki. Am I getting something wrong about schema:about or is there a problem with these items? --MarioGom (talk) 09:42, 29 April 2020 (UTC)

Indeed it's missing the sitelink for en and pt wikipedias, it is still unclear what could have have happened yet, filed a phabricator task to investigate further. DCausse (WMF) (talk) 10:23, 29 April 2020 (UTC)
There was a duplicate, Charli XCX (Q89621390), which had the two missing sitelinks (English and Portuguese). Before I merged them on 13 April the query returned both items. I have now edited Q5084390; not sure if the query service will be updated by this. Peter James (talk) 11:39, 29 April 2020 (UTC)
Thanks, I think the best approach is to do a full reload of the wdqs servers once all the cleanups related to the wb_items_per_site incident are done, in the meantime (I know that this is far from ideal) doing a null edit on the item should restore the missing sitelinks. I'll keep the linked phabricator task updated. DCausse (WMF) (talk) 09:22, 30 April 2020 (UTC)
DCausse (WMF) Thank you for the update. It's no big deal for me. Now that I know that my queries are correct, I'll just wait until the fixes eventually propagate. --MarioGom (talk) 19:19, 30 April 2020 (UTC)
No links are missing for Q243057, but the query service says most of the links are http, instead of https; the item page has all https links. Peter James (talk) 21:31, 30 April 2020 (UTC)
I'm a bit puzzled by this one, I've checked on the test server that has been reloaded recently and they appear as https. Hopefully the reload of all the servers will fix these discrepancies. DCausse (WMF) (talk) 09:28, 4 May 2020 (UTC)

Other varieties of English than British and Canadian?

I've been working on New Zealand place names as part of Task force tohutō. In New Zealand English (Q44661), Māori loanwords are now generally written with the macron that indicates long vowels, but this isn't true for other varieties of English. The official New Zealand Gazetteer names of many places were recently changed to use macrons, and the Wikipedia en-NZ article-naming conventions were changed to follow suit (please don't debate this with me without reading the 33,000 word RfC which took two years to resolve...)

So we have a situation where I'd like to represent the official name (P1448) of Taupō (Q2397257) as being in NZ English. But my options are En, En-gb, and En-ca. There's no NZ English option, and incidentally no US English? Or Australian English? Very odd. Can somebody point me to the discussion that set British and Canadian as the only English variants in Wikidata; and how would I go about adding NZ English as a pop-up option? —Giantflightlessbirds (talk) 07:42, 30 April 2020 (UTC)

I don't know where en-gb and en-ca came from, they have been in Wikidata for as long as I remember. Perhaps they were just inherited from the MediaWiki software at inception. There are obviously a lot of dialects of English, see en:List of dialects of English. I suspect that allowing all these variants to be specified in Wikidata would be harmful, on balance, because it would encourage massive duplication of data (and database bloat). What happens at present is that sometimes an en label is copied to an en-gb label, pointlessly since the values are identical, but then the en-gb label tends not to get updated when the en label is changed. Couldn't we assume anyway that an official name of an item with country (P17) = New Zealand is in the NZ English dialect, so far as names even have a dialect? Ghouston (talk) 08:23, 30 April 2020 (UTC)
See also Wikidata:Project_chat/Archive/2018/09#British/Canadian_English. Ghouston (talk) 10:34, 30 April 2020 (UTC)
The problem is that New Zealand English has macrons in much of its official naming and the macrons are likely to be removed when names are not designated as being in NZ English. It is not just a dialectal matter.: the spelling includes diacritics which have no place in ordinary English. Hence, there is an imperative for the recognition of nz-en when designating Maori names with macrons in wikidata. MargaretRDonald (talk) 10:43, 30 April 2020 (UTC)
I don't see any reason why labels of NZ place names would be changed from official NZ versions, any more than the en:Taupō article is likely to be renamed in Wikipedia. Ghouston (talk) 10:51, 30 April 2020 (UTC)
There has in fact been regular Wikidata vandalism going on with macrons being removed from names, and there was quite a bit of resistance and edit warring over renaming Wikipedia articles to the official NZ place name until the RfC was finally closed a month ago. Adding en-NZ might help with this; but I can understand the assumption that a NZ official placename at least is in NZ English. —Giantflightlessbirds (talk) 21:12, 30 April 2020 (UTC)
@Giantflightlessbirds, Ghouston: The process for adding new language codes in this context is controlled by the language committee, which has stalled or given unclear answers on adding language variants before (e.g. en-IN, en-US). Based on the existence of en-GB and en-CA it could be appropriate to add all of these, but I don't know how this would be done. I don't think there's an established process for this, since there haven't recently been any successful requests to add new variants for major languages, although phab:T180771 and phab:T195816 seem to be advancing slightly faster. Jc86035 (talk) 11:51, 30 April 2020 (UTC)
@Lea Lacroix (WMDE): Has there been any progress related to Wikidata:Identify problems with adding new languages into Wikidata? Jc86035 (talk) 11:57, 30 April 2020 (UTC)
I would say there's also a strong case for deprecating & removing en-gb & en-ca rather than adding more en-xx variants (given every other WM project manages okay without them). Maybe we need a broader RFC on this to decide which way to go, given that the current situation is a bit unsatisfactory from all directions? Andrew Gray (talk) 14:51, 1 May 2020 (UTC)
Nothing specific as far as I know. The next step that was suggested on the input page was to create a new process within the Wikidata community, but that's not something we can trigger from the development team. Lea Lacroix (WMDE) (talk) 09:24, 4 May 2020 (UTC)

@Giantflightlessbirds: I think what you've done on Taupō (Q2397257) looks exactly right. en="English" (not en-us) should respect the variant that officially administrates the place, so in fact the macrons are "normal English" for NZ places. Just like the en-Wikipedia style guide about strong national ties to a topic. So, even without en-nz, I think you can proceed comfortably with the macrons. --99of9 (talk) 00:05, 1 May 2020 (UTC)

Yes, I think it's better this way than for example having an en-nz label using a macron and an en label on the same item without. Although the macron would basically disappear in this setup, since only users who request en-nz specifically would see it, if that's the goal, to protect the rest of the English-speaking world from the macron. I don't think that's necessary or desirable, however. Ghouston (talk) 01:30, 1 May 2020 (UTC)
There's currently a en-gb label on the item, without the macron. Does it even make sense to have a specific en-gb label on a New Zealand place name? Ghouston (talk) 02:00, 1 May 2020 (UTC)
Nope, we certainly don't use en-gb! Thanks, everyone, for your help with this. —Giantflightlessbirds (talk) 04:11, 1 May 2020 (UTC)
OK, I think I have it right. name (P2561) (English): Taupo (this is not and has never been an official name); official name (P1448) (English): Taupō, as of 21 June 2019; and native label (P1705) (Maori, sic): Taupō-nui-a-Tia. Someone had put "Taupo" as a native label, so I deleted that. Incidentally, how do we go about editing the name of the Māori language, Te Reo Māori, so it's spelled correctly with a macron? It's very jarring to see the form "Maori" used everywhere in Wikidata. —Giantflightlessbirds (talk) 08:59, 1 May 2020 (UTC)
  • This depends on the context. The item Māori (Q36451) current has the macron in the en label, but in the P1705 the value is Reo Māori (Maori). I don't know where the label (Maori) comes from: coded in MediaWiki software perhaps. Ghouston (talk) 10:25, 1 May 2020 (UTC)

Merging "Foreign Office" and "Foreign and Commonwealth Office"?

Q358834 and Q58211956 seems to reference the same entity. Wiki page for "Foreign and Commonwealth Office" says: "The Foreign and Commonwealth Office (FCO), commonly called the Foreign Office". Q58211956 doesn't references any Wiki pages at all. In my opinion, Q58211956 should be merged into Q358834 (which is clearly older). What do you think? Hdfan2 (talk) 11:52, 3 May 2020 (UTC)

Could not save due to an error. The save has failed.

Hello, i find a problem when i edit wikidata pages .. the following message appears: Could not save due to an error. The save has failed. What is the solution ? --Omar Ghrida (talk) 00:46, 4 May 2020 (UTC)

Is that the only message you get, without any details? And are you still experiencing it? Sometimes errors come and go for temporary technical issues. Ahmadtalk 06:56, 4 May 2020 (UTC)
@Ahmad252: The problem has been resolved, thank you for your reply --Omar Ghrida (talk) 07:49, 4 May 2020 (UTC)

How to display the {{Wikidata Infobox}} on two different pages on Commons?

I see {{Wikidata Infobox}} displays the same on c:London and c:Category:London. But that template doesn't work the same on c:Ambigram (only c:Category:Ambigrams displays it). Why? -- Basile Morin (talk) 05:38, 4 May 2020 (UTC)

The gallery was specifying the qid manually, but didn't do it right. I fixed it. There's another method, which would involve creating a new Wikidata category item so that there could be two sitelinks to Commons, but it's probably good enough as it is. Ghouston (talk) 06:07, 4 May 2020 (UTC)
Thanks -- Basile Morin (talk) 06:30, 4 May 2020 (UTC)
I created Category:Ambigrams (Q93217145), so both the gallery and category are now sitelinked, and the infobox works without the manual QID. Thanks. Mike Peel (talk) 08:37, 4 May 2020 (UTC)

How to add google book as a reference in a better way

I was trying to add [8] as reference for height of Mahavira (Q9422), but adding the link seems inefficient. Do we have a wikidata item for google books or can we simply add WzEzXDk0v6sC and page number and rest of data gets automatically fetched like title, isbn, author, etc.

@Capankajsmilyo: Create an new item and add Google Books ID (P675) to the item for the edition (not the book itself!) - see Help:Source#Books.--GZWDer (talk) 08:35, 4 May 2020 (UTC)
Thanks for the response. Google books seem like a database. Can't we automate Q creation of books just like it was done for other databases like IMDB? Capankajsmilyo (talk) 09:51, 4 May 2020 (UTC)

suggest step to update orcid reference available on caltech library

i want to update "stated in" & "caltech library". what more should be updated in reference. is updating above is correct ? or should i update just "reference URL" ? orcid is given at caltech thesis page Leela52452 (talk) 13:52, 4 May 2020 (UTC)

Public genealogical data

Hello,

I'm a member of my local genealogical society and I'm wondering: is the Wikidata community interested in the genealogical data of ordinary people?

If so, I can suggest to my local society that they transfer their genealogical data to Wikidata. We have the birth and death dates of tens of thousands of people.

Where I live, genealogical data on people born 100 years ago or more is in the public domain.

We keep this data in a huge computerized database, but there is certainly a way to retrieve it. Is there a file format or an API that would allow us to easily upload it to Wikidata?

Thank you,

--Milano-2018-10-16 (talk) 20:26, 28 April 2020 (UTC)

Hi Milano-2018-10-16, thanks for your nice idea. Before uploading huge dataset, be sure to check the following:
  • data input into Wikidata should be notable enough. I am not sure that "ordinary people" are notable enough. I, personnaly, would be interested and in favor of genealogy dataset but perhaps other would object to. I've read notability guideline and discover the genealogy topic has'nt yet been determined. I, personnaly, have been always wondering why there hasn't been any genealogy wikipedia. Curious to what other will say.
  • new data should be first checked (there might already be people inside) and reconciled with objects (for instance "Pierre-Paul JEAN" should be linked to Pierre-Paul (Q20727006) and Jean (Q12657412) and so on. Bouzinac (talk) 20:52, 28 April 2020 (UTC)
@Milano-2018-10-16: How many entries it will have? Please provide a link to the dataset so community may evaluate the data.--GZWDer (talk) 22:05, 28 April 2020 (UTC)
+1. (Is it a French-speaking one, as induced by your mother tongue?) Nomen ad hoc (talk) 22:07, 28 April 2020 (UTC).
And, did the dataset have any sources? (ordinary GEDCOM should not be imported to Wikidata, as it is usually unsourced; but if the dataset is peer reviewed, it may be OK.)--GZWDer (talk) 22:17, 28 April 2020 (UTC)
@Bouzinac, GZWDer: Thank you for your quick answers!
The database I'm thinking of is only available by subscription, but the data from 1620 to 1920 is in the public domain. My idea is to make this portion more easily accessible by transferring it to Wikidata. But, before trying to convince the board of directors of my genealogy association, I want to make sure that the Wikidatians are interested.
I'm referring to the Connolly file, which is an index of Catholic and Protestant baptisms, marriages and burials coming mainly from Quebec. The file contains more than 6,000,000 entries. There are some duplications and inaccuracies in the file, but the data remains on the whole fairly reliable. You are right the file was compiled by francophones like me, but the data (names of persons and dates) is not in any particular language. The database contains no explicit reference to other sources. However, once a date and place are provided, it is possible to infer from which handwritten register the data comes from. To be honest, though, I am no longer certain that the data included in this database meets Wikidata's standards. You can find more info on the Connolly file here.
That said, there are other high quality databases. For example, it would be great to incorporate into Wikidata the free portion of the PRDH database, which is little known but of excellent quality (it is what you might call "peer reviewed"). You can access an English version of the interface here. Please note that scans of the original Quebec registers are, for the most part, made available online by the national library (here).
A few years ago, I suggested to a Wikimedia Canada volunteer that the foundation create WikiGenealogy, a Wikimedia sister projects dedicated to genealogy. There was a lot of excitement about the idea, but it seems that no one was able to make it happen. At the time, I was told that the first step might be to record enough genealogical data in Wikidata to lay the foundation for a WikiGenealogy.
So far, I observe that the same datasets are transcribed separately by several associations and genealogists. Why not federate all efforts around a single open platform like Wikidata?
Can I suggest that you organize a vote or create a reference page that would give guidelines in relation to the genealogical data of lesser-known people on Wikidata? If I put together the substance of what has been said here, we would already have a few guidelines to vote on:
  • Data must be from a peer-reviewed data set or be referenced.
  • New data concerning people already in Wikidata must be incorporated (avoid generating duplicates with what is already in Wikidata).
  • New data should be reconciled with objects (thus, first names, surnames, etc., are linked).
  • No genealogical data on people born less than 100 years ago, except for people of notoriety who have already made this information publicly known or whose information has been published in a public source such as a catalogue of authority notices.
--Milano-2018-10-16 (talk) 00:20, 29 April 2020 (UTC)
  • I think one of the worries we had when we were deciding about uploading all the people in Findagrave was that we were worried about how to disambiguate all the ordinary people, and if adding in several hundred John Smiths would make it difficult to find the John Smith that most people are looking for. I am already having problems with all the ORCID entries of scientists and disambiguating them. But I am sure we can work something out. For the ORCID people we described them as ORCID=143357 or something like that, so we know they are minimally described people, that may be duplicated in the database. --RAN (talk) 03:01, 29 April 2020 (UTC)
  • @Milano-2018-10-16: I think putting this data on the internet would be great. While wikidata may not be the right place for it you could consider WikiTree. Iwan.Aucamp (talk) 07:17, 29 April 2020 (UTC)
Looks there has been tentatives of "Wikidata-genealogy" but they do not look to be active ? https://tools.wmflabs.org/genealogy/wiki/Joseph_Don_Carlos_Young_(1855-1938) . A nice FAQ can be read there : https://www.wikidata.org/wiki/Help:FAQ/Genealogy Bouzinac (talk) 08:01, 29 April 2020 (UTC)
A general overview of Wikimedia and genealogy is at meta:Wikimedia Genealogy Project, WikiProject Genealogy (Q19817878). —Sam Wilson 08:51, 29 April 2020 (UTC)
My 2 cents: I think that specialized genealogy sites are better suited to large genealogy datasets than Wikidata is. Those sites will have a built-in standardized way of prominently displaying the person's birth/death dates and relatives, so you can see at a glance who you are talking about. Whereas with Wikidata the only "identify-at-a-glance" info is in the description, and that has to be done by hadn and isn't consistent by any means or always used. There are already too many confusing entries for humans (don't get me started on how scrappy the huge import from "The Peerage" is) and too many duplicates. Again, specialized genealogy sites are likely to have tools for finding duplicates, and for linking with relatives, etc. So, I think the best thing for Wikidata is to only have selected people but make sure to link to resources like Wikitree for exploring their whole family. Thanks to the OP for the idea, though! I am sure we are greatly lacking in information from Quebec in many ways, and the dataset would be useful to Wikidata whether it's hosted here or somewhere else. Levana Taylor (talk) 13:40, 29 April 2020 (UTC)

Hi, Thank to all of you for your research, which allows us to take the discussion further!

I can see that I am not the first nor the last to suggest the creation of a genealogy project on Wikimedia. But reading the answers leads me to ask myself: wouldn't it be possible to use a pre-existing site such as WikiTree as a basis for a project hosted by Wikimedia?

I wonder how projects like Wikidata and Wikispecies got started. Is there a known way to repeat a similar process with a future WikiGenealogy? Who is capable of carrying out such a project?

For sure, it is always possible to list pre-existing external projects, but creating a dedicated and federating Wikimedia project still seems relevant. --Milano-2018-10-16 (talk) 20:17, 29 April 2020 (UTC)

Note 1. Wikitree does not allow mass imports and 2. Data in Wikitree is not in free licenses.--GZWDer (talk) 04:03, 30 April 2020 (UTC)
  • A specialized genealogy database is likely to work better than MediaWiki software as suggested by User:Levana Taylor, but this doesn't remove the complaint that no existing genealogy database seems to be freely licensed. The Wiki model (of anyone can edit without registration) may not be a good match for a site that's constructed from bulk import of data sources such as old censuses or graveyards. Allowing unchecked random edits would corrupt the data, but checking every edit may be too much work. Ghouston (talk) 02:32, 1 May 2020 (UTC)
  • There is the Incubatorwiki for new projects mostly used for new language versions. Here is a link [9]. I dont know how the strategy of the Wikimedia Foundation is and if it is possible to get such a project hosted by the Wikimediafoundation. I dont know so much about Genealogy data and I think it is better in another Database than Wikidata. A database who is licensed under a free license for that topic is a good idea. --Hogü-456 (talk) 16:00, 1 May 2020 (UTC)
Apparently, there has been a trying/test of wiki-genealogy : https://meta.wikimedia.org/wiki/WikiTree Bouzinac (talk) 20:48, 1 May 2020 (UTC)
  • I believe that importing high quality curated datasets is good. Specific software for genealogy has some advantages but it also has disadvantages. Generology software only stores people. It doesn't store anything which those people interacted with. Wikidata can also store information like books a person has written, patents they have been granted and companies the person worked in. ChristianKl11:21, 4 May 2020 (UTC)

Cypriot municipal elections

Hello. Every 5 years there are Cypriot municipal elections (Q64918845). The day of the elections, for each municipality there are one mayor election (voters vote persons) and one municipal council election (voters vote political parties and persons). There are 39 municipalities. I tried to create a structure for Cypriot municipal elections. In my examples, I used only two municipalities (Limassol Municipality (Q28870916) and Nicosia Municipality (Q56037497)) and two different elections (2011, 2016). I am not sure about the structure and I don't want to apply it to all 39 municipalities and all elections without be sure. Please tell me your opinion. If you know a good example of another country please let me know. The structure is:

 
 
 
{{{Cypriot municipal elections (Q64918845) }}}
 
 
 

}}

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

}}

 
 
 
 

{{{Cypriot Mayors Elections (Q92282917) }}}

 
 
 
 

{{{Cypriot Municipal Councils Elections (Q92282921) }}}

}}

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

}}

 

{{{Mayor of Limassol Municipality Elections (Q92282907)}}}

 

{{{Mayor of Nicosia Municipality Elections (Q92312582)}}}

 

{{{Municipal Council of Limassol Municipality Elections (Q92282909)}}}

 

{{{Municipal Council of Nicosia Municipality Elections (Q92313829)}}}

}}

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

}}

{{{2011 Mayor of Limassol Municipality Elections (Q92306574)}}}{{{2016 Mayor of Limassol Municipality Elections (Q92284592)}}}{{{2011 Mayor of Nicosia Municipality Elections (Q92312564)}}}{{{2016 Mayor of Nicosia Municipality Elections (Q92312746)}}}{{{2011 Municipal Council of Limassol Municipality Elections (Q92308465)}}}{{{2016 Municipal Council of Limassol Municipality Elections (Q92287822)}}}{{{2011 Municipal Council of Nicosia Municipality Elections (Q92313321)}}}{{{2016 Municipal Council of Nicosia Municipality Elections (Q92314215)}}}

}}


 
 
 
{{{Cypriot municipal elections (Q64918845) }}}
 
 
 

}}

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

}}

 
 
 
 

{{{2011 Cypriot municipal elections (Q28035577) }}}

 
 
 
 

{{{2016 Cypriot municipal elections (Q64995666) }}}

}}

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

}}

 

{{{2011 Cypriot Mayors Elections (Q92320908)}}}

 

{{{2011 Cypriot Municipal Councils Elections (Q92321112)}}}

 

{{{2016 Cypriot Mayors Elections (Q92320983)}}}

 

{{{2016 Cypriot Municipal Councils Elections (Q92321117)}}}

}}

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

}}

{{{2011 Mayor of Limassol Municipality Elections (Q92306574)}}}{{{2011 Mayor of Nicosia Municipality Elections (Q92312564)}}}{{{2011 Municipal Council of Limassol Municipality Elections (Q92308465)}}}{{{2011 Municipal Council of Nicosia Municipality Elections (Q92313321)}}}{{{2016 Mayor of Limassol Municipality Elections (Q92284592)}}}{{{2016 Mayor of Nicosia Municipality Elections (Q92312746)}}}{{{2016 Municipal Council of Limassol Municipality Elections (Q92287822)}}}{{{2016 Municipal Council of Nicosia Municipality Elections (Q92314215)}}}

}}


 
 
 
{{{Cypriot municipal elections (Q64918845) }}}
 
 
 

}}

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

}}

 
 
 
 

{{{Limassol Municipality municipal elections (Q92282911) }}}

 
 
 
 

{{{Nicosia Municipality municipal elections (Q92312633) }}}

}}

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

}}

 

{{{Mayor of Limassol Municipality Elections (Q92282907)}}}

 

{{{Municipal Council of Limassol Municipality Elections (Q92282909)}}}

 

{{{Mayor of Nicosia Municipality Elections (Q92312582)}}}

 

{{{Municipal Council of Nicosia Municipality Elections (Q92313829)}}}

}}

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

}}

{{{2011 Mayor of Limassol Municipality Elections (Q92306574)}}}{{{2016 Mayor of Limassol Municipality Elections (Q92284592)}}}{{{2011 Municipal Council of Limassol Municipality Elections (Q92308465)}}}{{{2016 Municipal Council of Limassol Municipality Elections (Q92287822)}}}{{{2011 Mayor of Nicosia Municipality Elections (Q92312564)}}}{{{2016 Mayor of Nicosia Municipality Elections (Q92312746)}}}{{{2011 Municipal Council of Nicosia Municipality Elections (Q92313321)}}}{{{2016 Municipal Council of Nicosia Municipality Elections (Q92314215)}}}

}}

Xaris333 (talk) 22:44, 28 April 2020 (UTC)

No. Xaris333 (talk) 13:17, 29 April 2020 (UTC)
The structure of the trees looks good to me but you have to be careful what property you use. Items in the lowest row of the tree should be connected with the second lowest row by instance of (P31). For all other connections use subclass of (P279). Nowhere should part of (P361) be used. --Pasleim (talk) 15:11, 29 April 2020 (UTC)
Thank you. I corrected all the items. Xaris333 (talk) 16:54, 29 April 2020 (UTC)

What is the ruling of Wikimedians themselves having entries in Wikidata?

What is the ruling of Wikimedians themselves having entries in Wikidata? Do they really exists and can be they be described with reliable sources? What is the current status of that? --RAN (talk) 02:18, 2 May 2020 (UTC)

Essay on principles of automated editing

Recently we have had a few discussions where there have been disputes between users of automated processes (bots, scripts or tools) and other users. Often these disputes seem to be partially caused by confusion over the responsibilities and expectations of different users. For example, we have had some cases where an editor has complained that a bot has made a mistake but the bot operator has replied to say that the bot has correctly imported the source information. There has also been a suggestion that we are forming into factions over those who support all actions by users of automated tools, those that challenge the edits of users of automated tools and those who ignore the problem.
I've put together a userspace essay to explain some of the principles of using automated processes and the methods to resolve disputes. The aim here is not to place blame on anyone (which would increase the perceived split into factions) but to explain some common sense standards of behaviour and how the existing processes can be used to resolve problems.
Does this seem useful and is the content set at the right level? From Hill To Shore (talk) 18:20, 2 May 2020 (UTC)

It looks good to me. Thanks for writing. —Scs (talk) 11:00, 3 May 2020 (UTC)
  • I haven't read it in detail, but it doesn't seem to encourage contributors to use more tools to stop contributing manually doing the same edit again and again (e.g. multiple reverts of the same bot edit).
A problem we encounter is that non-contributors to Wikidata seem to be offended that Wikidata is edited by bots. Maybe it should be made clear that data users interests are considered, but are generally not a factor in determining which tool should be used.
Besides, I don't think any Wikidata contributor has a duty to fix any error introduced by somebody else. Obviously, it's a good and helpful thing to do. --- Jura 11:42, 3 May 2020 (UTC)
If you feel that "duty" is too strong a word then I will be happy to replace it. I'm not sure of your other points though, especially with the caveat that you haven't read the page in detail. Your point about editors constantly reverting the same bot edit (and by implication, a bot operator constantly reverting a manual edit) is covered by the guide - we must direct those users into a discussion to find a consensus and then enforce that consensus if either side continues to edit war. Secondly, I make the point quite clearly that automated edits make a positive contribution to Wikidata and note the advantages of automated processes. However, this is not intended as promotional material for one type of editing over another. The key purpose of my essay is to clarify the misunderstandings between the two groups that you note in your comment and I don't think straying away from a neutral presentation of facts and process will be conducive to either audience.
However, if you want to write your own tool promotion essay then I will be happy to link to it with a "see also" section at the bottom of my essay. We don't have to give out all messages on a single page after all. From Hill To Shore (talk) 15:59, 3 May 2020 (UTC)
I feel like we need some idea of what kind of error rates we are willing to tolerate when using automated tools. We can't be right all of the time and doing all editing manually isn't feasible (and still will result in errors). Realistically the acceptable error rate is a function of what kind of information is being imported. BrokenSegue (talk) 06:13, 4 May 2020 (UTC)

Acc. to Asimov's laws robots may never do harm to humans. Mass edits by bots over a minimal error rate I would consider to be psycho harm to people (editors here that try to fix errors, users of wikidata expecting correct data, etc.): bots (their operators) should be responsible for cleaning up such errors. --Herzi Pinki (talk) 19:19, 4 May 2020 (UTC)

What happened to automatic data updation based on infobox

When I was here a few years back, a bot was running which was fetching details of father, mother, date of birth, etc from enwiki and adding to their respective Qs. Another bot was running which was adding data to the other end. By other end I mean father on son's page and vice versa based on either Q. Both these activities seems stopped. Is there any new policy or something? Capankajsmilyo (talk) 07:08, 4 May 2020 (UTC)

@Capankajsmilyo: 1. importing such data to Wikidata is usually not a fully-automatic process, but performed by tools like HarvestTemplates or a similar script in Pywikibot. Some properties (dates) are imported by different users occaionally. I do not recommend importing fathers and mothers in such way to prevent bring errors. 2. Data are fetched (transcluded) from Wikidata to Wikipedia. They should not be mass copied by bot. Instead infoboxes should be adapted so that Wikidata value may be shown.--GZWDer (talk) 08:43, 4 May 2020 (UTC)
Thanks again for the response. I am not talking about data from wikidata to wikipedia but vice versa. Wikipedia communities seem to have a strong feeling against transclusion of wikidata. However, doing the opposite seems a better option. Whatever users edit on wikipedia is updated oon wikidata (add some quality-checks if data exist in wikidata already). I'm not sure its same, but some TemplateData thing existed on wikipedias before. To elaborate suppose a new actor / politician / person page is created on wikipedia, his/her data gets created on wikidata too. The editors update the page with details like date of birth, hometown, place of birth, etc but that data doesn't get updated on wikidata. Capankajsmilyo (talk) 09:59, 4 May 2020 (UTC)
I'm not sure about biographic infoboxes, but on Commons, Template:Infobox artwork (Q6064255) can be still be used to populate missing properties (artist, date, collection, etc) on respective artwork items, provided appropriate templates or text-strings are in the template. -Animalparty (talk) 21:01, 4 May 2020 (UTC)

Hello! Could someone help me decide if Gregory of Nazianzus (Q25690037) is a duplicate of Gregory of Nazianzus (Q44011) or of Gregory of Nazianzus the Elder (Q935447)? I don't speak the language tag, and because it is translitterated I don't know how to translate it. Thanks. --Jahl de Vautban (talk) 10:06, 4 May 2020 (UTC)

Jahl de Vautban: According to the image links, it was a duplicate of Gregory of Nazianzus (Q44011). Esteban16 (talk) 16:56, 4 May 2020 (UTC)
@Esteban16:, yes, but dates seemed to point more toward Gregory of Nazianzus the Elder (Q935447), that's why I was confused. Anyway, it's done, it shoudn't be to hard to relinks in case it's wrong. Thanks! --Jahl de Vautban (talk) 17:08, 4 May 2020 (UTC)

Wikidata weekly summary #414

Issue on Commons template drawing from Wikidata

commons:Commons:Village pump#More poorly curated WD content injected everywhere. Someone will probably want to follow up. - Jmabel (talk) 15:40, 4 May 2020 (UTC)

[11][12][13], and this was not even vandalism. Well, I do not particularly care about the Commons community discussions at the moment, but it is hard not to accept that he has a point.--Ymblanter (talk) 15:52, 4 May 2020 (UTC)
Did you mean to include mystery-meat links to revert pages? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:48, 4 May 2020 (UTC)
Yes, indeed, thanks. Will correct now.--Ymblanter (talk) 18:53, 4 May 2020 (UTC)

How to model graves?

A discussion was started in German about how to model graves. Wikidata:Forum#Wie werden Gräber modelliert? As the contributions are only by two people we both wish more involvement here to get a modelling guideline for graves. At the moment there are at least 3 ways to model a grave:

Any application accessing graves needs to know the modelling scheme, without such scheme it is difficult to sparql all the graves. Forces:

  • If a grave is used by more than one notable person, modelling an extra object to use it as place of burial (P119) for all the persons buried there, seems to be a good idea.
  • if a grave is also considered to be a monument (e.g. Taj Mahal (Q9141)), it seems to be a good idea to model it separately.
  • Splitting a person from its place of burial (P119) information might create the need to access two items instead of just one, which has performance implications (# expensive calls).
  • transition issues to a common scheme

This is just one particular scheme question. How, in general, deal with such modelling issues? best --Herzi Pinki (talk) 09:50, 4 May 2020 (UTC)

I'm not sure if the item Hugo von Hofmannsthal's grave (Q91013436) was created solely because of the category on Commons that predates it, but not all graves are equal, and most don't warrant a seperate item.. Grant's Tomb (Q1025105) is notable. Arlington National Cemetery (Q216344) is notable, and contains many notable people, but not every headstone warrants a separate item (even if someone on Commons makes a category like Category:headstones of my great-great granddad photographed in 1998 by an orphan named Sven). All humans have heads, and most have arms, but we don't have an item for Barack Obama's head or William Shakespeare's left arm. For most people whose grave is known, I think place of burial (P119) should simply be the cemetery (or park, mausoleum, etc.), with relevant structured data with respect to the specific grave simply added as a qualifier to P119. -Animalparty (talk) 04:54, 5 May 2020 (UTC)
The problem is the definition of "Ort". Wo "Wohnort" can be a city/village, but also a house. Usually we should go to the lowest possible item in the system. If it is a single grave - the grave. If it is a graveyard/cemetary/church/whatever, use this. If it is just a city/village/place - go for this. -- Marcus Cyron (talk) 10:41, 5 May 2020 (UTC)
I'm not sure if there is a need for an item about Hugo von Hofmannsthal's grave (Q91013436). For the ones I created at Wikidata:Lists/cemetery/Ireland/Glasnevin_Cemetery#location, I thought having items useful. OTH, for many, I don't think we even have items about the person buried there. --- Jura 08:58, 5 May 2020 (UTC)
I was thinking about modelling the graves of honor in Vienna cemeteries (>1000 graves of honor) to find those with missing images of the gravestone e.g. for the next competition. Also to get lists of notable persons with missing items. I was looking for some guideline more than for individual advice and for transition from one model to the other. I would have modelled the location with the person and as qualifiers to the cemetery in place of burial (P119), while @D-Kuru: preferred the more atomic approach (give everything a WD-item like you would give everything an IP-address in the Internet of Things). At the moment I'm stuck in confusion. --Herzi Pinki (talk) 10:27, 5 May 2020 (UTC)
As a sample, this is what is available for Père Lachaise Cemetery (Q311). Maybe I can dig up the App, if you want to adapt it. --- Jura 10:42, 5 May 2020 (UTC)

Property to list elections' results on an area's item?

Hi! Are there properties so that, on an item for an area (a town, state, etc.), results for various elections within it can be listed? If not, this is useful data (that is pretty hard to find sometimes in the case of old elections) which could be used to get sets to make forecasts more easily. Does such a property exist/should it? DemonDays64 | Talk to me 01:50, 5 May 2020 (UTC) (please ping on reply)

  • You mean state governor or US president election results on every single town item? The data seems a bit too granular for Wikidata. Maybe Commons? --- Jura 06:08, 5 May 2020 (UTC)

statistic request

I was wondering whether I could monitor the numbers of changes of a single property in a certain context (let's say a country (P17))) and in a timeframe by individual users? E.g. to get an idea how many images have been added image (P18) by each of them. Eventually to assign credits in competitions. best --Herzi Pinki (talk) 10:32, 5 May 2020 (UTC)

Items for countries are getting too large

My work with using Wikidata information in Commons infoboxes has run into several problems recently because including information from country items causes time-outs. Looking at France (Q142) as an example, it's now 1.6MB of data - which seems excessive. Looking through the item doesn't point to a single cause - there are multiple properties with many values that are adding up to cause issues. I don't have any solutions to suggest here, I'm just flagging the issue - does anyone have any thoughts on how to change things here? Thanks. Mike Peel (talk) 20:26, 24 April 2020 (UTC)

Could we move some of the statistics to a new item? For example, create item:Life expectancy of France, move all the data on life expectancy (P2250) to the new item and then link life expectancy (P2250) to the new item. From Hill To Shore (talk) 20:31, 24 April 2020 (UTC)
  • I suppose it's a defect of the Wikidata API, that you are needing to retrieve every statement which has the particular country on the left hand side, when you probably only want a subset. At the same time, there's no way in the API to retrieve statements where the country appears on the right hand side, where it's more obvious that it wouldn't always be possible to return all matches in a single response. Ghouston (talk) 04:12, 25 April 2020 (UTC)
From Wikidata:Request_a_query#How_many_triples_for_Q30, it appears that most triples on country items come from quantity properties, item properties and (surprisingly) sitelinks. Monolingual strings don't seem to have much of an impact. This even if the approach there slightly overestimates quantity properties and references aren't fully factored in. --- Jura 05:41, 1 May 2020 (UTC)
@Mike Peel: Have you reported this problem in Phabricator? It would be good to have some engineers look at it from the infrastructure side as well. Kaldari (talk) 20:09, 5 May 2020 (UTC)
@Kaldari: I haven't posted it on phabricator, as I can't point to a single cause (it seems to be a cumulative effect), or otherwise figure out a good way to report it. If you can think of a good way to report it, please go ahead and I'll comment where I can. Also, it's not as high priority as other issues I've posted on phabricator that are still waiting for a response (e.g., phab:T232927)... Thanks. Mike Peel (talk) 20:34, 5 May 2020 (UTC)
@Mike Peel: Do you have an API query that is reliably failing? Kaldari (talk) 21:01, 5 May 2020 (UTC)

>7,000,000 people

Wikidata now surpassed 7 million items about people (7,135,860 as of now [15]). I wonder if there is a way to break this down into groups where they come from. I imagine:

  • people in Wikipedia (initial group, continuous additions)
  • Peerage (ca. 600,000 Oct 2019)
  • Orcid
  • authors of scholarly articles (ca. ?)
  • etc.

Maybe this has already been done. --- Jura 09:27, 27 April 2020 (UTC)

A search for the most frequent occupations (in WDQS) finds:
  1. politician (Q82955) 607249
  2. researcher (Q1650915) 490418
  3. association football player (Q937857) 260810
  4. writer (Q36180) 246723
  5. actor (Q33999) 222087
  6. painter (Q1028181) 146924
  7. journalist (Q1930187) 118402
  8. university teacher (Q1622272) 103804
which seems a pretty healthy mix. Though a lot of people are missing that occupation (P106) statement too. ArthurPSmith (talk) 14:49, 27 April 2020 (UTC)

Property creator icon

Hello everybody,

I think current property creator icon   isn't very descriptive of user-right, so I have gone ahead and created one:   (File:Wikidata property creator.svg)

P in it stands for Property and barcode is Morse code of + sign, so it is descriptive form of P+ — which can be easily recognised as Property creator. I have also created two other versions: [16] and [17], but from a short discussion with fellow admins over IRC the choice is one I uploaded over commons.

As this is a big change, so I would like to know what community thinks of this change. Regards. ‐‐1997kB (talk) 07:35, 28 April 2020 (UTC)

  • I like the idea of a new icon as the old one is pretty ambiguous. I'm unsure about your proposal as the only way you'd visually know it's for a property creator would be knowing the meaning of the barcode. So I suppose it depends whether the icon is intended to be meaningful without prior context or just to act as a visual identifier. In terms of visual/artistic feedback I also find the very full square appearance of the barcode visually unbalances the icon (and a more minor thing, but I like having the blue in there somewhere). I like your thinking though and nice work, for the balance thing perhaps experiment with clipping the barcode into the shape of a + or a letter C? --SilentSpike (talk) 10:41, 28 April 2020 (UTC)
@SilentSpike: Well.. the idea of barcode representing + icon is from Wikidata logo in which barcode represents word WIKI, so I think it's as meaningful as site logo. In terms of artistic feedback I tried to keep it as simple as possible.
But with your feedback I have also tried something different:
  • PC with wikidata logo morse code [18]
  • P+ with wikidata logo morse code [19]
  • PC where C filled with morse of C [20]
  • P+ where + filled with morse of + [21]
IMO following two of above are most simple and meaningful: File:Wikidata property creator.svg and PC with wikidata logo morse code [22]. ‐‐1997kB (talk) 13:22, 28 April 2020 (UTC)
@1997kB: I'd agree with your two picks. I also had a thought that really as an international project the symbol should avoid English letters (P can been seen as a symbol for the property ID, but C is really for the English word "creator"). So I mocked up:
  • P with a magic wand [23]
as a more symbolic design idea. --SilentSpike (talk) 15:57, 28 April 2020 (UTC)
@SilentSpike: Like the idea of having magic wand there, but I do not think magic wand is something that can be co-related to creators. Also as complex the shape goes it will be hard to use them in topicons and userboxes as when size is reduced it's hard to recognize complex shapes. ‐‐1997kB (talk) 03:29, 29 April 2020 (UTC)
Also I have tried adding blue in File:Wikidata property creator.svg. See [24] where P is filled with morse of wikidata logo. ‐‐1997kB (talk) 03:52, 29 April 2020 (UTC)
────────────────────────────────────────────────────────────────────────────────────────────────────
Perhaps your original design is best purely for following the KISS principal. Though I do feel like we could be missing a trick by not having something more immediately symbolic of creation than a barcode that needs to be decrypted into the + sign. It somewhat goes against the nature of an icon. --SilentSpike (talk) 09:24, 29 April 2020 (UTC)
P in darkred (color change 8b0000) with the last 7 bars of the logo and a little + above (darkred). —Eihel (talk) 12:17, 29 April 2020 (UTC)
Yeah IMO File:Wikidata property creator.svg and [25] are the best one for now. But let's keep this thread open for atleast a week and in the meantime I will also try some other designs. ‐‐1997kB (talk) 15:26, 29 April 2020 (UTC)

Elections with one candidate

Election with one candidate. So no voting took place and the person was elected. How can I show that in Wikidata? 2006 Mayor of Kato Polemidia Municipality Elections (Q93160832). Xaris333 (talk) 01:05, 4 May 2020 (UTC)

This is a so called tacit election. You could add instance of (P31)=tacit election (Q1760295) --Pasleim (talk) 08:33, 5 May 2020 (UTC)
@Pasleim: why not uncontested election (Q85811908)? Are they different thing? Xaris333 (talk) 16:44, 5 May 2020 (UTC)
uncontested election (Q85811908) is slightly more general than tacit election (Q1760295). In a uncontested election (Q85811908) it is possible that voting still takes place even though there is only one candidate. In some election systems, even if there is only one candiate, they can fail because a minumum number voter turnout is required. In a tacit election (Q1760295) no voting at all takes place. --Pasleim (talk) 16:54, 5 May 2020 (UTC)

Differenciate between part of constellation and/or asterism

 
Constellation and Asterism Draco

The constallation Draco contains all stars within the yellow borders while the asterism Draco consists only of the stars connected by the green lines.

The star HD 139357 (Q2044186) is part of the constellation Draco Draco (Q8675), but not part of the asterim while Thuban (Q15714) is both part of the constellation and asterism.

Describing a star of being part of a constellation seem to be expressed with HD 139357 (Q2044186) constellation (P59) Draco (Q8675) and a constellation having stars with Draco (Q8675) has part(s) (P527) HD 139357 (Q2044186). The constellation can be described with instance of (P31) constellation (Q8928).

How to describe the difference between an asterism and a constellation best? Should there bee an asterism draco and a constellation? Is there a property to make a star part of an asterism?

Looking forward to reading your ideas. ragards Ogmios (Tratsch) 11:40, 5 May 2020 (UTC)

For instance, Big Dipper (Q10460)has part(s) (P527)Alpha Ursae Majoris (Q13084), Alpha Ursae Majoris (Q13084)part of (P361)Big Dipper (Q10460) Ghuron (talk) 18:03, 5 May 2020 (UTC)

How long it takes to a change to be synced?

Hi, I have changed a value in Wikidata about an hour ago (mobile tagline in a company's page) and it still doesn't seem to be synced. what can I do?

Hi, you can purge the page by adding ?action=purge to the end of its URL, and your changes should appear. Modeum (talk) 18:17, 5 May 2020 (UTC)

P410: military rank

I'd like to propose to change this property a bit to include ranks of other services, e.g. police, border guard, gendarmerie etc. Right now military or police rank is described as military rank in English and has Wikidata item of this property (P1629)military rank (Q56019) only. I think it should be renamed to service rank with an alias military rank. Inclusion of police rank (Q19476593) was proposed in 2018 by Andreasmperu with no response, in 2019 [26] Lord Yeager added police rank (Q19476593) as allowed value type. Wostr (talk) 20:00, 5 May 2020 (UTC)

Could an admin please look at this property proposal

Hi all

Could an admin please look at Wikidata:Property proposal/Included in curricula, it has been open for almost a month with 11 supports and only one oppose (the two opposes at the top just wanted the inverse property which has now been done).

Thanks very much

--John Cummings (talk) 21:55, 5 May 2020 (UTC)

Mix'n'Match disappointment

I set up Mix'n'Match catalogue 3536 today. It has 24.5K public bodies in the United Kingdom, but only ~700 were matched automatically. I'm wondering whether I could have done something to improve the matches, or whether the issue is that too many of the organisations are not marked up as such in Wikidata? I've only checked a small sample, but Antrim Borough Council (Q16970896), for example, is said to be an "instance of administrative territorial entity of Northern Ireland". Note, though that it failed to match the exact string "Bromsgrove District Council" to the identically-labelled (in English) Bromsgrove District Council (Q73072550), which is, indirectly, an instance of a subclass of organisation. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:06, 5 May 2020 (UTC)

Please, merge

Please, merge Q4293437 and Milicz (Q11781234). 217.117.125.72 08:56, 6 May 2020 (UTC)

Discussion about ISIL (P791) and formatter URL (P1630)

I have started a discussion about removing all currently deprecated values for property ISIL (P791) listed at property formatter URL (P1630). Details can be found at this link, please join if you're interested. --Sannita - not just another it.wiki sysop 10:30, 6 May 2020 (UTC)

Wikidata:Requests for permissions/CheckUser#BRPever

In accordance with the policy, here is a notice of my candidacy for CheckUser.-BRP ever 01:26, 7 May 2020 (UTC)

The Peerage and gender

I'm doing some work on fixing misgendering and stumbled upon a huge amount of items that, apparently, have incorrect sex or gender (P21) sourced from The Peerage person ID (P4638). For example, the source states that Sir Joseph Fuller (Q75931107) is a female (F) and uses the pronoun "she", but also that Fuller is "son of" and appointed "Knight Grand Cross". This is highly unlikely. There are other cases where The Peerage states that a male took a married family name from a female. Here's a query to find suspicious cases:

SELECT ?item
WHERE {
  ?item wdt:P4638 ?thepeerageid .
  {
    ?item wdt:P21 wd:Q6581097 .
    ?item wdt:P735/wdt:P31 wd:Q11879590 .
  }
  UNION
  {
    ?item wdt:P21 wd:Q6581072 .
    ?item wdt:P735/wdt:P31 wd:Q12308941 .
  }
}
LIMIT 100
Try it!

How should we handle this? --MarioGom (talk) 23:16, 30 April 2020 (UTC)

Fix it! The Peerage has no shortage of errors, and the mass import has caused no shortage of headaches. Yay! -Animalparty (talk) 23:47, 30 April 2020 (UTC)
I fixed some low hanging fruits, but this is going to require some semi-automated heuristics with quite a lot of refinement. I'm going to move on to other task at the moment, but here's a better query to detect possible errors:
SELECT ?item
WHERE {
  hint:Query hint:optimizer "None" .
  ?item wdt:P4638 ?thepeerageid .
  {
    ?item wdt:P21 wd:Q6581097 .
    ?item wdt:P735/wdt:P31 wd:Q11879590 .
  }
  UNION
  {
    ?item wdt:P21 wd:Q6581072 .
    ?item wdt:P735/wdt:P31 wd:Q12308941 .
  }
  
  # honorifix: Sir
  #?item wdt:P511 wd:Q209690 .
  
  ?item wdt:P735 ?given .
  FILTER (?given NOT IN (
    wd:Q16652258, # Joan (female)
    wd:Q1484457, # Joan (unisex)
    wd:Q18001597 # Christian (male)
  ))
  
  ?item p:P21 ?sexstatement .
  ?sexstatement prov:wasDerivedFrom ?sexref .
  ?sexref pr:P248 wd:Q21401824 .
  
}
LIMIT 100
Try it!
and here's a couple of examples of more detailed references I used to denote the deviation: Lord Esmé Gordon (Q75281890), Sir Thomas George Wilson (Q76216976). --MarioGom (talk) 10:17, 1 May 2020 (UTC)
  • Does the Peerage website correct errors that we point out in a reasonable time frame? It would be much easier to have them correct errors as we detect them, rather than have two values here, with the incorrect one deprecated. We used to have a contact at VIAF that corrected errors within a month, but that appears to have stopped last year. --RAN (talk) 02:22, 2 May 2020 (UTC)
Weeeelll ... here's what the introductory page at The Peerage, written by Daryl Lundy, says. "The site is the result of around 17 years of work by one (somewhat eccentric) person collating information on the British Peers (and some European royals), and then entering it into a range of various genealogy programs... NOTE: this site is a work in progress, due to new reference sources becoming available for these families as well as new births, deaths and marriages. It is possible a few errors have crept in, so please pay attention to the credibility of each of the citations given when evaluating the quality and accuracy of this data. Your help in finding, reporting and fixing any errors is hugely appreciated. I hope you enjoy the information I have collected and presented here. Please contact me via email darryl@thepeerage.com with any corrections or updates you might have. I will do my very best to continue to expand and evolve this site on a regular basis." Levana Taylor (talk) 03:29, 2 May 2020 (UTC)
I've said it before, and I'll say it again: The Peerage, impressive though it is, is literally the work of one random guy. It should not be taken as anything but an amateur project (just like Wikidata, but less readily changeable). Errors, discrepancies, and redundancies abound, some of which is inherent to historical compilation, others are simple errors of omission or fact. The fact that it's easily accessible online doesn't make it an authoritative source. Many of the apparent mis-genderings appear to be reciprocal, e.g. the "female" Donald Finlay (Q75910034) is married to the "male" Isabel Heathfield Eliott (Q75910031), which suggests a simple coding error, and which might make bot-assisted corrections easier. Unfortunately, the misgenderings also affect values of mother (P25) and father (P22), which would also need to be swapped for all children of misgenered items.  – The preceding unsigned comment was added by Animalparty (talk • contribs) at 04:11, 2 May 2020 UTC (UTC).
Why don't we accumulate all errors here: Wikidata:WikiProject Authority control/The Peerage errors and let Darryl@thepeerage know that we store them all in one place. We should also keep a query of all the entries with duplicate values, like we do at the VIAF error page. I looked over my old emails and Darryl responded the same day and made corrections I pointed out to him. I just emailed Daryl and sent him the link to the new error page. Can we migrate the gender errors detected there? No data set is error free, it is just whether the errors can be corrected or not. Also note that we are creating errors on our end by incorrectly merging his data to the wrong person using the Mix-N-Match program. I have been correcting people who died before they were born, and people over 120 years old, errors that we caused by merging people of the same name, but the wrong people. --RAN (talk) 17:10, 2 May 2020 (UTC)
Hmmm... I've corrected dozens of errors without keeping track of them. Is there some way of searching my edit history for items with Peerage IDs? (Not that I think going back and hunting for this stuff is a particularly good use of time, but if it was easy, I might do it.) Levana Taylor (talk) 19:34, 2 May 2020 (UTC)
I have added a section with the latest version of the query: Wikidata:WikiProject Authority control/The Peerage errors#Incorrect gender. --MarioGom (talk) 09:31, 7 May 2020 (UTC)

Wikidata user growth is increasing

May 2016 we had 7882 active editors and April 2018 we had 9734. Which is roughly an added 10% per year. In March 2020 we had 14163 active editors which suggests we grew 20% per year over the last two years with is double the user growth in the years before.

I'm very happy that we succeeded in improving on the metric I consider the most important for Wikidata. We still have plenty of issues on Wikidata but having more people contributing on our project means more people fixing errors. ChristianKl10:03, 5 May 2020 (UTC)

  • If I understand the metric correctly, it includes users who moved or deleted pages at Wikipedia. Do we know how the users who actually edited Wikidata evolved? --- Jura 10:15, 5 May 2020 (UTC)
  • It's no perfect measurement. It includes users that move or delete more then 5 pages per month. Given relatively steady usercounts at the individual Wikipedia's it would surprise me if there would be suddenly hundreds more of those. ChristianKl14:41, 5 May 2020 (UTC)
    • Given that the absolute number is probably overestimated, the increase would be even larger if edits at Wikipedia are stable. --- Jura 10:09, 7 May 2020 (UTC)

Mainpage:news

Hi. wikidata:news is full of Wikidata milestone news and shows no interesting data for those who visit main page. Please remove it from main page. دوستدار ایران بزرگ (talk) 08:29, 6 May 2020 (UTC)

You can use many thing. I propose Wikidata:Showcase_items or new created properties introduced in Wikidata:Status updates or Wikidata:Map data. دوستدار ایران بزرگ (talk) 18:50, 6 May 2020 (UTC)
Yeah, it's pretty un-newsworthy. There's really no special status at all to being the x millionth piece of data, (but maybe it makes the robot who probably created it happy). It would be nice to see how people in the real world are actually using any of the data. Has Wikidata had any transformative effects on society? Is it increasing the pace of medicinal research and drug discovery? Has it powered the spread or translation of knowledge into new communities? If so, it would be nice to know. It'd be nice to see some of the fruits of the labor of us bots and humans, and would probably help persuade more people to donate or improve data. -Animalparty (talk) 02:51, 7 May 2020 (UTC)

Office and office holder

Is it possible to have one item for the office and another item for the office holder, even if one of them has no article in the Wikipedias, or do they have to be merged into one? I see that there are for minister and ministry, but, for instance, only for praetor and not for praetorship. I think they are different because the office holder refers to the people in office and the office to the office itself. --Romulanus (talk) 10:08, 6 May 2020 (UTC)

For other uses, I suppose you could create a property for "office of the praetor", if there are some statements that could be made about it. In general, I'd just use "praetor".
It doesn't really matter if Wikipedia has articles about any of them. --- Jura 11:17, 6 May 2020 (UTC)
I do not argue with the value to be used with position held (P39), but if possible the existence of both items on their own. The merges Jahl de Vautban mentions are what led me to ask. I think they're not the same. --Romulanus (talk) 11:55, 6 May 2020 (UTC)
I'm the one who split one into "decemviri" and "decemvir" ;) It's a bit like council of ministers and minister.
Q663395 (Roman magistrature) and Q20778343 (Roman magistrate) are different as both are classes. I'd keep them separate as well. --- Jura 12:02, 6 May 2020 (UTC)
@Jura1:, I think I more or less get it for the decemviri and the decemvir. To put it in others words, the body of the decemviri is composed of ten decemvir, just like a body of tresuiri is composed of three triumvir, correct? If that the case I agree that my merge be undone.
However I fail to understand the difference between a magistrature and a magistrat. Could you clarify it? --Jahl de Vautban (talk) 12:29, 6 May 2020 (UTC)
Maybe "the government" and "high-ranking government official" could compare? magistrature could link magistrat with has part(s) of the class (P2670). --- Jura 06:35, 7 May 2020 (UTC)
@Jura1: well, I've tried to think about it but I still can't understand what would be the justification of having "Roman magistratures" and "Roman magistrates", even when I compare with modern day countries. Anyway, I'm in minority here and I won't fight over it. @Romulanus: you can rollback my merges. --Jahl de Vautban (talk) 08:09, 7 May 2020 (UTC)
Both would be in singular. --- Jura 10:05, 7 May 2020 (UTC)

Mediawiki Infobox gene

Hi!

On a computer that runs MediaWiki that's not one of the WMF project cluster I've been trying to use v:Module:Infobox gene so that the information displayed for each gene box on e.g., Wikipedia or Wikiversity show up, but it doesn't work. There are statements in the w:Module:Infobox gene that call the information from Wikidata. How should these be modified, if possible, to call the data from Wikidata? --Marshallsumter (talk) 04:21, 7 May 2020 (UTC)

Indicating uni and department as employer?

Is there an inverse of of (P642) that can be used when stating an employer? Most academics that list an employer just list the university, but the department woudl also be very useful. Is there a 'more specifically' qualifier to use as person has employer = university ABC, qualifier 'more specifically' = department of XYZ (E.g. Axel Zeitler (Q60531603) has the employer=University of Cambridge (Q35794), more specifically=Cambridge University Department of Chemical Engineering and Biotechnology (Q5025585)). Any thoughts? T.Shafee(evo&evo) (talk) 02:00, 6 May 2020 (UTC)

You're not employed by your department, you're employed by the larger organization. But perhaps affiliation (P1416) would be approprite here, either as a top-level statement or as a qualifier on the employer (P108) statement. —Scs (talk) 12:41, 6 May 2020 (UTC)
@Scs: Good idea. Currently affiliation (P1416) can't be used as a qualifier of employer (P108), but it might be a logical location. T.Shafee(evo&evo) (talk) 05:12, 7 May 2020 (UTC)
It looks like part of (P361) is currently used for that, with over 7000 instances, e.g., Jan Harm Tuntler (Q21552450). Another alternative would be member of (P463), if there's a desire to change it. Ghouston (talk) 00:20, 8 May 2020 (UTC)

Should labels of WikiProjects include "Wikipedia:" or not?

Recently @Sawol:'s edit on WikiProject Korea (Q8503515) bumped this issue, that there are many WikiProjects that, at least for "en", their labels are omitted "Wikipedia:" namespace(s), should them keep omitting, or should "Wikipedia:" namespace be restored? --Liuxinyu970226 (talk) 06:47, 6 May 2020 (UTC)

Needs help from users who usually edit those WikiProjects' items: @Harej, Bjung, Ricordisamoa, WhisperToMe: --Liuxinyu970226 (talk) 06:49, 6 May 2020 (UTC)
@Sawol: Pardon?! --Liuxinyu970226 (talk) 07:00, 6 May 2020 (UTC)
@Sawol: Please do not re-add "Wikipedia:" without consensus here. --Liuxinyu970226 (talk) 02:20, 7 May 2020 (UTC)

Those items should not use the namespace prefix "Wikipedia:", as labels are given per language, not per project and there may be non-Wikipedia WikiProjects linked to the item. ---MisterSynergy (talk) 07:17, 6 May 2020 (UTC)

In MediaWiki the default name of that namespace is "Project:". I guess that works? 62 etc (talk) 08:41, 6 May 2020 (UTC)
I would keep the prefix, depending on the language label, if it is meant to have interwiki links to project pages. Otherwise, it gets confusing with projects that have article space articles. See WikiProject Women in Red (Q23875215) and Women in Red (Q43653733). --MarioGom (talk) 08:45, 6 May 2020 (UTC)
@MarioGom: I'm afraid that saying articles as "WikiProjects" are logically wrong, maybe the description should be modified, suggest to say "a movement that..." --Liuxinyu970226 (talk) 02:20, 7 May 2020 (UTC)
@Liuxinyu970226: I can't make any sense of that last thing you said. I can't even parse it. Could you reword? - Jmabel (talk) 15:38, 7 May 2020 (UTC)

Images for countries

User:Victor Knox raises interesting problem, should we have image (P18) for countries and if yes what type? I see several alternatives, let's discuss (and probably vote):

  1. Location map (but we have specific location map (P1943))
  2. Topographic map
  3. Satellite image
  4. The most known sightseeing
  5. Collage of different sightseeings (but we have montage image (P2716))
  6. Some other variant?

Personally I would prefer 2 or 3. --Infovarius (talk) 22:13, 6 May 2020 (UTC)

IMHO satellite image. strakhov (talk) 04:05, 7 May 2020 (UTC)
Satellite image won't show borders, and for landlocked countries will just be a rectangle of terrain. Or am I missing something? - Jmabel (talk) 15:42, 7 May 2020 (UTC)

Verifiability and notability

As a followup to Wikidata:Administrators' noticeboard#Please restore Q73707267 and Q92453624: it's beyond time that we have a serious conversation about clarifying the second (and arguably third) criteria of Wikidata:Notability, especially in connection with living persons' items. We also need to draw a line on what sources are acceptable; too often I see items created by folks on themselves with nothing but e.g. Instagram links. We need to ensure we send the message that we are not an indiscriminate collection of information, and that "serious and publicly available" does not just mean someone's own social media, or even their biography in an "About Us" page.

I'm open to suggestions, but we need clarity, and pretty urgently, given how many of these items are created each day.--Jasper Deng (talk) 00:46, 4 May 2020 (UTC)

In my opinion, items about entrepreneurs should be deleted if the company's itself aren't notable. Or if the item doesn't mention any companies at all. --Trade (talk) 01:17, 4 May 2020 (UTC)
Specific guidelines like that could be useful but I'm more looking for general rules applicable to anything, not just entrepreneurs. This is especially important as many outsiders see WD as a place for SEO, given its use by Google.--Jasper Deng (talk) 04:09, 4 May 2020 (UTC)
"aren't notable" - but each company, no matter how small is it, may be mentioned or described in numerous primary sources, and sometimes aggregation thereof. There is a Chinese website listing most of companies (more than 200 million) registered in China (which is and aggregation source, which does not indicate any notability in Wikipedia), and managers and stakeholders thereof. This means basically all Chinese companies can have an item. I am personally not opposed to such idea as some other projects (such as OpenStreetMap) allows them (and we may make Wikidata data useful in OSM), but I am afraid that most users do not want Wikidata to be a yellow page.--GZWDer (talk) 08:22, 4 May 2020 (UTC)
I don't mind WD becomming a Yellow Pages. What i do mind however are non-communicative SEO users dumping hordes of terrible made self promoting items, constantly recreating them through an army of sock puppets and various IP adresses while expecting WD to clean up after them. Theae people are not just annoying, they are qctively malicious @GZWDer: --Trade (talk) 11:15, 4 May 2020 (UTC)
Also, Although I oppose exclusion of primary sources (as they are still serious public data, see Wikidata:Requests for comment/Handle genealogical information; and some kinds of subjects, like paintings, are mainly described in primary sources), but in the extreme, allowing them means we can create items for more than 200 million dead people in United States (from various US censuses), plus 300 million living ones (mentioned in at least one public records - one may even argue that we can import all such records to Wikisource and link them in Wikidata, though this does not yet happen).--GZWDer (talk) 08:32, 4 May 2020 (UTC)
Primary sources cannot be the only source of information on living people; if e.g. one living person is needed as a parent of another, notable one, then we can apply criterion 3 instead, though ideally, we should only use secondary sources.--Jasper Deng (talk) 08:49, 4 May 2020 (UTC)
But this may still mean we can create items for all companies appearing in primary sources (or even yellow pages, which may be secondary) and their managers and stakeholders (for private companies there will only be few ones) will be notable per #3.--GZWDer (talk) 09:01, 4 May 2020 (UTC)
In principle, we can require for living people that all statements are cited to reliable sources, to start with, but this still have two problems: (i) what sources are reliable - for example, the English Wikipedia community recently decided that Daily Mail is not reliable; should we follow the suit? Are databases reliable sources? (ii) if we only keep the statements which are referenced by reliable sources, and the number of these statements is greater than zero - is the item notable? I would argue it is still not necessarily notable (in the end of the day, a telephone book is a reliable source and contains all peole, or at least all people with a telephone), and the notability according to criterion 2 should be smth like one should be able to create a project page (for example, a Wikipedia article in any language) which still conform to the policies of the project, but I do not have a sightest idea how to formulate this.--Ymblanter (talk) 10:52, 4 May 2020 (UTC)
"one should be able to create a project page" - this will exclude many things such as scientific articles and paintings. Even limited to people (or living ones), should we include individuals described in a larger article like w:Family_of_Barack_Obama#Malia_Obama_and_Sasha_Obama or w:Lupton family? and which level of "description" is needed for #2?--GZWDer (talk) 11:01, 4 May 2020 (UTC)
I think these are arguably notable according to criterion 3, Structural need.--Ymblanter (talk) 11:29, 4 May 2020 (UTC)
  • One of the reasons I wanted a Wikidata entry as a Wikimedian is that despite my images being released under a creative commons license, or even ones I loaded as public domain, I am still contacted by the organization that wants to publish them, and asked to sign a release, or give permission in an email. See for example this image. I don't want my entry removed based on the malice, whim, or bias, of an individual editor with deletion rights. Read how Isaac Newton deleted his rival Robert Hooke. --RAN (talk) 16:44, 4 May 2020 (UTC)
  • Do we want to discuss the status of Wikidata:Verifiability also? It has been barely touched in years; it would need some significant changes to match current practice here I think... ArthurPSmith (talk) 17:34, 4 May 2020 (UTC)
    • @ArthurPSmith: We do need to revisit that too, sorry for not making it clear from my initial comment. This is the primary reason why some projects like the English Wikipedia are distrustful of our data, and it is the primary reason why notability and living persons cannot be enforced as well as they could be. Also, for those of you who want to cry "WD:UCS": UCS is useful only in cases where consensus is obvious and clear, which is quite clearly not the case here.--Jasper Deng (talk) 20:04, 4 May 2020 (UTC)
      • Not only verifiability should be revisited, but also every issues related to content, e.g. NPOV (m:Neutral point of view says some have it and some do not, how should it be properly settled in Wikidata?), Original research (Wikidata de facto allowed some original researchs, such at mass adding main subject (P921)), and also deletion policy (Should we set up some speedy deletion criteria and require discussion for deletion other than such criteria?), etc.--GZWDer (talk) 00:04, 5 May 2020 (UTC)
        • I am not sure how you could have NPOV on Wikidata. On a Wikipedia it is fairly straight forward; you consider the balance of sourced material and ensure text is phrased to put more weight on the opinion accepted by the majority of sources, with a smaller note on the alternate view. How do we set weighting on Wikidata beyond uploading multiple versions of a statement and setting preferred rank on the one with the most reliable sources? If we have a controversial statement only mentioned in a minority of sources, do we include it because it is a statement verifiable to at least one source or exclude it because it is not a neutral balance of source material? From Hill To Shore (talk) 00:19, 5 May 2020 (UTC)
          • @From Hill To Shore: NPOV is inherently a smaller problem on WD than on other wikis. The descriptions of items are usually fairly mechanical, and more watered down than Wikipedia lead sentences; if we have doubts, NPOV can be applied in the usual way to the description. The trickiest case is, as you hinted at, conflicting sources on the same claim. Ranking of statements should reflect NPOV in that case. Otherwise, the usual concept of due weight is not as applicable; for example, personal data like height that usually only goes in the infobox gets similar prominence to the person's occupation, which usually takes the majority of a Wikipedia person article. I'm thus not too worried about NPOV. But verifiability in the first place is a more fundamental problem that needs to be addressed urgently.--Jasper Deng (talk) 08:59, 5 May 2020 (UTC)
  • If it were up to me, I'd insist all information be verifiable to reliable sources, and for things that are not trivial like height and weight, ideally something independent of the subject. Note that the Wikipedia standard of using secondary sources is relaxed here. I don't want this discussion to peter out like many others of its kind have, because I think we do need to take this seriously.--Jasper Deng (talk) 10:10, 8 May 2020 (UTC)
It would be less of a problem if adding sources was less cumbersome. --Trade (talk) 13:00, 8 May 2020 (UTC)
This basically means Wikidata:Requests_for_comment/Verifiability_and_living_persons#General_guideline plus secondary sources (or w:WP:ABOUTSELF) should be used for living people. But this does not solve the question of what is a reliable source.--GZWDer (talk) 15:57, 8 May 2020 (UTC)

Notability of Corona deaths

Hoi, I notice that many people who died of COVID-19 are included in Wikidata that are not notable. They are stated to be policemen, inmates even relatives of celebrities. The level of notability in Wikidata is not high but this is imho pushing it/ too far. What do you think.. Thanks, GerardM (talk) 01:52, 7 May 2020 (UTC)

I say give up. They are dead, and they are data. One day we will all be in Wikidata. Robots will curate us when we're gone. All Watched Over by Machines of Loving Grace (Q4729854). -Animalparty (talk) 02:53, 7 May 2020 (UTC)
"One day we will all be in Wikidata<[citation needed]"? How? Nomen ad hoc (talk) 06:09, 7 May 2020 (UTC).
Someone will invent a structural need, or simply mass import public birth and death records, probably. -Animalparty (talk) 22:50, 7 May 2020 (UTC)
Well, that would sound concerning for me, Animalparty. Nomen ad hoc (talk) 14:14, 8 May 2020 (UTC).
We might look to en:Wikipedia:September 11 victims for guidance, but of course there are far more victims of COVID-19, and Wikidata's notability criteria are weaker. Bovlb (talk) 03:01, 7 May 2020 (UTC)
Not as useful, perhaps, but for completeness see also en:https://en.wikipedia.org/wiki/List_of_deaths_due_to_coronavirus_disease_2019, en:Wikipedia:Articles for deletion/List of deaths from the 2019–20 coronavirus pandemic, and en:Wikipedia:Articles for deletion/List of deaths from the 2019–20 coronavirus pandemic (2nd nomination). Bovlb (talk) 03:43, 7 May 2020 (UTC)
  • I think you are trying to apply English Wikipedia rules to Wikidata. If they actually exist, and can be referenced, they can be included, if someone is willing to do the work of entering them. --RAN (talk) 04:01, 7 May 2020 (UTC)
    • I'm not trying to apply English Wikipedia rules to Wikidata. I am seeking guidance where it is available. They have dealt with similar issues in the past. We are not bound by their rules, but that doesn't mean we can't learn from them. Bovlb (talk) 04:11, 7 May 2020 (UTC)
If they have an obituary we can link to, then they are Wikidata notable. If we want to come up with a scheme to rank people by notability, I am sure we could think of a way to that. --RAN (talk) 05:54, 7 May 2020 (UTC)
"If they have an obituary", they are surely notable. But in case they haven't? Nomen ad hoc (talk) 06:11, 7 May 2020 (UTC).
  • I think we essentially dropped notability criteria for people when the Peerage was imported .. Is it correct that about 20% them are living people? --- Jura 06:25, 7 May 2020 (UTC)
Wikidata:Notability is clear and has been the same for 7 years: if there is sources (criteria 2) and /or a structural need (criteria 3) then it's notable for Wikidata. No reason to treat COVID-19 deaths differently than the rest of the items. Cheers, VIGNERON (talk) 06:43, 7 May 2020 (UTC)
We have always discussed the phonebook quandary in discussions, and generally the answer has been there is not much usefulness/utility in importing 10,000 entries labelled only labelled John Smith with just a telephone number, especially since telephone numbers are not permanent. We want to import high density information, databases with multiple fields, so people can be properly disambiguated. --RAN (talk) 18:27, 9 May 2020 (UTC)
      • And my phonebook do not have a way to sparql query. There is researchers out there that would surely want to find various death based on location, professions, etc. Studying the impact of a pandemy would surely be easier with something like wikidata that do have such a list. --Misc (talk) 10:30, 8 May 2020 (UTC)
          • I think the consensus is that phone books shouldn't be imported into Wikidata. So even if there are sources (criteria 2), this is not thought to be sufficient. --- Jura 10:36, 8 May 2020 (UTC)
  • I don't think notability is that much of an issue for Wikidata so long as reliable primary sources exists (which is a much lower threshold for inclusion than secondary sources). I'm personally much more concerned with the large import of personal data of non-famous persons, especially as Wikidata is increasingly reused by large databases (typically the Knowledge Graph) which gives a large exposure to potentially sensitive information. I believe that at some point the project will need to use different inclusion criteria for living or recently deceased persons. Alexander Doria (talk) 11:14, 7 May 2020 (UTC)

Duplicated park

I walked to Hells Kitchen Park (Q49499734) to snap a picture and later discovered that there is also a Hell's Kitchen Park (Q5706428). The apostrophe as far as I see is the correct punctuation, and otherwise as far as I see both entries have correct information even though the coordinates are for a different part of the park. Am I handling it correctly by mentioning it here? Jim.henderson (talk) 12:56, 7 May 2020 (UTC)

Thanks for the picture.
It's reasonably easy (though occasionally tricky) to merge duplicate items like this one.
I usually use Special:MergeItems (and I've just used it to merge Q49499734 into Q5706428).
See also Help:Merge; supposedly there's a "merge gadget" described there that might be easier. —Scs (talk) 13:10, 7 May 2020 (UTC)
Yes, the gadget is easy to use. You open the Q item you are merging from, open the merge tool from the menu and insert the Q item you are merging to. From Hill To Shore (talk) 09:43, 8 May 2020 (UTC)
You can use this to merge in either direction, by selecting the appropriate option. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:51, 8 May 2020 (UTC)

Merging two items : Vietnamese boat people and boat people

Hello to all,

I typically contribute to wikipedia, but while doing so I noticed that pages in differents languages are not showing up : specifically Vietnamese boat people Q6449297 and boat people Q494303 should probably be the same item, but I can't seem to be able to merge them. Could someone who knows how to do that resolve the issue ?

Thanks in advance, from Wikipedia. --Marteil2003 (talk) 08:58, 8 May 2020 (UTC)

@Marteil2003: It looks like one of these items is for refugees travelling by boat, and the other is for Vietnamese boat people specifically. It's possible that some of the site links are on the wrong item. Ghouston (talk) 09:20, 8 May 2020 (UTC)
@Marteil2003: The markup {{Q|Q6449297}} and {{Q|Q494303}} renders as Vietnamese boat people (Q6449297) and boat people (Q494303) (the labels being shown in the viewer's preferred langauge). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:54, 8 May 2020 (UTC)
@Ghouston, Pigsonthewing: Ok I think I solved the issue, but could someone go over it ? I have never used Wikidata. Thanks, Marteil2003 (talk) 11:05, 8 May 2020 (UTC)
Yes, it's fine. Ghouston (talk) 11:31, 8 May 2020 (UTC)
Thank you all, this is great. And unsurprisingly it is the Vietnamese Wikipedia that has a minimal pair between these two items. Deryck Chan (talk) 16:03, 9 May 2020 (UTC)

Sort sequence

The Commons category:Aaron Jones sorts on "Aaron", not "Jones" as it should. The only lines in the category description are four parent categories and Template:Wikidata Infobox. If I add DEFAULTSORT, I get an error message saying that there is a preceding default, which is correctly shown as "Jones, Aaron". The Wikidata entry correctly shows his given and family names, so anything coming out of Wikidata should be correct. So, where is the incorrect sort coming from? Jameslwoodward (talk) 14:36, 8 May 2020 (UTC)

Probably phab:T252079 as some of the labels were missing. I purged Commons:Category:Aaron Jones but there seems to be a delay in updating it in the parent categories. Peter James (talk) 18:08, 8 May 2020 (UTC)

P94 = P4004 ?

Hello! Please explain the difference between the properties coat of arms image (P94) and escutcheon image (P4004)! Doc Taxon (talk) 17:10, 8 May 2020 (UTC)

It looks like it was created for "shield image" in w:Template:Infobox_settlement, but apparently that is (currently) meant to hold the coat of arms (description is "Can be used for a place with a coat of arms. ").
@Xaris333, ChristianKl: please double-check. --- Jura 17:22, 8 May 2020 (UTC)
@Jura1: I don't have the domain knowledge about flags to give good guidance here. ChristianKl23:11, 8 May 2020 (UTC)

coat of arms (Q14659) is different from escutcheon (Q331357). Xaris333 (talk) 18:38, 8 May 2020 (UTC)

I think I know the difference now, thank you for your help Doc Taxon (talk) 14:02, 9 May 2020 (UTC)

  Doc Taxon (talk) 14:02, 9 May 2020 (UTC)

Unusual number of statements for a person

What is happening at Prem Raj Pushpakaran (Q61656939)? Is this vandalism? gobonobo + c 01:50, 9 May 2020 (UTC)

Not gonna lie, when I saw the title of this talk page section I thought it was a jab at item #383541. Mahir256 (talk) 10:12, 9 May 2020 (UTC)

Requests for permissions/Bot/নকীব বট

According to the policy, I want to draw your kind attention to the Request for Permission page. - Regards Nokib Sarkar (talk) 02:18, 9 May 2020 (UTC)

Merging two items : Q93734821 and Q65336472

Hello to all, I typically contribute to wikipedia, but while doing so I noticed that pages in differents languages are not showing up : specifically အောင်စိုးသာ Q65336472 and Aung Soe Tha Q93734821 are same person. I can't linked enwiki page and mywiki page so please merge this two pages. Thanks Cape Diamond MM (talk) 16:51, 9 May 2020 (UTC)

@Cape Diamond MM: I just merged them, and the interlanguage links now work. If you want to do this yourself in the future, follow the instructions at Help:Merge. Just be careful that the two pages refer to exactly the same concept. Vahurzpu (talk) 19:34, 9 May 2020 (UTC)

Gay villages

A category was imported from English Wikipedia that is almost devoid of references at English Wikipedia. I can see if referenced, it would be a legitimate categorization, but it looks like it is just someone's personal opinion, sprinkled with some referenced ones, and some iconic ones. Can I remove them when I come across them here at Wikidata, if unreferenced? I came across it while filling in information at Asbury Park (Q201127) --RAN (talk) 17:50, 9 May 2020 (UTC)

  • Excellent, thanks!

How to model different versions of scientific articles?

What is open peer review? A systematic review (Q29649956) and What is open peer review? A systematic review (Q30491890) represents two different versions of one scientific article. I don't know what's the correct property to model that.--GZWDer (talk) 02:16, 7 May 2020 (UTC)

I am wondering about the interest to have two items for that? Is only one not enough? Usually, different version of a given paper fix only spelling and minor stuff. It does not change the meaning or the results of the paper. Pamputt (talk) 07:42, 7 May 2020 (UTC)
(edit conflict) Hmm, usually identifiers have the version at the end, indeed, the DOIs differ there, one has ".1", the other ".2". And usually one can get the newest version by just removing the suffix. But this does not work with DOI, does it? How can you get the newest version of that article without knowing which? Anyway I don't think any author wants old versions being referred to. --SCIdude (talk) 07:44, 7 May 2020 (UTC)
What about using only one item and add the several DOI values with possibly a qualifier such as applies to part (P518) with an item equivalent to pre-release version (Q51930650)? Pamputt (talk) 08:20, 7 May 2020 (UTC)
It would make sense to merge the items, set the newest DOI with preferred rank and add qualifiers if they apply, as Pamputt proposed. --MarioGom (talk) 09:26, 7 May 2020 (UTC)
For what it's worth, we do sometimes have distinct entities for different editions of books. One example I found is Gray's Anatomy (Q200306) and Gray's Anatomy (20th edition) (Q19558994). —Scs (talk) 13:22, 7 May 2020 (UTC)

  WikiProject Wikidata for research has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. Pamputt (talk) 09:55, 7 May 2020 (UTC)

I just merged the two elements, indicating the two DOIs. I marked the older DOI as depreciated. Pamputt (talk) 21:47, 7 May 2020 (UTC)


Also in these, any suggestions what to do with Q83567432? It seems to have been a regularly updated newsfeed. --- Jura 10:04, 7 May 2020 (UTC)


Duplicate items: ZooBank vs ORCID

There are a lot of authors duplicated from ZooBank and ORCID imports. I have created a list of high confidence potential duplicates here: User:MarioGom/Potential duplicates/ZooBank vs ORCID. Duplicates can be verified with the links to ZooBank and ORCID and finding duplicate publications. I have queries with wider criteria, but more false positives. Feel free to ping me if you find false positives in the list, or if it should be replenished with more potential duplicates. --MarioGom (talk) 09:23, 7 May 2020 (UTC)

  Notified participants of WikiProject Biology

Nice work. I've confirmed and merged one, and will do a few more. (Is there a button to push to regenerate your list, to avoid duplication of effort?)
Would someone scold me if I suggested that this might indicate additional disambiguation criteria that the ORCID and/or ZooBank bots could use while importing? —Scs (talk) 11:17, 7 May 2020 (UTC)
A bot could indeed look at ORCID and ZooBank duplicates for matches. Scs: you can regenerate the list clicking on Manually update this list and waiting up to 1 minute for the success page. However, note that changes take time to propagate and will have no effect for some time even clicking the link. I'm not sure how much it takes, but updating in less than 10 minutes or so usually has no effect. In the absence of manual updates, it updates daily automatically. --MarioGom (talk) 11:45, 7 May 2020 (UTC)
I wasn't thinking of a separate bot (that's more or less what you've written) -- I was thinking that the bot that originally created, say, Enrique Macpherson (Q30512371) could have checked first and noticed that Enrique Macpherson (Q21340373) already existed. —Scs (talk) 11:52, 7 May 2020 (UTC)
I noticed that several of the dups I just merged had similar qids (Q54522789, Q54537815, Q54539266), and indeed they were all created by QuickStatementsBot in late May 2018, with the edit summary "Created a new Item: #quickstatements; invoked by SourceMD:CreateFromWikispeciesDOIs‎". I have no idea how to figure out who created that batch, and it's obviously long finished, and creation of such batches might being done more carefully today, but it's the sort of thing I'm thinking of. —Scs (talk) 13:03, 7 May 2020 (UTC)
The query returns at least one false positive. I have added reciprocal different from (P1889) statements to the items concerned. @MarioGom: Please can you adjust your query to exclude such items? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:49, 8 May 2020 (UTC)
Andy Mabbett: Done, thank you! --MarioGom (talk) 11:44, 8 May 2020 (UTC)
The false positive was probably because there was a link from Q24206732 to Helen Ward (Q43193512); I changed it to link to Helen Ward (Q25450864). Q59846853 also linked to the wrong person, and I'm not certain that the Helen Ward linked in Q47201908, Q47288737, Q48124447, Q50655465, Q51033516, Q80260479, Q87705425, and Q90234880 (and possibly others) is the same person as Helen Ward (Q43193512). Looks like several different people. Peter James (talk) 12:49, 8 May 2020 (UTC)

Duplication because of private ORCID additions

I have noticed many many duplicate items for people because ORCID identifiers are added for people whose information is private at ORCID. The result is that there is no sane way to merge them. Non free ORCID items are really problematic. Thanks, GerardM (talk) 07:33, 8 May 2020 (UTC)

I'm not sure that somebody who only has a name and an ORCID can be assumed to be notable. Ghouston (talk) 09:25, 8 May 2020 (UTC)
Personally, I mostly ignore these two-statement ORCID only items. --- Jura 09:38, 8 May 2020 (UTC)
There are often other sources confirming the use of the ORCID iD by the named individual referred to in the item. I can't say whether that's true in the cases referred to here, because they are not identified. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:46, 8 May 2020 (UTC)
Not when we only have a name and an ORCiD identifier that is not informative.. They have been added by a bot.. When the ORCiD id is NOT private it is easy and obvious to link them to papers. In this case that will not work because of "false friends". Thanks, GerardM (talk) 12:50, 8 May 2020 (UTC)
If it is used in some PubMed publications, you can search it in EuropePMC, for example (explicit link to EuropePMC author profile). At least, you can get some information of the author. They exists for a reason. Also, even if a profile is private, it may provide links to profiles in other websites (e.g. ResearcherID), which may be imported to Wikidata.--GZWDer (talk) 15:46, 8 May 2020 (UTC)
Most of the private ORCID profiles are on Wikidata because they are linked to some publication. Using "what links here", you can usually trace it back to a publication and see the field of research and the author's affiliation (usually on the paper itself) which will, most of the time, uniquely identify the author. Vahurzpu (talk) 19:22, 8 May 2020 (UTC)
How do you ensure that a paper is INDEED linked to that author.. Still the number of duplications is huge. Merging is to be done by hand.. and that is not you ? Thanks, GerardM (talk) 20:00, 8 May 2020 (UTC)
On the contrary; what I said applies in many cases of duplicate items, where on one of them we only have a name and an ORCID identifier. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:42, 9 May 2020 (UTC)
They cannot be merged because you do not know if it is a false friend. THanks, GerardM (talk) 05:34, 10 May 2020 (UTC)
The cases I describe can and should be merged, because there is another source confirming the use of the ORCID iD by the named individual referred to. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:24, 10 May 2020 (UTC)

How to model different versions of scientific articles?

What is open peer review? A systematic review (Q29649956) and What is open peer review? A systematic review (Q30491890) represents two different versions of one scientific article. I don't know what's the correct property to model that.--GZWDer (talk) 02:16, 7 May 2020 (UTC)

I am wondering about the interest to have two items for that? Is only one not enough? Usually, different version of a given paper fix only spelling and minor stuff. It does not change the meaning or the results of the paper. Pamputt (talk) 07:42, 7 May 2020 (UTC)
(edit conflict) Hmm, usually identifiers have the version at the end, indeed, the DOIs differ there, one has ".1", the other ".2". And usually one can get the newest version by just removing the suffix. But this does not work with DOI, does it? How can you get the newest version of that article without knowing which? Anyway I don't think any author wants old versions being referred to. --SCIdude (talk) 07:44, 7 May 2020 (UTC)
What about using only one item and add the several DOI values with possibly a qualifier such as applies to part (P518) with an item equivalent to pre-release version (Q51930650)? Pamputt (talk) 08:20, 7 May 2020 (UTC)
It would make sense to merge the items, set the newest DOI with preferred rank and add qualifiers if they apply, as Pamputt proposed. --MarioGom (talk) 09:26, 7 May 2020 (UTC)
For what it's worth, we do sometimes have distinct entities for different editions of books. One example I found is Gray's Anatomy (Q200306) and Gray's Anatomy (20th edition) (Q19558994). —Scs (talk) 13:22, 7 May 2020 (UTC)

  WikiProject Wikidata for research has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. Pamputt (talk) 09:55, 7 May 2020 (UTC)

I just merged the two elements, indicating the two DOIs. I marked the older DOI as depreciated. Pamputt (talk) 21:47, 7 May 2020 (UTC)


Also in these, any suggestions what to do with Q83567432? It seems to have been a regularly updated newsfeed. --- Jura 10:04, 7 May 2020 (UTC)


Duplicate items: ZooBank vs ORCID

There are a lot of authors duplicated from ZooBank and ORCID imports. I have created a list of high confidence potential duplicates here: User:MarioGom/Potential duplicates/ZooBank vs ORCID. Duplicates can be verified with the links to ZooBank and ORCID and finding duplicate publications. I have queries with wider criteria, but more false positives. Feel free to ping me if you find false positives in the list, or if it should be replenished with more potential duplicates. --MarioGom (talk) 09:23, 7 May 2020 (UTC)

  Notified participants of WikiProject Biology

Nice work. I've confirmed and merged one, and will do a few more. (Is there a button to push to regenerate your list, to avoid duplication of effort?)
Would someone scold me if I suggested that this might indicate additional disambiguation criteria that the ORCID and/or ZooBank bots could use while importing? —Scs (talk) 11:17, 7 May 2020 (UTC)
A bot could indeed look at ORCID and ZooBank duplicates for matches. Scs: you can regenerate the list clicking on Manually update this list and waiting up to 1 minute for the success page. However, note that changes take time to propagate and will have no effect for some time even clicking the link. I'm not sure how much it takes, but updating in less than 10 minutes or so usually has no effect. In the absence of manual updates, it updates daily automatically. --MarioGom (talk) 11:45, 7 May 2020 (UTC)
I wasn't thinking of a separate bot (that's more or less what you've written) -- I was thinking that the bot that originally created, say, Enrique Macpherson (Q30512371) could have checked first and noticed that Enrique Macpherson (Q21340373) already existed. —Scs (talk) 11:52, 7 May 2020 (UTC)
I noticed that several of the dups I just merged had similar qids (Q54522789, Q54537815, Q54539266), and indeed they were all created by QuickStatementsBot in late May 2018, with the edit summary "Created a new Item: #quickstatements; invoked by SourceMD:CreateFromWikispeciesDOIs‎". I have no idea how to figure out who created that batch, and it's obviously long finished, and creation of such batches might being done more carefully today, but it's the sort of thing I'm thinking of. —Scs (talk) 13:03, 7 May 2020 (UTC)
The query returns at least one false positive. I have added reciprocal different from (P1889) statements to the items concerned. @MarioGom: Please can you adjust your query to exclude such items? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:49, 8 May 2020 (UTC)
Andy Mabbett: Done, thank you! --MarioGom (talk) 11:44, 8 May 2020 (UTC)
The false positive was probably because there was a link from Q24206732 to Helen Ward (Q43193512); I changed it to link to Helen Ward (Q25450864). Q59846853 also linked to the wrong person, and I'm not certain that the Helen Ward linked in Q47201908, Q47288737, Q48124447, Q50655465, Q51033516, Q80260479, Q87705425, and Q90234880 (and possibly others) is the same person as Helen Ward (Q43193512). Looks like several different people. Peter James (talk) 12:49, 8 May 2020 (UTC)

Duplication because of private ORCID additions

I have noticed many many duplicate items for people because ORCID identifiers are added for people whose information is private at ORCID. The result is that there is no sane way to merge them. Non free ORCID items are really problematic. Thanks, GerardM (talk) 07:33, 8 May 2020 (UTC)

I'm not sure that somebody who only has a name and an ORCID can be assumed to be notable. Ghouston (talk) 09:25, 8 May 2020 (UTC)
Personally, I mostly ignore these two-statement ORCID only items. --- Jura 09:38, 8 May 2020 (UTC)
There are often other sources confirming the use of the ORCID iD by the named individual referred to in the item. I can't say whether that's true in the cases referred to here, because they are not identified. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:46, 8 May 2020 (UTC)
Not when we only have a name and an ORCiD identifier that is not informative.. They have been added by a bot.. When the ORCiD id is NOT private it is easy and obvious to link them to papers. In this case that will not work because of "false friends". Thanks, GerardM (talk) 12:50, 8 May 2020 (UTC)
If it is used in some PubMed publications, you can search it in EuropePMC, for example (explicit link to EuropePMC author profile). At least, you can get some information of the author. They exists for a reason. Also, even if a profile is private, it may provide links to profiles in other websites (e.g. ResearcherID), which may be imported to Wikidata.--GZWDer (talk) 15:46, 8 May 2020 (UTC)
Most of the private ORCID profiles are on Wikidata because they are linked to some publication. Using "what links here", you can usually trace it back to a publication and see the field of research and the author's affiliation (usually on the paper itself) which will, most of the time, uniquely identify the author. Vahurzpu (talk) 19:22, 8 May 2020 (UTC)
How do you ensure that a paper is INDEED linked to that author.. Still the number of duplications is huge. Merging is to be done by hand.. and that is not you ? Thanks, GerardM (talk) 20:00, 8 May 2020 (UTC)
On the contrary; what I said applies in many cases of duplicate items, where on one of them we only have a name and an ORCID identifier. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:42, 9 May 2020 (UTC)
They cannot be merged because you do not know if it is a false friend. THanks, GerardM (talk) 05:34, 10 May 2020 (UTC)
The cases I describe can and should be merged, because there is another source confirming the use of the ORCID iD by the named individual referred to. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:24, 10 May 2020 (UTC)

sexual orientation & European General Data Protection Regulation

Hi all, On the Dutch Wikipedia there are some concerns about the adding of the sexual orientation (P91) of people, especially in relationship to the General Data Protection Regulation (Q1172506) (GDPR). The sexual orientation is considered to be a special personal data and considered very sensitive. Of course like all statements in Wikidata, if this kind of data is added, it must be only with notable (well known) people and must be well sourced. There are a few concerns about the collection of this kind of data, even when it is sourced.

First of all, the idea of the existence of a database in what we have collected people with their sexual orientation is considered sensitive/dangerous. It has been suggested that if this becomes known to a wider audience, like through articles written in the press, this might be something that can cause a lot of angry reactions that something like that exists: a database with people and their sexual orientation. (This even in a liberal country like the Netherlands.)

Secondly, another main question that comes up is how information regarding the sexual orientation collected in a structural way is not violating the European GDPR. The question regarding this is I think on what basis this data is collected concerning the privacy of individuals?

If someone can shine a light on these matters, such would appreciated a lot. Thanks! Romaine (talk) 14:57, 28 April 2020 (UTC)

  • Currently, it seems the property is not in the most sensitive category we have (property that may violate privacy (Q44601380)). It likely should be in that category which will result in the standard being "Values for living individuals should generally not be supplied unless they can be considered widespread public knowledge or are openly supplied by the individual themselves (otherwise hidden supporting references are not sufficient)."
Recording data that's openly made public by a person themselves isn't violating their privacy as the person has chosen certain data to be public. Similarly, I don't think we violate privacy when we record data that's considered widespread public knowledge, as it's already out there. ChristianKl18:15, 28 April 2020 (UTC)
"Recording data that's openly made public by a person themselves isn't violating their privacy as the person has chosen certain data to be public" Note that GDPR says nothing about this. It only knows about 'personal data', 'public interest' (government) and processing/controlling, it does not consider if that data is already widely or narrowly known to the public. Just because you have disclosed personal data, does not give anyone an innate permission to collect this personal data and this was sort of affirmed in this recent lawcase. Any such freedom to do so, would have to come from OTHER laws/liberties within the EU and its signatory countries. TheDJ (talk) 10:30, 29 April 2020 (UTC)
@TheDJ: The lawsuit seems to say that if you store data about a person and could contact them because you have their contact addresses you have an obligation to contact them. It doesn't say that Bisnode should ask for permission but only to notify them given that it was easy enough for them to do so with the available contact data. At Wikidata we don't store contact data that could be used to inform the person that we are storing their data. Beyond that legal issues are the responsibility of the meta:Legal team. ChristianKl11:44, 11 May 2020 (UTC)
We do, there is twitter/instagram/facebook account name. And I do think that date of birth (P569) would be PII under some circonstances (since banks do use that to identify me on a regular basis so that's clearly PII that should be protected), which do bring interesting problem, should wikidata automatically contact someone on twitter as soon as the information is entered ? We do have 112878 humans with X username (P2002) and date of birth (P569). --Misc (talk) 13:02, 11 May 2020 (UTC)
  • Such issues are within the domain of the meta:Legal team. They already have a detailed privacy policy. I'd take it up with them if you want an opinion about Wikidata and GDPR specifically. I doubt that many here have the expertise to interpret GDPR or the ability to set Wikimedia policy. Ghouston (talk) 00:10, 29 April 2020 (UTC)
  • See also Phabricator:T193148 and mw:GDPR_(General_Data_Protection_Regulation)_and_MediaWiki_software. Ghouston (talk) 03:33, 29 April 2020 (UTC)
  • "Secondly, another main question that comes up is how information regarding the sexual orientation collected in a structural way is not violating the European GDPR. The question regarding this is I think on what basis this data is collected concerning the privacy of individuals?". Within GDPR, there are basically only 2 ways allowing us to collect this data. 1: consent or 2: the 'public good' exception. Now consent is a bit hard for us, because it requires explicit consent from the subject to the party managing/controlling the data, so that almost never applies and actually comes with a whole slew of extra responsibilities and rights (like the subject can change its mind). The primary 'defense' for WikiData would most likely be 'public good'. GDPR is intentionally vague here as to what that means, rather it defines it as "freedom of expression and information, including processing for journalistic purposes and the purposes of academic, artistic or literary expression." (Art 85) and indicates that member states need to 'reconcile' their laws to make sure that these other freedoms are not infringed upon by GDPR. Also this does not contradict the rights of the individual. Subjects may still ask for erasure, rectification, access to the data and have the right to object to processing. But what happens in such cases when both parties cannot agree is almost always up to a courtcase. Another important point is in this 'a structural way' that we collect from other sources.. As far as I can determine, this is not really covered by GDPR, but arguably we collect for another purpose than the 'primary purpose of collection' of a specific piece of information ('in the news'), which is potentially a problem. However there are exceptions for archiving in the public interest (art 89), but don't expect anyone in drafting this eu law to have considered what that means for something as unique as Wikidata. As I often state.. If you are looking for a clear line on what is allowed and what is not, then you won't find it in this EU law, because it is intentionally vague, we will find out in court. TheDJ (talk) 10:17, 29 April 2020 (UTC)
Hi, Romaine asked me to weigh in as well since I specialise in data protection law in my day job.
Put very simply, the GDPR does not apply to the article namespace, because it falls under the exception of art. 85 GDPR which refers to the law of the EU member states protecting "the right to freedom of expression and information, including processing [personal data] for journalistic purposes and the purposes of academic expression". This rule covers both the question of whether we are allowed to publish personal data about a living EU resident in the first place (art. 6 GDPR), as well as the question of so-called sensitive data (art. 9 GDPR).
You can read the full text of the GDPR here or, in an easier format, here. For further reading, you may want to check the GDPR's recitals, namely nos. 4, 65, and 153.
I hope that this answers your questions; let me know if there is anything I can clarify. Kind regards, --Gnom (Diskussion) make Wikipedia green! 10:41, 29 April 2020 (UTC)
@Romaine:, for a start, I think the Project:LGBT folks would have appreciated a headup when touching so much items at once, especially since this do impact mostly non-heterosexual folks (as there is almost no one who do record heterosexuality in sexual orientation (P91), except a few folks importing data from Russian wikipedia as far as I know). Some people on the project (for example, me) work to keep that information accurate, and just removing a ton at once is making this job a lot more difficult (because while we can query info without ref, it is harder to query info who is not here). This will also cause troubles to people using this property, such as people looking for article to translate about LGBT folks during June for Pride month. I should also point that removing this information is not protecting anyone but pushing LGBT folks back in the closet, and doing that 1 week after National Lesbian Day of Visibility (Q19833027), during preparation of events for June is not a great timing. Just looking at the diffs, I see Adèle Haenel (who was in the news before the pandemic due to Polanski polemics in France), Janelle Monáe (who is notably out), Bertrand Delanoë openly gay, ex-mayor of Paris, etc, etc. Removing that information instead of extracting from wikipedia is taking the lazy road and pushing the hard work to others, and I frankly expect more collaboration. Unless you can show that most of the data is incorrect, it should be reversed and cleaned case by case. If there is a problem, start by a report, not by removing stuff. --Misc (talk) 21:02, 1 May 2020 (UTC)
Hi Misc, Sorry, I had not thought of the LGBT project, my apologies. I understand your position that it can be sometimes hard to find this data, however most statements came directly from Wikipedia and should be still in the articles, and is not lost. Another portion was completely unsourced. As this is extremely sensitive personal information, it is required to have a proper source (and not Wikipedia as source). There can't be exceptions to that. Already in 2016 this property was cleared from any statements without a source, and it is very disappointing to see that when I looked earlier the past days again statements where added without a proper independent source. And please be aware, it is the responsibility of the person who adds this data, to add a proper source.
Werther all the data was qualitative is a question. In a discussion on the Dutch Wikipedia various items were listed as example how terrible the quality was.
Please also note: On request of some users from the Dutch community, the Dutch chapter and German chapter have been asked to look into the matter of adding statements regarding someone's sexual orientation, and there are some big concerns raised. Later this year they want to organise a Wikimedia wide discussion if this kind of data is wanted and acceptable.
In this context, I think the discussion should be a clean and mature one being based on having this statement only added with proper independent sources. Simple restoring the data in the items, will likely cause troubles in that discussion. I would say, if you want to have this property completely deleted from Wikidata, please go ahead restoring all the badly sourced items as they are.
I understand there are frustrations, but it is also frustrating to see that the simple requirement of sources is not followed. Greetings - Romaine (talk) 21:50, 1 May 2020 (UTC)
I agree. A collaborative way of doing it would have been to list the unsourced elements and give us time to add sources. This impacts work done on gender gap projects in all languages by updating lists of suggested translations. These changes should be reverted.Nattes à chat (talk) 21:25, 1 May 2020 (UTC)
Romaine has removed roughly 2000 statements as far as I am aware. Spotting a source, verifying the content, and adding a reference requires something like 5 min/claim. Hard to imagine that anyone or any project would spend 150+ hours of work for such a repair. Particularly since the problem has apparently been ignored since the last cleanup in 2016. Adding sensitive data such as sexual orientation (P91) to Wikidata is extremely laborious, and we cannot just start an import script to do this job, unlike for lots of other data. —MisterSynergy (talk) 22:29, 1 May 2020 (UTC)
Could you plaese hand over a complete list of all the items affected by this massive modification somewhere? So that the people who try to work on wikidata using SPARQL queries to find these unsourced items can continue? At their own rythm (we are all volunteers)? Maybe the system should just find a way of blocking such edits by asking the contributors to first add references. To me it is a flaw in the design of the system, if one can do a forbidden action. It also seems undemocratic if one can revert in five minutes the work of hundreds of contributors without giving them prior notice. And maybe you could now as it was not done before engage conversation with the LGBTIQ project here? Nattes à chat (talk) 09:50, 2 May 2020 (UTC)
I would also appreciate a list so that way people have the opportunity to even start adding in sources. A concrete example of how this negatively impacted project work can be seen in the work I'm doing today, which is not going particularly well thanks to these P91s being deleted. It took me a ridiculous amount of time to wade through Wikipedia articles in multiple languages looking for people for a contest who meet certain criteria, whereas before it would have been a simple query. I found 6... -Yupik (talk) 10:16, 2 May 2020 (UTC)
Special:Contributions/Romaine is a place to start with. There are plenty of "older" pages. —MisterSynergy (talk) 10:38, 2 May 2020 (UTC)

@Ecritures, Ciell: for information as this is happening on the dutch wikipedia. Nattes à chat (talk) 09:52, 2 May 2020 (UTC)

Please, can the people who had the idea in the first place and reverted everything before we got the opportunity to save a useful list make just little effort to make that list? That would be apperciated. Nattes à chat (talk) 14:08, 2 May 2020 (UTC)
  • The Dutch discussion is very long, and I haven't read it all, but I assume it repeats itself quite a lot. There's discussion about whether under GDPR, building a database (in Wikidata or Wikipedia, particularly lists and categories) that includes sensitive information for living people such as medical condition (P1050), sexual orientation (P91), religion or worldview (P140) really meets any journalistic or academic purpose. Then there's discussion about, even if it's legal, whether it's ethical and invades privacy, and is tabloid journalism at best. There's the complaint that making this data available in an easily-accessed multilingual form may put people at risk. There's also doubt about whether the information in Wikidata has been entered reliably, e.g., lacking references entirely, or inferred from invalid assumptions (e.g., that being in a same-sex marriage implies that somebody is homosexual, or that an opposite-sex marriage implies that somebody is heterosexual, or that somebody mentioning in an interview that they are attracted to a particular person of the same sex implies anything). I could add an opposite point of view, that I haven't noticed in that discussion, but may be there somewhere, that the pride movement has been around for a long time now, and its goal is that LGBT people should not be hidden in a closet but accepted as a normal part of society. Turning Wikimedia projects into a homosexual-free zone, as though they are part of Poland or similar places [28], may not be seen as ideal either. Ghouston (talk) 00:57, 3 May 2020 (UTC)
    • I strongly concur with Ghouston's last statement. - Jmabel (talk) 01:43, 3 May 2020 (UTC)
      • Thank you @Ghouston:. I also wonder if GDPR could be invoked to revert information on items which are not "european", and I would like to know if there was a check to know wether these informations were adequately sourced on the wipipedia links used as reference. In which case, we should be able to at least have a list of the impacted elements to be able to add the adequate refs. Some LGBTIQ people are out and even claiming this in the media, and it is an important issue for many of them, so it would be kind of inappropriate "out of respect" for them to delete this information, maiking it seem as something to be hidden. One thing is sure : we have to be careful on how and why this data is added or retrieved.Nattes à chat (talk) 10:54, 3 May 2020 (UTC)
@Nattes à chat: here above Robin van der Vliet has made a list. I just made a list as well which is here. Feel free to copy/move the page elsewhere.
In the history of other items I have noticed that also in 2016 a large removal has taken place of unsourced P91 statements (not in the list above).
"Maybe the system should just find a way of blocking such edits by asking the contributors to first add references." -> I think this is a very good idea.
@Ghouston: "The Dutch discussion is very long, and I haven't read it all, but I assume it repeats itself quite a lot." -> It repeats itself indeed, and it is also including false information and personal insults. To me it is also very emotional. All together makes it a long discussion, but one perspective gets at least 90% of the attention, and because of the "heaviness" of the discussion I think most users do not read it at all.
The first part of the discussion was largely about whether it falls under GDPR or not. Various users stated that Wikipedia and Wikidata have to follow GDPR, but they refused to provide sources for these claims. Then luckily we got some input from a GDPR expert and (if I am not mistaken) from jurists from WMDE. After this, only one user seems to dispute these clear answers.
"There's discussion about whether under GDPR, building a database (in Wikidata or Wikipedia, particularly lists and categories) that includes sensitive information for living people such as medical condition (P1050), sexual orientation (P91), religion or worldview (P140) really meets any journalistic or academic purpose." -> I agree with this, but the most replying users in the discussion there strongly disagree with that. Multiple times it is said (translation): "I can't of anything which makes this data fall under any journalistic or academic purpose."
After the GDPR concerns were mostly cleared, the intensity was still over heated. Some points that some users brought up:
  1. Brought up by multiple users: Just before the start of World War II in the Netherlands there was a register of Jewish people, and this database was used to deport people to Nazi concentration camps. Wikidata should not go in that same direction to avoid that such can happen as nobody knows what the future will bring.
    • A large part of the discussion also was about that I find it in principle wrong to compare a collection of data with the WWII and the holocaust, probably the largest suffering of mankind.
  2. Brought up by multiple users: Well known people have not asked to be in a database in which there sexual orientation has been documented. In relationship to the previous point, about this issue is said that this can cause some public outrage when this comes into the press.
  3. Brought up by multiple users: Governments in countries where there is no freedom of sexual orientation can easily abuse the data from Wikidata for harassment practises, data which is available in any language.
    • There have been some examples of LGBT+ people travelling to LGBT+ unfiendly countries with some issues as result. With this reasoning it has been easily forgotten that: 1. Only people should be in Wikidata that have made themselves publicly stated what sexual orientation they have + source added; 2. It thus already has been (usually) widely in the media so that hostile people/governments can easily find it (and translate tools work pretty well). 3. On Wikipedias in various languages this data is also present.
  4. Brought up by multiple users: Mentioning which sexual orientation someone has is a violation of privacy.
    • This even while mutiple times it has been made clear that only people should have this statement if they self-identified and have a reliable source.
  5. Brought up by multiple users: It happened in the media various times that in a talk show on tv a well known person had not the intention to reveal his/her sexual orientation, but with a slip of the tongue this became public unwanted.
  6. One user had looked in Wikidata for the use of P91 and found examples of this statement which were questionable if that is actual their sexual orientation.
  7. Some users had come to the conclusion in the discussion that this property should be deleted from Wikidata completely.
    • As reply to this, I mentioned that in the Netherlands there is one famous gay singer who even had a television show about this (tv show in what he was looking for a partner of the same gender). I asked them, if they really want this property to be deleted completely, they must be able to explain why it is even a bad idea in the case of this gay singer. (They couldn't, and I "enjoyed" the shower of a shitstorm.)
To explain the quality level of discussion, users on Wikidata who add this property to items were compared with an infant playing with matches and petrol: (according to them) this should be removed quickly from their hands.
So I am sorry if I forgot to notify any project/users/etc, if other point of views should also have been mentioned, or if I should have approached the situation differently, a shitstorm was a bit distracting me from it.
All of this does however not take away that a source for this statement is a requirement and that there were a lot of statements with this property added without any source, which is a problem. A problem that seemingly was tackled four years ago and came up in previous discussions. However, the problem came back. It seems to me a good thing to look into this matter and see if this could be avoided in future.
@Ghouston: "I could add an opposite point of view, that I haven't noticed in that discussion, but may be there somewhere, that the pride movement has been around for a long time now, and its goal is that LGBT people should not be hidden in a closet but accepted as a normal part of society. Turning Wikimedia projects into a homosexual-free zone, as though they are part of Poland or similar places [20], may not be seen as ideal either." -> Each year (besides this year due corona) there is a big Amsterdam Gay Pride (Q478546) which is accepted as a normal part of the Dutch society. I did not see anything in the discussion that referred to intolerance. The full discussion was focussed on privacy and the risks of this kind of data in a database. Romaine (talk) 14:05, 3 May 2020 (UTC)
I wasn't intending to accuse anybody of intolerance, but was wondering if a well-meaning attempt to remove information about sexuality, intended to protect individuals, may alternatively be interpreted as putting them back in a closet, which they may not actually appreciate. Ghouston (talk) 14:48, 3 May 2020 (UTC)
In your message I did not see any accusation of intolerance. I think we must be aware of the possibility of intolerance (as it sadly exists in the world) and let that never be a reason why things are done. I think it is certainly possible that a removal of information about sexuality can be interpreted as putting them back, but on the other hand there are also people that would dislike to be in a database with this identification and want privacy. Both wishes can be served, for that reason only people that have openly self-identified to have a sexual orientation are allowed to have this statement, and then only with reliable source. Between the two perspectives this seems balanced to me, with respect for both.
Looking from the perspective of someone who did not openly identify as such, the data must be qualitative and can't invade someone's privacy. Looking from the perspective of someone who did openly declare their sexual orientation, it has to be clear that with adding this data to a statement something went wrong (missing source, while required), and all warning signs have been ignored. Romaine (talk) 15:20, 3 May 2020 (UTC)
Hi @Romaine: Could you provide a list for the deleted items in 2016? That would help IMO. Nattes à chat (talk) 15:47, 3 May 2020 (UTC)
Also, helping to fix the mess that was created seems to me the responsible thing to do. I hope you realize that you have pushed months of work on other to clean the mess, and I do consider that if you have time to discuss now, you also have time to fix. --Misc (talk) 16:31, 3 May 2020 (UTC)
"Some users had come to the conclusion in the discussion that this property should be deleted from Wikidata completely" I wonder how our colleagues on the Dutch Wikipedia would react, were we to inform them that we had held a discussion on Wikidata, and decided that certain content on that Wikipedia should be deleted? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:32, 6 May 2020 (UTC)
  • I have said many times with these 'sex' discussions, we can't change or erase history. If you were born Michael, we need to report it. It does not matter if you changed it to Yuesef for religious reasons or Michelle for mental reasons. Quakewoody (talk) 22:54, 3 May 2020 (UTC)
I have now created a Phabricator task T251720 using Robin's list for this for the Hackathon this weekend. -Yupik (talk) 06:34, 4 May 2020 (UTC)
I see that many people in that list are dead. Please note GDPR only protects living people, and thus cannot be used as an argument for them. -Ash Crow (talk) 20:38, 5 May 2020 (UTC)
After making a few reverts, adding the denanded sources, I see that many dead people who were notoriously LGBTIQ + Jean Cocteau, Marielle Franco, Simone de Beauvoir, Alice B Tolkias, Colette are included in these deletions... @Romaine: I would appreciate if you could provide a list of the deleted sexual orientations you mention earlier in this thread in 2016 that we could add on to the existing list.Nattes à chat (talk) 22:08, 5 May 2020 (UTC)

  WikiProject LGBT has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.--Trade (talk) 23:58, 8 May 2020 (UTC)

  • I am sorry to state that this whole discussion makes me feel that queer people including me are now going to be forced back to the closet and the sacrifices of many LGBTIQA+ individuals including openly (openly) gay Harvey Milk Harvey Milk (Q17141) will now be erased from the history. Anyways, erasing of queer histories is not something new, only some lame excuses are needed. John Samuel (talk) 06:46, 9 May 2020 (UTC)
    • Yes, it's funny what occasions cause some people to suddenly become very insistent that We Must Follow Official Policy! Not the first time I've seen this, when it comes to what should or shouldn't be included in author bios. Had a depressing experience with this at ISFDB. Levana Taylor (talk) 07:41, 9 May 2020 (UTC)
    • Agree — it appears that a number of the Dutch conversations circled around the question "does someone's sexual orientation have any journalistic merit [beyond tabloid journalism]?" That is not a question anyone should be using as a metric here; entire fields of study exist around gender/sexuality/sociology/psychology, etc., and various individuals' personal realities are absolutely relevant to any biographical representation of them and their relationships to society (such as @Jsamwrites: example of Harvey Milk). And the Dutch question also wondered about the journalistic merit of including religion and medical conditions... I understand their wariness of having a database of information that could be used nefariously, but for notable people who have fought to be able to practice their particular faith, or who advocate for the rights of people with their medical condition, using "journalistic value" as a reason to delete that information feels all kinds of wrong. Sweet kate (talk) 21:56, 9 May 2020 (UTC)
      • Imagine a Spanish homosexual person who thinks that gay people are underrepresented on the Spanish Wikipedia. When we store the information about the sexual orientation they could run a query for homosexual people who live in Spain and have an article on another Wikipedia. It seems strange to me consider such an activity that motivated by providing more visibility to homosexual people as inherently without journalistic mertic. ChristianKl12:00, 11 May 2020 (UTC)

changed constraint

I've never edited a constraint before. I noticed that prefixes like kilo (Q107428) have the obvious numeric value (P1181), except that they were being flagged as constraint violations. So I added unit prefix (Q15132612) to the type constraint at numeric value (P1181) [29]. Any problem with that? —Scs (talk) 12:02, 9 May 2020 (UTC)

P3921 Wikidata SPARQL query

I've been puzzled by property Wikidata SPARQL query equivalent (P3921). Part of the problem was a typo in its description, which I think I've just fixed. And I guess I can see what this property is for if it's applied to an entity that's a list of other entities. But what does it mean for this property to be applied to something like kilometre (Q828224) or parsec (Q12129)?

Should there be a constraint on P3921 limiting it to entities of type Wikimedia list article (Q13406463) or Wikimedia category (Q4167836)? (I would have thought so, but there must be something I'm missing, because kilometre (Q828224) is actually one of the listed examples at P3921.) —Scs (talk) 12:43, 10 May 2020 (UTC)

About the SPARQL query on units: This seems to be used to list possible conversions. While that might seem useful at first sight, I find it somewhat suboptimal: There are currently about 5000 units in Wikidata. Repeating a query that only differs in one place - the unit item to which the query is attached - seems like clutter (a lot of repetitive work for little gain). It might be better to record such a query with usage instructions in a single place (the talk pages of the conversion properties, or some future "Metrology" WikiProject). Therefore I support removing P3921 statements from unit items. Toni 001 (talk) 07:43, 11 May 2020 (UTC)
The SPARQL equivalent property is a somewhat odd type of statement, different from most others.
For these, it seems an adequate use of the "sparql equivalent" and replacement for the 5000*100 conversions some users attempt to add with conversion to standard unit (P2442).
A side effect is that it highlights a function frequently ignored. Did you know about it before finding the query equivalent? --- Jura 11:04, 11 May 2020 (UTC)

Wikidata weekly summary #415

Do you know Wikidata:CheckUser?

Wikidata:CheckUser is active now. --Succu (talk) 21:37, 11 May 2020 (UTC)

Adding grammatical features to forms using QuickCategories

Hello! I have a really big bunch of items with a lot of complex grammatical features to upload. I want to use QuickCategories, as now it is possible to add statements to lexemes (YEAH!), but I can't find the documentation about how to add grammatical features to forms. Is it even possible? -Theklan (talk) 22:31, 11 May 2020 (UTC)

Updating population data for US States

Hi, I've parsed the US Census files as well as linked them to Wikidata pages for all US States. I'd like to update their populations to match the Census ones. Can someone give me a guidance how should I do this exactly?

I guess I'd need to give a reference, what should it be? Also, how to register that "point-in-time" pointing to the latest population data?

Say, I'd like to update https://www.wikidata.org/wiki/Q99 to have population 39,512,223 which is from the 2019 estimates from US Census, from this file: https://www2.census.gov/programs-surveys/popest/datasets/2010-2019/counties/totals/co-est2019-alldata.csv

What should I add exactly?  – The preceding unsigned comment was added by Hyperknot (talk • contribs) at 21:49, 6 April 2020 (UTC).

Dear Sir :

  I want to use sparql to get information from wikidata about COVID-19.
  1. COVID-19 (Q84263196)
   1.1 "number of deaths" of each day from 2020/01/01
   1.2 symptoms 
   1.3 possible treatment
  2. 2019–20 COVID-19 pandemic (Q81068910)
  2.1 number of deaths of each day of each country from 2020/01/01 
  How could I first find "country" from  2019–20 COVID-19 pandemic (Q81068910)
  Then I can find "number of deaths" of each day from 2020/01/01 from each country.

Besides Wikipedia, what is Wikidata actually used for?

This is not a facetious question. Aside from being the backbone of Wikipedia articles, what are some actual real world uses of the data? Is Wikidata used mainly for the personal amusement of tech-savvy people with too much free time, or for serious research and education? Almost everything I see on Wikidata involves ways people can feed the data beast (donate data! play games!), and ways to find trivia like "Municipalities with less than 5,000 inhabitants in the Basque Country" or "Average stardate of Star Trek episodes ", but who is actually using the millions of data items, and for what purposes? Outside of Wikimedia, do any projects and platforms rely on the incomplete and idiosyncratic dataset that is Wikidata? As Wikidata editors, are we just freely volunteering our time and effort to to help Google and Amazon sell ads more efficiently, and maybe some guy in Moscow who wants to know how many Swedish actors appeared in American TV series starting with the letter "K"? Are we secretly training robot algorithms to replace humans? I guess another way of asking my question is: if Wikidata disappeared entirely tomorrow, would anyone besides those reading this comment take notice? -Animalparty (talk) 18:44, 4 May 2020 (UTC)

I appreciate your question. thanks. I would also be interested in an answer. --Herzi Pinki (talk) 19:24, 4 May 2020 (UTC)
It is a serious question, but luckily I can say that I myself have found Wikidata tremendously useful for a project that has nothing to do with Wikipedia -- namely, identifying the authors of 19th-century magazine articles. I started out creating author pages on Wikisource, but then began submitting corrections and additions to the Curran Index. Here we had a freely-editable, powerful way of organizing and combining the data gleaned from my research, with links to references. It would have been way more trouble to do everything as text notes and scrappy personal databases; plus, the result is something publicly available that I can point people to. Maybe (at a rough estimate) 10% of the authors on my list already had very good Wikidata items, 50% had items that needed some or a lot of improvement, and 40% needed items created. Although, as I say, I found it most useful as a way of organizing my own research, every now and then it would flag up something I wouldn't have spotted easily without it. Example: one suggested attribution in the Curran Index is Thomas Chapman as author of an 1837 poem. I linked the Curran entry to Chapman's item and was in the process of adding the info that they gave about him to his item. What's this? He died in 1828? Reference: the Cambridge Alumni Database (already, before me, linked to the item). In that database, they have a further reference to his obituary in a Cambridge newsletter. Ergo, the attribution suggestion is definitely wrong. So, the idea of Wikidata being a good place for confluence of all the world's disparate data sources is not misguided, even if implementation has been rocky. (I'm not too pleased by the unorganized way that masses of data are being added: we definitely need more data, but people, PLEASE stop creating so many duplicate items.) I guess my attitude to WD is a mix of enthusiasm for the possibilities and frustration at the often-sloppy execution. But it is useful to me. Levana Taylor (talk) 20:15, 4 May 2020 (UTC)
Here are some examples:
  1. Analyze and structure document collections by ontologies or lists of names
  2. Chemical database references
  3. subject of published works on data management
  4. Learning to generate one-sentence biographies from Wikidata
  5. IBM - Incorporating resources from Wikidata into applications
  6. What Dutch GLAMs want from Wikidata
  7. Integrating Wikidata at the Library of Congress
  8. book explaining Wikidata's role as a central reference point - we provide a specific identifier for an item/concept/subject and one external database links to it. A second database links to the same item and immediately the two databases can talk to each other without further reference to Wikidata.
There are many more examples of Wikidata usage out there. You just need to try an internet search with a few different terms. Another use that has come up is that many software systems are calling up Wikimedia information through the Wikidata ID. If Wikidata disappeared tomorrow, large numbers of software programmes that call on Wikimedia information would stop working until someone invents an alternative system to do the job that Wikidata does. From Hill To Shore (talk) 20:43, 4 May 2020 (UTC)
I've been routinely relying on Wikidata's collection of organizations (mostly research-related), with OpenRefine, to match up ID's from a variety of different databases, including several internal ones that had never previously been correlated. The multi-linguality is extremely useful as names can be recorded in English or native languages, in different scripts, etc. I've been hoping the same could be done for matching up people's names with identifiers (as Levana Taylor mentions above) but Wikidata is still too incomplete to make much progress there for the databases I work with. But it may be only a few years away from being very useful there. There are a lot of other outside users on a similarly small scale, using Wikidata's collection of information on chemical substances, genes, etc. And there are larger-scale users who are annotating documents using Wikidata - Finland's YLE for example. I believe OpenStreetMap has been using Wikidata, and VIAF has corrected some of their data based on Wikidata feedback. There was a list of such users around somewhere but I can't find it just now... ArthurPSmith (talk) 20:43, 4 May 2020 (UTC)
  • Wikidata started as the knowledge base for Google Knowledge Graph and was donated to the Wikimedia Foundation, it is also used as one of the knowledge databases by Alexa and Siri, you can see that by asking Siri or Alexa about someone that doesn't appear in Wikipedia, and only appears in Wikidata. That person will be described using our description such as "John Xavier Hamilton Smith is an American politician". It takes about 30 days to cycle into Alexa and Siri. I am not sure how Google is handling our data, we seem to have forked. Google Knowledge Graph, the box on the right of the screen, appears to have many more entries than we have in Wikidata, especially geographic entries, down to the building level. --RAN (talk) 22:01, 4 May 2020 (UTC)
I think you misunderstand the history here - Wikidata did not start out as a Google project that was "donated" to WMF, or that we forked. The Google project you're thinking of is Freebase, which was shut down in 2014 (after Wikidata had been up and running for a couple of years) and some data was migrated into Wikidata as part of the shutdown. Andrew Gray (talk) 18:15, 5 May 2020 (UTC)
Thank you for the clarification. --RAN (talk) 19:42, 11 May 2020 (UTC)
At the current time, I don't really know if it is used for much other than some basic applications like search engines. In the future when it is more complete it could be very useful for a lot of things. However, why not now? I think it is because Wikidata is still not all the way there in some major areas, such as sources — there is still a ton of unreferenced data/dubiously referenced "imported from Wikipedia" data. I do not think Wikidata is used for much yet, but in the future when it is more mature it will certainly increase in usage. DemonDays64 | Talk to me 01:58, 5 May 2020 (UTC) (please ping on reply)

My problem: there is no general statement in Wikidata about reliability of the data, completeness of the data, currentness of the data. So Wikidata can only be a weak entry point for further investigations somewhere else. Databases (like geonames) are copied without QA and without reason, just to have exponential growth of Wikidata entries nobody can correct, check, complete any more. Some Bots do not care for duplicates, correctness, cleanup. There are only a few modelling guidelines and no enforcement of such guidelines. There is a lot (A HUGE LOT) of constraint violations - who cares? --Herzi Pinki (talk) 07:10, 5 May 2020 (UTC)

@Herzi Pinki: „Some Bots do not care for duplicates, correctness, cleanup“. Please name them, if you want a change! --Succu (talk) 21:08, 5 May 2020 (UTC)
let's start with Lsjbot and AliciaFagervingWMSE-bot --Herzi Pinki (talk) 21:25, 5 May 2020 (UTC)
User:Lsjbot created not a single item here! --Succu (talk) 19:24, 6 May 2020 (UTC)
@Succu: User:Lsjbot creates articles on sv:WP and ceb:WP based on questionable data from geonames, other bots import those articles from sv:WP and ceb:WP for IMHO often not notable stuff to WD. Then we have a triple of articles in sv:WP and ceb:WP and an item here in WD. E.g. (ceb:Seehorn (bukid), sv:Seehorn (berg), Seehorn (Q21865708) by User:EmausBot, a duplicate of the then existing Seehorn (Q19971449)), (ceb:Vorderer Kesselschneid, sv:Vorderer Kesselschneid, Vordere Kesselschneid (Q21877898) by User:EmausBot, a duplicate of the then existing Vordere Kesselschneid (Q872754), including a spelling error), (ceb:Schönbielhorn, sv:Schönbielhorn, Schönbielhorn (Q22698955) by User:GZWDer (flood), a duplicate of the then existing Schönbielhorn (Q1340383)), (ceb:Gerichtssäule, sv:Gerichtssäule, Großkrut Galgen (Q37900807) by User:AliciaFagervingWMSE-bot, a duplicate of the then existing Großkrut Galgen (Q21875285)). Having these triples makes it difficult to merge, as there are often conflicts with same sitelink. User:Lsjbot even creates duplicate articles on sv:WP and ceb:WP, which get imported here as duplicates too. I resolved this problem by redirecting on sv:WP (sv:special:contributions/Herzi_Pinki) and ceb:WP (ceb:special:contributions/Herzi_Pinki). Mountains at the border having different names on both sides are created as a mountain here and a mountain there, instead of the same thing with two different names (The error is already in the source geonames, I initially tried to fix such problems on geonames too, fearing that otherwise the junk will get reimported from geonames, I also tried to fix errors on OSM, originating from geonames or / and wikidata). GIGO. Bots here and bots there. ceb:Angerwald (bukid sa Unggriya) & ceb:Angerwald (bukid sa Awstriya) (both imported by User:EmausBot), en:Maloško Poldne & de:Mallestiger Mittagskogel Mallestiger Mittagskogel (Q1887525) by user:Sk!dbot and User:Legobot the same day; ceb:Plamorder Spitze (in Austria) & :ceb:Plamorderspitze (in Italy).
I feel that imports from sv and ceb should be prohibited and all imported stuff has to be either checked or deleted. Coordinates from geonames are often only to the precision of minutes, which makes nice patterns (Query), but does not tell the truth. elevations of mountains often differ by some hundred meters as they seem to be derived from some geo-model - wrong coordinates - wrong elevation. Close to the border not even the country of a geographical object is reliable. WD is considered to be one of the most important data backbones of the world, thus I feel it is our damned obligation to care for correctness and reliability as much as possible. Better no data than incorrect data. --Herzi Pinki (talk) 21:05, 9 May 2020 (UTC)
The Geonames import via Cebu Wikipedia did create a lot of duplicate entries and ambiguous entries. Most of the duplicates have already been merged. The inherently ambiguous ones are still there, they are all easy to find because they link to the Cebu Wikipedia. It created a lot of work, to merge the duplicates, and I was angry by the import, but most of the work is done. Since Wikidata is a database you can do a search for elevation data that differs from your more reliable source, import the correct data, and deprecate the incorrect values. All data sets have error rates. The difference with Wikidata is that it can be corrected. --RAN (talk) 19:54, 11 May 2020 (UTC)
@-Animalparty: WD was not designed „being the backbone of Wikipedia articles“. Why do you think so? --Succu (talk) 19:50, 5 May 2020 (UTC)
Succu I used "Wikipedia articles" as shorthand for interlinks within the Wikimedia ecosystem. From Wikidata:Main Page: "Wikidata acts as central storage for the structured data of its Wikimedia sister projects including Wikipedia, Wikivoyage, Wiktionary, Wikisource, and others." I didn't say it was designed for such, but that appears to be the primary justification for its existence. -Animalparty (talk) 20:36, 5 May 2020 (UTC)
Selective reading from Wikidata:Main Page... --Succu (talk) 21:01, 5 May 2020 (UTC)
Okay, how about Wikipedia's Wikidata#Development_history, which cites Meta:Wikidata: "The development of the project is mainly driven by Wikimedia Deutschland and was originally split into three phases: 1) Centralising interlanguage links – links between Wikipedia articles about the same topic in different languages. 2) Providing a central place for infobox data for all Wikipedias. 3) Creating and updating list articles based on data in Wikidata." I fully admit I may not have 100% correct view of reality, but you can see how my view may have been shaped. What do you use Wikidata for? -Animalparty (talk) 21:53, 5 May 2020 (UTC)
Wikidata is also used on infoboxes on various Wikipedias, lots of projects are using listeria to find articles to translate, so a bit more than just interlinks. Now, outside of the movement, there is also inventaire.io. I stumbled on this post for The Visual Novel Database (Q16646841), and OpenStreetmap do use Wikidata as well, cf this key). If you search a bit more, you can see the Library of Congress is working on reusing Wikidata (cf this post). So yeah, there is more than Google/Amazon (and even, vocal assistants do help lots of people who can't use a keyboard, so even if it was helping just vocal assistants, I think it would be good). --Misc (talk) 20:07, 11 May 2020 (UTC)
"Wikidata is also used on infoboxes on various Wikipedias". I was not expecting such a self-destructing argument. The fact that it is used on infoboxes is not a progress, it is a plague. The Wikidata policy of tolerating data without any source is in direct contradiction with the principle that on Wikipedia an assertion must be referenced, with an independent reliable source (i.e. in particular, independent from another wiki). This use of imported infoboxes creates circular references, and especially so when new unexperienced contributors import in wikidata thousands of data from wikipedia in some language (and very often... from the Wikipedia infobox !). Sapphorain (talk) 20:42, 11 May 2020 (UTC)
I do not really understand your reasoning. If the data in wikipedia is always referenced as you seems to imply, then importing from wikipedia shouldn't be a problem. If this is not always referenced, then importing from wikidata through infobox wouldn't be much different than writing it in wikipedia in the first place, especially since there is already a high level of vandalism and problems on Wikipedia. Sure, Wikidata permit to do larger scale vandalism, because one has to just add a incorrect information at 1 place and it get displayed on several places. But this also has the inverse benefit, it is easier to fix on several places, and more people to watch, and there is no need to update the information several time. Also, pretty bold to come on Wikidata to say that using the content produced by volunteers is not progress, I wonder where Wikipedia community do get a bad reputation, really. --21:48, 11 May 2020 (UTC)
Then let me be more specific. What happens all the time is that a contributor installs an infobox on some Wikipedia page, through a template that directly imports data from Wikidata, even if it is referenced with zero source. (In passing: the infobox often provides a false impression of respectability to the data it contains; in passing also: there is no way to put a refnec template on one of these datas). Then, another contributor re-imports in Wikidata the datas contained in the Wikipedia infobox, with a nice reference, say: "imported from Wikipedia in Spanish". Clearer ? Sapphorain (talk) 10:23, 12 May 2020 (UTC)

X numeric user ID (P6552)

X numeric user ID (P6552) can't be used as main value so shouldn't it be moved to just "numeric ID" property and used also with other websites which also has optional numeric ID? Eurohunter (talk) 13:08, 10 May 2020 (UTC)

  • why should it? --- Jura 05:08, 11 May 2020 (UTC)
    • @Jura1: I don't know. It's just an idea/question. Twitter is not the only website with numeric ID and I wonder if for example Genius should be treated in the same way. Eurohunter (talk) 21:54, 11 May 2020 (UTC)
      • @Eurohunter: seems like a reasonable proposal but I would suggest making a new property instead of changing the existing noe. One issue is that we no longer will be able to enforce the format of the ids and people may put the wrong content into the field. The reason we did this for Twitter was because the external identifiers (usernames) are not permanent identifiers (accounts can get renamed). I'm unsure if that's relevant for many other services. BrokenSegue (talk) 05:09, 12 May 2020 (UTC)
    • external-id properties are for external-ids .. I think the complete opposite approach is actually preferable. --- Jura 07:25, 12 May 2020 (UTC)

What to do?

I have found some empty items. I don’t understand the existence of them! What should be done? Reuse, delete or leave them empty? This is a non sense! --151.49.56.189 06:49, 12 May 2020 (UTC)

You can look at the page history and try to guess why they were blanked. Options would be to restore a previous version, merge with another item, or list for deletion at Wikidata:Requests for deletions. Ghouston (talk) 07:45, 12 May 2020 (UTC)
What Ghouston said, but to be explicit: Never reuse items. ChristianKl09:02, 12 May 2020 (UTC)

Loading Description field

Trying to load new records in Wikidata using QuickStatements. While most properties got loaded, we encountered problems when loading the “Description” field using QuickStatements. We also noticed that unlike the other properties, this Description field does not come with a Wikidata Property ID, that starts with “P”. Could this be the cause of the loading problems? If so, how can we load our “Description” info. in Wikidata?

Label/alisas/description/sitelinks are all not properties. https://www.wikidata.org/wiki/Help:QuickStatements#Adding_labels,_aliases,_descriptions_and_sitelinks explains how to add them with QuickStatements. ChristianKl11:22, 12 May 2020 (UTC)

Hide and Show or Expand

Suggest that links to every project with pane in every item, be collapsed by default .

For example:

Wikipedia (147 entries) [Expand]

(147!)

Only links to other Wikiprojects (as meta and so on) would be expanded by default (to see which projects have links).

That way, with this layout, there would be space for more things as make it more aesthetic, adding every wikiproject logo, in a small format:


  Wikipedia (147 entries) [Expand]


--BoldLuis (talk) 12:00, 12 May 2020 (UTC)

Typo

Hi!

I've made a typo when creating Shailaja Pujari (Q94098026). I've named the item Shailaja Oujari but her name's Shailaja Pujari and was wondering how to go about getting it changed. Thanks! --Suonii180 (talk) 12:55, 12 May 2020 (UTC)

Against any label or statement you should see an "edit" button. You can edit the label and publish your change. From Hill To Shore (talk) 13:54, 12 May 2020 (UTC)
(ec) The labels here are editable, I just edited the item.--Ymblanter (talk) 13:56, 12 May 2020 (UTC)

List of possible duplicates (due to database breakage)

Hello all,

During the recent database breakage, while the table containing sitelinks was temporarily unavailable, a lot of duplicate items got created. Several editors worked on fixing the issue, thanks a lot!

From our side, we run a script to find out what are the possible remaining duplicates. In this csv file, you will find items that failed saving because a sitelink was already existing in another item, and in the second and other columns, items that already have this sitelink. Not all of these are necessarily issued from the recent breakage, but the file should provide a good basis to help you check and fix the remaining duplicates. The file contains 2700 lines.

I hope this can be helpful. Thanks for your understanding and sorry for the delay - the script took several days to run. If you need anything else from us, let me know. Cheers, Lea Lacroix (WMDE) (talk) 12:37, 12 May 2020 (UTC)

It looks like most created since the breakage were from QuickStatements, for example batch 28721, with "100% (2153) of 2164 done, 11 errors" - according to EditGroups, 2320 items were created. Peter James (talk) 13:38, 12 May 2020 (UTC)
Such issue was reported in Topic:Vkhd578n4cv9ndew.--GZWDer (talk) 18:13, 12 May 2020 (UTC)

Dropping legacy javascript variables

Hey, Raw global variables in javascript has been deprecated since 2014 and it has been producing deprecation warning since 2014. See phab:T72470 for more details. I think it makes sense to pull the plug after six years (and it's already disabled in mediawiki.org). Search says it's been used in some user scripts but beside that it's not used anywhere else (It was used in a one place but I fixed it). Do you think it's okay to disable it in Monday? Also keep in mind, this doesn't violate Wikidata:Stable Interface Policy as javascript code is explicitly considered unstable. (Clarification: I'm doing this in my volunteer capacity and it has nothing to do with my day job). Amir (talk) 15:23, 8 May 2020 (UTC)

Batch QID lookup

I have several tens of thousands of statements almost ready to run through QuickStatements for adding family names to persons. But they still need the QIDs of the family names looked up. They're in the form "Alexander Axel Spearman (Q76172777) family name (P734) Spearman" for example. Even though I have a fairly big list in a Microsoft Access file that can do some of the looking up, it only has about half of the necessary family names. Is there a Wikidata tool that can do it -- for instance, is this a job for Mix'n'Match? Levana Taylor (talk) 21:50, 11 May 2020 (UTC)

@Levana Taylor: Try running a query like this, modifying the VALUES line as needed (the query service can deal with hundreds of these at a time surprisingly well). This is solely based on the English-language labels, and so you'll need to double-check that there's a one-to-one mapping. For a somewhat higher reliability, you could swap out rdfs:label for wdt:P1705 (native label), though this might end up leaving out some incompletely-specified surnames. The second approach could also correctly handle non-English surnames, though I'm not sure if you need those for your application.
SELECT ?item ?label WHERE {
  VALUES ?label {"Spearman"@en "Taylor"@en}
  ?item rdfs:label ?label.
  ?item wdt:P31 wd:Q101352.
}
Try it!
Vahurzpu (talk) 01:40, 12 May 2020 (UTC)
That works excellently, thanks! However, one question about SPARQL syntax, because I'm a SPARQL newbie -- how do I add lcase to the string search to make it case-insensitive? (Lists of Scottish surnames generally don't know whether to write MacDaniel or Macdaniel.) Levana Taylor (talk) 03:08, 12 May 2020 (UTC)
If you only have a few possibilities, you can run them one at a time. Otherwise, it may require using FILTER (slow, but perhaps feasible if there aren't too many instances of Q101352), or use the search engine somehow. Ghouston (talk) 04:10, 12 May 2020 (UTC)
@Ghouston: I think I am not understanding what you're saying ... I have thousands of lookups to do, so no, I'm not running them one at a time, either with a query or a search; and yes, there are way too many Q101352 to chug through them with a slow method. Luckily, the query that Vahurzpu provided is very speedy, and all I have to do is use a text processor to zap my list of names into the form "Radcliffe"@en "Ramsay"@en "Rankine"@en etc. and paste chunks of it into the query. So far I've tried it with 700 at one time and it spits out the result in seconds. What I'm wondering, though, is how to make the search case-insensitive so that if the list contains (e.g.) "Macdaniel" the query will return MacDaniel (Q36999146). If your answer was in fact a reply to that question, I apologize for not understanding! Levana Taylor (talk) 08:19, 12 May 2020 (UTC)
Well, if what you want is a simple modification to the query above, to make it run just as fast but case insensitive, I don't believe it's possible. I'm sure somebody else will jump in if I'm wrong. Ghouston (talk) 08:51, 12 May 2020 (UTC)
@Levana Taylor: As Ghouston says, RDF matching (which is what the above SPARQL query does) is based on literal strings and that cannot be made case insensitive. The best approach I have found is to adjust your input strings so you list both "Macdaniel" and "MacDaniel" for instance in the VALUES statement, and whichever one matches will show up in your results list. However, there are fancier things that can be done as suggested useful filters with SPARQL functions or regular expressions; it may be better though to just tie in the built-in wikidata search interface that is case insensitive and also insensitive to accents and other unicode variations. This is a little complicated so I recommend Wikidata:Request a query for that. ArthurPSmith (talk) 18:11, 12 May 2020 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── OK, thanks for the explanation! It's not very important; I'm already having to look the data over by eye, to make sure all the "surnames" my text-processing pulled out are really surnames, so I can fuss with the case all I need to. I'm going through a bunch of "Mac"s right now. BTW, now I've got everything set up it's going really speedy. 300 new family name items created today, 1200 surname statements added, a lot more tomorrow. Levana Taylor (talk) 05:11, 13 May 2020 (UTC)

@Levana Taylor: This should be allow to extract most you need (just remove LIMIT 1). Please be careful with people who have more than one name and surnames that appear just once. --- Jura 07:04, 13 May 2020 (UTC)
Yup, I already caught some typos by checking out the hapax legomena -- the families MacNeil and MacNair nearly lost members to the new families "MacNell" and "MacVair." Also there are a few duplicates caused by the automatic duplicate detector not being able to spot when someone enters a right single quote rather than a straight apostrophe. I ought to find most of those. As for "more than one name," when I go through the list of processed names, I throw out any where I'm not sure whether it's a double surname, and also ones that are complicated for other reasons. Someone else can deal with those, but there are hundreds of thousands of low-hanging fruit. Levana Taylor (talk) 14:20, 13 May 2020 (UTC)

P571: date before

How can I enter a date in the form "before year YYYY"? I was able to find "before" and "after" as qualifiers in additional delimitation, but not a way how to enter this in the main entry. JiriMatejicek (talk) 12:49, 12 May 2020 (UTC)

@JiriMatejicek: you can set the date to a low-precision value like "1980s", and add a latest date (P1326) qualifier. Ghouston (talk) 06:56, 13 May 2020 (UTC)

Right removal?

I wonder if this removal is right or should be discussed on the discussion page. --Romulanus (talk) 13:18, 6 May 2020 (UTC)

  • I don't think the constraint makes much sense. There are plenty of cases where we know that someone holds a position but don't know when they started. ChristianKl18:09, 6 May 2020 (UTC)
    • Why not use an « unknown value » in this case, or an approximate date ? This could avoid the person unlegitimately disappear in some query because the property is absent and « optional » was not used in sparql, for example. In any case we know there IS a value, even if it’s not known. author  TomT0m / talk page 19:33, 6 May 2020 (UTC)
    • I advocate use of approximate date. In any case we know at least millenium. --Infovarius (talk) 22:15, 6 May 2020 (UTC)
    • I have been using "unknown value" when start time (P580) is not known because of the constraint, but if anyone can put it in and take it out without discussion... What is its purpose? I think a constraint like this that affects millions of items should not be put in or taken out so lightly. For the time being, I've undone the edition so that its relevance can be discussed. --Romulanus (talk) 07:51, 7 May 2020 (UTC)
      • My understanding has always been that "unknown value" means that something is generally unknown (i.e. there are no sources containing the information), rather than just that a single editor doesn't happen to currently know it. In the latter case, it should simply be left blank, so that someone else can fill it in later. --Oravrattas (talk) 08:26, 13 May 2020 (UTC)
  • For what it's worth, start times were originally added as mandatory constraints without being discussed, I think, so it's a bit moot either way. I've never been completely sure it's a good idea to make it mandatory rather than suggested (but I'm not sure of the detailed reasoning). At the moment, there are about a million P39 claims (https://w.wiki/QM3), only about half of which have a P580 qualifier (https://w.wiki/QM5). So it's not being treated as a mandatory qualifier... Andrew Gray (talk) 11:46, 9 May 2020 (UTC)
  • I think the constraint makes sense, just because there are plenty of cases where we don't know the start time doesn't justify removing the constraint. Think about it from the perspective of: We know that for every position held, there exists a start time. The whole point of the constraint is to bring attention to cases where this isn't present so they can be investigated. If a value can't be sourced, then add an "unknown value" to reflect this. --SilentSpike (talk) 09:26, 13 May 2020 (UTC)

Dropping legacy javascript variables

Hey, Raw global variables in javascript has been deprecated since 2014 and it has been producing deprecation warning since 2014. See phab:T72470 for more details. I think it makes sense to pull the plug after six years (and it's already disabled in mediawiki.org). Search says it's been used in some user scripts but beside that it's not used anywhere else (It was used in a one place but I fixed it). Do you think it's okay to disable it in Monday? Also keep in mind, this doesn't violate Wikidata:Stable Interface Policy as javascript code is explicitly considered unstable. (Clarification: I'm doing this in my volunteer capacity and it has nothing to do with my day job). Amir (talk) 15:23, 8 May 2020 (UTC)

Batch QID lookup

I have several tens of thousands of statements almost ready to run through QuickStatements for adding family names to persons. But they still need the QIDs of the family names looked up. They're in the form "Alexander Axel Spearman (Q76172777) family name (P734) Spearman" for example. Even though I have a fairly big list in a Microsoft Access file that can do some of the looking up, it only has about half of the necessary family names. Is there a Wikidata tool that can do it -- for instance, is this a job for Mix'n'Match? Levana Taylor (talk) 21:50, 11 May 2020 (UTC)

@Levana Taylor: Try running a query like this, modifying the VALUES line as needed (the query service can deal with hundreds of these at a time surprisingly well). This is solely based on the English-language labels, and so you'll need to double-check that there's a one-to-one mapping. For a somewhat higher reliability, you could swap out rdfs:label for wdt:P1705 (native label), though this might end up leaving out some incompletely-specified surnames. The second approach could also correctly handle non-English surnames, though I'm not sure if you need those for your application.
SELECT ?item ?label WHERE {
  VALUES ?label {"Spearman"@en "Taylor"@en}
  ?item rdfs:label ?label.
  ?item wdt:P31 wd:Q101352.
}
Try it!
Vahurzpu (talk) 01:40, 12 May 2020 (UTC)
That works excellently, thanks! However, one question about SPARQL syntax, because I'm a SPARQL newbie -- how do I add lcase to the string search to make it case-insensitive? (Lists of Scottish surnames generally don't know whether to write MacDaniel or Macdaniel.) Levana Taylor (talk) 03:08, 12 May 2020 (UTC)
If you only have a few possibilities, you can run them one at a time. Otherwise, it may require using FILTER (slow, but perhaps feasible if there aren't too many instances of Q101352), or use the search engine somehow. Ghouston (talk) 04:10, 12 May 2020 (UTC)
@Ghouston: I think I am not understanding what you're saying ... I have thousands of lookups to do, so no, I'm not running them one at a time, either with a query or a search; and yes, there are way too many Q101352 to chug through them with a slow method. Luckily, the query that Vahurzpu provided is very speedy, and all I have to do is use a text processor to zap my list of names into the form "Radcliffe"@en "Ramsay"@en "Rankine"@en etc. and paste chunks of it into the query. So far I've tried it with 700 at one time and it spits out the result in seconds. What I'm wondering, though, is how to make the search case-insensitive so that if the list contains (e.g.) "Macdaniel" the query will return MacDaniel (Q36999146). If your answer was in fact a reply to that question, I apologize for not understanding! Levana Taylor (talk) 08:19, 12 May 2020 (UTC)
Well, if what you want is a simple modification to the query above, to make it run just as fast but case insensitive, I don't believe it's possible. I'm sure somebody else will jump in if I'm wrong. Ghouston (talk) 08:51, 12 May 2020 (UTC)
@Levana Taylor: As Ghouston says, RDF matching (which is what the above SPARQL query does) is based on literal strings and that cannot be made case insensitive. The best approach I have found is to adjust your input strings so you list both "Macdaniel" and "MacDaniel" for instance in the VALUES statement, and whichever one matches will show up in your results list. However, there are fancier things that can be done as suggested useful filters with SPARQL functions or regular expressions; it may be better though to just tie in the built-in wikidata search interface that is case insensitive and also insensitive to accents and other unicode variations. This is a little complicated so I recommend Wikidata:Request a query for that. ArthurPSmith (talk) 18:11, 12 May 2020 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── OK, thanks for the explanation! It's not very important; I'm already having to look the data over by eye, to make sure all the "surnames" my text-processing pulled out are really surnames, so I can fuss with the case all I need to. I'm going through a bunch of "Mac"s right now. BTW, now I've got everything set up it's going really speedy. 300 new family name items created today, 1200 surname statements added, a lot more tomorrow. Levana Taylor (talk) 05:11, 13 May 2020 (UTC)

@Levana Taylor: This should be allow to extract most you need (just remove LIMIT 1). Please be careful with people who have more than one name and surnames that appear just once. --- Jura 07:04, 13 May 2020 (UTC)
Yup, I already caught some typos by checking out the hapax legomena -- the families MacNeil and MacNair nearly lost members to the new families "MacNell" and "MacVair." Also there are a few duplicates caused by the automatic duplicate detector not being able to spot when someone enters a right single quote rather than a straight apostrophe. I ought to find most of those. As for "more than one name," when I go through the list of processed names, I throw out any where I'm not sure whether it's a double surname, and also ones that are complicated for other reasons. Someone else can deal with those, but there are hundreds of thousands of low-hanging fruit. Levana Taylor (talk) 14:20, 13 May 2020 (UTC)

P571: date before

How can I enter a date in the form "before year YYYY"? I was able to find "before" and "after" as qualifiers in additional delimitation, but not a way how to enter this in the main entry. JiriMatejicek (talk) 12:49, 12 May 2020 (UTC)

@JiriMatejicek: you can set the date to a low-precision value like "1980s", and add a latest date (P1326) qualifier. Ghouston (talk) 06:56, 13 May 2020 (UTC)

Open Data and Open Sensors

Hello everybody,

can we find publicly available sensors through wikidata? I tried to find sensor instances, yet nothing came up. Maybe I made a mistake.

I am imagining somekind of catalogue in which publicly available sensors are collected and their referring rest-endpoints are listed. This would enfoster the reuse of open data. If wikidata is not the right place to search for this, does anybody have a hint where to look best?

Thank you, best regards!  – The preceding unsigned comment was added by 77.185.98.234 (talk • contribs).

@77.185.98.234: Items can be made on Wikidata for particular devices, if there are external sources that describe them and if somebody is interested in creating them. There are some items for systems-on-a-chip, like Qualcomm Snapdragon S1 MSM7225 (Q18669450), which can then be linked from devices that use them. Ghouston (talk) 01:20, 14 May 2020 (UTC)

Your data isn't rdfxml?

Hi, I downloaded the items Summer Glau (Q236854), Karen Gillan (Q231237), George Washington (Q23) as .rdf files with

wget http://www.wikidata.org/wiki/Special:EntityData/$.rdf

When I try to load the graph with python3-rdflib version 4.2.2-2 in Python 3.7.3 following error occurs (example):

rdf:nodeID value is not a valid NCName: 6aad7017843afb03b069981cb0e0e5a7

Same with raptor version 2.0.14:

rapper -c Q23.rdf

rapper: Parsing URI file:///home/pi/NLP/Q23.rdf with parser rdfxml

rapper: Error - URI file:///home/pi/NLP/Q23.rdf:5672 - Illegal rdf:nodeID value 6aad7017843afb03b069981cb0e0e5a7

All of this works with Douglas Adams (Q42), your "showcase". Which made me think I could use these libraries.

--Botfiddler (talk) 16:30, 13 May 2020 (UTC)

I can reproduce this; the issue seems to be that the rdf:nodeID values start with a number. The W3C RDF validator complains about this, making me think it's a violation of the spec.
For a workaround, you can use one of the non-RDF input formats, like Turtle. If you want to load the Q42 graph, you can use:
G = rdflib.Graph()
G.load('https://www.wikidata.org/wiki/Special:EntityData/Q42.ttl', format='text/turtle')
RDFLib seems to do fine with those. Vahurzpu (talk) 16:53, 13 May 2020 (UTC)
"making me think it's a violation of the spec" - Indeed. According to https://www.w3.org/TR/rdf-syntax-grammar/ and https://www.w3.org/TR/REC-xml/ the value of rdf:nodeId should be an NCName, which may not have a digit as the initial character. See also Wikidata:Project_chat/Archive/2020/04#Blank_node_deprecation_in_WDQS_&_Wikibase_RDF_model and Wikidata:Contact_the_development_team/Query_Service_and_search#Blank_node_deprecation_in_WDQS_&_Wikibase_RDF_model where it is proposed to assign URIs to all current blank nodes (at least for the WDQS, and hopefully also for EntityData). Bovlb (talk) 23:27, 13 May 2020 (UTC)

Specific results

How can I get specific a list of person pages of hiwiki / sawiki who doesn't have page on enwiki but has date of birth same as another page of a person on enwiki which doesn't have a page on hiwiki. Just a concept to find duplicates on the two. They might be good merger candidates. Final decision can be left to wikidata editors.  – The preceding unsigned comment was added by Capankajsmilyo (talk • contribs) at 10:02, 4 May 2020‎ (UTC).

@Capankajsmilyo: I had a quick play with this, but couldn't come up with a query that didn't time out. You might have better luck at Wikidata:Request_a_query. Bovlb (talk) 14:48, 14 May 2020 (UTC)

Data about topics covered by a publication

Suppose we want to store data about topics covered by articles of a periodical, pages of a website, chapters of a book etc. See for example atarimagazines.com: it lists all contents of each issue of historical magazines, and many projects like that exist (at least about computing). Is such data in project scope? What would be the correct data structure? --Bultro (talk) 13:50, 7 May 2020 (UTC)

When I look at Special:RecentChanges, I always see lots of work being done adding main subject (P921) to our many scholarly articles (Q13442814). (See for example these edits: [30] [31] [32].) So that's one possibility. —Scs (talk) 14:22, 7 May 2020 (UTC)
If the individual articles don't merit items of their own, you can use applies to part (P518). - Jmabel (talk) 15:54, 7 May 2020 (UTC)
They're unlikely to merit items for each article; I meant how to store indexes for whole publications. Suppose we're talking about Compute! (Q4036474); issue #1 has these contents, with articles about Atari 8-bit family (Q249075), Commodore PET (Q946661), etc. What could be done exactly? An item for each issue? --Bultro (talk) 00:18, 8 May 2020 (UTC)
main subject (P921), with qualifier applies to part (P518). - Jmabel (talk) 01:29, 8 May 2020 (UTC)
Can you make a clear example? --Bultro (talk) 23:49, 8 May 2020 (UTC)
@Bultro: Give me a clear prose example of what you are trying to say about some particular item, and I will model it for you. - Jmabel (talk) 19:06, 9 May 2020 (UTC)
I am trying to say (I don't know in what item) that issue #1 of Compute! contains these topics, and so on for many other issues. That may be just a list of topics represented by other items (like Atari 8-bit family (Q249075), Commodore PET (Q946661), etc.) or even a complete list of articles, with title, author, page.
I think this kind of data has never been stored in Wikidata before, so it may just be an idea --Bultro (talk) 13:39, 10 May 2020 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── We have Compute! (Q4036474) but as far as I can see we have no modeling at all for individual issues of the magazine. So the first step is to create an item (let's call it Q-ISSUE}} for that issue. And, now that I think it through further, what I wrote above won't work, because our approach to statements isn't robust enough for what I was thinking, and to model this right you'd have to model down to individual articles having items. So something like:

⟨ Q-ISSUE ⟩ has part(s) (P527)   ⟨ Q-ATARI-ARTICLE-ITEM ⟩
⟨ Q-ATARI-ARTICLE-ITEM ⟩ main subject (P921)   ⟨ Atari 8-bit family (Q249075)      ⟩

etc. - Jmabel (talk) 16:20, 10 May 2020 (UTC)

Or better yet

⟨ Q-ATARI-ARTICLE-ITEM ⟩ published in (P1433)   ⟨ Q-ISSUE ⟩
⟨ Q-ATARI-ARTICLE-ITEM ⟩ main subject (P921)   ⟨ Atari 8-bit family (Q249075)      ⟩

-- Jmabel (talk) 16:28, 10 May 2020 (UTC)

That makes sense. But if we always have to create a separate item for every article of a magazine, or chapter of a book etc., it gets pretty heavy.
Perhaps a "secondary subject(s)" property? Or something like has part(s) (P527), but allowing for string values? --Bultro (talk) 10:17, 14 May 2020 (UTC)

a) Obviously any work x, which is published in (P1433) any work y must have the same publication date (P577) as work y. Therefore I would like to propose to ensure, that the values for publication date (P577) will be automatically transferred from y to x (and protected against manual changes), if published in (P1433) is used.

b) Seemingly any work x, which is edition or translation of (P629) any work y must have the same author (P50) as work y. If this is true (as I guess), than the value for author (P50) could be automatically transferred from from y to x (and protected against manual changes), if edition or translation of (P629) is used.

--Nstrc (talk) 14:08, 10 May 2020 (UTC)

  • That would apply to first publication, or to one particular publication but we seem to use published in (P1433) for things like poems, which can certainly have more than one significant "publication" (e.g. first appearance is in a magazine, later collected in a volume of that poet's poetry, maybe in a major anthology). - Jmabel (talk) 16:53, 10 May 2020 (UTC)
Arguing in this way confused work level and edition level: "first appearance is in a magazine, later collected in a volume of that poet's poetry, maybe in a major anthology" are different editions of the same work;
  • The poem's "first appearance [...] in a magazine" has of course the same publication date (P577) as the issue of the magazine, in which it is published.
  • The poem's second appearance "in a volume of that poet's poetry" has of course the same publication date (P577) as that volume.
  • etc.
--Nstrc (talk) 17:52, 10 May 2020 (UTC)
a) "are you saying that you created confusion or that I did?"
Sorry, I said: You did.
b) "published in (P1433) gives as a Wikidata property example (P1855)."
Maybe this works for poems. - But even in case of poems I would say:
I would like to know not only that To a Creole Lady (Q3576923) is published in Les Fleurs du mal (Q216578).
Rather I would like to know, that To a Creole Lady (Q3576923) is published
* in Q23890844 (Portuguese edition) on page X,
* in Q17358083 (French edition) on page Y
and
* in Q23890857 (German edition) on page Z.
I.e.: To a Creole Lady (Q3576923) is a WORK; this work has MANY EDITIONS - and each of this editions have a CERTAIN publication date (P577).
c) I'm interested in exampales like that:
In Karl Marx (Q9061) is stated: "occupation (P106): journalist (Q1930187)". This statement is proved by: Q56605416 (article about Karl Marx <Q56605416> within the Encyclopædia Britannica [15th edition, 23. print] <= Q56605405>).
Of course, in any case
  • the article about Karl Marx at issue (Q56605416)
must have the same publication date (P577) as
  • Encyclopædia Britannica [15th edition, 23. print] (Q56605405), isn't it?[1]
d) In other words: There can be two kinds of items:
  • items, which list all editions of a certain work,
and
  • items, which belongs to a CERTAIN edition. - I'm interessted in the latter case.

References

  1. Unfortunaley, in the example at issue
    * 'neither in Q56605416
    * nor in Volume 7 (Q56605405)
    publication date (P577) is mentioned.
    But: If
    * in Volume 7 (Q56605405)
    a publication date (P577) would be mentioned, this date could be automatically transferred into
    * Q56605416. Isn't it?
--Nstrc (talk) 00:29, 11 May 2020 (UTC)

"if its published in something that's published in parts such as volumes or a periodical"

a) Obviously any work x, which is published in (P1433) any work y must have the same publication date (P577) as work y. Therefore I would like to propose to ensure, that the values for publication date (P577) will be automatically transferred from y to x (and protected against manual changes), if published in (P1433) is used. b) [...] --Nstrc (talk) 14:08, 10 May 2020 (UTC)

  • I suppose a) is true where the work is published in another single work, which has a single publication date, although not if its published in something that's published in parts such as volumes or a periodical. E.g., the P1433 on Q88368736. For b) occasionally an additional author may be added to later editions of a work. I found an odd case recently of a book where the first edition was by Gonzalez and Wintz [33] but the 2nd by Gonzalez and Woods [34]. I wasn't sure what to put on the work item Digital image processing (Q88534836), but I settled for listing all three authors.[1]
Ghouston (talk) 03:54, 11 May 2020 (UTC)

References


a) Regarding the exampale Q88368736: For my point of view there should be:
  • not only an work level-item "Nature Medicine" (Q1633234)
but rather as well
  • an edition level-item "Nature Medicine Vol. 26, (2020)" (Qxxxxxx)
and
  • - if true - a second, more precise edition level-item "Nature Medicine Issue 16 March 2020" (Qyyyyyy) -
and consequently Q88368736 would by published in (P1433) Qxxxxxx respectively Qyyyyyy. Then that, what I propose, would work: Q88368736 necessarily would have the same publication date (P577) as Qxxxxxx – 2020 – respectively Qyyyyyy – 16 March 2020 –.
b) Cfr.:
--Nstrc (talk) 07:42, 11 May 2020 (UTC)
  • @Nstrc: You cannot force this constraint for recent electronic publication; articles in most scholarly journals are now published online just about every day of the week, so (if the journal chooses to do this) their publication date may be quite different from a nominal date associated with the "issue" they are in. We also generally do not add Wikidata items for individual issues of journals, nor even for their yearly or whatever schedule volumes. Example: Transformation electronics: Tailoring the effective mass of electrons (Q21708333) was published 8 October 2012 (as can be seen from the journal page here), in issue 16 of Volume 86 of Physical Review B (Q2284414); the issue has a publication date of 15 October 2012 as can also be seen from the journal page (and we have no items for that issue nor even of that volume of the journal). ArthurPSmith (talk) 14:20, 11 May 2020 (UTC)
  • "You cannot force this constraint for recent electronic publication"
My main request is not the constraint ("protecting against manual changes"). Rather my main point is to ensure, that the date of publication will be insterted automatically.
Again: Obviously any work x, which is published in (P1433) any work y must have the same publication date (P577) as work y! Of course, if there is within the item for y no date of publication mentioned, than the date can't be transferred.
But if the date is mentioned within the item for y, than it must be the same for x. (There is no difference between book and journal; printed and digital; former time and nowadays.)
(However, if there are two or more editions of y with different dates of publication, than it is nessary to create diffferent edition items of x and y as well.)
  • "We also generally do not add Wikidata items for individual issues of journals"
But that's a pity. Tables of content of individual journal issues could be informative.
But even, if it is not allowed to create such items - then only follows that my proposal for automatisation would not work for journal articles. But still it would work for books.
This example does not disprove my argument. Transformation electronics: Tailoring the effective mass of electrons (Q21708333) is published in (P1433) in Physical Review B (Q2284414). Of course, the entire journal has no date of publication (rather only a date of inception (P571).) Therefore, that, what I propose, could not cause any mistake.
--Nstrc (talk) 15:10, 11 May 2020 (UTC)
@Nstrc: But the ISSUE publication date in this case IS different from the article publication date, so if as you suggest there should be items for each issue, then your assertion would be in error. However if you are limiting this assertion to books then I suppose I don't have a problem with that. ArthurPSmith (talk) 01:31, 13 May 2020 (UTC)

Proposal for publication date (P577) regarding papers, which are published in books

@ArthurPSmith: Okay, in the case of Transformation electronics: Tailoring the effective mass of electrons (Q21708333) the paper is published online on 8 October 2012. But it is classified as part of "Vol. 86, Iss. 16", which itself is published on 15 October 2012. Then my propsal, to use the issue date of publication as well as date of publication of the essay would not work. I'm convinced - regarding such journals.

However, for texts – text (Q234460) – and subclasses of texts, which are published in books my proposal still would work and facilitate the complition of items. - What would be necessary to realise that proposal?

--Nstrc (talk) 08:20, 13 May 2020 (UTC)

Different authors of different editions

a) [...]
b) Seemingly any work x, which is edition or translation of (P629) any work y must have the same author (P50) as work y. If this is true (as I guess), than the value for author (P50) could be automatically transferred from from y to x (and protected against manual changes), if edition or translation of (P629) is used.
--Nstrc (talk) 14:08, 10 May 2020 (UTC)

For b) occasionally an additional author may be added to later editions of a work. I found an odd case recently of a book where the first edition was by Gonzalez and Wintz [35] but the 2nd by Gonzalez and Woods [36]. I wasn't sure what to put on the work item Digital image processing (Q88534836), but I settled for listing all three authors.
Ghouston (talk) 03:54, 11 May 2020 (UTC)
I would propose the following:
Work level (1 item): without authors.
Edition level (2 items): 1st edition with authors Gonzalez and Wintz / 2nd edition with authors Gonzalez and Woods. (Both items are editions of the work level-item [but not editions of each other].)
--Nstrc (talk) 08:14, 11 May 2020 (UTC)
That would be no luck for Gonzalez then ;)
Maybe I should co-author a reedition of a work by .. --- Jura 11:06, 11 May 2020 (UTC)
  • There's no such things as "protecting against manual changes" in Wikidata. It's not part of the toolbox. From my perspective I however wouldn't see an issue with a bot that copies over values (and that allows users to fix possible expections). ChristianKl11:21, 11 May 2020 (UTC)

The Source MetaData WikiProject does not exist. Please correct the name. ChristianKl❫ 11:34, 11 May 2020 (UTC) The Source MetaData/More WikiProject does not exist. Please correct the name. ChristianKl❫ 11:35, 11 May 2020 (UTC) Template:Anker::works that have different authors depending on the edition seem closer to a serie of works, which could be linked by derivative work (P4969). I would be very much opposed to @Nstrc:'s proposition to have no author on the work, as that would mean that for any operation requiring to know the author of a work, we would need to infer it from its associated editions, and then handle the (not so) special case where a work has known edition in Wikidata. -- Maxlath (talk) 12:09, 11 May 2020 (UTC)

"works that have different authors depending on the edition seem closer to a serie of works, which could be linked by Bearbeitung derivative work (P4969)."
Sounds reasonable. So, we would have:
  • two work level-items: Digital image processing (1st edition with authors Gonzalez and Wintz) and Digital image processing (2nd edition with authors Gonzalez and Woods).
  • 1st edition has derivative work (P4969) 2nd edition
and
isn't it?
--Nstrc (talk) 12:27, 11 May 2020 (UTC)
based on (P144) almost seems too weak in this case, but that would work like that yes. The problem being that the naming of "1st edition" and 2nd edition" might lead to bad interpretations, and people changing those items instance of (P31) to version, edition or translation (Q3331189) (which I also wish was possible to lock). -- Maxlath (talk) 12:44, 11 May 2020 (UTC)

I invite people to look at Bayesian Data Analysis (Q29167237), a text book, with 3 editions and where the author list have been extended: Bayesian Data Analysis, first edition (Q33020057), Bayesian Data Analysis, Second Edition (Q33019051) Bayesian Data Analysis, Third Edition (Q29167245). I think it is not uncommon that textbooks may change authors between editions. — Finn Årup Nielsen (fnielsen) (talk) 13:01, 11 May 2020 (UTC)

Thank you. Regarding Bayesian Data Analysis, first edition (Q33020057): Shouldn't transferred "first edition" from "Description" to "Label" and/or added at Bayesian Data Analysis (Q29167237) "work level" to "label", so that the difference between Bayesian Data Analysis (Q29167237) (work level) and Bayesian Data Analysis, first edition (Q33020057) (edition level) is more clear?
--Nstrc (talk) 13:30, 11 May 2020 (UTC)
@Ghouston:: In any case Woods is not the author of Digital image processing with publication date (P577) "1977". You should create a fruther item for the later edition.
--Nstrc (talk) 13:30, 11 May 2020 (UTC)
Multiple edition items are needed because details vary from one version to the next. This can even include the title (especially for translations, but also in other cases), and as seen, the authors, as well as all the obvious properties like publication date, publisher, and external identifiers. So the question then is what the "work" items are for. Firstly, they provide a place to link all the edition items together, if there are more than one. Secondly, it's a way of "summarizing" all the editions into a single abstract "work", so that we can produce a list of e.g., Charles Darwin's works that includes a single entry for On the Origin of Species (Q20124), instead of listing hundreds of different versions of it. But it seems that we aren't exactly sure how to create this summarised version when editions each have different information. In the case of Digital image processing, we would potentially want this work to be listed once for each of its authors, not e.g.., twice for one author and not as all for another. We may also attach some significance to certain details, if known, such as the date when a manuscript was completed, in its first version, and the date of first publication. Ghouston (talk) 02:28, 12 May 2020 (UTC)
Then there are sometimes authors on particular editions, such as translators and those who write introductions and commentaries, who we don't really want to put on the "work" item, although we perhaps want to recognise their contribution as a "work" somehow when listing the works for those people. Ghouston (talk) 02:34, 12 May 2020 (UTC)


Digital image processing (Q88534836) needs (at least) a second item!

"In the case of Digital image processing, we would potentially want this work to be listed once for each of its authors, not e.g.., twice for one author and not as all for another. We may also attach some significance to certain details, if known, such as the date when a manuscript was completed, in its first version, and the date of first publication."
Ghouston (talk) 02:28, 12 May 2020 (UTC)

Again: For my point of view, you must create (at least) two different items:

  • 1977 version by Gonzalez and Wintz (about 450 pages)
  • version by Gonzalez and Woods (about 650 pages) ("Publication date[:] 2002" 1).
  • Wintz is author of one version (about 450 pages / 1977); Woods is as well author of one version (about 650 pages / "2002" [?]).
Nevertheless, Gonzalez is in deed author of both versions (1977 and [2002?]).

Cfr.: Maxlath (Diskussion) 12:09, 11 May 2020 (UTC).
--Nstrc (talk) 07:01, 12 May 2020 (UTC)

  • Edition items should be created, but to me it seems that they are editions of the same work, and if we were listing works by Gonzalez, it would only be included once. The same applies to Bayesian Data Analysis (Q29167237) where the authors change between editions.
Ghouston (talk) 07:51, 12 May 2020 (UTC)
work level Bayesian Data Analysis (Q29167237) ("...37") Digital image processing
edition level 1995:

Bayesian Data Analysis, first edition (Q33020057) ("...57")

2003:

Bayesian Data Analysis, Second Edition (Q33019051)

1977:

by Gonzalez and Wintz (about 450 pages)

2002 (?):

by Gonzalez and Woods (about 650 pages)

--Nstrc (talk) 08:23, 12 May 2020 (UTC)

  • I have no substantial comment except to confirm that this is a difficult issue for Wikidata to address. The meta:Wikicite project tries to talk this through, but no one has produced satisfying modeling at WikiProject Source Metadata or anywhere else to manage this recurring issue. Wikicite focuses on academic publications to sidestep most issues of edition or revision of the same work. Books are complicated! Blue Rasberry (talk) 20:50, 14 May 2020 (UTC)

Perceptions of Wikidata reuse -- Research invite

I'd like to talk to Wikidata editors about their knowledge of how Wikidata is used on other wikis and outside of the Wikimedia community. Furthermore, I'm interested in understanding how newcomers learn about Wikidata's policies and guidelines. If you'd be interested in meeting with me virtually for a ~30 minutes interview, please complete a brief questionnaire. For more details, see the description of our study posted on meta: m:Research:Perception on rules, values and motivation structure of Wikidata contributors. Thanks! Chuankaz (talk) 16:23, 11 May 2020 (UTC)

@Chuankaz: I think you might need to change the form settings. When I click on it, I get a note saying "This form can only be viewed by users in the owner's organization." Vahurzpu (talk) 19:18, 11 May 2020 (UTC)
@Vahurzpu:I have fixed the link and it's now working. Sorry for the inconvenience! Chuankaz (talk) 05:04, 14 May 2020 (UTC)

Deleted items being recreated over and over again

There are non-notable items that have been nominated for deletion and deleted more than 15 times only for the person in question to recreate it again a couple of days later using a different IP address


There has to be a better way to deal with this.@Fralambert: --Trade (talk) 23:26, 12 May 2020 (UTC)

@Trade: The usual way to deal with such situations is the title blacklist. Its effect is similar to full create protection. But this approach might not scale well given how many such items we might have in the future. We might need something more powerful.--Jasper Deng (talk) 00:24, 13 May 2020 (UTC)
Where is it located? And what do you mean with 'it might not scale well' and 'we need somethin more powerful'?--Trade (talk) 00:27, 13 May 2020 (UTC)
@Trade: MediaWiki:Titleblacklist. Having every prohibited title on this one page might be unwieldy.--Jasper Deng (talk) 00:29, 13 May 2020 (UTC)
I agree that the title blacklist (along with salting) is the usual approach on other projects, but I don't think that works well here, because the new pages have different QIDs, but the same label, and the QID is the title of the page so far as the software is concerned. We could create a service that searches the labels (and, perhaps, aliases) of deleted items, and this could support a gadget that annotates a new pages report, or identifies trends. Unfortunately, we normally consider the contents of deleted pages to be (by default) secret because of the concerns that underly revision deletion. Bovlb (talk) 00:41, 13 May 2020 (UTC)
I believe title blacklist prohibits item titles from being that in any language, but I could be wrong now. The WMF's legal department does not want anyone having access to deleted material without a community (!)vote, last time I checked.--Jasper Deng (talk) 00:54, 13 May 2020 (UTC)
Anything that blocks creation based on title/label alone is a bad idea for a database. We may decide that Person X born on 01-01-2000 is non-notable but we would want to allow the creation of an entry for the notable Person X who was born on 01-07-1800. Unless the software is intelligent enough to check individual statements, this solution could do more harm than good. From Hill To Shore (talk) 02:46, 13 May 2020 (UTC)
@Jasper Deng: I wasn't aware that the title blacklist applied to labels. I cannot see anything in the documentation or the source code. Regarding access to deleted material, perhaps we can craft a compromise whereby we build a special search engine from deleted material, but don't allow general browsing.
@From Hill To Shore: Agreed, although if we're seeing the same item deleted repeatedly, then a time-limited label block might be nice. Bovlb (talk) 04:36, 13 May 2020 (UTC)
Title blacklist can not handle labels. AbuseFilter should be used instead.--GZWDer (talk) 06:28, 13 May 2020 (UTC)
A qid can't really be re-created ;) --- Jura 06:36, 13 May 2020 (UTC)
@GZWDer: I'm actually rather sure Titleblacklist does handle labels. @Jura1: We are talking about repeated recreation of non-notable items with the same label but different QID's (so we cannot just blacklist one particular QID).--Jasper Deng (talk) 09:12, 13 May 2020 (UTC)
@Jasper Deng: No. I'm able to create testwikidata:Q212224, which is a title banned in global title blacklist.--GZWDer (talk) 11:46, 13 May 2020 (UTC)
Okay, it seems I remembered wrong. We can indeed use abuse filter for this, but as brought up before, it is not going to be always practical to blanket prohibit a particular title. Abuse filter can check other things like added claims, but then those would have to be added immediately at the time of item creation (not really possible when manually creating an item, which is the most common use case for such a restriction), and it still leaves open hijacking of existing items with the same title for a different, non-notable person of the same name.--Jasper Deng (talk) 19:24, 13 May 2020 (UTC)
If items are usually created by unregistered or new users, AbuseFilter can disallow addition of names depending on user status, but it may be better to allow creation of items if there are usually sitelinks. If there are existing items with the same name they probably need more people watching them, and if necessary items can be protected or IP addresses can be blocked. Peter James (talk) 21:24, 13 May 2020 (UTC)
That's unlikely to provide a sustainable solution. This is problematic as there is no way to determine at item creation time the exact subject of a new item without identifiers, which can only be added at item creation if done (semi-)automatically. Special:CreateItem may need to be modified to allow for that.--Jasper Deng (talk) 21:29, 13 May 2020 (UTC)
Sometimes they can be determined without identifiers, it depends on the name (and sometimes the description) and there are not always identifiers. Peter James (talk) 22:33, 13 May 2020 (UTC)
How about having a bot that automatically deletes such items when they are created by users that aren't autoconfirmed and have no edits from autoconfirmed users? ChristianKl08:28, 14 May 2020 (UTC)
I instead proposed to introduce a bot to track spam patterns and report any discovery to a new dedicated page. Not only should edits be reverted or deleted, but also users should be blocked.--GZWDer (talk) 08:35, 14 May 2020 (UTC)

Oppose to decide this via chat. Please use a full RfC. MrProperLawAndOrder (talk) 16:46, 14 May 2020 (UTC)

Blocking policy: partial block by default where shared namespaces are not involved

The Wikidata:Blocking policy currently doesn't appear to mention partial blocks at all. In cases where a user is blocked because of actions in local namespaces (like Wikidata: or User:, namespaces that can't be accessed from other wikis) while having made constructive contributions in the shared namespaces, perhaps the user should by default not be blocked from editing the shared namespaces. That would allow them to focus on other projects unimpaired. Does this sound reasonable? Alexis Jazz please ping me if you reply 20:15, 13 May 2020 (UTC)

@Alexis Jazz: It depends on the situation, but on a very high level I do agree it's best to restrict access only as needed. Allowing someone to address themselves at WD:AN is always good if possible. That said, if it is personal attacks I'm inclined to do a full block even if it's only on discussion pages just because we can't have that as a community. Contrast this with edit warring on one particular item, which can be addressed by blocking editing of only that item.--Jasper Deng (talk) 20:47, 13 May 2020 (UTC)
@Jasper Deng: Can Wikidata claim the shared namespaces as 100% their own? Don't they belong to all the projects (including Wikidata itself), together? And if someone is blocked due to personal attacks, what would be the rationale to stop them from contributing constructively to the shared namespaces? Because that block would also hinder that user when editing, say, eswiki, even if eswiki has no problem with that user. And if a user can contribute constructively to the shared namespaces, does it serve the project to block them from doing so? Doesn't it ultimately amount to a form of punishment if the goal is not to protect the project from harm? I know some users can be toxic sometimes, and I understand how you feel about them (and you're not alone), but without that toxicity, all that's left is just a user. Alexis Jazz please ping me if you reply 21:22, 13 May 2020 (UTC)
Then they would be unable to discuss any edits that others take issue with due to being unable to edit discussion pages. Editing here means participating in the community; by contraposition, if said participation is disruptive, you can't edit here. The community as I know it has been rather defensive of its "home territory", that is, they tend to eschew the application of other projects' local policies here. You can always try to change that, but I don't consider it likely to change.--Jasper Deng (talk) 21:27, 13 May 2020 (UTC)
@Jasper Deng: They may (if blocked from every namespace that isn't shared) indeed be unable to discuss any edits, but then.. maybe they don't even want to? Maybe they just want to add P18 to an item that doesn't have P18 yet so a photo shows up on its page on Wikispecies? Maybe they just want to add an article they translated to the interwiki links? And if anyone reverts them they'll just ping/mail/chat/anything an admin in case of vandalism and shrug otherwise. Maybe they're done discussing things and just want to focus on another project, but other projects depend on Wikidata. Contributing here doesn't always mean participating in the community, not for everyone. Alexis Jazz please ping me if you reply 22:49, 13 May 2020 (UTC)
@Alexis Jazz: No, if someone has a problem with their edits, the burden is on them to defend such edits. They need to be offered a fair opportunity to discuss in that case. But if they prove themselves unable to discuss such issues well, then their edits here are a net negative. It is not possible to disentangle even the most mundane-looking edits (what if someone thinks there's a better value for P18 and the two can't agree?). This is a collaborative project like any other wiki, you cannot decouple editing from participation in the community.--Jasper Deng (talk) 23:29, 13 May 2020 (UTC)
@Jasper Deng: What if they don't want to defend their edits? What if they just wanted there to be a value for P18 where there was none, and they will yield to anyone who thinks there is an even better picture? What if they don't even bother looking at their watchlist and disable notifications? Alexis Jazz please ping me if you reply 23:42, 13 May 2020 (UTC)
Let me rephrase. They have to, or they might continue making similar edits that are problematic. All users have to be able to explain their edits to someone else who is confused about them even in cases where those are in fact good edits. Asking millions of rhetorical questions will not help advance your argument, you're unlikely to change my view on this. It's up to you to get consensus from others.--Jasper Deng (talk) 23:44, 13 May 2020 (UTC)
I'm speechless. Alexis Jazz please ping me if you reply 23:54, 13 May 2020 (UTC)
You gave me the idea that you were open to exchanging arguments. You gave some arguments against, I gave some in favor, and we could both consider our position. But I see only now that wasn't how you saw it. Alexis Jazz please ping me if you reply 01:11, 14 May 2020 (UTC)
Allowing someone to contribute to shared namesspaces means trusting them to provide more value then they cause trouble. It's a frequent criticism that we have already have too much vandalism in Wikidata and your proposal is basically to allow people that we don't trust to the extent that we ban them to vandalize our data. ChristianKl06:43, 14 May 2020 (UTC)
It's about people you don't trust to contribute without disruption in the local namespaces but who haven't disrupted the shared namespaces. Alexis Jazz please ping me if you reply 14:52, 14 May 2020 (UTC)
I don't trust any person I banned to not disrupt the item namespace. From the Wikidata perspective there's no huge disinction between local and shared. The item namespace gets ruled by Wikidata policy. ChristianKl20:01, 14 May 2020 (UTC)

Best way to check for redirects en masse

So I'm running a website that's a catalogue of video games and I use Wikidata as my main data source. The problem is, I don't currently have a way to detect redirected Wikidata items, which occasionally leads to duplicate items. (For example, let's say someone creates a new item for an upcoming Call of Duty game, then I run an import, then someone else creates the same game again and ends up merging the original item into their new item. Now when I go to import again, it'll create a second database entry in my site for that game, as I have no way of detecting the redirect right now.

Essentially, I'd like a way to efficiently check ~35000 Wikidata item IDs to see if any are redirects. What'd be the best way to do that?

Thanks, Nicereddy (talk) 04:59, 14 May 2020 (UTC)

Curid

I have seen in meta that in the lateral (right) pane, appears : Link by ID. Clicking in a template, it goes to meta.wikimedia.org/?curid=204985 (link ). Can it be used to create a Q in Wikidata?. And then, easily link to the same template in other sister projects (Wikipedia, Wiktionary...). Much needed.--BoldLuis (talk) 09:48, 14 May 2020 (UTC)

Improved (Wikidata) item creation in sister projects

  Applause. I have seen in Wiktionary lateral pane to create a Wikidata item. Well done!!!!:

Tools

    What links here
    Related changes
    Upload file
    Special pages
    Permanent link
    Page information
    Wikidata item

--BoldLuis (talk) 10:19, 14 May 2020 (UTC)

QuickStatements

Hello, I am not used with QuickStatements. I would want the exact code that should have been necessary to make this and this, because this is the kind of thing that I often do, and it would be useful for me to be able to do this on several items at the same time, and I would have liked an example as a model, please. Christian Ferrer (talk) 11:55, 14 May 2020 (UTC)

Hello. The help page for QuickStatements is at Help:QuickStatements. To add qualifiers and references you would first write a statement and then any number of qualifier properties after it and then reference properties; Q22703892 [tab] P225 [tab] "Sibogaster" [tab] P405 [tab] Q91510 [tab] P574 [tab] +1924-00-00T00:00:00Z/9 [tab] S248 [tab] Q94383917 [tab] S6184 [tab] Q1361864. The format for a date is, change 1924 into 1924-00-00, make it into a timestamp +1924-00-00T00:00:00Z, and add precision after a slash: 9 for year, 10 for month, 11 for day; so, +1924-00-00T00:00:00Z/9. Reference properties are the same as any other except that you change the P to an S. You can only add one reference to this QuickStatements statement. Repeat the statement with a different reference if you have one. Levana Taylor (talk) 15:52, 14 May 2020 (UTC)
@Levana Taylor: Thanks you, that's exactly what I asked for. I just made a test with with Ophialcaea tuberculosa (Q2743163). I will have the opportunity and the possibility to make batch treatments now, thank you again. Christian Ferrer (talk) 20:21, 14 May 2020 (UTC)

Use of the future tense to express preference vs. prediction

I recall from past experiences, or at least my memory of them, occasions where people, chiefly in England, have expressed their food preferences by saying something like: "I'll the soup", "I'll the fish", etc., leaving out what Americans would always put in, the "have" "try" or other verb to complete their sentence. Is such a practice familiar to anyone else, or am I misremembering old movies?  – The preceding unsigned comment was added by Scotchmacstra (talk • contribs) at 21:58, 14 May 2020 (UTC).

I lived in England for a number of years and I do not recall ever hearing such words completely omitted. You do hear the "have" unaspirated and liaised, as in "Eye-lav the soup." You might be better off asking your question on en:Wikipedia:Reference desk/Language. Bovlb (talk) 22:38, 14 May 2020 (UTC)

Hundreds of wrong ids on human items by tool edits

One example, Alexander the Alabarch (Q1243557) from Ancient Rome

  • 10:42, 12 February 2020‎ Thierry Caro talk contribs‎ 14,988 bytes +353‎ ‎Created claim: Deutsche Biographie ID (P7902): pnd137786980, #quickstatements; #temporary_batch_1581486542647 undo thank Tag: QuickStatements [1.5] (restore) [37]
    • human from 13th century
    • no real batch, hard to review, no source stated, but probably from mix-n-match
  • 18:42, 1 March 2020‎ Reinheitsgebot talk contribs‎ 15,340 bytes +352‎ ‎Created claim: CERL Thesaurus ID (P1871): cnp01170666, #quickstatements; mixnmatch:microsync for catalog 1640 details undo (restore)
    • this is the same person as for Deutsche Biographie, note that Deutsche Biographie website also links to CERL and CERL to them, but what is behind "mixnmatch:microsync" cannot be seen

This is one of many similar cases. @Kolja21, Epìdosis: can probably confirm. Is there a way to turn off mix-n-match for Deutsche Biographie?

In general, should more care be taken with mix-n-match? It seems to be very powerful but the syncing is very obscure. Casimir Dudevant (Q1047458) 1795-1871 receives IDs for François His (Q93930448) 1725-1803 :

  • 06:55, 12 February 2020‎ Thierry Caro talk contribs‎ 21,134 bytes +353‎ ‎Created claim: Deutsche Biographie ID (P7902): pnd137582773, #quickstatements; #temporary_batch_1581486542647 undothank Tag: QuickStatements [1.5] (restore)
  • 18:42, 1 March 2020‎ Reinheitsgebot talk contribs‎ 21,486 bytes +352‎ ‎Created claim: CERL Thesaurus ID (P1871): cnp01168365, #quickstatements; mixnmatch:microsync for catalog 1640 details undo (restore)

A related problem: Deutsche Biographie uses GND, i.e. IDs can change on merges in GND and it is an aggregator, it can loose humans on removal from sources. Can mix-n-match handle ID changes an item removal from source? MrProperLawAndOrder (talk) 11:29, 11 May 2020 (UTC)

@MrProperLawAndOrder: I confirm that the edits you report are due to Mix'n'match. Of course it is possible, and technically really simple (it can be done by @Magnus Manske:), to turn off the catalog of Deutsche Biographie. I agree about the need of being very prudent in using MnM. However, my impression is, seeing the statistics of the catalog, that the great majority of the matches has been performed not by humans, but instead by Auxiliary data matcher, so the problem probably lays in it. Now the current problem is the following: if you remove wrong IDs from Wikidata, the match isn't automatically removed from MnM; so, whichever user in good faith, going to https://tools.wmflabs.org/mix-n-match/#/sync/1619, sees "825 connections here, but not on Wikidata" (number showing now) and starts a QuickStatements batch to import them to Wikidata, which results in the disaster you correctly denounce.
My proposal is the following:
I hope having been clear. Would you agree with this? --Epìdosis 12:22, 11 May 2020 (UTC) P.S. Obviously I again thank you very much for the great work of cleaning you are doing!
Oh, I've not answered your last question: unfortunately on MnM big catalogs aren't updated periodically, so new entries in GND-Deutsche Biographie aren't added and old entries remain active; the only way to update a catalog is to create a new one and delete the old one. --Epìdosis 12:26, 11 May 2020 (UTC)
Final note: I would maybe move this thread to Property talk:P7902. --Epìdosis 12:27, 11 May 2020 (UTC)
@Epìdosis: 1) deactivation = OK; 2) i try to clean = OK; currently ~600 DtBio without GND with high error rate. 3) reactivation = not sure. If MnM does not remove items that have been removed from DtBio or are found under a new URL, then it will contain outdated information. MnM could be useful if from time to time it rescans DtBio website for existence of IDs, but in case an ID is not found anymore, it should mark the entry as FETCH error - then we can use MnM for deprecating IDs in Wikidata. Otherwise we must find a way to do this in WD and then also find a way to fix it in MnM - so MnM gives us more work. 4) please leave discussion here, maybe other properties have a similar problem. But maybe it can be archived at P7902 talk. MrProperLawAndOrder (talk) 15:42, 11 May 2020 (UTC)
Correct me if I am wrong, but my understanding is that all GND entries are eventually federated onto VIAF, then after curation onto ISNI. If so, where is the sense in using GND directly? LeadSongDog (talk) 14:42, 11 May 2020 (UTC)
LeadSongDog, DtBio and Sächsische Biografie use the GND as their IDs. Probably many more system from Germany, Austria, Switzerland do that. So, we are lucky, we can directly compare the IDs with the GND ID. Yes, GND ends up in VIAF, but is much more stable, VIAF IDs change more often. Regarding ISNI: I have no idea how long it will take that each GND has an ISNI. GND/DNB is not even listed at http://www.isni.org/content/isni-registration-agencies . And I doubt the institutions from Germany, Austria, Switzerland will all switch to ISNI in the next 10 years. So, having the GND in WD is helpful. MrProperLawAndOrder (talk) 15:52, 11 May 2020 (UTC)
Thanks for your investigations. Beside the lack of quality of the mass edits imho no value should be added without a source and a date.[38] As a maintenance list these hits are useful. VIAF and ISNI have simular problems and we can help their algorithms by creating new items for the namesakes. --Kolja21 (talk) 16:35, 11 May 2020 (UTC)
After checking some more edits: The ones of Magnus look pretty good (some are better than VIAF) but Thierry's edits seem arbitrarily made without comparing the names or the dates of life. Maybe we should undo Thierry's edits and focus on the ones of Magnus. --Kolja21 (talk) 18:01, 11 May 2020 (UTC)
As a reminder please check the following:
That is where I look for mix-n-match errors that are easy to detect. Usually they are caused by CERL and GND/DNB conflations, and occasionally vandalism. These are easy to detect, other errors may be more subtle. --RAN (talk) 20:00, 11 May 2020 (UTC)
BTW, do we have a contact at CERL for reporting typos, like we just set up with The Peerage? --RAN (talk) 00:25, 12 May 2020 (UTC)

@Kolja21: Reinheitsgebot made the same errors with CERL that Thierry made with DtBio. @Richard Arthur Norton (1958- ): not sure what CERL is for. Just another set of bot generated items, as if VIAF, ISNI, the VIAF component national libraries (some as networks, e.g. anglo-LCCN and german-GND), the private business IDs and genealogy focused ids were not enough. I have never seen any new content on a CERL page. MrProperLawAndOrder (talk) 16:24, 13 May 2020 (UTC)

Indeed, there are a lot of edits to be checked. I try not only to delete the wrong IDs but also to add the missing items:
Every biographical article with part of (P361): Allgemeine Deutsche Biographie (Q590208) should have main subject (P921): person concerned. --Kolja21 (talk) 19:40, 13 May 2020 (UTC)

Ancient China - 15th century Germany conflation by same tools

https://www.wikidata.org/w/index.php?title=Q11153345&action=history

Complete nonsense. User:Thierry Caro and User:Reinheitsgebot seem not to fix. @Epìdosis: can MnM DtBio be deactivated? More bad than good. For DtBio all matching can be done via GND. MrProperLawAndOrder (talk) 23:28, 14 May 2020 (UTC)

@MrProperLawAndOrder: Asked, let's wait. --Epìdosis 07:28, 15 May 2020 (UTC)

Different OCLC control numbers for one and the same edition?

Actually there should be only one OCLC control number for each edition, isn't it?

But during working about editions of Louis Althusser's (Q184169) La revolución teórica de Marx I found a lot of different OCLC control numbers for one and the same edition (same year; same quantity of pages[1]; same publisher <Siglo XXI (Q6128174)>; same translator - and, if not otherwise mentioned, always published in Mexico City).

As well the counting of the editions is incoherent in same cases:

coherent counting incorrect counting?
year of publication edition year of publication

edition

OCLC control number OCLC control numbers
* 1967 (VIII, 206 p.) 1
318386974, 651425227, 31082416, 803201674, 1026253040[2]
* 1968 (xv, 206 p.) 2
corr. y aum[3]: 627200374, 802877125, 9102363

corr[4]: 911877087, 1024699146

??? 3
* 1969 (xv, 206 p.) 4
805611161, 318285888, 318250563, 1024618725, 911877166
* 1970 (XV, 206 p.) 5 1970 10
911985651, 243780616[5]; (without edition number 20196181: Madrid) 919768560
* 1971 (XV, 206 p.) 6 1971 („206 pages“): Buenos Aires 3
630484330, 1024598061, 633730540 298924802
* 1972 (XV; 206 p.) 7
634744089, 742494415[6] (1043381701 [without edition number, but with ISBN: 9682301661[7]])
1974 („xv, 206 p.“) 7
57398756: Buenos Aires
1973 (XV, 206 p.) 9
1024641549, 911877169, 802611379[8]
* 1974 (xiii, 206 p.) 8 1974 (XV, 206 p.) 12
964811041: Buenos Aires 1024540608, 892105572, 475691083, 634957496, 629782682, 801941032[9]

What shall I do?

References

  1. But apparently in same case the pages with roman numeral pagination are ignored.
  2. in addition: 916470083 („Los siguientes ejemplares son reimpresiones de la primera edición: ej. 3-4, (4a ed. – 1969); ej. 5: (5a ed. – 1970).“ [„The following copies are reprints of the first edition: copy 3-4 (4th edition – 1969); copy 5 (5th edition – 1970.“]
  3. „revised and expanded“.
  4. „revised“.
  5. In addition: 318449168 („xiii, 206 p.“), 920139261 („206 p.“); 991663784 („La revolución técnica de Marx“).
  6. In addition: 37308247, 933835560, 892268702, 881013776, 801875044 („206 p.“).
  7. http://cataleg.url.edu/search*cat/h(ocolc)1043381701.
  8. In addition: 933867024(„206 p.“).
  9. In addition: 919522669 („206 p.“).

--Nstrc (talk) 16:11, 12 May 2020 (UTC)

I found the same with Solaris (Q93864156). I just linked the lowest (oldest?) OCLC number, which was also linked on Open Library. I suppose adding multiple OCLC numbers would also be an option, but I'm not sure if it would be useful. Ghouston (talk) 01:08, 14 May 2020 (UTC)
Oh, this happens over and over and over -- it's because OCLC imports records from libraries which have different styles of cataloguing so OCLC's automatic matching can't recognize that they're the same edition. They may very well not be using the latest, most sophisticated data-processing but the latest, most sophisticated isn't perfect either. With VIAF, we record all the IDs so that they can hunt for duplicates in our data. I wonder if there's any point to doing that for OCLC editions. Two questions would be, would they ever consult Wikidata looking for corrections to make, and has there been a conversation with anyone on their staff? and if they merge, would they definitely keep the lowest number? If the answers are no and yes, then as Ghouson says, it would be OK to only record the lowest number. Personally, though, I'd put 'em all in but make sure the lowest is at the top of the list on the item page. Levana Taylor (talk) 14:18, 14 May 2020 (UTC)
Anyone knows, whether OCLC displays for all control numbers (of the same edition) the same libraries? Would be listing only one OCLC number at Wikidata sufficient, for providing acess to all copies the relevant edition?
@Ghouston:"I just linked the lowest (oldest?) OCLC number"
This would be, e.g., in the case of the 1971 edition of La revolución teórica de Marx not instructive:
  • 298924802 („206 pages“)
  • 630484330 (XV, 206 p.)
  • 633730540 (XV, 206 p.)
  • 1024598061 (XV, 206 p.)
Number 298924802 is the lowest for 1971, but just one of the cases, where the pages with roman numeral pagination are ignored.
--Nstrc (talk) 04:46, 15 May 2020 (UTC)

Business v. enterprise v. company

I'm really confused how to tag "for profit organizations" e.g. "Walmart". We have:

Also, the hierarchy of these doesn't make sense (which from my perspective would make this a minor detail)? Whereas I think the structure should maybe be company (Q783794) > commercial organization (Q21980538) > business (Q4830453) > enterprise (Q6881511).

Meanwhile at some point a lot of items on enwiki using the company template were marked as "companies" and then that got automatically upgraded to "business" (by someone [40]) which left a lot of broken items. It's all very confused and this point ought to be clear since a lot of our items fall into this category. BrokenSegue (talk) 22:44, 14 May 2020 (UTC)

  Notified participants of WikiProject Companies Something I'd also be interested on seeing clarification for --SilentSpike (talk) 22:57, 14 May 2020 (UTC)

The problem is, a business (Q4830453) is not necessarily a company (Q783794) (or any type of legal person (Q3778211)) and not every company (Q783794) conducts business (Q4830453) (see shelf company (Q2534287)). Thus, commercial organization (Q21980538) seems the best choice for a legal entity that actually conducts business (like Walmart). --MB-one (talk) 07:13, 15 May 2020 (UTC)

Lord Buckethead and Count Binface

We have two of each of these (Lord Buckethead (Q30249394) and Lord Buckethead (Q76131126), and Count Binface (Q76509760) and Count Binface (Q78931230) respectively). To add to the confusion, one of them has the name of one of the others as an alternative name.

Some merging and/or distinguising is probably in order, but its all rather confusing (see en:Lord Buckethead and en:Count Binface) and I'm not sufficiently familiar with the conventions here to do the job properly. There is some ambiguity about whether the two Lord Bucketheads are identical (they may have been portrayed by two different people), but there definitely seems to be only one Count Binhead (who confusingly, seems to have been portrayed by the same person as one of the incarnations of Lord Buckethead).

Perhaps we should make a distinction between the characters and the performers, as the same character may have been performed by multiple candidates, but only the natural person is the actual candidate. -- The Anome (talk) 09:20, 15 May 2020 (UTC)

This article seems to cover the details well. This is a bit of a mess for sure, raises interesting questions on how to model a satire character (inspired by a movie character) played by multiple persons who both ran in different real elections as the same character. I believe Lord Buckethead (Q76131126) is not the same person as the Lord Buckethead who ran in 2017 (comedian Jon Harvey), but is the candidate funded by the author of the original character from the movie (Lord Buckethead (Q30249394)) who forced the 2017 candidate to change to Count Binface. Count Binface (Q78931230) and Count Binface (Q76509760) should probably be merged and modelled as the character not the human (comedian Jon Harvey). --SilentSpike (talk) 10:31, 15 May 2020 (UTC)
One could just add the real name of the person as alias to Q76509760. --- Jura 10:52, 15 May 2020 (UTC)

foreword (Q1358138) and preface (Q670787) / said to be the same as (P460)

F.Y.I.:

Talk:Q1358138 and Talk:Q670787

--Nstrc (talk) 12:18, 15 May 2020 (UTC)

Hundreds of wrong ids on human items by tool edits

One example, Alexander the Alabarch (Q1243557) from Ancient Rome

  • 10:42, 12 February 2020‎ Thierry Caro talk contribs‎ 14,988 bytes +353‎ ‎Created claim: Deutsche Biographie ID (P7902): pnd137786980, #quickstatements; #temporary_batch_1581486542647 undo thank Tag: QuickStatements [1.5] (restore) [41]
    • human from 13th century
    • no real batch, hard to review, no source stated, but probably from mix-n-match
  • 18:42, 1 March 2020‎ Reinheitsgebot talk contribs‎ 15,340 bytes +352‎ ‎Created claim: CERL Thesaurus ID (P1871): cnp01170666, #quickstatements; mixnmatch:microsync for catalog 1640 details undo (restore)
    • this is the same person as for Deutsche Biographie, note that Deutsche Biographie website also links to CERL and CERL to them, but what is behind "mixnmatch:microsync" cannot be seen

This is one of many similar cases. @Kolja21, Epìdosis: can probably confirm. Is there a way to turn off mix-n-match for Deutsche Biographie?

In general, should more care be taken with mix-n-match? It seems to be very powerful but the syncing is very obscure. Casimir Dudevant (Q1047458) 1795-1871 receives IDs for François His (Q93930448) 1725-1803 :

  • 06:55, 12 February 2020‎ Thierry Caro talk contribs‎ 21,134 bytes +353‎ ‎Created claim: Deutsche Biographie ID (P7902): pnd137582773, #quickstatements; #temporary_batch_1581486542647 undothank Tag: QuickStatements [1.5] (restore)
  • 18:42, 1 March 2020‎ Reinheitsgebot talk contribs‎ 21,486 bytes +352‎ ‎Created claim: CERL Thesaurus ID (P1871): cnp01168365, #quickstatements; mixnmatch:microsync for catalog 1640 details undo (restore)

A related problem: Deutsche Biographie uses GND, i.e. IDs can change on merges in GND and it is an aggregator, it can loose humans on removal from sources. Can mix-n-match handle ID changes an item removal from source? MrProperLawAndOrder (talk) 11:29, 11 May 2020 (UTC)

@MrProperLawAndOrder: I confirm that the edits you report are due to Mix'n'match. Of course it is possible, and technically really simple (it can be done by @Magnus Manske:), to turn off the catalog of Deutsche Biographie. I agree about the need of being very prudent in using MnM. However, my impression is, seeing the statistics of the catalog, that the great majority of the matches has been performed not by humans, but instead by Auxiliary data matcher, so the problem probably lays in it. Now the current problem is the following: if you remove wrong IDs from Wikidata, the match isn't automatically removed from MnM; so, whichever user in good faith, going to https://tools.wmflabs.org/mix-n-match/#/sync/1619, sees "825 connections here, but not on Wikidata" (number showing now) and starts a QuickStatements batch to import them to Wikidata, which results in the disaster you correctly denounce.
My proposal is the following:
I hope having been clear. Would you agree with this? --Epìdosis 12:22, 11 May 2020 (UTC) P.S. Obviously I again thank you very much for the great work of cleaning you are doing!
Oh, I've not answered your last question: unfortunately on MnM big catalogs aren't updated periodically, so new entries in GND-Deutsche Biographie aren't added and old entries remain active; the only way to update a catalog is to create a new one and delete the old one. --Epìdosis 12:26, 11 May 2020 (UTC)
Final note: I would maybe move this thread to Property talk:P7902. --Epìdosis 12:27, 11 May 2020 (UTC)
@Epìdosis: 1) deactivation = OK; 2) i try to clean = OK; currently ~600 DtBio without GND with high error rate. 3) reactivation = not sure. If MnM does not remove items that have been removed from DtBio or are found under a new URL, then it will contain outdated information. MnM could be useful if from time to time it rescans DtBio website for existence of IDs, but in case an ID is not found anymore, it should mark the entry as FETCH error - then we can use MnM for deprecating IDs in Wikidata. Otherwise we must find a way to do this in WD and then also find a way to fix it in MnM - so MnM gives us more work. 4) please leave discussion here, maybe other properties have a similar problem. But maybe it can be archived at P7902 talk. MrProperLawAndOrder (talk) 15:42, 11 May 2020 (UTC)
Correct me if I am wrong, but my understanding is that all GND entries are eventually federated onto VIAF, then after curation onto ISNI. If so, where is the sense in using GND directly? LeadSongDog (talk) 14:42, 11 May 2020 (UTC)
LeadSongDog, DtBio and Sächsische Biografie use the GND as their IDs. Probably many more system from Germany, Austria, Switzerland do that. So, we are lucky, we can directly compare the IDs with the GND ID. Yes, GND ends up in VIAF, but is much more stable, VIAF IDs change more often. Regarding ISNI: I have no idea how long it will take that each GND has an ISNI. GND/DNB is not even listed at http://www.isni.org/content/isni-registration-agencies . And I doubt the institutions from Germany, Austria, Switzerland will all switch to ISNI in the next 10 years. So, having the GND in WD is helpful. MrProperLawAndOrder (talk) 15:52, 11 May 2020 (UTC)
Thanks for your investigations. Beside the lack of quality of the mass edits imho no value should be added without a source and a date.[42] As a maintenance list these hits are useful. VIAF and ISNI have simular problems and we can help their algorithms by creating new items for the namesakes. --Kolja21 (talk) 16:35, 11 May 2020 (UTC)
After checking some more edits: The ones of Magnus look pretty good (some are better than VIAF) but Thierry's edits seem arbitrarily made without comparing the names or the dates of life. Maybe we should undo Thierry's edits and focus on the ones of Magnus. --Kolja21 (talk) 18:01, 11 May 2020 (UTC)
As a reminder please check the following:
That is where I look for mix-n-match errors that are easy to detect. Usually they are caused by CERL and GND/DNB conflations, and occasionally vandalism. These are easy to detect, other errors may be more subtle. --RAN (talk) 20:00, 11 May 2020 (UTC)
BTW, do we have a contact at CERL for reporting typos, like we just set up with The Peerage? --RAN (talk) 00:25, 12 May 2020 (UTC)

@Kolja21: Reinheitsgebot made the same errors with CERL that Thierry made with DtBio. @Richard Arthur Norton (1958- ): not sure what CERL is for. Just another set of bot generated items, as if VIAF, ISNI, the VIAF component national libraries (some as networks, e.g. anglo-LCCN and german-GND), the private business IDs and genealogy focused ids were not enough. I have never seen any new content on a CERL page. MrProperLawAndOrder (talk) 16:24, 13 May 2020 (UTC)

Indeed, there are a lot of edits to be checked. I try not only to delete the wrong IDs but also to add the missing items:
Every biographical article with part of (P361): Allgemeine Deutsche Biographie (Q590208) should have main subject (P921): person concerned. --Kolja21 (talk) 19:40, 13 May 2020 (UTC)

Ancient China - 15th century Germany conflation by same tools

https://www.wikidata.org/w/index.php?title=Q11153345&action=history

Complete nonsense. User:Thierry Caro and User:Reinheitsgebot seem not to fix. @Epìdosis: can MnM DtBio be deactivated? More bad than good. For DtBio all matching can be done via GND. MrProperLawAndOrder (talk) 23:28, 14 May 2020 (UTC)

@MrProperLawAndOrder: Asked, let's wait. --Epìdosis 07:28, 15 May 2020 (UTC)

Different OCLC control numbers for one and the same edition?

Actually there should be only one OCLC control number for each edition, isn't it?

But during working about editions of Louis Althusser's (Q184169) La revolución teórica de Marx I found a lot of different OCLC control numbers for one and the same edition (same year; same quantity of pages[1]; same publisher <Siglo XXI (Q6128174)>; same translator - and, if not otherwise mentioned, always published in Mexico City).

As well the counting of the editions is incoherent in same cases:

coherent counting incorrect counting?
year of publication edition year of publication

edition

OCLC control number OCLC control numbers
* 1967 (VIII, 206 p.) 1
318386974, 651425227, 31082416, 803201674, 1026253040[2]
* 1968 (xv, 206 p.) 2
corr. y aum[3]: 627200374, 802877125, 9102363

corr[4]: 911877087, 1024699146

??? 3
* 1969 (xv, 206 p.) 4
805611161, 318285888, 318250563, 1024618725, 911877166
* 1970 (XV, 206 p.) 5 1970 10
911985651, 243780616[5]; (without edition number 20196181: Madrid) 919768560
* 1971 (XV, 206 p.) 6 1971 („206 pages“): Buenos Aires 3
630484330, 1024598061, 633730540 298924802
* 1972 (XV; 206 p.) 7
634744089, 742494415[6] (1043381701 [without edition number, but with ISBN: 9682301661[7]])
1974 („xv, 206 p.“) 7
57398756: Buenos Aires
1973 (XV, 206 p.) 9
1024641549, 911877169, 802611379[8]
* 1974 (xiii, 206 p.) 8 1974 (XV, 206 p.) 12
964811041: Buenos Aires 1024540608, 892105572, 475691083, 634957496, 629782682, 801941032[9]

What shall I do?

References

  1. But apparently in same case the pages with roman numeral pagination are ignored.
  2. in addition: 916470083 („Los siguientes ejemplares son reimpresiones de la primera edición: ej. 3-4, (4a ed. – 1969); ej. 5: (5a ed. – 1970).“ [„The following copies are reprints of the first edition: copy 3-4 (4th edition – 1969); copy 5 (5th edition – 1970.“]
  3. „revised and expanded“.
  4. „revised“.
  5. In addition: 318449168 („xiii, 206 p.“), 920139261 („206 p.“); 991663784 („La revolución técnica de Marx“).
  6. In addition: 37308247, 933835560, 892268702, 881013776, 801875044 („206 p.“).
  7. http://cataleg.url.edu/search*cat/h(ocolc)1043381701.
  8. In addition: 933867024(„206 p.“).
  9. In addition: 919522669 („206 p.“).

--Nstrc (talk) 16:11, 12 May 2020 (UTC)

I found the same with Solaris (Q93864156). I just linked the lowest (oldest?) OCLC number, which was also linked on Open Library. I suppose adding multiple OCLC numbers would also be an option, but I'm not sure if it would be useful. Ghouston (talk) 01:08, 14 May 2020 (UTC)
Oh, this happens over and over and over -- it's because OCLC imports records from libraries which have different styles of cataloguing so OCLC's automatic matching can't recognize that they're the same edition. They may very well not be using the latest, most sophisticated data-processing but the latest, most sophisticated isn't perfect either. With VIAF, we record all the IDs so that they can hunt for duplicates in our data. I wonder if there's any point to doing that for OCLC editions. Two questions would be, would they ever consult Wikidata looking for corrections to make, and has there been a conversation with anyone on their staff? and if they merge, would they definitely keep the lowest number? If the answers are no and yes, then as Ghouson says, it would be OK to only record the lowest number. Personally, though, I'd put 'em all in but make sure the lowest is at the top of the list on the item page. Levana Taylor (talk) 14:18, 14 May 2020 (UTC)
Anyone knows, whether OCLC displays for all control numbers (of the same edition) the same libraries? Would be listing only one OCLC number at Wikidata sufficient, for providing acess to all copies the relevant edition?
@Ghouston:"I just linked the lowest (oldest?) OCLC number"
This would be, e.g., in the case of the 1971 edition of La revolución teórica de Marx not instructive:
  • 298924802 („206 pages“)
  • 630484330 (XV, 206 p.)
  • 633730540 (XV, 206 p.)
  • 1024598061 (XV, 206 p.)
Number 298924802 is the lowest for 1971, but just one of the cases, where the pages with roman numeral pagination are ignored.
--Nstrc (talk) 04:46, 15 May 2020 (UTC)

Business v. enterprise v. company

I'm really confused how to tag "for profit organizations" e.g. "Walmart". We have:

Also, the hierarchy of these doesn't make sense (which from my perspective would make this a minor detail)? Whereas I think the structure should maybe be company (Q783794) > commercial organization (Q21980538) > business (Q4830453) > enterprise (Q6881511).

Meanwhile at some point a lot of items on enwiki using the company template were marked as "companies" and then that got automatically upgraded to "business" (by someone [44]) which left a lot of broken items. It's all very confused and this point ought to be clear since a lot of our items fall into this category. BrokenSegue (talk) 22:44, 14 May 2020 (UTC)

  Notified participants of WikiProject Companies Something I'd also be interested on seeing clarification for --SilentSpike (talk) 22:57, 14 May 2020 (UTC)

The problem is, a business (Q4830453) is not necessarily a company (Q783794) (or any type of legal person (Q3778211)) and not every company (Q783794) conducts business (Q4830453) (see shelf company (Q2534287)). Thus, commercial organization (Q21980538) seems the best choice for a legal entity that actually conducts business (like Walmart). --MB-one (talk) 07:13, 15 May 2020 (UTC)

Separate items for FitzGerald and Fitzgerald

As part of mass surname additions, I have been trying to figure out which persons to add Fitzgerald (Q55550590) to and which FitzGerald (Q16466739), and I am having the devil of a time deciding. Naturally, internet sources, newspapers, and books aren't consistent in how they capitalize a person's name. But also, "authoritative" genealogies like Burke's Landed Gentry etc. don't necessarily agree with one another, and even official documents looked up on Family Search aren't always consistent! The inevitable conclusion is that it's not a good idea to have separate items for these, they should just be aliases of each other. There are also separate items for Fitzroy/FitzRoy and Fitzsimons/FitzSimons, but not for Fitzherbert, Fitzhubert, etc. The same issue may apply to Mac names, which have a similar but somewhat lesser problem with inconsistency. For most of them, there is only one item, but we have separate ones for Macleod/MacLeod, Macdonald/MacDonald, Maclean/MacLean, and about twenty more. Perhaps there should be a list somewhere of names that have alternate capitalization variants, too? Levana Taylor (talk) 23:48, 14 May 2020 (UTC)

Perhaps it's just a stylistic difference in printing. E.g., one source, during a particular historical period, may always use the spelling FitzGerald, and some other source at a different time may use Fitzgerald. Are there any sources that consistently use different variants for particular people? Ghouston (talk) 02:42, 15 May 2020 (UTC)
But I think that may not be the case, since I've found one person Robert D. FitzGerald (Q3281581) who is more commonly (but not always) written as FitzGerald. Ghouston (talk) 02:47, 15 May 2020 (UTC)
  • As with anything else at Wikidata, follow whatever the reference states. If several can be added, then consider using ranks.
That we have items for surnames spelling it in some way, doesn't mean that there are items for persons using that spelling. Sample: there is Castromartinez. It's just that the reference used spelled it that way or the person who imported it formatted it in a specific way. --- Jura 05:23, 15 May 2020 (UTC)
Still and all, it's rather a nightmare. The work of reconciling references is enormous. There are also Ffoulkes and ffoulkes, LeDrew and Ledrew, etc. etc. etc. A list is a starting place, so here's Wikidata:WikiProject Names/Capitalization variants. That surely doesn't contain all of the existing ones, especially not all the De's and Le's.
Also, if we are supposed to use The Peerage as a reference (which I think is not a good idea), that will require creating MANY more items for variant spellings and name forms. Levana Taylor (talk) 05:30, 15 May 2020 (UTC)
If you think a reference isn't any good, don't use it. --- Jura 05:36, 15 May 2020 (UTC)
Yes, seriously, is there some sort of place to list references which it is strongly advised not to use? Because I know other people have been finding lots of errors, and I've found 9 just today. I thought that adding surnames to Peerage items would help with cleaning up duplicates, but I'm finding that I'm not comfortable adding a surname without checking the item for accuracy first, and that is taking MUCH too long. I wish I could just abandon any data imported from The Peerage, honestly. Levana Taylor (talk) 05:56, 15 May 2020 (UTC) Levana Taylor (talk) 05:45, 15 May 2020 (UTC)
Why not just merge them? Although they appear to both have enwp sitelinks, one of them just redirects to the other. (Also, pinging @Harmonia Amanda: as someone that edits a lot on family names). Thanks. Mike Peel (talk) 06:01, 15 May 2020 (UTC)
@Mike Peel: if you merge them: how do you determine which people misspell their surname? --- Jura 06:03, 15 May 2020 (UTC)
object named as (P1932)? Thanks. Mike Peel (talk) 06:56, 15 May 2020 (UTC)
That would just reflect what a reference reads (if we determine the reference is incorrect). Normally, the native label-statement on the item gives the spelling used. --- Jura 07:03, 15 May 2020 (UTC)
Spelling OK, but capitalization, really? Thanks. Mike Peel (talk) 16:16, 15 May 2020 (UTC)

Lord Buckethead and Count Binface

We have two of each of these (Lord Buckethead (Q30249394) and Lord Buckethead (Q76131126), and Count Binface (Q76509760) and Count Binface (Q78931230) respectively). To add to the confusion, one of them has the name of one of the others as an alternative name.

Some merging and/or distinguising is probably in order, but its all rather confusing (see en:Lord Buckethead and en:Count Binface) and I'm not sufficiently familiar with the conventions here to do the job properly. There is some ambiguity about whether the two Lord Bucketheads are identical (they may have been portrayed by two different people), but there definitely seems to be only one Count Binhead (who confusingly, seems to have been portrayed by the same person as one of the incarnations of Lord Buckethead).

Perhaps we should make a distinction between the characters and the performers, as the same character may have been performed by multiple candidates, but only the natural person is the actual candidate. -- The Anome (talk) 09:20, 15 May 2020 (UTC)

This article seems to cover the details well. This is a bit of a mess for sure, raises interesting questions on how to model a satire character (inspired by a movie character) played by multiple persons who both ran in different real elections as the same character. I believe Lord Buckethead (Q76131126) is not the same person as the Lord Buckethead who ran in 2017 (comedian Jon Harvey), but is the candidate funded by the author of the original character from the movie (Lord Buckethead (Q30249394)) who forced the 2017 candidate to change to Count Binface. Count Binface (Q78931230) and Count Binface (Q76509760) should probably be merged and modelled as the character not the human (comedian Jon Harvey). --SilentSpike (talk) 10:31, 15 May 2020 (UTC)
One could just add the real name of the person as alias to Q76509760. --- Jura 10:52, 15 May 2020 (UTC)

foreword (Q1358138) and preface (Q670787) / said to be the same as (P460)

F.Y.I.:

Talk:Q1358138 and Talk:Q670787

--Nstrc (talk) 12:18, 15 May 2020 (UTC)

Need help adjusting a format constraint

On Federal Register Document Number (P1544), the format constraint is not allowing items from 2020. The value for format as a regular expression (P1793) is "(201\d|9[3-9]|[0E]\d)-[1-9]\d{0,4}". Can somebody help me fix this? gobonobo + c 01:39, 16 May 2020 (UTC)

Just changed "(201\d|9[3-9]|[0E]\d)-[1-9]\d{0,4}" to "(20[12]\d|9[3-9]|[0E]\d)-[1-9]\d{0,4}", which should take care of the problem until 2030. Vahurzpu (talk) 01:51, 16 May 2020 (UTC)
Thank you Vahurzpu! gobonobo + c 02:09, 16 May 2020 (UTC)

ISCO occupation code

The property ISCO-88 occupation class (P952) is overly simple, and did not state which standard it follows. There are two different standards in use, ISCO-08 and ISCO-88 (and two older ISCO-58 and ISCO-68), with different number series, and the property uses numbers from both series. I have not tried to figure out which number series are used in the various work-related elements, but the examples given in the property are inconsistent.

Either the existing property must be specified to only following one standard, or a new property must be created for a specific standard (ISCO-08 from 2008). I suspect the latter is the easiest solution. There are rather few entries in use, so both solutions should work. Jeblad (talk) 11:54, 14 May 2020 (UTC)

Two standards = two properties. Which one gets the current property? Maybe the older one, so it is in chronological order. Support Jeblad's proposal for new property for 2008 codes. MrProperLawAndOrder (talk) 16:56, 14 May 2020 (UTC)
It is also possible to use a qualifier, but I think it will be a mess. There are other problems with the current definition, as this is not an unique identifier and both single-value constraint (Q19474404) and distinct-values constraint (Q21502410) will fail. A lot of the items where it is used also lacks occupation (Q12737077) as class. Jeblad (talk) 17:48, 14 May 2020 (UTC)
Change the existing property into “ISCO-88 occupation code” and create a new “ISCO-08 occupation code”. 46.46.207.162 11:39, 16 May 2020 (UTC)

Unsigned comment

I suggest that wikidata automatically include the User name in talk pages, when unsigned. --BoldLuis (talk) 02:34, 15 May 2020 (UTC)

  • Wikidata just reuses the general MediaWiki software. Unfortunately, not every edit needs an username (correcting spelling errors for example doesn't). Maybe in the future there will be a successful better chat system for MediaWiki but we aren't there now and it's not Wikidata focus to develop that. ChristianKl06:50, 15 May 2020 (UTC)
The WMF are in fact working on this, see Talk pages project and especially /replying. (Also, there is/was Structured Discussions, but we don’t have it enabled on the English version of the Project chat.) --Lucas Werkmeister (WMDE) (talk) 09:23, 15 May 2020 (UTC)
. It would be ideal for the talk pages. I am looking for the "--~~~~" sign forever in the menu. And this means, less time, less usability. Thank you for the information- --BoldLuis (talk) 10:01, 16 May 2020 (UTC)

Do we want headers on Help pages?

BoldLuis (talkcontribslogs) added headers to a bunch of help pages and Eihel (talkcontribslogs) undid the edits. Do we want these kinds of headers? ChristianKl09:34, 15 May 2020 (UTC)

Hello @ChristianKl:, BoldLuis (talkcontribslogs) asked this question here without waiting for a reply. I still gave him a reply which seems adequate: the link is present on each page, since it is on the left menu and in the correct language. —Eihel (talk) 09:49, 15 May 2020 (UTC)
Yes, it is needed for newbies (the targe for the help pages; if not a newbie: namespace would be useful). The talk goes on there. --BoldLuis (talk) 09:39, 16 May 2020 (UTC)

Templatedata in template items

There would be a link in the Wikimedia templates to the TemplateData of the template (mainly in Metawiki). So, documentation (see usage, examples and so on) could be more easy. It would be similar to images in items, but for documentation (Template data). --BoldLuis (talk) 09:16, 16 May 2020 (UTC)

Planned maintenance operation (read-only time) on May 19 @ 5:00 A.M.

Hi, There's a planned maintenance operation in the upcoming week. It will happen on Tuesday 19th May at 05:00 AM UTC, for 15 minutes. This wiki will go read-only during this operation. Services targeting Wikidata may not work during the meantime. See also: phab:T251981. --Kaartic (talk) 12:17, 16 May 2020 (UTC)

Data preparation

Data preparation is a wide topic and at the Wikimedia Hackathon last Weekend there were some topics about things related to that and during the Weekend there were talks and Livestreams where data preparation and how to do it was mentioned. After it is a wide topic it is helpful to learn from other people. Is someone interested in talking about that topic. I can also tell you how I prepare the data before I upload it to Wikidata. In the German Wikipedia there is since a few weeks every week a Digital Meeting about a specific topic and everyone who is interested can join. Maybe it is possible to establish something like that in Wikidata if there is a interest for something like that. --Hogü-456 (talk) 20:10, 12 May 2020 (UTC)

@Hogü-456: Are we talking about things like data cleaning (e.g. what to do with missing values or obviously corrupted rows of data)? My degree covered that in part.--Jasper Deng (talk) 20:13, 12 May 2020 (UTC)
@Jasper Deng: This is a possible topic to talk about it. There are many different topics in that and I think that this is something what is important and I think it were interesting to talk about. --Hogü-456 (talk) 18:27, 16 May 2020 (UTC)

Wikidata:Feedback

In a same way to wiktionary:Wiktionary:Feedback, I suggest create Wikidata:Feedback.--BoldLuis (talk) 02:32, 15 May 2020 (UTC)

@BoldLuis: wikt:WT:FB is / are pages typically for Wiktionaries. Another terms for this kind of page is a guest book, Livre d'or. In Wikibooks, these pages can be found with books. For WD, I'm a little doubtful. —Eihel (talk) 03:35, 15 May 2020 (UTC)
Feedback can already be given here on the project chat, there's no need for another page that does roughly the same as the project chat. It creates unnecessary complexity. ChristianKl06:51, 15 May 2020 (UTC)
OK, now I now this is the same in this project. Thanks. --BoldLuis (talk) 18:27, 16 May 2020 (UTC)

edition or translation of (P629) --- has edition or translation (P747)

If one uses edition or translation of (P629), then the following infomation is displayed:

"Qxxx should also have the inverse statement has edition or translation (P747) Qyyy."

Why it is necessary to insert the inverse statement manually? Could the inverse statement not inserted automatically?

--Nstrc (talk) 06:36, 15 May 2020 (UTC)

  1. The constraint is accompanied by a qualifier: constraint status (P2316)suggestion constraint (Q62026391)
  2. Item may be an exception to the constraint
  3. Errors are not propagated to other elements, such as here
  4. Vandalism is not doubled
etc. —Eihel (talk) 09:15, 15 May 2020 (UTC)
@ChristianKl: "the information in it can be easily queried based on edition or translation of (P629)"
But that would require readers (passive users), who know, how do to a SPARQL-Query - and that this opportunitiy exists at Wikidata.
Most people, who come - by chance / by clicking within the Wikipedia-sidebar on "Wikidata item" - to a Wikidata-Item will not know nothing about all this - I guess.
Something different would maybe, if there would be within each Wikidata-Item a link to the "From related items"-section within the corresponding reasonator results, e.g.:
* If For Marx (Q5466801) would link to the "From related items"-section of https://tools.wmflabs.org/reasonator/?q=Q5466801.
Such links should have an explanation like "Further items, which are related to this item [e.g. 'Q5466801']".
With easy access to the "edition or translation of"-subsection there, the "has edition"-section would be in deed not realy necesary.
--Nstrc (talk) 16:30, 15 May 2020 (UTC)
Wikidata datastructure is not optimized for giving a user that visits the page this way the optimal experience. In the last year after we got the relateditems gadget we deleted a bunch of unnecesary reverse properties and I think the reasoning for deleting those should also apply in this case. ChristianKl17:27, 15 May 2020 (UTC)
@ChristianKl: I agree with you. However, as we delete more and more inverse properties, Wikipedia templates will lose their powers one by one. It is a pity that there isn't a simple way to pull "inversely inserted" values into Wikipedia articles, e.g. editions of Harry Potter and the Philosopher's Stone (Q43361) by displaying all items with edition or translation of (P629):Harry Potter and the Philosopher's Stone (Q43361). The only way to do this is currently via Listeriabot, and that is a fairly clumsy, unflexible one-purpose tool (but better then nothing of course!). Vojtěch Dostál (talk) 18:26, 15 May 2020 (UTC)
Okay. - But what's about my idea regarding an item-to-reasonator-resulat-link?
--Nstrc (talk) 18:20, 15 May 2020 (UTC)
There's a gadget that you can activate that does provide in the tools section a reasonator link. ChristianKl23:44, 16 May 2020 (UTC)

Wikidata COVID-19 recorded deaths pass 1,000 today

The following query uses these:

  • Properties: cause of death (P509)     
    SELECT ?item ?itemLabel WHERE {
      ?item wdt:P509 wd:Q84263196.
      SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
    }
    

--RAN (talk) 01:05, 16 May 2020 (UTC)

There's a formatted list at Wikidata:Lists/corona virus deaths. Ghouston (talk) 03:53, 16 May 2020 (UTC)
  Applause. Good work!!. --BoldLuis (talk) 09:19, 16 May 2020 (UTC)
This is a Listeria list that shows these deaths and it shows the bias in showing non notable American victims .. "because we can". Yes it is a bias. Thanks, GerardM (talk) 10:41, 16 May 2020 (UTC)
@GerardM: I don't actually understand what your comment is trying to convey. If there are non notable items according to Wikidata:Notability the right course of action is for them to be removed. Wikidata is gratis and libre, if you see something wrong or identify some deficiency, you are welcome to address it and if you clearly state what you want others to do you can even get some help in the process. This it fine to say something if you see something, but you can do something also - or at the very least be clear what you want others to do. Iwan.Aucamp (talk) 13:29, 16 May 2020 (UTC)
  • Wikidata Notability is to actually exist in a referenced source, please do no equate it to being famous or use the English Wikipedia definition of notability. History is written by those who take the time to do it. The only bias is in who sacrifices their free time to record history as it passes. --RAN (talk) 21:09, 16 May 2020 (UTC)

Property as a qualifier

Hello. Where can I request for the update of number of seats in assembly (P1410) to allow to be used as a qualifier? Xaris333 (talk) 23:51, 16 May 2020 (UTC)

Different ISBN-13 (P212) and publication date (P577) for work level items?

@Vojtěch Dostál:: "editions of Harry Potter and the Philosopher's Stone (Q43361)"

Is this item edited as it should be? - There are different ISBN-13 (P212) and publication date (P577) for one and the same work!

Of course, works can have different editions - and each edition has it's specific publication date (P577). But that is another statement, than the statement: "A certain work has different (a lot of) publication date (P577)", isn't?

[*] En passant: I would preferr to say works have inception (P571) - and only it's editions have publication date (P577). - But that's not my main point.

--Nstrc (talk) 06:33, 16 May 2020 (UTC)

ISBN-13 (P212) is in fact not restricted to editions at the moment. See this discussion. Harry Potter and the Philosopher's Stone (Q43361) is probably wrong given how the discussion there is going, the constraints on properties should be corrected. Iwan.Aucamp (talk) 09:50, 17 May 2020 (UTC)

Does lack of objection count as approval?

I proposed changes to a property here Property_talk:P2969 (for Goodreads version/edition ID (P2969)) and notified WikiProject books. There has been no input, negative or positive. Should I then just proceed? Could we formalize rules here? Iwan.Aucamp (talk) 12:53, 16 May 2020 (UTC)

  WikiProject Properties has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. Iwan.Aucamp (talk) 13:05, 16 May 2020 (UTC)

how do i add non existent language or new language

for example: on Zoia Ceaușescu (Q218415) i want to add persian language. how do i add ? item Q218415 data in persian language. i am not talking about item name, description & aliases Leela52452 (talk) 10:50, 17 May 2020 (UTC)

hello Jura i want to add items p21, p37, p27, p569, etc. for persian language, similar to english and other languages. Leela52452 (talk) 12:22, 17 May 2020 (UTC)
You mean, you want to add statements to the item Zoia Ceaușescu (Q218415) with the properties sex or gender (P21), country of citizenship (P27), date of birth (P569)?
If you switch the interface language to "fa", sex or gender (P21), country of citizenship (P27) values should already be in "fa", try https://www.wikidata.org/wiki/Q218415?uselang=fa . This is because the value of these properties is an item and this item already has labels in fa. For the value of P21, this is the item at https://www.wikidata.org/wiki/Q6581072?uselang=fa If the value for "fa" would be missing, you could add that at female (Q6581072), not Zoia Ceaușescu (Q218415) --- Jura 12:29, 17 May 2020 (UTC)
  Done ?uselang=fa excellent trick. i will keep in mind not add person name but to add gender. i also just now discovered that items are available for other languages also. if anything is missing i will add and move on. beautiful, simple Leela52452 (talk) 13:17, 17 May 2020 (UTC)

Wikilambda project proposal: extension to Wikidata

There is now a proposal for a new Wikimedia project, Wikilambda, which contains functions, and is that a natural companion to Wikidata, which includes data. In Wikilambda we can then express functions that do something with the data, and can use the results in many different ways. A major use case is to increase the expressivity of Wikidata so that we can create more complete articles for different languages if there is a gap for a given language. Comments, critiques, and statements of support would be very welcome!

The proposal: Wikilambda

Discussion page and vote: Discussion page for Wikilambda

Thank you for your time! Happy to answer any questions! --Denny (talk) 18:32, 17 May 2020 (UTC)

Hungarian translation

Hello, if anyone speaking hungarian could help translate that page https://www.wikidata.org/wiki/Help:Basic_membership_properties ? It would help explain difference between P31 and P279 to someone hungarian. Bouzinac (talk) 19:49, 17 May 2020 (UTC)

EN 15907 (Q61732429) as properties for film (Q11424)

I was looking to kick off a conversation about possibly adding properties to reflect where film (Q11424) items exist in relation to cataloguing standard EN 15907 (Q61732429). The standard only lists a fairly small range of possible values, but as it is seeing increased use within the archival community it would make collection mapping much more straightforward for archives in the future. Of course, another option for implementation would simply be to make each entity as an item which can be used as an instance of (P31) (eg instance of (P31) Variant, instance of (P31) Manifestation), although in regards to film records, there is already a bit of confusion between instance of (P31) film (Q11424) and instance of (P31) genre (eg silent film (Q226730)). Pxxlduchesne (talk) 06:19, 14 May 2020 (UTC)

  Notified participants of WikiProject Movies

@Jura1: In regards to the definitions of the standard, they are also clearly explained in the FIAF Moving Image Cataloguing Manual, which is available for free here. Apologies, that is a useful additional detail which I forgot to provide. Pxxlduchesne (talk) 23:13, 14 May 2020 (UTC)
Maybe Jmabel got it for free. --- Jura 06:06, 15 May 2020 (UTC)
I'm sorry, still completely confused, got what for free? - Jmabel (talk) 14:44, 15 May 2020 (UTC)
I think Jura is alluding to the fact that EN 15907 (Q61732429) carries a fee to purchase the standard doc. In any case I'm beginning to thing that for the purposes I have described it would make more sense to spin up a dedicated Wikibase instance, which links out to Wikidata where appropriate. Pxxlduchesne (talk) 05:37, 16 May 2020 (UTC)
Got it. That's weird: do we have a specific preference for free sources? Are books then considered inherently inferior sources? - Jmabel (talk) 19:41, 16 May 2020 (UTC)
The question here was about using E #something for editing (see header). Supposedly, every contributor would have to read it (to edit or to comment in this discussion). Apparently it's not a standard that is as open as Wikidata. The question is not about adding some statement to a single item. --- Jura 09:10, 18 May 2020 (UTC)

Propose deleting "unknowns" imported from The Peerage

The data from The Peerage is the output of genealogy software as the site author says on his introductory page, and it contains quite a number of mere placeholder items (where you enter into the software that you know so-and-so had four children, names to be added later -- but the names weren't added before the output was posted to the website). There are also some children who have no name because they died at birth. Neither of these types of items is notable for Wikidata. The searches "unknown" haswbstatement:P4638, ditto "unnamed", plus hunting for things like "a child [surname]," "infant [surname]," and looking for question marks in various lists of names, allow me to estimate that there are 20K of such items. I would propose deleting them all -- which will require some cleanup because they are linked from the items of their relatives. It is true that you could argue that some of these unknowns are useful for structural purposes -- but why should WD do the work of trying to figure out who they are and how they fit in with unquestionably notable people? Let the genealogy be finished elsewhere, and only then imported. Levana Taylor (talk) 18:05, 15 May 2020 (UTC)

  •   Oppose especially without source for "Neither of these types of items is notable for Wikidata." Please focus on verifiability not highly subjective notability in a db where data is presented in structured form. There are biographies containing claims about children that died near birth. MrProperLawAndOrder (talk) 20:25, 15 May 2020 (UTC)
  •   Oppose While true they are a pain, in some cases their ambiguity merely represents the myopia of the source materials consulted (if you're only looking at compilations of direct kin or spouses of European nobility, you're going to overlook the odd American offspring that didn't marry a peer), and I've identified by name a handful of "Peerage unknowns" through alternative sources and inference. Some people have multiple Peerage entries, due to stitched together genealogies (e.g. Hetty Kelly (Q17362620) and her father Arthur Wolseley Kelly (Q76031532)), so a simple merge resolves the "Unknown" name. -Animalparty (talk) 00:44, 16 May 2020 (UTC)
  • Incomplete information is not always a reason to delete (sometimes a name is not known but there is other information to identify them), but some of the pages say they are still under construction and may contain errors, these probably shouldn't have been added to Wikidata. Peter James (talk) 12:12, 16 May 2020 (UTC)
  • If we have a bunch of "unknown" children and we finally identify one of them, how should we determine which of the "unknown" children it is? If we can't, we'd just end up adding an additional child and we end up not even knowing how many children there were. --- Jura 12:19, 16 May 2020 (UTC)
    • It depends on the level of detail at The Peerage and whether we can match up the entry to other databases. I recently managed to put a name to an unknown child through matching the date of death at another database. If the Peerage entry is truly a blank slate (no gender, no dates and no name) then we assign the entry one of the names we have identified from another source then report our proposed name at Wikidata:WikiProject Authority control/The Peerage errors. DarrylLundy will read that page and decide if he wants to set his entry to match ours (based on the quality of the supporting data). That reminds me that I should dig out the unknown child I mentioned above and record it on that page. From Hill To Shore (talk) 14:59, 16 May 2020 (UTC)
      • If we can't make use of Wikidata entries without correcting the source database first, I think this strongly suggests we probably shouldn't have used that source in the first place. --- Jura 17:14, 16 May 2020 (UTC)
        • All datasets contain a degree of error. It is inevitable. When importing data we have to decide "is this good enough" and then correct any errors that are imported. If the owner of the source database is willing to work with us to fix the errors on both sides then that is a positive step. Deciding not to import a source because it has errors is a false choice; if we did that then we might as well shut down Wikidata. Now, if you want to argue that the source quality was too low for a mass import then feel free to argue that with the editors who were on here at the time. I joined Wikidata after the data was imported, so I'm not going to sit in judgement of previous actions. However, I will comment here that your claim that we can't make use of the entry is another false assumption; as pointed out already, the existence of unnamed children in some source material has added a layer of structured data to the entries. From Hill To Shore (talk) 17:37, 16 May 2020 (UTC)
          • No problem with that, it's just that Wikidata isn't really suitable to track such unknowns without any statements as separate items. Similar databases might have 100 more entries and, if we apply the same standard, we would end up with 100* similar unknowns. --- Jura 18:01, 16 May 2020 (UTC)
I have to agree with Jura1, the problem is not so much that it's bad to have items for people even with missing information, it's that Wikidata doesn't have good ways of working with masses of incomplete genealogical information -- the family tree builder is nice but more tools are needed for finding potential relatives; e.g. a relationship suggester: "X is the child of Y and Z, are Y and Z spouses? K is the child of Y, are they also the child of Z? M is the child of N who is the spouse of P, is M the child of P?" And here's another way Wikidata is inferior to dedicated genealogical sites: they have an input form which automatically populates all sorts of structured data such as surname and given name, maiden name, etc. Wikidata has input forms but they aren't much used; so there are a lot more missing statements in person items than there need to be, names in particular. Thirdly, there is not (I think) a quick way of reviewing items for potential duplicate matching; putting birth/death dates in the description certainly helps, but that's not mandatory, and also it's important to find duplicates with missing birth/death dates. Levana Taylor (talk) 23:49, 16 May 2020 (UTC)
  • I think there are two different approaches possible here. I would support deleting "unknowns" if they are just single items with connections up to a parent or down to a child, and nothing else. This of course does not mean we could not recreate them later on if we find more information! Prior to deletion, though, it would be good to set a count of children for each parent item (sourced to Peerage); the count of children is one of the useful bits of data we can derive from these.
However, I would oppose deleting "unknowns" if they are structurally useful links - eg Item A and Item B are valid, and we know they are connected by an unknown Item X, daughter of A and mother of B. Keeping X in this case would be valuable and useful. Andrew Gray (talk) 17:33, 16 May 2020 (UTC)
It should always be kept in mind that the Peerage is not in any way, shape, or form a definitive or authoritative source. It's literally the pet project of one very dedicated person. Errors and omissions are should be expected. The fact that an identity is not fully known on the Peerage is not in itself evidence the person is unknowable. -Animalparty (talk) 17:46, 16 May 2020 (UTC)
Sure, definitely agree - but then we wouldn't want to delete them on Wikidata :-). I was assuming the suggestion here was "items that were imported as unknown, are not sourced to anything else, and are still labelled as unknown". It would be very silly to delete items where we have independently established an identity just because they were matched to an "unknown" record in Peerage, but I don't think those are in scope of this proposal? Andrew Gray (talk) 18:55, 16 May 2020 (UTC)
The Peerage is not listing every (more than 100 million) people descedended from Charlemagne. Therefore it does list all children for all person - there will always be missing children for some people.--GZWDer (talk) 23:11, 16 May 2020 (UTC)
  •   Oppose These need to be considered on a case by case basis depending on the level of information contained in the sources. For example, an unknown child entry with a birth date and death date in the same year tells us the parents had a child that died in infancy. That is useful structured data that can't be be conveyed via number of children (P1971). I've also found some of the trailing entries to be useful in linking people together; while part of the name is unknown, there is often a piece of information that can be tied in to another database and allow either population of the entry or a merge with an existing entry. A blanket deletion of unknowns will remove that benefit. As I have already pointed out above, if you expect completion and accuracy in a dataset prior to importation, then we might as well shut down Wikidata now; all datasets contain a degree of error and we need to correct those errors. From Hill To Shore (talk) 17:52, 16 May 2020 (UTC)
  •   Oppose They have a structural need in concatenating generations, and as above they can be filled in later. We have several programs that generate family trees that require concatenated generations. --RAN (talk) 21:18, 16 May 2020 (UTC) See for example:

 – The preceding unsigned comment was added by [[User:|?]] ([[User talk:|talk]] • contribs).

  • I wonder what will happen when we import a similar database with unknown children: the number will double as each "unknown" will have another identifier and there isn't really a way to match them. --- Jura 09:30, 17 May 2020 (UTC)
If the two unknowns are truly unknown, with no distinguishing attributes, then it doesn't matter how we mix and match external identifiers and they can be merged arbitrarily. If the sources don't know, we shouldn't bend over backwards to make Wikidata relations any more convoluted. If Duke X has 2 unnamed twins, we need not make any more than 2 items. If external identifiers subsequently improve with uniquely identifiable data, than the items can be adjusted/swapped as needed. -Animalparty (talk) 05:50, 18 May 2020 (UTC)

If unknown child has no attributes, isn't father/mother or husband/wife to any known person - we don't need such element, it can be stated as "number of children" in a father/mother item. And if unknown connects some notable persons - then we need it for connectivity. Carn (talk) 07:44, 18 May 2020 (UTC)

Something is broken

Wikidata is very slow, and since 2:49 last night (20 hours ago) all scripts, bots and imports stopped working. Does someone know what is going on? Edoderoo (talk) 19:41, 16 May 2020 (UTC)

There is now a phabricator ticket for it as well. Edoderoo (talk) 21:03, 16 May 2020 (UTC)
And it is now resolved :) ·addshore· talk to me! 21:49, 16 May 2020 (UTC)

Rather similarly, for the last 3 days QuickStatements errors-out more than half the statements in every batch on the first run, and I've had to run the batches 4-5 times to get every statement to go through. Is that due to my connection or is the problem on the server end? Levana Taylor (talk) 23:26, 16 May 2020 (UTC)

I think it's QS's interface with the server, again. Use wikibase-cli. --SCIdude (talk) 08:48, 17 May 2020 (UTC)
What was the cause? It is still slow. Why is it happening all the time across all Wikimedia projects? Eurohunter (talk) 12:47, 17 May 2020 (UTC)
Engineers from WMDE and WMF are still investigating. Some elements of answers can be found here. It doesn't seem related to our usual issues with maxlag, it's a specific problem caused by a mariadb bug. Lea Lacroix (WMDE) (talk) 14:21, 18 May 2020 (UTC)

E-ISSN

Is there a Wikidata property for E-ISSN?

E.g:

http://ezb.uni-regensburg.de/ezeit/?2696771

E-ISSN(s): 1948-5840.

If not, can ISSN be used as well for E-ISSN?

--Nstrc (talk) 17:07, 17 May 2020 (UTC)

The e-ISSN is really just "the ISSN assigned for an electronic edition", rather than a different type of identifier, so the normal ISSN property can be used for it. Andrew Gray (talk) 22:25, 17 May 2020 (UTC)
Thank you.
--Nstrc (talk) 07:07, 18 May 2020 (UTC)

Elections where number of candidates=numbers of seats

I have asked before about election with one candidate Wikidata:Project chat/Archive/2020/05#Elections with one candidate. No problem with that. But there is also elections about Municipal Council that all political parties agree how to share the seats. So, each party just have as candidates only those the party choose after the agreement. For example, one party was taking 5 seats according to the agreement, so the party had only 5 candidate. The other case is that all political parties make a political coalition (Q6138528) that has candidates equal to the number of seats. In both cases, all candidates elected without voting. Is that a case of tacit election (Q1760295)? Xaris333 (talk) 02:33, 18 May 2020 (UTC)

"Is that a case of Stille Wahl (Q1760295)?"
Yes, it seems so. But it seems to be a special term from Swiss and Liechtenstein political and constituional terminonoly. I had nor heart nor read the term "Stille Wahl" never before.
"Stille Wahl" = "Führen alle Listen zusammen nicht mehr Kandidaten auf, als Mandate zu vergeben sind, so werden alle Kandidaten von der Kantonsregierung als gewählt erklärt."
If there are no more candidates than seats, then all candidates get into office without election / by mere appointment.
--Nstrc (talk) 07:05, 18 May 2020 (UTC)

Wikidata:Requests for permissions/Oversight/Kostas20142

In accordance with the instructions at Wikidata:Requests for permissions/Oversight, this is a notification of my candidacy. --Kostas20142 (talk) 14:25, 18 May 2020 (UTC)

Double check property edit

Could someone please double check that my edit Special:Diff/1182487115 was correct? The UI to edit was confusing, the diff is confusing, and Special:EntityData/P2257.json is refusing to update, even after purges. Nirmos (talk) 06:21, 16 May 2020 (UTC)

@Nirmos: It looks fine to me. Thanks for improving our constraints! ArthurPSmith (talk) 17:32, 18 May 2020 (UTC)

international recognition of Transnistria (Q25047789)

@Pridnestrovian editor: reverting wrong version for no reason. Eurohunter (talk) 12:46, 17 May 2020 (UTC)

@Eurohunter: I turned to you on the discussion page with a request to comment on the edit war you are provoking. You ignored it. What are you talking about? Pridnestrovian editor (talk) 13:48, 17 May 2020 (UTC)
Most of the contributions of the user here is replacing Transnistria with Pridnestrovye in English, which is incorrect. I already warned them and promised to block them (which is likely to happen, because they do not seem to be interested in stopping and discussing the issue), however, most of their edits need to be reverted, and I am afraid I do not have time for this. This is unfortunately one of the instance when one person can inflict so much damage in a short time that no human can revert it without spending about ten times as much time.--Ymblanter (talk) 14:05, 17 May 2020 (UTC)
I think this is the case when one need to roll back the entire contribution of the participant with a bot. Carn (talk) 16:54, 17 May 2020 (UTC)

I suggest paying attention to the behavior of this moderator, who:

1) does not understand the problem and does not want to discuss it, attacking the editing participants;
2) threatens users by defending their own point of view (this is his "greetings"):
"A repetition of this will likely result in the block of your account",
"You have to stop immediately, otherwise I will block you indef",
"promised to block them (which is likely to happen";
3) lies in the discussion and makes unfounded accusations:
"damage you inflicted on Wikidata is already too high",
"they do not seem to be interested in stopping and discussing the issue";
4) provokes conflict and destructive activity.

The actions of such moderators can cause real damage to the project. Pridnestrovian editor (talk) 16:28, 17 May 2020 (UTC)

If Ymblanter thinks that you are a problem with high probability you are it. The fact that you allow unethical statements about him in this discussion confirms this point of view. I hope you listen to his advice and stop what you are doing without any block.Carn (talk) 16:44, 17 May 2020 (UTC)
That is, do you think that his threats are not unethical, but my citations is? I’m already used to the fact that in some wiki projects some moderators use their authority to uphold their own point of view, but for beginners, such aggression leads to the cessation of work. At one time (2007-2011) in the Russian-language Wikipedia section (it was not without the participation of this user who was the administrator there), such actions led to the departure of 2/3 of active users to alternative projects (Traditio, Lurkmore, Cyclopedia, etc.) that in something are more successful than that Wikipedia section.
I am always ready for discussion, and if there is sound reasoning, I will implement the decisions made without any extra questions. But from the side of this participant, I saw nothing but threats. Why in my address, and not in the address of the second editor? Because he defends his position, not even paying attention to the previous discussion of this topic[45], as a result of which I began to correct incorrect data in the records. Pridnestrovian editor (talk) 18:06, 17 May 2020 (UTC)
Now we need an immediate block [46], and it would be better if someone else, not me, would block this user for continuous disruption. Whoever is in doubt whether this is disruption or not, can check the name of the English Wikipedia article.--Ymblanter (talk) 18:16, 17 May 2020 (UTC)
And its talk page and protection log, if there are still doubts left.--Ymblanter (talk) 18:19, 17 May 2020 (UTC)
In English, the region is overwhelmingly referred to as Transnistria or occasionally Transdniester. The English-language article is at en:Transnistria. I see that User:Pridnestrovian editor has contentiously edited that article with the sort of POV remark in the imperative mode ("People who do not speak Russian or other Slavic languages ​​should keep in mind…") that goes completely against Wikipedia's style.
I cannot claim to be an uninvolved party, since I was involved in the prior discussion, and I'm not an admin on this wiki, but in my opinion Pridnestrovian editor is not genuinely open to discussion: he is en:sea lioning, and if he won't knock it off some uninvolved admin here should block him. - Jmabel (talk) 18:26, 17 May 2020 (UTC)
And what's the point of blocking someone here? This is just stupid. I myself will leave here if the local administrators do not want to engage in a constructive dialogue, but intend to support this user with the help of administrative powers. The only problem is that other users will constantly appear instead of me, adequate enough to understand what is happening here, for many of which such insults to entire countries and peoples will seem extremely wild. Pridnestrovian editor (talk) 18:45, 17 May 2020 (UTC)
Just to mention that after Pridnestrovian editor has been indefinitely blocked, an IP showed up at my page with anti-semitic hate speech in Russian mentioning Transnistria, so that I had to revision-delete the edit.--Ymblanter (talk) 20:38, 17 May 2020 (UTC)
If he starts creating new account ypu moght wanna consider filling a checkuser request against him @Ymblanter:--Trade (talk) 20:58, 17 May 2020 (UTC)
The user has been blocked. Someone probably wants to review these edits
A related discussion has been started by user:ŠJů at Commons:Commons:Categories for discussion/2020/05/Category:Transnistria. That ip range seems to have a special interest in Transnistria. Multichill (talk) 20:49, 17 May 2020 (UTC)
I think I did it, you may check Carn (talk) 16:14, 18 May 2020 (UTC)
Спасибо @Carn:--Ymblanter (talk) 18:47, 18 May 2020 (UTC)

Bob Singleton spam

For some reason, new low-quality items for Bob Singleton (Q28101393) keep getting created:

Many redirects already have been created: Special:WhatLinksHere/Q28101393.

 – The preceding unsigned comment was added by Haansn08 (talk • contribs).

Is it some sample in a manual? Maybe @Matěj_Suchánek: could filter them. --- Jura 09:13, 18 May 2020 (UTC)
In which case similar people Rajat Singla and MD Anan Islam should be included as well. --Trade (talk) 09:39, 18 May 2020 (UTC)
IP addresses look different but all are the same ISP (Windstream Communications) and all are in Iowa. Different cities but information may not be accurate at that level. Peter James (talk) 10:15, 18 May 2020 (UTC)
I deleted all the above as spam--Ymblanter (talk) 18:51, 18 May 2020 (UTC)
Some of their contributions (98.16.49.24, 198.14.240.94) look like fake items, other items created (198.14.244.83, 173.191.207.25, 207.155.115.182, 98.16.49.24, 98.17.32.179) are often duplicate or not notable; if there are no references they should probably be deleted. Peter James (talk) 10:42, 18 May 2020 (UTC)
Also two "Bob Singleton" items Q74748678 and Q74747202 both by 98.16.50.219, 10 minutes apart, a few days after Q73979768 by 98.16.51.0. Peter James (talk) 10:48, 18 May 2020 (UTC)
In some of these only one or two IP addresses have been used so /21 is just a guess, also the list is probably incomplete. Peter James (talk) 11:03, 18 May 2020 (UTC)

Planned maintenance operation (read-only time) on May 19 @ 5:00 A.M.

Hi,

This is a reminder about the planned maintenance operation that is about to happen on Tuesday 19th May at 05:00 AM UTC, for 15 minutes.

This wiki will go read-only during this operation. Services targeting Wikidata may not work during the meantime.

See also: phab:T251981.

--Kaartic (talk) 12:17, 16 May 2020 (UTC)

More specifically, it should be less then 15. See phab:T251981. I thank all for their help in this. KylieInTheSkylie (talk) 23:34, 18 May 2020 (UTC)

Wikidata weekly summary #416

Revisit of Wikidata:Alternate accounts after CheckUser

In July 2019 I have made a discussion to review Wikidata:Alternate accounts. Now we have CheckUser, so I now propose again to relax the alternate accounts policy. See Wikidata:Alternate accounts/Draft for my proposal (text in green requires further discussion). Comments welcome.--GZWDer (talk) 07:19, 12 May 2020 (UTC)

  • Your proposal looks good to me. There are edits that need increased privacy, especially in non-Western jurisdictions, and the policy we have now prevents established users from making those edits. Going through a mailing list for CheckUsers should provide enough disclosure. The actions in "Handling illegitimate accounts" seem reasonable to me so that I would be fine with using must to describe how users have to declare accounts to the mailing list. When the wording is finished, there should be an RfC for the adoption of the new policy. ChristianKl09:00, 12 May 2020 (UTC)
  • I have several comments:
    In the last line, I would change "may" to "should" - if people are using accounts in good faith, they are probably just unaware that they should link them and should definitely be given a chance to fix it.
    I don't agree with the local-only accounts section. If someone is editing in Wikidata and has another account that also edits Wikidata from another project, I still think they should link the two when asked to (I would not immediately block them for it - see above).
    Allowing private disclosure for privacy reasons seems good.
    Personally I have always interpreted alternative accounts to mean simultaneous use of multiple accounts. I don't care about anyone's old abandoned accounts unless they're continuing a pattern of abusive behaviour. I would rather clarify that it's about simultaneous use than add an exception for clean starts.
  • - Nikki (talk) 09:38, 12 May 2020 (UTC)
  • The "Special cases --- Privacy" section is too narrow. For instance, if an editor wanted to edit items related to a certain region (e.g. their place of residence), they might reveal special knowledge that can be used to geolocate them. Using a alternate account for such an editing pattern would IMO also protect the main account's privacy. Thus, should we somehow modify this part of the draft in a way that the idea of privacy protecction is conveyed, without limiting it to a few special cases only as it is currently the case? ---MisterSynergy (talk) 11:24, 12 May 2020 (UTC)
@Nikki: The "local-only" clause is moved to the bottom.
@MisterSynergy: "Privacy" section is revised.--GZWDer (talk) 11:42, 12 May 2020 (UTC)
  • The proposed requirement that alternate accounts should unconditionally be reported to CheckUsers via a designated mailing list is highly dangerous. That mailing list and the CU wiki might have private access, but information might nevertheless be leaking from there (software misconfiguration, software hacked, accidentally copied somewhere else, rogue CU, etc.). Given that use of alternate accounts can be an essential part of hiding and protecting someone's real identity from serious trouble, I'd rather not collect such information at all. It would IMO suffice to mention that in the case of a CU investigation related to someone's alternate accounts, a private mailing list can be used to communicate with the CU in order to clear things up privately. ---MisterSynergy (talk) 11:34, 12 May 2020 (UTC)
  • Pinging participants of previous discussion for comment: @Marsupium, Jasper Deng, Rschen7754, VIGNERON, Sextvåetc: and other CheckUsers and candidates: @Sotiale, علاء, BRPever:--GZWDer (talk) 16:13, 12 May 2020 (UTC)
  • Edits I do on other wikis, such as page moves and deletions are recorded here at Wikidata as local edits. There are also thoughts about being able to edit Wikidata based infoboxes directly from Wikipedia. How do we handle edits of users who do not even contribute to our community? Can we therefor at all diverge from the global policy? 62 etc (talk) 16:30, 12 May 2020 (UTC)
  • This needs a RFC and should not be decided om Project chat. --Rschen7754 18:33, 12 May 2020 (UTC)
    • It does need an RfC to make the decision but with proposals like this it makes sense to first circulate them to get a wording that then can be voted on. ChristianKl19:16, 12 May 2020 (UTC)
  • I simply do not see what is wrong with the current policy and oppose the revision. Whether we should allow clean starts is something I am not sure of; those are easily abused and especially on a project where we need transparency, I am wary of relaxing requirements for that. I adamantly oppose any proposal to relax the requirements for declaration of alternate accounts as I feel that exceptions such as "mechanical" edits due to actions on other wikis would be exploited as a loophole. In my experience, the great majority of situations of undeclared multiple account use are abusive and I see no benefit here. Note that the policy says that undeclared accounts are "not considered legitimate alternate accounts" rather than "are considered illegitimate", to allow for gray areas and admin discretion. I don't see how this is not enough.--Jasper Deng (talk) 19:25, 12 May 2020 (UTC)
    • @Jasper Deng: I don't see the reason why Wikidata needs more transparency than other projects, which even does not require declaration of alternative accounts when request adminship by policy (such as English Wikipedia).--GZWDer (talk) 20:49, 12 May 2020 (UTC)
    • Note I have remove the explicit endorsement of mechanical edits.--GZWDer (talk) 20:51, 12 May 2020 (UTC)
      • (edit conflict) @GZWDer: The nature of our editing, i.e. high amounts of semi-automated small edits, and consequent recent changes volume, inherently makes it harder to audit any one user's edits. I consider such editing bad for transparency but the community has no problem with it.--Jasper Deng (talk) 20:53, 12 May 2020 (UTC)
        • @Jasper Deng: This revision explicitly prohibits "using undisclosed accounts to avoid detection". In fact only two types of accounts do not require public declaration - clean start and privacy account. I also proposed mandatory disclosure of privacy accounts to CheckUsers (though MisterSynergy does not agree).--GZWDer (talk) 20:56, 12 May 2020 (UTC)
          • @GZWDer: I'm sorry, but that does not at all address my concerns (what does "avoid detection" mean?). The nature of even legitimate editing makes things harder to audit on this project.--Jasper Deng (talk) 21:04, 12 May 2020 (UTC)
            • @Jasper Deng: Is auditing still a concern even if the accounts must be declared to CheckUsers, and they must not edit same pages or discussions?--GZWDer (talk) 21:15, 12 May 2020 (UTC)
              • @GZWDer: We can't do that (see my reply and ping above). Even if we could, if multiple accounts by the same user do many automated edits, that is a lot of edits to sift through in order to determine if they've edited the same pages, or perhaps with more difficulty, discussions (since the nature of discussion pages like this one is multiple discussions' edits interleaved, so "editing the same page" is not enough in that case).--Jasper Deng (talk) 21:19, 12 May 2020 (UTC)
                • @Jasper Deng: If this is a concern, I can add a point: such privacy accounts must not perform any automatic or semi-automatic edits with edit rate higher that an ordinary user can perform manually. (striken, likely a misunderstanding)--GZWDer (talk) 21:23, 12 May 2020 (UTC)
                • And another point, the privacy accounts may only edit some specific areas, and may not particpate any discussion main account involved.--GZWDer (talk) 21:28, 12 May 2020 (UTC)
                  • (edit conflict) Please consider adding on to your original comment instead of making new ones, to reduce edit conflicts@GZWDer: Then the fact that someone is not doing automated edits on one account but doing them on another already compromises privacy somewhat. Also, Wikidata is in the real world; you must be willing to accept real-life consequences of participation on a public website. If in doubt, you should not make edits that could impact yourself in real life. It is better to privately ask a completely independent editor to make such edits. If this is meant as a defense against harassment, this would be practical only if we have an arbitration committee or other body empowered to handle this situation. I am wary of assigning CheckUsers that role; it would make it harder to elect new CheckUsers.--Jasper Deng (talk) 21:29, 12 May 2020 (UTC)
                    • @Jasper Deng: I don't think an ArbCom is required to handle privacy issues. Many large wikis like Commons do not have one. And the original purpose of ArbCom is to act in the last step of dispute resolution; they do not specifically act as a body for privacy issue (e.g. this is most similar to what ArbCom originally intended to be; though I do not read French, I suspect the French one does not explicitly have the function to handle privacy issue). If an body is really needed, I have a crazy idea - We introduce a "privacy committee"; member of which have CheckUser and Oversight rights; and we move all existing CheckUsers and Oversights to the new committee (if the community approves), then deprecate the current request for CheckUser and Oversight permission process.--GZWDer (talk) 21:51, 12 May 2020 (UTC)
                      • @GZWDer: I'm not saying we should have an arbcom; it would be IMO sad if we did, just because that would show an inability to resolve disputes on our own. However, I view the whole thing as a solution in search of a problem. Let us not make more bureaucracy for the sake of bureaucracy, and keep the policy as simple as possible.--Jasper Deng (talk) 23:38, 12 May 2020 (UTC)
                        • Just because you personally know about no data that you want to add but for which you need privacy doesn't mean that the same thing goes for other people as well. The CheckUsers right is already highly privacy relevant and it should be hard to give it to more people. ChristianKl07:43, 13 May 2020 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── @ChristianKl: My point still stands. Also, again, note that the entire idea of having a Wikidata-specific CheckUser mailing list is not possible.--Jasper Deng (talk) 09:13, 13 May 2020 (UTC)

@Jasper Deng: We have oversight@wikidata.org as a private Wikidata-specific mailing list for Oversighters. Why do you believe that the same wouldn't be possible for CheckUsers? ChristianKl15:15, 13 May 2020 (UTC)
@ChristianKl: oversight@wikidata.org actually points to an OTRS queue. But I don't know why we can not have a similar queue for CheckUsers.--GZWDer (talk) 15:46, 13 May 2020 (UTC)
@GZWDer: I don't think an ArbCom is required to handle privacy issues. Many large wikis like Commons do not have one.
Well, meh. At least ArbCom can make decisions. Commons often can't. When a conflict arises, an admin may say "user X should (not) do Y", the thread is closed and archived and user X just continues doing Y without repercussion. That's the best case scenario as many threads get archived and the conflict is assumed to resolve itself somehow. So the same issues flare up again and again. If CheckUsers are involved, those CheckUsers also act as de facto ArbCom for that case. Is ArbCom the solution? I don't know, I imagine that on smaller wikis where everyone knows each other an ArbCom would be overkill. Commons ain't no small wiki and I wouldn't call it a shining example of a big wiki that functions well without having an ArbCom.
Jasper Deng: In my experience, the great majority of situations of undeclared multiple account use are abusive
What a shock. Considering you'd find most undeclared alternate accounts as the result of investigation into abuse and users who have alternate accounts for legitimate reasons don't abuse them, it's... actually totally expected that the majority of undeclared accounts you see are abusive. And it would totally be solved by a declaration requirement. Please sign here if you are planning to steal from our shop. Alexis Jazz please ping me if you reply 04:00, 14 May 2020 (UTC)

'The CheckUsers right is already highly privacy relevant and it should be hard to give it to more people.' In what way do you think it should be harder to get CU? --Trade (talk) 14:36, 13 May 2020 (UTC)@ChristianKl:

By my count English Wikipedia has 26 CU's in contrast to 1141 admins. We have 4 CU's for 58 admins. Our ratio is already a lot higher then that of EnWiki. An outcome where we don't give anybody new CU rights in the next five years seeems reasonable to me. 4 CU's are plenty for the needs of Wikidata. ChristianKl15:15, 13 May 2020 (UTC)
The inability for ordinary users to read and search the labels for deleted QID's makes detecting sock puppets harder. If we could find a way to fix this issue we'll likely see more CU requests and thus need a higher amount of Check Users.--Trade (talk) 20:20, 13 May 2020 (UTC)
@ChristianKl: That means we need more admins, not less CU's. Considering that this ratio is higher on a project like the Simple Wikipedia, this is a very poor argument. Also, note that the change to the socking policy is proposed to be permanent; if we are going to grow as a project, then at some point we will be needing more CheckUsers, and even if not, we will sooner or later have to replace the existing ones.
WMF cares about privacy. Instituting a mailing list where protected information such as IP addresses gets circulated might find opposition. This proposal is however about providing more privacy protection and thus will be supported by the WMF. Practically, that would go over OTRS. ChristianKl19:52, 13 May 2020 (UTC)
@GZWDer: An OTRS queue might still not be permitted. What I have been told is, a local CheckUser mailing list is not allowed. I am unsure if that extends to OTRS. Either way, I still oppose the specific use cases you propose simply because there are better alternatives that are simpler.--Jasper Deng (talk) 19:22, 13 May 2020 (UTC)
You haven't provided any alternative for how established users can upload data that might get them into trouble when linked to their real identities. CU's can already see IP addresses and thus whether or not someone discloses their identity explicitely to CU's, they could be tracked down if a CU is compromised. ChristianKl20:12, 13 May 2020 (UTC)
@ChristianKl: you seem to have missed If in doubt, you should not make edits that could impact yourself in real life. It is better to privately ask a completely independent editor to make such edits. above. And I doubt a local CU mailing list is prohibited for just that reason. I need to seek clarification on why.--Jasper Deng (talk) 20:49, 13 May 2020 (UTC)
In general it's a sign that data is valuable if there's a party that goes through effort to punish it's publication, as a result it's bad for Wikidata when we discourage that such valuable data gets added.
If I would for example want to add data about corruption in the Chinese communist party, I would want some privacy protection. Any person I might ask to help in such a project might also feel like the risk isn't worth it to themselves. Would you be willing to upload such data if somebody asks you personally? ChristianKl07:15, 14 May 2020 (UTC)
  • @ChristianKl: The claim In general it's a sign that data is valuable if there's a party that goes through effort to punish it's publication seems totally wrongheaded to me. Would Nazi propaganda be valuable in Germany? Child pornography in the United States? - Jmabel (talk) 15:20, 14 May 2020 (UTC)
WikiCommons does host content like https://commons.wikimedia.org/wiki/File:Reichspost%2BHakenkreuz.jpg that's banned in Germany and I do think that the encyclopedic value of that image his higher then that of the average random picture on WikiCommons.
I don't think Child pornography images are valuable for the purposes of the goals of Wikimedia but it's still data that's valuable to some people. Things get banned because they matter. Sometimes those are good things, sometimes they are bad things. We forbid the bad things via policy and if someone would upload child pornography we wouldn't try to protect their privacy but help law enforcement. On the other hand, for information that's not violating our policy the fact that someone wants to go through effort to keep the information from being published is sign that there's value in publishing the information. ChristianKl19:00, 14 May 2020 (UTC)
I simply strongly oppose the drop of transparency that results from this; after all, avoiding scrutiny is one of the things that alternate accounts may not be used for. With a large enough community I am sure it should not be too hard to find an independent editor to make changes you wish to make.--Jasper Deng (talk) 22:35, 14 May 2020 (UTC)

Per WMF, any mailing list that contains private info (especially IP info) generally does not get archived (this includes mail:checkuser-l). Local CU mailing lists are generally not allowed either, to enforce that policy. That means that they would have to set up a private WMF wiki to store the list of privately declared accounts, unless you want them to keep it in a Google Doc. --Rschen7754 00:14, 14 May 2020 (UTC)

@Rschen7754: I previously said "but they should be or must be declared to CheckUsers via designated mail list; CheckUsers will record the accounts at the private CheckUser wiki." I may add "Account relationships you declared will also be revealed to current CheckUsers in other projects, Stewards and Ombudsmen, but access to the relationships are governed by m:Access to nonpublic personal data policy; if you do not trust them, you should not use private alternative accounts at all."--GZWDer (talk) 05:05, 14 May 2020 (UTC)
Yeah, and I think that a lot of people would not be happy with that - all members of the English Wikipedia ArbCom have CU, and many globally don't trust them or are even banned from there. Plus, the more people who have access to that wiki (over 100), the more likely that it is going to leak, which is bad for easily memorable information like usernames. --Rschen7754 05:28, 14 May 2020 (UTC)
  CommentWD:IPBE Should also be modified. --Liuxinyu970226 (talk) 00:51, 14 May 2020 (UTC)
That page needs to be thoroughly reviewed, I don't know if it now agrees with m:Legal/Statement regarding IP block exemptions. --Rschen7754 00:57, 14 May 2020 (UTC)
I was not aware of that page as it was not ratified as policy. This should not be discussed here; we need a separate discussion over it.--Jasper Deng (talk) 01:20, 14 May 2020 (UTC)
Wikimedia Legal has been contacted by the Ombudsman Commission (“OC”) regarding an investigation by the OC into a practice by some Checkusers (“CUs”), primarily on English Wikipedia, of making preventive mass checks of users who have the IP block exemption (“IPBE”) flag on their Wikimedia user account.
Wow. I am shocked yet somehow not surprised at the same time. w:en:Wikipedia:CheckUser#Fishing was already in place well before 2016, but those CheckUsers apparently figured they were above the rules. Alexis Jazz please ping me if you reply 04:00, 14 May 2020 (UTC)
For what it's worth, I have no intention of "preemptively" checking those who have IP block exemption like that.--Jasper Deng (talk) 05:18, 14 May 2020 (UTC)
Don't ask users to reveal sensitive information and declare alternate accounts. You won't catch any vandals, because vandals don't declare their alternate accounts anyway. Alexis Jazz please ping me if you reply 04:00, 14 May 2020 (UTC)
This is not about vandals. This is about the borderline cases. --Jasper Deng (talk) 05:18, 14 May 2020 (UTC)
  • Given that we have multiple preferred options not allow secondary privacy related accounts / allow secondary privacy related accounts if disclosed / allow secondary privacy without disclosure we should have an RfC that provides all options and allows everyone to support the option they like best. ChristianKl07:01, 14 May 2020 (UTC)
  • Just remove the requirement to have all alternate accounts publicly declared. The main restriction on multiple accounts should be disruptive use, and if someone wants to clean start or avoid political/real life sensitivity they shouldn't need to follow a bureaucratic process to do so, particularly since Wikidata isn't the home wiki for most and they are very unlikely to know about/respect procedure we have here. -- Ajraddatz (talk) 13:22, 15 May 2020 (UTC)

What to do with Q43011752?

This item tells me instance of (P31)Wikimedia disambiguation page (Q4167410), but by visiting its enwiki link, it doesn't feel like a disambiguation page, rather it looks like the shorten of en:Saterland Frisian (Saterland Frisian (Q27154)). --Liuxinyu970226 (talk) 12:39, 18 May 2020 (UTC)

We try to have an article on every language. It was a bit weird to have separate articles on the dialects but none on the language. Usually it would be the other way around, with the dialects split off when they become developed enough. Kwamikagami (talk) 05:08, 19 May 2020 (UTC)

Recipes

Is it ok to add recipes to Wikidata? I understand that a recipe may or may not be under copyright, but there shouldn't a problem with adding the title, author, and a list of ingredents? For instance, could I add an item for Chef John's Buttermilk Biscuits and use has part(s) (P527) to list out the ingredients? For example: has part(s) (P527)baking powder (Q29476)quantity (P1114) → 2 teaspoon (Q88296091)? U+1F360 (talk) 02:33, 19 May 2020 (UTC)

Such items already exist, as instances of recipe (Q219239), e.g., Haagse bluf (Q1833211). Ghouston (talk) 05:59, 19 May 2020 (UTC)
@Ghouston: Thanks! U+1F360 (talk) 13:09, 19 May 2020 (UTC)
Another question, would it be better to use has part(s) (P527) or has part(s) of the class (P2670)? U+1F360 (talk) 13:09, 19 May 2020 (UTC)

Currently franchising (Q171947)subclass of (P279)business model (Q815823), I guess since business model (Q815823) is not instance of (P31)second-order class (Q24017414) it is right, but maybe business model (Q815823) should be a instance of (P31)second-order class (Q24017414)? If not I guess I should make business model typeinstance of (P31)second-order class (Q24017414) and then add franchising (Q171947)instance of (P31)business model type.

  Notified participants of WikiProject Companies

  WikiProject Ontology has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Iwan.Aucamp (talk) 13:24, 19 May 2020 (UTC)

Probably good to note, the way business model (P7936) is set up it seems like franchising (Q171947) should be a instance of (P31)second-order class (Q24017414) as it has a constraint that values must be ?valueinstance of (P31)business model (Q815823) and currently this includes Zoo & Co. (Q219979) Gong Cha (Q5581670) Tecnocasa (Q2399658) because they are instance of (P31)franchising (Q171947). Iwan.Aucamp (talk) 13:32, 19 May 2020 (UTC)
It also follows from the above, if business model (Q815823) should not be instance of (P31)second-order class (Q24017414) and "business model type" should serve this role, then the constraint for business model (P7936) should be updated. Iwan.Aucamp (talk) 13:34, 19 May 2020 (UTC)
It's possible that instance of (P31)franchising (Q171947) was used because business model (P7936) had not been created; would business model (P7936)franchising (Q171947) be more accurate? Peter James (talk) 16:57, 19 May 2020 (UTC)
@Peter James: It would be more accurate, but that is not really the problem, even if that was fixed the problem here will remain I think and Tecnocasa (Q2399658)business model (P7936)franchising (Q171947) won't validate because franchising (Q171947) is not instance of (P31)business model (Q815823). I still think business model (Q815823) should be a instance of (P31)second-order class (Q24017414). Iwan.Aucamp (talk) 17:11, 19 May 2020 (UTC)
It probably should be - or is being a subclass of a variable order metaclass enough? Items such as Zoo & Co. (Q219979) shouldn't be instances of a subclass of business model, which is what the statements now say they are. Peter James (talk) 17:17, 19 May 2020 (UTC)
@Peter James: It probably is enough for now that business model (Q815823)subclass of (P279)variable-order class (Q23958852) (transitively). I will change franchising (Q171947)subclass of (P279)business model (Q815823) to franchising (Q171947)instance of (P31)business model (Q815823) and look at cleaning bad ?itemsinstance of (P31)franchising (Q171947). Thanks for the inputs Iwan.Aucamp (talk) 17:41, 19 May 2020 (UTC)

error on connecting

Hi i want to add fa:رده:بازیکنان والیبال زن اهل روسیه to the Q9536257 but it shows error. please solve itYamaha5 (talk) 16:42, 19 May 2020 (UTC)

There was a duplicate sitelink to a page in the Serbian Wikipedia in Category:Russian women's volleyball players (Q86271542). I merged it, so it should now be possible to add the link. Peter James (talk) 17:08, 19 May 2020 (UTC)

Gene property: Microarray Probe IDs

Wikidata does not appear to have microarray probe ids and their corresponding mappings to various gene naming conventions (Ensembl ID, Homolgene ID, Entrezgene ID).

Are probe IDs and mappings something that could be added to Wikidata? How would one go about this?  – The preceding unsigned comment was added by Reliscu (talk • contribs).

@Reliscu: I'm sure it can, you would have to request a property for those IDs I suppose, if you could give some examples of items with IDs it would be great. See Wikidata:Property_proposal for details on requesting a property. Iwan.Aucamp (talk) 21:17, 19 May 2020 (UTC)


  WikiProject Molecular biology has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Wikidata:Property proposal/Person

Could someone please take a look at the page? Many of these proposals have been waiting for a conclusion for months. --Trade (talk) 00:30, 20 May 2020 (UTC)

@Infovarius: You reverted my change adding front matter (Q24033349)described at URL (P973)https://en.wikipedia.org/wiki/Book_design#Front_matter here with reason "bad source". I am not aware of any guidelines that define what is good or bad sources for described at URL (P973). It is not indented to be the same as a reference as far as I can tell and I don't think there is consensus that wikipedia should not be allowed in there (see this). If you want this rule or guideline please raise it on the property discussion page.

I do think it is useful to have links to wikipedia which describe concepts and in this case I cannot add Book Design as the en.wikipedia link, because, well, it is just described at that page, and not the page's primary subject. If described at URL (P973) is not the right thing to use, I would like suggestions of what else to use. Iwan.Aucamp (talk) 17:45, 18 May 2020 (UTC)

  • We have sitlinks to link to Wikipedia. Redirects can be used to link to specific sections like this. ChristianKl18:05, 18 May 2020 (UTC)
    • @ChristianKl: If I try to add en:Front matter (which does redirect to en:Book design) then I get a failure:
      {"code":"wikibase-validator-sitelink-conflict","module":"wbsetsitelink","*":"The link [https://en.wikipedia.org/wiki/Book_design enwiki:Book design] is already used by Item [[Q686831|Q686831]]. You may remove it from [[Q686831|Q686831]] if it does not belong there or merge the Items if they are about the exact same topic."}
      
      so it seems it is trying to add the page it redirects to. Iwan.Aucamp (talk) 18:26, 18 May 2020 (UTC)
      • At the moment you need to temporarily edit the Wikipedia redirect page to not be a redirect and then edit it again to make it a redirect again. ChristianKl18:55, 18 May 2020 (UTC)
        • I'm not sure this approach is entirely satisfactory or right. I think described at URL linking to en.wikipedia works pretty well in comparison. I think if there is objection to this practice it should be raised on the property instead of being treated as being implicit, so we can update the property. Iwan.Aucamp (talk) 19:14, 19 May 2020 (UTC)

More on meta-classes

Currently, transitively, media of Australia (Q4438762)subclass of (P279)knowledge (Q9081). This seems wrong to me, and I think it goes wrong at mass media (Q11033)subclass of (P279)specialty (Q1047113), which I think should instead be mass media (Q11033)instance of (P31)specialty (Q1047113) (if anything at all). I'm not that familiar with meta classes though, so I would appreciate any opinions on this matter.

  WikiProject Ontology has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Iwan.Aucamp (talk) 17:21, 19 May 2020 (UTC)

  Done changed mass media (Q11033)subclass of (P279)specialty (Q1047113) to mass media (Q11033)instance of (P31)specialty (Q1047113) Iwan.Aucamp (talk) 19:33, 20 May 2020 (UTC)

Adding car information per body type

I'd like to enrich information about car models and there is something I'd like to consult with the community:

Car classification and all its subclasses provide many types of cars. Query to check these:

SELECT ?type ?typeLabel WHERE {

 ?type wdt:P31 wd:Q836985 ;
   wdt:P279+ wd:Q1420 .
 
 SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }

}

And instances of automobile models inherit from those car classifications. For example, Opel Antara inherits from sport utility vehicle.

However, there are models that would inherit from multiple car classifications. For example, Opel Astra G can inherit from (because is manufactured in several different body types):

That's OK since we can have multiple inheritance in Wikidata. However, some of the model's features are different between different body types. Example of such features:

  • Length - the Astra G station wagon is longer than the hatchback version
  • Width - is usually the same
  • Height - can be different
  • Image - is different (as illustrated above)

What is the best way in which I can add information regarding images and dimensions of different variants (body styles) of the same automobile model?

One possibility would be to add a new property (like Automobile Body Type or something similar) with a value of one of the mentioned car classifications mentioned above. The other per-body characteristics (image, length, width, height) can be qualifiers. I am open to any other suggestions.

One problem with that approach is that dimensions on root level will become obsolete and redundant. There are cars with a single body type for which not specifying a body type and having the dimensions on root level will still be OK (example - Renault Arkana).

@Nikola Tulechki:

Petar-Iv (talk) 20:06, 19 May 2020 (UTC)

I'd say that if vehicles differ in body type, length, etc., then they are basically different models. If you want to record all that information, you'd need a Wikidata item for each variant. The same thing happens with other products like mobile phones and CPUs. Having a property like "Automobile Body Type" would be an alternative to using subclassing from the body type items, but I'm not sure that there'd be any advantage to it. Ghouston (talk) 05:59, 20 May 2020 (UTC)

United Nations membership and successor states

United Nations membership is currently modelled using member of (P463): United Nations (Q1065), with start time (P580) (and sometimes end time (P582)) qualifiers. Generally this information seems fairly complete and accurate. There is, however, a slight oddity surrounding successor states, particularly of the former Soviet Union. At the signing of the Charter of the United Nations (Q171328), not only was Soviet Union (Q15180) a member, but also Byelorussian Soviet Socialist Republic (Q2895) and Ukrainian Soviet Socialist Republic (Q133356) in their own right.

At the dissolution of the Soviet Union, Russia (Q159) took that seat, and the other two were replaced in turn by Belarus (Q184) and Ukraine (Q212). Each of the former three have a relevant 1991 end time (P582) qualifier on their member of (P463): United Nations (Q1065) statement, but each of the latter three have a start time (P580) dating the whole way back to 1945 (in Ukraine's case with a slightly odd subject has role (P2868): Ukrainian Soviet Socialist Republic (Q133356) qualifier). I can understand why each would want to claim that their UN membership started in 1946 rather than 1991, but it seems odd to me for both Q2895 and Q184 (say) to claim to have been members at the same time.

I suspect we probably want to update each member of (P463) for the modern states to their 1990s date, and add relevant replaces (P1365)/replaced by (P1366) qualifiers, but as this seems like it might potentially be contentious, I figure it's worth asking about it here first. Thoughts? Alternatives? --Oravrattas (talk) 10:36, 20 May 2020 (UTC)

I'd say so, since Belarus (Q184) for example only represents the country since 1991, it doesn't make sense for it to have any start date preceding that. Following replaces (P1365) you get Belarus (Q184) -> Byelorussian Soviet Socialist Republic (Q2895) -> Socialist Soviet Republic of Byelorussia (Q68678) and Lithuanian–Byelorussian Soviet Socialist Republic (Q76236) and after that you are in a loop. Ghouston (talk) 10:47, 20 May 2020 (UTC)

Has maintenance of items at Wikidata become impossible?

Given that new items are created at a continuously high pace and edits to correct problems are mostly throttled, I wonder we haven't reached a point where Wikidata has become to slow to maintain it properly. --- Jura 10:59, 20 May 2020 (UTC)

  • What are you referring to when you speak of edits to correct problems are mostly throttled? ChristianKl11:17, 20 May 2020 (UTC)
    • You can create an item with 50 statements in one go, but to fix 2 statements on such an item it's mostly impossible. Maybe it's better to create a correct version and then delete the old one. --- Jura 11:28, 20 May 2020 (UTC)
      • I just fixed some items, did not have much problems. Why is it mostly impossible to fix 2 statements? Database too slow? I have had this some weeks ago but right now things are pretty snappy. Iwan.Aucamp (talk) 13:06, 20 May 2020 (UTC)
      • Maybe we need to change the system in a way that if the database is at write capacity it doesn't allow new item creation via bots? ChristianKl13:19, 20 May 2020 (UTC)
        • I wonder what the right ratio would be. If it's easier to create 100 items than to fix 1, data quality is likely to go down. --- Jura 13:25, 20 May 2020 (UTC)
          • I'm not sure it should be a question of ratio. We could just disable the item creation by bots as long as there's a backlog for edits. ChristianKl13:36, 20 May 2020 (UTC)
            • Currently all bots are on hold, when there is too much lag on the database. This happens many times a day, for shorter or longer periods. The two bots that create most items are then on hold. Last 24 hours one user started several QS-batches and doubled the daily amount of new items, so maybe that is why we had some issues the last day. Edoderoo (talk) 13:50, 20 May 2020 (UTC)
          • Maybe we should do that. --- Jura 14:01, 20 May 2020 (UTC)

The summary suggests to me another reason why maintenance is becoming increasingly difficult: software rot. External databases are a moving target but, even if the bots create items for new database entries, they almost never update all claims/info on existing items, as this is more difficult and less rewarding. Moreover, some tasks like e.g. replacing aliases are impossible to do with QuickStatements. I have tried to solve this in the molecular biology part of WD, and would like to hear about other efforts, maybe at least a collection of links to code can be made, that would be great help for starters. --SCIdude (talk) 14:52, 20 May 2020 (UTC)

  • If we only added stable identifiers, it wouldn't matter that much. Maybe we should screen that better. --- Jura 15:08, 20 May 2020 (UTC)
    • What about Family Search? Their IDs unfortunately aren't stable because they allow merging in either direction between pairs of duplicates -- and there is an awful lot of merging going on -- but they are way and away too useful to drop, as a source for collections of original official documents. Although you can't view their documents unless you're registered [I have a fake name & email that I use just for them :-)], some of thee documents aren't openly available anywhere else either. And at least they don't charge; they're by far the largest free source. Levana Taylor (talk) 15:50, 20 May 2020 (UTC)

Merge

Can we only allow merging into older item (prohibit merge into newer item)? I don't see the benefit of merging into newer item other than to disrupt things. Hddty (talk) 11:49, 20 May 2020 (UTC)

See Wikidata:Requests for comment/Disallow merging into newer entity. Ghouston (talk) 11:55, 20 May 2020 (UTC)

Wikimedia 2030 recommendations have been published

Wikimedia 2030 recommendations guide how the Wikimedia movement should change to meet the Wikimedia vision in the upcoming decade. They are at a high level so that the ideas are flexible enough to be adapted to different global and local settings. The final set clarifies and refines the previous version, which was published in January. The team that finalized the content reviewed and integrated feedback and made the language as clear as possible.

The result is a 40-page document that outlines 10 recommendations, along with 10 underlying principles, a narrative of change, and a glossary. We encourage you to read the recommendations.

There are a couple of other formats for you to take a deeper dive if you wish, such as:

If you would like to comment, you are welcome to do so on the Meta-Wiki talk pages. However please note that these are the final version of the recommendations. No further edits will be made.

The focus of the movement strategy process will shift toward implementation. The Wikimedia Foundation is intending to host a series of virtual events. The goal will be to produce a plan to begin the implementation — to identify what initiatives must come first, and in what sequence, and with what resources and support. This series will have a wide scope to ensure various parts of the movement are engaged. We expect to share more details this month, and begin this work together through the summer and into the fall.

SGrabarczuk (WMF) (talk) 14:52, 20 May 2020 (UTC)

 

deprecate obsolete form of a person's name?

The physicist James Clerk Maxwell (Q9095) is referred to in quite a few 19th-century sources, including the original Dictionary of National Biography, as "Clerk Maxwell, James" or "Clerk-Maxwell, James." But nowadays there is a near-universal consensus, including in the updated Oxford Dictionary of National Biography, to call him "Maxwell, James Clerk." Should the former compound surnames be deprecated? I don't see anything in the list of reasons for deprecation that applies. Levana Taylor (talk) 16:58, 20 May 2020 (UTC)

Wikidata Witches Project

Hi there!

Last weekend Nat and me started our Wikiproject Witches at the WM hackathon.

I did an inventarisation of all the statements that are being used on the items of people that have been accused for being a Witch. Most of them use charge (P1595) but back in 1500 most of the time people would not always make it to the trial. Sometimes they died before, or would be released in the mean time. Or in some cases, they would be chased and killed by a mob. So is the alternative to use accusation (Q19357312)? A lot of queries relay on this one specific statement, so I do not want to change it without some of your feedback.

Same goes for exonerated of (P7781) and convicted of (P1399). How to model the data about people that did not have any time in front of a judge, but did receive punishment? Ciell (talk) 18:50, 16 May 2020 (UTC)

This is a weak modeling but perhaps significant event (P793) with hunt of the witches (Q188494) or with a new item of instance of (P31) hunt of the witches (Q188494) describing the specific witch hunt event (especially if there is more than one person concerned). The last solution may be usually better for a rich data project, since it will be easier to link all the entities involved in the event. Alexander Doria (talk) 09:45, 17 May 2020 (UTC)
Yes, I would at least want to describe the victims and their trials. Example per Alexander Hammiltoun (Q43390357) for the accused and Trial of Alexander Hammiltoun (Q43403608) for the trial. Ciell (talk) 11:09, 17 May 2020 (UTC)
I also thinks it's the best solution (unless the sources are really scant). Using qualifiers within the person item will be quickly limitative and hard to manage. Alexander Doria (talk) 11:35, 17 May 2020 (UTC)
@Ciell: Maybe this could be of use nature of statement (P5102) Iwan.Aucamp (talk) 15:18, 18 May 2020 (UTC)
Iwan.Aucamp How could I use that, do you have an example for me? Ciell (talk) 12:19, 21 May 2020 (UTC)

Can't add qualifier to statement

Trying to add information to Young Galileo (Q6880367) - I changed "Statement:Instance of" from "newspaper" to "magazine", added a reference URL. Now trying to add to magazine the qualifier "publication interval" as month (or unit:month) - and the operation "publish" remains grayed out and doesn't respond to my click. What's wrong? --Deborahjay (talk) 09:16, 21 May 2020 (UTC)

@Deborahjay: I think you need to use a value of 1, with unit of 'month' (and probably as a full statement, rather than a qualifier) See, for example Reader's Digest (Q371820) --Oravrattas (talk) 10:08, 21 May 2020 (UTC)

monograph series (Q1700470) and book series (Q277759)

monograph series (Q1700470) is categorised as subclass of book series (Q277759). - However, what is the differentia specifica, which makes monograph series (Q1700470) more specific than book series (Q277759)?

  • The label of the former is in English and Spanish "monographic series" respec. "serie monográfica". Does this mean, anthologies (Q105420) and essay collections (Q16324495) could be no elements of monograph series (Q1700470)?
  • The label of Q277759 is in English, German, French, Spanisch "book series", "Buchreihe", "série de livres" respec. "serie de libros". So far, no problem. - However, for Q1700470 "monografische Reihe" is in German only mentioned as alias, while the main label is "Schriftenreihe".
  • GND treats "Buchreihe" and "Schriftenreihe" as well as "Monographische Reihe" as synonyms (i.e.: Schriftenreihe / Monographische Reihe <Q1700470> are not categorised as subsclass of Buchreihe <Q277759>!): http://d-nb.info/gnd/4179998-7.

I recognised these problem, during working on Positionen (Marxist book series; written in German) (Q94950782) (http://ld.zdb-services.de/resource/549789-9), which is a book series, containing monographs as well as essay collections. Therefore I would tend to use ZDB ID (P1042) and book series (Q277759), rather than ZDB ID (P1042) and Q1700470 (monographic series).

--Nstrc (talk) 17:18, 20 May 2020 (UTC)

"could not be elements".
--Nstrc (talk) 19:41, 20 May 2020 (UTC)
The English Wikipedia has sitelinks for both, and the way it describes them is "A book series is a sequence of books having certain characteristics in common that are formally identified together as a group." and "Monographic series (alternatively, monographs in series) are scholarly and scientific books released in successive volumes, each of which is structured like a separate book or scholarly monograph." Ghouston (talk) 00:04, 21 May 2020 (UTC)
Then "scholarly" / "scientific" (= characteric feature of "monographic series") would be the differentia specifica between "monographic series" and "book series". - Is this in deed true and valid for other languages as well?
--Nstrc (talk) 08:21, 21 May 2020 (UTC)

Monograph

I see that monograph (Q193495) is monograph in the more colloquial sense of "specialist work of writing on a single subject or an aspect of a subject" (and descriptions in several other languages seem to agree). Do we have an item for monograph in what I'm pretty sure was the original sense, a specialist document where only a single copy was intended to exist (there was not intent of publication)? Or am I wrong about that being the original sense? - Jmabel (talk) 15:00, 21 May 2020 (UTC)

@Jmabel: I don't think that's the original meaning - as far as I know "monograph" has always been "single subject", not "single exemplar". The OED gives it as originating with scientific writers in the early 19th century, a time when mass publication was pretty normal. "Monographer" and "monography" were a little earlier, second half of the 18th century, but still had the same sense of a specialised work. There's no reference to the "single copy" meaning, and it's the sort of thing I would expect the OED to be aware of! Andrew Gray (talk) 17:05, 21 May 2020 (UTC)
Huh. I've certainly heard it used that way for unpublished dissertations on file in one library. I'm surprised that wasn't the original meaning. - Jmabel (talk) 23:04, 21 May 2020 (UTC)

Wikidata:Requests for permissions/Oversight/Esteban16

Notification of my candidacy for oversighter. Esteban16 (talk) 18:15, 21 May 2020 (UTC)

 

Please, Merge

Please, merge United States (Q232865) and Estados Unidos (Q29406895). 217.117.125.72 19:49, 13 May 2020 (UTC)

And États unis (Q41700). 217.117.125.72 19:51, 13 May 2020 (UTC)
Those are not supposed to be merged. Disambiguation sites get only merged when they are disambiguations for the same string. ChristianKl20:14, 13 May 2020 (UTC)
Some are, others are not; Q41700 has names in different languages. It wouldn't be possible to merge all as frwiki and dewiki have sitelinks at Q232865 and Q41700, and svwiki has sitelinks at Q29406895 and Q41700. Peter James (talk) 21:04, 13 May 2020 (UTC)
There are items that don't follow our general policy, but what I wrote above is still our general consensus. ChristianKl21:47, 13 May 2020 (UTC)
Per Wikidata:WikiProject Disambiguation pages, its guideline page and talk page. Ghouston (talk) 00:52, 14 May 2020 (UTC)
Currently, the description of Wikimedia disambiguation page (Q4167410) is type of page in the Wikimedia system. We likely should update it in a way that makes this issue clear, so that the information about how it's used is easier to discover. ChristianKl07:26, 14 May 2020 (UTC)
@Ghouston, ChristianKl: So far, I also shouldn't merge Meiguo (Q11608888) with one of two items mentioned above here? --Liuxinyu970226 (talk) 02:28, 21 May 2020 (UTC)
@Liuxinyu970226:Yes, the images shouldn't be merged. If someone wants sitelinks between them they can create redirects on the Wiki's. ChristianKl07:54, 21 May 2020 (UTC)
@ChristianKl: Well, foreign languages redirects to disambiguation pages aren't allowed on zhwiki as per policy: zh:Wikipedia:重定向#非中文重定向問題 (this policy allows such redirects to be created on zhwiki if and only if: 1. the language of redirect (or its culture) has clear and definite relations to the target **article**, e.g. company/work/person/local name of non-Chinese speaking places; 2. there are reasonable hopes that Chinese-speaking users will usually use this foreign name to describe the target **article**, e.g. in professional literature works, commonly seen foreign person names in Latin script; 3. redirect titles are same as or very similar to which Wikidata items' labels are, which as I asked locally, also only applies to foreign languages **articles**; 4. if foreign Wikipedias already have such redirects to that foreign languages **articles** of same Wikidata item, e.g. zh:苹果 shares with en:Apple, and as en:Malus domestica redirects to enwiki Apple, it's allowed to create zh:Malus domestica to zhwiki 苹果; 5. special cases: timezones, CAS Registry Number (P231) of chemicals, Emojis (follow [47]) and/or airports' ICAO airport code (P239). As far as I believe, all of those cases can't be applied to disambiguation pages). --Liuxinyu970226 (talk) 00:34, 22 May 2020 (UTC)
I have no problem with zhwiki having such a policy. If zhwiki doesn't want someone who types "United States" into their box to be redirect to the relevant disambiguation page, they are free to not allow those redirects. It just means that they also won't get Wikilinks given our policy. It also still possible for anyone who still wants to link to them to manually create the links. ChristianKl12:27, 22 May 2020 (UTC)

Question about collection of plans for a building

I am creating or editing pages for my institution's archival collections. Most of them will have a statement P485, archives for, under the person I'm editing. I'm now looking at a collection of plans for a building. P1963, properties for this type, under Q41176, Building, doesn't list either archives or plans. How do I indicate this, or what should I propose? I've only been editing Wikidata for a few weeks.Librarianlois (talk) 15:15, 20 May 2020 (UTC)

You mean you add reference URLs to Wikidata person items' P485 claims. If I understand correctly you now want to indicate that a certain person's collection (or whole archive) consists of building plans. If so I would look for a property that can make statements about a person's work, like notable work (P800) or occupation (P106), and put as value architectural plan (Q47597). But this is not my field! --SCIdude (talk) 17:00, 20 May 2020 (UTC)

I want to add a link to the collection of plans under the entry for the specific building, but haven't found a way to do that. Librarianlois (talk) 18:08, 20 May 2020 (UTC)

I think Librarianlois is saying that she wants to add a link to a building item, indicating an archival object which contains architectural plans or drawings for the building, in the same way that someone would add a link to a person item, to link to an archival object containing their papers. Mary Mark Ockerbloom (talk) 18:42, 21 May 2020 (UTC)
described at URL (P973)? --SCIdude (talk) 07:28, 22 May 2020 (UTC)

Expressing a predicative relation ... the case of "banned in"

 
basic semantic triple model

I am very confused about the resistance I have encountered to creating a property called "banned in". Does anybody have further examples of encountering resistance to using "properties" to express "predicates"?

I have created an item of the style "Ban of X from Y" based on the workaround suggested by the person opposed to the property creation. Unfortunately, I cannot see how this will meaningfully capture the predicative relation. It's been a week since I've made the proposal now. Are there any ontology experts out there who can help me understand why we should not be using a property to express this relationship, or who could show me how the workaround with its unresolved triple in the item label would actually work as well as standard practice in ontology? Many thanks! SashiRolls (talk) 14:46, 17 May 2020 (UTC)

Still following the example of the person who proposed Ban of X from Y, I created ban of involuntary servitude (Q94993180). I then was able to generate a list of countries in which slavery was banned by adding the statement instance of (P31) ban of involuntary servitude (Q94993180) to
While this workaround would not be my preferred way of expressing the relationship, it does have the advantage of working.
SELECT ?item ?itemLabel ?placeLabel
WHERE {
?item wdt:instance of (P31) wd:ban of involuntary servitude (Q94993180)
?item wdt:applies to jurisdiction (P1001) ?place
}
My work was reverted with the claim "this is not the way we use P31" (whose aliases include: "is", "is an example of", "is a specific"). No alternate solution or discussion was provided on any of the talk pages associated with the items in an effort at building consensus. I'm very disappointed that a working solution was removed, even if the solution may not be the final one chosen. I cannot see how it does any harm to leave examples of working solutions, until other working solutions are found. Ideas on other ways to approach the problem of Wikidata's apparent inability to list the dates when involuntary servitude was banned from countries?
Perhaps @ChristianKl: could explain his haste to remove these four statements that increased Wikidata's ability to answer the question? SashiRolls (talk) 17:22, 22 May 2020 (UTC)
Wikidata has a property proposal process to have create standardized ways to model relationships. Circumventing that process by abusing properties outside of what they are for leads to inconsistent data. Having a bunch of inconsistent data makes it hard for users to interact with Wikidata's data. Acting with "haste" reduces the amount of mess that gets created by inconsistent modeling. ChristianKl18:43, 22 May 2020 (UTC)
I mean you no ill will, Christian. Saying that the 13th amendment of the US Constitution (is / is an example of / is a specific) ban of involuntary servitude is fairly clearly not by any stretch of the imagination an abuse of properties. (Perhaps instead you meant to say "misuse", and intended to explain why you thought it was a misuse?) It allows for a joint table to be constructed and populated by foreign keys. I have left one example for discussion in the consensus building process: https://w.wiki/RaN. Please don't circumvent consensus-building again by deleting the single example. If & when another solution is chosen, we can delete the use of P31 you object to (did you explain why?)... should it be decided to create a property to characterize this relation, instead. Until that time, nothing should prevent people from studying the question. SashiRolls (talk)
For a query to provide a good answer to a question the domain of the question has to be expressed in a standardized way. At Wikidata we made a decision to require the property creation process instead of simply allowing users to create properties because we consider such standardization to be important when it comes to expressing relationships we don't already express.
The proper word for using a feature to circumvent a policy is abuse and not misuse. "use (something) to bad effect or for a bad purpose". Circumventing policy is a bad purpose. ChristianKl06:35, 23 May 2020 (UTC)
Could you look up the proper word for accusing others of bad faith, Christian?
Has WikiData adopted the Five Pillars (cf. #4) or is it still a wild Wild West? SashiRolls (talk) 09:58, 23 May 2020 (UTC)
I can't speak for others, but I have been discouraged from replying to your initial query due to the way in which you communicate. It does not come across as in good faith. That's not an accusation, but just some feedback that you may wish to word things in a way which isn't easily misinterpreted as standoffish (that is, assuming you are communicating in good faith). --SilentSpike (talk) 10:20, 23 May 2020 (UTC)
How about focusing on the issue, then? I looked into the related dissolved, abolished or demolished date (P576) which gave me a type constraint violation when I applied it to slavery (Q8463). (I removed it because I did not wish to introduce errors.) SashiRolls (talk) 10:37, 23 May 2020 (UTC)

I have summed up the user experience of trying to create a property to tame the "chaotic and non-standardized methods" currently in use to classify prohibitions and bans on a separate page. instance of prohibition, qualified by of (e.g. night baking, death, caffeinated alcoholic beverages, ...) seems popular. SashiRolls (talk) 22:22, 23 May 2020 (UTC)

ICTV virus ID (P1076) is using an obsolete identifier?

w:Template_talk:Taxonbar#ICTV_links_not_working reports that links provided by Property_talk:P1076 aren't working — the mirror site isn't providing data. Looking at the current ICTV spreadsheet, there is now a Taxon History URL column with a taxnode_id component, that will link to the current instance of the history for the virus. Older values seem to work. Could P1076 be changed to use taxnode_id and link to the master site ie https://talk.ictvonline.org/taxonomy/p/taxonomy-history?taxnode_id=19931949 , assuming uploading the necessary data from the spreadsheet can be easily done. RDBrown (talk) 06:04, 19 May 2020 (UTC)

Old PubMed interface

How do I add https://pmlegacy.ncbi.nlm.nih.gov/ as the old interface for PubMed (https://www.wikidata.org/wiki/Q180686)? I couldn't work out what qualifier to use. Fences and windows (talk) 16:07, 23 May 2020 (UTC)

Issues

When I try to add data at Emirate of Kilis (Q94718933), I receive the message: "Could not save due to an error. The save has failed." which tends to be a typical issue I get when I try to connect articles. In this case, I'm trying to add the article Mîrektiya Kilîsê on Kurdish wikipedia (.ku) but it won't work. I would appreciate some help. --Semsûrî (talk) 20:32, 23 May 2020 (UTC)

Semsûrî: That kuwiki article is already linked to Emirate of Kilis (Q6974214), that's why you receive the error message. Please confirm whether those items should be merged. Esteban16 (talk) 20:42, 23 May 2020 (UTC)
Yes, same topic so a merge is needed. Thanks. --Semsûrî (talk) 20:47, 23 May 2020 (UTC)
@Semsûrî: I just performed the merge, but in the future you can do it yourself by following the instructions at Help:Merge Vahurzpu (talk) 22:02, 23 May 2020 (UTC)

Flag popped up

At The Student Life (Q7767104), a flag popped up since a bot moved Location to Headquarters location, but the item (a newspaper) also has Collection. How do I address the flag? Sdkb (talk) 07:42, 23 May 2020 (UTC)

Would archives at (P485) be a better property to use? Ghouston (talk) 11:07, 23 May 2020 (UTC)
@Ghouston: I switched to that, and used a more accurate archives location, but removing Collection caused another issue to pop up: the item still has inventory number (P217), and that throws up an exclamation point when it doesn't also have Collection. Sdkb (talk) 04:03, 24 May 2020 (UTC)
You can make inventory number (P217) a qualifier of the archives at (P485) statement. Ghouston (talk) 05:00, 24 May 2020 (UTC)

Microsoft Store product ID (P5885)

The PlayStation Store are split into three properties, one for North America, one for Europe and one for Japan. Should the same be done with this property?--Trade (talk) 01:23, 24 May 2020 (UTC)

@Trade: Good time to ask about iTunes ID. Eurohunter (talk) 05:20, 24 May 2020 (UTC)
If the identifier scheme is different, I don't think they should be combined in one property. If it's the same scheme and the difference merely reflects current marketing decisions by Microsoft, I'm not sure why this should impact Wikidata. What type of trade does it involve? --- Jura 10:36, 24 May 2020 (UTC)
The Microsoft Store has not language specific IDs and thus there's no reason to have multiple properties. with the ID's in the example I have no trouble opening https://www.microsoft.com/de-de/p/xbox-one-s-all-digital-edition/8ps0m24j508z . When you look at PlayStation Story IDs for Grand Theft Auto V (Q17452) you find UP1004-CUSA00419_00-GTAVDIGITALDOWNL and JP0230-CUSA00880_00-GTAVDIGITALDOWNL. While the last part of the ID is the same, the rest of the ID isn't and is also not the same for all store entries. ChristianKl13:27, 24 May 2020 (UTC)

Help to reduce the duplication of village items 

I am currently working on village data sets of Karnataka. While working on this I am facing the following issue

  • I have created item for all the villages in Raichur district, but for some of the item already exists from Vietnamese Wikipedia, but this is not visible in open refine as the label are not in English Wikipedia as I reconcile with the English Wikipedia, but if I reconcile with Vietnamese Wikipedia I will not be able to match the items already existing because of the English Wiki. Due to this the chance of duplication is very high. I request someone to help me and guide me on how to go ahead with this.--Ananth subray (talk) 07:34, 24 May 2020 (UTC)

For example Basapur village 1 2 

This is quite possible that I do not understand the issue, but can not you just merge the two items? I know we have a large duplication rate, in part because Geonames was used, and this database has a lot of duplication, thus manually merging is annoying, but we have to deal with this somehow.--Ymblanter (talk) 07:43, 24 May 2020 (UTC)
There are really two possible approaches:
  1. if the quality of the viwiki items is low, one could simply ignore them. It's a common problem with cebwiki and similar items.
  2. An alternative could be to set English labels for all these items and then check them.
--- Jura 12:22, 24 May 2020 (UTC)
It seems to me like the first step would be to add names to existing items. You for example could add an English name for all villages in Raichur district (or maybe even the whole of India) where the item currently doesn't have an English name but has the same name in all the languages with latin alphabet it has currently. ChristianKl12:27, 24 May 2020 (UTC)

Property as a qualifier

Hello. I have requested the update of number of seats in assembly (P1410) to allow to be used as a qualifier. Please read the talk page of the property. Xaris333 (talk) 09:20, 24 May 2020 (UTC)

Deprecated rank

I'm in a curious edit war with User:MovieFex concerning the use of deprecated ranks at A Tale of the Wind (Q3801138). Most sources give the French title of this movie as the original title, nevertheless there is one source that gives the Dutch one as the original title. I deprecated the Dutch title according to Help:Ranking#Deprecated_rank and reason for deprecated rank (P2241) cannot be confirmed by other sources (Q25895909). MovieFex insists on deleting it instead of deprecating it, but I think it is useful to keep this statement deprecated (it is not obvious that the Dutch title should not be the original title) and this is actually what deprecated ranks are for - to mark statements that are known to contain errors. Maybe there are other opinions about this. - Valentina.Anitnelav (talk) 13:13, 20 May 2020 (UTC)

Here you got more examples of deprecated statements: [48] ; [49] ; [50] ; [51] and so on. What do you mean, should they all be restored? You missinterprete "to mark statements that are known to contain errors". Thank you for your help filling constraint valuation sites because you've got a "NEW" idea. These sites are full of "NEW" ideas and ignorance. I do a lot of work in maintaining these sites and I feel so wonderful that users give me that work because I don't know anything to do preferable. May be you give it a chance and a few days (may be weeks) working on these sites, too. Perhaps it does change your mind of those "special ideas". -- MovieFex (talk) 13:33, 20 May 2020 (UTC)
Well, I do work on constraint violation reports. But these seem to be false positives. They should not trigger a constraint violation. Maybe there is some technical problem. - Valentina.Anitnelav (talk) 13:51, 20 May 2020 (UTC)
  • I see the benefit of including an incorrect, but referenced statement with deprecated rank to ensure that it wont be added by error as a correct statement. Deprecated statements are generally ignored for constraint checks. --- Jura 13:37, 20 May 2020 (UTC)
@Jura1: This is wrong and this would also make no sense, as you can see in the examples above given. All were listed at Wikidata:Database reports/Constraint violations/P345. If I give a wrong e.g. IMDb-ID to a person which is wrong, what should be a reason to deprecate and hold it? -- MovieFex (talk) 13:43, 20 May 2020 (UTC)
At Q16154029#P345, I left the imdb id to avoid that it gets re-added. --- Jura 13:59, 20 May 2020 (UTC)
Yes, I've seen some of those examples and I didn't delete it. But from time to time I try to correct them on IMDb by de-mixing. This process takes time and I have to be in the mood for it. In the last few weeks I have worked very hard in maintaining P345, many corrections here and on IMDb. But after every update there is a new flood of errors, made by bots and scriptwriters. The mechanism of detecting errors is important and necessary and no false positive. Nevertheless, a wrong entry hasn't to be hold at all to grow up wikidata as a big rubbish dump. -- MovieFex (talk) 14:43, 20 May 2020 (UTC)
It looks like part of the confusion here is caused by the timing of the bot updates. This edit shows that Wikidata:Database reports/Constraint violations/P345 was "updated" from a dataset of 2020-05-14 to a dataset of 2020-05-10. I'm not sure why that is but the bot restoring 7 day old data will probably be inserting some false positives.
Taking a further look at the history shows that Deltabot and KrBot2 are edit warring over about 500k of text entries. @Pasleim, Ivan A. Krestinin: Can you please take a look at Wikidata:Database reports/Constraint violations/P345 and work out what is causing your bots to conflict? From Hill To Shore (talk) 21:16, 20 May 2020 (UTC)
@From Hill To Shore: This is something different. Deltabot imports less of information. KrBot2 needs 5 days to process the data as Ivan pointed out here and his updates are made in about ~ 3000 !!! sites of constraint valuations. This lasts minimum one half day untill he has finished. By the way, there is no need to fill these sites intentionally by playing around. I told V.A. that title (P1476) has a single value and there is no need to create an exception with tricky maneuvers. When seen that the maintenance pages grow and grow one have to pull in one direction and don't give bad examples to imitators and unknowing users. -- MovieFex (talk) 22:57, 20 May 2020 (UTC)
The bots are clearly in conflict; the reversions are too consistent. I'll let the bot owners clarify what is going on.
"By the way, there is no need to fill these sites intentionally by playing around." I have no idea what you are trying to say here. From Hill To Shore (talk) 23:18, 20 May 2020 (UTC)
There's a difference between an incorrect value due to an editor simply making a mistake (which should generally be deleted) and a value that is known to be incorrect, but which is stated in a reputable source. In the latter case Wikidata should store that information (this is a database of claims, not of facts), but it can be marked as deprecated. --Oravrattas (talk) 14:55, 20 May 2020 (UTC)
  • No. Everyone errs at times, if they are doing anything non-trivial. It's important to mark if there is particular apparently erroneous information in a generally reliable source (UNESCO report, NY Times, El País (Spain), Britannica, OCLC, etc.) - Jmabel (talk) 19:32, 20 May 2020 (UTC)
    • I don't think we should measure sources differently if we assess their errors than we do in other cases. Reminds me of some odd habit in one Wikipedia where they mention a reference in the edit summary when adding a fact they think needs a reference, but for some reason they don't think reference meets their standard of their Wikipedia. --- Jura 20:53, 20 May 2020 (UTC)

Certainly should be deprecated rather than deleted if even one source we would normally consider citeable gives the value. - Jmabel (talk) 14:45, 20 May 2020 (UTC)

Deprecated Rank and constraint violations

MovieFex insists on deleting this deprecated statement as he thinks it triggers a constraint violation (see discussion on his talk page (User_talk:MovieFex#Deprecated_ranks_and_constraint_reports). Maybe somebody else can make it believable to him that deprecated statements are excluded from constraint checks. - Valentina.Anitnelav (talk) 21:03, 23 May 2020 (UTC)

I insist on it because title (P1476) has a single value and VA's edit does not follow the guidelines Wikidata:WikiProject Movies/Properties of monolingual text.
By the way according to the Dutch movie database moviemeter.nl the original title is French and there is also no loss of information because it is given as alternative label and is linked under described at URL (P973), too. -- MovieFex (talk) 13:30, 24 May 2020 (UTC)
Nobody (here) claims that the Dutch title is the original title. It is about deprecating or deleting that claim. To deprecate a claim means to mark it as erroneous, not that it is true. There is one source that gives the Dutch title as the original title, which is not that farfetched as one of the production countries and the original language is Dutch. As other sources don't back up that claim I deprecated that statement with reason for deprecated rank (P2241) cannot be confirmed by other sources (Q25895909). This all conforms to the guidelines given at Help:Ranking, especially Help:Ranking#Deprecated_rank.
To deprecate the statement is also in accordance with the guidelines given at Help:Property_constraints_portal/Single_value, especially Help:Property_constraints_portal/Single_value#Possible_actions. To deprecate the wrong statement is given as a possible action.
There is a loss of information when deleting a deprecated statement. You can look up the information being lost at Help:Ranking#Deprecated_rank. There you find reasons why deprecating is better than deleting.
I don't see why deprecating a statement should be in conflict with any guideline about the use of monolingual text. - Valentina.Anitnelav (talk) 14:30, 24 May 2020 (UTC)
Change the constraint to single best value, problem solved. The deprecated statement seems correct. --SilentSpike (talk) 18:10, 24 May 2020 (UTC)

I think the initial version of my question might not have been clearly worded. I want to know if it's possible to change the sorting of the pages listed on the What Links Here page. By default they are sorted by ID, but I would like to sort them by the amount of incoming links each page has. 2602:306:C541:CC60:4116:2BFD:ABB0:A970 02:13, 25 May 2020 (UTC)

Tool for reciprocal statements

is there a tool/gadget that would make it easy (easier) to add a reciprocal statement to another Q-id? For example when I add different from (P1889) to a WD item, obviously I want to also add that statement to the item this statement differs from. Or when I add an owners to a plantation, i want to add that same plantation as a property to the item of the owner. Or the relationships of spouses or unmaried partners, family members. You get it and you probably experienced the same annoyance ;). Does anyone know of a tool/gadget to make these edits less tedious? Ecritures (talk) 01:28, 24 May 2020 (UTC)

There is consistency_check_add.js but I don't recommend it because it adds claims even if they already exist, duplicating them. --SCIdude (talk) 06:46, 24 May 2020 (UTC)
There is also bots that do take care of that, for exemple, Deltabot do it for books, and likely others. --Misc (talk) 07:05, 24 May 2020 (UTC)
Does consistency_check_add even work anymore? It used to work fine for me, then declined to maybe 50%, now doesn't seem to work at all. -Animalparty (talk) 17:26, 24 May 2020 (UTC)
Automatic addition had bugs, so I disabled it. The on-demand triggers should still work. --Matěj Suchánek (talk) 09:01, 25 May 2020 (UTC)

Language

Hello. When you use properties like name in native language (P1559) you have to add the language. I am using Wikidata in Greek. I have noticed that a language is wrong in Greek. Where can I ask to correct the language? Xaris333 (talk) 09:13, 24 May 2020 (UTC)

@Xaris333: Hello! what do you mean by "wrong language". Can you give an example? Vojtěch Dostál (talk) 13:14, 24 May 2020 (UTC)
@Vojtěch Dostál: the name of the language. Macedonian (Q9296). Now the language is written as "Μακεδονικά" but the proper word in Greek language is "Σλαβομακεδονικά". I don't ask about an agreement here, if my opinion is correct or not (don't discuss that here please). I am just asking where can someone can ask to correct a language. Xaris333 (talk) 13:47, 24 May 2020 (UTC)
You probably mean "Where can I change the Greek label of item Q9296" :-). The answer is - you click on "Edit" at the top of item Q9296 and go the the table just below it. There, you can find the Greek label for the item is and change it there. And yes, please discuss such change beforehand. Vojtěch Dostál (talk) 13:50, 24 May 2020 (UTC)
@Vojtěch Dostál: Of course not. I know how to change a label. The label of Macedonian (Q9296) is correct. The label of Macedonian (Q9296) is different with the Greek word of the language when you used properties like name in native language (P1559) where you must add the language. Xaris333 (talk) 14:11, 24 May 2020 (UTC)
It comes from CLDR. You may have to have it changed there, alternatively a request in Phabricator (cldr extension) could be a way to override it. --Matěj Suchánek (talk) 09:01, 25 May 2020 (UTC)
Thanks! Xaris333 (talk) 11:12, 25 May 2020 (UTC)

Percussion page in Wikipedia is using an incorrect word even when the citation is using the correct word: tympanitic.

Term is tympanitic not tympanic when describing normal percussion of the abdomen. The source cited - University of California also uses tympanitic, not tympanic when describing percussion over stomach or intestines of the abdomen. (See your citation 4) https://meded.ucsd.edu/clinicalmed/abdomen.html Several non-academic Google sites have copied the incorrect substitution of tympanic for tympanitic here which confuses nursing and medical trainees. Tympanic membrane refers to the ear drum. Tympanitic refers to the resonant sound produced when a medical examiner is percussing the abdomen over structures that contain air (stomach, intestines). - Margaret Mulligan MD, Salus University, email: mmulligan@salus.edu  – The preceding unsigned comment was added by 100.11.33.65 (talk • contribs) at 19:03, 24 May 2020‎ (UTC).

Wikidata weekly summary #417

Ontology of anatomy

I opened a discussion on Wikiproject Anatomy about how our ontology regarding anatomy should be structured. ChristianKl12:57, 24 May 2020 (UTC)

Since when are WMF projects doing OR? Are there no existing ontologies? MrProperLawAndOrder (talk) 01:36, 26 May 2020 (UTC)

xtools privacy error, chrome

Hey all, i see privacy error when i want check articles i created via my chrome browser. What is wrong? Its cookies stuff? --Ruwaym (talk) 19:40, 25 May 2020 (UTC)

Ruwaym: Most likely. I checked the page and didn't see any error. Esteban16 (talk) 02:39, 26 May 2020 (UTC)

Dates in Middle Ages

Recently I was informed about some dificulties in describing dates. There is a case when the day and month is known with high probability, but year is uncertain. For example let see Richeza of Poland, Queen of Castile (Q80692). Her date of death (P570) should to be split in two parts. The first is June 16 (Q2653), which can be assigned with sourcing circumstances (P1480) set to probably (Q56644435). The second part is 1185 (Q19729), and its sourcing circumstances (P1480) can be set to circa (Q5727902). Handling the second part with date of death (P570) is easy. But the first with day only portion is not. I found refine date (P4241), but this is qualifier and it cannot have its own sourcing circumstances (P1480). Is this case ever possible to describe using Wikidata schema? I need property similar to significant event (P793) but with value of point in time with respect to recurrent timeframe (Q14795564). This property would be suplemental to the main date property. Paweł Ziemian (talk) 12:19, 25 May 2020 (UTC)

Relevant ticket added above. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:41, 26 May 2020 (UTC)

Why can't I add link?

I was trying add b:en:HyperText Markup Language/Lists into Q81957734, but it said just "Error occured" when saving. Could you tell me why it won't let me add? --Semi-Brace (talk) 11:26, 26 May 2020 (UTC)

That page was added into another property, kindless error message was made me confused. :p --Semi-Brace (talk) 11:28, 26 May 2020 (UTC)

Item update request

Please open Pope Francis (Q450675) and add San Lorenzo de Almagro (Q218282) as supported sports team (P6758). Thanks!!! --151.49.97.155 18:36, 26 May 2020 (UTC)

Zero date year

There must not be zero year, as in WikiData there is some hybrid time format, that is not compatible with ISO 8601, that uses astronomical year numbering, such as year = 0 means 1 BCE, year = 1 means 2 BCE and so on. Perhaps such a system was rejected so that there were no errors with a shift of 1 year, it is ok. But if you enter 0-1-1 system accept it as "1 jan 0" which is wrong.

I tried to make a regexp format constraint for a date - (?!\+0000)[+-]\d{4}-[01]\d-[0123]\dT00:00:00Z, but «Properties with constraint „format constraint“ need to have values of type „String“ or „Monolingual text“.». The "range" constraint can not be accepted to, as it is intended to check if a value is in an interval, and not vice versa. Perhaps the easiest way is to tighten up a module that recognizes date entries and prevent dates with a zero year from being entered. More broadly this problem can be solved by adding "OR" and "AND" constraints, that must be used with other constraints inside.Carn (talk) 16:22, 17 May 2020 (UTC)

Remember that Wikidata is about what the sources say, not about the true date. February 31 is used on gravestones, also there might be sources using the year 0. --Denny (talk) 18:27, 17 May 2020 (UTC)
I think there will not be very many of such cases, as well as there is few records of people whose age exceeds 150 years. There is exception to constraint (P2303) for this.Carn (talk) 21:47, 17 May 2020 (UTC)
@Denny: Wikidata time system is very messed up. It can't record in the time string how people state a date if they state it as "circa 2019" the way it could if it would simply implement ISO 8601-2:2019. This likely leads to lot more incorrect copies of data then an inability to record the 31 February. Yes, you can add qualifiers but plenty of people don't use them and they also get stripped away when a user wants a truthy value.
February 31 should violate a constraint but the constraint likely shouldn't be set to be mandatory. ChristianKl17:10, 18 May 2020 (UTC)
  • Maybe Help:Dates#Years_BC has some info, you are interested in. --- Jura 19:43, 17 May 2020 (UTC)
    • Thank you, but phrase "Year 0 Gregorian is invalid in the database would not be exported" is not fully true:
    • ["mainsnak"] = {
            ["datatype"] = "time",
            ["datavalue"] = {
              ["type"] = "time",
              ["value"] = {
                ["after"] = 0,
                ["before"] = 0,
                ["calendarmodel"] = "http://www.wikidata.org/entity/Q1985727",
                ["precision"] = 11,
                ["time"] = "+0000-03-02T00:00:00Z",
                ["timezone"] = 0,
              },
            }
      
    • It can be imported by lua in infocards, for example, which may cause another wave of indignation "let's get rid of WikiData in our Wikipedia sections" Carn (talk) 21:47, 17 May 2020 (UTC)

  Notified participants of WikiProject property constraints - i don't see how to do it in Module:Constraints, i see that it collects the parameters and sends them to SPARQL, which I do not understand, so i must ask - is it easy to make the property "reverted constraint" that will give a warning in case the constraint placed in it will NOT triggered?Carn (talk) 14:48, 22 May 2020 (UTC)

What about decade 0 ? I mean, an event in the decade year 1 to year 9 ? Example : return from Egypt (Q7317265) which is said to be somewhere between 4 BC and 3 BC. Bouzinac (talk) 13:12, 27 May 2020 (UTC)

Grants:Project/Maximilianklein/WHO

In some item page history I found user:VIAFbot, last edits 2016, looked at the operators contributions and found that the last were in 2017 - except for a few edits in 2020 leading to m:Grants:Project/Maximilianklein/WHO - no, has nothing to do with the WHO. Hardware limitations in Wikidata, SPARQL timeouts, QS slow, etc. ... and then non-Wikidatians try to get some money for programming. 50 USD/h. MrProperLawAndOrder (talk) 00:58, 23 May 2020 (UTC)

This is a proposal to get a grant to build an analysis using Wikidata data, with its primary focus being on supporting other projects such as Wikipedia. It will help answer questions those projects have been trying to understand for years. This is a good thing. I am not sure why you think that this will mean resources are taken away from the Query Service; it won't. Andrew Gray (talk) 17:44, 23 May 2020 (UTC)
Hello @MrProperLawAndOrder, ChristianKl: I confirm what Andrew Gray says: the scope of the project is to study biographical content in Wikimedia projects, not Wikidata items. The main idea of the project is to provide statistics, with a user-friendly interface, without the need for its users to know how to write (or rewrite) SPARQL queries, or to rely on someone who knows. I also replied to some of your concerns in the discussion page of the grant, like explaining that some features this tool would provide don't exist in WDQS. On a technical note, I would like to mention that this tool would mainly rely on Wikidata dumps, not WDQS.
About the name WHO, I will just copy-paste the answer that I already gave a few days ago in the discussion page of the grant: This name is temporary and was chosen before the pandemic, as some sort of pun like "Who is in Wikimedia projects?". We still don't have decided on a definitive name, but we are looking for a friendly name (not an acronym) which is easy to remember (not like Denelezh).
About non-Wikidatians: I'm not sure how you define who is and who is not a true™ Wikidatan. For instance, is someone without a proper Wikidata user page or who never went to a WikidataCon, a real™ Wikidatan? Anyway, as already stated, the scope of this project is to study biographical content in all Wikimedia projects, not only Wikidata items. And there are several ways to be a Wikimedian: direct edits in Wikimedia projects for sure, but also organizing events, developing tools that are useful for the community, conducting studies...
Envlh (talk) 09:31, 24 May 2020 (UTC)
@Envlh:, you wrote "I confirm what Andrew Gray says: the scope of the project is to study biographical content in Wikimedia projects " ... but he wrote "This is a proposal to get a grant to build an analysis using Wikidata data, with its primary focus being on supporting other projects such as Wikipedia." - that's not matching. He didn't restrict to "biographical content", nor did he expand to "study ... content in Wikimedia projects". MrProperLawAndOrder (talk) 18:34, 25 May 2020 (UTC)
@MrProperLawAndOrder: I am not sure if it is a language issue but you are coming across as a pedant here. Envlh confirmed Andrew Gray's point and then provided more detail. To insist that both statements have to directly align is a waste of everyone's time. From Hill To Shore (talk) 19:25, 25 May 2020 (UTC)
@From Hill To Shore: No he didn't, and your claim about how I come across and introducing it with "language issue" is going in the direction of a personal attack. I pointed out that B claimed to confirm what A said, but A didn't say it. And no, what I quoted from B was not provided as being more detail, but connected with ":". Maybe it is a language issue indeed. How do you interpret ":" as used in the first sentence? And I didn't insist they align, so your "is a waste of everyone's time" apart of being a personal attack is not even based in any real event. MrProperLawAndOrder (talk) 19:44, 25 May 2020 (UTC)
@MrProperLawAndOrder: I mentioned language issue as from previous interactions on this talk page you have misunderstood other comments that I have made.[53] I am giving you the benefit of the doubt that English may not be your first language. This is actually the reverse of a personal attack as I am assuming good faith that you are misinterpreting the comments that have been made. Misunderstandings are a common occurence on a multi-lingual project. If English is your first language then I think you are perhaps expecting too much precision in a discussion; someone makes a point and another person expands on it. From Hill To Shore (talk) 20:04, 25 May 2020 (UTC)
  • Essentially it re-builds a Query Service II that can break independently of Query Service I. This can be a good thing. OTH resources allocated to Query Service I are already somewhat limited, so allocating some of the foundation's resources to rebuild another one, isn't necessarily helping to improve it for #1. --- Jura 18:12, 23 May 2020 (UTC)

@Andrew Gray: How can money spent on that grant also be spent on the general purpose WDQS? MrProperLawAndOrder (talk) 18:35, 25 May 2020 (UTC)

I think you misunderstand what is being asked for here. The pot of "money for grants" is not the same as "money that is going to be spent on WDQS". The grant money is not otherwise going to go to WDQS. WDQS is not losing out because of this. Andrew Gray (talk) 18:59, 27 May 2020 (UTC)

Two ways of using QS:

  1. "run in background": edits have a linked batch id "batch #34562 [...] Tag: QuickStatements [1.5]" [54], easy to track, easier to review, one can even see upcoming edits when looking at the batch page
  2. some other way: edits only have "#quickstatements [...] Tag: QuickStatements [1.5]" [55], hard to track and to review, one cannot see what and how many are upcoming. If an error is found, it is hard to find other edits made via the same method / in the same round

Is there any policy on how to edit? Why does QS allow mass editing without batch id? MrProperLawAndOrder (talk) 18:25, 25 May 2020 (UTC)

As frontend batch is not stored in database, they are only identified by the time batch is created. You may however see all such batches here.--GZWDer (talk) 19:29, 25 May 2020 (UTC)
GZWDer, thank you. First column is labeled UID and the values are clickable. Why are these clickable IDs not mentioned in the edit history and linked from there? MrProperLawAndOrder (talk) 01:32, 26 May 2020 (UTC)

@GZWDer: just found a reason: Waiting for several minutes QS refused to start, it said "Running" but nothing was edited. Running via browser instead of "in background" works. MrProperLawAndOrder (talk) 21:01, 27 May 2020 (UTC)

This is because of background QuickStatements is proceeded by a seperate worker (backend process) which is currently not working. @Magnus Manske:.--GZWDer (talk) 22:06, 27 May 2020 (UTC)

Item for Blood and lymph

blood and lymph (Q29225829) seems to link to an Italian wikiversity article about blood and lymph. While a title that combines two subjects might be appropriate for Wikiversity it feels strange for Wikidata as an item. ChristianKl21:56, 25 May 2020 (UTC)

Human item - WorldCat Identities ID

Q6771432#identifiers WorldCat Identities ID is listed before Library of Congress authority ID. Why? MrProperLawAndOrder (talk) 01:23, 26 May 2020 (UTC)

Technically, because it's listed earlier at MediaWiki:Wikibase-SortedProperties. It's in the VIAF members section, which has been sorted alphabetically by English label I don't know where the labels come from, but Worldcat is under "I" as "Identities - WorldCat Identities ID". Ghouston (talk) 06:47, 26 May 2020 (UTC)

@Ghouston: thank you. Not listed at http://viaf.org/ where many members are shown. MrProperLawAndOrder (talk) 07:14, 27 May 2020 (UTC)

Contemporary constraint

At Q17410937#P22 it violated contemporary constraint maybe because the child is born 4 months after its father died. Can someone fix it? Hddty (talk) 11:41, 26 May 2020 (UTC)

Done, as yet another exception on the constraint. Ghouston (talk) 06:22, 27 May 2020 (UTC)
Can't the constraint be changed to allow for fathers who might die up to 9-10 months before the birth of the child? Piecesofuk (talk) 07:28, 27 May 2020 (UTC)
Some qualifier on Q17410937#P22 could be helpful to indicate it. --- Jura 09:12, 27 May 2020 (UTC)

Special:FewestRevisions to be disabled

Hello all,

We need to disable the page Special:FewestRevisions because it is extremely resource intensive and causing a huge number of errors in the database (see phab:T238199 and comments for more details).

We had a look at the number of page views and noticed that the page is not often used (phab:T245818), thus we hope that disabling it won’t cause any trouble. However, if you are using this page, please let us know as soon as possible, and tell us more about how and why you are currently using the page, so we can suggest another workflow that would allow you to do it another way. For example, this Quarry query is producing the same result as the special page.

Thanks a lot for your understanding, Lea Lacroix (WMDE) (talk) 07:47, 27 May 2020 (UTC)

SELECT (COUNT(*) as ?count) { ?item wikibase:statements 0 ; wikibase:sitelinks 1 }
Try it!
  • np. At Wikidata, Query server can do something similar instead (e.g. 1.6 million with the above, most frequent sitelinks: [56]). --- Jura 09:04, 27 May 2020 (UTC)

Item update request

Please open Pope Francis (Q450675) and add San Lorenzo de Almagro (Q218282) as supported sports team (P6758), source is http://www.rainews.it/dl/rainews/media/papa-francesco-tifoso-spera-foto-gallery-san-lorenzo-campione-455aec3a-d2e4-442c-98af-8f425702a9e8.html#foto-1 . Thanks!!! --151.49.97.155 18:35, 27 May 2020 (UTC)

  Done Vahurzpu (talk) 21:26, 27 May 2020 (UTC)

Unremovable incorrect "Wikimedia disambiguation page"

https://www.wikidata.org/wiki/Q6027853 - how it can be fixed? cMateusz Konieczny (talk) 18:44, 27 May 2020 (UTC)

Could not save due to an error. Malformed input

Could not save due to an error.
Malformed input: Neujalis, Juozas

I could only add "Juozas Neujalis" [57], but that is not what the source says. MrProperLawAndOrder (talk) 19:25, 27 May 2020 (UTC)

sometimes when wikidata is overloaded it just emits errors randomly. seems to accept the change now. BrokenSegue (talk) 19:45, 27 May 2020 (UTC)
@MrProperLawAndOrder, BrokenSegue: You should remove leading and trailing spaces and invisible characters. See phab:T47925.--GZWDer (talk) 22:09, 27 May 2020 (UTC)

Bug

There is a bug with wikidata at the moment on at least Wikipedia. When an article is not attached to Wikidata, there is no "add links" button available, so if you want to link a new article you have to do it manually. Not sure what could be causing it. I'll also refer you to the Simple English Wikipedia talk where we have briefly discussed the issue there. ImprovedWikiImprovment (talk) 20:46, 27 May 2020 (UTC)

Perceptions of Wikidata reuse -- Research invite and eliciting feedback & suggestions

I'd like to talk to Wikidata editors about their knowledge of how Wikidata is used on other wikis and outside of the Wikimedia community. Furthermore, I'm interested in understanding how newcomers learn about Wikidata's policies and guidelines. I plan to recruit participants using project chat page and sending individual invitations to editors via user talk page. The detail of research could be found here: https://meta.wikimedia.org/wiki/Research:Perception_on_rules,_values_and_motivation_structure_of_Wikidata_contributors

Feel free to give suggestions and comments! If you would like to participate (it's an approximately 30 minutes interview), you could leave message on my user talk page or fill in a questionnaire to set up an interview. Thanks! Chuankaz (talk) 21:05, 27 May 2020 (UTC)

Do we have a qualifier for named after that specifies the language used?

It feels like it would be useful for terms that have different names in different languages to specify to which language the property named after (P138) refers. Do we have an existing way to store this information? ChristianKl21:33, 25 May 2020 (UTC)

In Sunday (Q132) it tries to do so, but the qualifier may be ambigous as it may refers to either the source language or target language.--GZWDer (talk) 00:56, 26 May 2020 (UTC)
Sunday is named after mister Q177053 in Spanish? MrProperLawAndOrder (talk) 01:28, 26 May 2020 (UTC)
That looks like an error. Domingo (Sunday) translates into English as "Lord" or "master" rather than "mister." It would probably be more accurate to link it to Dominus (Q1283380). I am not a Spanish speaker though, so this is more of an educated guess. From Hill To Shore (talk) 07:28, 26 May 2020 (UTC)
  • It seems to me that it becomes important once an item gets names in many different languages. Whenever I see an item where the "named after" has no corresponding to any of the languages Wikipedia shows me it feels strange. ChristianKl19:00, 26 May 2020 (UTC)
In slouch hat (Q3750832) I've used "named after" as a qualifier on individual names along with "valid in place" and "valid in period" where I have references to support those statements. - PKM (talk) 20:55, 26 May 2020 (UTC)
I added it mostly to geographic places, but yes, on classes, it's crucial. If more detail needs to be provided, lexemes would generally be more suitable.--- Jura 09:10, 27 May 2020 (UTC)
The item Sunday (Q132) is an interesting example and at this I thought about synonyms. I think that this name for the day has a religious background in the languages where currently the item for Mister Q177053 is used. For God there are many different names. I would choose named after god. But I dont know what is the correct item for it. I think it is God in Christianity (Q825). In German the best translation, to understand the meaning from the Spanish domingo is Tag des Herrn, so in English something like the Lords day. What do you think is the best item to use here. Is the best item from the mentioned word or the item with the same meaning the right one in the section to show after what something is named. --Hogü-456 (talk) 21:01, 28 May 2020 (UTC)

child no value

Why having "child=no value" e.g. [58]? For a claim child=XYZ one can do research and add a source, but how would one verify "no value" and add a source for it? I asked the claim creator at the talk page for his source or to make transparent that is was OR, he replied but didn't gave a source yet.

Disclosure: I oppose OR.

MrProperLawAndOrder (talk) 15:34, 27 May 2020 (UTC)

  • Basically you are claiming that it's impossible to reduce that a person from which we know that they have 0 children (a sourced claim) has actually no child. How about focusing on actually putting effort into understanding claims before you challenge them? ChristianKl17:25, 27 May 2020 (UTC)

I agree but I don't see why you need to be quite so rude about it. BrokenSegue (talk) 18:39, 27 May 2020 (UTC)

  • I'm annoyed, because it reflects repeat behavior by the OP to enter discussions and complain that people (three times against me) should behave differently without investigating the issues that are at stake and instead trying to press people to classify decisions along lines that are irrelevant for Wikidata policy. ChristianKl21:04, 27 May 2020 (UTC)
  • In our data model "No value" means that no value exists and a person is childless, you might want to learn about how Wikidata works. Help:Statements even makes that easy and gives an example for child:
Unknown or no values
There are times when an item has either no value or an unknown value for a given property. Depending on the property, these data values still provide important information about an item and should still be recorded in Wikidata. For example, we could say that Elizabeth I of England (Q7207) had no value for the child (P40) property, which is quite different than not recording anything at all.)
There's no duty to provide a source just because someone tries to harrass you and wants you to do work for them. ChristianKl21:20, 27 May 2020 (UTC)
@ChristianKl: stop your claims about intentions of others. If you cannot prove them, and the claims have a negative connotation, they are just personal attacks. The community opposes personal attacks. MrProperLawAndOrder (talk) 22:10, 27 May 2020 (UTC)

Yes @MrProperLawAndOrder: seems annoying and unfamiliar with wikidata but people should try to remain calm in the face of it. BrokenSegue (talk) 22:38, 27 May 2020 (UTC)

@BrokenSegue: Re "seems annoying and unfamiliar with wikidata" - I see no evidence that asking the question made when starting this section could lead to "seems annoying and unfamiliar with wikidata". I work on human data, and the claim child="no value" is something I have not seen before, despite having seen hundreds of pages. Of course it can exist in large quantities in WD, but it has not been applied to every human that has no child. So, I asked for the source at the talk, user did not provide it, so I went here. But if you are less "annoying and unfamiliar" than me, please share, in which cases such statement should be made to further the goals of the WMF in sharing knowledge to world. Thousands of humans can receive it. And last but not least, explain the difference from simply stating "number of children = 0" via Property:P1971. MrProperLawAndOrder (talk) 23:02, 27 May 2020 (UTC)
For your information: Child: No value is used 301 times. This query will show them:
SELECT ?item ?itemLabel
WHERE {
  ?item a wdno:P40.
  ?item wikibase:statements []. # Only entities have this...
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!
--Dipsacus fullonum (talk) 11:12, 28 May 2020 (UTC)
User:Dipsacus fullonum, thank you! Very helpful. I made sure each occurrence also has "number of children = 0" (https://w.wiki/S8x = 0) and this is now used 481 times:
SELECT ?item ?itemLabel
{
  ?item wdt:P1971 ?noc.
  FILTER(?noc = 0).
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!

} MrProperLawAndOrder (talk) 02:10, 29 May 2020 (UTC)

Bots going wild - Reinheitsgebot - MnM - Magnus Manske adding hundreds of wrong claims on human items

Maybe they also add millions of correct claims, but what does the community think about hundreds of edits like [59] caused by MnM, Reinheitsgebot both operated by Magnus Manske. Here the name "James Roberts" was the same on both items, but there are also cases like

the firstnames match and the string "II" is for the person that died 1193 part of the name, for the other it is part of the description "world war II U-boat commander" - WWII took place more than 7 centuries after the other person died. Matching seems to be poorly designed.

Requests to at least deactivate certain highly trouble some MnM catalogs receive no response even after 12 days. Topic:Vme7nczx4t8iduvdMrProperLawAndOrder (talk) 17:05, 27 May 2020 (UTC)

Afaik, all matches in MnM need to be verified by a human. If people do not verify and say "yes, that's the same", the problem is not with the bots or the tools, so this is weird to point at a issue with the tool creator (and not very constructive either). As for maintenance of MnM, I do agree it would be better to not depend on 1 volunteer, but I do think that would requires a lot of efforts infra side, and as someone whose job is exactly that, I know this take lots of time, and free time is a usually a luxury for sysadmins. --Misc (talk) 18:33, 27 May 2020 (UTC)
@Misc: Those examples show the limits of tools like Reinheitsgebot. There was an interest to have such tools at the begining to import data. Now data curation is more important and bots should focus on importations of curated data only and avoid to destroy the work of curation done directly in the items without using tools. Snipre (talk) 18:39, 27 May 2020 (UTC)
@Snipre:, I do not follow you, how is adding item destroying anything ? Again, if people are matching things that shouldn't have been matched, the problem is people, not the tools. And if you still think the tools are bad and should be improved, that's a free software, the code is on https://bitbucket.org/magnusmanske/mixnmatch/ . Markus is a volunteer like lots of us are (or should be). And as someone said "Talk is cheap, show me the code". --Misc (talk) 19:42, 27 May 2020 (UTC)
@Misc: the point of this section is about bots going wild. No human is checking, Reinheitsgebot is just acting on what MnM says and I see no evidence Magnus is fixing his mass insertions of errors into WD. Last but not least, you claim tool creators have no responsibility for what is done with them: Tabacco industry claims this too. But why are WMF resources used for operating tools of mass disruption? Isn't there a policy by WMF that requires that hosted tools operate according to the goals of the foundation? MrProperLawAndOrder (talk) 22:04, 27 May 2020 (UTC)
@MrProperLawAndOrder:, as far as I know, a human has to validate the matching before anything get pushed. I do not think comparing people dying by cancer to wrong edit made by a tool is a valid compraison, nor up to the standard of discussion we want here. Antagonizing people is not really productive. --Misc (talk) 09:51, 28 May 2020 (UTC)
@Misc: "as far as I know, a human has to validate the matching before anything get pushed" the edit summary doesn't reveal that, and I see no proof that there is really a human behind each match in MnM. The Erwin example doesn't look like a human error. I didn't mention people dying, but someone here said tool creators are not responsible for how the tools are used, without giving a source for that claim. I just wanted to highlight how questionable that claim seemed to. Sorry, if the method left a feeling of antagonizing and I regret to have done it. It was by no means meant to equate the whole situations. As I wrote in the intro, the tools may have added millions of correct claims. Re "valid comparison": One can compare anything with anything, comparing is not equating, and if one aspect is equal it doesn't mean all aspects are equal. Nevertheless I prefer I wouldn't have done it. Please accept my apologies. MrProperLawAndOrder (talk) 10:45, 28 May 2020 (UTC)

I would add that errors in any process (even human driven) are inevitable and so it's important to consider whether the contribution is net-positive. It also seems the errors you are pointing out are somewhat old so it's a bit late to do anything about it other than to try to fix the errors that get through. BrokenSegue (talk) 19:49, 27 May 2020 (UTC)

@BrokenSegue: what does age change about the process of fixing? If they are "old" by whatever definition, isn't it worse the creator didn't fix them? And the report wasn't only about the past, since MnM and Reinheitsgebot are still active, and Magnus Manske still not acting on the catalog deactivation request. There is nothing net positive about that specific catalog. DtBio re-uses GND IDs and the preferred value for these can change. No indication MnM can deprecate IDs. For DtBio it is easier just to directly compare lists of GND IDs. MrProperLawAndOrder (talk) 21:57, 27 May 2020 (UTC)
At Wikidata_talk:Bots#Unattributed_proxy_edits there was a discussion about whether bots that proxy edits for users should (be required to) identify the user, with Reinheitsgebot as one example. Bovlb (talk) 14:06, 28 May 2020 (UTC)
  • Hold on, hold on, hold on. The Erwin diff you give looks like it was all done by Reinheitsgebot, but you have collapsed two edits together; I am sure this is a simple mistake, but it has the result of being very misleading.
The first edit was by a human, in a batch, mistakenly adding Deutsche Biographie (GND) ID (P7902) to the wrong item. (Fair enough - we all make errors. Even Homer nods). Reinheitsgebot saw this, and one week later added CERL Thesaurus ID (P1871) because that value was linked from the P7902 data. The bot did nothing wrong here. There was a human error, it (reasonably) assumed the human-error edit was correct, and made some edits based on it. Andrew Gray (talk) 16:54, 28 May 2020 (UTC)
Andrew Gray, human marked in bold, but the edit wasn't even fully human, the value was coming from MnM (tool by Magnus)[61]. Where is the proof that "Reinheitsgebot saw this, and one week later added CERL Thesaurus ID (P1871) because that value was linked from the P7902 data."? The edit summary for Reinheitsgebot says "Created claim: CERL Thesaurus ID (P1871): cnp01149900, #quickstatements; invoked by Mix'n'match:add_person_dates" no reference to the edit of the other user. Both wrong edits coming from MnM, two properties (DtBio and CERL) and two catalogs (1619, 1640). Reinheitsgebot has shown to insert DtBio + CERL errors into an article on its own, the human-MnM-based edit not necessary for this:
Markus Ebner (Q119378):
MrProperLawAndOrder (talk) 02:34, 29 May 2020 (UTC)

Namespaces of wikidata.org ... where to place a certain page?

As I am not too familiar with Wikidata I'd like to ask how to name, respectively where to place a certain page. meta:User:Manfred Werner (WMAT)/Baustelle is a collection of quiries displayed with the help of Integraality. The tables were created for the recent Museum Day Wikidata competition and I would like to keep them available at a place where they are found - I thought of putting a link at Wikidata:WikiProject_Museums#Example_Queries. But where should I put the page itself on Wikidata? --Manfred Werner (WMAT) (talk) 12:58, 28 May 2020 (UTC)
PS: In case you noticed - the page with the collected tables does not work as intended yet. I'm working on it. When finished I'd still like to make it available here. --Manfred Werner (WMAT) (talk) 13:58, 28 May 2020 (UTC)

There are similar pages under other WikiProjects here, so a subpage of Wikidata:WikiProject_Museums would make sense to me. ArthurPSmith (talk) 18:31, 28 May 2020 (UTC)

Chinese, English, Russian, Arabic -- Herobrine303 (talk) 13:20, 28 May 2020 (UTC)

No, use old-style interwiki. --Matěj Suchánek (talk) 14:41, 28 May 2020 (UTC)

Hello, can someone please give an explanation of why all these unrelated talk pages appear in https://www.wikidata.org/wiki/Special:WhatLinksHere/Q22272508? --SCIdude (talk) 15:43, 28 May 2020 (UTC)

@SCIdude: People added {{Item documentation}} to a bunch of talk pages, which in part displays a concept tree. It's hard to notice, but click "expand" in the lower right-hand corner of the box to see where it's linked. If you're just looking for links from items, you can filter it to just the main namespace. Vahurzpu (talk) 16:01, 28 May 2020 (UTC)
Thanks. Wouldn't the generation of this be a (unnecessary) resource hog? --SCIdude (talk) 17:03, 28 May 2020 (UTC)

Cannot change Q210499

Q210499 has as its Japanese link ja:電子掲示板. This is however incorrect. It should be ja:草の根BBS actually, and ja:電子掲示板 in English is en:textboard. I receive error:

Could not save due to an error. The save has failed.

It is true that the literal translation of 電子掲示板 is "electronic bulletin board". But that's not how the term is used in Japan. It's used to refer to sites like en:2channel, not old 1980's style dial-in BBS. For proof, please see ja:電子掲示板#日本語のインターネット掲示板. It is also occasionally used in a sense similar to "forum", but this is not the default meaning, and if the possibility for confusion exists, you have to explain that you mean specifically a forum by writing e.g. 非匿名掲示板. Another source would be this Yahoo! News article[62], which contains the statement インターネット社会、日本の社会文化は匿名文化なの, meaning, roughly, "Japanese internet culture is anonymous." Which is true, anonymous textboards are dominant in Japan, unlike overseas where en:4chan et cetera are the fringe and Facebook groups the norm. Psiĥedelisto (talk) 09:04, 24 May 2020 (UTC)

  Info Affected items: bulletin board system (Q210499), Q11618061 and textboard (Q3519361). --Liuxinyu970226 (talk) 04:57, 29 May 2020 (UTC)
Looks like Wikidata:WikiProject_Informatics should explain here. --Liuxinyu970226 (talk) 22:58, 29 May 2020 (UTC)

  WikiProject Informatics has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Merge gadget issues?

Are other people seeing the merge gadget? It disappeared for me with no apparent cause earlier this afternoon. There weren't any relevant deployments in that window, per wikitech:Deployments. I want to make sure it's not just me before I file a bug report. Vahurzpu (talk) 18:10, 28 May 2020 (UTC)

Yes, I see the merge gadget.--Ymblanter (talk) 19:03, 28 May 2020 (UTC)
  • Merge gadget is missing on my browser as well. English Wikipedia seems to be having some current sidebar issues too (all the sidebar links jumbled up over the logo, tab display problems). Hopefully a short-lived glitch. -Animalparty (talk) 19:31, 28 May 2020 (UTC)
Ha! I just came here with the same complaint! I wasted 20 minutes looking through my preferences, thinking I must have accidentally turned it off. --RAN (talk) 19:58, 28 May 2020 (UTC)
Can someone merge Q95689708 and Q95690079 for me. --RAN (talk) 20:01, 28 May 2020 (UTC)
@Vahurzpu, Animalparty, Richard Arthur Norton (1958- ): I reported this problem on Phabricator as phab:T253912. It affects only the Vector skin, if you change your skin (e.g. to Monobook) your merge gadget will be back — NickK (talk) 20:22, 28 May 2020 (UTC)
Thanks, According to the Phabricator, must be fine now.--Ymblanter (talk) 07:09, 29 May 2020 (UTC)

Adding a place

Hi, I would like to know how I can add a place that is not on wikapedia. It is a youth facility that is no longer in operation. I've been doing a lot of research and have every bit of information from beginning to end and beyond. If someone could please help me in this quest, I would greatly appreciate it.

Thanks!  – The preceding unsigned comment was added by Ofarrell2020 (talk • contribs) at 23:57, 28 May 2020‎ (UTC).

@Ofarrell2020:} As long as you have solid citations, sure. - Jmabel (talk) 01:05, 29 May 2020 (UTC)


@jmabel, are you able to help me to get the history of this facility on wikipedia?

just click the "create a new item" button to create a new item. you might want to read the tutorials and information on what kinds of things are permitted to have entries. People will likely help you figure out how to do things but typing in the information and linking sources to the information will be left to you. Also, making an entry here and on Wikipedia are different things and have different rules/procedures. BrokenSegue (talk) 05:47, 29 May 2020 (UTC)
How would I help you get the history of an unnamed "facility" that is not on Wikipedia [in an unspecified language -- there are about 200 of them] onto Wikipedia? If you mean would I write an article on a topic in which I have no particular interest, sorry, no, I would not do that. - Jmabel (talk)

Arabic or English?

birth name (P1477) on Tony Fadell (Q92879) is reported as Arabic. It should be English. --2001:B07:6442:8903:28DE:B6FE:6343:A17A 15:27, 29 May 2020 (UTC)

done. BrokenSegue (talk) 15:29, 29 May 2020 (UTC)

Américo Castro

Si alguien puede arreglar la ficha de Américo Castro, vandalizada por ELSONRICS que ha sustituido la información por la de un tal Eduardo Cárdenas Barajas que parece ser él mismo, lo agradeceré.--Enrique Cordero (talk) 18:21, 29 May 2020 (UTC)

La idea sería devolver la ficha al estado en que se encontraba aquí: https://www.wikidata.org/w/index.php?title=Q482275&diff=prev&oldid=1192878699.  – The preceding unsigned comment was added by Enrique Cordero (talk • contribs) at 19:01 29 May 2019 (UTC).
Buenas Enrique Cordero, he restaurado la versión del elemento justo antes de que empezase el vandalismo. Voy a avisar al usuario. Si persistiese, habría que denunciarlo en el tablón de administradores. Saludos, Ivanhercaz (Talk) 19:13, 29 May 2020 (UTC)
Tras ver el resto de ediciones de ELSONRICS, voy a revisarlas y revertir posibles vandalismos. Procederé a solicitar su bloqueo. Saludos, Ivanhercaz (Talk) 19:14, 29 May 2020 (UTC)

Bots going wild - KrBot - Ivan A. Krestinin replacing IDs

User talk:Ivan A. Krestinin#GND ID replacement of redirected ids - bot operator not stopping despite several users disagreeing with the removal of IDs that are still valid.

Bot breaks resolving the deprecated GND IDs via Wikidata, turning WD to be less useful. The deprecated IDs are still in use and have longterm support in the GND DB. They are kept there and not deleted, are results of merging items. MrProperLawAndOrder (talk) 02:46, 30 May 2020 (UTC)

Cannot change Q210499

Q210499 has as its Japanese link ja:電子掲示板. This is however incorrect. It should be ja:草の根BBS actually, and ja:電子掲示板 in English is en:textboard. I receive error:

Could not save due to an error. The save has failed.

It is true that the literal translation of 電子掲示板 is "electronic bulletin board". But that's not how the term is used in Japan. It's used to refer to sites like en:2channel, not old 1980's style dial-in BBS. For proof, please see ja:電子掲示板#日本語のインターネット掲示板. It is also occasionally used in a sense similar to "forum", but this is not the default meaning, and if the possibility for confusion exists, you have to explain that you mean specifically a forum by writing e.g. 非匿名掲示板. Another source would be this Yahoo! News article[63], which contains the statement インターネット社会、日本の社会文化は匿名文化なの, meaning, roughly, "Japanese internet culture is anonymous." Which is true, anonymous textboards are dominant in Japan, unlike overseas where en:4chan et cetera are the fringe and Facebook groups the norm. Psiĥedelisto (talk) 09:04, 24 May 2020 (UTC)

  Info Affected items: bulletin board system (Q210499), Q11618061 and textboard (Q3519361). --Liuxinyu970226 (talk) 04:57, 29 May 2020 (UTC)
Looks like Wikidata:WikiProject_Informatics should explain here. --Liuxinyu970226 (talk) 22:58, 29 May 2020 (UTC)

  WikiProject Informatics has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Merge gadget issues?

Are other people seeing the merge gadget? It disappeared for me with no apparent cause earlier this afternoon. There weren't any relevant deployments in that window, per wikitech:Deployments. I want to make sure it's not just me before I file a bug report. Vahurzpu (talk) 18:10, 28 May 2020 (UTC)

Yes, I see the merge gadget.--Ymblanter (talk) 19:03, 28 May 2020 (UTC)
  • Merge gadget is missing on my browser as well. English Wikipedia seems to be having some current sidebar issues too (all the sidebar links jumbled up over the logo, tab display problems). Hopefully a short-lived glitch. -Animalparty (talk) 19:31, 28 May 2020 (UTC)
Ha! I just came here with the same complaint! I wasted 20 minutes looking through my preferences, thinking I must have accidentally turned it off. --RAN (talk) 19:58, 28 May 2020 (UTC)
Can someone merge Q95689708 and Q95690079 for me. --RAN (talk) 20:01, 28 May 2020 (UTC)
@Vahurzpu, Animalparty, Richard Arthur Norton (1958- ): I reported this problem on Phabricator as phab:T253912. It affects only the Vector skin, if you change your skin (e.g. to Monobook) your merge gadget will be back — NickK (talk) 20:22, 28 May 2020 (UTC)
Thanks, According to the Phabricator, must be fine now.--Ymblanter (talk) 07:09, 29 May 2020 (UTC)

Adding a place

Hi, I would like to know how I can add a place that is not on wikapedia. It is a youth facility that is no longer in operation. I've been doing a lot of research and have every bit of information from beginning to end and beyond. If someone could please help me in this quest, I would greatly appreciate it.

Thanks!  – The preceding unsigned comment was added by Ofarrell2020 (talk • contribs) at 23:57, 28 May 2020‎ (UTC).

@Ofarrell2020:} As long as you have solid citations, sure. - Jmabel (talk) 01:05, 29 May 2020 (UTC)


@jmabel, are you able to help me to get the history of this facility on wikipedia?

just click the "create a new item" button to create a new item. you might want to read the tutorials and information on what kinds of things are permitted to have entries. People will likely help you figure out how to do things but typing in the information and linking sources to the information will be left to you. Also, making an entry here and on Wikipedia are different things and have different rules/procedures. BrokenSegue (talk) 05:47, 29 May 2020 (UTC)
How would I help you get the history of an unnamed "facility" that is not on Wikipedia [in an unspecified language -- there are about 200 of them] onto Wikipedia? If you mean would I write an article on a topic in which I have no particular interest, sorry, no, I would not do that. - Jmabel (talk)

Youtube user ID

I am not sure how to enter a Youtube user ID. There is no suitable property, only the (unused) qualifier Q65028018. --RolandUnger (talk) 04:44, 29 May 2020 (UTC)

I do not know if there was already a property proposal. --RolandUnger (talk) 04:45, 29 May 2020 (UTC)
what do you mean by user id? plenty of pages have channel ids. is there not a 1 to 1 relationship there? BrokenSegue (talk) 05:44, 29 May 2020 (UTC)
If I remember right we decided against listing Youtube user IDs as they provide little value. Channels are important enough to be linked but the user IDs were not considered important enough to warrent a property especially given that there are living people privacy concerns. ChristianKl10:30, 29 May 2020 (UTC)

There could be property for user id (for old channels) and channel id aka "numeric id" as in case of Twitter and Genius. Eurohunter (talk) 11:21, 30 May 2020 (UTC)

Vandalism: Counter Measures

Hello all, I know that there are lots of different counter measures that prevent vandalism to make its way into Wikipedia articles. Basically, that is a combination of measures on Wikidata directly and measures in the Wikidata->Wikipedia interface.

Wikidata:Vandalism is not really helpful in finding more information on these counter measures, such as Patrolled Versions (Wikidata:Patrolled versions does not exist?!), or the possibility of protecting single statements when they are proven to be true/correct and there will not be any demand to change them in the future (for example official census data regarding population).

The measures on the interface are things like the fact that changes on Wikidata are listed on the Wikipedia watchlist, or that the patrolled versions of Wikipedia sometimes mark an article as unpatrolled when there were changes on Wikidata.

Is there a page where I can find a list of these measures? Or would it make sense to create one, as it will be helpful to convince users who are afraid of using data from Wikidata in Wikipedia articles because of potential uncaught vandalism? Thanks in advance, Yellowcard (talk) 12:16, 29 May 2020 (UTC)

Thank you for this.
An example which crossed my way a few weeks ago: The description of People for the Ethical Treatment of Animals (Q151888) had been vandalised on April, 14., which is incorporated in the de-WP-article via template, an was not changed until April, 22., when I noticed the vandalism. -- Perrak (talk) 12:37, 29 May 2020 (UTC)
@Yellowcard, Perrak: You might want to check out Wikidata:WikiProject Counter-Vandalism which lists a number of tools in active use. We could always use more people patrolling! ArthurPSmith (talk) 17:44, 29 May 2020 (UTC)
Thank you very much! -- Perrak (talk) 08:38, 30 May 2020 (UTC)

Arabic or English?

birth name (P1477) on Tony Fadell (Q92879) is reported as Arabic. It should be English. --2001:B07:6442:8903:28DE:B6FE:6343:A17A 15:27, 29 May 2020 (UTC)

done. BrokenSegue (talk) 15:29, 29 May 2020 (UTC)

Américo Castro

Si alguien puede arreglar la ficha de Américo Castro, vandalizada por ELSONRICS que ha sustituido la información por la de un tal Eduardo Cárdenas Barajas que parece ser él mismo, lo agradeceré.--Enrique Cordero (talk) 18:21, 29 May 2020 (UTC)

La idea sería devolver la ficha al estado en que se encontraba aquí: https://www.wikidata.org/w/index.php?title=Q482275&diff=prev&oldid=1192878699.  – The preceding unsigned comment was added by Enrique Cordero (talk • contribs) at 19:01 29 May 2019 (UTC).
Buenas Enrique Cordero, he restaurado la versión del elemento justo antes de que empezase el vandalismo. Voy a avisar al usuario. Si persistiese, habría que denunciarlo en el tablón de administradores. Saludos, Ivanhercaz (Talk) 19:13, 29 May 2020 (UTC)
Tras ver el resto de ediciones de ELSONRICS, voy a revisarlas y revertir posibles vandalismos. Procederé a solicitar su bloqueo. Saludos, Ivanhercaz (Talk) 19:14, 29 May 2020 (UTC)

Bots going wild - KrBot - Ivan A. Krestinin replacing IDs

User talk:Ivan A. Krestinin#GND ID replacement of redirected ids - bot operator not stopping despite several users disagreeing with the removal of IDs that are still valid.

Bot breaks resolving the deprecated GND IDs via Wikidata, turning WD to be less useful. The deprecated IDs are still in use and have longterm support in the GND DB. They are kept there and not deleted, are results of merging items. MrProperLawAndOrder (talk) 02:46, 30 May 2020 (UTC)

Item for member of ministerial council?

What would be the value of position held (P39) for a person who is a member of a ministerial council -- the small group of advisers second only to the Minister? I have searched for "ministerial council member," "ministerial adviser," "government adviser," etc. to no avail. Levana Taylor (talk) 04:33, 30 May 2020 (UTC)

Hello Levana Taylor, you should probably create an item for the specific position as organisations vary from one state to another. You may want to look around ministerial council (Q2932402). Arpyia (talk) 10:12, 30 May 2020 (UTC)

suggestededit-add 1.0

What is this 'feature' and can it be disabled? Or at least add a captcha to it or something. Most edits on my watchlist that come from this 'feature' are jokes, vandalisms or other incorrect edits that have to be reverted (adding a full stop at the end of the description, changing the first letter to the uppercase etc.). Wostr (talk) 08:38, 27 May 2020 (UTC)

It has been no less than a disaster for Persian. Sometimes vandalisms sticking in wikidata and being shown to users for years Amir (talk) 10:34, 27 May 2020 (UTC)

  • Thinking about this, it seems especially troubling that the edits with captialize the description and the full stop not being in a good description are allowed by the Android App. I think I explicitely told the developers back then that those edits are not up to our descriptions. ChristianKl14:49, 27 May 2020 (UTC)
I'll write up a short report for the team with the latest quotes and interactions from the community around this. Ping me with any statistics, examples etc you want them to be specifically aware of. The quality filter that was built into this has not at all worked as well as we hoped. Amir, if you've got anything specific – statistics or just a short developer-friendly description of the Persian issue – I'll make sure to pass it along. /Johan (WMF) (talk) 23:34, 27 May 2020 (UTC)
  • We only allow users who have previously made 3+ good (i.e., not reverted) – reverted how? I think there are at least three options to do this: undo, restore and rollback. Does it work with every option other than manual revert or only with rollback? Wostr (talk) 09:43, 28 May 2020 (UTC)

User:Johan (WMF): Special:Diff/1167134589 and Special:Diff/1166571067 are gross insults to a living person (I can translate if you want). Some other: Special:Diff/1180773869 and Special:Diff/1166345494 and Special:Diff/1166345567 and these are only the ones I find by checking my latest 100 contributions. It's waaaay more. Amir (talk) 11:36, 30 May 2020 (UTC)

@Ladsgroup, Wostr: what parts are affected the most, descriptions, labels, anything else? MrProperLawAndOrder (talk) 19:33, 30 May 2020 (UTC)

Suggested edits are for descriptions only AFAIK. Amir (talk) 22:20, 30 May 2020 (UTC)

Item fix request

Please open Karolina (Q1734206) and change "Karolina" to "Karolina (name)" on English Wikipedia entries. Thank you!!! --2001:B07:6442:8903:752A:823B:CFB6:9FF2 10:37, 27 May 2020 (UTC)

We usually don't need brackets on items' labels, better to use description function, thx. --Liuxinyu970226 (talk) 23:57, 30 May 2020 (UTC)

IDs authors

Hi,

I just noticed that Albert Mathiez (Q1343174) Treccani's Enciclopedia Italiana ID (P4223) was signed: see Q1343174#P4223. What do you think about this use case? For me it sounds useful, but I'm not sure this is the right method to fill the IDs authors.

Thanks, Nomen ad hoc (talk) 21:56, 26 May 2020 (UTC).

The Source MetaData WikiProject does not exist. Please correct the name.. Nomen ad hoc (talk) 21:56, 26 May 2020 (UTC)

  Notified participants of WikiProject Italy. Nomen ad hoc (talk) 21:56, 26 May 2020 (UTC).

DarTar (talk) 08:28, 19 May 2018 (UTC) Daniel Mietchen (talk) 11:24, 19 May 2018 (UTC) Maxlath (talk) 11:33, 19 May 2018 (UTC) Jumtist (talk) 11:34, 19 May 2018 (UTC) Pintoch (talk) 11:40, 19 May 2018 (UTC) JakobVoss (talk) 11:44, 19 May 2018 (UTC) PKM (talk) 20:12, 19 May 2018 (UTC) ArthurPSmith (talk) 13:47, 22 May 2018 (UTC) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits Vladimir Alexiev (talk) 12:43, 27 November 2018 (UTC) Ivanhercaz   (Talk) 11:55, 3 February 2019 (UTC) Epìdosis 11:23, 15 April 2019 (UTC) Tris T7 TT me Kpjas (talk) 07:45, 2 March 2021 (UTC)

  Notified participants of WikiProject Wikipedia Sources. Nomen ad hoc (talk) 11:58, 27 May 2020 (UTC).

@Nomen ad hoc: In my opinion it is the correct way to indicate the author of an encyclopedia article; it is used, along with author name string (P2093) (when the author still hasn't an item), not only with Treccani's Enciclopedia Italiana ID (P4223), but also with Treccani's Biographical Dictionary of Italian People ID (P1986), Catholic Encyclopedia ID (P3241), Stanford Encyclopedia of Philosophy ID (P3123) and other similar IDs. --Epìdosis 08:08, 27 May 2020 (UTC)
@Nomen ad hoc, Epìdosis: the standard way is to create an item for the biographical article, isn't it? MrProperLawAndOrder (talk) 17:09, 27 May 2020 (UTC)
... Even for the online-only encyclopedias and directories? Nomen ad hoc (talk) 18:38, 27 May 2020 (UTC).
@MrProperLawAndOrder: Creating an item only when there is a page on Wikisource for the encyclopedia-article, otherwise the standard is this. --Epìdosis 21:00, 27 May 2020 (UTC)

@Epìdosis, Nomen ad hoc: there is no evidence the person was the author of the id. If no item is created, then use described by to refer to the article and use the id there. MrProperLawAndOrder (talk) 21:05, 27 May 2020 (UTC)

@MrProperLawAndOrder: Why "there is no evidence the person was the author of the id"? --Epìdosis 21:12, 27 May 2020 (UTC)
@MrProperLawAndOrder: I share E.'s question. Best, Nomen ad hoc (talk) 17:26, 30 May 2020 (UTC).
@Nomen ad hoc, Epìdosis: do you have evidence that the article author created the ID? MrProperLawAndOrder (talk) 18:04, 30 May 2020 (UTC)
@MrProperLawAndOrder: It is indicated in the article itself. --Epìdosis 18:35, 30 May 2020 (UTC)
@Nomen ad hoc, Epìdosis: the ID isn't even mentioned in the article. Only visible in URL. and in 1934 there was no internet - of course he could have created the ID, but again, no evidence. MrProperLawAndOrder (talk) 18:40, 30 May 2020 (UTC)
@MrProperLawAndOrder: So, if a person writes an article in an encyclopedia and then the encyclopedia is scanned and each article receives a different webpage, I see no problem of "no evidence" in the statement that the author has written the article. --Epìdosis 19:08, 30 May 2020 (UTC)
@Epìdosis: "I see no problem of "no evidence" in the statement that the author has written the article."? The statement was that there is no evidence the author created the ID. Adding "author" as a qualifier next to the ID would imply that. But the person was an author of an article. E.g. for DtBio IDs one could write: author: GND. ... But, for unique value violations one sometimes writes "named as", this also doesn't refer to the ID but to the object ot identifies. Anyway, for this work it seems it was created before internet time, so an item for the biographical article could be created? MrProperLawAndOrder (talk) 19:27, 30 May 2020 (UTC)
@MrProperLawAndOrder: "an item for the biographical article could be created"? No, because it doesn't respect notability criteria: encyclopedia articles can be notable for items only for N1, so they should be on Wikisource; until such articles aren't on Wikisource, they can't be subjects of items. author (P50) can be used not only when the author created the webpage which contains the article, but also when the author created the article which is copied in the webpage, so again I see no problem in the present situation. --Epìdosis 19:42, 30 May 2020 (UTC)
@Epìdosis: I can not make any sense of this: "until such articles aren't on Wikisource, they can't be subjects of items." This seems to say they would have to be removed from Wikisource to get a Wikidata item. That makes no sense at all. - Jmabel (talk) 21:42, 30 May 2020 (UTC)
@Jmabel: OK, I formulated it badly: I wanted to say that "only if these articles are inserted in Wikisource they can be subjects of items; in the meanwhile, not being in Wikisource, they can't be subjects of items". --Epìdosis 21:46, 30 May 2020 (UTC)
@Epìdosis: you might edit "aren't are"; you wrote the opposite of what you meant. I'd change it myself, but I know there is a strong norm here against editing someone else's comments. - Jmabel (talk) 21:49, 30 May 2020 (UTC)
Hi Jmabel, any opinion about it? Nomen ad hoc (talk) 06:31, 31 May 2020 (UTC).
My only opinion is that we should be consistent. Either we use author in this way when an ID refers to a work with articles, to indicate that the article to which the ID refers has that author, or we don't. I don't really care which. - Jmabel (talk) 16:36, 31 May 2020 (UTC)

Arabic, Czech, Russian, and Serbian help please

We are trying to figure out what's the difference between Yuan dynasty (Q7313) and imperial house of the Yuan Empire (Q10531418). Most languages seem to think they are the same thing, but Arabic, Czech, Russian, and Serbian have two separate articles for these two historical designations.

Please help us by joining the discussion at User_talk:Nostalgiacn#Do_not_disrupt_important_items if you know any of these four languages in addition to English. Deryck Chan (talk) 16:35, 29 May 2020 (UTC)

Kinda makes sense to me: Some empires have substantial continuity even if ruled by several different dynasties, but if, in this case, the empire only existed at the same time as the dynasty, many Wikipedias would choose to discuss them together, with a redirect from one to the other. Levana Taylor (talk) 17:53, 29 May 2020 (UTC)
The fact that many Wikipedias chose to discuss one and not the other doesn't mean that it's the same concept. Ontologically the family that runs a country and the country are two different entities. ChristianKl18:56, 29 May 2020 (UTC)

Merging Israeli settlements

Israeli CBS municipal ID (P3466) has a bunch of cases where one settlement Id currently get's used for multiple items. I think it's likely that most of them should be merged but it's hard for me to know without knowing hebrew. Can someone who speaks Hebrew look into it? ChristianKl19:57, 29 May 2020 (UTC)

@יונה_בנדלאק: ^^ --Liuxinyu970226 (talk) 22:55, 29 May 2020 (UTC)
Which ones? (I'm not a big querying expert.) --Amir E. Aharoni {{🌎🌍🌏}} talk 06:11, 30 May 2020 (UTC)
Amir query is now linked from Property talk:P3466. When the constraint bot runs again, Wikidata:Database reports/Constraint violations/P3466 should be larger. That could use some attention. Not sure if merging is the solution. Multichill (talk) 15:48, 31 May 2020 (UTC)
From a quick look, most of these are municipalities that were merged (in real life) into one administrative unit. I'm not sure that merging the items is the right thing. I'll take a closer look some time soon.
Is there some policy about it? Surely Israel is not the only place where this happens :) --Amir E. Aharoni {{🌎🌍🌏}} talk 17:02, 31 May 2020 (UTC)

Request for opinions on alternate accounts and Checkuser

Hello,

In reference to a recent discussion your community had here, the Foundation received a community question whether we could help pull together an overview how alternate accounts are handled on other projects.

In terms of what English Wikipedia does in such cases, their policy prohibits using alternate accounts in a specific list of circumstances and explicitly allows using alternate accounts in another specific list of circumstances. English Wikipedia policy also provides for exactly how legitimate alternate accounts should be disclosed to either the community or to a Checkuser or member of the Arbitration Committee. This disclosure is handled not through an OTRS queue, but rather through privately e-mailing either the Checkuser or the Arbitration Committee's mailing list (or an individual Arbitrator via private email) hosted by the Foundation. I believe English Wikipedia uses Checkusers rather than oversighters for this purpose because Checkusers are the ones charged with handling sockpuppetry investigations and thus would be the people who would need to know if an account is a legitimate alternate account. While it is not, strictly speaking, required that a user of an alternate account notify Checkusers or the Arbitration Committee, users who do not make this notification and also do not publicly disclose their main account run the risk of being blocked as sockpuppets.

There is also an important contextual point. Unlike Wikidata, English Wikipedia has fairly elaborate conduct policies that help coordinate the mitigation of potential damage caused by such accounts. We understand that Wikidata has a draft policy closely resembling the one successfully used by the Wikimedia communities in technical spaces - for years by now. So it might be worthwhile to explore adopting both items together.

We hope this helps next time you are exploring these topics.

Best regards, --PEarley (WMF) (talk) 16:41, 30 May 2020 (UTC)

This is more confusing than it helps, to be honest.
But since you are now interfering here anyways, can you please *actually* provide "an overview how alternate accounts are handled on other projects", exactly as you initially wrote above? You just mentioned English Wikipedia only, and I already wrote in another recent checkuser policy related discussion that this sort of "English Wikipedia imperialism" is not particularly loved by everyone here. If you just want to force us to adapt English Wikipedia policies, you can state that explicitly, rather than by providing imbalanced input. —MisterSynergy (talk) 17:31, 30 May 2020 (UTC)
  • @PEarley (WMF): As MisterSynergy already said, unfortunately this answer doesn't prvodie clarity.
The trust of safety team was contacted by me because it's unclear to me how this type of information can be managed is allowed to be managed. If an user disclores their secondcary account to one CheckUser/ArbCom member, what kind of channels exist for that CheckUser to share that information with other CheckUsers/ArbCom members? It seems to me that some sharing of this information is necessary to prevent the person getting blocked by other CheckUsers/ArbCom members who aren't the person who gets the initial email.
I asked "Can you as Trust and Safety Team clarify in our discussion what kind of private discussion venues are permissible given our goals of protecting the privacy better?" because I want to know how we can share such information and your response doesn't answer whether or not we can have some form of a private mailing list/OTRS queue/Wiki for sharing it.
To the extend that you understood the question as "whether we could help pull together an overview how alternate accounts are handled on other projects" is your answer you came to after deliberating a week about the question "no" given that you only answered about a single project? ChristianKl08:11, 31 May 2020 (UTC)

HTML elements

Recently Dhx1 has imported some basic data about HTML elements. But there's available more data in MDN that could be imported, although I have many doubts about how that data should be modelled. For example, cite (Q94100306):

Property in MDN Property in HTML LS Property used Example value
Content categories Categories instance of (P31) flow element (Q94102244)
Permitted content Content model has part(s) of the class (P2670) phrasing element (Q94102316)
Tag omission Tag omission in text/html Maybe has characteristic (P1552) <mandatory starting and ending tag>?
Permitted parents Contexts in which this element can be used Maybe new property called <can be children of>. I think this is not exactly an inverse of 'permitted content'.
Implicit ARIA role ?
Permitted ARIA roles Accessibility considerations § For authors ?
DOM Interface ?
Attributes Content attributes Maybe has part(s) of the class (P2670) with value <global attributes>, qualifier nature of statement (P5102) possibly (Q30230067)?
Specifications described at URL (P973) <https://html.spec.whatwg.org/multipage/text-level-semantics.html#the-cite-element>

Also, I wanted using has use (P366) for adding a the intended use of the element, but there are cases like i (Q94100343), defined since HTML Internet Draft 1.2 for marking italic (Q344098), but was redefined in HTML5 for marking text with an special use. I was thinking in adding both values, with a qualifier; although maybe I should create another item. Thanks. --Tinker Bell 03:45, 31 May 2020 (UTC)

Universal Decimal Classification (P1190)

Is there anywhere a list of librarys, which are using the Universal Decimal Classification?

--Nstrc (talk) 07:54, 31 May 2020 (UTC)

Classification of edition items?

Dewey Decimal Classification (P1036), Library of Congress Classification (P1149), Chinese Library Classification (P1189) and Universal Decimal Classification (P1190) are only mentioned at Work_item_properties, but not at Edition_item_properties. This makes sense, if the different editions have the same content.

But sometimes there are considerable enlarged or shortend editions, which would require different classification for different editions. - Or should be such "editions" rather classified as 'works' for its own?

--Nstrc (talk) 08:20, 31 May 2020 (UTC)

Mmmh. -
Regarding
--Nstrc (talk) 15:21, 31 May 2020 (UTC)
Sorry, since there are three threads on this topic and my English is limited I can't follow. Please ask Wikidata:WikiProject Books. --Kolja21 (talk) 15:39, 31 May 2020 (UTC)

  WikiProject Books has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

: Cfr. as well above Regensburger Verbundklassifikation (P1150).
--Nstrc (talk) 15:46, 31 May 2020 (UTC)
PS.: Sorry, I didn't intend to create such long list. Can someone correct my syntax, please?
--Nstrc (talk) 15:49, 31 May 2020 (UTC)
See also Wikidata:Forum#LoC-Klassifikation and Wikidata:Forum#Unterschied LoC-Klassifikation - RVK. Opening more threads does not help with clarification. --Kolja21 (talk) 15:56, 31 May 2020 (UTC)

True duplicates

Yesterday and today I've run some batches with QuickStatement and it created quite a few true duplicates, e.g. Cerastium murale (Q95915220) and Cerastium murale (Q95915221). Is there a problem with the database?

I can find and merge items I created, but is there a way to query all current true duplicates? --Robot Monk (talk) 13:38, 31 May 2020 (UTC)

@Robot Monk: It does appear that it's a database issue. Normally it's impossible to have two items with the same label and description. I get "Item QXXXXXX already has label "Cerastium murale" associated with language code en, using the same description text." when trying at Special:NewItem (tested on test Wikidata). --SixTwoEight (talk) 13:59, 31 May 2020 (UTC)
See Topic:Vkhd578n4cv9ndew.--GZWDer (talk) 14:27, 31 May 2020 (UTC)
If it's a lot of items, and you can generate a list somehow, you can put a request at WD:RBOT to get them merged. Edoderoo (talk) 15:58, 31 May 2020 (UTC)
Thanks to all, already merged them. --Robot Monk (talk) 16:54, 31 May 2020 (UTC)

IDs authors

Hi,

I just noticed that Albert Mathiez (Q1343174) Treccani's Enciclopedia Italiana ID (P4223) was signed: see Q1343174#P4223. What do you think about this use case? For me it sounds useful, but I'm not sure this is the right method to fill the IDs authors.

Thanks, Nomen ad hoc (talk) 21:56, 26 May 2020 (UTC).

The Source MetaData WikiProject does not exist. Please correct the name.. Nomen ad hoc (talk) 21:56, 26 May 2020 (UTC)

  Notified participants of WikiProject Italy. Nomen ad hoc (talk) 21:56, 26 May 2020 (UTC).

DarTar (talk) 08:28, 19 May 2018 (UTC) Daniel Mietchen (talk) 11:24, 19 May 2018 (UTC) Maxlath (talk) 11:33, 19 May 2018 (UTC) Jumtist (talk) 11:34, 19 May 2018 (UTC) Pintoch (talk) 11:40, 19 May 2018 (UTC) JakobVoss (talk) 11:44, 19 May 2018 (UTC) PKM (talk) 20:12, 19 May 2018 (UTC) ArthurPSmith (talk) 13:47, 22 May 2018 (UTC) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits Vladimir Alexiev (talk) 12:43, 27 November 2018 (UTC) Ivanhercaz   (Talk) 11:55, 3 February 2019 (UTC) Epìdosis 11:23, 15 April 2019 (UTC) Tris T7 TT me Kpjas (talk) 07:45, 2 March 2021 (UTC)

  Notified participants of WikiProject Wikipedia Sources. Nomen ad hoc (talk) 11:58, 27 May 2020 (UTC).

@Nomen ad hoc: In my opinion it is the correct way to indicate the author of an encyclopedia article; it is used, along with author name string (P2093) (when the author still hasn't an item), not only with Treccani's Enciclopedia Italiana ID (P4223), but also with Treccani's Biographical Dictionary of Italian People ID (P1986), Catholic Encyclopedia ID (P3241), Stanford Encyclopedia of Philosophy ID (P3123) and other similar IDs. --Epìdosis 08:08, 27 May 2020 (UTC)
@Nomen ad hoc, Epìdosis: the standard way is to create an item for the biographical article, isn't it? MrProperLawAndOrder (talk) 17:09, 27 May 2020 (UTC)
... Even for the online-only encyclopedias and directories? Nomen ad hoc (talk) 18:38, 27 May 2020 (UTC).
@MrProperLawAndOrder: Creating an item only when there is a page on Wikisource for the encyclopedia-article, otherwise the standard is this. --Epìdosis 21:00, 27 May 2020 (UTC)

@Epìdosis, Nomen ad hoc: there is no evidence the person was the author of the id. If no item is created, then use described by to refer to the article and use the id there. MrProperLawAndOrder (talk) 21:05, 27 May 2020 (UTC)

@MrProperLawAndOrder: Why "there is no evidence the person was the author of the id"? --Epìdosis 21:12, 27 May 2020 (UTC)
@MrProperLawAndOrder: I share E.'s question. Best, Nomen ad hoc (talk) 17:26, 30 May 2020 (UTC).
@Nomen ad hoc, Epìdosis: do you have evidence that the article author created the ID? MrProperLawAndOrder (talk) 18:04, 30 May 2020 (UTC)
@MrProperLawAndOrder: It is indicated in the article itself. --Epìdosis 18:35, 30 May 2020 (UTC)
@Nomen ad hoc, Epìdosis: the ID isn't even mentioned in the article. Only visible in URL. and in 1934 there was no internet - of course he could have created the ID, but again, no evidence. MrProperLawAndOrder (talk) 18:40, 30 May 2020 (UTC)
@MrProperLawAndOrder: So, if a person writes an article in an encyclopedia and then the encyclopedia is scanned and each article receives a different webpage, I see no problem of "no evidence" in the statement that the author has written the article. --Epìdosis 19:08, 30 May 2020 (UTC)
@Epìdosis: "I see no problem of "no evidence" in the statement that the author has written the article."? The statement was that there is no evidence the author created the ID. Adding "author" as a qualifier next to the ID would imply that. But the person was an author of an article. E.g. for DtBio IDs one could write: author: GND. ... But, for unique value violations one sometimes writes "named as", this also doesn't refer to the ID but to the object ot identifies. Anyway, for this work it seems it was created before internet time, so an item for the biographical article could be created? MrProperLawAndOrder (talk) 19:27, 30 May 2020 (UTC)
@MrProperLawAndOrder: "an item for the biographical article could be created"? No, because it doesn't respect notability criteria: encyclopedia articles can be notable for items only for N1, so they should be on Wikisource; until such articles aren't on Wikisource, they can't be subjects of items. author (P50) can be used not only when the author created the webpage which contains the article, but also when the author created the article which is copied in the webpage, so again I see no problem in the present situation. --Epìdosis 19:42, 30 May 2020 (UTC)
@Epìdosis: I can not make any sense of this: "until such articles aren't on Wikisource, they can't be subjects of items." This seems to say they would have to be removed from Wikisource to get a Wikidata item. That makes no sense at all. - Jmabel (talk) 21:42, 30 May 2020 (UTC)
@Jmabel: OK, I formulated it badly: I wanted to say that "only if these articles are inserted in Wikisource they can be subjects of items; in the meanwhile, not being in Wikisource, they can't be subjects of items". --Epìdosis 21:46, 30 May 2020 (UTC)
@Epìdosis: you might edit "aren't are"; you wrote the opposite of what you meant. I'd change it myself, but I know there is a strong norm here against editing someone else's comments. - Jmabel (talk) 21:49, 30 May 2020 (UTC)
Hi Jmabel, any opinion about it? Nomen ad hoc (talk) 06:31, 31 May 2020 (UTC).
My only opinion is that we should be consistent. Either we use author in this way when an ID refers to a work with articles, to indicate that the article to which the ID refers has that author, or we don't. I don't really care which. - Jmabel (talk) 16:36, 31 May 2020 (UTC)

Arabic, Czech, Russian, and Serbian help please

We are trying to figure out what's the difference between Yuan dynasty (Q7313) and imperial house of the Yuan Empire (Q10531418). Most languages seem to think they are the same thing, but Arabic, Czech, Russian, and Serbian have two separate articles for these two historical designations.

Please help us by joining the discussion at User_talk:Nostalgiacn#Do_not_disrupt_important_items if you know any of these four languages in addition to English. Deryck Chan (talk) 16:35, 29 May 2020 (UTC)

Kinda makes sense to me: Some empires have substantial continuity even if ruled by several different dynasties, but if, in this case, the empire only existed at the same time as the dynasty, many Wikipedias would choose to discuss them together, with a redirect from one to the other. Levana Taylor (talk) 17:53, 29 May 2020 (UTC)
The fact that many Wikipedias chose to discuss one and not the other doesn't mean that it's the same concept. Ontologically the family that runs a country and the country are two different entities. ChristianKl18:56, 29 May 2020 (UTC)

Merging Israeli settlements

Israeli CBS municipal ID (P3466) has a bunch of cases where one settlement Id currently get's used for multiple items. I think it's likely that most of them should be merged but it's hard for me to know without knowing hebrew. Can someone who speaks Hebrew look into it? ChristianKl19:57, 29 May 2020 (UTC)

@יונה_בנדלאק: ^^ --Liuxinyu970226 (talk) 22:55, 29 May 2020 (UTC)
Which ones? (I'm not a big querying expert.) --Amir E. Aharoni {{🌎🌍🌏}} talk 06:11, 30 May 2020 (UTC)
Amir query is now linked from Property talk:P3466. When the constraint bot runs again, Wikidata:Database reports/Constraint violations/P3466 should be larger. That could use some attention. Not sure if merging is the solution. Multichill (talk) 15:48, 31 May 2020 (UTC)
From a quick look, most of these are municipalities that were merged (in real life) into one administrative unit. I'm not sure that merging the items is the right thing. I'll take a closer look some time soon.
Is there some policy about it? Surely Israel is not the only place where this happens :) --Amir E. Aharoni {{🌎🌍🌏}} talk 17:02, 31 May 2020 (UTC)

Request for opinions on alternate accounts and Checkuser

Hello,

In reference to a recent discussion your community had here, the Foundation received a community question whether we could help pull together an overview how alternate accounts are handled on other projects.

In terms of what English Wikipedia does in such cases, their policy prohibits using alternate accounts in a specific list of circumstances and explicitly allows using alternate accounts in another specific list of circumstances. English Wikipedia policy also provides for exactly how legitimate alternate accounts should be disclosed to either the community or to a Checkuser or member of the Arbitration Committee. This disclosure is handled not through an OTRS queue, but rather through privately e-mailing either the Checkuser or the Arbitration Committee's mailing list (or an individual Arbitrator via private email) hosted by the Foundation. I believe English Wikipedia uses Checkusers rather than oversighters for this purpose because Checkusers are the ones charged with handling sockpuppetry investigations and thus would be the people who would need to know if an account is a legitimate alternate account. While it is not, strictly speaking, required that a user of an alternate account notify Checkusers or the Arbitration Committee, users who do not make this notification and also do not publicly disclose their main account run the risk of being blocked as sockpuppets.

There is also an important contextual point. Unlike Wikidata, English Wikipedia has fairly elaborate conduct policies that help coordinate the mitigation of potential damage caused by such accounts. We understand that Wikidata has a draft policy closely resembling the one successfully used by the Wikimedia communities in technical spaces - for years by now. So it might be worthwhile to explore adopting both items together.

We hope this helps next time you are exploring these topics.

Best regards, --PEarley (WMF) (talk) 16:41, 30 May 2020 (UTC)

This is more confusing than it helps, to be honest.
But since you are now interfering here anyways, can you please *actually* provide "an overview how alternate accounts are handled on other projects", exactly as you initially wrote above? You just mentioned English Wikipedia only, and I already wrote in another recent checkuser policy related discussion that this sort of "English Wikipedia imperialism" is not particularly loved by everyone here. If you just want to force us to adapt English Wikipedia policies, you can state that explicitly, rather than by providing imbalanced input. —MisterSynergy (talk) 17:31, 30 May 2020 (UTC)
  • @PEarley (WMF): As MisterSynergy already said, unfortunately this answer doesn't prvodie clarity.
The trust of safety team was contacted by me because it's unclear to me how this type of information can be managed is allowed to be managed. If an user disclores their secondcary account to one CheckUser/ArbCom member, what kind of channels exist for that CheckUser to share that information with other CheckUsers/ArbCom members? It seems to me that some sharing of this information is necessary to prevent the person getting blocked by other CheckUsers/ArbCom members who aren't the person who gets the initial email.
I asked "Can you as Trust and Safety Team clarify in our discussion what kind of private discussion venues are permissible given our goals of protecting the privacy better?" because I want to know how we can share such information and your response doesn't answer whether or not we can have some form of a private mailing list/OTRS queue/Wiki for sharing it.
To the extend that you understood the question as "whether we could help pull together an overview how alternate accounts are handled on other projects" is your answer you came to after deliberating a week about the question "no" given that you only answered about a single project? ChristianKl08:11, 31 May 2020 (UTC)

HTML elements

Recently Dhx1 has imported some basic data about HTML elements. But there's available more data in MDN that could be imported, although I have many doubts about how that data should be modelled. For example, cite (Q94100306):

Property in MDN Property in HTML LS Property used Example value
Content categories Categories instance of (P31) flow element (Q94102244)
Permitted content Content model has part(s) of the class (P2670) phrasing element (Q94102316)
Tag omission Tag omission in text/html Maybe has characteristic (P1552) <mandatory starting and ending tag>?
Permitted parents Contexts in which this element can be used Maybe new property called <can be children of>. I think this is not exactly an inverse of 'permitted content'.
Implicit ARIA role ?
Permitted ARIA roles Accessibility considerations § For authors ?
DOM Interface ?
Attributes Content attributes Maybe has part(s) of the class (P2670) with value <global attributes>, qualifier nature of statement (P5102) possibly (Q30230067)?
Specifications described at URL (P973) <https://html.spec.whatwg.org/multipage/text-level-semantics.html#the-cite-element>

Also, I wanted using has use (P366) for adding a the intended use of the element, but there are cases like i (Q94100343), defined since HTML Internet Draft 1.2 for marking italic (Q344098), but was redefined in HTML5 for marking text with an special use. I was thinking in adding both values, with a qualifier; although maybe I should create another item. Thanks. --Tinker Bell 03:45, 31 May 2020 (UTC)

Universal Decimal Classification (P1190)

Is there anywhere a list of librarys, which are using the Universal Decimal Classification?

--Nstrc (talk) 07:54, 31 May 2020 (UTC)

Classification of edition items?

Dewey Decimal Classification (P1036), Library of Congress Classification (P1149), Chinese Library Classification (P1189) and Universal Decimal Classification (P1190) are only mentioned at Work_item_properties, but not at Edition_item_properties. This makes sense, if the different editions have the same content.

But sometimes there are considerable enlarged or shortend editions, which would require different classification for different editions. - Or should be such "editions" rather classified as 'works' for its own?

--Nstrc (talk) 08:20, 31 May 2020 (UTC)

Mmmh. -
Regarding
--Nstrc (talk) 15:21, 31 May 2020 (UTC)
Sorry, since there are three threads on this topic and my English is limited I can't follow. Please ask Wikidata:WikiProject Books. --Kolja21 (talk) 15:39, 31 May 2020 (UTC)

  WikiProject Books has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

: Cfr. as well above Regensburger Verbundklassifikation (P1150).
--Nstrc (talk) 15:46, 31 May 2020 (UTC)
PS.: Sorry, I didn't intend to create such long list. Can someone correct my syntax, please?
--Nstrc (talk) 15:49, 31 May 2020 (UTC)
See also Wikidata:Forum#LoC-Klassifikation and Wikidata:Forum#Unterschied LoC-Klassifikation - RVK. Opening more threads does not help with clarification. --Kolja21 (talk) 15:56, 31 May 2020 (UTC)

True duplicates

Yesterday and today I've run some batches with QuickStatement and it created quite a few true duplicates, e.g. Cerastium murale (Q95915220) and Cerastium murale (Q95915221). Is there a problem with the database?

I can find and merge items I created, but is there a way to query all current true duplicates? --Robot Monk (talk) 13:38, 31 May 2020 (UTC)

@Robot Monk: It does appear that it's a database issue. Normally it's impossible to have two items with the same label and description. I get "Item QXXXXXX already has label "Cerastium murale" associated with language code en, using the same description text." when trying at Special:NewItem (tested on test Wikidata). --SixTwoEight (talk) 13:59, 31 May 2020 (UTC)
See Topic:Vkhd578n4cv9ndew.--GZWDer (talk) 14:27, 31 May 2020 (UTC)
If it's a lot of items, and you can generate a list somehow, you can put a request at WD:RBOT to get them merged. Edoderoo (talk) 15:58, 31 May 2020 (UTC)
Thanks to all, already merged them. --Robot Monk (talk) 16:54, 31 May 2020 (UTC)

Scholarly articles with titels in square brackets

Currently are large number of scholary articles exist, where the item label is [wrapped in square brackets]:

SELECT ?article (LANG(?label) AS ?lang) ?label WHERE {
  ?article wdt:P31 wd:Q13442814;
    rdfs:label ?label.
  FILTER((STRSTARTS(?label, "[")) && (STRENDS(?label, "]")))
}
LIMIT 10000
Try it!

Following the DOI links this may be some kind of semantic for article titles translated to English. In my opinion Wikidata is able to model multilingual titles (maybe use native label (P1705) or language of work or name (P407)?) and the square brackets can be removed. Sometimes labels like [no title], [kein Titel] or [Not Available] are also used, which should also be removed.

Since lots of these items seem to created by @Daniel_Mietchen: would it be possible to remove the square brackets beforehand in the future? --Haansn08 (talk) 02:06, 27 May 2020 (UTC)

  • Actually the square brackets have a meaning. If they are dropped, the information needs to be stored otherwise.
"no title" is being fixed (I hope). --- Jura 09:08, 27 May 2020 (UTC)
@Jura1: Could you clarify how the information should be stored if I were to remove the brackets? --Haansn08 (talk) 22:58, 27 May 2020 (UTC)
@Jura1: Since the actual title and language is not known, would it be ok to add language of work or name (P407)non-English (Q66724591) to all matching items and remove the square brackets from the labels? --Haansn08 (talk) 07:33, 28 May 2020 (UTC)
I'd do so (from labels, not the title statement). --- Jura 08:23, 28 May 2020 (UTC)
@Vladimir Alexiev, Jura1, Haansn08: Hmm, I'd recommend the opposite approach - if we go from enLabel "[Notes on glucose]" to "Notes on glucose", we're implying the 'proper' English title is "Notes on glucose", even if there wasn't any English title originally. I think it's better to either keep the square brackets or to remove the label entirely (assuming we have the correct label in the source language).
language of work or name (P407)non-English (Q66724591) is a good idea, though, assuming we don't already have any language statement. Andrew Gray (talk) 22:52, 30 May 2020 (UTC)
I don't think removing the label entirely is a good idea. The title statement will still be there to indicate that it's a translation. --- Jura 06:22, 31 May 2020 (UTC)
In that case, is there really any harm in leaving the square brackets? It's a clear indication it's a translation, and it's a pretty normal convention in the field. It looks a little tidier without them, but it gives a misleading impression to anyone who doesn't think to explicitly check title statements. Andrew Gray (talk) 09:33, 31 May 2020 (UTC)
It's fairly common to add translated titles as labels when these are available. Why would we start adding brackets? I do think it's a problem that these are copied to random languages, but that is another question. --- Jura 14:54, 31 May 2020 (UTC)
@Andrew Gray: I've only seen this convention in PubMed. If you can prove it should apply to at least half of WD entities then we could discuss using it. --Vladimir Alexiev (talk) 11:08, 1 June 2020 (UTC)
"half of WD entities" is an absurdly high bar. - Jmabel (talk) 15:34, 1 June 2020 (UTC)
@Vladimir Alexiev: It's certainly not confined to PubMed; until this thread, I had believed it was a pretty universal convention :-). It's used in the major English-language library cataloguing approaches, which presumably is where PubMed gets it from (I think it's in both AACR2 and RDA) - a title that isn't present in the work itself gets square brackets, both translations & descriptive titles. It's also used by APA style.
I guess the point I'm getting at is that there isn't really an "English title" for a paper like this. There's an original name in say German, and a convenience translation in English. But if we take the square brackets out, it suggests there is a "proper" English title, like you would have for a properly translated work. Normally we wouldn't expect to have someone add an English label to something only ever written & published in German. Maybe I'm splitting hairs here, but it really does feel like this isn't the right way to do it. Anyway, I won't press the point, I just wanted to flag it up. Andrew Gray (talk) 20:57, 3 June 2020 (UTC)