Wikidata:Project chat/Archive/2020/06

This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

Automated finding references: new data and a distributed game

 

Hello all,

As previously announced, the Wikidata development team is working on ways to automatically extract references from external websites, so editors can check them and add them on Items. On this feedback loop page, we presented our first batch of references and collected many useful comments.

Following up on the topic, we improved our data batch based on your feedback. On top of that, we created a distributed Wikidata game called Reference hunt!. With this game, you get a suggestion of an Item and a reference based on structured data from an external website. You can accept the reference if it is good (it will then be added on Wikidata), reject the reference if it is not fitting, or pass if you are not sure.

Feel free to try the game and give us feedback on this talk page! We will also track the edits made by the game with the tag reference-game and monitor the results (how many times people click on accept or reject on a suggested reference) to analyze the overall quality of the data batch.

Cheers, Mohammed Sadat (WMDE) (talk) 11:39, 25 May 2020 (UTC)

  • I think it would make sense to show the user some context for the claim. Maybe the sentence in which the "extracted data" appears. Without context it's hard to say whether the extraction worked well.
In addition it would be great to only show users references in languages that the user can understand. My abilities to read Korean are very limited. ChristianKl11:53, 25 May 2020 (UTC)
Thanks for your feedback. Since the extraction happens using "hidden" markup in the source code of the website, it is tricky to show the exact context. But we can do some more. Unfortunately, it is not possible at the moment to only show users references in languages that they know, it's best to skip these references and let them be handled by someone who speaks the language. Mohammed Sadat (WMDE) (talk) 10:58, 28 May 2020 (UTC)
  • Thanks for this game. It is quite cool how you could collect the data for this game. As ChristianKl suggested, providing some context for a claim would improve the game a lot. For example, if the extracted data is a date, I'll have first to open the source URL to check whether the date is a date of birth, date of death or some other date. --Pasleim (talk) 14:17, 25 May 2020 (UTC)
Pasleim, ChristianKl: we're taking this into account. Mohammed Sadat (WMDE) (talk) 12:39, 28 May 2020 (UTC)
@GZWDer:Thanks for your feedback. This has now been resolved and it should not happen anymore. Mohammed Sadat (WMDE) (talk) 12:42, 28 May 2020 (UTC)

Pretty cool, but so far not very game-like. Do you have a game designer consultant who's working on adding encouragement and rewards and fun surprises? Especially important because properly verifying the references is slow -- there is pretty much no way around reading the source page carefully in 95% of the ones I've done so far. I have some suggestions:

  • In order to properly manage the data, you should have two yes/nos for "is it reliable source" and "does it support the claim" rather than combining them. If a significant number of players thought the reference source was not reliable, you can take it off your list. And if a certain source is getting lots of "does not support," you can re-evaluate how you extract data from that source.
Thank you for your feedback Levana Taylor! Unfortunately the Wikidata Game does not support this and we're just hooking into it. But we are looking at the declined references to figure this out anyway. Just a bit more manual work.
  • At some point in the future, you could check if the user has Babel set up and only give them references in languages which they said their ability was at least level 2. And when they log in, ask them if they have Babel and if not, have a link to how to do it. If they don't have it show them any language.
As far as we know that's not possible, but we'll look into it more.
  • If you're only hunting references for claims that currently have no references, can you also hunt for ones where the only reference is "imported from Wikimedia project"? I know that might be tricky. Levana Taylor (talk) 14:36, 28 May 2020 (UTC)
These are already being taken into account as well. - Mohammed Sadat (WMDE) (talk) 08:51, 2 June 2020 (UTC)

Commons Special:MediaSearch or search Commons in any language

Hoi, Special:MediaSearch provides a working search interface. It is a prototype, it supports any and all languages and it is imho the best new functionality for our movement in a long long time.. What I would like for myself is a link like I have for Reasonator. I did find that you can tweak the results in Commons. For instance Mr Frank Vera is linked through "depicts" on Commons and among a lot of noise it is at least the first result.

Please have a look and please consider how it enables anyone to find pictures in any language. Thanks, GerardM (talk) 16:03, 30 May 2020 (UTC)

testing

I'm generally impressed. But let me give you a few terms that didn't have great results, because that might reveal some issues to be looked into:
But on the whole: at least 4 out of every 5 searches I tried in a variety of languages got me good results. - Jmabel (talk) 17:12, 30 May 2020 (UTC)

Some more attempts that didn't work well:

- Jmabel (talk) 21:36, 30 May 2020 (UTC)

This is the first prototype, it works after a fashion and it brought to us warts and all. It is a mix of wikidatified material that includes a "depicts" statement in Commons and texts with labels superimposed. There is no display like in Hay Kranen's SDSearch.. When a user named "Katze" uploaded hundreds of files, it prevents German people from a nice display only with cats.
When you look for something and there is no label for an item in your language, you will not find it. So add a liberal sprinking of labels and this search engine will do much better .. You have to wait for the label to make its way into the search engine though..
In my experience so far, it works best for very concrete subjects. People, taxons, building. There is no true querying available so this is as good as it gets for now.
It is a prototype and for me it is the best thing that happened to the Wikimedia Foundation since Wikidata took over interwiki links. This makes us more of a movement. Thanks, GerardM (talk) 17:06, 31 May 2020 (UTC)
@GerardM: Do I take that to mean, "No, I don't want feedback on where it fails"? - Jmabel (talk) 15:37, 1 June 2020 (UTC)
@Jmabel: Feedback is good. Given that it is a prototype, and is branded as "it just works", it follows that it does not fail. It just does not work as we could hope for. Therefore what is needed is to find in what way it does work and where it needs improvement. There are a few things I have found:
  • there are text that should not be considered in a multilingual search/search - usernames
  • there is a need to only show the images that have been tagged as well as there is a need to show off the smarts of the search engine (I am impressed)
  • I am particularly interested in the multi lingual aspects of it all, it requires a lot more labels for it to be useful in most languages.
    • There is anecdotal evidence that an additional label for one scientist invites updates on dozens of pictures. The name used on images was different from the name in English..
    • Potentially we are talking dictionaries per language - How will that affect the performance of Wikidata? - How will that affect the performance of Commons?
    • Search needs descriptions in order to disambiguate. The quality of descriptions in Wikidata proves that they will not scale to 300+ languages. For me there is only one realistic solution; generated descriptions, preferably runtime.
  • There are user stories for different usages. Think GLAM, think Commonists working on license info, think Wikipedians looking for alternate illustrations.
    • user stories should not be in the format of "I want something for my community" it is better to say "I need to show all material provided by the "Tropenmuseum" (or any other GLAM). "I need to show all material for Batavia sorted by date". "I need all the representation of a particular artist" etc.
    • Potentially even pictures of English Wikipedia can be included so that we can start managing "fair use" material.
It helps a lot when tests include links to the search involved. One "nice to have" thing is to indicate the language the search is for.
I am expansive in my reply because feedback is so vital. I have played with the issues that were raised. Thanks, GerardM (talk) 17:32, 1 June 2020 (UTC)
  • I would suggest that usernames should not be entirely eliminated, but should require a qualifier (e.g. "username:"). Often searching on a username is very useful for those of us actually active on Commons.
  • I didn't initially provide links above, but the search strings were verbatim. I've now added the links, if that's the preferred way for you to get this. - Jmabel (talk) 01:49, 2 June 2020 (UTC)

Why are listed at Wikidata:WikiProject_Books#Work_item_properties only Dewey Decimal Classification (P1036), Library of Congress Classification (P1149), Chinese Library Classification (P1189) and Universal Decimal Classification (P1190), but not Regensburg Classification (P1150)?

Rather Regensburg Classification (P1150) has conflicts-with constraint (Q21502838) book (Q571) and magazine (Q41298).

--Nstrc (talk) 08:03, 31 May 2020 (UTC)

"Proposal B can be viewed just as a transition towards the elimination of library-classification properties on publications, in order to supersede them with zentrales Thema (P921)."
I totally disagree with the idea, that only main subject (P921) should be used for categorising the content of books (or essays). A certain subject can be analysed by different academic disciplines, e.g. there is a difference, whether a book, which deals with soccer, is classified as
* UDC 79 "Recreation. Entertainment. Games. Sport" (cfr. LCC GV "Recreation. Leisure")
* UDC 33 "Economics. Economic science"
* UDC 94 "General history" (cfr. LCC D "History")
or
* UDC 34 "Law. Jurisprudence" (cfr. LCC K "Law").
Rather than aiming to eliminate library-classification properties on publications, I would prefer, I could use not only Dewey Decimal Classification (P1036), Library of Congress Classification (P1149), Chinese Library Classification (P1189) und Universal Decimal Classification (P1190), but also Regensburg Classification (P1150).
Cfr. as well (unfortunatly in German): Wikidata:Forum#Unterschied LoC-Klassifikation - RVK.
--Nstrc (talk) 13:08, 31 May 2020 (UTC)
PS:
Regarding
--Nstrc (talk) 15:31, 31 May 2020 (UTC)

Dewey Decimal classification is of somewhat limited usefulness because in the old days, a lot of libraries did their own assigning of classifications to books so they'd vary widely. I don't known whether that's still current practice. But I can certainly see replacing that one with main subject (P921) -- legacy being the only reason it's kept. For LOC, though, you can look up what the Library of Congress itself uses, and so fewer libraries would disagree with it. From what Nstrc says, it sounds like all German libraries use a unified Regensburg classification. So yeah, in general classifications are useful info to have and are not at all the same thing as main subject (P921). Why not keep identifiers for classifications used by major national library systems? Levana Taylor (talk) 16:52, 1 June 2020 (UTC)

Facilitate discussing an edit

Hello. The history tab of an item contains next to each edit an undo and a restore link. When clicking either one is offered a text field where one can explain the action. Confirming the action notifies the editor who performed the original edit. This workflow, once discovered by an editor, makes it easier (if not even incentivizes) to ask about a not-so-obvious edit by undoing it.

How about a discuss link next to each edit which:

  1. creates a new section on the item's talk page
  2. links to the edit
  3. notifies the user who did the original edit

Thanks. Toni 001 (talk) 08:32, 1 June 2020 (UTC)

Seems like a good idea. I would also add on my wishlist: "be able to see the history, but just for 1 property". It is kinda tedious to check the whole history when you just want to know when 1 property was changed, and filtering didn't look obvious to me (maybe there is a trick I do not know). --Misc (talk) 10:37, 1 June 2020 (UTC)
  Support for "be able to see the history, but just for 1 property" --Haansn08 (talk) 06:04, 2 June 2020 (UTC)
Discussing an edit seems to me like a general discussion feature that should be of a task of the working group that takes the next stab at reimagining discussions on MediaWiki and not a WikiData specific thing. ChristianKl21:29, 1 June 2020 (UTC)

Wikidata weekly summary #418

Alert - bot malfunctioning on topic adding VIAF to GND humans

@Epìdosis, Kolja21, MisterSynergy, Ymblanter, ArthurPSmith: Can someone immediately stop it? User not listening to request https://www.wikidata.org/w/index.php?title=Topic:Vn7dpnl9v9dw6fer&topic_showPostId=vnh3cipij0kx1eqo#flow-post-vnh3cipij0kx1eqo MrProperLawAndOrder (talk) 18:55, 1 June 2020 (UTC)

At least 66 edits like [1] MrProperLawAndOrder (talk) 19:06, 1 June 2020 (UTC)

list for "p://schema.org/" MrProperLawAndOrder (talk) 20:11, 1 June 2020 (UTC)

Image attribution text

I like the new property attribution text (P8264). Is there any way to find images already used in WD that should have this property added? - PKM (talk) 21:07, 1 June 2020 (UTC)

Help explaining difference between P279 and P31 in hungarian

Hello, as there is no hungarian translation in this page https://www.wikidata.org/wiki/Help:Basic_membership_properties, could anyone explain to Palotabarát why he should not redo what he put on for instance on this Q : Consulate of the United States, Tabriz (Q20986676) ? Thank you ! Bouzinac (talk) 07:40, 2 June 2020 (UTC)

Yes, sorry, it was a typo. Bouzinac (talk) 09:12, 2 June 2020 (UTC)

Bouzinac: I understand what you are writing, please understand that the type of diplomatic mission should be included in this point. Undoing the edit will not resolve the issue. It’s more important to be in it — even if you’re in the wrong place — than not to be in it.

We have talked about this before, I wrote even then: be sure to have a consulate, a consulate general, a commercial representation, and so on. Palotabarát (talk) 11:14, 2 June 2020 (UTC)

Using Wikidata to create COVID-19 per capita data table

I'm trying to create w:Template:COVID-19 pandemic data/Per capita using Wikidata data (see my current early efforts at the sandbox, and lmk if you'd like to help). I'm having trouble retrieving the most recent case count, though, since unlike country population data, the most recent value doesn't seem to be ranked as preferred. How do I get that to happen? Sdkb (talk) 08:17, 2 June 2020 (UTC)

How to specify that a given infrastructure is a proposal/project and not an operating infrastructure?

I found several infrastructures that are not already existing but are projects/proposals which may (or may not...) exist in the future. The issue is that I can't find a way to differenciate them from real operating similar infrastructures. Examples are Q67130279 or Q56063893 which are two projects of spaceports in Europe not yet agreed. I'm looking for a way to flag this fact but can't find anything on the documentation. Is there a qualifier I can add to an existing 'instance of' statement, like 'status'? Or am I supposed to replace existing 'instance of' with 'project' (or proposal? or something else?) with a qualifier 'of' 'spaceport'? Or is there a property to state that the whole item is a proposal? Thanks in advance for your help. Romain2boss (talk) 12:36, 29 May 2020 (UTC)

Use state of use (P5817). --Jklamo (talk) 17:13, 29 May 2020 (UTC)
Thanks for your answer. You mean a new statement state of use (P5817)=project (Q170584)? Looking at their wikidata pages I'm not sure it fits well. I mean that its is not really 'used' and the definition and examples given are about existing infrastructures that are or have been used. Can you please point me to somewhere where I can make sure that it is a wellspread use in this context? If this is the case I think state of use (P5817) definition and examples should be extended to include project state. Romain2boss (talk) 20:42, 29 May 2020 (UTC)
Finally I used state of use (P5817)=proposed building or structure (Q811683) and am adding an example on property page. Romain2boss (talk) 20:35, 2 June 2020 (UTC)

Tags

I noticed that english wikipedia have a tag called "possible BLP issue or vandalism" and i was wondering if this tag exists on WD as well. --Trade (talk) 22:06, 1 June 2020 (UTC)

We could create the tag, but the point is that on ENWP it is added by three specific edit filters, and those filters cannot readily be applied to Wikidata content. If you have a sample set of edits that reflect similar problems, we could try creating edit filters.  Bovlb (talk) 01:03, 2 June 2020 (UTC)
"If you have a sample set of edits that reflect similar problems, we could try creating edit filters." tho edits i had in mind have been supressed for obvious reasons. --Trade (talk) 01:31, 2 June 2020 (UTC)
I think the tag should be edits on an item that is likely to be about a living person of a statement with a property that has living people protection class (P8274). ChristianKl08:57, 2 June 2020 (UTC)
Bovlb I was referring to edits such as this. I have also seen instances where someone would add words such as "chIld abuser", "sexual predator" and similar phrases into the descriptions of BLP. Quite problematic to say the least@Bovlb:--Trade (talk) 15:48, 2 June 2020 (UTC)-
@Trade: This sounds like an obvious extension to the existing Special:AbuseFilter/11.  @DannyS712, YMS, Matěj Suchánek - Bovlb (talk) 22:47, 2 June 2020 (UTC)
Needles to say labels should be covered as well. --Trade (talk) 23:00, 2 June 2020 (UTC)

How to obtain this id? Eurohunter (talk) 17:44, 2 June 2020 (UTC)

@Eurohunter: It appears to be only found in the page source. In the source, search for "og:track_id" and you should find it. The property proposal has a bit more detail. Vahurzpu (talk) 23:44, 2 June 2020 (UTC)

What is the reverse property for facet of (P1269)? Eurohunter (talk) 19:05, 2 June 2020 (UTC)

do all properties have reverses? I'm not sure this one should have one. BrokenSegue (talk) 23:22, 2 June 2020 (UTC)
There's no inverse property for it and the inverse label is "has aspect". Whenever an inserve property exists it will be listed on the page of the property. ChristianKl23:52, 2 June 2020 (UTC)

96,000,000th item passed

We just passed Q96000000, but the item itself is AWOL; see Q96000000 & Norbert Krisztián Kovács (Q96000001). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:22, 2 June 2020 (UTC)

Aren't we at 87,105,000+? Eurohunter (talk) 20:25, 2 June 2020 (UTC)
The Q-ID counter is at ~96.00M; according to my counting, there are actually ~86.81M items, ~2.87M redirects, ~1.40M deleted items, and ~4.93M Q-IDs that have been omitted (i.e. the items have never existed, just as Q96000000). The number given on the Wikidata:Main Page, currently ~87.11M, is close to ~86.81M, but for some reason somewhat different—I have no idea how it is calculated. :-) —MisterSynergy (talk) 21:48, 2 June 2020 (UTC)

Applicablity of RfP voting eligibility

I have striken Nnadigoodluck's vote at Wikidata:Requests_for_permissions/Oversight/Kostas20142 as this user does not have 100 local non-automated edits. In two recently closed RfCU requests the user Operator873 is also ineligibile to vote. Does community agree with this? On the other hand as CU and OS requires 25 supports having this requirement may let CU and OS requests a (very little) bit more difficult to succeed.--GZWDer (talk) 07:27, 27 May 2020 (UTC)

How does federation with WikiData work?

If I wanted to stand up a new Wikibase (Q16354758) specifically oriented around collecting and storing data for a specific type of item (e.g. a database of productivity software (Q7247856)), is there a path where that data would be searchable / integrated into the main Wikidata instance?

I've seen multiple references to being able to link datasets, but I'm not sure it's very clear how that process works? Can someone shed some light on it for me?

- AlphaWeaver (talk) 12:22, 29 May 2020 (UTC)

@AlphaWeaver: There are definitely plans for this but I think it's still in early stages. See Wikidata:Federation input for a discussion here on what is needed. That said, I think you can implement federation at the level of SPARQL queries now, if that's sufficient for your needs. ArthurPSmith (talk) 17:42, 29 May 2020 (UTC)
@AlphaWeaver: Initial development efforts toward Wikibase federation are on the Wikibase roadmap for 2020/2021. Making it possible for one Wikibase to access via API the entities in another Wikibase (e.g. Wikidata) is something we are currently working on in a series of phases. Phase one has been underway since late February, and it consists of making it technically possible to access properties from one Wikibase to reuse in another. We started with properties for a variety of reasons, including stronger Wikibase user demand and [relatively] less technical complexity to develop. The problems we need to solve to enable users to access ~8000 Wikidata properties is on a different scale entirely than the problems we need to solve so that users can access millions of Wikidata items in a stable and reliable way. We strive to make significant progress on this topic in 2020! Samantha Alipio (WMDE) 11:10, 3 June 2020 (UTC)

New feature to be developed: Simple Query Builder

Hello all,

The development team is currently doing research on how to improve the experience of the Query Service for people who don't have knowledge about SPARQL. After evaluating several ideas and asking for your feedback, we decided to focus on providing an visual interface for people who would like to build simple queries, for example for maintenance. You can find the detailed announcement on this page.

We are looking for people to test our first prototype! We are especially looking for people that have a solid understanding of Wikidata’s data model but that don’t necessarily have a lot of experience with using SPARQL. More details on the page linked above. Cheers, Lea Lacroix (WMDE) (talk) 08:19, 4 June 2020 (UTC)

Properties for leaning/inclined towers (Q797765)

After fixing some vandalism on Leaning Tower of Pisa (Q39054), it occurred to me that it could be worth expanding. Beyond angle from vertical (P4183), a few other properties could be useful. I'm interested in more suggestions.

A query displays a few others. --- Jura 11:55, 4 June 2020 (UTC)

Which topics would like to see Wikidata Tours for?

Hi all

As you may know Wikidata:Tours are a really great way to learn about aspects of Wikidata, however there are a small number of tours and until now no clear way to create them. Navino Evans and myself have created a comprehensive guide on how to create Wikidata Tours with a space to collaborate with others to build tours and to suggest tours you'd like to see. We hope that increasing the number of tours available for new users will be able to learn how to contribute more easily and add data more consistently making queries more reliable.

Please take a look and:

  • Suggest tours you would like to see available, this can be anything from technical aspect of Wikidata, how to create a specific kind of item e.g a person, museum etc or a specific task e.g add a date of birth.
  • Share existing guidance on different aspects of Wikidata that could be turned into a tour.
  • Start writing new tours.
  • If you know Javascript you can help transform a tour draft from a page of text into a working tour.

Thanks very much

--John Cummings (talk) 18:58, 21 May 2020 (UTC)

I think it could be beneficial to newer editors to provide a tour on creating literary source items (i.e. creating the work level item and edition level item and linking them, to use the edition as a reference). It can be a confusing concept to encounter for the first time and I didn't fully grasp the distinction until I understood certain properties can only apply to edition level items (e.g. an ISBN or a publisher). --SilentSpike}} (talk) 10:19, 22 May 2020 (UTC)
Nevermind, I see suggestions should be made in the table on the page linked! --SilentSpike (talk) 10:24, 22 May 2020 (UTC)
@SilentSpike: thanks very much for the suggestion, I'd be happy to work with you to put together a tour on this, I don't understand the topic well at all so if you could create a draft I can edit and make into the tour format (there's a proforma for a tour linked to from the design and write sections of the instructions). --John Cummings (talk) 15:56, 25 May 2020 (UTC)
Would there be any possibility of getting phab:T246814 resolved before we add more tours? We don't want to waste the time of Recent Change patrollers. Bovlb (talk) 19:45, 1 June 2020 (UTC)
@John Cummings: Forgot to ping OP. Bovlb (talk) 15:53, 4 June 2020 (UTC)
@Bovlb: thanks for the suggestion, I really want this to be fixed also but I don't know who would be able to or what skills would be needed. Do you know which software is effected and who is responsible for it if it is maintained by WMF or WMDE etc? --John Cummings (talk) 15:57, 4 June 2020 (UTC)
@John Cummings: From the discussion so far, there seem to be two proposed solutions:
  1. Add an edit filter that checks the item id against a hard-coded list and adds a tag. Any admin can do this, but we would need to (remember to) update the filter whenever we create new topics.
  2. Change the WikiTours interface to add a tag. This would be the more elegant solution, but would require work from a WikiTours interface developer. I don't know who that is, and I don't know how to find out.
Bovlb (talk) 16:07, 4 June 2020 (UTC)
Thanks @Bovlb: could you share this on phabricator? It seems sensible to centralise the discussion there. --John Cummings (talk) 16:27, 4 June 2020 (UTC)
@John Cummings: Done. Bovlb (talk) 16:47, 4 June 2020 (UTC)

Deletions of in use items because they do "not meet notability policy"

 
uses Q90739444

WikiProject sum of all paintings, strives to "get an item for every notable painting", and hundreds of thousands of paintings on Commons rely on data pulled from Wikidata. We encourage people to update the metadata about artworks on Wikidata instead of on Commons. That is why I found it quite troubling that some pages on Commons were broken because items on Wikidata were deleted due to lack of notability. More specifically I noticed pages searching for Q92121816 and Q90739444, which were deleted by User:Pasleim. I had to look up Wikidata:Notability policy again and it is unclear if they meet the policy. Those paintings do not have "valid sitelinks" (very few paintings do) so we have to rely on case #2 with "clearly identifiable conceptual or material entity" provision. For a painting to be notable, it must be possible to "described [it] using serious and publicly available references". In case of those 2 items, it is possible to describe them using references, like this one for Q90739444, but whoever created the item did quite a sloppy job and did not add any references or sources. So here is the problem, We can not be deleting items used by other projects to store metadata about distinct objects, even if those items do not meet our notability criteria. On Commons we have policy that if image is used on any project than it is "in scope" and will be hosted on Commons (assuming it also has valid license) I think we need something like this here that if item is used by other project to store metadada than it is in Scope. --Jarekt (talk) 03:09, 24 May 2020 (UTC)

We need to avoid circular policies. If Commons has the policy that each image is in scope which is used on any project, and Wikidata would have the policy that items for images on Commons are notable, then vandals just need to create both image and item at the same time and thus they are notable both projects. The above mentioned items were deleted because they were created by a globally blocked user. Such actions need to be possible independent of any Commons usage. --Pasleim (talk) 06:42, 24 May 2020 (UTC)
@Jarekt: I noticed it too and started Topic:Vmw7njupwlcqez67. Pasleim messed this up and should undelete it instead of trying to talk themselves out of this, that will only make things worse. Just restore the incorrectly deleted items. Multichill (talk) 08:44, 24 May 2020 (UTC)
@Multichill: You have also reverted edits by the blocked user [2]. Do you prefer that I restore all items created by them and then revert their edits afterwards? What is your opinion about edits like this? Is it okay that users can create new items on Wikidata, even if their are known to create hoaxes and blatant wrong statements for years and across many projects? --Pasleim (talk) 09:00, 24 May 2020 (UTC)
The standing of the user does not have anything to do on notability of the items they create. The items were sloppily created, but we still need them and they should be improved not deleted. Is there a policy that makes items created by banned users not notable? --Jarekt (talk) 12:11, 24 May 2020 (UTC)
Wikidata has currently no way to check the use of items at Wikimedia Commons. As long as this isn't possible, Commons contributors need to be aware of that and moderate their use of otherwise unused items. Commons developers are aware of the problem. --- Jura 12:18, 24 May 2020 (UTC)
Sloppy creations and sloppy deletions. The baby has been thrown out with the bathwater. Assuming the question asked by Jarekt is not rhetorical: No such policy exists. I see that undeletions slowly have started. Wikidata:WikiProject sum of all paintings/Creator/Giambattista Pittoni needs checking when that is done.
@Jura: your remark doesn't make sense in this context because the items linked to Commons using image (P18). Multichill (talk) 12:32, 24 May 2020 (UTC)
Good point. I withdraw it. The question seems to be if blocked users can continue to edit or not.--- Jura 12:39, 24 May 2020 (UTC)
Disallowing a user from editing is the whole point why a user get blocked. If a user manages to evade a block, it is obvious that edits must be reverted and page creations deleted. English Wikipedia has a policy for this [3] but such a policy is unnecessary as it is common sense that block evasions are not tolerated. --Pasleim (talk) 17:19, 24 May 2020 (UTC)
Pasleim, I am not aware of any policy allowing deletion of otherwise fine items because one of the contributors is banned. Q90739444 had 4 contributors (including Multichill), why are you deleting their contributions? It is a waste of time for other contributors to help with an item which is than deleted to make some banned user fell bad. If you need to delete someone's contributions than it is fine, just replace them with your own. --Jarekt (talk) 00:09, 25 May 2020 (UTC)
I deleted only items where the banned user was the main contributor. I would say, on average 90% of the edits were made by the banned user, 9% by bots and 1% by other users. Q90739444 has one edit by BotMultichillT. Item with a greater contribution from other users I did not delete. --Pasleim (talk) 04:38, 25 May 2020 (UTC)

Speaking as a Commons admin, there is really a fundamental choice here. As Jarekt says in his initial remarks, we on Commons have been encouraged to update the metadata about artworks on Wikidata instead of on Commons. If Wikidata does not consider Commons' usage a sufficient criterion to keep an item, then we on Commons should stop doing that. There is already a lot of skepticism about Wikidata (and even more about structured data for Common) from a lot of Commons users. If you want to address that skepticism, you pretty much have to accept Commons as a legitimate client whose needs must be met. If you don't want to, then please stop suggesting that Commons lean more and more on Wikidata and Wikibase. - Jmabel (talk) 17:20, 24 May 2020 (UTC)

Is there any Wiki project that doesn't have something against WD?--Trade (talk) 17:24, 24 May 2020 (UTC)
Also on Commons, edits by the here mentioned globally blocked user were reverted on grand scale [4]. --Pasleim (talk) 17:53, 24 May 2020 (UTC)
Wikidata does indeed consider Commons' data usage usually as a sufficient for notability, but there are simply cases where Commons does a poor job as well. As this should not be a rant about the Commons project and I think that both projects should drastically improve collaboration with each other, I am not going to list the details here again. Yet, sometimes a deletion seems appropriate regardless of linked Commons data; content of a problematic user, such as in this case, could in fact be such a situation.
That said, once someone offers to review and verify the deleted items, I think they should be undeleted and improved. The concept of notability in Wikidata refers to a technical form that items should have (basically being identifiable and verifiable, based on serious sources), rather than being a measure for admissibility based on intrinsic properties of the described subject. With that in mind, one can understand more easily why quite a lot of items qualify for deletion, although the described entities would certainly be admissible if only a proper item was created. —MisterSynergy (talk) 18:12, 24 May 2020 (UTC)
I   Strong oppose mass revert or deletion of edits by (whether locally or globally, whether by admins, community, or WMF) blocked/locked/banned users (they may be discussed in their own merits, as if they are made by normal users), for the following reasons:
  • Wikidata data may be in use by various projects (including those the users are not blocked, and non-WMF ones) and deleting them defeats the propose.
  • Protection, blocks, etc. should be trected as technical measure to improve the data quality. If the quality is OK there is no need to delete or revert them. This also applies to edits that can be improved latter.

In Commons there is strong consensus not to delete any contribution of blocked users. In Chinese Wikipedia (potentially positive) contributions of blocked or even WMF banned users are not removed at all, and the community treats deletion of contribution of banned as killing every child born in violation of one-child policy (Q221719). Note only WMF is obliged to WMF bans; nobody else are required to revert edits by banned users.--GZWDer (talk) 18:28, 24 May 2020 (UTC)

  • Pasleim, The deletions of items because they were edited by undesirable users, is not one of the allowed deletion rationales stated in Wikidata:Deletion_policy#Deletion_of_items. We still need those items. Do we undelete, create new ones from scratch, or should we delete statements by offending party and recreate them? How do you want to proceed? Keeping the original item ID would allow less changes on Commons, but I am OK with creation of new items and redirecting.--Jarekt (talk) 21:03, 26 May 2020 (UTC)
Just undelete them, every item should be kept/deleted on their own merit, not who created them. --RAN (talk) 21:18, 26 May 2020 (UTC)
@Jarekt: I can offer to create new items by bot, if for the paintings external sources exist. Not only with focus on Pittoni, but generally for all missing paintings. --Pasleim (talk) 11:59, 27 May 2020 (UTC)
@Pasleim: The items should be undeleted and then improved.--GZWDer (talk) 12:42, 27 May 2020 (UTC)
I just discovered a whole lot more of those deleted items, removing metadata from bunch of files on Commons. Some of those files had contributions from several named users including me. So far I undeleted the metadata of the following files, deleted by Pasleim since they did "not meet notability policy", despite the fact that most of them had independent sources and all were being used  :
Those few items were among 300 other items deleted for lack of notability in the same couple days. To me all of those seem notable and all were being used, so I am still concerned that there is some misunderstanding about our notability policy. I do not have time to check 300 other items but many were works by Giambattista Pittoni and our list of his works shrank considerably in last few days. Pasleim, Can you un-delete all the other items for artworks, by notable artists like Pittoni deleted by you in the end of May. That way, community can more easily decide if any of the artworks are not notable? --Jarekt (talk) 04:17, 3 June 2020 (UTC)
@Jarekt: I think you can undelete it now unless Pasleim opposes undeletion.--GZWDer (talk) 16:58, 5 June 2020 (UTC)
I was undeleting them, each time I run into a file on commons which suddenly points to a deleted artwork item, but I am working on something else and wanted to avoid crawling through 300 deletions to see which ones are out of scope (some definitely are) and which are OK? Also I do not want to be doing this work if Pasleim is going to be deleting perfectly good items. I am also concerned that undeletion of the item does not restore all the links to the item which might have been deleted. Anybody knows how to look for those and restore them? --Jarekt (talk) 17:31, 5 June 2020 (UTC)

tourism slogan

Would it be correct to use the property motto text (P1451) [or motto (P1546) for when the motto itself is notable enough to have its own WD item] to include city/region/country tourism slogans (with 'start' and 'end' date qualifiers when appropriate)? Here is an English Wikipedia list for every country's current tourism slogan. We DO already use this property for places' formal mottos (e.g. France (Q142) -> Liberté, égalité, fraternité, and United States of America (Q30) -> E Pluribus Unum), and we also already use the Properties for commercial advertising slogans (e.g. McDonald’s (Q38076) has it for 5 different languages).
So, is it appropriate to use the Motto properties for tourism campaign slogans alongside these more formal mottos at the same time, or should they be separate with a different property? Furthermore, would that extend to cities and regional tourism campaign slogans, such as I Love New York (Q1353042)?
If it's not appropriate to 'mix' the 'formal motto' and the 'tourism slogan' of places, should "tourism slogan" be a separate property?? Wittylama (talk) 13:02, 29 May 2020 (UTC)

I think the question is whether an entity would be likely to have both a "slogan" and a "motto", or if that's only a rare case where the same property for both would be fine? To me it seems like they would be generally clear from context so using the same property should be fine (i.e. add "tourism slogan" as an alias for the motto properties). ArthurPSmith (talk) 17:39, 29 May 2020 (UTC)
Pretty common to have both. E.g., for Virginia (Q1370), "Sic semper tyrannis" sends a pretty different message than "Virginia is for lovers". - Jmabel (talk) 22:35, 29 May 2020 (UTC)
Could we allow both on the same property and distinguish type of slogan/motto/etc by qualifiers where relevant? This would also let us encompass things like slogans used to market to investors, which are probably different from ones used to market to tourists. Andrew Gray (talk) 10:04, 2 June 2020 (UTC)
Here's a practical example now in use - with the 'latin motto' and the 'tourism slogan' next to each other for the item New York (Q1384): Q1384#P1451. If you think that is confusing and needs to use separate properties, or needs qualifying somehow to differentiate them, please advise. Wittylama (talk) 10:16, 2 June 2020 (UTC)

I've now made a batch-edit (featuring my first use of OpenRefine for reconciliation!) via QuickStatements to add everything from this table of national tourism slogans on English Wikipedia, to their respective country's wikidata items using the property motto text (P1451). It's an edit to 132 items with two edits - one for the string of text, and one for the "imported from Wikimedia project -> English Wikipedia" reference. example from the item Albania. Notwithstanding I gave the quickstatements batch a name, it still appears in the edit summary as: "#temporary_batch_1591202694827".
Once the batch has finished, the next question is how do I query for these to visualise on a map, in such a way that it brings the tourism slogans, not the 'formal' Latin mottos? Wittylama (talk) 16:56, 3 June 2020 (UTC)

@Wittylama: This query will get "all values of P1451 that have their language set to 'en'." (I haven't written the rest of it yet). However, a problem emerges pretty quickly - on for example Q31#P1451 or Q25#P1451, we have an English translation of the "formal" motto, plus the English-language tourism motto. Whether those should be there or not is another question (a translation isn't the same as the official motto) but it points up why specific qualifiers might be helpful, especially if the official motto is actually in English! :-)
In terms of what qualifiers might work, perhaps object has role (P3831) pointing to a new item such as "official motto" or "tourism slogan"? Andrew Gray (talk) 19:48, 3 June 2020 (UTC)
Object has role -> tourism slogan works nicely, yeah. Is it possible to all the statements I’ve just created - and not the other ones you’ve identified in that query - in order to mass-add that qualifier? Wittylama (talk) 20:13, 3 June 2020 (UTC)
Not easily from my end (the query service doesn't know edit histories) but if you have the list you uploaded it might be possible to re-run it through QS or similar with the relevant qualifiers added? That will probably add them to the existing claims. Andrew Gray (talk) 20:15, 3 June 2020 (UTC)
I've set up some examples of qualifiers at Q1370 and adjusted the constraints accordingly. This query has all the current subclasses of "motto" - we have national motto (Q29654714) which would probably do for the main ones, but no generic "official motto". Andrew Gray (talk) 20:27, 3 June 2020 (UTC)
@Andrew Gray: I worked out how to run a mass 'undo' on the batch (fortunately I hadn't closed the tab), added the relevant qualifier statement to all the items and re-exported to quickstatements. I ran a batch of a single item - Albania - and it worked well Q222#P1451. So, I then ran it for all the rest... and broke. Most annoyingly, it broke in an inconsistent way: sometimes not adding the reference, sometimes not adding the qualifier, sometimes neither, sometimes both. So, I 'undid' that batch too. here's the OpenRefine project if you feel like having a look at the 'schema' tab. Wittylama (talk) 22:52, 3 June 2020 (UTC)
@Wittylama: unfortunately that's a link to your local OpenRefine install :-). If you can get it to export a simple file (csv? excel?) with item name + desired motto text and email it over I can do the upload from this side for you. Andrew Gray (talk) 17:44, 4 June 2020 (UTC)
Ah, sorry Andrew Gray, I had assumed an OpenRefine project was visible if you shared the URL, like googledocs. Here's the code, formatted for Quickstatements: https://pastebin.com/2fCXV5eg . It's really quite a simple batch - each item gets a single statement, and all of them get the same single qualifier and reference. I've stripped away things like 'start date' and former slogans that appear in some items in the original table on En.WP for simplicity's sake. I've also stripped all en.wp's references because they're to each country's tourism website, not a reference to the slogan's existence itself. So, I don't understand why such a simple batch would fail in such varying ways.... Wittylama (talk) 14:02, 5 June 2020 (UTC)

@Wittylama: I'm beginning to suspect it was nothing more than coincidental QS problems! I'm running the batch locally (albeit very slowly) and it seems to be working OK.

I haven't filtered it to just countries as for the moment it's only showing them anyway (plus Virginia, which we added earlier). Andrew Gray (talk) 16:55, 5 June 2020 (UTC)

Integration of 20th Century Press Archives (Q36948990)'s country/subject archives metadata

A draft for the integration of the large archives' metadata (> 280.000 articles in 9000 digital folders) into Wikidata is outlined here, linking to details of the proposed data structure. Feedback is very welcome - cheers, Jneubert (talk) 09:24, 2 June 2020 (UTC)

It is meant not as a topical item ("real world object"), but represents the folder for the clippings (= digitized files) collected under that designation. These folders are sometimes rather artificial and may contain material about different items (e.g., 1918-1920 flu pandemic (Q178275), Cholera (Q12090) and other diseases). Jneubert (talk) 10:03, 2 June 2020 (UTC)
For Q91257808, wouldn't "main subject" be more appropriate for statements like "location = German Reich" and "has part = 1918-1920 flu pandemic". --- Jura 09:22, 4 June 2020 (UTC)
Well, "location" is not a perfect fit here. It would be great to have a property like Dublin Core's dcterms:spatial - however, I was not able not find anything similar among Wikidata's properties. I would have used country (P17), if not some of the "countries" were cities (like Hamburg), regions (like Northern Europe, or the world) or groups of countries (like "Russion peripheral countries"). In my eyes, main subject (P921) would be under-specific here - due to the structure of the archive, each instance of PM20 country/subject folder (Q91257459) is defined by exactly one "country" and one PM20 subject category (Q92707903). In the future, this structure should be supported by a ShEX schema and according validation. Having a mix of different geographical and topical items in "main subject" would make that much more difficult.
The use of "has part" is different. In the folder, clippings about dozens of different diseases have been collected. The subject "cholera", for example, applies only to a sequence of 7 documents, whereas the other 604 documents in the folder have no informationj about cholera. So this is really a subdivision, not a "primary topic" of the whole folder. You are right that the disease itself (cholera) is not a part of this folder. I suppose the formally correct thing would be to create an item for each subsection, e.g. "Germany : Individual diseases and their control - cholera". But I think that would be overkill.
The downside of the "has part" shortcut however is that the inverse relation is not true (cholera is not "part of" that PM20 folder). I can imagine that could mess up completely unrelated queries about diseases, which rely on the inverse relation. So perhaps using "has part" is not such a good idea.
Is there any property for "subtopic"? Could it make sense to propose one? Jneubert (talk) 17:35, 4 June 2020 (UTC)

QuickStatements thankyou

Thanks to whoever tuned up QuickStatements. It's now running quickly and without errors again. Long may that last. Levana Taylor (talk) 04:57, 3 June 2020 (UTC)

Getting number of cases (P1603) to rank most current figure as preferred

How do I change number of cases (P1603) so that it will automatically rank the most current figure as preferred? This is so that I can import the most updated case count to w:Template:COVID-19 pandemic cases/Per capita. Pinging TomT0m, who I see wrote a related page. Sdkb (talk) 06:55, 3 June 2020 (UTC)

Some kind of bot / script, I guess. Ghouston (talk) 07:01, 3 June 2020 (UTC)
See Wikidata_talk:WikiProject_COVID-19/Data_models/Outbreaks#Should_we_skip_fiddling_with_preferred_rank_until_the_outbreak_is_over? about ranking. --- Jura 07:04, 3 June 2020 (UTC)
I don't think it's possible to set up a property to do this automatically, but as Ghouston suggests, in principle a bot script should be able to do it reasonably simply (look at all values, pick the one with the most recent date, set as preferred, set any other preferred to normal). Andrew Gray (talk) 12:55, 3 June 2020 (UTC)
Technically it is not a problem to write a few lines of lua code, that finds P1603-statement with maximum value of P585 qualifier Ghuron (talk) 18:19, 4 June 2020 (UTC)
@Ghuron, Jura1, Andrew Gray: Thanks for the replies. Is my post at WD:Request a query the right place to go to ask for the bot? Sdkb (talk) 03:56, 5 June 2020 (UTC)
@Sdkb: Wikidata:Bot requests is probably the best place to go - query requests is more for "I'd like to know the answer to this" as you can't actually edit through the query service. Unfortunately it's a little outside my competence to write a script for it, but hopefully someone there should be able to help. It might also be worth leaving a note at Wikidata talk:WikiProject COVID-19 if you're not already in touch with them, as they may also have been thinking about the same problem. Andrew Gray (talk) 09:08, 5 June 2020 (UTC)

How to track datasheets and reference manuals?

Hi,

In the properties proposal, there are several properties for identifiers in various external systems (book shops, museums, etc).

However this probably assumes that the URL doesn't change over time, or that if it changes over time, it can just be updated, without necessarily needing to keep a history of URLs and versions of the page.

system on a chip (Q610398) and various other hardware components have hardware documentation which in many cases comes in the form of a technical reference manual, at a given URL which is publicly accessible, in the pdf format.

These documents are crucial as without them there are many things that you cannot understand:

  • Software and hardware documentation relies on them
  • They are referenced in many projects source code such as the Linux kernel (Q14579), Das U-Boot (Q1170607), sometimes with an URL and sometimes just by the name of the PDF files.

However:

  • Companies making hardware components are bought, merged, change strategies with regard to accessibility of information, etc. So the URL tend to move around, and sometimes document can even disappear if they are not archived by archive.org for instance.
  • The PDF are typically updated over time and have versioning information in a given pdf. Older PDF are sometimes very relevant as sometimes information is removed in the newer ones.

Would Wikidata be adapted for archiving the version and URL of such documents?

If so what would be the way to represent something like that for a given company like Freescale Semiconductor (Q863675) which is now part of NXP Semiconductors (Q1155668), or Texas Instruments (Q193412) ?

GNUtoo (talk) 05:01, 4 June 2020 (UTC)

@GNUtoo: I think trying to describe them on the level of the manufacturing company is probably unmanageable, given presumably you have one datasheet per component (or series?), and a large number of components per company? I can see two possible practical ways to do this:
a) Use described at URL (P973) on the item about the particular component, pointing to a URL for the manual, with appropriate qualifiers (publication date (P577), edition number (P393), archive URL (P1065) etc). Add new URLs with new sets of qualifiers as needed for different versions. object has role (P3831) can be used to say "this is a data sheet (Q1172383)" (etc) if it's useful to distinguish between links.
b) For a more detailed approach, create an item for the manual itself, and link to the manual using full work available at URL (P953), with qualifiers as above. Link from the component item using described by source (P1343).
The first of these is probably more straightforward, though I don't know if we have a massive number of items for the right types of thing yet... Andrew Gray (talk) 17:36, 4 June 2020 (UTC)

"Current Uses" in property discussions

Hello, for some time now, the "Current uses" row in {{Property documentation}} seems to give a wrong number by quite a large margin. For example, it says 583,402 at Property talk:P691 but it's actually closer to 375,342 (based on this: https://w.wiki/T2T) or possibly 376298 (based on this: https://w.wiki/H5t). Any idea what might be causing it? If this is difficult to fix in the short-term, maybe we should hide the corresponding row in {{Property documentation}}. --Vojtěch Dostál (talk) 08:55, 5 June 2020 (UTC)

@Vojtěch Dostál: It’s probably counting qualifiers and references as well. This query comes much closer to the number recorded at {{Property uses}}:
SELECT ?countStatements ?countQualifiers ?countReferences (?countStatements + ?countQualifiers + ?countReferences AS ?sum)
WITH { SELECT (COUNT(*) AS ?countStatements) WHERE { ?item p:P691 ?statement. } } AS %countStatements
WITH { SELECT (COUNT(*) AS ?countQualifiers) WHERE { ?statement pq:P691 ?qualifier. } } AS %countQualifiers
WITH { SELECT (COUNT(*) AS ?countReferences) WHERE { ?statement pr:P691 ?reference. } } AS %countReferences
WHERE {
  INCLUDE %countStatements.
  INCLUDE %countQualifiers.
  INCLUDE %countReferences.
}
Try it!
– 583415 is close enough to 583402 that I think the difference can be ascribed to edits that were made since the bot last updated the usage data. (But note that my query isn’t completely correct, since it doesn’t count no value Help qualifiers and references, I think.) --TweetsFactsAndQueries (talk) 09:11, 5 June 2020 (UTC)

@TweetsFactsAndQueries: Hmm, in that case it should not be taking into account P4876 for the calculation of percentage...Vojtěch Dostál (talk) 14:41, 5 June 2020 (UTC)

See Template talk:Property uses - Template:Number of main statements by property was created, but the current Property uses template need a history split (and, the old version may be updated via ElasticSearch). @Pasleim:.--GZWDer (talk) 16:56, 5 June 2020 (UTC)

Mapping the concept of participant in a non-professional, part-time, unpaid activity onto Wikidata ontology

I'm trying to add data to dungeon monitor (Q5315156), without much success. "Dungeon monitor" is a (typically) unpaid role at BDSM play parties; it's not a profession (requiring expertise) or a job (for which people get paid), nor is it a full-time activity: it would be for a few hours, now and again. Does anyone know how best to encode a participant in such an activity/role? -- The Anome (talk) 12:09, 5 June 2020 (UTC)

I feel like Wikidata uses "occupation" as a very broad term (maybe because nobody ever made anything else). For example bird watcher (Q84321442) is listed as an occupation even though it really isn't. I mean human activity (Q61788060) seems like a good fit but then you won't be able to tag humans with it in a sensible way. Maybe we should just broaden occupation and accept it lists anything they do even non-professionally. Or else make a new property for non-professional activities they engage in (but what would you call it?). BrokenSegue (talk) 15:28, 5 June 2020 (UTC)
It's worth noting that adding information about unpaid activities on BDSM play parties comes with huge privacy concerns. ChristianKl19:44, 5 June 2020 (UTC)
Concur with ChristianKl. I'd hesitate to add that for anyone who hasn't made a public statement about that themself. @The Anome: were you planning to add it on anything short of that? - Jmabel (talk) 21:36, 5 June 2020 (UTC)
"anyone who hasn't made a public statement about that themself" I mean how else would the person adding the occupation know it? --Trade (talk) 23:54, 5 June 2020 (UTC)
Possibly from libellous or slanderous comments by the general public or unscrupulous media organisations. Even if a court case rules that the original claim is libel, the rumour will persist and may end up in one of our data sources. From Hill To Shore (talk) 02:05, 6 June 2020 (UTC)
Or, more reliably but I still think we shouldn't use it, if it came up in a court case in testimony (e.g. a liability case related to an accident at a club). - Jmabel (talk) 03:17, 6 June 2020 (UTC)
Frivolous or occasional occupations should be used with caution: they will be blindly regurgitated in infoboxes across the Wikimedia ecosystem, no matter how trivial, and will be propagated through every source that draws from Wikidata. The concept of 'definingness' should be invoked: is the subject commonly and consistently known as a dungeon monitor, or did they just do it once or twice? Many people have worked as a cashier (Q1735282) or waiter (Q157195) before becoming notable as an actor (Q33999) or politician (Q82955), and adding trivial occupations (even if verifiable) would be more noise than signal, pollution in my opinion. -Animalparty (talk) 00:12, 6 June 2020 (UTC)
I feel like we are stsrtig to get off topic--Trade (talk) 01:53, 6 June 2020 (UTC)

Updating Wikidata:Database reports/Constraint violations/P952

I have made many edits on items who has ISCO-88 occupation class (P952) the two last days, but I can not see that Wikidata:Database reports/Constraint violations/P952 is updatet as required. Suggestoins anyone? Pmt (talk) 13:25, 6 June 2020 (UTC)

  • It depends when the next database dump is available and gets processed. On June 3, the data from the database dump as of May 27 was used.
In the meantime, you can use query links on the property talk page. --- Jura 13:42, 6 June 2020 (UTC)
Thanks a lot, I am running the quries but an (readable and explainable) update is rather urgently needed for the Norwegian wikipedia in accordance with ISCO-08 occupation class (P8283) and also https://mix-n-match.toolforge.org/#/catalog/148. So all kinds of help would be appriciated. Pmt (talk) 14:11, 6 June 2020 (UTC)
  • I just clicked on the "SPARQL" links of the three remaining constraints:
    • One shows 6 type violations
    • the second 6 format violations
    • and the third violations that aren't relevant.
One could past the result on some other page. --- Jura 14:20, 6 June 2020 (UTC)
@Jura1: "past" => "paste"? But Is still don't understand what you are suggesting. - Jmabel (talk) 17:06, 6 June 2020 (UTC)
"paste" the results of the queries on some page to have an updated "list". --- Jura 17:08, 6 June 2020 (UTC)

Relator tool

Has anyone else experienced trouble with Relator tool recently? For the last couple of days I haven't been able to edit items or create new family members, and am not even sure if I am staying logged in. Even before this, when creating a new item (e.g. "child of X"), the relationship would not be reciprocal, requiring the manual addition of "father/mother of Y". -Animalparty (talk) 21:11, 4 June 2020 (UTC)

All the tools that use WiDaR / OAuth made by Magnus have troubles, as the URL's of these tools are going to change, they need another OAuth-token, and every tool needs its own token. I know he is working on it to get this fixed. There are more tools that do not work like before, like adding properties with PetScan, managing items with Duplicity, etc. Magnus reported on his blog yesterday that Tabernacle (TABernacle (Q26882268)) is the first tool that he got fixed. Edoderoo (talk) 06:18, 5 June 2020 (UTC)
Same here. I didn't realize how much I depended on this tool until now. Gamaliel (talk) 12:49, 5 June 2020 (UTC)
You may try User:Frettie/consistency check add.js, family properties adding is easier with that tool.--Jklamo (talk) 18:19, 5 June 2020 (UTC)
User:Frettieconsistency check add.js has not worked for me for months, at all. -Animalparty (talk) 18:40, 5 June 2020 (UTC)
It works for me.--Jklamo (talk) 16:19, 7 June 2020 (UTC)
Me too. I am not able to log into the tool at all. ミラP 20:30, 5 June 2020 (UTC)

Yoon Chan-Young, Korean actor

I am new and don't know how to edit. This person was a young actor but he is now known as Jeon Sung Woo and is currently very active in tv series, movies, theatre and musicals. Can someone combine the two Wikipedia sites so his whole history is on the same page?  – The preceding unsigned comment was added by Kcjones801 (talk • contribs) at 06:52, 6 June 2020 (UTC).

@Kcjones801: Hi. As a first step we will need to know which pages you are talking about. I am guessing that you are asking us to combine two Wikidata pages; take a note of the "Q" reference at the top of each page and post the two references here in the format {{Q|123456}}. If you are talking about combining pages outside of Wikidata, please post the weblinks here. From Hill To Shore (talk) 13:32, 6 June 2020 (UTC)
I can only find Yoon Chan-yeong (Q18328892) (born 2001) and Jeon Sung-woo (Q55733983) (born 1987), unlikely to be the same person. Peter James (talk) 12:45, 7 June 2020 (UTC)

List of bots

Hello. Is there a list of bots who are importing authority control data from diverse language wikipedias? Something where I can see which bots do it and for which language versions they are still running. Best regards --Christian140 (talk) 05:55, 7 June 2020 (UTC)

Q29870196

Is 70 years or more after author(s) death (Q29870196) only meant to be used for written works? It uses the word "author" rather than "creator". I associate "creator" with a sculpture or a photograph, something tangible that you could touch. --RAN (talk) 17:56, 7 June 2020 (UTC)

Copyright law uses the term author, with other creators understood to be included in the definition. See, for example https://www.copyright.gov/title17/title17.pdf, which is the main body of the US' current law on copyright. It uses "author" 537 times while "creator" does not appear once. Matthias Winkelmann (talk) 19:10, 7 June 2020 (UTC)

OK, thanks! --RAN (talk) 19:14, 7 June 2020 (UTC)

5pb. is Mages' old name. They are the same thing. Thanks. GhostP. (talk) 02:26, 8 June 2020 (UTC)

From this site: http://5pb.jp/ it seems that both names exist. Apparently, 5pb. is the company and Mages is a label or brand by that company.

Reality television and reality show might be a same thing. In different interwikis this term is called either "reality television" or "reality show". For example in English Wikipedia article is called "reality television" and "reality show" is just a redirection to the main article. What do you think? --Ivannnnl (talk) 11:11, 2 June 2020 (UTC)

...although in this case, I think the case could be made for said to be the same as (P460). —Scs (talk) 12:39, 2 June 2020 (UTC)
I have the impression that frwiki sees one as a subclass of the other. It would likely be benefitial to define the items better so that it's clear which one should be used when instead of going the said to be the same as (P460) route. Unfortunately, my French is awful. ChristianKl13:55, 2 June 2020 (UTC)
WP:fr is doing the difference between the 2 concepts:
  • reality television: type of documentaries, talk show,... this can be described as an emission about the normal life of people, in their original environment...
  • reality show: emission about people put in a particular environment and the emission shows on air what happens.
Snipre (talk) 21:02, 2 June 2020 (UTC)
  • Also, the "Big Brother" type shows are artificial setups created for entertainment: they aren't at least nominally attempting to depict reality like a documentary or news programme. One isn't necessarily a subclass of the other. Ghouston (talk) 12:41, 8 June 2020 (UTC)

Merging Q88696160 into Q7482890

Conflicting descriptions for certain languages prevent the merger. Would anyone please help in any way?--Jusjih (talk) 23:23, 6 June 2020 (UTC)

"Error: Conflicting descriptions for language fa." I do not understand Persian. Thanks for having answered.--Jusjih (talk) 04:05, 7 June 2020 (UTC)

Requesting small help

Hi,

I tried but could not add new article 基塔卜 of zh (Chinese Wikipedia) to " Kithaab " (Q60686734). Please some one help us out in connecting the new entry,

Thanks and warm regards

Bookku (talk) 04:35, 8 June 2020 (UTC)

I've done it, by inserting the item into the Wikipedia section. Ghouston (talk) 04:44, 8 June 2020 (UTC)

Fusión de elementos: Lucius Iulius Graecinus (de) con Julio Grecino .(es)

Buenos días:

Estas entradas corresponden a la misma persona; no sé fusionar; por favor, ¿alguien puede hacerlo?

Gracias --scutum (talk) 10:31, 8 June 2020 (UTC)

Copied to Wikidata:Café. —Scs (talk) 14:16, 8 June 2020 (UTC)
He's just asking for a merger on Lucius Julius Graecinus (Q1241495) which has apparently already happened. - Jmabel (talk) 14:58, 8 June 2020 (UTC)

Wikidata items for temporary maintenance categories

Hi. How can we avoid the creation of QIDs such as Q32927605, or at least find an easier way to delete them after maintenance? Such items shouldn't be created, right? In this instance, I deleted the only linked category on enwiki today, which is how I stumbled over this QID. Rehman 11:56, 8 June 2020 (UTC)

We have established workflows to bring such items to admin attention without any user action being necessary. It usually takes a couple of days until these items are deleted, sometimes a bit longer. But we do not pile up any significant or concerning number of such cases. ---MisterSynergy (talk) 12:08, 8 June 2020 (UTC)
Thanks, MisterSynergy. Would you be able to link me to the page/discussion regarding this process please? I'm interested to know how this works. Cheers, Rehman 12:18, 8 June 2020 (UTC)
Per Wikidata:Notability/Exclusion_criteria, they may be deleted if they have only one sitelink. If totally empty they will be listed in Wikidata:Database reports/to delete/empty category items.--GZWDer (talk) 18:05, 8 June 2020 (UTC)
(Edit conflict) It is not a formally defined process, more like a set of individual routines developed over time. There is for instance the bot maintained work list User:Pasleim/Items for deletion/Page deleted on which items appear from which the last sitelink has been removed; this is also the case for the mentioned item Q32927605. Some admins regularly work on this list and remove most incoming items timely. There are also several Listeria lists containing for instance "emty category items", "empty template items", and a couple of other similar meta stuff (see Wikidata:Database reports/to delete, and some others are in user space as well). Mind that there is no requirement for a deletion discussion before a deletion is being done at Wikidata, so the removal of such items that are clearly no longer necessary is a process which barely anyone takes notice of most of the time. —MisterSynergy (talk) 18:08, 8 June 2020 (UTC)
Thank you for the replies, GZWDer and MisterSynergy. Good to know this. Cheers, Rehman 02:26, 9 June 2020 (UTC)

Research Projects on Wikidata

Hi All,

As far as I know, more and more research projects decide to use Wikidata, either creating their research data directly here or just link it to their own database. Still, if I want to see a list of these projects or I want to know how to bring a new research to Wikidata, my options are Wikidata:WikiProject Wikidata for research or somehow finding a WikiProject initiated by academics.

For example the COURAGE project about cultural opposition created a amazing set of data here, yet you can only stumble upon it through digging in the data. This lack of transparency makes it difficult for non-academic users and researchers to "see" each other or even work together.

I think there could be a separate page, reachable from the main page and the left toolbar, which would serve as a directory for research projects. Here projects could have Mediawiki pages with a short description, members, official sites, queries listing all their edits, etc. A new property could also link the projects' item to these pages. It would also provide great visibility to Wikidata as a research platform or tool, and perhaps the new pages could even serve as places where members of the research teams could communicate.

What do you think?

Best, Adam Harangozó (talk)

  • COURAGE is a project where it's creators don't think there relationship with Wikidata is important consider it to be a partner or otherwise mention it's relationship to Wikidata on their mainpage. I don't see why Wikidata should list it prominently.
Ghazalgf created the property proposal for the relevant external ID property and seemed to added data. He hasn't written anything on his user page that he speaks for COURAGE and provide the transparency you are looking for.
Wikidata already has enough channels for people who want to collaborate to do so and adding additional channels risks diluting the existing ones. ChristianKl16:09, 8 June 2020 (UTC)
COURAGE was just an example, though I don't believe that we should point fingers at research projects saying it's their responsibility to communicate when nothing tells them how to on Wikidata (Still they even wrote an article about using Wikidata). There might be many channels but that doesn't equal quality communication and if you look at the page as an outsider, and as the example of COURAGE shows, it would help if there would be a central space for research projects. --Adam Harangozó (talk) 16:40, 8 June 2020 (UTC)
Could you define what you mean with "Research project"? ChristianKl21:33, 8 June 2020 (UTC)
According to the terms of use of Wikimedia it's responsibility of users who makes edits to disclose when they are paid for those edits. COURAGE seems to be a project that's paid by a grant and thus there's a responsibility for disclosure which is not happening. ChristianKl21:52, 8 June 2020 (UTC)

Margarethe von Trotta

Q94775666, Q94754322, Q55418 ? --C.Suthorn (talk) 16:14, 8 June 2020 (UTC)

Government information resources

Looking for some interesting ideas on teaching about government information resources.

Scope and Content as a qualifier for Archives At

Hello, Why is scope and content not allowed as a qualifier for Archives At? If you have archives at an institution and are allowed to list the title of the archival collection and its URL, why would you not want the collection’s scope and content included therein? This allows researchers to know if they should follow the link or not to the collection. Scope and content helps explain the title of the archival collection at a particular institution. Thank you, Jane

Two items on separate print copies of same portrait

Hi. I have just stumbled across Sir John Owen Knt (Q55008437) and Sir John Owen Knt (Q55008452) which describe separate print copies of the same portrait in the Welsh Portrait Collection (Q54859927). Should the items be merged as copies of the same portrait or should they be retained as describing separate print copies? If they should be kept separate, what justification would be best to use on different from (P1889)? From Hill To Shore (talk) 01:50, 6 June 2020 (UTC)

I've now found a third item on a print of the same portrait; Sir John Owen (Q77435063). From Hill To Shore (talk) 02:19, 6 June 2020 (UTC)
And Sir John Owen (Q55006249) makes four. From Hill To Shore (talk) 02:27, 6 June 2020 (UTC)
You might get better answers at Wikidata talk:WikiProject Visual arts, but keep in mind, nobody really knows anything, and Wikidata in many ways is a gigantic stumbling toddler. Anything goes, really, until it doesn't. Books can have separate editions, sculptures can have separate castings, and prints can have as many copies as lithographers or Xerox machines allow. Some copies are in libraries with fancy accession numbers. Some are scanned on Internet Archive. Some are on my wall or in millions of books in bookstores and libraries around the world. -Animalparty (talk) 07:20, 6 June 2020 (UTC)
  • My vote is to merge, since they are all the same page from a book. The entry would then show each collection that has a copy, but this is just my opinion. Think of it as a book, multiple identical copies, not new editions, that may have a different forward or corrections made. . --RAN (talk) 07:47, 6 June 2020 (UTC)
  • The question occasionally comes up and, depending on how much one thinks knowing about the print(s) and their basis, different approaches are suggested.
The current situation for the above is standard if one starts with a "bottom up" approach. It's fairly easy to do when one just describes a single print (or thousands of prints in a given collection).
If you merge them, you'd need to change P31 to P279. Would the identifiers then be kept or deleted (or qualified somehow)? Also, you could end up doing separate subclasses for some of the cases mentioned or variations of the image. --- Jura 09:27, 6 June 2020 (UTC)
print (Q11060274) is already a group of works, not just one copy, as a subclass of work with multiple executions (Q28886448). Peter James (talk) 12:51, 7 June 2020 (UTC)
Interesting point, especially as work with multiple executions (Q28886448) has group of works (Q17489659) instead of work (Q386724).
I don't think the description of print (Q11060274) reflects that nor contributors who add prints are aware of that. Probably Q28886448#P279 needs fixing. --- Jura 05:00, 8 June 2020 (UTC)
Hi all. This issue has come up before, and i understand the desire to merge, however from our point of view we would prefer them not to be, as the data items describe two different unique physical artifacts in our collection, each with their own catalog entry. Like other GLAM's we have started round tripping our Wikidata to power or enrich other services such as this, so from our perspective and probably other GLAM's who share on Wikidata, merging multiple catalog items would be problematic. Happy to discuss best way forward though.Best Jason.nlw (talk) 10:53, 11 June 2020 (UTC)

The new domain structure for tools fails us - Reasonator among others no longer available

Hoi, Reasonator no longer works. This is the Phabricator ticket. What is the plan to bring back functionality that is broken by what to me is a cosmetic change. Thanks, GerardM (talk) 11:14, 8 June 2020 (UTC)

So, I can't really say if that's the reason, but having different sub-domains usually do help from a web security point of view, since cookies would no longer be shared with all applications if configured correctly. With the previous system, someone finding a XSS bug in a application could have escalated to access all others applications in the domain with XHR. Having separate domain mean that CORS and others policies could be deployed to protect from that.--Misc (talk) 11:44, 9 June 2020 (UTC
The rationale for a change in domain is not relevant when a tool is not available. What is relevant is that Reasonator is now available again and as important, the huge amount of Reasonator URL are automagically translated into the new domain. Thanks, GerardM (talk) 11:57, 9 June 2020 (UTC)
"The rationale for a change in domain is not relevant when a tool is not available"; yes it is Gerard. You might not be privy to the rationale or perhaps understand it, but it IS relevant. I know you care, but some things just don't happen instantly and/or perfectly. We are standing on a swamp of volunteer labor here, ppl have day jobs to attend as they do this on the side and wikimedia engineers have credentials to protect from people who register for the tool services with the sole goal to get access to non-public information. And when these 2 things combine, then sometimes something is offline for a bit. TheDJ (talk) 12:48, 9 June 2020 (UTC)
There is a difference between a rationale for a change and, arguably tools have improved significantly, and acceptance for an essential tool not to be available. What we have is that support for understanding the information in a Wikidata rests on tools like Reasonator. It is imho the best of breed when you want to understand our data in another language than English. It is telling that we rely on what is described as a "swamp of volunteer labour" for essential functionality. What Special:MediaSearch Commons provides us is an option to give a use for the labels in Wikidata and find material in any and all languages. When we are to support the sum of all knowledge to everyone, this enables us to "ensure that "All of Commons is available to every single person on the planet". So yes, I care but what we do has to have a purpose. It is not only to bring the fastest queries to white dudes who understand SPARQL and support special interests that exist in our data. Thanks, GerardM (talk) 19:30, 9 June 2020 (UTC)

Wikidata weekly summary #419

The weekly update contains a query with the description "MWAPI searches in wikidata about people descibed as slave traders by citizenship" which produces a list of people. Some of the people in the list are living people. But the query only finds people with the word "slave" in a label, description or alias, so there is no reason to assume that they are slave traders. Some can be slaves themselves, some can be known for helping slaves, some can be named "Slave" in their own language, or the word could mean anything else in some languages. I think that it is very wrong to present a list of persons and say that they are described as slave traders when that isn't true. --Dipsacus fullonum (talk) 19:10, 8 June 2020 (UTC)

The middle name is misspelled. It should be "Gerard" - source: Austrian Centre for Digital Humanities and Cultural Heritage. Please correct this item. -MaxxL (talk) 04:55, 10 June 2020 (UTC)

  Done--Ymblanter (talk) 06:59, 10 June 2020 (UTC)

Can someone help me getting the URL of this identifier working?--Trade (talk) 11:47, 10 June 2020 (UTC)

I had no luck with this either. Weird eh? Thanks for proposing another good Aussie property! --99of9e (talk) 13:40, 10 June 2020 (UTC)
Well, whatever you did it works now. Now i just need to create a sufficient amount of content descriptors. Please give me a call if you find a full ist of them. @99of9:--Trade (talk) 21:35, 10 June 2020 (UTC)

Matching with OpenRefine

We used OpenRefine to auto-match our list of entities (eg. names of people, organisations, etc) with that of Wikidata. There were instances where the names were exactly the same, yet the search results showed a non-match. Is there something we can do to improve the results for such instances? Some examples that we encountered where our labels are exactly the same as that in wikidata: “Dick Lee”; “S. Rajaratnam”; “Zouk”.

To refine the search, we have tried adding “Singapore” under the property “Country” (see table below) as a parameter to increase the probability of a match, but it does not seem to work.

Singapore Infopedia ID Singapore Infopedia Title Country
SIP_1595_2009-10-30 Dick Lee Singapore
Example Example Example

 – The preceding unsigned comment was added by 116.88.233.116 (talk • contribs) at 14:45, June 10, 2020‎ (UTC).

Duplicate Entities

A question that we get asked by our colleagues is how does Wikidata prevent the creation of more than 1 Wikidata records for the same entity? For example, if a record has been created for Gregory Peck (actor), there will not be another record with a different “Q” identifier for him. Presently, is some bots/technology being applied to prevent this from happening?  – The preceding unsigned comment was added by 116.88.233.116 (talk • contribs) at 14:47, June 10, 2020‎ (UTC).

There are no technical processes stopping the creation of duplicate items (after all, there presumably are multiple people called Gregory Peck...), but there are various active measures that help ensure it doesn't happen or that it's caught and the records are merged reasonably quickly:
  • Wikipedia links. Two items can't link to the same Wikipedia item, so if you try and link your new page to w:Gregory Peck you'll quickly discover it's a duplicate. Similarly, there's a presumption that anything with a Wikipedia page will already have an item, so for most things which are "prominent enough" you'll know to look for a match rather than assume you need to create a new item in the first instance.
  • Internal links. It is quickly apparent that if we have two Gregory Pecks listed on the cast for To Kill a Mockingbird (Q177922), there must be something wrong. So hopefully once you start linking the new item to/from other items, it will become clear there's an issue.
  • Identifiers. Most identifier properties have reports set up to spot two items with the same ID, so if you try and link your new Gregory Peck to his IMDB entry, it'll flag it up as a probable error needing reconciled, because two different people shouldn't have the same IMDB ID.
  • Date matching. Reports like Wikidata:Database reports/identical birth and death dates/1 will spot that there are two people born on 5 April 1916 & died on 12 June 2003 - these can then be manually checked to see if they're the same person
  • Human observation - if all else fails, there's a good chance that someone looking for Gregory Peck's Q-number will spot that there are two of them and get it fixed.
As you can see, not all of these are likely to work in all circumstances, but if we have more than a minimal amount of data on the new item, they're likely to work. If we just have two entries with no metadata barring a name, it's more likely they'll be missed and remain unmerged. The best solution, of course, is for people to screen new items before creating them to make sure they're not a duplicate! Andrew Gray (talk) 15:23, 10 June 2020 (UTC)
It's worth noting that an entry that just contains the name of a person and no metadata could be refering to multiple different people and thus doesn't fulfill our notability guidelines, so it's likely going to be deleted.
In addition to what Andrew said, we also don't allow two items with the same name and description. ChristianKl20:31, 10 June 2020 (UTC)
Yes, I forgot this one, though it's a very narrow technical limit - "John Smith / artist, 1880-1950" and "John Smith / artist (1880-1950)" would be allowed even though to a human they're basically the same, as would two items called "John Smith" with no description at all. It can certainly help flag up some duplicates, though! Andrew Gray (talk) 21:23, 10 June 2020 (UTC)

Extended search

I have looked to items about laws, bulletins and other things related to laws. I found the item Q19144054 what has a label what is not the one as used in German Wikisource. But there is the property title (P1476) what includes the title from Wikisource. And there is a link to Wikisource. I havent found it with the search. When I look for Wikipediaarticles that works. I find them also when I search with the article name and that also from other languages. There are also the article titels searched. What is included in search in Wikidata and what not when I use the general search. At the right above corner of the page. --Hogü-456 (talk) 20:51, 10 June 2020 (UTC)

Merge or not?

Q21577184 and Gazprom Transgaz Samara (Q4131793). --Liuxinyu970226 (talk) 04:58, 11 June 2020 (UTC)

One is about the organization, another one is about the building. I would say no, but it should be some way of connecting them via a property.--Ymblanter (talk) 08:53, 11 June 2020 (UTC)
How about "Headquarters location"? -- The Anome (talk) 09:53, 11 June 2020 (UTC)

Modelling a supplier relation?

Hi, I'd like to state that organization A is a "supplier" of organization B (A provides goods or services to B), specifically Q96176445 supplies Q96176400 with vegetables. Is there a property or qualifier for this? For now I'm using partnership with (P2652) but it's obviously quite imprecise. Thanks! --A3nm (talk) 08:27, 9 June 2020 (UTC)

Wikidata Resolver - batch mode

Hello, is there a tool that returns all items from a list of possible values for a given external identifier? e.g.: I would like to know all the items that have ISSN (P236) = ("0261-3077" OR "0028-0836" OR "" 2535-7492 "OR ...) and it returns The Guardian (Q11148), Nature (Q180445), Toons Mag (Q64692267) ... I tried to do this through OpenRefine, but it also returns approximate values. Through the Special:Search (haswbstatement prefix), I'm limited to 300 characters, Wikidata resolver only allows you to consult one value at a time, and through Petscan I couldn't. --Pablo Busatto (talk) 06:07, 11 June 2020 (UTC)

Type constraint false positives (not respecting class inheritance)

Why is Q1304258#P669 showing that Adenauerallee (Q355872) should be an instance of a subclass of geographical feature (Q618123) when it already is? It's an instance of street (Q79007) which you can see is a subclass here: https://w.wiki/UCW --SilentSpike (talk) 17:55, 15 June 2020 (UTC)

The type constraint is about the item not the value. --SCIdude (talk) 19:59, 15 June 2020 (UTC)
Ah, I noted on the talk page that this seems confusing, as it is being referred to as a faculty of law AND as a building. I see now that the error states "but Q1304258 currently isn't", and I assumed that Q1304258 was Adenauerallee and not the Juridicum. Any chance of using something like Juridicum (Bonn) (Q1304258) here, to make this clearer? --WiseWoman (talk) 20:11, 15 June 2020 (UTC)
Only slightly embarrassed I overlooked this in my haste 🤦‍♂️ --SilentSpike (talk) 20:15, 15 June 2020 (UTC)
I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. SilentSpike (talk) 21:46, 15 June 2020 (UTC)

coordinate location (P625) constraints and mappings

Hi, stumbled across longitudes > 180 (San Pablo-Toledo Magnetic Observatory (Q59508672)), which I changed: [5] to map the latitude to [-180°,+180°]. The problem occurs on the map, where a second planet is used to show longitudes > 180 (There is no second planet). I would assume that there are similar flaws when sorting coordinates, checking nearby etc. This is not an exceptional case:

#defaultView:Map
SELECT ?item ?itemLabel ?coord (COUNT(?item) AS ?countries) (IF(?countries = 1, SAMPLE(?country), ?countries) AS ?layer) {
  VALUES ?isa { wd:Q62832 }
  ?item wdt:P31/wdt:P279* ?isa .
  ?item wdt:P17 ?country .
    ?item p:P625 [
           psv:P625 [
             wikibase:geoLatitude ?lat ;
             wikibase:geoLongitude ?lon ;
           ] ;
           ps:P625 ?coord
         ]
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
GROUP BY ?item ?itemLabel ?coord
Try it!

Questions:

  1. Is a longitude of >180° and <=360° a legal value in some contexts? Or is this a broken contract?
  2. How should wikipedias using such values proceed?
  3. What is the range of coordinate location (P625)? Adding longitude > 360° is rejected, but 180° < longitude <= 360° is not?
    1. Similar a latitude of > 90° or < -90° is not rejected, but should be: There is nothing above or below the poles (while for a longitude > 180° there is a possible interpretation).
  4. How to proceed?:
    1. Fix the rendering of longitudes by converting all values into [-180°,+180°] in the code for coordinate location (P625) or
    2. Tighten the constraint for longitude (I did not find any range constraint in the meta data) and
    3. Change the values manually / per bot to map into the usual range.
  5. and technically: How to get all longitudes >180° or <-180°, see next post.

best --Herzi Pinki (talk) 06:58, 11 June 2020 (UTC)

Coordinates may be used for other astronomical bodies, like the Moon or Mars. mw:Extension:GeoData#Glossary provides some information. Wikidata is meant to reflect information in sources, where possibly invalid values may come up, that's why entering invalid values is allowed, too. --Matěj Suchánek (talk) 09:31, 11 June 2020 (UTC)

checked some of the geohack listed apps. the following fail with longitude > 180°:

So I fear that allowing longitude > 180° breaks the contract. --Herzi Pinki (talk) 20:27, 11 June 2020 (UTC)

How to proceed? Fixing all the apps above (and more) or changing the rendering of longitudes lon > 180° to lon-360°? --Herzi Pinki (talk) 19:39, 12 June 2020 (UTC)
I think they are mostly magnetic observatories. @Trilotat:: would you double-check your additions? --- Jura 20:44, 12 June 2020 (UTC)

Get all longitudes > 180° fails

I want to retrieve all longitudes > 180° (of coordinate location (P625)) with sparql but do run into a timelimit.

SELECT ?item ?itemLabel{
    ?item p:P625 [
           psv:P625 [
             wikibase:geoLatitude ?lat ;
             wikibase:geoLongitude ?lon ;
           ] ;
           ps:P625 ?coord
         ]
  FILTER ( abs (?lon) > 180 )
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
limit 100
Try it!

any idea? --Herzi Pinki (talk) 06:58, 11 June 2020 (UTC)

According to Wikidata:Database_reports/List_of_properties/Top100 there are >8 million P625 claims in WD. This is beyond the WDQS capabilities. You need to work with a full WD dump yourself or use WDumper. --SCIdude (talk) 07:08, 11 June 2020 (UTC)
SELECT ?item ?itemLabel ?globeLabel ?lon 
WITH
{
  SELECT ?stv ?lon
  {
    ?stv  wikibase:geoLongitude ?lon .
    FILTER ( abs (?lon) > 180 )
  }
  limit 10000
} as %select 
WHERE
{
  INCLUDE %select 
  ?item p:P625 [ psv:P625 ?stv ] .
  ?stv wikibase:geoGlobe ?globe 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!
SELECT ?item {
  ?item p:P625 [ psv:P625 ?node ] .
  ?node wikibase:geoLatitude ?lat .
  ?node wikibase:geoLongitude ?lon . hint:Prior hint:rangeSafe true .
  FILTER( ?lon < "-180"^^xsd:double || ?lon > "180"^^xsd:double ) .
}
--Matěj Suchánek (talk) 09:32, 11 June 2020 (UTC)
but this proposal runs into timeout ... --Herzi Pinki (talk) 20:39, 11 June 2020 (UTC)
@Matěj Suchánek: Your query doesn't work. It will timeout because it doesn't handle the triples with wikibase:geoLongitude separate in a subquery before finding the statements like the query by Jura does. But even if it could run without timeout, it wouldn't return any results because of wrong use of the hint:Prior hint:rangeSafe true . hints. When you a compare a variable declared "rangeSafe" with a constant, Blazegraph will assume that all values of the variable have same type as the constant. Here the constants (180 and -180) are of type xsd:integer and ?lon contains values of type xsd:double violating that assumption resulting in filter removing all results. You should instead compare with constants also of type xsd:double in the filter. --Dipsacus fullonum (talk) 06:10, 12 June 2020 (UTC)
It was just a suggestion for optimization to start with. Improved in place according to Dipsacus fullonum's comment. --Matěj Suchánek (talk) 08:14, 12 June 2020 (UTC)

--- Jura 20:50, 12 June 2020 (UTC)

No Vandalism on Wikidata?

On the German Wikipedia there is currently a survey going on about whether to obtain authority controlled data from Wikidata. In the course of the discussion, many have criticized that vandalism can too easily go unnoticed here on Wikidata, making the overall data quality questionable. I think this is something that should (also) be discussed here, not only there. --77.3.167.180 10:58, 11 June 2020 (UTC)

  • The problem with discussing this with dewiki is that it's not clear if dewiki actually meets the basic requirements for all WMF wikis "that anyone can edit". If a wiki is mostly protected, obviously there can't be any vandalism. Is there some page that explains how dewiki meets the requirement? --- Jura 11:07, 11 June 2020 (UTC)
  • This may also be related to German Wikipedia having a general dislike for infoboxes, the most Wikidata-like content in Wikipedia language editions. So the attraction to have Wikidata integration may not be so compelling. However, on a bigger level I'm tired with this argument about this mythical "unreliable Wikidata" from Wikipedia-only editors, as if Wikipedia is or was always free of vandalism or strict sourcing. One doesn't need to look far to find high profile, long-lasting vandalism on Wikipedia (eg. "he looked himself up on Wikipedia, and someone had edited his entry to describe him as 'a Martian technology journalist.'" CBS News) yet Wikidata is being measured by a standard that Wikipedia itself would fail. Rather than devolving into a petty wiki-war, why not recognize that greater participation would mean more errors can be fixed, or tools can be created to mitigate them. Kind of like how things evolved in Wikipedia since 2001, when we made vandal fighting tools, user scripts, bots, technical protection features, and machine learning systems like ORES. We know each Wikipedia edition maintaining millions of dates, scores, prizes, place of birth, etc. is inefficient, inaccurate, and untenable. Let's stop infighting and get to collaborating. -- Fuzheado (talk) 12:48, 11 June 2020 (UTC)
    • This is especially sad because I do remember people criticizing Wikipedia because "you cannot trust it, anybody can edit it", so the irony is quite strong. Interestingly too, I never see any sources given on unreliability of wikidata. And I heard folks from wiktionnary saying they had the same type of remark and toxic interactions with some folks in the Wikipedia community, so that's just not us. I know that's likely just a tiny vocal minority, but this is tiring indeed and not to the level of collaboration I would expect in the community. --Misc (talk) 17:29, 12 June 2020 (UTC)
  • This discussion is on the same subject: ro:Wikipedia:Cafenea#Articole_cu_date_de_la_Wikidata. The problems are real, the solutions are not straightforward.--Strainu (talk) 15:44, 11 June 2020 (UTC)
  • Over the last few days I went through English Wikipedia's Category:VIAF different on Wikidata. When I started it had ~400 members, and I found that in about 2/3 of the conflicts, the template's value was wrong or otherwise substandard (linking to a single authority record with barely any information). This also caused me to make a couple authority control fixes in dewiki when I noticed mistakes they had made there had been copied to enwiki. Wikidata has people actively maintaining its authority control, and it's integrated with VIAF in a way that templates are not. I also question how much people want to vandalize authority control; while I've seen it and reverted it, it's not common. Vahurzpu (talk) 17:26, 11 June 2020 (UTC)

Protected page contains an error

PubChem's logo should be changed to File:PubChem logo.svg. Batreeq (talk) 00:42, 13 June 2020 (UTC)

how to describe inferences drawn from references?

Without being able to add notes to my edits, I sometimes have a hard time explaining why the references support the statements. The current case is the birth date of Rose Maria Wallop Powlett (Q75439556). Her birth was registered in the 4th quarter of 1881 (October-December); we know that the registration wouldn't be delayed very long, so the birth is not likely to have been before September. Her mother died on 23 September, so that gives us a latest date. If I state "born in September 1881," how do I best explain that the references [the birth registration record and mother's death record] support it? Levana Taylor (talk) 02:32, 12 June 2020 (UTC)

Qualifiers like latest date (P1326) and based on heuristic (P887) may be useful, and there's always the Talk page if you find something hard to explain with structured data. Ghouston (talk) 02:46, 12 June 2020 (UTC)
I'd add inferred from (P3452) vital record (Q18562479) as the opening line of the reference and then follow it up with details of the reference material. Instead of vital record (Q18562479) there may be a more specific item that describes the document or source you are looking at. From Hill To Shore (talk) 09:45, 12 June 2020 (UTC)
inferred from (P3452) vital record (Q18562479) seems wrong. inferred from (P3452) is about the item that's linked and not about what's represented by the item. If you want to refer to information that's listed in another Wikidata item use inferred from (P3452). based on heuristic (P887) seems the best property for this use-case and if one of the existing values doesn't do what you want you can create a new one. based on heuristic (P887) has the advantage that it's easier to translate a linked item once then to translate a free-text every time into dozens of languages. ChristianKl10:20, 12 June 2020 (UTC)
Good thought; I've created inferred from date of registration (Q96242113), which is straightforward; and inferred from death date of participant (Q96242333) with the description "determines latest possible date of an event from the death of a person involved in it" -- there must be a better label for that, ideas? Levana Taylor (talk) 18:33, 12 June 2020 (UTC)

Unifying heuristic items

ChristianKl's suggestion made me notice something that needs fixing: the description of based on heuristic (P887) says "indicates that the property value is determined based on some heuristic (Q201413)," but most items that are used as values of based on heuristic (P887) are currently instance of (P31) reasoning (Q1156402), with just a couple of heuristic (Q201413) and a couple of heuristic (Q1981968). These should be unified and/or the description changed. Levana Taylor (talk) 18:14, 12 June 2020 (UTC)

I think it may be useful to create a new item for the Wikidata heuristics, equivalent to Wikibase reason for deprecated rank (Q27949697). Ghouston (talk) 03:07, 13 June 2020 (UTC)

One particular form of vandalism that is particularly harmful to Wikipedias using data from Wikidata is bio data changes that go unnoticed for months/years. The most recent example I have is Alec Baldwin (Q170572), which has been vandalized in October 2019. This can be easily fixed by a robot that checks known bibliographic sources (starting with the ones mentioned as source) and either revert or raise an alarm on some dedicated page when the value from Wikidata does not correspond with what the sources have. Further improvements might include limiting bio data changes without sources for dead people to certain user groups (TBD which ones).

I'm opening this discussion for 2 reasons:

  1. To have a basis for a bot request
  2. To gather more properties that can be easily be checked this way and other improvement ideas.

Looking forward for your feedback.--Strainu (talk) 15:18, 11 June 2020 (UTC)

  • If the date has several refs, it's odd that it can be edited in that way. I think I had ask at Wikidata:Contact the developers to look into a similar one. Maybe re-ask them about it? --- Jura 16:10, 11 June 2020 (UTC)
  • Good initiative, keep us informed; I would move the thread to Wikidata talk:WikiProject Counter-Vandalism, which is not much frequented but should be the most appropriate place for these really needed discussions. I think the most frequent subject of vandalism are labels/descriptions/aliases, which are probably more difficult to be checked by bots but fortunately often less sensible on Wikipedia side. --Epìdosis 16:25, 11 June 2020 (UTC)
    • I do think that one of the issue is that we are not equipped to check modification per properties, just by item. I do use listeria and sparql for that, but that's not as trivial as watch list. Maybe one solution could be to replace infobox by a bot. Right now, if someone do a change on Wikidata, the change appear on wikipedia immediatly but not in watchlist of people who watch the article. If we replace the infobox by a bot that write the change (like listeria), then any Wikidata modification would appear on people watchlist (made by the bot), and be fixed faster. --Misc (talk) 17:35, 12 June 2020 (UTC)
      • On the one hand it's bad that vandalism on Wikidata is directly visible in Wikipedia infoboxes. On the other hand that also means that if a Wikipedian corrects vandalism on Wikidata that affects a Wikipedia infobox, that correction is immeditately visible. It would be bad if Wikipedian would complain that even through they fixed the vandalism over at Wikidata the same vandalism is still visible for hours on Wikipedia.
I think the problem of watchlists not working correctly should be targeted more directly rather then hacking around by using a bot. ChristianKl22:46, 12 June 2020 (UTC)
It's pretty easy to see why Wikipedians would be unhappy that an article on their watchlist has effectively been vandalized without them getting any notification that it changed. Any chance someone can do something on either the Wikipedia side or here that would track when the content of a Wikidata-based infobox changes? - Jmabel (talk) 00:40, 13 June 2020 (UTC)
  • We have to check for more subtle errors or lock the fields for editors with less than 1,000 edits. I track and fix less subtle errors here: --RAN (talk) 19:22, 13 June 2020 (UTC)
Wikidata:Database reports/items with P569 greater than P570 People who died before they were born.
Wikidata:Database reports/unmarked supercentenarians People over 120 years old.

being pejorative

Hello,

There are 102 Wikidata items (not lexems) which have instance of (P31) pejorative (Q545779)

SELECT ?item WHERE {  ?item wdt:P31 wd:Q545779 .  }
Try it!

Thoughts? Visite fortuitement prolongée (talk) 17:09, 12 June 2020 (UTC)

I checked a few items, and clearly, most do not seems to respect WD:D. I think they should be cleaned and description be used, Wikidata is not here to replace the wiktionnaries. --Misc (talk) 17:21, 12 June 2020 (UTC)
I went through the list an found a handful to delete. shovelware video game (Q61838206) might also be changed but given that it's used as main subject, I'm not sure what to do about it. The rest of them have links to other Wikiprojects. As long as we can't add links to lexemes we unfortunately need the items for the links. ChristianKl18:52, 12 June 2020 (UTC)
At least some of these -- maybe most -- link to Wikipedia articles about pejorative terms. E.g. Xarnego (Q1067614), Republican In Name Only (Q3365273), McJob (Q641593). Why shouldn't they have instance of (P31) pejorative (Q545779)? Or am I misunderstanding the complaint? Can someone indicate ones that are problematic? - Jmabel (talk) 00:49, 13 June 2020 (UTC)

Mapping different aspects and roles in the slave trade

Hi all

Given the current higher public awareness and discussion of the slave trade I'd like to try and improve mapping and defining different aspects and roles of the slave trade and identifying some sources and databases to import data into Wikidata from. A few questions:

  1. How can I understand how it is currently mapped?
  2. Does anyone have any suggestions on how to map it better?
  3. Does anyone have subject matter knowledge that could help with mapping the topic and suggesting sources?

Thanks very much

--John Cummings (talk) 17:51, 12 June 2020 (UTC)


Update: I've found an extremely extensive database on many aspects of the slave trade which I've documented here Wikidata:Dataset Imports/Slave Voyages

John Cummings (talk) 18:03, 12 June 2020 (UTC)

@John Cummings: The http://enslaved.org/ project connects datasets about the transnational slave trade as Linked Open Data, and uses Wikibase for that. I think they're already covering a lot of what you're talking about. Spinster 💬 18:44, 13 June 2020 (UTC)

asteroids showing coords on maps of Earth

Hi, there seems to be some objects like Q64763312 and Ciinkwia Saxum (Q89208823) where along with the coordinates the map of old mother earth is shown. No idea where to suppress the rendering of the map and the link to geohack with globe:earth for e.g. objects of type asteroid (Q3863). Help needed. --Herzi Pinki (talk) 19:34, 12 June 2020 (UTC)

This query can be used to tell where located on astronomical body (P376) and coordinate location (P625) have different values for globe:
SELECT ?item ?itemLabel ?locatedOn ?locatedOnLabel ?coordinatesGlobe ?coordinatesGlobeLabel
{
  ?item wdt:P376 ?locatedOn.
  ?item p:P625 ?coordinatesStatement.
  ?coordinatesStatement a wikibase:BestRank.
  ?coordinatesStatement psv:P625 / wikibase:geoGlobe ?coordinatesGlobe.
  FILTER (?locatedOn != ?coordinatesGlobe)
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!
--Dipsacus fullonum (talk) 06:24, 13 June 2020 (UTC)

I tried to update the pages, but I received this error message "Unknown globe http://www.wikidata.org/entity/Q11558". Anyone know how to fix it? --β16 - (talk) 12:21, 13 June 2020 (UTC)

I had no problems changing globe to 101955 Bennu (Q11558) in Special:diff/1206648783 using the script from User talk:Molarus/globe.js. The globe cannot be changed in the normal UI (phab:T56097). --Dipsacus fullonum (talk) 13:08, 13 June 2020 (UTC)
BTW: it seems to be a kind of luxury problem to have such violations only because of artificial redundancy. Wouldn't it be possible to interpret all coordinates with respect to located on astronomical body (P376) (defaults to Earth (Q2))? --Herzi Pinki (talk) 15:46, 13 June 2020 (UTC)
That would require reading two properties each time one reads a map. Normally, the bot took care of them. Personally, I think we should move non-Earth coordinates to another property. --- Jura 17:00, 13 June 2020 (UTC)
either way. Putting the burden of performance optimization to the end-user instead to the system is not what I would expect for one of the most accessed websites on the globe. As seen from the point of view of WP, each object is read with a single request delivering the json representation of all properties of that object, thus accessing one or ten properties does not make a difference. Accessing it twice also does not make a difference, json objects are cached. Maybe, getting a map here in wikidata is much more optimized to reduce the amount of data needed to transfer to the client, but if filtering is done on the server, does it really make a difference? --Herzi Pinki (talk) 19:38, 13 June 2020 (UTC)
It would no make sense to me to make a complicated work around because of a few cases of wrong data instead correcting the wrong data which would be much simpler and less error-prone. I am tempted to deprecate the wrong data because the numerical values are also implausible. The statement where I corrected the globe was for a regio on an asteroid with a diameter of 0,56 km, and the precision of the coordinates was given as 1 arch second which is less than 1 mm on the surface. That makes no sense at all. --Dipsacus fullonum (talk) 20:11, 13 June 2020 (UTC)

Json Error in Oppose template

The   Oppose-template shows on https://www.wikidata.org/wiki/Template:Oppose a syntax error. What's the problem with it? Can someone fix it? ChristianKl09:27, 13 June 2020 (UTC)

  Done --Dipsacus fullonum (talk) 09:34, 13 June 2020 (UTC)

Commons Category

Is it possible (and appropriate) to remove (or redirect) Category:Kat Von D (Q55287955), so we just have Kat Von D (Q234527). It just seems confusing to have Category:Kat Von D (Q55287955). If I have misunderstanding of Wikidata works, please correct me. I'm finding this confusing. --Thivierr (talk) 14:02, 13 June 2020 (UTC)

@Thivierr: No, as a Commons gallery exists, so the category item is needed to sitelink to the Commons category. Thanks. Mike Peel (talk) 14:21, 13 June 2020 (UTC)
  • Ha! People have been arguing about having Wikidata entries for categories since the project was started! It was, and still is, very confusing. --RAN (talk) 18:22, 13 June 2020 (UTC)

Parliament or municipal council seats

I am trying since April (Wikidata:Project chat/Archive/2020/04#legislative election) to find a way to add the seats that a political party won in legislative of municipal council elections. Or how many seats a constituency has in a body. Some examples,

I had proposed Wikidata:Property proposal/parliament seats. I get an answer to use number of seats in assembly (P1410) but the property can not be used as a qualifier. I requested for the update of P1410 to allow to be used as a qualifier Property talk:P1410#Property as a qualifier (Wikidata:Project chat/Archive/2020/05#Property as a qualifier but no answer.

I just need a way to add the above informations, especially how many seats a political party or a movement won in legislative of municipal council elections. Xaris333 (talk) 20:05, 13 June 2020 (UTC)

You said on Property_talk:P1410 that you wanted to use it as a qualifier, and nobody has objected after a few weeks; I'd just go ahead and make the change at that point. It's already being used that way on 132 items, according to "scope" violations listed at Wikidata:Database_reports/Constraint_violations/P1410#scope. Ghouston (talk) 01:48, 14 June 2020 (UTC)
Nobody has objected after a few weeks, but also nobody has agreed. Xaris333 (talk) 02:20, 14 June 2020 (UTC)

I wonder if this property should only have one (or the ones very relevant) language(s), that of the country where it is/has beeen named ? https://www.wikidata.org/wiki/Q588#P1448 or should it be one text per language ? Bouzinac (talk) 08:36, 8 June 2020 (UTC)

  • The nature of official names is that there's going to be some document that specifies which names are official. Some entities have official names in multiple languages while other only have an official name in one language. ChristianKl09:46, 8 June 2020 (UTC)
Yeah, but in practice, if in country Z, a city was named X until date xxxxx and then Y after date xxxxx, should we duplicate X and Y in other languages ? Have a look at this query to see the problem raised : https://www.wikidata.org/wiki/Wikidata:Request_a_query#Town_that_have_changed_of_name_in_the_past

Bouzinac (talk) 11:23, 8 June 2020 (UTC)

  • This potentially gets much more complicated in countries like the U.S. that have no official language. Of course, English is customary here, but there is nothing official about it. - Jmabel (talk) 14:54, 8 June 2020 (UTC)
I get your point but bear in mind that this property states about the official name of the town, not the name of the town in its official language. It's all the more funny as I went randomly through some major US cities official websites. They all offer a Google-translate... And the Google-translate in Spanish for NYC gives
Nueva-York
^^
Moreover and more seriously, NYC can be given official name through embassies. For instance, "Consulado de Espana en Nueva York" Bouzinac (talk) 19:21, 8 June 2020 (UTC)
I'm not sure what is supposed to be "funny" about "Nueva York". I'm originally from there, native in English, a little short of fluent in Spanish, and I can assure you that when people -- including New Yorkers -- are speaking Spanish, they pretty consistently call it "Nueva York". - Jmabel (talk) 02:36, 9 June 2020 (UTC)
The funny thing is that it seems US cities website are using Google-translate instead of having own properly translated pages... Bouzinac (talk) 06:49, 9 June 2020 (UTC)
Another funny thing is that the preferred official name of Norway (Q20) is written in Kven only. And some of the "normal rank" official names are in Estonian and Romanian. 62 etc (talk) 09:21, 9 June 2020 (UTC)
I don't know Kven language, but it seems for me logical as there is multiple statements P1448 for Kven language, so the guy might have wanted to highlight the right Kven statement... which does not mean he considers that statement to be the most preferred for other languages... Bouzinac (talk) 19:28, 9 June 2020 (UTC)
  Oppose Limit to only one language, at least Canadian items do have their both English and French official names, likely for Switzerland they may have four (German, French, Italian and Romansh). --Liuxinyu970226 (talk) 05:02, 11 June 2020 (UTC)
Hello Liuxinyu970226, I don't agree with you because, for instance in France, we have [7] that states name for geographical things. So Myanmar (Q836) is officially in French Birmanie(Q836) whilst French is obviously not official language (P37)in Myanmar (Q836). That's an example. I would however agree with you for native label (P1705) which would be only the languages of that country ~~~~
@Bouzinac: Not sure if this "unsigned user" is you or not, well and but then, for example, Swiss Federal Railways (Q83835) has names in 1. German: Schweizerische Bundesbahnen; 2. French: Chemins de fer fédéraux suisses; 3. Italian: Ferrovie federali svizzere and 4. Romansh: Viafiers federalas svizras, yes that item is currently using native label (P1705), but to the best of my knowledge, they should really and originally be official name (P1448) values. --Liuxinyu970226 (talk) 05:08, 14 June 2020 (UTC)
Yes, that was me. I've put a proposal of slight change to this property [here https://www.wikidata.org/wiki/Property_talk:P1448#Slightly_(but_important)_change_proposed]. For common geographical things (typically big cities/countries), there would be as many anguages as official languages (embassies, government bodies that have wording about this city).So Beijing (Q956) would be Pékin as P1448 in French because of https://cn.ambafrance.org/, which is French embassy in Beijing. For other less international things, such as business (Q4830453), there would be only the languages where the company has registered. Well, take the case of Peugeot (Q6742), that company would be Peugeot in fr@P1448 and I wonder if 寶獅汽車 is its Chinese official name at some official Chinese company registration office. Bouzinac (talk) 15:44, 14 June 2020 (UTC)

Violation of Help:Description

What to do when users keep adding full sentences in descriptions? Even if they've been told not to do so?

Should that be reverted? Does this warrant report on admin noticeboard?--Roy17 (talk) 10:25, 12 June 2020 (UTC)

They should be as terse as possible to distinguish people with similar names. However when there is no Wikipedia entry or Wikimedia Commons entry with a mini-biography, a longer description is very helpful. --RAN (talk) 19:17, 13 June 2020 (UTC)
I noticed that some bots create new entries using the first sentence from Wikipedia, that may be the source of the sentence versions of descriptions. --RAN (talk) 22:10, 14 June 2020 (UTC)

Can Government Budgets be imported into Wikidata?

I was looking for tools to compare income/expenditure of different govt departments over the years and was wondering if Wikidata has been used/can be used for such queries?

Are there good examples of entire government budgets being imported into Wikidata? I am interested in adding/querying govt income/spending data so looking for guides/examples/advice. Thanks! F102 (talk) 11:32, 14 June 2020 (UTC)

I suspect there might not be any good (or even mediocre) examples - we don't handle this sort of purely numerical data very well, and we don't really have the properties to cover it. Andrew Gray (talk)
@F102: Hmm, I take it back! It looks like budget (P2769) for this purpose on government ministries - reporting spending for each of them plus a date qualifier. The most extensively modelled one here is South Korea - see this report for a full query. I guess I'd just managed not to notice it :-)
So the model seems to be to report spending on a departmental level, and presumably you could construct a national budget by summing them all (it isn't apparently applied to countries or to items for "budgets") Looking at items that aren't government departments, we get this report, where it's a lot more varied - lots of regional governments, but also independent official institutions, universities, libraries, and a few oddities like films or research projects. Andrew Gray (talk) 12:42, 14 June 2020 (UTC)
@Andrew Gray: thanks a lot! This looks very useful. Thanks especially for the cool query. I would have taken a while to work it out myself. But now can easily change the prop to income, expenditure, debt etc to help search for stuff. Most helpful!

Creating new qualifier

I want to add this severity qualifier to medical terms (disease, syndrome, clinical sign, symptom),with the values listed being mild, moderate, Severe, profound etc, in order for a disease with a symptom like pain to have constraints like moderate / excluding mild. I am not sure what I have done wrong. Similarly I would like to create Intensity linked color eg bright, strong, pale, vivid. Amousey (talk) 15:58, 14 June 2020 (UTC)

Modeling lean-tos

Hi, I'm working on lean-tos in Sweden. I wonder what is the best set to indicate the capacity of benches. See https://www.wikidata.org/wiki/Q96280230 I found the property "maximum capacity" but it does not seem like a good fit because of its description. Also sometimes there are benches inside the lean-to, how could I best model that? --So9q (talk) 20:30, 14 June 2020 (UTC)

Please add Commons category (P373) Categories by country to the protected item. Compare by city (Q18683478), where a similar link is useful for metacategory infoboxes on Commons. --84.184.110.23 08:25, 16 June 2020 (UTC)

done. --Robot Monk (talk) 10:27, 20 June 2020 (UTC)
I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. SilentSpike (talk) 17:38, 20 June 2020 (UTC)

Structured data bots in the Commons

I notice that by a lot of files in Wikimedia, Commons structured data is being added by bots. This takes over the file data such as copyrigth licence, creation date, etc and puts it in structured data. Example: automatically adding structured data claims based on file information: date source copyright license These kind of data is not very usefull as the Commons files need to have a correct copyrigth licence. More usefull bot actions would be to scan all postcards categories and add (P31) (Q192425) to structured data? There are other ways in wich bots can be usefull to convert the information contained in Commons categories into usefull Wikidata. Real 'content' is only incidentaly added by Commons uploaders.Smiley.toerist (talk) 10:38, 19 June 2020 (UTC)

See also discussion in; Village_pump CommonsSmiley.toerist (talk) 10:37, 20 June 2020 (UTC)
This section was archived on a request by:
Discussion about Commons continued in its village pump/project chat --- Jura 14:04, 20 June 2020 (UTC)

Should we really advice people to edit in Norsk?

Since some time Norsk (no) is one of the languages I am suggested to edit in. (The others are Swedish, English and Brittish English) and I am a little confused here. Shouldn't we prefer people to edit Bokmål (nb) or Nynorsk (nn) instead of Norsk? I am not saying that we never should be allowed to edit in Norsk, but I would prefer that it is done by those who know what they are doing. A random user he might think Norsk (no) is the language of nowiki, when it instead is Bokmål (nb). (Nowikisource on the other hand may have texts in Norsk). 62 etc (talk) 04:55, 2 June 2020 (UTC)

  • The key question is why you get suggested Norsk(no). I would expect it's due to your browser sending that this is one of the languages you speak. ChristianKl09:01, 2 June 2020 (UTC)
    • Somehow the general change from "no" to "nb" isn't done everywhere, so in some contexts "no" appears instead of "nb". It shouldn't actually be possible to add "no" labels, but I might be mistaken. Contact_the_development_team might be able to fix it. You can explicitly select what you get with boxes on your user page. --- Jura 09:19, 2 June 2020 (UTC)
      • @ChristianKl: I do not speak Norsk, I do not even always understand spoken Norsk. But I can without problem read most texts both in Bokmål and Nynorsk. In my work here with Norwegian municipalities, I have visited many webpages in different versions of Norsk, but not changed any settings in my browser. (Believe me, it gets much worse after a machine translation, than reading it directly.)
      • @Jura1: I know I can change the settings by adding #Babel to my user page, I am talking about why random users (without #Babel) is adviced to edit Norsk.
      • Earlier, I was getting Swedish, English, Finnish and Meänkieli. All of these makes sense based on where I live. Bokmål (nb) isn't very farfetched either, geographically speaking. But Norsk should never be an option if you not explicitely desires to edit it. 62 etc (talk) 10:21, 2 June 2020 (UTC)
        • The question whether users should get "no" instead of "nb"/"nn" is different from what a user actually gets by default. --- Jura 09:34, 4 June 2020 (UTC)
  • So, why is nowiki using Bokmål and not Norsk, or rather, if that's using Bokmål, shouldn't it be nbwiki instead (my understanding was that the prefix correspond to the language used) ? --Misc (talk) 17:10, 5 June 2020 (UTC)

Pinging @Jeblad: to make sure we are not anti-ing Norwegians' NPOV, indeed, the de facto no.wikipedia.org is purely using Bokmål, and probably have articles in Riksmål, but as they don't mix Bokmål and Nynorsk, this wiki should consider renaming their domain to be nb.wikipedia.org, what I think that items are affected by this section, are Norwegian Wikipedias (Q191769) and Norwegian Wikipedias (Q32176383). --Liuxinyu970226 (talk) 05:11, 11 June 2020 (UTC)

The language code “no” is a meta-language for Norwegian, and is used as prefix at nowiki due to historical reasons. The correct code would be “nb”, that is Norwegian Bokmål, which is also used for the unofficial Norwegian Riksmål. Nnwiki, the other project in Norwegian has language code “nn”, and covers Norwegian Nynorsk and the unofficial Norwegian Høgnorsk. People at nowiki and nnwiki can't agree on how to interpret the “no” code, and that makes it slightly difficult to do necessary cleanup. One problem is the browser, when it is set to “no” does it mean “nb” or “nn”? Should the s/he be asked to create articles for nowiki or nnwiki? And so forth. The fierce fighting between the language communities does not make the problem simpler, not forgetting that some users have some really strange ideas how this should work, and some claim the projects will disintegrate into thin air if anyone consider cleaning up the mess.
I believe the correct interpretation is that a meta-language is the set of all specific languages within the group, and for “no” they are “nb” and “nn”. The meta-language itself has no written variant, so users should not be asked to add strings with this code. It is although possible to imagine a system where identical strings (names for example) can be given a meta-language, and then reused for all languages within the group. Sounds nice, but the description would often diverge, and the label-description pair forms the signature of the item (unless it is changed – haven't checked).
So, in short, I believe references to the language code “no” should be removed from language lists, and users should only be given “nb” and “nn”. (And Norwegian Wikipedias (Q32176383) should be redirected to Norwegian Wikipedias (Q191769), the former don't make much sense and is deleted at nowiki.) Jeblad (talk) 09:44, 11 June 2020 (UTC)
@1997kB, Alexis Jazz, Eurohunter, Jon Harald Søby: ^^ --Liuxinyu970226 (talk) 05:01, 14 June 2020 (UTC)
@DannyS712: There were 3 "keep" votes in the last deletion discussion. Jura who wanted to keep the item in case nowiki were to recreate a silly stub at some point (which doesn't seem like a valid reason to keep an item), 1997kB who wanted to keep it for a "replaced by" statement on Q191769 which was promptly and rightfully so removed by Jeblad and Asav who didn't provide a relevant rationale for keeping.
Btw, I support removing the "no" language code from Wikidata/MediaWiki. It appears that "nb" (Norwegian Bokmål), "nn" (Norwegian Nynorsk) and "no" (norsk, not capitalized) all three exist. Let me know if you organize a vote on that. Alexis Jazz please ping me if you reply 14:24, 14 June 2020 (UTC)
"no" seems to be outdated code which removal was not finished and it is missleading. Eurohunter (talk) 08:10, 14 June 2020 (UTC)
Interesting, why someone claim that nowiki was "Bokmål and Nynorsk Norwegian Wikipedia"? --Liuxinyu970226 (talk) 02:32, 16 June 2020 (UTC)

Identifier with links to two databases

I have a question on how to handle an identifier that can link to two databases, specifically PCP reference number (P381) which right now links to a internal database on the tools server but also has an official database from the Swiss government with more information: See property discussion. Is there a good way to deal with this: single identifier, multiple outlinks? --Hannes Röst (talk) 16:21, 12 June 2020 (UTC)

@Hannes Röst: The way other properties deal with this is by using third-party formatter URL (P3303). In this example, the official Swiss link would be formatter URL (P1630) and the WLM link would be on third-party formatter URL (P3303). The interface would only display the official link, though. In principle I'm sure someone could put together a gadget/userscript to display all the third-party links, though I don't think one currently exists. Vahurzpu (talk) 16:58, 12 June 2020 (UTC)
Ok I think this is a work-around that should work for now, but of course its not great. It seems the only options would be to create another Property that has the exact same identifier and then duplicate the identifier on the item page (one for DB1, one for DB2). Maybe we need a more generic solution for these cases, I assume this is not the only such case? --Hannes Röst (talk) 13:31, 15 June 2020 (UTC)

Why are people in Poland adviced to edit in Russian?

Russian is one of the languages among Polish, English and German users are suggested to edit in. How it was chosen? I belive there should be French, Spanish, Italian or even Ukrainian instead of Russian. Eurohunter (talk) 08:54, 14 June 2020 (UTC)

  • I guess it's chosen by the geographical principle. Poland is bordered by the Russian-speaking regions from north (Kaliningrad region of Russia) and from east (Belarus). I'm not sure that this list of suggested languages would depend on which languages may be spoken by the Polish diaspora. --Wolverène (talk) 09:33, 14 June 2020 (UTC)
  • What database are these drawn from? --- Jura 10:22, 14 June 2020 (UTC)
I guess the data is taken from https://unicode-org.github.io/cldr-staging/charts/37/supplemental/territory_language_information.html --Pasleim (talk) 06:59, 15 June 2020 (UTC)

Constraint

Please help with "mandatory qualifier constraint: This named_after statement is missing a qualifier applies_to_name." I don't understand what it is asking for at Richard J. Hughes Justice Complex (Q48740715) --RAN (talk) 23:22, 14 June 2020 (UTC)

Just add the qualifier "applies to name: Richard J. Hughes Justice Complex" on the "named after" entry. This is to clarify the source of comparison for people in other languages. If an item has different names in different languages, there may be several different entries for "named after," so the qualifier points out which one is being compared. I'd make the edit myself but I'm on mobile at the moment. From Hill To Shore (talk) 00:33, 15 June 2020 (UTC)
I added it. ChristianKl17:44, 15 June 2020 (UTC)

Formatter URL for property won't update

I've made a change to formatter URL (P1630) for HBO Max ID (P8298), but the URL won't update for the property examples or items using the property. Oddly, it has updated in the documentation. I've tried purging numerous times with no effect. Is there another way to force an update? Trivialist (talk) 12:59, 15 June 2020 (UTC)

Have you purged both the property & the items using it? It could be that both are needed. Otherwise, I would recommend making a small edit to the item (say tweaking the description) and seeing if that does it. Andrew Gray (talk) 18:19, 15 June 2020 (UTC)

ISNI (P213) proposed format change

There has been recent discussion on the P213 talk page that seems to have settled on removing the spaces that Wikidata currently preserves in ISNI identifiers. This would affect over a million items. If you have any opinions on this change, please weigh in on the P213 talk page, thanks. ArthurPSmith (talk) 18:07, 15 June 2020 (UTC)

Mapping Wikidata terms to DCMI derms

Hi Everyone! I've made a spreadsheet mapping Wikidata terms to that of DCMI and am now looking to add this to Wikidata. I can't decide what would be the best format for this and would like to hear everyone's thoughts! Right now we are trying to decide between a three column query on our documentation page or a four column static table (Dc property, mapping property, Wikidata property, notes). Let me know what you all think!

Click Here to request to view the spreadsheet Gjohnson1110 (talk) 13:27, 16 June 2020 (UTC)

exact match (P2888) seems to be intended for this purpose, at least for mappings that are 1:1 equivalent. There is also narrower external class (P3950), and the more specific equivalent property (P1628) and equivalent class (P1709). This assumes that DCMI also uses URIs as identifiers. See for example Bus Stop, where "equivalent class" is used to link to https://schema.org/BusStop. Here's a query Querythat counts the different hostnames in a sample of items. So, for example, schema.org is used in 489 items in this sample. The first two columns show data of a randomly drawn example from each group.

(The spreadsheet is not publicly readable, unfortunately. And you can sign your messages here with "~~~~" or by selecting "signature" from the "Insert" menu if you're using the visual editor.) Matthias Winkelmann (talk) 02:47, 16 June 2020 (UTC)

Small improvements

Hello,

I had a dream:

  • A talk page, CVN meeting (for those who are not fluent in a language, or new ones who are not sure but who want to participate, etc.)
  • A presence of the different CVN links on the first pages (WD:CP, Main Page, WD:H)
  • An improvement of the already existing tools for warnings on user TPs
  • Also add links for mass rb tools in WD:CVN

Comments, remarks… —Eihel (talk) 15:05, 16 June 2020 (UTC)

Restrict editing of properties to autoconfirmed users?

Over the last four months I've protected a few highly visible properties so that they can only be edited by autoconfirmed users. Specifically, in order of protection date: country of citizenship (P27), birth name (P1477), end time (P582), family name (P734), short name (P1813), reference URL (P854), founded by (P112) and logo image (P154) - see the log for the exact dates. This was based on erroneous anon edits to those properties that I spotted via my enwp watchlist. I think the protection helped reduce the erroneous edits (like this), while not preventing regular editors from making changes. I don't think that full protection is appropriate, but this light protection seems to help.

I think it would be beneficial to protect all properties in the same way. The main downside is probably that we won't get translated property labels from new editors, but I'm not sure how often that happens.

What do you think? Thanks. Mike Peel (talk) 21:28, 31 May 2020 (UTC)

As a result of such a decision, let us not forget that Wikidata:Property creators should be changed accordingly:
Administrators may add the property creator flag to the accounts of users who:
are generally always trusted members of the community, preferably with at least some history in working with properties, and
have shown a satisfactory understanding of how Wikidata works, especially regarding the property namespace, datatypes, and related concepts.:::Users should generally not be made property creators solely or primarily based on their work on other projects; property creator is currently the only right unique to Wikidata, and this should be taken into consideration when requesting or granting it.
In general, property creators should be rollbackers before the flag is assigned, but they must be autoconfirmed.Eihel (talk) 07:47, 2 June 2020 (UTC)
The paragraph isn't very precise, but I think as a community we generally see the bar for giving out the property creator right a lot higher then auto-confirmation, I don't think adding a requirement for that right adds much additional information. If we reword that paragrpah I think it should say "Trusted members with regular participation on property creation discussions." ChristianKl15:59, 2 June 2020 (UTC)
+1Eihel (talk) 15:11, 16 June 2020 (UTC)
  •   Support, either for all properties, or alternatively a rule like "once a property is used on more than a thousand items" (like many wikis do with templates) or "a month after property creation". The first approach would be very simple to manage and easy to understand, the second and third would give us some time to get labels and things sorted out before light protection kicks in. One bonus side-effect is that this might help reduce server load a little - changing the label on something like P31 or P18, even if it's quickly reverted, means a huge number of pages needing updated. Andrew Gray (talk) 10:34, 1 June 2020 (UTC)
  •   Support The data stored about properties needs to change far less often than that stored on items, and is typically metadata having no direct effects on information on actual entities.--Jasper Deng (talk) 10:39, 1 June 2020 (UTC)
  •   Support Snipre (talk) 17:02, 1 June 2020 (UTC)
  •   Support The properties' ability to give life to entries also gives them so much potential to cause a lot of damage. ミラP 17:06, 1 June 2020 (UTC)
  •   Support While it's nice to have more labels for more language, partrolling changes in lower-used languages works less well. It's better for new users to learn how Wikidata works and which labels are good by interacting via doing less central changes then property changes. ChristianKl17:25, 1 June 2020 (UTC)
  •   Support. Maybe we should use AbuseFilter for protecting and suggesting the usage of {{Edit request}}? Bencemac (talk) 17:30, 1 June 2020 (UTC)
  •   Support Properties (and anything else besides main space items) don't have the sitelinks redirection problem, and there are a lot of subtleties in their constraints, formatter URL's, etc., so restricting changes to autoconfirmed users makes a lot of sense to me. ArthurPSmith (talk) 20:08, 1 June 2020 (UTC)
  • super   Strong supportEihel (talk) 20:35, 1 June 2020 (UTC)
  •   Support --Epìdosis 20:48, 1 June 2020 (UTC)
  •   Support Absolutely. - PKM (talk) 20:49, 1 June 2020 (UTC)
  •   Comment On a technical level, I expect that this would be implemented using mw:Manual:$wgNamespaceProtection as is currently done to prevent the Query: namespace from being edited by anyone. While the abuse filter could technically be used to implement this, it would perform much worse than this more native solution and slow down the parsing of all edits.--Jasper Deng (talk) 21:14, 1 June 2020 (UTC)
    @Bencemac, Jasper Deng: If this got sufficient support, I was expecting that I would have to implement it either manually or by pywikibot. I've started phab:T254280 about what would be the best technical solution here. Thanks. Mike Peel (talk) 21:06, 2 June 2020 (UTC)
    @Bencemac, Jasper Deng: From phab:T254280 it sounds like $wgNamespaceProtection is probably going to be best solution. Thanks. Mike Peel (talk) 07:39, 4 June 2020 (UTC)
    For the records: I mentioned AF because I tried to figure out how to inform e.g. IPs about the edit request template/template. I'm open for alternatives. :) Bencemac (talk) 07:46, 4 June 2020 (UTC)
  •   Support Daask (talk) 21:16, 1 June 2020 (UTC)
  •   Oppose; we have relatively few edits by anons and new users on property pages, which I think are usually watched by enough users to revert bad edits. Let's keep it open for everyone, as there are indeed useful edits being made by non-confirmed users. Also mind that we do not have a proper "edit request" system in place. —MisterSynergy (talk) 21:40, 1 June 2020 (UTC)
    @MisterSynergy: In the Item-declaration-value aggregate, I prefer that a newbie is interested first at the ends of the set than at what links the two. There is so much RC that "goes between the drops" and the properties are increasing, this has already been revealed since the beginning of this year, but I have been thinking about it for a long time: autoconfirmed is such a stage easily reachable. —Eihel (talk) 07:51, 2 June 2020 (UTC)
    This is a dangerous reasoning; you can pour excessive semi-protection policies to a large amount of Wikidata content as there is practically always another thing which is "easier" to do for "beginners". I prefer to have Wikidata generally open to be edited by everyone, including newbies and IP users, except for cases that require temporary protection due to actual issues related to that very affected page. Mind that there has already been the RfC "semi-protection to prevent vandalism on most used Items" which was set up with a similar precautious intention, and left us with a totally unworkable policy that nobody can enforce, and a large amount of items whose indefinite protection status is extremely difficult to review even after one year has passed since that RfC has been concluded. ---MisterSynergy (talk) 11:32, 2 June 2020 (UTC)
    @MisterSynergy: Please look through the histories of the properties I linked to, they were seeing substantial anon vandalism before I protected them. They were being reverted, but not before the vandalism was seen on other wikis, which damages Wikidata's reputation within the Wikimedia projects. I'd much prefer that everything was open to everyone to edit, but since we have to use server caching, that doesn't seem to work right now. Thanks. Mike Peel (talk) 21:21, 2 June 2020 (UTC)
    I don't mind protecting property pages that are vandalized often, and consequently reviewing the protection status regularly as well. I just don’t see the point in a systematic protection of thousands of property pages when we talk about some hundreds of edits per month in total by anons and newbies in that namespace. It is not helpful to provide only some extreme cases here for review when most properties that would be affected by your proposal never see any vandalism. —MisterSynergy (talk) 21:51, 2 June 2020 (UTC)
    @MisterSynergy: You might remember that I was one of the most vocal opponents of blanket protection of items. I am supporting this because, unlike that proposal, this can be done cleanly and the criterion (being in the property namespace) is a well-defined and natural one. There are well-established reasons for wanting to protect properties; for one, the metadata we store on them rarely needs to be changed and can be sensitive (for example, a claim of format as a regular expression (P1793) should not be changed without clear consensus at the very least).--Jasper Deng (talk) 09:38, 4 June 2020 (UTC)
    You can of course judge this case differently than the other one, but I do not do this. In order to understand the matter even better, I have just evaluated some more numbers about the scale of the problem:
    • Out of the 7592 properties which we have, more than 76% have never seen an IP edit in more than seven years (5779 absolute); more than 91% have seen 3 or less IP edits in that time (6927 absolute)
    • Since 2013, there are 3 IP edits per day (median) on property pages (the mean value is 5.2 edits per day on property pages); there is a very modest increase with time: in this year 2020 we have 7 IP edits per day (median) on property pages (the mean value is 8.7 edits per day on property pages)
    • I have no information about newcomer accounts, as this information cannot be queried from the Mediawiki SQL database for the entire lifespan of Wikidata; based on the RC data from the last month, I'd infer that registered newcomers contribute significantly less than IPs in the property namespace, thus the above numbers are a fairly good representation of the problem.
    And I have to say that this looks much more like a "non-problem" that clearly does not require us to abandon one of the core principles of wiki projects, namely that it is freely editable even without accounts.
    I wish that User:Mike Peel had provided us with a similarly detailed input here, instead of showcasing some extreme outliers that are clearly not representative for the majority of our property pages. Now that he has collected lots of support votes on a biased proposal, it seems that this will become the second unconvincing precedent of bulk protections. —MisterSynergy (talk) 22:52, 4 June 2020 (UTC)
    @MisterSynergy: Most properties haven't been around for 7 years yet. Perhaps the ones I've pointed to are the most here and on other Wikimedia projects, but all properties will become more visible over time, and it looks really bad if they get vandalised and it sticks around in cache for a long time. Can you assess how many of those anon edits were productive? All proposals are biased, and this wasn't meant to be particularly biased, more it was aimed at starting a discussion. It's been interesting how many people feel strongly about this issue - I wasn't expecting it. Thanks. Mike Peel (talk) 17:17, 5 June 2020 (UTC)
    First of all, I did not mean to say that you intentionally set up a biased proposal. I assume that you did it more or less accidentally, but the outcome is now pretty frustrating to me.
    As far as I know, there is no way to classify IP edits as good or bad with a simple query. I would not be surprised if the majority of IP edits on property pages had to be reverted or improved, but this is not the point here. It's about the idea that pages can be edited, as long as they are not affected by problematic edit patterns. With the current protection policy, we can protect frequently vandalized property pages anyways, so there is simply no need to expand such a protection to thousands of properties that are not affected by vandalism at all. We are really about to give up a core principle of our project here, without any need to do so. —MisterSynergy (talk) 20:19, 5 June 2020 (UTC)
  •   Strong support Less vandalism, more reliability.--Jklamo (talk) 06:42, 2 June 2020 (UTC)
  •   Question I suppose we would indeed need some evidence that this is actually justified, given the general WMF principle that sites are freely editable. Clearly, it's an error to think there is any vandalism on formatter urls. --- Jura 08:07, 2 June 2020 (UTC)
    @Jura1: Look at the history of the examples I gave. In particular, changes to country of citizenship (P27) were regularly being reverted until I semi-protected it. I don't like protecting items, but they are really visible. While formatter urls don't currently get vandalised, they could be (and potentially with a big impact), and we don't have any way to only protect labels. Thanks. Mike Peel (talk) 20:45, 2 June 2020 (UTC)
    • @Mike Peel: we have means to protect en labels, it just needs to be updated, similar to what is being done for formatter URLs. I think there is a difference between protecting some aspects and protecting an entire namespace. Bear in mind that WMF sites are meant to be edited freely/wikily. --- Jura 20:51, 2 June 2020 (UTC)
      @Jura1: Can you demo it somewhere, please? Thanks. Mike Peel (talk) 20:56, 2 June 2020 (UTC)
      • Have a look at the relevant filters. --- Jura 20:58, 2 June 2020 (UTC)
        Sure, where should I look? Thanks. Mike Peel (talk) 21:07, 2 June 2020 (UTC)
        • It's #73 and #101. --- Jura 06:01, 3 June 2020 (UTC)
        • If we used edit filters for this, I'd strongly recommend going beyond just English labels - we often have similar kinds of issues in Spanish, plus occasionally other languages as well. Aliases/descriptions are often altered in the same way, although that has less of an immediate impact on readers of the site. Andrew Gray (talk) 13:05, 3 June 2020 (UTC)
          • English labels get edited when users can't figure out the user interface. --- Jura 13:09, 3 June 2020 (UTC)
            • @Jura1: See phab:T254280 and concerns about abuse filter performance. Thanks. Mike Peel (talk) 07:39, 4 June 2020 (UTC)
              • @Mike Peel: apparently it's practically not an issue. Interesting that a user who doesn't edit thinks that editing should be restricted even further. Aren't we missing the point of wikis entirely? --- Jura 08:13, 4 June 2020 (UTC)
                • @Jura1: Interesting that a user who doesn't edit thinks that editing should be restricted This is ad hominem (Q189183), i.e. a red herring (Q572959). Whether they edit much or not is irrelevant to the strength of their argument; as this could be construed as a personal attack, please refrain from further such comments. And as someone who helps maintain abuse filters, and as an experienced programmer, I can tell you unequivocally that abuse filter checks are expensive and we should minimize use of them for things that can be served by other means, such as this one.--Jasper Deng (talk) 08:58, 4 June 2020 (UTC)
                  • Please excuse if you feel I attacked a user's character or motive by a comment on a user's non-action and proposal. BTW, you omitted even further in should be restricted even further.
                    As far as filter performance is concerned, apparently it's currently a non-issue. Why are should we make it one? --- Jura 09:09, 4 June 2020 (UTC)
                    • @Jura1: I took issue with your comment because this is far from the only time you have received a warning about this issue; consider this your only warning (you really ought to know better about what is or is not a personal attack; if in doubt, avoid making a statement about someone else). I didn't quote the rest of your text because I only quoted the relevant part. And filter performance is definitely not a "non-issue"; these two filters may be relatively lightweight but their scope is smaller, and as we add other filters for other reasons, filters of this sort add up in their effect and become a burden. The solution using $wgNamespaceProtection is also better in that non-autoconfirmed users cannot even attempt such edits in the first place.--Jasper Deng (talk) 09:30, 4 June 2020 (UTC)
                      • At least we agree that the current filters aren't problematic as such. Given that one merely needs to be fixed, the impact should be limited.
                        Can you suggest a better way to express the apparent paradox between editing should be restricted even further and the absence of edits? --- Jura 09:39, 4 June 2020 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── @Jura1: I am not entirely sure what you mean, but if you meant Mike's edit count here: again, their amount of edits is not relevant to this discussion, please stop bringing it up. If you mean the amount of non-autoconfirmed edits to properties, there's no reason to assume it will stay that low. Our project's growth will include increases in such attempted edits if we do not implement this proposal.--Jasper Deng (talk) 09:42, 4 June 2020 (UTC)

  • Mike is probably one of the most productive contributors. In general, I don't think we should worry about performance before it's actually an issue. My question was about the statement you took offence with. What would be a better way to formulate it? --- Jura 09:50, 4 June 2020 (UTC)
    • As an experienced programmer, I cannot condone that practice. Good software design does not include defects that are easy to correct, like this one. On the subject of your statement, I do not know how it could have been read other than "it is strange that someone (Mike) who doesn't edit much thinks editing should be further restricted". Even if that is not your intended text, I cannot read your statement as not being predicated on some trait of Mike's. Therefore, whatever its intent, it should be avoided.--Jasper Deng (talk) 09:55, 4 June 2020 (UTC)
      • Please refrain from stating that I attacked Mike. I feel this is highly inappropriate.
        Also, can you provide some links to support your claim And as someone who helps maintain abuse filters? I'm not sure why mention it (it's not an argument in support or in opposition of the proposal, nor does it insert it in a wider context). --- Jura 10:02, 4 June 2020 (UTC)
        • @Jura1: Except I have clear evidence. In your original statement, you said "an editor who doesn't edit". Who is referred to by "an editor"? Context clearly indicates the person you just replied to, namely Mike. If you bothered to look at Special:Log/abusefilter/Jasper Deng you'll see that I have an informed perspective on how filtering works. For example, compiling a regex is known to take exponential time and space in the length of the regex so regex checking is an expensive operation. If you had managed filters yourself, you would come to the same conclusion: abuse filters should be written with performance in mind, so we avoid future problems with it.--Jasper Deng (talk) 10:07, 4 June 2020 (UTC)
          • You are mistaken about Mike and it seems that he doesn't seem to be offended either.
            I only found 2 filter with you as most recent editor (one in ~2013 and one recently). I hadn't thought of looking at Special:Log/abusefilter/Jasper Deng (I didn't even recall that that log existed). It does show 2 edits in last five years, supposedly this clarifies what you mean with someone who helps maintain abuse filters. It seems you didn't deem it necessary to intervene on the existing filters. --- Jura 10:25, 4 June 2020 (UTC)
  •   Support And I would also advocate for protection of common items that are the target of said properties (e.g. Q5 etc). Vandalism confuses me and if it happens across domains then it gets hard to track down too. Jane023 (talk) 04:55, 3 June 2020 (UTC)

When/how to close this?

To be honest, I wasn't expecting this to turn into a !vote quite as much as it has, otherwise I'd have started at Wikidata:Requests for comment. Given where we are, I'd appreciate thoughts on how long it should last (since it's only been up for 6 days), and whether it needs a formal close or if I can just ask on phabricator for a change of the setting for $wgNamespaceProtection. Or perhaps it still needs a formal RfC? Thoughts? Thanks. Mike Peel (talk) 17:21, 5 June 2020 (UTC)

Since it was a heavily biased proposal, I would like to see an actual vote (i.e. RfC) with a more neutral description of the problem. —MisterSynergy (talk) 20:08, 12 June 2020 (UTC)
I agree with MisterSynergy that an RfC would be good. RfC's that are also listed on top the watchlist get seen by a lot more people then the project chat and core policies such as protecting all properties should be made not only by the people who care to check the project chat at a given time. ChristianKl23:47, 12 June 2020 (UTC)
@MisterSynergy, ChristianKl: OK, I've drafted User:Mike Peel/sandbox, do you want to make any changes to it before I start the RfC? Feel free to edit it directly. Thanks. Mike Peel (talk)
Thanks. I am going to add some input, but not today. —MisterSynergy (talk) 21:46, 13 June 2020 (UTC)
It's kind of odd that you ask for feedback and then mostly igore the one given. Seems like a waste of time to do it again. --- Jura 12:58, 14 June 2020 (UTC)

Administrators' activity

This is a topic that I've thought about for a while. Currently, administrators must perform five admin/crat actions over six months to be considered active.1 Previosuly, the minimum was 10 actions,2, but redirects were implemented on Wikidata,3 which reduced the amount of work for administrators. I would like to know the current point of view of the community about this. I think that the minimum is way too low, considering the amount of janitorial work this project has. I can tell that, for example, Wikidata:Requests for deletion can easily reach +100 unresolved requests within a few days. Vandalism is another big issue we have, because as a global project, we face a lot of cross-wiki vandalism/spam/LTA cases. Maybe a few years ago it wasn't like that, and a low minimum made sense, but nowadays the situation has changed, and a new reform may be useful. Esteban16 (talk) 04:57, 14 June 2020 (UTC)

  • I think we should have much more admins here, and not risk to lose them due to a high activity requirement. I am happy with the current policy. —MisterSynergy (talk) 09:40, 14 June 2020 (UTC)
  • @Esteban16: Could you explain how the backlog would be helped by removing some people's administrative rights? I don't see the connection. - Jmabel (talk) 16:51, 14 June 2020 (UTC)
    @Jmabel: I think the goal is to motivate inactive admins to do some work. --Haansn08 (talk) 17:38, 14 June 2020 (UTC)
    • There are reasons to be an admin that do not lead to taking a lot of admin actions. For example, I'm active mainly on Commons these days but also remain and administrator on en-wiki mainly so that I can see deleted content there, or edit protected pages there when someone on Commons needs something done there. I imagine there are a good number of admins who are focused on some one wiki who also have admin privileges on another wiki and use them in a similar adjunct manner. I don't think we are in any way benefited by removing their admin privileges. By the way, if you strongly disagree, please feel free to nominate my to have my en-wiki admin privileges taken away, as a test case. I won't take it personally, though I will certainly oppose it. - Jmabel (talk) 18:23, 14 June 2020 (UTC)
  • (edit conflict) Jmabel: The aim of an activity policy should be to make administrators fulfill it (thus more help), not to make them lose their flag, and would be even better if they can make more than the minimum. However, the minimum is low. And if someone gets their admin flag removed due to inactivity, and becomes more active afterwards, chances to regain the flag are high (this has already happened). Esteban16 (talk) 17:42, 14 June 2020 (UTC)
    • I don't think we should see admin rights as a reward for doing deletion work. Carrot-and-stick motivation can crowd out other intrinsic motivation and deletion work should largely motivated because the admin intrinsicly thinks certain items should be deleted and not because they have to hit a quota. ChristianKl22:43, 14 June 2020 (UTC)
    • ChristianKl: I agree with you. That was just an example I gave to explain the lot of work that is handled, at least in that area. Editors (and sysops) have their preferred areas, but in the case of sysops, doing a considerable amount of work is effective. And "considerable amount" should be based on how much work there is. Esteban16 (talk) 23:45, 14 June 2020 (UTC)
      • The real issue is not many users participated in (potentially controversial) RfD discussions so that admins can not make an unbiased decision.--GZWDer (talk) 00:22, 15 June 2020 (UTC)
        • Few users participate in RfD because the page quickly gets flooded with mindless spam and self promotional items. There's two possibly solutions:
        • Create a Speedy Delete template that doesn't make blatant spam items appear on RfD
        • Create a new RfD for items made in (presumably) good fate and other cases where the nominator are in doubt of the notability rather than being 100% sure sure about it--Trade (talk) 07:41, 15 June 2020 (UTC)
  • @Esteban16: There are multiple ways to make admins do more deletion work. Withdrawing admin flags because admins don't delete enough would be a carrot and stick way. One other way would be to streamline deletion work. We could for example change the deletion template in a way that displays all the information about the item, so that users don't have visit the page of the item to make a judgement about the item.
Solving https://phabricator.wikimedia.org/T254434 would also reduce the friction when it comes to deleting items.
If both of those things would be fixed I would only have to open 1 page instead of having to open 3 pages when deleting an item from requests for deletion.
Given that we have a policy not to delete items with Wikilinks I could imaging to have a bot that moves all entries on request for deletion that have Wikilinks to a second page and then move them automatically back when the Wikilinks are removed. Entries that stay longer then X months on the second page could be automatically cleared. That would increase the percentage of entries on the request for deletion page that are actionable. ChristianKl10:03, 15 June 2020 (UTC)
Now since WD:RfD is being discussed here: that page is not a problem in my opinion and its mode of operation should not be changed. Most sections are directly actionable, some require input, and very few ones are actually difficult to close and sit there for some weeks (or months). However, the vast majority of items is being deleted without having seen that WD:RfD page at all, and they create the actual workload in the deletion field. So far we have deleted 74084 pages this year, which corresponds to ~444 pages per day on average, and clearly *much* more than what is cycled through WD:RfD. ---MisterSynergy (talk) 11:10, 15 June 2020 (UTC)
  • As one of those who contributed to that RfC, I agree with MisterSynergy, and we need more admins, not less. People are not going to magically start doing more RfD work just because of this policy and any work done only for the purpose of avoiding inactivity is usually of lesser quality.--Jasper Deng (talk) 12:02, 15 June 2020 (UTC)
  • I think my words have been misunderstood. I pointed RfD (maybe I should have said deletions), which was later highlighted, because that area has a lot of work, and I wanted to provide an example like that. There will be always areas with more work to be done than others, and due to the nature of our project, deletions is one fo them. I never meant that admins should do more deletion work, but the activity criteria could be increased. That said, my aim wasn't to discuss the current RfD process, statistics, etc... but all feedback is welcome. Some have stated that it would be better to have more admins, which I agree too, and most have said that this wouldn't help to increase the activity of inactive sysops, and rather would make them lose their flags. I can't tell that some wouldn't lose their flags, but also, others could become more active. I don't think it is bad when sysops get their flags removed for this, after all, this is all voluntary, and if they passed RfA without major concerns, they could do the same easily if they want to. And finally, I consider that all the tasks done by sysops should be appreciated equally, because in the end, they are serving the community. Esteban16 (talk) 14:47, 15 June 2020 (UTC)
    • Deletions is an activity where it's easy to count activity. Stepping into an edit war and mediating the conflict in also an admin task but it wouldn't show up in statistics. Summarizing an RfC likely also wouldn't show up. If you go through a list like https://www.wikidata.org/wiki/User:Pasleim/notability it takes less time to delete items then it takes to Google to bring the item up to our standards. ChristianKl18:28, 15 June 2020 (UTC)
    • I do think it is a bad thing when we lose admins for this. People have real-life circumstances (if the current pandemic is any indication) and an admin active at one time might not be able to be active at another. There's no advantage to forcing them to lose adminship.--Jasper Deng (talk) 20:33, 15 June 2020 (UTC)
    • One idea is to define a Criteria for speedy deletion (which should be very objective and specific, unlike WD:N), so that admins can easily acts on some easy ones, which will be in a new page and not clutter the main RFD.--GZWDer (talk) 23:32, 15 June 2020 (UTC)
      • All item deletion requests are "speedy" here at Wikidata anyways. There is neither a minimum discussion period, nor do items need to be listed on WD:RfD at all in order to be deleted. According to my experience, users will probably list their cases pretty often in the wrong section anyways. ---MisterSynergy (talk) 10:20, 16 June 2020 (UTC)
  • I'm not a fan of inactivity policies generally. I go in and out of activity (more time out these days, but such is life), but I still help out here and there and so do the other less active admins. The answer is more admins, not a stricter inactivity policy. -- Ajraddatz (talk) 15:13, 16 June 2020 (UTC)

Known age at time?

Hi! On Wikipedia, if a person has a known age at a certain date, there is the template w:Template:Birth based on age as of date where you can put in the age and the date and it will be inferred what their age is currently and when they were born. Is there a way to do this on Wikidata? For example, how could I put an age on Sean McElwee (Q64747234)? Thanks, DemonDays64 | Talk to me 04:53, 15 June 2020 (UTC) (please ping on reply)

@DemonDays64: I put something in the date of birth, although there's more than one way to record such approximate dates in Wikidata. Ghouston (talk) 05:36, 15 June 2020 (UTC)
In an another interview he's 25: https://www.nbcnews.com/think/amp/ncna908331, but I don't think it establishes the year of birth: it's either late 1992, or very early 1993, if both interviews have correct ages. Ghouston (talk) 06:12, 15 June 2020 (UTC)
@Ghouston: thanks. It would be great to have an "age at date" thing you could put into the birth date thing so it's more friendly for developers have to query, but at least having the data in some form now is good. DemonDays64 | Talk to me 14:57, 15 June 2020 (UTC)
See the example at Gerallt Davies (Q91336641) (note both birth and death date qualifiers). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:56, 17 June 2020 (UTC)

Faked

Identifiers Statistics

At the “Identifiers” section of some Wikidata records, we have inserted our property “National Library Board Singapore ID” to link our URIs to the Wikidata record of the same entity. For an example, pse see record for writer Catherine Lim (Q5052793). Is it possible to extract from wikidata the usage statistics of the URIs we inserted, ie. the no. of times our URIs were being accessed (i.e. either by clicking or through other means such as API or bots) for each of the entity that we linked to Wikidata?  – The preceding unsigned comment was added by Nlbkos (talk • contribs) at 08:23, 15 June 2020 (UTC).

No, this is principally unknown on Wikidata's side. You can probably see a fraction of the incoming traffic in your server logs, based on referer links which accessing browsers send along with the requests. —MisterSynergy (talk) 15:38, 16 June 2020 (UTC)

Question on intended use of follows (P155) and followed by (P156)

Hello, I actually saw already a while ago that the for Korean television series, the properties follows (P155) and followed by (P156) are used differently than I would expect it. So, I want to make sure if they are currently used as intended in the objects or not. I would expect it like the Star Wars 4 follows Star Wars 3 and is followed by Star Wars 5. But for Korean television series objects, it's use is refering to time slots. E.g. In Korea, there are drama series that are broadcasted on Wednesdays and Thursdays at prime time. Then, there is a slot for Friday and Saturday prime time dramas and so on. Now, for Korean drama series, it's like one drama currently airs Wed-Thu and for follows it's the series that run at this time slot before and for followed by, it's the series that starts after the current drama is finished. E.g. My Mister (Q43111245) (started March 21) follows Mother (Q30949692) (ended March 15). Is it correct to use the properties like this? For time slots, even there is no story or production connection? --Christian140 (talk) 06:26, 16 June 2020 (UTC)

I think if they are used as statements, instead of qualifiers, it should be more obvious how the series is constructed. Typically the item would have a part of the series (P179) statement. Ghouston (talk) 07:00, 16 June 2020 (UTC)
This is an area where I think some more consistency could be helpful. The examples given on follows (P155) give two where the items have a full-statement version of follows (P155), and two where the follows (P155) is a qualifier on a part of the series (P179). The latter version seems to me to be the better modelling in many cases than having a series ordinal (P1545), follows (P155) and followed by (P156) on the main item (especially as things can be part of more than one series), but it's still very much a minority use: currently p:P179/pq:P156 has under 25k uses compared to almost a million as a main statement. Interestingly, there about 75k items with separate part of the series (P179) and follows (P155) claims. Am I correct in thinking it would be preferable to migrate a lot of those to having the ordinal/following as qualifiers to the series statement, rather than as top level statements? --Oravrattas (talk) 08:27, 16 June 2020 (UTC)
So what would the series of series be? --- Jura 09:11, 16 June 2020 (UTC)
@Jura1: No, there is no thematic link. What they have in common is that they both were broadcasted in the Wednesday-Thursday prime time slot of the South Korean TV channel tvN, while My Mister (Q43111245) started after Mother (Q30949692) ended. There is also the object TvN Wednesday-Thursday drama (Q10853090) for that specific time slot. Translated to English, the name of the Korean article is "tvN Wed-Thu Drama". Objects and articles like this also exist for other broadcasting stations and days, e.g. JTBC Monday-Tuesday drama (Q6863671) which is like "JTBC Mon-Tue Drama". --Christian140 (talk) 08:09, 16 June 2020 (UTC)
I read the property proposal as they are intended to be used as main values. I agree with Oravrattas that they are better suited to use as qualifiers as that would make the meaning more unambiguous. Will it require a new property proposal to change the use intention? Should new properties be made instead of change how existing ones are used? --Dipsacus fullonum (talk) 09:14, 16 June 2020 (UTC)
So you'd create a property JTBC TV schedule? --- Jura 09:27, 16 June 2020 (UTC)
My question is now if it is okay as it is or if something needs to be done. Another example could be Welcome to Waikiki 2 (Q61100037). It actually follows Welcome to Waikiki (Q31176615). However, the follows (P155) and followed by (P156) refer to time-slots, The Light in Your Eyes (Q60063045) and The Wind Blows (Q63079920). I would actually expect that Welcome to Waikiki 2 follows Welcome to Waikiki, instead it just refers to the drama that was premiere broadcasted in the timeslot before. Would be the same problem with Iris Ⅱ (Q139403) or Dream High 2 (Q491084) --Christian140 (talk) 17:30, 16 June 2020 (UTC)

baptismal record (Q11966269) and baptismal record (Q50292403) were just merged but I think they were meant to be two different things. One was the certificate given to you to keep when you are baptized and the other was the portion kept by the church in their registry books. Like birth, marriage and death records there are sometimes two different versions, one for the family and one for the person/agency to keep. So, should I restore them? --RAN (talk) 03:35, 17 June 2020 (UTC)

I don't think that's the case, baptismal record (Q11966269) before the merge only had a sitelink to no:Dåpsattest, which at Google Translate seems like the same thing. Ghouston (talk) 04:06, 17 June 2020 (UTC)
Ah, I see ... thanks! --RAN (talk) 04:31, 17 June 2020 (UTC)

Merging issue with the gadget

I have recently enabled the gadget named 'Merge' which is present under Wikidata-centric section in User Preferences. Then, I randomly selected an item and selected the option Select for Merging. It showed that a .js file have been created. Now, when I visit any random item, it shows a symbol near the 'Read' link with a hovering text 'process the postponed merge. I would like to know how I can cancel this merging. Adithyak1997 (talk) 05:04, 17 June 2020 (UTC)

You may use mw.storage.remove('merge-pending-id'); in your browser console. Alternatively it will disappear if you performed another merge.--GZWDer (talk) 08:22, 17 June 2020 (UTC)

college town (Q1187811) currently has has facility (P912) college (Q189004) and university (Q3918) which violates the constraint. I'm not sure whether the constraint should be extended or whether another property should be used here. ChristianKl09:29, 17 June 2020 (UTC)

  • It could be interesting to add statements to classes that don't fit the "has part of class"/"has part" properties. This could cover organizations, functions and facilities generally found in instances of these classes. --- Jura 11:03, 17 June 2020 (UTC)

floruit (fl.)

Following a difference of opinion on the proper use of property P1317 (floruit), here in Dutch and here in English, I think it is good to continu and broaden the discussion here in the Project chat. I especially ask historians for their input.

I have come to know the use of the term "floruit", more often abbreviated to "fl.", as to indicate in what period a person has been found to have been active, if further dates on birth and death are not known. Most of the times it pertains historic persons who are only known by some works that have survived the turmoil of history, ar [=and/or] who are referenced to in a more or less select set of publications, without records of life data. In short, I think this is also what is reflected in f.i. en:floruit. I have not been able to consult dictionaries or handbooks on the proper and historical use of the term or on it's history.

Now, I encountered the use of the property 'floruit' by User:Hannolans in the item on Hans Blom, a philosopher who is most likely still alive. Hannolans merely copied the given date of his doctorate, 1995, to the property floruit. Of course via the given WorldCat Identity many publications by Blom after 1995 are known, but to keep it short: in defence of Hannolans, User:DanielleJWiki stated that the property floruit on Wikidata is used to express the date of the latest known sign of life of the subject.

I think this is completely in opposition of a centuries long usage, a non-controversial tradition really, in scientific publications. Now I don't mind to have a property 'Last known sign of life', even when it seems rather superfluous for people who are most likely alive, but why call it after an established and longstanding term with quite a clear meaning and a distinct, exclusive usage? But more than that, I think on Wikidata the property was not introduced to demarcate the 'last sign of life' – in 2014, the creation edit edit summary says "flourit [sic], time when a person was actively publishing works", and the current description in my view does also does not mean 'last known sign of life'.

Given the confusion I experience, I think it's justified to presume that others on Wikidata are using floruit in the traditional way as well: to indicate a span of time in which works or life evidence of some subject have been found. A suggestion by DanielleJWiki was to use work period (start) (P2031) en work period (end) (P2032) to express a period elsewhere given as floruit, but floruit is not meant to express fixed dates, it does not mean the subject started at that time, only that earlier works are not known. A later proposal to rename the property (to 'Last sign of life'?), would be an option if not other users used floruit in what is in my opinion the correct way. A suggestion by Hannolans that I propose to split to two properties would be an option, if I favored a property 'Last sign of life', which I actually do not (Hannolans and DanielleJWiki state that it is an important property in relation to later possible copyright issues, but I think there's other properties seeing on that).

Which is the proper use of P1317 and how to solve the given problem? Thanks, Eissink (talk) 20:32, 11 June 2020 (UTC).

The property states that a person is alive at that point in time. And that is useful information for cultural institutions. --Hannolans (talk) 21:05, 11 June 2020 (UTC)
But if that is the sole use of the property (which by the way totally defies the description), don't you agree that the name is rather unfortunate? Eissink (talk) 21:12, 11 June 2020 (UTC).
  • Just because Hans Blom (Q57249521) might likely be alive doesn't mean that we know whether or not he's alive. His entry has no date of death (P570) statement, so for the purpose of modeling within Wikidata it's unknown to us whether or not he died. Our concept of floruit might be a bit broader then the way some historians use the term, but all dates for floruit under the narrower definition, that some historians use that only care about times where the person worked and where we know the person is dead, are still valid under our broader definition. ChristianKl21:18, 11 June 2020 (UTC)
But, apart from the property name, what is the use of deriving a 'last known sign of life'-date from another existing property? (Especially when that date is already outdated by other dates that can be found in already existing identification properties?) Eissink (talk) 22:36, 11 June 2020 (UTC). P.S. Maybe it's good to add that the case of Hans Blom is not isolated, it is just one example of hundreds or thousands of similar edits. Eissink (talk) 01:07, 12 June 2020 (UTC).
If you see that a date is outdated by another date feel free to replace it by the newer date remove the outdated date. The use of having that information on an item is that you don't need to think about how to get the information from other items when you want to use it but can query it directly. The datum seems to be useful for dealing with copyright. ChristianKl11:50, 12 June 2020 (UTC)
When there's an abundance of identity codes and a plethora of works surrounding a subject that is or was clearly active in recent years, or is even known to have been born in say the latter half of the 20th century, I don't think that copyright argument has any power here, even if you'd wish to rename 'floruit' to 'Last sign of life'. Eissink (talk) 12:58, 12 June 2020 (UTC).

There are probably six different questions:

  1. do we need a property to add statements to indicate when a given person was alive?
  2. do we need a property to add statements to indicate when a given person was alive, but is known to be dead now?
  3. is there use for two separate properties for the above?
  4. if not, how should we label the property in English? What aliases should it have?
  5. how should this be translated into other languages?
  6. also, is the use of Property:P1317 consistent with its definition?

Hope this helps. --- Jura 21:34, 11 June 2020 (UTC)

  • There's also 7: what does it mean that a person is known to be dead or alive? ChristianKl21:50, 11 June 2020 (UTC)
    • Isn't that answered once we query the data? --- Jura 21:53, 11 June 2020 (UTC)
      • I’d say for example that a person known to be born in the 10th century is known to be dead even if there is no reference for for a « date of death » statement whatsoever. I’d add (and I did, actually) a « date of death » statement with a date of death like « year 1000 with century precision » or Template:Unknownn value . Maybe a qualifier earliest date (P1319) for the last attestation date the person is sayed alive in a reference. It would appear in a query with a date of death. Would that be OK for you ? author  TomT0m / talk page 07:27, 12 June 2020 (UTC)
        • That makes sense to me. Still it feels awkward to apply it as 'floruit' to persons who by all reasonable expectations, and by further evidence not yet disclosed on Wikidata, are most likely still alive and kicking today, if you know what I mean. In my opinion, the term 'floruit' should be used only with historical persons whose whereabouts have mostly vanished in the mist of time, with the exception of those more recent persons that have operated in any culture that doesn't have a well developed system of archives and public records, because I believe that is the traditional exclusive scientific use of floruit. Eissink (talk) 09:07, 12 June 2020 (UTC).
          • So you wouldn't use it say for an author of a book published in 1950, since they may be still alive, but for a book published in 1910 it would be fine? Ghouston (talk) 11:56, 12 June 2020 (UTC)
            • It's less the date that is important, actually, it's the circumstances. If one or several books have been published by [Pseud. Name] and if that is all that is known, then it's good to say floruit then and then, also when it's in the 1950s or in the 1980s for that matter. User:Bouzinac just made a good and usefull addition to property P1317 (floruit), as it is used in tradition, namely "ce qui permet de présupposer une période où elle vivait", "which allows us to presuppose a period when he or she lived". That reflects exactly how 'fl.' is and always has been used in scientific literature (which is why I oppose the very peculiar, unwarranted use – and appropriation – of the term on Wikidata). Eissink (talk) 12:30, 12 June 2020 (UTC).
          • I'd agree with this. It feels very awkward to apply it to modern figures, and kind of odd to apply it to anyone where we know either a birth or death date. Andrew Gray (talk) 12:07, 12 June 2020 (UTC)
            • It's the nature of language that as use-cases change meanings often get broader. Gene used to be a term that refered to a sequence that codes for a protein. Given that it was later found DNA frequently codes for RNA that doesn't codes for proteins, the common definition of gene now got broadened to also apply to sections of DNA that codes for RNA that doesn't get translated into proteins.
It seems to me that it's clear that the current use-age of the property points to well-defined concept and there's not much value to have a second property that does rougly the same for historic people. On the other hand the question of what the best name for the property happens to be, seems a more open one. One possibility would be to name it "known alive in" and use "floruit" as alias. If we do rename it, it might also be worth to look through prior art whether there's another established term somewhere that might be used. ChristianKl13:27, 12 June 2020 (UTC)
"Known alive in" for the (apparent?) current usage seems fine to me, but I wouldn't use "floruit" as an alias: why not create a new property for the (traditional) use of floruit, which then should be used only direct from the to be given source? It would certainly be a general improvement – it should then have two date fields, with the option to fill the fields with the same date in case of only one reference year. That property then also should have to allow multiple entries, for the probably more rare cases where different sources make different inferences. (As far as the example of 'gene' goes: I think we have here rather an example of a narrowing, not a broadening of definition, going from a possible period of activity to a single point in time, which is part of the problem raised, not to mention other consequences.) Eissink (talk) 14:46, 12 June 2020 (UTC).
The existing usage of floruit (P1317) does allow entering two values. The property doesn't have single-best-value constraint (Q52060874) or single-value constraint (Q19474404). You can easily enter two dates to specify a range. floruit (P1317) generally only make sense when an item has neither date of birth (P569) nor date of death (P570)
All the traditional ways to use floruit imply that the person was "Known alive in" at the specific point in time. Having two properties that mean the same thing when used for historic purposes is problematic because when people query the relevant data they would have to make more complex queries to be able to deal with both ways of modeling the data.
I personally think that sourcing requirements shouldn't be defined on a per property basis but based on general principles. Privacy protection is a general principle that warrents forbidden certain unsourced claims but there's no such requirement for claims that are unlikely to do damage and where it's just desireable to have sources. ChristianKl22:45, 12 June 2020 (UTC)
Now you say exactly what my point is from the beginning: "floruit (P1317) generally only make sense when an item has neither date of birth (P569) nor date of death (P570)", which means that the current usage (by Hannolans and maybe others) indeed doesn't make sense, because it is applied (by the hundreds or thousands) to items on persons whose dates of birth are perfectly known (as are many, many other life signs), thereby polluting and degrading the traditional use and meaning of floruit. It's the only reason I started this discussion. Such current usage does not make sense, but in the earlier discussions I encountered two users who defended Hannolans' no sense application as exactly the reason the property existed, even when the description (and common sense, really) states otherwise. I was not able to convince those users, but if the result of this discussion would be that Hannolans and others just have to stop adding floruit on items where the life dates are given, or at least not a matter of concern, then this case could be closed. Eissink (talk) 23:24, 12 June 2020 (UTC).
en:Floruit gives an example with a date range, "John Jones (fl. 1197–1229)", basically substituting either a work period (in artistic usage) or a period with signs of life (in historical or genealogical usage) for the unknown birth and death dates. Ghouston (talk) 03:40, 13 June 2020 (UTC)
And that's easily modeled by giving an item floruit (P1317) 1197 and floruit (P1317) 1229. ChristianKl08:58, 13 June 2020 (UTC)
@Eissink: Let's say we discover that John Jones from the above example was born 1185. It feels strange and undesireable to me that this should result in us deleting 1229 from our database as that date does provide more information about the person, as that involves losing information we had previously. ChristianKl08:58, 13 June 2020 (UTC)
In that case, I agree. But as Ghouston says: floruit is basically a substitute "for the unknown birth and death dates". What I object to, is that Hannolans uses floruit where the date of birth is perfectly known (this is just one example of many) and where also the date of death by all reasonable expectation will or could be perfectly known also, he is using floruit on contemporary people with an abundance of life signs. Moreover: he maintains that the floruit property in such cases should only provide one date, being the last sign of life date. And such use is defended by DanielleJWiki, who says that date should be removed once the date of death becomes known. And DanielleJWiki also adds that for floruit in the traditional sense we should use work period (start) (P2031)/ work period (end) (P2032), which in my opinion is turning things upsidedown completely. I object to both those things, and I hope you would too. Eissink (talk) 12:06, 13 June 2020 (UTC).
  • Maybe it's worth clarifying that in Wikidata statements shouldn't be re-written every time new information from some other source is known and/or evaluated. Accordingly, if the actual date(s) become known, one shouldn't re-write history at Wikidata. Statements can obviously be re-ranked, but this is different from deletion. --- Jura 16:53, 14 June 2020 (UTC)
  • Well, that makes my objection to the particular use of floruit by Hannolans, namely as a single "known alive in" date, even more clear, because in that case we would have a date of birth and (presumably) many other life dates by works etc., yet the 'flourishing' period would suddenly only span the years between the previous "known alive in"-date and the latter. I think it becomes also clear that Hannolans and DanielleJWiki c.s. should refrain from using floruit in an unestablished and unwarranted way. I hope they will agree too and I'd like to hear from them. The main guideline should be: don't use floruit on subjects whose life dates are not problematic. Eissink (talk) 18:18, 14 June 2020 (UTC).
    • I think the clarification Jura mentioned is a good recommendation. Reranking the "outdated" statement and giving preferred ranking to the most relevant one (instead of deleting the less recent one) seems like a sensible thing to do.
    • Not having a date of death in wikidata, but having a date when the person was still known to be alive deserves floruit (P1317). Let's consider the following example: if a person was born in 1900 and we know they showed a sign of life in 1951 but we don't know (if or) when they died exactly, floruit (P1317) at least gives us the determination method for deciding their work is still under copyright in the EU. DanielleJWiki (talk) 09:03, 15 June 2020 (UTC)
  • @DanielleJWiki: I see you are pushing the date back a little, from persons who are probably alive (and flourishing) today to people born in 1900, to try to uphold using floruit as a copyright indicator, but it still doesn't work. The method you propose for decisions on copyright can only be indicative and never decisive, unless we have a date of death. In any case, your proposed use of floruit, practically giving as 'flourishing period' a rather random span of time between two points believed to be the last known signs of life, is a slap in the face of the utter meaning of floruit, as I think I have elaborated lengthy enough above. Is it really that hard to understand that your application defies logic? Or don't you want to give up a wrong practice only because it has been your practice for some time? I think "I have always done it like this" is never an argument for continuing bad practice. If you want a field for copyright purposes, for giving a date for "last known alive", and if you really without academic discussion want to abuse an established concept for that, what else can I do but start a RfC, but do I really have to? Am I mistaken when I conclude from the above discussion that you should stop using floruit as you do? Please try to develop a new property for your purposes and leave floruit to it's traditional use. Thanks. Eissink (talk) 09:48, 15 June 2020 (UTC).
  • Maybe it's good to add that this property was not created to be used in the sense you and Hannolans prefer to use it. Eissink (talk) 09:59, 15 June 2020 (UTC).
  • What they use the date for doesn't really matter. The field can be useful for different things. Personally, I had used it to check items about people born <1910 or something, but without a P570. The question is if the reference supports a statement consistent with the property description. That the label can have a different meaning in some contexts isn't that important. --- Jura 10:02, 15 June 2020 (UTC)
  • It's true that is doesn't matter what the ultimate date is used for, but it was you who said here that dates shouldn't be rewritten, the consequence of which is that when people only focus on floruit as the last signs of life, the given floruit period will always be a distorted one for those who (rightly so) use it as it has been used in the field for a long time. And I think the property description is meant to, or at least should, reflect the established use. Wikidata should not sing itself loose from the outer world, and it should not give some self-oriented meaning to subjects that do have a larger context. Eissink (talk) 10:15, 15 June 2020 (UTC).
    • The property is for a point in time, not a period. As we learn more about a topic, dates get added .. It seems to be me that your mainly concerned with question #4 in relation to non-genealogists. --- Jura 10:20, 15 June 2020 (UTC)
  • Question 4, yes, and therefore also question 6, but both in relation to general historical science, so including genealogy. I object to hijacking a well established concept. And I do think that the property was created to fit that concept. Using it for more contemporary subjects from area's with less written history (archives, etc.) surely fits the concept, but using it on contemporary persons in the way of DanielleJWiki is stretching the concept beyond the point of breaking: it doesn't make sense, you cannot fit that into the concept. (And saying, as you do, that the property "is for a point in time, not a period" is also contrary to the concept of 'floruit' – and to be clear: I don't mean my concept.) That is indeed my point. I don't fight any purpose, I object to the fatal abuse of a concept. Eissink (talk) 10:37, 15 June 2020 (UTC).
So far I don't see an argument to have different porperties for the same concept. You suggest to have an historic cut off date somewhere around 1850-1900 and after that date a new property needs to be used by cultural institutions because otherwise there is confusion? I don't see how how that will help. Background information: we add this floruit for National Museum of World Cultures (Q17153751) as for many artists from Africa, Japan, Indonesia and Canada (Inuits) only a floruit is known by the museum, based on the artwork and no date of birth and death is available. We hope to find more exact information when we connect those artists to other collections and floruit helps a lot. It makes no sense to me that we should only use this property for Japanse print makers, but not for African sculptors as the wood sculptures are from a more recent date after the cut off date. The property is about storing information when a person was alife, useful for institutions to identify creators. And as said above some people will have both a floruit and date of birth because differences in sources and more and more information becomes available. Rendering that information can be done in applciations, and apps (infobox etc) can make use of floruit how they wish. For exampl not display floruit when a date of birth is available. --Hannolans (talk) 11:33, 15 June 2020 (UTC)
You haven't seen an argument to have different properties for the same concept, because that is not what I have aimed for. We don't need different properties for one concept. I say we need different properties for different concepts. And also I didn't say we need some cut off date (perhaps I used a date on your Talk page as example, but never as some hard border): even for contemporaries like f.i. Banksy or other anonymous creators floruit is perfectly fine (but, again, not if you only want to give the last known date!), I have not meant otherwise and I don't think I could have raised that impression. All the examples you give now, do more or less fit perfectly under floruit (at least if you would not ignore the first sign of life and only stick to the date of the last known sign of life). What does not fit, as I have tried to make clear in many words already, is applying the concept of floruit to for instance, again, Hans Blom or any other contemporary (or even historic, for that matter) figure on which life dates are perfectly known or at least potentially not problematic at all, but that is also what you do – that is what I object to. Why don't you address that concern of mine? You are avoiding the main cause for this discussion – did you even follow it, or is it really that hard to understand what I am saying? I'm not interested in how one could render or apply this or that app, I'm talking about the bare misuse of the concept of floruit in the database, because you are stretching the common sense, established definition of that concept (which, by consequence, no doubt muddles the possible use of the data also: we don't have definitions and descriptions for no reason). Eissink (talk) 12:03, 15 June 2020 (UTC).
Elsewhere the question has more or less been resolved, that is: I will request a new property for the alternative use of floruit. Starting the request might take a week, or even longer. For now, I consider this discussion closed, so please ping me if you want to add something. Thanks, Eissink (talk) 16:38, 15 June 2020 (UTC).
@Eissink: there's also another method of representing unknown dates of birth/death but with signs of life, which is to create a approximate date of birth statement with a latest date (P1326) qualifier, or an approximate date of death statement with earliest date (P1319). Ghouston (talk) 00:37, 17 June 2020 (UTC)
@Ghouston: thank you so much. I think earliest date (P1319) perfectly fits the purposes of Hannolans and DanielleJWiki. I hope they can let us know they agree, so we can end this disccusion finally, preventing further abuse of floruit (P1317). Eissink (talk) 09:04, 17 June 2020 (UTC).
@Ghouston: Using earliest date (P1319) with unknown value implies that we know that a person died and that it's unknown when they died. It's no solution for the case where that we don't know whether or not a person has died which happens to be the case in the usage of floruit (P1317) that gets critized in this thread. ChristianKl22:06, 17 June 2020 (UTC)

Hannolans refuses to react here and refuses to consider P1319 as an option because "we don't know if someone is still alife": that was my point in asking to stop using floruit to speculate on someone's last sign of life! I cannot believe this is happening and I am totally pissed of. Somehow it seems okay to abuse a concept to speculate on someone's death, but using the proper property to do that is rejected as outrageous. Can someone please stop this madness? What else can I do but ask for disciplinary measures? Can someone just go on abusing properties and defying commons sense? Eissink (talk) 11:44, 17 June 2020 (UTC).

  • It's also not appropriate to neglect the description of floruit – namely to only use it "when birth or death [are] not documented" – and misuse the property to basically speculate on someone's date of death, yet that is exactly what Hannolans is doing. You cannot disapprove the one and agree with the other without being totally hypocrite. This all started because I was shocked to see floruit used on a person who's work I know: I thought he must have died, as would anyone familiar with the use of floruit, yet it appeared someone only wanted to indicate a possible future copyright border by copying an outdated date from another property. This has to stop, I don't care how. Eissink (talk) 12:11, 17 June 2020 (UTC).
FYI, I want to add for all non-Dutch readers in this "discussion" that user:Eissink in the original discussion on the use of floruit (P1317) (on the talk page of Hannolans) has started swearing (godverdomme), making severe insinuations, using very denigrating comments and even threatening another (I might add very respected and very experienced) Wikidata colleague that he (Eissink) will make sure one way or another that this colleague will be stopped in using floruit as a 'sign of life' when there is no date of death mentioned in Wikidata. In my opinion the repetitive ways user:Eissink is discussing this matter is seriously damaging colleagues and the professional work atmosphere on Wikidata; this is not how we should discuss a matter like this. Ecritures (talk) 12:31, 17 June 2020 (UTC)
The one 'godverdomme' ("goddamnit") is real (and I choose my own words), the insinuations are imagined, the so called denigrating comments are not that and the "threatening" is just an announcement that I will not give up this discussion. I challenge the way you use a property, and I think there is all reason for me to do so, but if you think your refusal to come to a solution can be obscured by accusing me of damaging colleagues, then I think the time to conversate is over. And for god's sake stop alternating between Ecritures and DanielleJWiki – you entered the discussion as the latter, your 'professional' account, yet you refused to reply in this Project chat earlier. You shouldn't have let the dicsussion come to this point to start with. Eissink (talk) 12:55, 17 June 2020 (UTC).
Concerning the so called "threat": I did not say "I will make sure", I said "I will try to make sure", the meaning of which is totally different than the poisonous frame you like to stick me in. Your reaction is a disgrace. Anyhow, I have just followed up on "trying to make sure" that the concept of floruit is stopped being abused, see Wikidata:Administrators'_noticeboard#Report_concerning_Hannolans. Now at least I think I have tried to defend a concept to the maximum extent possible. If somehow the administrators think there's no problem, I have at least done all I could, so it will not haunt me, but may it haunt the dreams of everyone who keeps defending the abuse of floruit for speculating on a death date, for they are raising their middle fingers to historical science. My job on this issue is over now, but let me add that any work place where it is not allowed to criticize a "very respected and very experienced colleague" will sooner or later start to stink like hell. Eissink (talk) 14:07, 17 June 2020 (UTC).

Tutorial

Is there a tutorial on importing a new data set? I want to try it for the first time. --RAN (talk) 18:26, 13 June 2020 (UTC)

Do you have any ideas on what data you would like to import how? Your question currently is rather vague. Generally I can for example advise OpenRefine as an import tool but it surely depends on the size and form of your dataset. Ecritures (talk) 17:04, 17 June 2020 (UTC)

Islamic Republic of Afghanistan

In the recent-ish history of Afghanistan we have a series of separate items: Kingdom of Afghanistan (Q1138904) replaced in 1973 by Republic of Afghanistan (Q1415128), then in 1978 by Democratic Republic of Afghanistan (Q476757), in 1992 by Islamic State of Afghanistan (Q1415585), and in 2002 by Transitional Islamic State of Afghanistan (Q4689103). In 2004 that was replaced by the Islamic State of Afghanistan, but the item we currently use there is Afghanistan (Q889), which isn't really the modern state specifically, but a much more abstract representation of a wider concept of "Afghanistan" (see, for example, the multiple "inception" dates it includes). This means, for example, we get statements like:

⟨ Afghanistan (Q889)      ⟩ head of state (P35)   ⟨ Amānullāh Khān (Q153620)      ⟩
start time (P580)   ⟨ 9 June 1926 ⟩
end time (P582)   ⟨  14 January 1929 ⟩

even though that seems like it could/should live on Kingdom of Afghanistan (Q1138904) instead or as well.

I can think of a few possible options here:

  1. We create a new item specifically for the Islamic State of Afghanistan as the modern sovereign state, and Q889 can be for the more abstract concept of "Afghanistan". Information about political concepts like Head of State/Head of Government etc should then be moved to the relevant predecessor items.
  2. We create the new item for the Islamic State of Afghanistan for continuity with the other versions, but those items shouldn't really contain much data: specifically all the HoS/HoG info etc should live solely on Q889 for all periods
  3. A combination of #1 and #2, where we duplicate information across both the historic items, and Q889
  4. We declare that Q889 *is* the modern Islamic State of Afghanistan, and anything anachronistic be moved to the predecessor items
  5. Q889 remains a mixture of everything either because it's not actually a problem or confusing, or because it is, but we don't really have a good solution
  6. Something else entirely

My preference is probably for #1 or #4 (at least until someone else comes up with a much better #6), but #5 (the current situation) seems like the worst version. Thoughts/suggestions/alternatives? --Oravrattas (talk) 10:51, 2 June 2020 (UTC)

That's unfortunately a problem that we have with many (most?) countries. While we usually have separate items for their predecessor states, the items for the current countries are almost always mixing the current iteration and the general concept of the country (including its prior governments, its current and former territories, the culture and everything that ever happened within the boundaries of the current country all the way back to the dawn of time). IIRC there have been various small-scale discussions, but nothing that resulted in actual solutions or efforts to tackle this problem. --Kam Solusar (talk) 16:53, 2 June 2020 (UTC)
Seems like we need something like the work/edition solution from FRBR? :) ArthurPSmith (talk) 20:19, 2 June 2020 (UTC)
Oh, that's an interesting analogy. I suspect a key part of the issue here is that we're largely reflecting that Wikipedias tend to use the main country page for the wider 'general concept' as well as the current sovereign state. So the question is how to best untangle those. With a bit more thought, I don't think #4 above would really make sense, as we'll need to keep the broader item as the target for all those links. But if we already have separate items for most "editions" anyway, making sure we also have one for the current incardnation, and moving lots of the data out into those (#1 above), sounds workable. --Oravrattas (talk) 21:30, 2 June 2020 (UTC)
I think a possible solution is to identify when a state is considered as a "continuing state" in international politics, i.e., generally inherits things like treaties (including membership of international organizations, such as the UN), debts and citizens from the predecessor entity. Typically, a change of government or name, or change of territory, is not sufficient alone to invalidate the succession. Then there should be a Wikidata item to represent the continuing state, until such time as non-successive discontinuity is found (such as when an empire splits into multiple successor states). Wikipedia will typically have a separate item for each historical name that a state has been known by: these can be marked in Wikidata as historical periods of the continuing state instead of separate states in their own right. Ghouston (talk) 05:21, 3 June 2020 (UTC)
@Ghouston: I suspect I'm missing something in your suggestion, but I'm not really seeing how that helps with the overarching "Bonnie-and-Clyde problem". As you say, there'll pretty much always already be separate Wikidata items for the historic versions of a state/country, and deciding that some of those should be marked as something less than an actual historical country (Q3024240) (or whatever better set of items we come up with) is certainly plausible. But we're still left with the core problem that the primary item for the country is doing double duty as being about the current sovereign state as well as the more abstract concept of the country. I think the deeper question is really much more about how we untangle those two concepts, rather than how we decide which countries to do that for or which historic versions to move which claims to. --Oravrattas (talk) 07:13, 3 June 2020 (UTC)
It's common enough for Wikipedia articles to have a main topic, but to also digress into related areas. I think separate items for country and state would be very confusing, which would you link to in any particular situation? Is the distinction between a country/state, as an area within a border, and the entity that governs it, e.g., United Kingdom (Q145) and Government of the United Kingdom (Q6063) sufficient? Ghouston (talk) 07:29, 3 June 2020 (UTC)
The immediate context of the question for me is about office held by head of government (P1313), head of government (P6), office held by head of state (P1906), and head of state (P35), but there are lots of other properties (e.g. flags/anthems/emblems/mottos etc) that generally also change at these succession points. We can, of course, have the entire history of all these things on the main item for the country, with suitable start/end dates, but then the question remains whether we also duplicate them onto those earlier items as well, or leave those items largely bare? Or do we have the older versions on the separate items, and only include these details on the primary/current item from when it was last 'split'? To take a practical example, which item or items should have the claim: office held by head of state (P1906): King of Afghanistan (Q17323565)? I would certainly expect to see it on Kingdom of Afghanistan (Q1138904), but should it be on Afghanistan (Q889) and if so is that as well, or instead of? Or do you think that shouldn't live on any of the country items at all, and instead be moved off to separate government pages, or elsewhere? --Oravrattas (talk) 08:54, 3 June 2020 (UTC)
I'd be inclined to put these things like flags on both the main item, with start and end dates, and on the item for the historical name. Maybe you have a state that has changed names and flags a few times, but it's still recognised as the same state. I doubt that the flag etc., really need to go on the item for the government too, but such duplication happens a lot in Wikidata. Andy Foster (Q39073800) is declared to be mayor of Wellington City on 3 different items (and could be 4 if we made him the incumbent of his position.) Ghouston (talk) 09:10, 3 June 2020 (UTC)
Generally I'm fairly happy with duplication: saying the same thing in different ways can be a useful "double-entry bookkeeping"-style error detection check, and there are often also subtle distinctions between the different statements (that become clearer, for example, where the function of an office changes in the middle of someone's term in it). Such cases generally also make it easier to query the data, as you can get at what you want from different directions, so to speak. The sort of duplication you're proposing worries me a bit more, however, where we would add exactly the same statement across several items, and thus many queries will bring back both, and you'd need to know how to filter some of the entries out. To stretch the Bonnie-and-Clyde analogy perhaps a bit too far, it also seems odd to me to say that the solution is to have a "Bonnie and Clyde" item, and a "Clyde" item, but not a "Bonnie" item. The Islamic Republic of Afghanistan, the Republic of Ireland, and the Russian Federation, for example, are well defined entities that are significantly different from the wider concept and history of "Afghanistan", "Ireland" or "Russia". Having Wikidata items for the latter concepts, but not the former, seems decidedly odd, and leads to all sorts of inconsistencies and anachronisms in the data. I certainly agree that splitting them out could potentially be confusing in some scenarios, but sidestepping those by pretending that there's not a difference doesn't seem to be a great answer, and as ArthurPSmith suggests, the work/edition split in books is a useful parallel here. That certainly also has similar problems, where editors often add statements at the wrong level: but that's generally fairly easy to fix up, and we gain much more clarity and consistency in both modelling and querying.--Oravrattas (talk) 10:41, 3 June 2020 (UTC)
In Wikidata, it seems to me that only the main country item would be needed, with start and end dates for the things that have changed. The items for the old names would be just for the sake of Wikipedia, and for templates on those Wikipedia articles. It generally wouldn't be desirable to link other Wikidata items to them, or to query them. The items woudn't turn up in queQ15180ries if they were declared only to be "historical periods", or some such thing, instead of representing separate countries. Ghouston (talk) 10:55, 3 June 2020 (UTC)
Sometimes a country does simply change its name without the entity really changing (e.g. North Macedonia, Eswatini), but that's not really what we're talking about here. It seems much too reductionist to to suggest that the changes from the Kingdom of Afghanistan down to the modern Islamic Republic of Afghanistan are merely that, and there weren't actually distinct states in between. Even purely for the purposes of serving Wikipedia, never mind wider considerations, wouldn't it make sense for Republic of Dahomey et al to be able to populate their infoboxes from direct lookups on Republic of Dahomey (Q798431) rather than indirect date searches against Benin (Q962)? And of course with something like Russia (Q159) it gets even trickier. Should it have all the data for Russian Empire (Q34266) and Russian Republic (Q139319)? What about Russian Soviet Federative Socialist Republic (Q2184) and Russian Socialist Federative Soviet Republic (Q2305208), or even Soviet Union (Q15180)? --Oravrattas (talk) 11:36, 3 June 2020 (UTC)
Sometimes it will make sense to treat them as continuing entities, and sometimes as newly formed entities, especially when the formation is the result of a merge or split. I'm not sure that this is really a decision that Wikidata can make on its own: it should simply reflect how the change has been treated in international politics, e.g., whether a state has been able to continue to hold a membership in bodies like the United Nations, or whether it had to apply as a new entity. Wikipedia articles are a different issue: Wikidata usually has an item for any Wikipedia article, and the data on that item can be used to feed its templates, but such items don't need to be treated as representing separate countries. Ghouston (talk) 11:51, 3 June 2020 (UTC)
The principle that's typically used now is that countries get additional Wikipedia articles and hence Wikidata items for every name change. So we only have one item for the United States, since it never changed its name, but we have several for the United Kingdom and Afghanistan and a couple for the Republic of Ireland. Major changes in territory are typically ignored, in cases where a state retained the same name. Ghouston (talk) 12:06, 3 June 2020 (UTC)
I think there's a distinction between continuing states, such as the UK through the independence of the Irish Free State, and successor states, such as the Soviet Union and Russia. If I remember correctly, Russia was considered the successor state of the Soviet Union, and was able to take over the UN seat and privileged position on the Security Council. The UK on the other hand just continued as though nothing had happened. The newly created states, Ireland, Ukraine, etc., had to start from scratch in creating new treaties and memberships. Ghouston (talk) 12:12, 3 June 2020 (UTC)
I'm afraid I'm having a lot of difficulty in working out what you're suggesting looks like in practice. Would you mind spelling out in a bit more detail what that would actually look like for the the specific case of the historic office held by head of government (P1313) information for Ukraine (Q212), Russia (Q159), and Soviet Union (Q15180)? That is, if we want to have enough information to be able to replicate the sorts of lists that are on List of heads of government of Russia, List of prime ministers of Ukraine, and Premier of the Soviet Union, against which country item(s) should each of the relevant offices (Prime Minister of Ukraine (Q1145714), Chairman of the Council of Ministers of the Ukrainian SSR (Q62113182), etc), be listed? --Oravrattas (talk) 16:52, 3 June 2020 (UTC)
I think in the case of Ukraine, it's probably best to consider Ukraine (Q212) to be a newly created sovereign state from when it declared or achieved independence from the USSR. The preceding entity was non-sovereign. List of prime ministers of Ukraine consists of several lists consisting of officeholders of several different positions, so it would be constructed from items like Prime Minister of Ukraine (Q1145714) and a newly created item for the prime minister of the soviet state (if the places have separate items, so should the institutions of the places). Likewise for Russia. Afghanistan is more interesting as a country/state which has changed names a number of times, with corresponding items in Wikidata, but which is probably better considered as a continuing state. But all this is just how I think it could be done: alternative proposals would be interesting. Ghouston (talk) 00:09, 4 June 2020 (UTC)
@Kam Solusar: can you (or indeed anyone else) remember (or point to) any of the issues that have come up before around where conflating the modern state with the abstract history of a country has been problematic? It would be good to have as wide an overview of this as possible when looking at any potential approaches. Do you have a view on whether the "work/edition" split, similar to that used on Wikidata:WikiProject_Books might help (where the abstract 'country' could be, like the 'work', the main target of the Wikipedia articles, but most of the data would relate to a specific 'version' of the country)? --Oravrattas (talk) 06:31, 4 June 2020 (UTC)
I think in any case, items are needed for the administrative entities, such as provinces and states, and when an entity changes from say a province to a sovereign state, then the items for each should be separate. For example, if Autonomous Region of Bougainville (Q18826) becomes independent some day, it would be preferable to create a new item for the new state instead of renaming that one. There will also be additional items for geographical concepts, such as islands, e.g., Bougainville Island (Q201766), and well-defined areas. Areas may have retained a name persistently despite being parts of different states historically, but the boundaries of such areas may not be very well defined. Ghouston (talk) 07:30, 4 June 2020 (UTC)
One potentially useful data point here is that France (Q142) does have a separate item for its current political incarnation: French Fifth Republic (Q200686). --Oravrattas (talk) 08:29, 6 June 2020 (UTC)
That's a good example of such an item that isn't being treated in Wikidata as a separate state. Ghouston (talk) 10:11, 6 June 2020 (UTC)
Unfortunately the history is much more inconsistent, though: French Fourth Republic (Q69829) is a regime, a period, and a historical country; French Third Republic (Q70802) is (preferred) a historical country, and secondarily a sovereign state, and a regime; French Second Republic (Q58326) is a historical country and a regime; French First Republic (Q58296) is a (preferred) historical country, and secondarily a sovereign state (no regime this time); and there are a few other items in-between those that are inconsistent with each other, never mind anywhere else. So the question is: is this a good example to follow elsewhere, and if so can we nail down a bit more what the hierarchy really is (e.g. should everything get back to regime (Q5589178)), and which item should which properties preferably live on (e.g. basic form of government (P122), legislative body (P194), office held by head of government (P1313) etc should live on the "regime" items, rather than the more abstract country ones)? Should we be resurrecting (or creating a successor to) Wikidata:WikiProject Countries to help sort all this out? --Oravrattas (talk) 12:16, 6 June 2020 (UTC)
It's not a bad idea, to help avoid double counting, having multiple versions of the same country apparently in existence at the same time. Ghouston (talk) 13:08, 6 June 2020 (UTC)
I've also just found Wikidata:WikiProject Historical Place which hasn't been particularly active for a while, but could be a useful place for deeper discussion on a lot of this. --Oravrattas (talk) 20:56, 6 June 2020 (UTC)
@Oravrattas:: searching for "country" in this page's discussion archive will probably show various relevant discussions about this topic. I think it's a common problem with country of citizenship (P27) and historical persons that died long before the modern-day country was founded. Or with ancient events and things that stopped existing before the foundation of the modern country, mostly with properties like country (P17) or located in the administrative territorial entity (P131). As a German, I've come across problems with Germany (Q183) various times. Mostly because a unified German state didn't even exist until 1871 (when the (German Empire (Q43287) was founded) and the current day country of Germany (the Federal Republic of Germany) was founded in 1949. But quite a lot of historical persons do have the statement country of citizenship (P27)Germany (Q183), like Heinrich Heine (Q44403) (died 1856), Arminius (Q68880) (died ca. 21 AD) or Richard Wagner (Q1511) (died 1883). Distinct items exist for past iterations (German Empire (Q43287), Weimar Republic (Q41304), Nazi Germany (Q7318) plus various territories occupied by Allies between 1945 and 1949), but the current FRG gets mixed together with the general history of the geographical/cultural area. And then there's West Germany (Q713750), but that's a whole other can of worms. Germany (Q183) has many statements referring only to the current country, like capital, heads of government/state, GDP, etc. while others like diplomatic relationships or flags seem to include everything back to the German Empire. And I vaguely remember that quite a lot of older stuff has already been cleaned up/moved. --Kam Solusar (talk) 18:28, 6 June 2020 (UTC)
I think using the item Germany (Q183) to represent the state of Germany back to the German Empire, with its various historical names and regimes, is a reasonable thing to do. It certainly simplifies the country of citizenship (P27) and country (P17) statements for this period. It would mean accepting that Germany remained a continuous state through the events following of WW2 and losing and regaining the DDR, but you have similar issues with many other countries with occupation during WW2. But that means that the other items, such as West Germany (Q713750), shouldn't be instances of sovereign state (Q3624078), or country (Q6256), or subclasses of those. Ghouston (talk) 04:50, 7 June 2020 (UTC)
I suppose continuity between Nazi Germany and the Federal Republic of Germany is highly debatable, given the occupation and replacement of government, with the republic not founded until 23 May 1949. Ghouston (talk) 05:50, 7 June 2020 (UTC)
On the question of countries vs states vs governments, it does seem more correct to say that a state is a particular type of government, one that has sovereignty over a geographical area. So it may be better to say that Government of the United Kingdom (Q6063) is a state, and United Kingdom (Q145) is a geographical area controlled by a state (also a country). This would make it easier to have an item like Germany (Q183) remaining constant as various states and regimes come and go. Ghouston (talk) 06:45, 7 June 2020 (UTC)
That sounds really odd to me, but this might be as much of a language issue as a modelling issue. To me "the state" (as in "Smash the state!") can have that sort of meaning, but generally talking about somewhere being "a state" (as in a Member State of the United Nations) has a somewhat different meaning, and is quite distinct from the government of that state (with a "has a" rather than "is a" relationship). I also don't know how possible (or even sensible) it is to try to resolve all these sorts of issues right now. Are there even quite small steps we can start to take to try to improve the situation right now, and see how they go? I think there's generally quite widespread support for cutting down the size of existing country pages, which are starting to become quite unwieldy, and whilst that certainly shouldn't be a primary driving force for modelling decisions, it does seem like moving more in the direction of reducing the number of concepts those pages are about, and reducing the number of anachronistic statements on them would be helpful. What the instance of (P31) and subclass of (P279) hierarchies of the various items actually ends up being is much less important to me than getting some sort of consistency over which entries things like basic form of government (P122), office held by head of state (P1906), head of state (P35), office held by head of government (P1313), head of government (P6), legislative body (P194), diplomatic relation (P530), currency (P38), flag (P163), anthem (P85) etc should live on, particularly as we start to add significantly more historic information for these, and especially when that data predates the current regime/state/country/whatever. Given that we already have a split between France (Q142) and French Fifth Republic (Q200686) (and predecessors) (which are in turn distinct from Government of France (Q1450662) and specific instances of that such as first Philippe Government (Q29949210)) with quite a lot of duplication between them and inconsistencies amongst the historic versions, I think there are a couple of parallel sets of discussions/decisions necessary on (a) how to best arrange these and split data between them, and (b) whether/how to replicate that for other countries. --Oravrattas (talk) 14:19, 7 June 2020 (UTC)
Even in the context of member states of the UN, it's governments that sign up as members, not geographical areas. However, the UN does keep track of the borders of the areas controlled by its member states, and doesn't generally accept applications from new governments in areas already "allocated" to some other government (think Somalia, Iraq, Syria, Libya). This makes "countries" somewhat persistent, so that the country of Iraq was considered still to exist, with no change in borders, even when its government lost control of much of it to Islamic State (Q2429253). I don't think technical issues like the size of country pages should be considered, since it offers nothing to countries/states that never change their name, like the United States. Ghouston (talk) 04:00, 8 June 2020 (UTC)
en:Sovereign state gives a defintion: "International law defines sovereign states as having a permanent population, defined territory, one government, and the capacity to enter into relations with other sovereign states", which comes from the 1933 Montevideo Convention. You can interpret this as being the combination of the territory, a government and its population, or as describing the government as a state as long as it controls an inhabited territory. In common usage, we have "smash the state" as you mentioned, as well as state institutions and heads of state, which are parts of government, and it seems hard to consider something part of a state unless it's part of the government. I may live in Australia, but I'm not part of the Australian state (not a citizen, in fact, but I'm not sure if that would make any difference). Ghouston (talk) 04:26, 8 June 2020 (UTC)
But anyway, I guess it would freak everybody out too much to consider attaching the "state" property to governments alone, and making something like Government of the United Kingdom (Q6063) a member of the UN. So we've got to conclude that items like United Kingdom (Q145) do represent states, and not just countries, and so should be connected to a single continuous government. The areas that they represent, in some cases, will be a theoretical area that the government controls, as recognized by the UN and other states, and not the actual area of control in practice. Ghouston (talk) 05:45, 8 June 2020 (UTC)
State (polity) also notes that “Speakers of American English often use the terms "state" and "government" as synonyms, with both words referring to an organized political group that exercises authority over a particular territory. In British and Commonwealth English, "state" is the only term that has that meaning, while "the government" instead refers to the ministers and officials who set the political policy for the territory, something that speakers of American English refer to as "the administration".” For added confusion, of course, "government" is also highly ambiguous, and is often used synonymously with the executive, e.g. in a government vs parliament split, even though they are both also referred to in other contexts as as branches of government. And from what I can gather, the equivalent commonly used words for all these things in other languages also carry subtle, but often quite important, differences, especially in the nuances. So even though I don't really disagree with anything you've said there, resolving these sorts of hierarchies is certainly not going to be simple. But as I've said, that's not really my main question or issue. Even if item size doesn't matter, we still have the question of which properties should be used on France (Q142) and which on French Fifth Republic (Q200686) and its predecessors. --Oravrattas (talk) 08:13, 8 June 2020 (UTC)
There's also French State (Q3591845), which may be some kind of specific legal concept and not linked much in Wikidata, but which is still an instance of state (Q7275). As long as items like French Fifth Republic (Q200686) are considered to be regimes, not states, they don't need to be linked much in Wikidata, and its properties are likely to be a subset of what can be found on France (Q142). Ghouston (talk) 10:02, 8 June 2020 (UTC)
This seems to be going round and round in circles without really getting anywhere. Do you accept that there is a problem with how countries are currently modelled in Wikidata — particularly in relation to historic information about them — or do you think everything is mostly fine at the moment, other than some erroneous data entry that is fairly easily changed? --Oravrattas (talk) 16:30, 8 June 2020 (UTC)
Hmm, and I notice that others are wisely staying out of this discussion. Well, we can have items for states, items for regimes or historical periods, items for geographical areas, is anything missing? Maybe not, it's just a matter of data clean-up to make these items better reflect political history (whether states were continuing, successor, etc.), adding establishment / disestablishment dates, and to avoid situations where more than item is representing the same state at any particular point in time. However, that can only be done on a case-by-case basis, one political transition at a time, and could be a lot of work. Ghouston (talk) 04:45, 9 June 2020 (UTC)
Even cases that seem easy, such as Irish Free State (Q31747) renaming to Eire, or Ireland in 1937.[8] No new state was created. The state wasn't disbanded and replaced. I'd describe it as a continuing state with a change of name and constitution, and the item Q31747 represents a regime or historical period, and the establishment date on Republic of Ireland (Q27) can be moved back. There's no need for Charles Haughey (Q333735) to have two countries of nationality. But making that change wouldn't be easy, since I don't know who may object to it, and a lot of items like Q333735 may need to be edited. Ghouston (talk) 05:25, 9 June 2020 (UTC)
Yes, it would definitely be good to get some input from more people, though as Kam Solusar noted above there's a long history of people raising problems with the current set-up, and little actually changing, so I think it's worth struggling through and finding at least a few actions that we can actually take. From my point of view, the data is significantly less valuable if it can't be queried, and I think by now we should really be capable of replicating lists like those at List of state leaders in 1903 in a consistent manner. I agree it's unlikely to be a quick or easy process to actually fix up all the data, but unless and until we get some agreement on what we're even aiming for, the quantity of duplicated, inconsistent, and anachronistic statements will just continue to grow, and people won't be confident enough in what we should be doing to correct it (or, indeed, multiple people will correct it contradictory ways).
I definitely agree that that country of citizenship (P27) on Charles Haughey (Q333735) is a good example of the sort of problem we're facing, and needs to be cleaned up, but I do also fear slightly that there are parallel long-running issues around nationality that could easily derail this entire discussion, so for now I'd prefer to focus on properties that should hopefully be less contentious. So let's look at office held by head of government (P1313). That seems to me to be directly connected to the regime, and it's at changes of regime that something like this tends to change, so to me it makes more sense to have and (etc, back through the historical regimes), than to have those on France (Q142) with valid in period (P1264) qualifiers. I'm open to persuasion, though, as to why it should be the latter, and that we should move them off any regime items instead: my primary goal here is having a consistent approach that we can document, and write useful queries for. --Oravrattas (talk) 05:15, 10 June 2020 (UTC)
My two cents: I think the key here is to separate out (sticking with English terms here) "territory" and "regime"; also: "empire", given that we are interested in a long view of history, and in many cases that is a necessary concept. The English "country" and "state" are too ambiguous to be useful.
A given place can be in more than one (overlapping or nested) territory, and at a given time a territory can have full sovereignty, limited sovereignty of various types, or no sovereignty.
I think that the bulk of situations can be modeled with these concepts. A tricky part is how to model the shifting of borders: e.g. Alsace-Lorraine moving back and forth between France and Germany; Norway calving off from Sweden or Zanzibar uniting with Tanganyika; the splitting of several countries shortly after World War II (Germany, Austria, Vietnam, Korea) and (in most cases) their later reuniting, etc.
Citizenship is even trickier, though: how do we model Jews not being citizens in 19th-century Romania, or Native Americans i the U.S. in the same period, or the Rohingye today in Myanmar. - Jmabel (talk) 16:01, 10 June 2020 (UTC)
I don't think that changing borders is necessarily a difficult issue: borders can't be represented directly in Wikidata, you'd just have a link to an external shape file or something, and that can have start and end dates. The different degrees of sovereignty is something that needs to be sorted out: we should be able to make a list, for any point in time, of the different entities that existed at that time, adjusting whether dependent states are included. Such as the dominion of the British Empire (Q223832) in the British Empire, and whether we should have separate items for dominions and their subsequent sovereign states. Was being a dominion of the British Empire equivalent to being a member of the European Union, or was it significantly different in some way? At least we can take a look at France, where such issues may not arise. Ghouston (talk) 01:29, 11 June 2020 (UTC)
An interesting thing about Dominion of New Zealand (Q2594990) and some others is that you couldn't be a citizen of it, just a British citizen / subject. Yet people were still described as "New Zealanders", back into the 19th century, as an informal grouping without definition, like "Tasmanians" today. New Zealand citizenship didn't become possible until 1 January 1949, although Statute of Westminster Adoption Act 1947 (Q7604668), which made New Zealand more independent and basically ended the dominion status, was passed on 25 November 1947. Ghouston (talk) 01:50, 11 June 2020 (UTC)
"The land was ours before we were the land's."
Question about NZ: pre-1949, were the Maori non-citizens, or British citizens? - Jmabel (talk) 15:23, 11 June 2020 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── I'd like to keep "country of citizenship" issues slightly separate for now because they're a bit of a minefield. They'll come up regularly, and need to be sorted out at some stage, so obviously the discussion should still continue, but resolving those issues seems like it's going be an ongoing problem independently of the more general modelling, so I'd like to separately check whether we have at least a rough consensus on standardising around "territory" vs. "regime" (with lots of nuances still to resolve), or whether there are any objections to this approach? As there will be a lot of detail to work out, I think we also want to move the detailed discussion to a WikiProject. Wikidata:WikiProject Countries seems like the most obvious home, despite the problems with "country" as a term. However, it's currently marked as 'complete'. Is it OK to re-open that again, or should we find something else? --Oravrattas (talk) 17:23, 11 June 2020 (UTC)

I would say from the discussion here that it is not just OK to reopen it, it is essential. - Jmabel (talk) 22:26, 11 June 2020 (UTC)
I'm not sure what that project was about, but it was completed. I suggest trying to break up this open-ended discussion into smaller topics, e.g., how to handle France can be discussed at Talk:Q142 and generic questions about how states should be represented in Wikidata could be discussed at Talk:Q7275 or some other place (with pointers from Project chat as required). Ghouston (talk) 01:18, 14 June 2020 (UTC)
One concern with moving the discussion to each country in isolation is that that could solidify even further the current situation, where each country is modelled differently, and the data becomes even more difficult to query unless you already know exactly which modelling choices have been taken for each country. This seems to me to be such a core area that we need at least some degree of consistency of approach so that we can document a standard approach to various types of queries, without losing the ability to get the deeper nuance of a single country where that's important. --Oravrattas (talk) 08:09, 18 June 2020 (UTC)

Other people getting 'exceeded limit' error messages too?

Since this morning (CEST) I very regularly (think once every 5-10 manual edits) get an error message while saving - 'Could not save due to an error. The save has failed. As an anti-abuse measure, you are limited from performing this action too many times in a short space of time, and you have exceeded this limit. Please try again in a few minutes.' I'm only doing manual edits at the moment, no batch stuff. Does this happen to anyone else as well, and can I do something to fix this? Cheers, Spinster 💬 18:38, 13 June 2020 (UTC)

The message is MediaWiki:Actionthrottledtext:
"""As an anti-abuse measure, you are limited from performing this action too many times in a short space of time, and you have exceeded this limit.

Please try again in a few minutes."""

in your interface language. The limit should be 90 edits per minute for you so this sure is weird. I wonder if other people have this too. Multichill (talk) 19:27, 13 June 2020 (UTC)
Not with manual edits but I have had this with QuickStatements at around 36 edits per minute. QuickStatements sometimes attempts the same edit more than once and occasionally creates duplicate statements - are there also attempted edits that result in no change and are not recorded in the page history and would they be counted? Also are there limits to the number of edits to the same page? Peter James (talk) 22:54, 13 June 2020 (UTC)
The message seems to come from an AbuseFilter. Maybe one of them has been set up in a too strict way. @Matěj Suchánek: does that ring a bell? Lea Lacroix (WMDE) (talk) 13:30, 16 June 2020 (UTC)
Maybe Special:AbuseFilter/history/121 by @Pintoch, DannyS712: who flags every OpenRefine edit. [9][10] --- Jura 13:34, 16 June 2020 (UTC)
These edits don't seem to be recent though. --- Jura 13:37, 16 June 2020 (UTC)
Setting up abuse filters was the only solution available to tag all edits made by a tool which does not use OAuth, before Lucas Werkmeister (WMDE) implemented support for tags in Wikibase editing actions. Thanks to that, OpenRefine 3.4 and above do not use abuse filters to tag their edits anymore. I do not think this is related to Spinster's issues, but if anyone has issues with these filters being triggered for every OpenRefine edit they make, they should just upgrade OpenRefine to 3.4 or above. Which also lets them configure the maxlag setting used by the tool, by the way (the default value being still 5). In my experience, the latter is the main obstacle to getting edits published. − Pintoch (talk) 14:01, 16 June 2020 (UTC)
No, this seems to have nothing to do with abuse filters. It is server-side throttling. Matěj Suchánek (talk) 07:56, 18 June 2020 (UTC)
@Lea Lacroix (WMDE): can you provide some data to support your claim? --- Jura 08:09, 18 June 2020 (UTC)
Abuse filter was my first suspect. That was indeed wrong. I've filed phabricator:T255804 to investigate further. --Lydia Pintscher (WMDE) (talk) 17:55, 18 June 2020 (UTC)

position held

Hello. A person elected in two continuous elections as a president (or as a mayor etc). Which structure is the best?

First Structure
Second Structure

and

Xaris333 (talk) 14:07, 14 June 2020 (UTC)

People have used both models, but I would recommend the second, as it allows a 1:1 relationship of elections to terms and makes the resulting queries a bit cleaner. Andrew Gray (talk) 15:09, 14 June 2020 (UTC)
  • Most presidents use the first model. It's cleaner in queries and integrates better with Wikipedia. --- Jura 15:13, 14 June 2020 (UTC)

  • The second is more complete. If integration with Wikipedia is better with the first, then I think there is a problem with the Wikipedia-template, not with the second structure. Eissink (talk) 16:04, 14 June 2020 (UTC).
    • The second somewhat artificial structure suggests there is an interruption between the terms. This may be appropriate for some offices in some countries, while it's not in others. --- Jura 16:45, 14 June 2020 (UTC)
We inevitably have to have separate terms if anyone has disconnected periods of office (the Grover Cleveland problem), so the queries and Wikipedia integration will always have to cope with them in some cases. My feeling has always been that they're both valid approaches; it's reasonable to start with consolidated terms and later split them up to separated terms later on, as and when it becomes necessary to express that extra level of detail. Once you get to the point of using parliamentary term (P2937) and elected in (P2715) qualifiers, that's a good sign that separated terms might be valuable. Andrew Gray (talk) 17:21, 14 June 2020 (UTC)
We can't use parliamentary term (P2937) for presidential system (Q49892) (for the president). Xaris333 (talk) 17:32, 14 June 2020 (UTC)
Definitely - sorry, didn't mean to imply you should if they weren't relevant. (In some countries they presumably do align). Andrew Gray (talk) 17:36, 14 June 2020 (UTC)

So, every user choose the structure he/she thinks is better? Xaris333 (talk) 21:59, 15 June 2020 (UTC)

@Xaris333: It's better to say that you choose the level of detail that you think is appropriate, and items can get better-modelled over time. So for a simple entry, we can just say P39:xyz, started 1993, finished 2003. But once you start wanting to add more detail, like "elected in" qualifiers, or electoral terms if they exist, then I would strongly recommend splitting it up into multiple P39 statements. Otherwise, things will get very confusing and hard to query if you try and model all those different qualifiers in a single statement. Andrew Gray (talk) 21:32, 17 June 2020 (UTC)
@Andrew Gray: thanks. I usually add more detail, like "elected in" qualifiers, or electoral terms if they exist so I will use the second structure. Xaris333 (talk) 16:20, 18 June 2020 (UTC)

Start time and end time for positions in infoboxes

See for example: Commons:Category:Cory Booker where his dates for the various positions he held are separated by a comma instead of a dash. (1900, 2000) instead of (1900-2000). This usage is wrong, the comma indicates holding the position two times, once in each of the two years, indicating there is no continuity in office. The dash, like we use in birth and death dates indicate continuity. --RAN (talk) 16:30, 16 June 2020 (UTC)

This looks like it's an issue with the code on Commons infoboxes - probably best to raise it on the template talkpage there. Andrew Gray (talk) 16:54, 16 June 2020 (UTC)
At some point, someone added countless additional statements to items like Q1135767 making infobox uses at WMF projects mostly impractical and existing uses unreadable. --- Jura 11:08, 17 June 2020 (UTC)
@Jura1: For parliament members position we prefer the second structure and we can also use parliamentary term (P2937). But, doesn't that structure also suggests there is an interruption between the terms? In my opinion, since the start date of the second statement is the next day of the end date of the first statement, the structure not suggest that there is an interruption between the terms (for all cases, presidents, parliament members etc). Moreover, if the second structure is problematic for Wikipedia templates, that applied also for parliament members position, where we suppose to use the second structure. Xaris333 (talk) 21:02, 17 June 2020 (UTC)

Verifiability question about source of lean-to shelters

Hi. In Sweden there is a huge number of lean-tos in nature. Often these are made and maintenance by the municipalities or some local civil organisation. This data has not yet been released as open data by most of the municipalities. I found data collected by individuals and published in a closed facebook group as well as online[11][12] and obtained it under CC0 and pictures were released under CC-BY.

Now my question is: is it ok to import this data into Wikidata based on these references?

The problems with them are multiple: missing unique identifiers. Not that authoritative source. If we have to wait for the municipalities to release this data it may take years before all 290 municipalities release it in a way we can reliably refer to.

This would really help OpenStreetMap also because Wikidata would be the sole source a unique identifier and because it is much better suited to model the details of the object, like I did here for example: https://www.wikidata.org/wiki/Q96278337 (this has way more data that I would be able to input into OSM with the current tagging system (floor area, whether sheltered on the front, images, what type of lean-to (in OSM there is only one)). I could of course put in a lot of energy and try evolving the tagging system of OSM but it is honestly not as flexible or as powerful as Wikidatas data structure.

See another example item Q96279866 where the campsite has a name but the lean-to does not and is a separate object with a separate coordinate and details like floor area.

Importing this data would enable data consumers to do much more detailed analysis of e.g. a hiking route and the possibilities for sleeping under a roof along the way.--So9q (talk) 07:57, 16 June 2020 (UTC)

  • How many items are we talking about? Is there maybe a page of the municipalities that can be linked with described at URL (P973)?
If I understand correct you can have an external ID towards OSM. In addition you might have stated in (P248) for an item that represents the facebook group. ChristianKl08:57, 16 June 2020 (UTC)
  • There are about 1000 lean-to shelters in Sweden. I don't know about id towards OSM because they are not stable unless its a relation (and shelters never are). P248 sounds like a good idea to add too.--So9q (talk) 11:00, 16 June 2020 (UTC)
  • Removed from where? I think the data can be kept up to date and that Wikidata is a good place to aggregate it. I would prefer if the municipalities would publish the locations of the campsites like this (with a free license which this currently lacks). I plan to advocate inside OSM for use of Wikidata to add details and then sync to OSM as much as the community wants. It is not possible to do the reverse because of the license and honestly not really that much details about the average lean-to-shelter currently exist in OSM. You are lucky if you find a picture or anything else than the position. Names very very seldom has sources and might be completely made up by the contributor.--So9q (talk) 11:00, 16 June 2020 (UTC)
  • Removed from physical reality ;) I think 1000 shelter would be okay via QuickStatements with the above stated in (P248) sourcing. Ideally, the people in the facebook group could be convinced that it's nice to have this data in Wikidata and contribute to it. ChristianKl11:46, 16 June 2020 (UTC)
  • OK, I understand. They are sometimes burned to the ground but usually stays for 30 years or more. I could advocate in the associations that keep them up and functioning that they update Wikidata.
Most people in the group have learned the facebook interface and not much else it seems and to them it is much easier putting up a post there than to add wikidata object for the campsite, add object for a lean-to, add image on commons with all its (annoying) questions and then link the two objects to the relevant photos. This is way too cumbersome and hard for the average joe. I could make a mobile craddle-like app that makes it easier to contribute and automatically creates items, upload the photos etc. and that would probably work way better, but then they argue that google maps layers is way easier to navigate than a WDQS query output-map. In the end they just keep posting and 1 person puts it into a layer in Google Maps and put the link in the group description, quick and dirty!
@ChristianKl:Does this all mean that we are good to go importing more campsites and shelters as long as a "stated in" is provided? :D--So9q (talk) 13:01, 16 June 2020 (UTC)
When it comes to low-resolution items like this, I think a key concern is whether data can be kept up-to-date. If some of those burn down and then that status isn't marked in Wikidata nobody can really rely on the data in Wikidata and the set would be worthless. It sounds to me like there's a reasonable belief for this data-set that data will be kept up-to-date.
As far as I'm concerned we are good to go but maybe wait a week to see whether someone besides me sees things differently.
One of the great features of Wikidata is that it's data can be easily reused. The Google Maps layer could automatically be updated based on Wikidata. ChristianKl13:39, 16 June 2020 (UTC)
One aspect I notice when looking at the current Google Maps layer is that it only includes these lean-ins in Sweden. Does Norway have them as well? If so that might provide a benefit over the status quo if you autogenerated one layer in Norwegian and another in Swedish and both show all the lean-ins. ChristianKl13:47, 16 June 2020 (UTC)
Thank you very much for taking your time to answer thoroughly. Now I understand the concerns. There are 4000+ members in the group and the need for this exists in multiple countries so I feel quite sure that they could be persuaded to upload to WD instead of in a hidden layer in google for the benefit of everyone. I made this statistic detailing lean-tos per 1000 km2 a few days ago from OSM data but i there are probably many more (in Sweden) out there yet to be mapped/cataloged.--So9q (talk) 19:07, 16 June 2020 (UTC)
I thought a bit more, and I think it could be benefitial to have a Wikiproject lean-ins. That project could also be in Swedish if that's desireable to this community. Then the Wikiproject can both host a map and also a list of all the lean-ins. The list is benefitial because backlinks show an admin who might think about deleting an item that the item gets used. ChristianKl09:01, 17 June 2020 (UTC)
Note that this kind of data may be well suited to https://www.openstreetmap.org/ if rejected here (or in addition to Wikidata entries) that is better suited for maintenance of geographic data. Such objects are certainly welcome in OSM, though imports need to pass a higher bar than on Wikidata. Mateusz Konieczny (talk) 09:14, 17 June 2020 (UTC)
OpenStreetMap might be willing to host data, but it's closed up and you can't update a Google Maps layer automatically based on OpenStreetMap data. ChristianKl09:52, 17 June 2020 (UTC)
I feel this is a problem as when we have a tag Wikidata on an OSM object we add so much value to that OSM object... maybe the future is that OSM also supports linking Wikibase and other Linked data sources. I guess we will see the same "problem" with structured data on commons, when it gets more "mature" I guess we would like to add more Linked data to describe the picture that is not "good enough to be in Wikidata". I listened yesterday to an Europeana project when they try to add metadata to pictures maybe they will have their own knowledge graph that we can link to from a Wikicommon picture when Wikidata is not containing what we want to describe
cc: Question @Abbe98: were do Europeana see linked data limitations... and what plans are there to have a knowledge graph curated by museums good enough for describing details in pictures - Salgo60 (talk) 10:01, 17 June 2020 (UTC)
I doubt it would be a good idea for Europeana to have a bunch of random self-promotional items for businesses sprinkled around it's database in an inconsistent manner. So "accept everything" likely wouldn't be a good policy for it either. ChristianKl14:40, 17 June 2020 (UTC)
OT @ChristianKl: the challenge I see for a network as Europeana is that > 3000 museums should start communicate about something so difficult as metadata and agree about it. When I populated Europeana entity (P7704) we realized that today they have problem coordinate between 2 museums if they have the same artist see blogpost. When I read the Europeana 2020-2025 strategy I feel they have given up "The lack of sufficient high-quality content and structured descriptive metadata highly affects the access to and visibility and reusability of digital content". They hope AI will fix it... I feel Europeana and Wikipedia is a perfect match but the Europeana people need to be more "visible" see the feedback I got. I feel they are more academic people than doers willing to do the dirty work ;-) .. - Salgo60 (talk) 19:54, 17 June 2020 (UTC)
As far as I know there are no such plans at Europeana. I'm not sure where you have heard such a thing but I don't think it's from Europeana. Abbe98 (talk) 19:07, 18 June 2020 (UTC)

Find items without a value for a property?

Hi! With the Query Service, how can I do a search for items without a property?

I can do this:

SELECT ?human ?property
WHERE
{
  ?human wdt:P31 wd:Q5.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". } 
}

to find items that have human (Q5) as the value for instance of (P31).

How can I search for items that do not have a statement of a certain property, for example items without a value for date of death (P570) to find just living people? I'm having trouble finding this in the guides on Wikidata.

Thanks, DemonDays64 | Talk to me 17:44, 16 June 2020 (UTC) (please ping on reply)

@DemonDays64: To find items that don't have any kind of P570 claim, you can use "filter not exists":
filter not exists { ?item wdt:P570 ?died }
To find items which have P570 set to the special unknown value, like John Bingham, 7th Earl of Lucan (Q371406) (ie there is a P570 statement, but it says "don't know"), you can use:
?item wdt:P570 ?died . FILTER isBLANK(?died)
Note that the "filter not exists" approach will time out on very broad queries like your "all people" one. This query demonstrates how to use them both - it searches for all people with Oxford Dictionary of National Biography ID (P1415), who by definition are all deceased, and reports whether they have a blank death-date, or an "unknown" death date. For various reasons I don't quite understand, "unknown" dates are all reported as things like t1977580164.
It is also possible to set claims to be "no value" rather than "unknown" - this explicitly says "there shouldn't ever be a value for this". In theory you should not find any P570/P569 claims using this statement, as it doesn't make sense for birth/death dates, but if you wanted to search for it you could use a syntax like
?human rdf:type wdno:P570
There are currently two of those errors, so I will go and fix them :-) Andrew Gray (talk) 19:44, 16 June 2020 (UTC)
@Andrew Gray: thanks a ton!! I now have made a query to find all US senators without Twitter accounts, which works great using your advice:
SELECT ?human ?died ?twitter ?born
WHERE
{
  ?human wdt:P31 wd:Q5.
  ?human wdt:P39 wd:Q4416090.
  ?human wdt:P569 ?born.
  filter not exists { ?human wdt:P570 ?died. }
  filter not exists { ?human wdt:P2002 ?twitter. }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". } 
}
Thanks so much! DemonDays64 | Talk to me 21:42, 16 June 2020 (UTC)
@DemonDays64, Andrew Gray: You should use the function wikibase:isSomeValue instead of isBLANK to find somevalue (unknown values):
 ?item wdt:P570 ?died . FILTER wikibase:isSomeValue(?died)
That is because it is planned to migrate the internal representation of these special values from blank nodes to special URIs to make updates easier as detailed in phab:T244341. --Dipsacus fullonum (talk) 08:06, 17 June 2020 (UTC)
@Dipsacus fullonum: Wikidata:SPARQL query service/queries/examples has a few example queries using isBlank. Presumably these should also now be migrated to isSomeValue? --Oravrattas (talk) 09:15, 17 June 2020 (UTC)
@Oravrattas: Yes, I think so and updated two of the queries. --Dipsacus fullonum (talk) 10:43, 17 June 2020 (UTC)
This is really helpful, thanks. I can never remember what code to use so I look it up in the examples each time :-) Andrew Gray (talk) 22:15, 17 June 2020 (UTC)

Don't display a column for a variable

(pinging @Andrew Gray: who gave such good help)

Hi! I have another kind of related question about the query service. How can I not display a column?

For instance, my query uses filter not exists { ?item wdt:P2002 ?twitter. }, which makes it so that there is a column for Twitter usernames; however, as you could predict, this column is always going to be empty, so it just takes up space.

Is there a way to declare a variable but not have a column be rendered for it? Thanks, DemonDays64 | Talk to me 04:09, 18 June 2020 (UTC) (please ping on reply)

@DemonDays64: simply binding ?twitter like that doesn't add a column in the output for it - for that you have to also include ?twitter in the list following your initial SELECT. So to remove it from the output, you can just remove it from there:
SELECT ?human ?humanLabel ?born
WHERE
{
  ?human wdt:P31 wd:Q5.
  ?human wdt:P39 wd:Q4416090.
  ?human wdt:P569 ?born.
  filter not exists { ?human wdt:P570 ?died. }
  filter not exists { ?human wdt:P2002 ?twitter. }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". } 
}
Try it!
Note also that if you're explicitly not wanting to ever use the ?twitter or ?died, like here, you can use the "unique blank node" syntax ([]), like this:
SELECT ?human ?humanLabel ?born
WHERE
{
  ?human wdt:P31 wd:Q5.
  ?human wdt:P39 wd:Q4416090.
  ?human wdt:P569 ?born.
  filter not exists { ?human wdt:P570 [] }
  filter not exists { ?human wdt:P2002 [] }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". } 
}
Try it!
--Oravrattas (talk) 06:33, 18 June 2020 (UTC)
@Oravrattas: oh thanks! I assumed you had to declare all variables in the SELECT (maybe I should read the docs more thoroughly lol). DemonDays64 | Talk to me 06:41, 18 June 2020 (UTC)

Copyright question

Is it explained/documented anywhere what kind of content is allowed on Wikidata from copyright point of view?

I was unable to find

(1) is Wikidata respecting or ignoring advice in https://meta.wikimedia.org/wiki/Wikilegal/Database_Rights
(2) is Wikidata officially forbidding or allowing import of copyrighted databases
(3) is Wikidata data supposed to be free of copyright and copyright-like restriction in USA legal situation
(4) is Wikidata data supposed to be free of copyright and copyright-like restriction in EU legal situation
(5) Where Wikidata is on spectrum of OpenStreetMap/Wikimedia Commons "we respect copyright, even stupid parts - we are not an experiment in copyright breaching", through Internet Archive with "we ignore copyright where it is stupid according to our opinion" to "Pirate Bay warez site". Is "it is a copyrighted data" a valid removal decision?

Question triggered by https://www.wikidata.org/wiki/Wikidata:Requests_for_deletions#Q76939332 that in turn was triggered by a copyright complaint on other open data project.

Yes, I am aware of https://phabricator.wikimedia.org/T193728 that went nowhere.

Mateusz Konieczny (talk) 09:09, 17 June 2020 (UTC)

I think there are a bunch of grey areas. In the case of data that's clearly licensed as share-alike by an entity outside of Wikimedia like OSM I consider copyright a good reason to remove the data. 09:42, 17 June 2020 (UTC)ChristianKl
As I understand WD is definitely intending to respect database copyright. However there's definitely some grey area when it comes to individual facts and identifiers. When I first joined WD and wasn't as versed in this as I now am I started importing data from The Database of British and Irish Hills (Q61667995) which is released under CC BY 3.0. I've since stopped until I feel confident that it wouldn't be breaching copyright to do so. However, I've actually spoken to a representative of the editorial team of this database in the past (about importing data into OSM) and he proved to be quite insightful on the topic:
There is greater clarity in the more litigious US, where it has been established in US copyright law that most data are considered "facts", i.e. belonging to the domain of knowledge (a public benefit), and therefore not copyrightable. This was clarified in an addendum that reads “In no case does copyright protection for an original work of authorship extend to any idea, procedure, process, system, method of operation, concept, principle, or discovery, regardless of the form in which it is described, explained, illustrated, or embodied in such work.” The effort in obtaining the data is irrelevant. In essence, observational and experimental data are “facts” that are free to be shared and reused without copyright restriction. The only data that are copyrightable are those containing what the US calls expressive choice, such as photographs, drawings, graphs or visualisations (but you would be free to make your own drawing from a photograph or a map, as Wainwright did). There is nothing to contradict this in the UK Copyright and Patents Act, which specifies "Any literary, dramatic, design, musical or artistic work". The requirement in both the US and Europe is creativity. So the way data is structured (e.g. the grouping of hills into Catchments and Watersheds in the DoBIH) might be copyrightable.

Database right, which is applicable to the DoBIH in the EU, is something different. It protects the structure of a database, provided there is sufficient intellectual creativity (dumping data into a spreadsheet is not enough!), but not on its own the data inside it. However database right protects against the abstaction of a substantial proportion of a database without permission.

There are certainly grey areas! There have been some legal battles, particularly in the US, around algorithms and software. The main arguments seem to be around the minimum level of creativity required. As a statistician and ex-scientist I would argue that many statistical analyses involve a considerable element of creativity. However it appears statistics are not copyrightable unless a substantial subjective element was involved in their creation, i.e. that they cannot be reproduced by someone applying standard methods to the data. It can be difficult to obtain impartial advice, especially from publishers who will generally interpret copyright law in the way most favourable to themselves. The music industry is particularly bad in this respect.
I would definitely like to see clarification on what regional copyright distinctions WD intends to respect (such as how commons generally needs a file to be free in both the US and it's origin country). --SilentSpike (talk) 10:32, 17 June 2020 (UTC)
An EU website describes the right as: 1. the right to prevent the extraction of either all or a substantial part of the database. There the key question would be what substantial means.
A legal article suggests that in addition there needs to be "substantial investment in obtaining and verifying data".
When it comes to having a clarification it's quite tricky because of the separation of concerns between the Wikidata community on the one hand and WMDE and WMF on the other hand. In some sense copyright is a content matter and would fall under the concern of the community. On the other hand it involves complicated legal issues. Maybe we should have a RfC that the Wikidata community welcomes WMDE to provide a copyright policy? @ Lydia Pintscher (WMDE): how would you see such a RfC? ChristianKl12:56, 17 June 2020 (UTC)
@ChristianKl: I'm not sure I understand completely so let me try to clarify: You're asking if WMDE would be able to write a policy to help the community decide what's ok and what is not ok to import into Wikidata? And if I would be ok with an RfC to basically ask WMDE to do that? If that's what you're asking then I'd need to check with our legal people what they think about this task :D Does anything comparable exist for another Wikimedia project or another project so I have some examples to make it concrete? --Lydia Pintscher (WMDE) (talk) 18:23, 18 June 2020 (UTC)
I am biased here as an OSM mapper, but I think that ""substantial investment in obtaining and verifying data"" is cleared for OSM dataset or its subsets. And as Wikidata community is unable to coordinate "we copy this tiny subset but nothing else", so copying from OSM would be out as long as database rights are supposed to be respected. What "substantial part" means is, I suspect, not clear even for real lawyers. But I am pretty sure that "location of 1700+ objects" is likely to count as "substantial" Mateusz Konieczny (talk) 14:51, 17 June 2020 (UTC)
A website of the EU commmision suggests that substantial means « Substantial part » is usually interpreted as around 10% of the whole database (which, in case of a huge database, can be quite a lot of data). given that OSM has around 6 billion nodes 1700 doesn't seem to be a substantial number and orders of magnitude away from the 10%. 16:45, 17 June 2020 (UTC) ChristianKl
@Mateusz Konieczny: To directly answer your questions as far as I know: (1) Yes we (should) respect the WMF legal statement, (2) Wikidata officially forbids the full import of any databases which are not provided under CC-0 or similar public domain rights. However, partial imports of data which is judged "not copyrightable" may occur. An example would be dictionaries and lexemes where the logical data (part of speech, inflections, etc.) may be imported, but the definitions which can be considered "creative" should not be, unless some CC-0 compatible arrangement is available. Also "non-substantial" imports may be done as noted by others here. (3) and (4) Wikidata data should be free of copyright in the US and EU. (5) Wikidata provides data under CC-0, which Commons and OSM generally do not; otherwise Wikidata should be the same as Commons as another Wikimedia platform. ArthurPSmith (talk) 18:55, 17 June 2020 (UTC)
This is helpful but still does not answer all questions about database subsets: 1) can we assume that a full import of all identifiers is always possible (i.e. only linking to the external database through external ids)? 2) is an external ontology DAG (i.e. only the hierarchy and the names without further information) copyrightable? These two questions are the most important IMHO. --SCIdude (talk) 08:34, 18 June 2020 (UTC)

Do we have a process for requesting merges in Ceb-Wiki?

I was just looking through distinct values violations in VIAF-ID. I found Grellingen (Q66821) and Q22440045 which seem to exist as separate items in Wikidata because of CebWiki. Do we have a process of how to get entries like this merged? ChristianKl11:19, 18 June 2020 (UTC)

Might be worth using https://www.wikidata.org/wiki/Wikidata:Tools/Edit_items tool named "Merge". Bouzinac (talk) 11:31, 18 June 2020 (UTC)
Since when can that tool merge CebWiki entries? ChristianKl11:41, 18 June 2020 (UTC)
@ChristianKl: it merges wikidata entries and put a warning message on cebwiki pages, requesting a merge down there. Bouzinac (talk) 12:17, 18 June 2020 (UTC)
When I use it and there are two cebwiki pages I get Error while "Please wait...": A conflict detected on cebwiki: Q12901 with cebwiki:Assenede (parokya), Q21770102 with cebwiki:Assenede (lungsod) ChristianKl12:24, 18 June 2020 (UTC)

CebWiki has a page for lots and lots of distinct GeoNames IDs that other wikis combine (those Grellingen pages are 7285972 and 2660513); but just because wikidata thinks those GeoNames IDs should only be one item, why does CebWiki have to agree? Levana Taylor (talk) 14:36, 18 June 2020 (UTC)

And it actually is a good idea to keep the municipality (administrative structure) and the settlement as different items - municipalities are split or merged, and the settlement is usually much older than the municipality which is named after it, and it is quite common that municipalities contain several settlements which could have separate items, but then why not the main settlement? There are of course many real duplicated in ceb/geonames, but the one most stumble upon is the municipality/settlement "duplication". Ahoerstemeier (talk) 19:51, 18 June 2020 (UTC)

Cleaning up distinct value constraints from VIAF

It has been proposed to evaluate the possibility of cleaning the distinct value constraint violations of VIAF ID (P214) through a bot. I've moved the thread to Property talk:P214#Cleaning up distinct value constraints from VIAF. --Epìdosis 14:44, 18 June 2020 (UTC)

QuickStatements and Constraint Violations

  • While looking into issues with VIAF ID (P214) I get the impression that a lot of distinct value constraint violations appear when data gets copied over from other databases. Maybe it's time to block QuickStatement from adding data that causes mandatory constraint violations and only allow mandatory constraint violations to be added by hand for the cases where a human has looked at the specific case and make a judgement that the data is okay for the particular context? ChristianKl12:12, 18 June 2020 (UTC)
I looked at a few, but most of the time the issue is that we have multiple items for one VIAF-id. For people this is not possible, but for geographical places is sometimes is. The few examples I've examined were not wrong because of QuickStatements, and also did not suggest a big problem. For me it was a clear example that most constraints are vague, and can be a sign that something needs to be double checked, but often is no "list we should work down to be emtpy a.s.a.p.". Edoderoo (talk) 12:27, 18 June 2020 (UTC)
There are also ways to enforce the double checking. There could be a check-box in QuickStatements that allows constraint violations but if the checkbox is checked it's only possible to do 100 statements at a time.
It's much easier to work on a list that can easily be emptied then working on a list that's too long anyway. ChristianKl13:16, 18 June 2020 (UTC)

I'm trying to signify that a nonprofit organization's (International Student House of Washington, D.C. (Q65085150)) mission is intercultural education (Q5819932). However, statutory purpose (P6346) has data type "monolingual text", which doesn't allow linking to the item for intercultural education, and the other alternatives I found (has goal (P3712) and has use (P366)) don't really seem to fit. Help? Sdkb (talk) 19:55, 18 June 2020 (UTC)

Perhaps field of work (P101)? From Hill To Shore (talk) 21:18, 18 June 2020 (UTC)
That works alright; thanks! I'll add it as a related property to the others where relevant. Sdkb (talk) 21:34, 18 June 2020 (UTC)

Aleksandar Yanev Georgiev

Could someone better versed in Bulgarian basketball please help me sort out whether Aleksandar Georgiev (Q2832491) and Aleksandar Yanev (Q16233667) are the same entity? Looking at the various sitelinks and identifiers, there are at least three different dates of birth, and (I think) at least two career progressions, but they seem rather muddled. Thanks! Bovlb (talk) 20:19, 20 June 2020 (UTC)

  Notified participants of WikiProject Basketball; WikiProject Bulgaria: @Tobias1984, Nk, Vladimir Alexiev, Datawiki30, StanProg: Bovlb (talk) 16:29, 21 June 2020 (UTC)
@Bovlb: There are two different basketball players Aleksandar Yanev Georgiev (known as Aleksandar Yanev) b. 1990 (6/20 January) and Aleksandar Georgiev b. 1991 which data is mixed within these 2 items. All articles are about Aleksandar Yanev. "Aleksandar Georgiev" which birth year we see in one of the items is not notable, so we can delete the item and link the articles to the notable one. --StanProg (talk) 18:58, 21 June 2020 (UTC)
I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. Bovlb (talk) 17:54, 23 June 2020 (UTC)

How should we note that George Floyd (Q95677819) was killed by police?

Currently, George Floyd (Q95677819) doesn't provide information that he was killed by police. As a result we can't run a query for people who were killed by police which is undesireable for Wikidata. One way to go about it is to use killed by (P157) with object has role (P3831) police officer (Q384593). For people where we don't know the name of the officier we could still do it with unknown value Help. ChristianKl09:31, 13 June 2020 (UTC)

  • While I am all for correct information being added. We should make sure the information is correct. For example, was he killed by police, or did he die while in police custody. There is a difference, and the last I heard (admittedly, I have tuned it all out) he wasn't killed but died of natural causes. Quakewoody (talk) 10:55, 13 June 2020 (UTC)
    • another issue is also that some countries (France being one, maybe we are the only one) may have more than 1 police force. And I suspect we should also express a difference when the death did happen with a police officer on duty vs off duty. --Misc (talk) 12:01, 13 June 2020 (UTC)

We need to obviously tread carefully in as far as the officers' rights to a trial, and to be assumed innocent in the meantime. However, at the very least we can obviously document that the officer was charge (P1595) with 2nd-degree murder (and others with lesser crimes, respectively). Using this on perpetrator (P8031), which also requires object has role (P3831) which would be suspect (Q224952) would seem to give the situation justice? Then, add member of (P463) to the police department.

As far as I can tell, this avoids discussions that tend to provoke border-line bad faith interjections whose authors themselves feel the need to immediately disown their words by invoking how ill-informed they are. Matthias Winkelmann (talk) 12:12, 13 June 2020 (UTC)

quickstatements limit

I want to add statements for 10000 items. It never completes loading. If I try several thousands at a time it still looks the same. It only works when there are a few items. But it seems too few to me. May be some one can help to add for me? Thanks.

CSV file: https://drive.google.com/file/d/18JNqc3VbZn63WlpQyo9JJP0DwJJBiDfp/view?usp=sharing

--The Master (talk) 12:36, 16 June 2020 (UTC)

Need help and advice on how to deal with self promotional SPAM on a small wiki

Last year I briefly asked about this problem: Wikidata:Project_chat/Archive/2019/03#Spam_or_hoax_items. By this year my attempt to fix the problem has failed because a rogue sysop has persistently blocked deletion request of the spam article, even though the same spam has been deleted on five other wikis.

Wikidata items host the sitelinks so I guess such scenario may not be uncommon. Can you please give me tips on how to delete that spam? The problem also forces wikidata to include the spam item.

Meta deletion request is not possible since the small wiki has sysops. Local deletion request is blocked by sysop. What's a common approach to handle such problems?--Roy17 (talk) 10:25, 12 June 2020 (UTC)

Roy17: As said above, you can request it to stewards, but they may not get involved unless you have reached the community/local sysops, that said, to (try to) handle this issue locally first. If the community insists on keeping the spam, or sysops are abusing their tools, stewards can act. Esteban16 (talk) 19:01, 12 June 2020 (UTC)
Thank you all so much for the advice. I have exhausted all local means. A formal complaint will be filed on meta soon.--Roy17 (talk) 17:42, 19 June 2020 (UTC)

Wish there was a form for references from FamilySearch records

I've been doing a lot of adding statements and references from FamilySearch's vital records lately, and it sure would be handy if there was a form for adding a whole bunch of referenced statements at once. Here's what it would have to contain: The QID of the person to whom it refers; the URL of the record; the FamilySearch records collection (e.g. "England Births and Christenings, 1538-1975"); then all of the information that can be gathered. Given names (with series ordinal); family name; parents, children, spouse; date and place of birth, baptism, marriage, death, burial, with the option to add qualifiers like "sourcing circumstances: circa" and "latest date;" age at death; residence; and in the case of censuses, occupation. Also, easy creation of reciprocal statements (parent-child/child-parent, etc.) Any form-makers out there want to take this on? Levana Taylor (talk) 13:39, 18 June 2020 (UTC)

  Support In general adding references in Wikidata should be a lot easier than it is! ArthurPSmith (talk) 20:31, 18 June 2020 (UTC)
I will suggest this.--GZWDer (talk) 23:29, 18 June 2020 (UTC)
That'd work as a format. I have been using this, but any standardized format can be converted to another. Levana Taylor (talk) 23:44, 18 June 2020 (UTC)
@Levana Taylor: I made this form approximately according to the specification above. It's very hacky and exports to QuickStatements, rather than directly applying the edits, as I don't have a server-side component. If there's any way I could improve the form, let me know. ―Vahurzpu (talk) 20:55, 19 June 2020 (UTC)
Thank you!! That is certainly a vast improvement over assembling QuickStatements statements by hand. But yeah, there are some things that need fixing or changing--most notably, that adding qualifiers like "latest date" to dates isn't working, ditto creation of place of birth/death; and your date format doesn't allow for entering less than the full day-month-year. All other issues are just adjustments in what data to include. Where's a discussion page where we can hash out the details? Levana Taylor (talk) 22:12, 19 June 2020 (UTC)

Rebrand of the Wikimedia Foundation

As many of us know, the WMF has stated that they will rebrand into someting which contains the name Wikipedia (the Wikipedia Foundation or similar). There have been some animated discussions on Meta, which I will not reference. We got a statement yesterday that the rebrand would go through irrespectively of what the global community thinks of it. What I would like to probe here are two issues: (i) How does this community feel by being part of Wikipedia movement or acting under the umbrella of the Wikipedia Foundation; (ii) what it the take of WMDE on this, and, for particular, whether they think about rebranding into WPDE. --Ymblanter (talk) 08:21, 19 June 2020 (UTC)

I have myself a strong opinion on this, but I will not provide it now in order not to influence the discussion.--Ymblanter (talk) 08:21, 19 June 2020 (UTC)
The consultation can be found at meta:Communications/Wikimedia brands/2030 movement brand project/Naming convention proposals. Deryck Chan (talk) 19:08, 19 June 2020 (UTC)
  • It's hard to explain Wikidata without mentioning Wikipedia (also to distinguish it from non-WMF wikis). Mentioning Wikimedia doesn't really help with that. That the foundation is called "wmf fondation" or "xyz fondazione" wont matter for that. --- Jura 08:30, 19 June 2020 (UTC)
Calling the organization Wikipedia Foundation suggests that they are responsible for the editoral decisions on Wikipedia to outsiders when that's not the case. ChristianKl10:07, 19 June 2020 (UTC)
  • In constrast to the WMF, WMDE is an organisation with members. The decision whether or not to rebrand WMDE should be up to it's members. I would expect the members not to be in favor of rebranding.
While I consider the rebranding of the WMF unfortunate, there's little we can do about it. ChristianKl10:05, 19 June 2020 (UTC)
I predict that we will lose valuable contributors over this petty and (doubtless) expensive change. - Jmabel (talk) 15:41, 19 June 2020 (UTC)
  • I don't care. Seems like a reasonable change. I myself confuse Wikimedia and Mediawiki on occasion. BrokenSegue (talk) 16:55, 19 June 2020 (UTC)
  • While I don't care one way or the other what they do... I am big on "separation". I don't think Pedia should be the boss of Data or Commons or Quote or etc. And changing the 'parent company' name to one of the subsidiary's name will just confuse the issue to make it seem that Pedia is the boss. Quakewoody (talk) 17:53, 19 June 2020 (UTC)

This item is only used on two pages. It seems like it might be better as a property rather than an item, since currently it needs to be structured awkwardly as "subject has role" "founder of" with qualifier "of" [thing they founded]. How would we convert it, if that's the right thing to do? Sdkb (talk) 03:21, 19 June 2020 (UTC)

See Wikidata:Property proposal/inverse label. The proper solution will be multilingual text datatype which is currently unavailable. An alternative workaround is using multiple monolingual text statements.--GZWDer (talk) 03:56, 19 June 2020 (UTC)
If you want to link the item for a company that a person founded on the page of the person the way to do it is to use employer (P108) with position held (P39). ChristianKl10:46, 19 June 2020 (UTC)
I'm not sure if founders of companies always become employees of those companies. Sometimes they might help with the founding and invest money but don't take any active role in the company. --Kam Solusar (talk) 14:59, 19 June 2020 (UTC)
There might be cases where it doesn't happen but in most cases I think that relationship exists. ChristianKl14:08, 20 June 2020 (UTC)

I proposed a new project which may be a complement of Wikidata. Feel free to discuss it in Meta.--GZWDer (talk) 06:34, 19 June 2020 (UTC)

The edit summary of the merge tool doesn't show that a merge happened

I undid a merge yesterday and I didn't notice that I undid a merge. When I undid it, the edit summary didn't say anything about it being a merge, so the task of dealing with the other item wasn't on my radar. Did we used to have a better edit summary? Was it changed? ChristianKl10:31, 20 June 2020 (UTC)

  • It shows me on the history page:
22:46, 19 June 2020‎ ChristianKl (A) talk contribs block‎ -1,963‎ - instance of: ·legislationcountry: ·Netherlands·Belgiumapplies to jurisdiction: ·police undo Tag: Undo (restore)
14:47, 19 June 2020‎ AnarchistiCookie talk contribs block‎ +1,963‎ - instance of: ·legislationcountry: ·Netherlands·Belgiumapplies to jurisdiction: ·police undothank (restore) ChristianKl13:29, 20 June 2020 (UTC)
22:46, 19 June 2020‎ ChristianKl talk contribs‎ 20,801 bytes -1,963‎ ‎Undo revision 1211267323 by AnarchistiCookie (talk) undothank Tag: Undo (restore)
14:47, 19 June 2020‎ AnarchistiCookie talk contribs‎ 22,764 bytes +1,963‎ ‎Merged Item from Q96441995 undothank (restore)
I get the same if logged out. Can't think what might be changing this... Andrew Gray (talk) 13:37, 20 June 2020 (UTC)
I found the issue. A script I had activated via mw.loader.load( '//www.wikidata.org/w/index.php?title=User:Yair_rand/DiffLists.js&action=raw&ctype=text/javascript', 'text/javascript' ); // User:Yair rand/DiffLists.js caused the problem. ChristianKl13:47, 20 June 2020 (UTC)

Transposition of date formats as an error

As seen in Peter II of Yugoslavia (Q223183) for date of death, is there a specific name for the typographical error where you transpose say 1 2 2020 as "2 January 2020" instead of "1 February 2020" or the other way around? --RAN (talk) 06:04, 20 June 2020 (UTC)

I would tend just to call it a transcription error. Andrew Gray (talk) 15:52, 20 June 2020 (UTC)

Main subject/primary topic of an exhibition

What would be the most appropriate property to use for indicating the main subject of an exhibition? I was considering to use main subject (P921) but according to the description that should be used to 'indicate the primary topic of a work'. I settled for now on field of work (P101) but that doesn't feel correct either. In my eyes the primary subject/topic of an exhibition can at least be a (or more) person(s) but also more conceptual subjects or ideas. Thanks in advance for your help, Ecritures (talk) 21:48, 20 June 2020 (UTC)

I've used main subject (P921). exhibition (Q464980) is a subclass of work. --Shinnin (talk) 22:36, 20 June 2020 (UTC)

Importing comprehensive knot property data from knotinfo

Note: I posted this at the Mathematics project talk page, and no-one responded, so I'm reposting it here

https://knotinfo.math.indiana.edu/ is a fabulous knot theory resource containing comprehensive data about almost 3000 knots. At the moment, we only seem to have a very few knot properties -- Alexander polynomial (P5350), Conway polynomial (P5351), Jones polynomial (P5352), and Alexander–Briggs notation (P6432) -- and remarkably few articles on specific mathematical knots. defining formula (P2534) is also used for the braid word of the knot.

What do other editors here think about the practicality of importing the complete content of Knotinfo into Wikidata? This would add perhaps 90 more properties to Wikidata, and almost 3000 new items. This seems like an excellent candidate for Wikidata: knots have so many different naming schemes that assigning Wikidata items to the simplest couple of thousand knots sounds like a great way to unify the naming schemes (pace https://xkcd.com/927/ , of course).

I think mathematical knots are a sufficiently fundamental kind of object that they surely all pass the notability criteria: I think the nearest equivalent to this is the several hundred items listing polyhedra, or the listings of thousands of individual chemical compounds. At the same time, this is not an unbounded set: knots with crossing number up to 12 are about as far as most mathematicians want to enumerate the knots individually.

Importing the data is trivial -- everything is downloadable from the site as a single spreadsheet in a simple format, and I can munge the data and use a bot to add it in a few hours -- but the preliminaries of adding several dozen new properties seems quite onerous, and we would also need to be sure that we are on firm grounds as regards copyright and crediting. (We would, of course, be acknowledging knotinfo as a reference for every single property of every item.)

I can also take care to merge in the entries for knots Wikidata has with the items that already exist, so there wouldn't be any duplication. As a final bonus, we could also link new items to MathWorld ID (P2812) for those knots that have them.

See User:The Anome/knot properties for a very provisional list of the properties, as yet neither cleaned up nor checked for consistency with existing properties.

Any thoughts? -- The Anome (talk) 23:48, 9 June 2020 (UTC)

  • I consider the knots themselves to be notable. Whenever we create new properties we don't create a new property to just solve a particular use case but think how to best model a specific relationship for Wikidata as a whole. The case for why a new property is needed should also be made for every new property individually. ChristianKl20:44, 10 June 2020 (UTC)
  • Just having the items with external link to the database would be worth it. But why duplicate all properties? That's why you have external databases. --SCIdude (talk) 07:14, 11 June 2020 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Thanks. Instead of copying everything, how about this subset? (see https://knotinfo.math.indiana.edu/index.php and https://knotinfo.math.indiana.edu/knotinfo_data_complete.xls for reference)

-- The Anome (talk) 09:25, 11 June 2020 (UTC)

    • You can easily copy the subset that can be expressed with existing properties. For things that can't be experessed with existing properties, we need to discuss each one in detail in a property proposal. ChristianKl12:31, 11 June 2020 (UTC)
That makes sense to me. I've got a bot, User:The Anomebot 3 that's suitable for the purpose, and I will put in a request for permission to do this. There are currently 22 "legacy" knots already in the database that will need hand-annotation to avoid creating duplicates: see User:The Anome#Knots for a query. -- The Anome (talk) 11:12, 12 June 2020 (UTC)
  •   Support adding the knots, though not necessarily with 90 new properties, which seems excessive. (Agree with SCIdude, "Why duplicate all properties? That's why you have external databases.") —Scs (talk) 16:26, 12 June 2020 (UTC)
Another interesting task will be getting images for all ~3,000 of them. Not sure whether the images at knotinfo.math.indiana.edu are PD. Commons has plenty, e.g. commons:File:Blue Figure-Eight Knot.png, though it's not immediately clear how many, or how they're organized. —Scs (talk) 16:35, 12 June 2020 (UTC)
Cool! (And I presume you've seen w:List of prime knots, for comparison?) —Scs (talk) 10:50, 15 June 2020 (UTC)
Thank you for reminding me about that page: yes, I can add the image parameter for each of the knots to match those, as well as cross-checking the data fields. -- The Anome (talk) 18:47, 21 June 2020 (UTC)

Leave of absence

Person A is elected member of parliament for the period 2018-2022, then takes a leave of abscence in 2019, during which time person B takes their place in parliament. How should one model that? Two possibilities that immediately come to mind are (using Sweden as an example):

First option

Person A → position held (P39)member of the Swedish Riksdag (Q10655178)

start time (P580) → 2018-01-01
end time (P582) → 2018-12-31

Person A → position held (P39)member of the Swedish Riksdag (Q10655178)

start time (P580) → 2020-01-01
end time (P582) → 2022-12-31

Person B → position held (P39)member of the Swedish Riksdag (Q10655178)

start time (P580) → 2019-01-01
end time (P582) → 2019-12-31


Second option

Person A → position held (P39)member of the Swedish Riksdag (Q10655178)

start time (P580) → 2018-01-01
end time (P582) → 2022-12-31

Person B → position held (P39)member of the Swedish Riksdag (Q10655178)

start time (P580) → 2019-01-01
end time (P582) → 2019-12-31

Maybe there exists some dedicated property for these kinds of cases that I've not found? Popperipopp (talk) 11:56, 16 June 2020 (UTC)

  • I think it depends on the status of the "absent" person. If they are still considered in office, it would be Second, otherwise, I'd rather go for First. --- Jura 12:54, 16 June 2020 (UTC)
  • Sometimes a LoA is an extended vacation. Sometimes it is a polite way of saying "I quit". So, really, like above, first determine what the 'status' is. Quakewoody (talk) 13:12, 16 June 2020 (UTC)
  • I agree with Jura/QW that it depends on the situation of the "normal" members and also the alternates. My reading of this summary is that the alternates have all the powers of a normal member of the Riksdag (they're not a special class of member with less responsibility) and also that the "normal" member takes no part in the Parliament while on leave.
Given this, I think the first option makes most sense - these people are functionally not holding the office of member of the Swedish Riksdag (Q10655178) during this period. One person hands over the job, another one takes over, and then they give it back later on, so there is only one member in the "slot" at any given time. If you add an appropriate end cause (P1534) qualifier - something like "standing down to take a leave of absence" on person A, and then "alternate member standing down" on person B - along with replaces/replaced by to link A and B together, it will be clear exactly what has happened and any reports can match them all up. You could also add a subject has role (P2868) or significant event (P793) qualifier to B's P39 to indicate that they're an alternate member, so that a query can easily find all the alternates. Andrew Gray (talk) 17:40, 16 June 2020 (UTC)
Thanks for all the great feedback. I think it makes sense to tackle this on a case-by-case basis, and the end cause (P1534) goes a long way in clarifying the situation. Popperipopp (talk) 17:36, 21 June 2020 (UTC)

PetScan installation

As a few other tools, PetScan stopped working since the ongoing url domain name change.

Is there a way to install the tool locally? Supposedly some of the services it uses would still need to query remotely, but the combination and editing could run locally. --- Jura 09:53, 21 June 2020 (UTC)

See code, but you need a Cloud VPS instance with Wikimedia replica connection. Alternatively export the result to PagePile and generate QuickStatement commands (for creating new items, see also newitem.py; for adding claims, see also claimit.py).--GZWDer (talk) 11:14, 21 June 2020 (UTC)
Thanks for the links. As most tools I had been using stopped working, I will have to figure out some of the others. --- Jura 11:48, 21 June 2020 (UTC)

Property for a unformal leader

Does anyone have a suggestion for a property for a leader/headed by of an action like Mutiny on the Bounty (Q749811) where the Disaffected crewmen, led by Acting Lieutenant Fletcher Christian (Q316070) , seized control of the ship from their captain or like Richard Parker (Q7328273). Pmt (talk) 13:53, 21 June 2020 (UTC)

I would agree with @Jura1: - a qualifier on "participant" seems the best thing to use for identifying the leader of an event or movement. Ditto things like "instigator", etc. Andrew Gray (talk) 21:19, 21 June 2020 (UTC)

How to indicate if a death toll or number of disease cases is cumulative (total no of cases since start) or an annual frequency (cases per year)?

Regarding number of deaths (P1120): The number of deaths 2010 sometimes refers to the annual frequency (number per year) during that year, and sometimes to a cumulatively number, i.e. total number since start, for example of a war, a long-term catastrophe or an epidemic, until that year. Cumulative numbers are most common, especially in historical catastrophes that lasted several years, but both types of numbers occur for the same item and property, and it is not clear which type a certain number is. For example in War in Donbas (Q16335075) it is an annual number, while in 1918-1920 flu pandemic (Q178275) it is a cumulative number, and in HIV/AIDS (Q12199) both type of numbers are mixed. The same problem goes for many other properties, such as number of injured (P1339), number of casualties (P1590), number of cases (P1603) and victim (P8032). A third type of value is en:incidence frequency (number of cases per year and capita), and a fourth, en:cumulative incidence (total number of cases per capita since start).

Which of the following options would be the best approach to indicate this?

1. These properties should all be cumulative, but new frequency properties may be created for each of them, named "death frequency"/"annual deaths"/"cases per year" etc. Incidence properties (per capita) may also be created. That approach would be easiest for module and template designers.

2. Another option would be to allow a frequency unit such as as "deaths per year", "cases per year", "events per day", "events per capit and week", etc. Why is that not possible today?

3. A third option would be to require that the user always states nature of statement (P5102) -> either "cumulative" or "annual frequency", or "incidence" or "cumultive incidence" (Q1106825). Some of these items then needs to be created.

4. Both start date and end date should be indicated. For example the full duration of a war, or the duration of a year. Similar to HIV/AIDS (Q12199). Tomastvivlaren (talk) 21:22, 21 June 2020 (UTC)

Wikidata weekly summary #420

@Mohammed Sadat (WMDE): You have now for two weeks in a row in the weekly status update called persons who are not slave traders, including slaves, for slave traders. This time in the query example "Monuments named after/commemorating/depicting slave traders". The query does find monuments named after, commemorating or depicting slaves, not slave traders. I find it very grave to call other people for slave traders when they are not, and I am most disturbed that it occurs two times in a row. I had hoped in this update to find an apology to the innocent people you called slave traders last week, and indeed not a new grave accusation of innocent people. --Dipsacus fullonum (talk) 21:40, 15 June 2020 (UTC)
I agree. Slave trading is repellent, but falsely labeling someone a slave trader is repellent also. I'm not Mohammed Sadat, but I went ahead and deleted the offending items. (If this was wrong, someone can explain, and reinsert them.) —Scs (talk) 23:54, 15 June 2020 (UTC)
@Dipsacus fullonum, Scs: I’ve restored the queries – deleting them here doesn’t fix the other copies of the weekly summary, and it makes this discussion harder to follow.
In the case of this week’s “monuments named after/commemorating/depicting slave traders” entry, it looks like two queries were mixed up. This query, from the tweet linked in the weekly summary, is for slave traders; but the weekly summary actually links to this query instead, from the next tweet in the thread, which is for enslaved people. An unfortunate mixup (but one that, after the weekly summary already went through MassMessage, I have no idea how to fix).
That said, you also removed my “Statues and monuments of slaveholders or traders” query. Apart from me accidentally using the wrong word in the tweet (should be “sculptures”, not “monuments”, as sculpture (Q860861) is the parent class of statue (Q179700)), this query is correct as far as I’m aware, and if you see a problem with it, I’d like to know what it is. --TweetsFactsAndQueries (talk) 06:08, 16 June 2020 (UTC)
@TweetsFactsAndQueries: Just to be clear, I have no problem with the query described as “Statues and monuments of slaveholders or traders”. On the contrary I admire how it is written technically, specially the second graph pattern of the union (although I wonder why you choose to include deprecated statements with P3716). Sometimes the selected query examples have technical problems, so it is good to see illustrative examples also.
The other query that I do have a problem with is “MWAPI searches in wikidata about people descibed as slave traders by citizenship“ in issue #419. That one makes a list of persons, including living persons, that aren't necessarily described as slave traders. --Dipsacus fullonum (talk) 07:04, 16 June 2020 (UTC)
@TweetsFactsAndQueries: Of course I understand that there are multiple copies of the Weekly Summary posted on individual talk pages and elsewhere. I'm not sure how to track them all down. But the copies here on the project chat page are certainly higher-visibility, and now that you have restored the problematic queries here, it is up to you to defend them. I will not edit war to delete them, but it's going to bother me the longer they are publicly visible here, inviting anyone to click on and be misled by them.
There are three queries, one identified by Dipsacus fullonum as problematic, one identified by me as problematic, and one which I belatedly see is perfectly fine. (I deleted the third along with the first two, I confess, on the mere assumption that where two were bad, the third was also likely to be so.)
  1. The query labeled "MWAPI searches in wikidata about people descibed as slave traders by citizenship" in Weekly Summary #419. Dipsacus fullonum has already explained this one.
  2. The query labeled "Monuments named after/commemorating/depicting slave traders" in Weekly Summary #420. The first two of its monuments I checked were Dred and Harriet Scott Statue (Q75117035) and Crispus Attucks High School (Q5186067). Needless to say there is nothing in the entities for Dred Scott (Q480427), Harriet Robinson Scott (Q27839143), or Crispus Attucks (Q288241) to suggest that they were slave traders.
  3. The query labeled "Statues and monuments of slaveholders or traders" in Weekly Summary #420. I apologize for attempting to delete this one too hastily, without first checking any of its results, which I now see must be based, properly, on actual slave ownership.
So, what shall we do about the first two? Redelete them here? Push out a correction to all the recipients on the global message delivery list? There has been no response from Mohammed Sadat; I'm guessing he's been away from his computer for a few days. —Scs (talk) 10:29, 16 June 2020 (UTC)
  • It seems a bit ridiculous to have two people employed as community communications and having no communication with the community in a case like this. Is "community communications" about reading twitter and copyposting it? If community communication is about speaking to people outside of Wikidata it would be good to have an overview on Wikidata about what communication those people are actually doing about Wikidata. ChristianKl11:53, 16 June 2020 (UTC)
Hi everyone, I would like to apologise for both of the problematic queries ("MWAPI searches in wikidata about people described as slave traders by citizenship" in Weekly Summary #419 and "Monuments named after/commemorating/depicting slave traders) that were posted in the newsletter. The queries were included to show the power of Wikidata in topical discussions. It is with the deepest shock that I have become aware of the issues with these queries. It is very clear to me the horrific impact of incorrectly labelling those who are innocent of the crime of slave trading and abhorrent that slaves themselves should be so labelled. I would like to express my deep regret over what has happened and give you my promise that I will do my utmost to avoid such hurt in the future. I would like to thank Dipsacus fullonum and Scs for bringing this topic forward and for digging into the queries and the results to uncover the errors. You have helped to demonstrate the strength and tenacity of the community, as well as the desire for truth and knowledge that drives all of us within the movement. I would also like to apologise for the delay in writing this response. Four days is a long time for a topic this sensitive to go unattended and I will endeavour to be more aware in the future. Next week we will be pushing the corrected queries to all recipients of the newsletter along with an explanatory statement. Please be patient while I put this in motion. Thanks again again for your help. Let's work together to prevent issues like this in the future. Mohammed Sadat (WMDE) (talk) 21:01, 19 June 2020 (UTC)
Hi ChristianKl, as the root causes were not obscured or difficult to find, we've decided not to use 5 Whys. We prefer to use 5 Whys for larger retrospectives. We have determined that we should have (1) discussed more thoroughly before writing about such a sensitive issue (2) reviewed the queries ourselves before publishing them (3) responded to the community's questions much earlier than we did. Moving forward, we've put processes in place to help prevent similar issues happening again. Mohammed Sadat (WMDE) (talk) 09:00, 22 June 2020 (UTC)

two-part recipe

The usual WD recipe (Q219239) has part(s) (P527) some ingredients. My problem is I need to specify a two-part recipe (i.e. two sets of ingredients without any order). How can I do this?

Example:

Any hints appreciated. Bonus would be to be able to stick names on these sets. --SCIdude (talk) 16:18, 21 June 2020 (UTC)

I'm still thinking about a good way to do it on hybrid of (P1531). --- Jura 18:04, 21 June 2020 (UTC)
I think there is the general problem of defining a set by its members without creating an item for it (it would be easy to create "chocolate cake batter" and "chocolate cake icing"). --SCIdude (talk) 06:25, 22 June 2020 (UTC)
Depending on the context, the qualifiers "object has role", "series ordinal", "criterion used" can help. --- Jura 11:16, 22 June 2020 (UTC)

This kind of converting case should perhaps be prevented somehow

special:diff/1202523271 via #suggestededit-add 1.0 .--RZuo (talk) 01:34, 22 June 2020 (UTC)

Awkward plural

Q169962#P3020 should be written in plural, years. i see the data is entered as 700 unit, unit=year, then the website should automatically display the singular or plural form based on the number.--RZuo (talk) 01:40, 22 June 2020 (UTC)

On Wikidata, grammar absolutely does not matter. It is the relationship between items that is important. Wikidata is not meant to be read like a book, but to hold data and express relationships. -Animalparty (talk) 01:51, 22 June 2020 (UTC)
It would be possible to fix this by looking at lexemes but it's likely more work then it's worth compared to other areas where development work can be expanded by WMDE. However Wikibase is open-source so if anybody cares deeply about getting plural shown, implenting a lookup of the plural for units would work. ChristianKl11:32, 22 June 2020 (UTC)
See phab:T141597.--GZWDer (talk) 15:20, 22 June 2020 (UTC)
Displaying plural forms of the unit would not be a good idea in Swedish. "centimetrar" is the plural form of "centimeter", but you do not write/say: "Jag är 198 centimetrar". (Yes, I am almost 2 meters.) You in most cases treat the unit as if it is uncountable. There are exceptions, like time units and currencies. To work around this, our templates show the short version of the unit in Wikipedia. "cm" instead of "centimeter" and "h" instead of "timme/timmar". 62 etc (talk) 16:21, 22 June 2020 (UTC)
Time to paste my favorite article about plurals and translation: A Localization Horror Story: It Could Happen To You. 21 years, still relevant :) --Misc (talk) 21:45, 22 June 2020 (UTC)

AOU+AOS: Two bodies in one item

This 2016 edit, or those peceding it in ca & es, conflated the American Ornithological Union with its successor body, the American Ornithological Society.

[13] tells us that the AOU merged with the Cooper Ornithological Society to form the American Ornithological Society on 11 October 2016.

Given the passage of time, what's the best way to resolve this? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:23, 11 June 2020 (UTC)

[Restored from archive] No answers? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:47, 22 June 2020 (UTC)
American Ornithological Society (Q465985) was originally about the American Ornithologists' Union, but was conflated with the American Ornithological Society after the merger [14]. Given that only happen in 2017, and some language labels still refer to American Ornithologists' Union, maybe restore it to that and create a new item for the American Ornithological Society? Ghouston (talk) 00:52, 23 June 2020 (UTC)

Wikidata weekly summary #421

You are late, the RfA was closed already a few days ago ;) Stryn (talk) 16:44, 22 June 2020 (UTC)

Lexical masks now in JSON

We have released lexical masks as ShEx files before, schemata for lexicographic forms that can be used to validate whether the data is complete.

We saw that it was quite challenging to turn these ShEx files into forms for entering the data, such as Lucas Werkmeister’s Lexeme Forms. So we adapted our approach slightly to publish JSON files that keep the structures in an easier to parse and understand format, and to also provide a script that translates these JSON files into ShEx Entity Schemas.

Furthermore, we published more masks for more languages and parts of speech than before.

Full documentation can be found in Wikidata:Lexical Masks.

Background can be found in the paper.

Thanks Bruno, Saran, and Daniel for your great work! --Denny (talk) 20:59, 22 June 2020 (UTC)

Applications section of Wikidata article in wikipedia needs to be updated.

The applications section of w:wikidata needs updating. The examples are somewhat dated and not the most interesting or broad-reaching. Currently:

  • Mwnci extension can import data from Wikidata to LibreOffice Calc spreadsheets
  • There are (at October 2019) discussions about using QID items in relation to what are being called QID emoji
  • Wiki Explorer - Android application to discover things around you and micro editing Wikidata

For readers, it'd be better to use ones that are easier to understand the impact. I reckon the ppl on this discussion page prtobably have the broadest knowledge of examples of interesting applications. Google infoboxes? Use in medicine? Maps of witches? Answering otherwise unanswerable questions? Open versions of otherwise proprietary databases (wikicite). Please add examples directly to the Wikipedia page, or Wikipedia discussion page. T.Shafee(evo&evo) (talk) 04:59, 23 June 2020 (UTC)

Google infoboxes come from the Google Knowledge Graph. While some of that data comes from Wikidata plenty doesn't. Giving the appearence that those infoboxes are Wikidata infoboxes also has the problem of encouraging SEO spam. ChristianKl10:51, 23 June 2020 (UTC)

Even if it's included in the text above, maybe it could start with a chronological enumeration of key functions at WMF:

  • provide interwikis for Wikipedias
  • provide translations of concepts
  • provide short descriptions of labels of concepts
  • provide a rosetta stone of identifiers
  • provide infobooxes for WMF projects
  • provide lists for WMF projects
  • provide items about resources used as reference
  • provide items about sites in WikiVoyage

Then continue with more complex stuff ... --- Jura 11:10, 23 June 2020 (UTC)

I have seen alot of statements that uses this property as it's sole reference. Can anyone here explain to me the logic behind this?--Trade (talk) 22:47, 18 June 2020 (UTC)

Web pages are the reference for themselves, but only if a date is specified. The statement also states that the person editing did actually visit the page. --SCIdude (talk) 14:24, 19 June 2020 (UTC)
That does not work when there's no web page included in the first place. --Trade (talk) 20:50, 19 June 2020 (UTC)
Burning Car Theme (Q85044875)@Richard Arthur Norton (1958- ):--Trade (talk) 15:27, 20 June 2020 (UTC)
You meant Q85044875#P444? Presumably the PlayStation Store (Q1052025) is also some kind of "page" that can be visited by the general public. That you don't use a web browser makes no difference IMO. --SCIdude (talk) 07:45, 21 June 2020 (UTC)
Q18649951#P5161--Trade (talk) 11:16, 22 June 2020 (UTC)
I think these sources are actually not very meaningful. A data consumer should ideally be able to understand the source from the group of source statements alone. Just P813 is unclear and forces an assumption that whoeevr added it means to say the statement the source is on, is the source itself, but this isn't explicitly stated in the data (it could equally just be a mistakenly added source). --SilentSpike (talk) 16:35, 24 June 2020 (UTC)

Which property do I use to include biomes?

Hi guys,

I'm fixing some National Parks of Brazil, but I couldn't find what is the best property to include biomes that the Park are in. An important information, that is not present in any National Parks that I searched here. Rodrigo Tetsuo Argenton (talk) 00:15, 23 June 2020 (UTC)

Hello. The part of (P361) property might be useful for this purpose. Modeum (talk) 17:38, 24 June 2020 (UTC)

Odd constraint violation

Why is

⟨ Seattle Municipal Archives (Q19979269)      ⟩ Commons Institution page (P1612)   ⟨ "Seattle Municipal Archives" ⟩

giving a constraint violation warning? - Jmabel (talk) 00:31, 24 June 2020 (UTC)

The constraint is that the Commons page should exist. As the Commons page and the link to it were both created in the last few hours it could just be an issue of database lag. I get this all the time with certain types of Commons links (mainly creator pages). If it hasn't resolved itself in 24 hours, flag the issue again. From Hill To Shore (talk) 01:50, 24 June 2020 (UTC)
@Jmabel, From Hill To Shore: For what it’s worth, the usual purge will also purge cached constraint check results. But in this case I’m afraid the issue is due to the Commons link constraint (Q21510852) constraint type being generally broken for Commons Institution page (P1612) (and some other properties) – see T242518 and T237920. --Lucas Werkmeister (WMDE) (talk) 09:12, 24 June 2020 (UTC)
@Lucas Werkmeister (WMDE): Shouldn't we turn off the constraint if it is routinely giving bogus warnings? - Jmabel (talk) 14:53, 24 June 2020 (UTC)

Reason for script error

I would like to the script error that is present in the page Property talk:P6391. Adithyak1997 (talk) 18:32, 24 June 2020 (UTC)

@Adithyak1997: Can you please give more information about the problem you're having? Bovlb (talk) 20:40, 25 June 2020 (UTC)
@Bovlb:, please visit the page Property talk:P6391. I am able to see a hidden category named Pages with script errors. I actually couldn't give more info since I am only seeing that hidden category only in that page. Lots of talk pages are actually present in that category. Adithyak1997 (talk) 05:27, 26 June 2020 (UTC)
@Adithyak1997: Aha! To reproduce your problem, it is necessary to enable the preference "Show hidden categories". Apparently one of the causes of this is having too many {{P|P123}}} to expand on a page, but I'm not the expert. Bovlb (talk) 15:59, 26 June 2020 (UTC)

How to state this?

I am adding an item for George Bennett Gosling (Q96649516), the British army officer who is probably best known for having provided the first English-language report of seeing an okapi alive in the wild. How should I describe that accomplishment in structured data? I have information about the date and place where he wrote the letter in question (February 1906 at Angu (Q96650182)). Levana Taylor (talk) 05:32, 26 June 2020 (UTC)

Why does this very basic query timeout?

SELECT ?ref ?refURL WHERE {
  ?ref pr:P854 ?refURL .
  FILTER (CONTAINS(str(?refURL),'philatlas.com')) .       
} LIMIT 20

Psiĥedelisto (talk) 12:43, 26 June 2020 (UTC)

@Jura1: So is there no way to figure out what we're sourcing to PhilAtlas, which has been blacklisted on enwiki? Psiĥedelisto (talk) 13:03, 26 June 2020 (UTC)
There's Special:Linksearch - not quite as useful for heavily used domains, but it works for URLs used in reference URL (P854). Special:Linksearch/*.philatlas.com currently doesn't find any pages containing this domain. --Kam Solusar (talk) 14:11, 26 June 2020 (UTC)
@Psiĥedelisto:: OK, this is weird. On en:Wikipedia talk:Tambayan Philippines#PhilAtlas (thanks for the ping there) it was pointed out that items like Aborlan (Q111338) do indeed have references containing that domain. I've experimented with the external link search and it seems searching for "philatlas.com" "*.philatlas.com" or "www.philatlas.com" yield no results, but using "https://*.philatlas.com" as input, it finds hundreds of results: [15]. Apparently, it only searches for URLs with the http:// protocol by default, not https. I had initially tried it with a few big newspaper websites to see if the link search works for URLs in reference statements before posting, but didn't notice that it only found older http results in referencces that haven't been updated to https. --Kam Solusar (talk) 15:58, 26 June 2020 (UTC)
@Kam Solusar: Yes, that's correct, Special:LinkSearch is protocol-specific (and defaults to http) so you need to search both *.example.com and https://*.example.com to cover both protocols. ArthurPSmith (talk) 18:03, 26 June 2020 (UTC)
@Psiĥedelisto: The query times out because searching inside text strings is very slow. The solution is to find the relevant items with an MWAPI call. This rewritten query does the same and is relatively fast:
SELECT ?item ?itemLabel ?property ?propertyLabel ?refURL
WITH
{
  SELECT ?item ?property ?refURL
  WHERE
  {
    SERVICE wikibase:mwapi
    {
      bd:serviceParam wikibase:endpoint "www.wikidata.org" .
      bd:serviceParam wikibase:api "Generator" .
      bd:serviceParam mwapi:generator "exturlusage" .
      bd:serviceParam mwapi:geuprop "title" .
      bd:serviceParam mwapi:geunamespace "0" .
      bd:serviceParam mwapi:geuprotocol "https" .
      bd:serviceParam mwapi:geuquery "*.philatlas.com" .
      bd:serviceParam mwapi:geulimit "max" .
      ?item wikibase:apiOutputItem mwapi:title .
    }
    hint:Prior hint:runFirst "true" .

    ?item ?claim ?statement .
    ?property wikibase:claim ?claim .
    ?statement prov:wasDerivedFrom ?reference .
    ?reference pr:P854 ?refURL .
    FILTER CONTAINS(STR(?refURL),'philatlas.com') .
  }
} AS %query
WHERE
{
  INCLUDE %query
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en"  }  
}
Try it!
--Dipsacus fullonum (talk) 19:28, 26 June 2020 (UTC)

Web of Sciences Qualifier

(This discussion is moved from Wikidata:Request a query)

Hey,

am I correct that there is no such thing like a qualifier for Web of Science-*publication*-ID in Wikidata? I found WOS-ResearcherID (P1053) but nothing for WOSUID.

Thank you!

-Eva (talk)

@EvaSeidlmayer: there isn't a property for it now, though there is an open proposal to create one, so it's likely to be created soon. Vahurzpu (talk) 20:22, 22 June 2020 (UTC)
Eva FYI, I was thinking about it, I might do it myself in few months f it is a solid ID. (I had no time to check it carefully)--Alexmar983 (talk) 16:04, 27 June 2020 (UTC)

Add Scopus and other IDs from ORCID

Similar to this request from a few months ago, could we maybe also have Scopus author ID (P1153), ResearcherID (P1053) (and potentially others) added by a bot from an individual's ORCID iD (P496)? These IDs are often listed in a person's ORCID, such as here for Peter F. Orazem (Q30071694). --Bender235 (talk) 19:51, 24 June 2020 (UTC)

We should. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:56, 25 June 2020 (UTC)
According to this FAQ page on ORCID, it seems they allow for the inclusion of Authenticus ID (P7834), CIÊNCIAVITAE ID (P7893), Dialnet author ID (P1607), GitHub username (P2037), and Loop ID (P2798), in addition to the ones mentioned above. It would be great if we could have a bot scraping all of these. --Bender235 (talk) 22:42, 27 June 2020 (UTC)

Adjacency

I'd like to specify that sidewalk (Q177749) is adjacent to road (Q34442). I can't find a property to use, though; shares border with (P47) seems to apply only to countries, and I'm not sure how I'd use adjacent (Q78532896) since it's not a property. Help? {{u|Sdkb}}talk 22:15, 24 June 2020 (UTC)

connects with (P2789) is a general property that can be used, but I'm not sure that this is how roads work. My instinct would be that a sidewalk is part of a road and EnWiki seems also not to think that sidewalk are adjacent to roads but part of roads. ChristianKl00:10, 25 June 2020 (UTC)
Typically a part of a street (Q79007), which is an urban road (Q34442). part of (P361) could be used, although the lack of symmetry always bothers me on such classes: maybe a sidewalk (Q177749) is always part of a street (Q79007), but a street (Q79007) doesn't necessarily have a sidewalk (Q177749). Ghouston (talk) 00:54, 25 June 2020 (UTC)
Well, "typically" part of a street (Q79007), but I can think of a case where a highway has one. Ghouston (talk) 00:57, 25 June 2020 (UTC)
I added it as has part(s) of the class (P2670) sidewalk (Q177749) quantity (P1114) 1±1. 1±1 doesn't read as user-friedly and it might be nicer if the UI would display it as 0-2 but it does note the information. ChristianKl11:03, 25 June 2020 (UTC)
You need to be careful here with international usage. While the USA uses "sidewalk" with the implication of it being on the side of a street, the international aliases are pavement, footpath or footway. In the UK those may be part of the same highway as a street or road or they can be an independent highway for foot traffic that meanders away from the flow of vehicles. I haven't looked into how these are modelled here but we may need to split the items based on national usage if you want to be creative with the supporting properties. From Hill To Shore (talk) 14:25, 25 June 2020 (UTC)
Whereas in the U.S., a "footpath" is specifically not along a street: anything from a separate path a few meters from the street and parallel to it, to a path in a park, or even sometimes a hiking trail. - Jmabel (talk) 15:18, 25 June 2020 (UTC)
Word usage isn't primary. Relationships between items in Wikidata are more important. There are items that are instance of (P31) sidewalk (Q177749) and we care about what those describe. It would be nice if someone models the different entities and conceptions that exists. Afterwards we can decide on what the best names for them happen to be. ChristianKl20:31, 25 June 2020 (UTC)
Yes, but you need to make sure the model matches the situation before you start playing around with specialised properties. If an item represents X and Y but an editor comes along and sets properties for X only then all the links to that item expecting Y will now be wrong. If you want to create an instance solely for X then create a new item for it. From Hill To Shore (talk) 21:01, 25 June 2020 (UTC)
I think it's pretty clear, at least in English, from the label (which is apparently reasonably unambiguous in American English), the description and the linked enwiki article. Footpaths have been difficult at Commons, due to English language usage variations; some time ago some categories were merged into c:Category:Paths, and Category:Footpaths is now a redirect to that. The items path (Q5004679) and footpath (Q3352369) are apparently duplicates, but the usage on eswiki needs investigation. Ghouston (talk) 00:20, 26 June 2020 (UTC)
I don't think there are errors that come from saying that any road has 0-2 sidewalks given that 0 happens to be in the range. ChristianKl11:20, 27 June 2020 (UTC)

Why am I getting the warning at Q96118741

Why at Portrait of Dr. Ernst Schwerin (1846-1920) (Q96118741) is the image field giving me a warning? --RAN (talk) 20:17, 25 June 2020 (UTC)

It looks like the word "carte" is triggering a constraint in French for "carte de localisation." Basically, the system thinks you have used a map so is trying to tell you to use locator map image (P242) instead. From Hill To Shore (talk) 20:25, 25 June 2020 (UTC)
Richard Arthur Norton (1958- ) In general, it's not a good idea to change file names for the sole sake of appeasing the fickle gods of Wikidata. Commons:File renaming discusses valid reasons. If Wikidata has a problem with a single word in a single language, that's a problem best solved on Wikidata's end, not Commons. -Animalparty (talk) 20:18, 27 June 2020 (UTC)
Just add another exception to the constraint on image (P18). However, it's accumulating so many exceptions that the value of the constraint has to be questioned. Ghouston (talk) 02:36, 28 June 2020 (UTC)

Wikidata Vandalism Dashboard

I don't know where I can report this, there is no info anywhere who's the author or where problems should be reported, so I'm posting it here. It's a problem with Wikidata Vandalism Dashboard. For some reason 'Entity title' column does not function properly, the whole column is empty. Wostr (talk) 21:22, 25 June 2020 (UTC)

User:Wostr: Hey, User:Lucas Werkmeister and I maintain this code. I just tried it with this url and it works fine for me. Can you elaborate more on which URL you see the issue? Amir (talk) 21:38, 25 June 2020 (UTC)
@Ladsgroup: okay, I see now that it doesn't work with &limit= in the url > 50. I always had the limit set to greater value than 50. Wostr (talk) 23:38, 25 June 2020 (UTC)
@Wostr: this commit fixes the issue. Let me know if you encounter more issues. Amir (talk) 20:11, 27 June 2020 (UTC)

Changing format of ISNI values

I've proposed to change the format in which ISNI (P213) values are stored from display format (= with spaces) to storage format (= without spaces). Please write your opinion here. Thanks, --Epìdosis 00:06, 26 June 2020 (UTC)

Please do the same for ISBN values, Google uses no dashes, we use dashes, and eventually a bot adds in the dashes. --RAN (talk) 01:55, 28 June 2020 (UTC)

Propert for Judged by

Does anyone know about a property for Judged by within judiciary (Q105985)? Pmt (talk) 20:56, 26 June 2020 (UTC)

P7859

I tryied to add WorldCat Identities ID (superseded) (P7859) to Jacob Chanai (Q96654298) but it was unsuccessful. I have this. I could not understand what should be correct viaf-233939274 not working and also np-... Can someone help. Geagea (talk) 18:02, 27 June 2020 (UTC)

@Geagea: What is the actual value that you need to add? Adithyak1997 (talk) 18:30, 27 June 2020 (UTC)
That's the point. I don't know. I tried "viaf-" and "np-" but was not successful. Geagea (talk) 18:43, 27 June 2020 (UTC)
I think it's added mostly by bot when one exists. --- Jura 18:45, 27 June 2020 (UTC)
The problem is that in Wikidata, the property Worldcat Entities ID is accessed using the domain/url https://www.worldcat.org/identities/. But for this item, the ID is actually present in the domain/url http://experiment.worldcat.org/entity/work/data/. This is what is creating the problem. I have asked to owners of website, about whether they can provide an id with the needed domain, don't know whether it's possible. If it's not, I think the regex given for the field needs to be modified. Adithyak1997 (talk) 18:50, 27 June 2020 (UTC)

Link a item to a paragraph

Hello,

at the moment I try to get a overview about Wikisource and think about sources and how it can be easier to find a topic. There are some projects where for different parts are not extra pages. This is a Example. The page includes different song texts. How is it possible to show that the Text in Wikisource exists. Usually there are Interwikilinks but that is not possible if there are several topics mentioned at one page. --Hogü-456 (talk) 19:38, 28 June 2020 (UTC)

The position replaces another position

Is there a diverse inverse of substitute/deputy/replacement of office/officeholder (P2098)? Or something else to show the information.

Xaris333 (talk) 20:52, 26 June 2020 (UTC)

Any ideas? Xaris333 (talk) 14:22, 29 June 2020 (UTC)

Czech and Arabic help needed -- for error fixing

Somehow, at one point, the items for the travel writer Charles H. Baker (Q5078504) and the American politician Charles H. Baker (Q96693981) got merged. I just finished separating them out, but meanwhile, the conflation has been around long enough to have propagated all over the place :-{ I've sent correction notifications to everyone I could; but it would be good to have a Czech speaker to notify Národní autority ČR, BDČZ, Obalky Knih, and Archiv výtvarného umění (one of the less obvious errors in those is stating that he resided in and died in Clifton, Kentucky, USA when in fact it was Clifton, Bristol, UK). Also could an Arabic speaker please fix arzWiki. Since it's Wikidata's fault in the first place ... Levana Taylor (talk) 06:23, 28 June 2020 (UTC)

Also, if Q5078504 is a conflation, it shouldn't become an item about an individual. --- Jura 09:43, 28 June 2020 (UTC)
@Meno25: for arzwiki point --Alaa :)..! 10:29, 28 June 2020 (UTC)
Q5078504 was originally Charles, and linked two Wikipedia articles, until a bot added an identifier for James, and other bots added identifiers based on that. Peter James (talk) 12:25, 28 June 2020 (UTC)
Current status: Charles H. Baker (Q5078504) is a conflation item containing identifiers for ISNI and the four Czech sites; Charles H. Baker (Q96693981) is Charles H. Baker; James Baker (Q88218745) is James Baker. Levana Taylor (talk) 16:37, 28 June 2020 (UTC)
The abART person ID (P6844) is only about James Baker, the only error I can see is the wrong Clifton. P6656 (P6656) is also James, with the exception of the image and link to the NL CR AUT ID (P691) conflation. Peter James (talk) 18:10, 28 June 2020 (UTC)
Looks much better. Thanks for fixing it! --- Jura 09:19, 29 June 2020 (UTC)

Revision of Alexander-Briggs notation property

There is a huge mess-up with the Alexander–Briggs notation (P6432) property in the range 10_161 to 10_166, for historical reasons.

This is because of an error in all tables up to Rolfsen's resulted in one knot being listed twice, in forms known as 10_161 and 10_162, prior to being identified as being the same knot. This is where it gets nasty. Some authorities just omitted the knot 10_162, but others renumbered 10_163 to 10_166 into the range 10_162 to 10_165. This has led to enduring confusion, made worse by occasional attempts to fix the problem that pile one mistake on the other. See http://katlas.org/wiki/The_Rolfsen_Knot_Table and w:Perko pair for more information.

This means that all Alexander-Briggs notations in the range 10_162 - 10_166 (and only those) are now ambiguous, and without information as to the source, worse than useless as knot identifiers.

As part of my knot consolidation project, I now propose to add a qualifier to the Alexander-Briggs name property on Wikidata, to try to ensure that we don't propagate this any more. I propose to add the string " (Rolfsen original)" to refer to the original names in the range 10_162 - 10_166, and " (re-numbered)" to the new, re-numbered names in the range 10_162 - 10_165. Please note that the Alexander-Briggs notation for all other knot articles will remain the same as before, without any suffix, since none of them need any disambiguation.

This will require a change to the property's regex, and the editing of five knot items to add the new, tagged forms of the names, which diverge for those particular knots.

The old regex is:

[0-9]+(\^([0-9]|{[0-9]+})?_([0-9]|{[0-9]+})

and my proposed new regex is:

[0-9]+(\^([0-9]|{[0-9]+})?_([0-9]|{[0-9]+})(?:| \(Rolfsen original\)| \(re-numbered\))

Does this seem OK to people?

  Notified participants of WikiProject Mathematics

-- The Anome (talk) 12:53, 28 June 2020 (UTC)

  • Could you add a few sample values this is meant to match? I'm not sure what { } in there will match.
    Literal text in tex/math would generally be   written \text{blabla}. However, if there are just different criteria to determine an expression, one would generally use a qualifier for that. --- Jura 13:24, 28 June 2020 (UTC)

Sure. And you're absolutely right, I hadn't thought the markup through.

That would then make the new regex this:

[0-9]+(\^([0-9]|{[0-9]+})?_([0-9]|{[0-9]+})(?:| \\text\{\(Rolfsen original\)\}| \\text\{\(re-numbered\)\})

which is truly horrible.

Trying again:

  • Unaltered:  ,  ,  
  • Rolfsen:  
  • Renumbered:  

Yes, I've thought of using qualifiers. The problem is that there are so many different sources, and everything gets fuzzy once you've gone beyond the original printed Rolfsen tables. Even online sources calling themselves the Rolfsen tables can contain either numbering. From my perspective, the Alexander-Briggs symbols from 10_162 to 10_166 are basically meaningless, and other unambiguous identifiers such as the Dowker-Thistlethwaite names need to be used instead, or appeal to invariants, which can be more difficult than it seems due to multiple different ways to write the invariants themselves.

Having two different properties, Alexander-Briggs notation (Rolfsen original) and Alexander-Briggs notation (re-numbered) might be another way of doing it, but also seems inelegant.

Maybe Help:Deprecation is the way to go? -- The Anome (talk) 13:42, 28 June 2020 (UTC)

Thank you! I'd also make the regex something like this:

[0-9]+(\^([0-9]|{[0-9]+})?_([0-9]|{[0-9]+})( \\text\{\([-A-Za-z ]\)\})?

to allow other tags to be added if necessary, without further overhead. The alternative is a nest of multiple statements with different cites, qualifiers and deprecation annotations; I'd like to avoid that if at all possible. -- The Anome (talk) 14:04, 28 June 2020 (UTC)

  • This "Rolfsen original" and "re-numbered" should probably be qualifiers: 1) they are not part of identifier; 2) they are language-dependent. But I do not know which qualifier to use: "knot smth" "Alexander-Briggs notation" "10_163" "???" "original Alexander-Briggs notation". Isn't there a necessary property for this qualifier? Something like applies to part (P518) Wikisaurus (talk) 14:06, 28 June 2020 (UTC)
    • This is why I'm a bit perplexed. The A-B notation is arbitrary, as it's based on the order that various knot tablulators originally wrote these knots into their tables. But at least they were unambiguous. Even when the Perko pair was found, that didn't really break the system. But the resulting mess from the various efforts to try to fix it has made the A-B notation almost useless -- but for only six knots! For everything else, it's still a useful system, at least up to the limits of the older tables. I'm working on adding the D-T names to Wikidata, which should at least give us a systematically derived unique key for these entries, and is extensible to all knots, known or future. But I'd like at least to clear the A-B notation mess up whild I'm tidying the rest of the knot properties. (See my property proposal Wikidata:Property proposal/Dowker-Thistlethwaite name.)

      I'm now inclining more to using qualifiers and references, since it is only six knots, and if I use the "deprecated" and "conflation" qualifiers, together with lots of cites, that might be enough. -- The Anome (talk) 15:43, 28 June 2020 (UTC)

      • If P6432 was an identifier, this would definetly need to be a qualifier. However, if it's just a way to describe the knot, including it in one way or the other in the string itself doesn't seem problematic to me. As it's about knots, I can't really tell ;) --- Jura 09:23, 29 June 2020 (UTC)

Duplicate children

I have been noticing duplicate children in many of the genealogical trees generated at Commons, and find the error comes from Wikidata. Some people are listed twice as a child because of them also appearing in The Peerage. The problem is that after merging duplicates they still appear later as a duplicate as child= at each parent. One entry links to the redirect and the other is the record that survived the merge, is a bot supposed to be removing the duplicate after merger? --RAN (talk) 22:17, 28 June 2020 (UTC)

This was done by User:KrBot, but currently blocked.--GZWDer (talk) 09:59, 29 June 2020 (UTC)
The bot was unblocked yesterday (link), so there's a good chance this will start up again soon. Vahurzpu (talk) 14:30, 29 June 2020 (UTC)

Wikidata weekly summary #422

Preferred Rank for P31

During a discussion of a SPARQL query the question came up whether P31 should have "Preferred Rank" or not. We found a few instances such as Cremin (Q57119) which has preferred rank locality (Q3257686) and normal rank municipality of Switzerland (Q70208). It is not clear to me why should one be more important than the other? There fore I would like to ask whether there a consensus regarding the use of "Preferred rank" for instance of (P31)? Should this be used if there is a clear primary meaning of an item? Does this help external tools to reason about the item? What do you think? Best regards --Hannes Röst (talk) 18:15, 24 June 2020 (UTC)

  • There are cases where external tools and data-users want to have a truthy answer for "What's the instance of (P31) of an item?" Setting the "Preferred rank" helps to provide the best answer to the query. Generally we don't want truthy answer for questions to provide answers that aren't true in the moment the query is made. Today, Cremin (Q57119) isn't a municipality of Switzerland and thus it's better when the truthy answer is that it's a locality. ChristianKl18:26, 24 June 2020 (UTC)
Yes I see that point, but it it is difficult to say what is really the "best truth" here. It may just as well be a truthy answer to say that Cremin (Q57119) is a former municipality of Switzerland (Q685309) (and may even provide more information) since that implies that its a locality but also provides information that it used to be a municipality. I just wanted to get a general opinion on whether P31 should have a preferred rank at all or not. I can see instances where it makes sense to describe an item that is well known for quality A as "preferred" when it has a less well known quality B. Eg Geneva (Q71) is described as "city in Switzerland and capital of its canton" and for automatic tools to provide the same answer, it should have preferred rank city of Switzerland (Q54935504) and cantonal capital of Switzerland (Q14770218) while the other P31 of college town and municipality of Switzerland should be less important. Btw: is it possible to set two "preferred ranks" as in this case? --Hannes Röst (talk) 19:13, 24 June 2020 (UTC)
Truthy answers to questions always give one or zero answers and strip away qualifiers.
The concept of ranks is quite central and not using it for instance of (P31) would give the orders in which statements affect queries in a way that's hard to understand to users. It's possible to set multiple preferred ranks but the truthy value is still not both results but the first. Having only one value with the preferred rank makes it easier to understand what's likely returned by a query.
former municipality of Switzerland (Q685309) is a bit internally contradictory given that it subclasses municipality of Switzerland (Q70208). I think there are good reasons against relying in "former X"-items for instance of (P31) claims. ChristianKl19:43, 24 June 2020 (UTC)
@ChristianKl: Perhaps I'm misunderstanding what you're saying, but it's not true that truthy queries only bring back at most one result: they bring back all the results at the highest rank. See, for example,
SELECT ?location ?locationLabel WHERE {
  wd:Q42442324 wdt:P276 ?location.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!
This currently gives three results. --Oravrattas (talk) 21:24, 24 June 2020 (UTC)
It does seem that I remember it wrongly. https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format#Truthy_statements seems to be the actual definition. ChristianKl21:39, 24 June 2020 (UTC)
Often this is a problem, but not always. I think Geneva (Q71) is a good example of being multiple things and we often see this in articles where the village and the political entity (municipality) have a single entry and also a single Wikipedia article. And I tend to agree that probably using Ranks for P31 is generally a bad idea since the Ranking systems seems to be used for cases where there are multiple entries and one of them is clearly the most up to date and truly correct one while for P31 that is almost never the case that one "instance-of" is more recent/more true than another. It seems the situation for P31 is like the children of Barack Obama, neither one is more correct than the other. This link also seems to indicate that correct historic information should not be indicated with rank: This does not apply to correct historical information, such as previous values of a statement, as long as they represent accurate information for the indicated time period. Such statements should instead be annotated with the appropriate start time (P580)/end time (P582) qualifiers. I would agree that a clarification in Help:Ranking would be a good idea. --Hannes Röst (talk) 18:51, 29 June 2020 (UTC)
PS: due to the aforementioned issues with villages and political structures, I dont think that locality (Q3257686) should be removed from Essert-Pittet (Q68715) since municipality of Switzerland (Q70208) is orthogonal to being a village: sometimes a municipality contains multiple villages and we therefore cannot assume that municipality of Switzerland (Q70208) *implies* locality (Q3257686) necessarily. --Hannes Röst (talk) 18:51, 29 June 2020 (UTC)

Books from the Biodiversity Heritage Library

A mass upload of books from the Biodiversity Heritage Library, to Commons, is underway. Please assist in categorising them, and linking them or their categories to Wikidata items. They may also be useful as a source of illustrations, for species and other subjects, where we currently have no image. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:12, 29 June 2020 (UTC)

@Pigsonthewing: Are the pdfs being systematically linked from Wikidata? Jheald (talk) 15:24, 29 June 2020 (UTC)
Not that I'm aware of. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:03, 29 June 2020 (UTC)

Taxon question (Marah oreganus/oregana)

I hesitate to weigh in on anything about taxons, because I've seen a big fight here almost every time it is opened up, but we have three wikipedias linked from Marah oregana (Q17430120) and three others, plus Commons, linked from Marah oregana (Q3845272). That cannot be good: they all refer to the same species. In the case of Commons, Commons own label actually matches Marah oregana (Q17430120). User:MILEPRO indicates in a discussion on Commons that oregana is accepted by Catalogue of Life,IPNI, Kewscience, and Tropicos. I have no idea which sources support the other taxon name, but presumably there are some. Could someone who knows far more about this than I do please sort this out? - Jmabel (talk) 15:40, 29 June 2020 (UTC)

The gender problem: „although Kellogg originally (1854) used masculine gender, he later (1863) corrected this to feminine, which is the classical gender of the Hebrew name“. I merged the items. --Succu (talk) 16:30, 29 June 2020 (UTC)

The two items regard repectively the taxon Serpentes and "snake", that is the vulgar name of the taxon. I think they should be merged.--R5b43 (talk) 00:41, 30 June 2020 (UTC)

  WikiProject Taxonomy has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. ^^ --Liuxinyu970226 (talk) 02:44, 30 June 2020 (UTC)

Estate appraiser and consultant

I am trying to find two items. I have translate the Greek words of my source. The translations are:

  • real estate consultant
  • real estate appraiser

Do you know if we have items for them? I have found

but I don't know if are the same (I think they are not the same)

Xaris333 (talk) 15:28, 29 June 2020 (UTC)

Judging by enwiki articles, real estate agent (Q519076) may be about estate agents / brokers in the United States, and estate agent (Q16148831) about estate agents / brokers in the United Kingdom. Most other language Wikipedias have linked with real estate agent (Q519076) and ignored the other. "Real estate consultant" may be the same a agent / broker, "real estate appraiser" is probably appraiser (Q10855106), but specifically for real estate. Ghouston (talk) 08:09, 30 June 2020 (UTC)

Deprecated Rank

I have come across a situation in Hans Krüsi (Q931779) where I am pretty confident that the correct place of birth is Zurich, Switzerland (listed in an semi-official biography). However GND lists Speicher as place of birth. In a situation like this, should we ignore GND or should we add the wrong place of birth with Rank deprecated ? Adding deprecated rank seems correct, the question is whether we want to add incorrect information on purpose just so that we can label it as incorrect -- or just leave it be and list the correct information? --Hannes Röst (talk) 18:02, 30 June 2020 (UTC)

In that situation list both as valid items and provide a reference for each. Then set the one you believe is accurate to "preferred" rank. "Deprecated" should be used only where there is clear evidence of an error, not where you have a personal view of the validity of one source over another. I have had plenty of examples where I have found a statement I thought was wrong is later supported by more reliable sources. From Hill To Shore (talk) 18:30, 30 June 2020 (UTC)
GND is just a tertiary resource. There are few complex ways to identify the actual reference they used.
In the meantime, you could set it to deprected with the reason for deprecated rank (P2241)=possibly invalid entry requiring further references (Q35779580) or cannot be confirmed by other sources (Q25895909) --- Jura 19:06, 30 June 2020 (UTC)

Merge request

Hi I just created Huize Ivicke (Q96758035) after creating an English language wikipedia page for the monumental Huize Ivicke but when entering values found the place had already been created with no name at Huize Ivicke (Q18774472), presumably when someone was creating Qnumbers for all Dutch monuments. Can the two Qnumbers be merged? By the way, there isn't a relevant Dutch wikipedia page yet. Thanks Mujinga (talk) 19:01, 30 June 2020 (UTC)

@Mujinga: Yes, see Help:Merge for instructions. Vahurzpu (talk) 19:09, 30 June 2020 (UTC)
@Vahurzpu: Unfortunately I did it. I will try to make users do it next time. Adithyak1997 (talk) 19:13, 30 June 2020 (UTC)
Well thanks to both in any case Mujinga (talk) 19:17, 30 June 2020 (UTC)

How to sync a WIkidata query to a Google Sheet?

Hi all

There are a lot of amazing data visualisation websites and pieces of software which can sync to data from Google Sheets, they're able to fetch the data from the Sheet so the visualisation stays up to date. . In the example of Carto if the Google Sheet changes the Carto map changes. I created this map of GLAMs around the world which links to the Wikidata item, (the colour is based on when the data was added to Wikidata.

 

Currently the only way I'm aware of being able to do this is to run a query, download the .csv and upload that to Google Docs which then syncs to the software.

  1. Run query
  2. Download query results
  3. Upload query results to Google Sheet
  4. Carto map syncs to Google Sheet

My question is is there a way of syncing the query to the Google Sheet so the who process was automated, including any new items that appeared in the sheet? This would mean the process would be:

  1. Query runs
  2. New query results replace the old ones on the Google Sheet
  3. Carto uses the Google Sheet to update the map

Also if this was possible what would be the best ways of doing it

Thanks

--John Cummings (talk) 12:25, 29 June 2020 (UTC)

@John Cummings: I have created a google sheet for fetching any updates that has been made in Wikidata. I have also made it automate and it will check any updates in every 10 minutes interval and update if there is a change. Here is the link of that google sheet created for fetching COVID-19 updates in India.--❙❚❚❙❙ JinOy ❚❙❚❙❙ 13:48, 29 June 2020 (UTC)
@Gnoeee: this is really amazing, thanks so much, myself and @NavinoEvans: will go through it and work out how its done and write up some instructions for others to use. Would it be ok to come back to you with questions? --John Cummings (talk) 12:13, 1 July 2020 (UTC)
@John Cummings: Yup sure. Let me know if you have any questions. I'll be happy to help :) -❙❚❚❙❙ JinOy ❚❙❚❙❙ 12:40, 1 July 2020 (UTC)
@Gnoeee: could I ask you to explain how it works, the different parts of it, what to read to learn how to get each part working etc? I'm going to put together a guide with Nav on how to do this kind of thing but first we need to learn ourselves :) Thanks very much --John Cummings (talk) 22:42, 3 July 2020 (UTC)
@John Cummings: I have added a new sheet, 'Documentation' in the GSheet that induces the work to be done to get the clean result. I have also added comments to each steps that will be useful for creating a new sheet similar. Ping me if any questions. :) -❙❚❚❙❙ JinOy ❚❙❚❙❙ 20:30, 4 July 2020 (UTC)

term length of office

Hello. How can I add the information that the term length of office (P2097) of Attorney General of Cyprus (Q19241145) is until the age of the retirement of the person? (Until the 68th birthday of the person in that case). Xaris333 (talk) 20:38, 26 June 2020 (UTC)

Any ideas? Xaris333 (talk) 14:22, 29 June 2020 (UTC)

You could use it with unknown value and a determination method (P459) qualifier saying "mandatory retirement age" or similar? Andrew Gray (talk) 18:26, 30 June 2020 (UTC)
@Xaris333: Actually, if you look at the examples over at term length of office (P2097), you'll find an example for exactly that situation. Life tenure lasting until a predefined retirement age is included in the definition of life tenure (Q5365387), see for example Associate Justice of the Supreme Court of the United States (Q11144). Circeus (talk) 19:00, 7 July 2020 (UTC)

Turkish names

In Turkish we have 29 letters, of which some are somewhat "different" from other Latin script languages. In several languages letters like these may be accepted as letters with diacritical signs, in Turkish script no. They are "letters" as such for their proper right. These "different" letters are: ç, ş, ı (besides i), ü, ö and ğ. There are many Turkish proper names in which these letters are used. We cannot use one or the other of these letters in a "manipulated" way like c instead of ç or s instead of ş or g instead of ğ etc. This is not only against the alphabet itself and a lack of respect to the people who have those names as part of their identity but also a serious mistake, because we have many names which "look like" each other but they are different. A few examples of these similar but different names, most of which are both given and surnames:

<<Ergun and Ergün, Tuncer and Tunçer, Tuncel and Tunçel, Gülsen and Gülşen, Aysen and Ayşen, Sengül and Şengül, Sanlı and Şanlı, Sina and Sına, Senay and Şenay, Ersan and Erşan, Erksan and Erkşan, Nursen and Nurşen, Umran and Ümran, Seval (lit. "love and marry") and Şeval (springtime) (or Şevval / 10th month of the Islamic lunar calendar) etc.>>

Some of these are not only formal differences but also the meanings are different. For example Sengül and Gülsen m/l mean "laugh, smile" (imperative) as a "wish" for her future when a baby girl is born. However Şengül or Gülşen mean "gay rose, happy rose, cheerful rose" m/l. (The differences between the meanings of Seval, Şeval and Şevval are mentioned above already.) How can one call somebody with a name that means something different than the name of the person? Please do not create names in languages you do not know. "Names in this country" or "Names in that country" internet pages are about names used in those countries, not pertaining to those "languages". Therefore please do not invent a name and add it to an irrelevant language. Say, Netherlands or Germany or the States may be countries where people from many other places of the world live, but that does not make "Jian-shu" (I am inventing the name, coincidences may occur) an English name if it is Japanese, Chinese or Korean or whatever. Once I met an American army official with a Japanese face and name. When we had some chat, and I dared to ask him about his name, his answer was: "It's a Japanese name attached to an American citizen". Very correct. It was not an American name nor -less- an English name. Continues to be a Japanese name. As I added an anectode to this chat, to give a more "human" character to an internet interaction, let me also recall that the Turkish diplomat and writer (PBUH) Ergun Sav wrote in one his memories how he disliked it and reacted when people called him "Ergün" or wrote his name wrongly. We cannot have "Ergun and Ergün" as one same name. They are not. Please let us all treat such sensible things with more care and in case of Turkish names please ask me in case of any hesitation.

Many thanks and regards. E4024 (talk) 16:29, 29 June 2020 (UTC)

Hi, thanks for your post but it unfortunately sometimes happens that a name is transcribed into another language in a way that does not capture the full original meaning. Therefore an entity may be known by a name in the foreign-speaking world that is different from the name it is known in its home country. For example, Vladimir Putin (Q7747) is correctly written as Владимир Владимирович Путин but is known in the German speaking world as "Wladimir Putin" and in the English speaking world as "Vladimir Putin" and in the French speaking world as "Vladimir Poutine". This is why we have name in native language (P1559) to indicate the correct name in the native language. I think something similar happens with Turkish names and entities, these are known in different countries by different names and it makes sense to populate these fields accordingly (otherwise a German speaker would never find what they are looking for if name in native language (P1559) did not have an entry for "Wladimir Putin"). I think we can solve your problem by using name in native language (P1559) where appropriate to make sure the correct native name is used, what do you think? --Hannes Röst (talk) 15:56, 30 June 2020 (UTC)
Thanks for taking the time. However, did you read the part between << and >>? These similar names in Turkish are so very many. BTW in making name entries we have "Latin script" and "Turkish alphabet" choices and when I began to work on these names I asked a very very active WD user which one to use and s/he indicated the Latin script. Then why do we have the "Turkish alphabet" as a choice? How do we separate Gülşen and Gülsen and hundreds of other similar names? In the end we are talking about a language which has an adapted Latin alphabet (like many other languages, Spanish, Portuguese German, etc) and not Russian or Chinese. IMHO P1559 should not be for Latin script languages like Turkish and I only correct the spelling of a handful of languages, I am not interested in German or Dutch, I am interested especially in English, the basic language here and I almost always add, as "alias" some easier form, as in the case of Beril Böcekler (Q56173992). Look, maybe I was not very clear, I was speaking about "names" not "people". See Tuncer (Q32945156; it is not the "name without diacritical marks" of another name. "Tuncer" is one Turkish name and "Tunçer" is "another" Turkish name! What is so difficult in getting this? If a person's name is Tunçer Öztunç you may write it as an "alias" Tuncer Oztunc" -in languages other than mine, Turkish- but the "label" is Tunçer Öztunç because that is the name of the person, and an important part of his/her identity. And the label of Tuncer is Tuncer and that of Tunçer is Tunçer and the names in Latin script do not need aliases nor can be claimed as "a version without diacritical mark", because they are not. Let me make clearer: I did not come here for "aliases" of people's names. I came here for "Turkish proper names" (before being added to anyone).

BTW I somehow feel a prejudice against my language but I prefer to keep to myself some disruptive edits and attitudes I observed and noted by certain people, until the point when I will get really very annoyed. Keeping everybody else apart, I even suspect a lack of goodwill on the part of some people. If not, how, say, a name like x (put any language name here, even written in other alphabets) could appear here as "Dutch"?! Please. E4024 (talk) 18:27, 30 June 2020 (UTC)

I am sorry if I dont understand your problem completely; but we have similar issues in German as well (a word with and without ö may have different meaning, for example schön is different from schon). I think it would also help to make an example where the problem occurs that you are describing. For your example of Tuncer, there is two different items Tunçer (Q71888499) and Tuncer (Q32945156) that describe the two different concepts (two different names) and as you have indicated above, these are distinct names. So you are able to model this in Wikidata and as you describe the label for an entity Tunçer Öztunç would be correctly given as Tunçer Öztunç in Turkish and possible a translation of that in another language (similar to my example of Vladimir Putin).
name in native language (P1559) does not presume that the name is in a different script and is used on examples such as George Washington (Q23) which are English names but of course also have their names listed in Turkish (George Washington), Russian (Джордж Вашингтон), Sanskrit (जार्ज वाशिंगटन) and many other languages. In order to make it clear what the name is in the native language (here English), name in native language (P1559) is used and listed as "George Washington (English)". As you can see, usage of name in native language (P1559) does not make any assumption about one script being preferred to others, it simply states what the name looks like in the native language. I hope that helps.
For your example Beril Böcekler (Q56173992) I think everything is in order here, the Wikipedia articles in different languages list her as Beril Böcekler (en,ca,it,pt,tr) and so that should be the label used, adding an alias Beril Bocekler probably helps since this is the name used by Anadolu Agency (the state-owned Turkish publications in English) and therefore should be added since she is "also know as" Beril Bocekler at least by Anadolu Agency and also some others.
To answer your question: "How do we separate Gülşen and Gülsen and hundreds of other similar names? " -> I would say in the same way as for Tuncer, we create two different items Gülşen (Q70501229) and Gülsen (Q83354020) and thus describe both names. You can make it more clear for non-Turkish editors by using different from (P1889) to indicate that these are two different names. This means that Zekiye Gülsen (Q15731569) links to Gülsen (Q83354020) and Olcay Gülşen (Q3674593) links to Gülşen (Q70501229). The wikidata model even allows you to create an item for Gülşen (Q13435177), the female first name which is carried by Gülşen Değener (Q29722) and is of course different from both Gülşen (Q70501229) and Gülsen (Q83354020), the two last names. Does that work for you?
Does this solve your issue or is there an example where a problem still occurs that you cannot solve? --Hannes Röst (talk) 22:06, 30 June 2020 (UTC)
(Look. I have, by now, enough experience with WPs, WD and Commons, where I was asked for several people to run for admin. I also had the bad luck in my first times to be expelled from three WPs for stupidities made due to lack of experience. Just like my first impression in those platforms, I still believe not everybody tries to do things correctly in these realms but there are people with agendas or "ego"s. Only now I know how to avoid them and make my way through sharing knowledge without problems, at least most of the time. All this may not mean anything to you but I am sure someone will read and get a message; someone that had been blocked and left the place for a time, and I went to call them back. Now I am being rewarded by hits below the belt all the time. All this in parentheses. Forget this part.) Regarding our case, indeed it is me who is trying most to reflect these subtleties of the Turkish language -those that you and I refer above- to WD in the correctest way possible. Only within the last hours I opened both surname Saracoğlu (Q96761116) and surname Saraçoğlu (Q96761094). Probably some of those names I referred <<above>> as being different names in Turkish were also opened or developed (including their Commons categories) by me. The problem is not about people themselves (their names), of course we can write them in different forms; the problem is some people are insistently opening "wrongly spelled Turkish name items", say, if a name has ğ or ç in it they rush (literally) to open the wrong-titled item before I make the correct one. How do I know that? Simple: If some people are only opening items about Turkish names with those "strange" letters but making the items "without the strange letters" and if I have noticed that this activity has begun or speeded up since we had a clash and if the same party has not shown any interest in Turkish names which do not carry those strange letters, what should I think? I am happy to see and hear that there are other people like you guys working here, because somehow I only see the one same person patrolling around and trying to impose their choices on Turkish names. Take a simple example, without giving neither user name nor the title of the item in question: Let's suppose there is a Turkish or German, uhm no no better a Spanish name with a strange letter, say "Calíope" and this person imposes "Caliope" on the item of a Spanish female whose name is "Calíope". (S/he opens an item for the wrong name or uses one that is there and should be deleted.) For better understanding, even the surname of "Calíope" is Gonzales Fernandes. When you try to correct the wrong first name without accent (which was BTW deleted by an admin, but somehow revived upon request) this colleague of ours adds to the item the wrong name and a property that I had never seen before, something like "unidentified name" or "unknown language name" (I really forgot the term, and prefer not to go back there to see, because it gives me a stomach-ache) rather than putting that small accent where it belongs and recognizing that it is a Spanish (or Turkish, or German) name/person. The surname is not important for them either. Do Gonzales and Fernandes sound Spanish to you? No, they are "unidentified language". Well, this is more or less my problem. A continuous production and/or defence of wrongly-spelled Turkish proper names... I must confess that I recognize my behaviour sounds like I doubt the goodwill of some colleague. I hope I don't. However this is not a place to satisfy personal egos. "First name" means the name your parents gave you; and "family name" means the name your family proudly carries. We cannot invent new names because the press or official organizations of this or that country (generally Netherlands, somehow) writes those names with different letters. The name items must be written just like they are in the original language; writing the names of people differently in different languages, even if the person himself/herself does that, s/he throws away the strange letters, for personal choice or for practical reasons, has no importance at all. Her/his name item in WD is only one and should be with the unwanted strange letters. I see that you people understand and agree with me, and I needed to know that there are such people in WD. I'm not alone. Thanks. E4024 (talk) 05:03, 2 July 2020 (UTC)
It seems like we agree on most things and I hope that most people here would agree that a large goal of this project is to ensure that we are cross-language compatible and not just focused on English. However, "The name items must be written just like they are in the original language; writing the names of people differently in different languages, even if the person himself/herself does that", is contrary to opposite of the Wikidata project idea, which is to write names, places and concepts using the language and script that is understandable across the world, see my example on Vladimir Putin (Q7747). See also Help:Label which clearly states that the label in a given language should be the one that is most common in that language. I hope that solves your question. --Hannes Röst (talk) 03:25, 5 July 2020 (UTC)
Of course "It seems like we agree on most things", indeed I agree on most things with everybody who tries to do a good voluntary work in goodwill (like yourself). I have changed eye-glasses several times trying to help with WD, Commons and WPs (look at my global edit counts). However, there are things I am not capable of understanding. Let me put an example but please take the names as "imaginery": Let's suppose you open an article on a person called Hörst Something in DE:WP. We have no WD item for the person. Neither we have an item for the given name Hörst. A bot opens the item as "Horst" instead of "Hörst" and some user, as if he/she cannot see that the article in DE:WP says the guy's name is "Hörst" and literally "rushes" to open a "given name item" for HORST (without umlaut). Orrrr, there is a Turkish lady with the name Tuğba Göçel. In 10 different language WPs she has an article with her name spelled "correctly". As there are no (still let us suppose :) WD items for her name or surname, this same user rushes to open one item for "Tugba" instead of "Tuğba" and another for "Gocel" instead of "Göçel" even though all the Wikipedias use the spelling with ğ, ö and ç. This is the part I cannot understand. Danke schön. --E4024 (talk) 03:55, 5 July 2020 (UTC)
Since we agree on most points, I dont think this discussion leads anywhere unless you are able to name a few examples of where you think something went wrong and where you would like to improve things. Also, I dont think its helpful to call out all of the project as being biased or prejudiced since most people here are committed to collect and represent information as correctly as possible. Best --Hannes Röst (talk) 20:28, 6 July 2020 (UTC)
  • @E4024: I haven't been completely following this, but it seems like there is one or more people or bots who have been acting incorrectly regarding Turkish names? If they are not listening to you, please bring the specific complaint to the Administrator's noticeboard, as Hannes Röst suggested above, it's hard to address the problem if we don't have a bit more detail on who has been doing things wrong. ArthurPSmith (talk) 18:47, 7 July 2020 (UTC)

Public user name for YT channel

Why can't I use Property:P554 on Property:P2397? — Alexis Jazz (talk or ping me) 07:12, 30 June 2020 (UTC)

@Alexis Jazz: I think it might be due to the reason that 'public username' can only be applied on items having property 'website account on'. Adithyak1997 (talk) 09:47, 30 June 2020 (UTC)
@Adithyak1997: Okay, confusing. I don't know how this should be solved. Maybe a new property for "YouTube channel name" needs to be created. I think some channels don't even have an associated username anymore. — Alexis Jazz (talk or ping me) 10:06, 30 June 2020 (UTC)
@Alexis Jazz: Since you mentioned that it's 'confusing', please wait for a more experienced user to reply. It's actually mentioned in this page too. Adithyak1997 (talk) 10:16, 30 June 2020 (UTC)
@Liuxinyu970226: I'm not so sure that actually solves the issue? Though there should probably be a property for YT user ID as well. The way I understand (but I could be wrong), one user may have multiple channels. A channel does not need to have an associated user, at least, not publicly. As Property:P2397 says: "ID of the YouTube channel of a person or organisation (not to be confused with the name of the channel)", we can add the channel ID (for example UCgIIsBhcseFH1Kghmo0ULbA) but the human readable name of that channel can't be added. That's acceptable when the channel name is (nearly) identical to the label of the Wikidata item, but in other cases it isn't. — Alexis Jazz (talk or ping me) 00:30, 8 July 2020 (UTC)

Deletion policy

What exactly are the rules here for deletion? Can admins just delete non-vandalism items for notability reasons without listing them on the requests for deletion page? Is there no process that has to be followed? Looking through the logs I'm seeing a lot of deletions of things that look potentially notable to me that I don't think ever touched the RfD page for discussion. In particular I'm looking at @MisterSynergy:'s recent deletions. BrokenSegue (talk) 23:27, 29 June 2020 (UTC)

  • There are the Wikidata:Deletion policy and the Wikidata:Notability policy. Neither of them requires a listing on a certain page (such as WD:RfD) prior to a deletion.
  • Wikidata:Requests for deletions is not predominantly a discussion page either, and cases listed there can be processed immediately with no minimum "discussion period". Its main purpose is to bring items to admin attention; discussions only emerge in complicated situations.
  • I look for items with problems regularly, and have several worklists to collect them. Usually I delete somthing of the order of 2000 items per week, most of them are not being listed on any project page before I delete; currently around 0.1% of my deletions (~1 item per 1000 deletions) are being restored due to user requests.
  • You can always ask for undeletion in case you think there is something to add to an item that was deleted. The undeletion process is as unbureaucratic as the deletion process.

MisterSynergy (talk) 06:55, 30 June 2020 (UTC)

A low undelete rate doesn't mean there's a low error rate. I'm not going to spend energy requesting undeletion for these things. That said I don't see the value in many of these deletions and I can't even evaluate them for un-deletion given I can't see deleted edits. I might take a look at your SPARQL queries and make sure I mark-up items I create such that they won't show up in them. I just wish there were any oversight or notification here. E.g. An admin flags an item and then a bot deletes in 48 hours. BrokenSegue (talk) 07:45, 30 June 2020 (UTC)
My focus is usually on items that have lost all sitelinks (via User:Pasleim/Items for deletion/Page deleted and archives), or empty items (i.e. do not contain any statements or sitelinks). I collect some of the queries on User:MisterSynergy/sysop/items for deletion, which also contains stuff that I don't really use. What follows is effectively an evaluation whether the item technically meets the notability requirements, in particular whether there is anything available *in the item* that helps identifying the subject (a sitelink, identifiers (with exceptions), references on any statement, or backlinks from other items).
From the content side, there is practically everything you can imagine involved. Larger fractions of content include purely technical items (such as category items without sitelinks), but there are also quite a lot of promotional items about humans or startups; I also regularly see items with severe BLP issues that are completely unsourced.
My attempts to contact involved users in advance of a deletion were not particularly successful, as they often don't want to do anything as they have abandoned the content anyways, or don't know what to do, or forget doing something in spite of having been notified. There is also a substantial amount of items which have been created based on (meanwhile deleted) Wikipedia content solely by different bots or import scripts/tools—there is simply no editor who is still interested in an improvement or who could be contacted. Also mind that there is a constant influx of the order of thousands of unrelated new cases per week.
The described procedure is quite established, I do it more or less since I became an admin a little more than three years ago. A low undeletion rate of course does not mean that everything is fine, yet I think my assessments are usually correct—and otherwise I am not hesitating to undelete in most situations, as described above. —MisterSynergy (talk) 08:28, 30 June 2020 (UTC)
In my view, deletion of items without mentioning in Requests for deletion is not a problem. The main reason I think is already stated by MisterSynergy ie. it involves deletion of thousands of items. I think it's quiet a natural action that is currently occurring in all wikis that are populated, meaning, the ones that have thousands of edits per day/week. Adithyak1997 (talk) 09:59, 30 June 2020 (UTC)
With proper effort it would be possible for admins to delete just as quickly while also providing authors/others the ability to object. Undelete isn't the same. BrokenSegue (talk) 17:38, 30 June 2020 (UTC)
Can you please elaborate how this could look like? I would agree that the process could be more transparent, but whatever we do needs to scale well. —MisterSynergy (talk) 18:13, 30 June 2020 (UTC)
What we would need is a system that allows only admins to "tag" items and allows non-admins to remove those tags. Could be done via a new protected property of items with the rationale as the property value. Items with that property for more than some time would get deleted by a bot. Users could remove the property (or request a bot do so somehow) and migrate to RfD. This would seem to be just as fast as deletion but provides some window for dispute/oversight. I believe this would be similar to Wikipedia's "Proposed deletion" system. BrokenSegue (talk) 18:20, 30 June 2020 (UTC)
Is there a way for a normal user to see their deleted edits? I create many items, too many to just keep an eye on them if they're deleted (or vandalised, or just extended). I have no idea how many of my items where deleted, but reading this discussion I start to be curious about that. Edoderoo (talk) 16:51, 30 June 2020 (UTC)
xtools has such a functionality, but it works only for users with up to 400k edits. No idea whether other tools are around, but users with more than 400k edits can make direct SQL queries such as this one. —MisterSynergy (talk) 18:09, 30 June 2020 (UTC)
If you add items to your watchlist (things I create are auto-added), you can at least see when an item gets deleted. It doesn't help with what it was or why it was deleted. But at least you'd know that something was deleted. Quakewoody (talk) 18:13, 30 June 2020 (UTC)
A watchlist helps if you create an item a day. I create up to a 1000 items a day if I'm up to speed. A watchlist doesn't help anymore. I checked, in 5 years time 3160 items were deleted, some very recent. These might be items where I added one property, label or description, then it's no big deal for over 2.5M edits in total. Those were all items I created, so it's a complete waste of my time to share knowledge that gets deleted by others, without any notice. Edoderoo (talk) 19:38, 30 June 2020 (UTC)
I can only hope that there is something that I can not see as a regular user, that will justify deletion of all those 3160 items that I have created. I try to spend a lot of time adding properties and sources to these properties, so for me it now feels like 3160 minutes of my life has been deleted by a sysop that didn't even want to let me know my life was shortened for 53 hours. I'm extremely pissed off right now. Edoderoo (talk) 06:50, 2 July 2020 (UTC)
Though this is more a rant rather than an actionable request, I have quickly looked into your deleted items. Based on random sampling, the clear majority of them has lost all of their sitelinks and did not meet the notability criteria any longer. You also seem to have created a lot of items for Wikinews articles that have been deleted; those where created with Petscan apparently, thus not really manual editing. ---MisterSynergy (talk) 07:23, 2 July 2020 (UTC)
What is the worth of notability if we create 100.000 scientists a day that only have one label, one property P31=Q5 and one OrcID filled? Edoderoo (talk) 07:31, 2 July 2020 (UTC)
Whether we like it or not, it is a valid project policy. Whataboutism does not help to undermine its applicability to other cases; if you think it should be modified, you can trigger a discussion (RfC) if you'd like to. —MisterSynergy (talk) 07:41, 2 July 2020 (UTC)
Plus, I suspect that the plan is to link them to publications in the future, with the help of the identifier. But importing such datasets take a lot of time. --Misc (talk) 09:15, 2 July 2020 (UTC)
@Edoderoo: Where is the bot request for having 100.000 scientists that only have one label, one property P31=Q5 and one OrcID? If you are creating items in amounts where you can't track them via the watchlist you should fill the bot requests to find the consensus to add large numbers of items. I don't think anybody deleted items for which there's bot approval consensus that they should exist. ChristianKl09:41, 4 July 2020 (UTC)
As far as I know, I did not create a single scientist with the specifications you give, and I'm not talking about items created by my Edoderoobot account. And I don't plan to defend myself against accusations that are missing any base of clear sense. Sorry for that. Edoderoo (talk) 09:50, 4 July 2020 (UTC)
  • The RfD page is filled with things that should be deleted. It is so rare that anything is worth keeping, that I don't even "vote" to delete. I only say when I think something is a keeper, or if there is something others should see - like "previously deleted", "created by spammer", or "is this related to". Quakewoody (talk) 17:36, 30 June 2020 (UTC)
Which is why I put the word "vote" in quotes. Quakewoody (talk) 18:14, 30 June 2020 (UTC)
  • Would it be possible to configure an abuse filter that allows only admins to add "P:Item_to_be_deleted" and every autoconfirmed user to remove such statements? Then a bot could delete all items with such claims after 3 days. If a user deletes such a statement the bot could automatically open a thread in items for deletions. In cases where an admin deletes 1000 items / week adding such statements should be a similar or lower amount of work then the current process. Using QuickStatements would make it easier to delete larger amounts of items.
    Having the functionality like this would also allow us to make it it's own right, so we could allow non-admins to use this new delete process if a non-admin wants to run cleanup processes that deletes large numbers of items. ChristianKl08:32, 3 July 2020 (UTC)
    • This is probably a reply to User:BrokenSegue's proposal further up in this topic. Another similar proposal was recently made at Wikidata talk:Requests for deletions#The deletion process should be made more visible by User:Discostu, with the difference that they proposed to use the item talk page and a template instead of direct claims. That would be much simpler technically and conceptually, yet it would allow the same sort of oversight.
      Problem with all of these ideas is that there needs to be a proper notability assessment and with both proposals, it needs to be done twice (before adding the tag, and before the actual deletion) and thus it *at least* doubles the actual effort we need to invest here. In case this is not clear: I may be deleting 2000 items per week regularly, but do not delete just because they happen to appear on a query result; I do make a thorough check for several aspects for each individual item before I delete it. This is the part which is taking the most time, not the querying or clicking the "delete" button. —MisterSynergy (talk) 09:27, 3 July 2020 (UTC)
      • @MisterSynergy: I think the actual deletion should be done automatically by a bot that mainly relies on the first notability assessment. The bot might further not delete the item if it has sitelinks or external links but it's okay if it relies for the qualitative judgement on the first deletion. That's why we wouldn't want to give the right to delete this way to all autoconfirmed users but only admins and users that we consider to be trustworthy based on being rollbackers and having participated in RfD in the past.
I think using claims on items has benefits over using the talk page because claims on items interact better with other tools. As I for example care a lot for our anatomy items I could set up a listeria page that lists all anatomy items that are nominated for deletion and put that page on my watchlist.
Putting the nomination on a talk page would also mean that we need to create and then delete a talk page for many items that didn't have a talk page before. ChristianKl09:41, 4 July 2020 (UTC)
@ChristianKl: Right this is basically my proposal above. Plus it means that if I'm watching an item I'll get a notification before it's deleted. I'd be willing to write/maintain such a deletion bot. To me the biggest issue is that it kinda abuses the property system (and I'm unsure if it's possible to set permissions to deny adding but allow removing). Is the next step to advance this idea to write an RFC document or something? BrokenSegue (talk) 08:58, 6 July 2020 (UTC)
  • To change something that important, you need an RfC.
  • Claims (properties) are usually not meant to describe the item page, thus using them would be conceptually wrong.
  • Most items are unwatched. I don't know actual numbers, but I would not be surprised if 95+% of all items have zero page watchers. If you put properties on them (or templates on the talk page), nobody will take notice.
MisterSynergy (talk) 09:38, 6 July 2020 (UTC)
I'm also afraiding that MisterSynergy's subpages are rather helpful for merging (or maybe even auto-merging use AWB?) than RFDing works. --Liuxinyu970226 (talk) 12:42, 5 July 2020 (UTC)

For me it would already make quite a difference if regular users could just see deleted items. I believe 99+% of deleted items contain no information that should be hidden, and I'm confident that most sysops and most deletions are no issue. But one sysop that has time on his hands, and has got the believe that "no sitelinks means no project scope" can make quite a mess before it will ever been seen. Sure I can become sysop myself again, but that will only solve the problem for me (and would not be a fair reason to become sysop in the first place). Edoderoo (talk) 08:02, 8 July 2020 (UTC)