Wikidata:Project chat/Archive/2020/11

This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

Project ping does not work?

I am a member of the Wikidata Ontology and Mathematics projects and I've seen "{{Ping project|Ontology}}" and "{{Ping project|Mathematics}}" templates used several times but have never received any notifications from these "pings." Is this feature working correctly for anybody? — The Erinaceous One 🦔 07:03, 27 October 2020 (UTC)

I don't think so. I've also noticed the same with {{Ping project|Energy}}. Rehman 07:14, 27 October 2020 (UTC)

Testing:   WikiProject Ontology has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. ChristianKl11:54, 27 October 2020 (UTC)

In addition, ping does not work if you have personalized your signature. Ayack (talk) 16:08, 27 October 2020 (UTC)
@Ayack: That doesn't make sense. I suspect you have two issues confused, but I'm not sure what problem you ran into. As long as you use the person's account name, ping should work fine. - Jmabel (talk) 01:02, 28 October 2020 (UTC)

  Notified participants of WikiProject Mathematics anyone receiving them? --- Jura 07:37, 28 October 2020 (UTC)

I've just received a notification that "Jura1 mentioned you" in this page section. -- The Anome (talk) 07:41, 28 October 2020 (UTC)
I also received a ping, and there were pings for maths project users once in a month or so. Wikisaurus (talk) 08:13, 28 October 2020 (UTC)
Ping worked for me. Lymantria (talk) 11:49, 28 October 2020 (UTC)
+1 author  TomT0m / talk page

@Rehman, ChristianKl, Jura1, Jklamo: It looks like I do receive Mathematics pings--there's just no indication in the notification that indicates it is a result of {{Ping project|Mathematics}}, so I didn't connect the dots. (It would be nice if the notification was more informative.) On the other hand, Ontology project pings are definitely not coming through. Turns out, per Template:Ping project/doc, that if there are more than 50 people are in a project list, then none of them receive notifications. Should we start pruning inactive members from lists? — The Erinaceous One 🦔 11:12, 28 October 2020 (UTC)

Alternatively there exists another project notification possibility, used by some project on frwiki : a ping template adds a category to the page, and a template puts the number of pages in this cat on the project main page. The category is removed when the discussion is considered closed by someone by adding a parameter to the ping template. This works, but requires an active behavior of the pingee to watch the project page. author  TomT0m / talk page 15:18, 28 October 2020 (UTC)

  • @TomT0m: That approach does sound a little less convenient, but the current system is simply not working for large projects--which arguably are also the most important projects--so I would support adapting that approach. Do you know if we could combine the two approaches so that notifications are sent out and the page is added to the relevant category? Seems like it shouln't be too hard. Maybe we could even just modify the existing project ping template. — The Erinaceous One 🦔 11:22, 1 November 2020 (UTC)

Checking the general existence of a property from Lua using the Wikibase Client

Hi! I am trying to make the dewp Wikidata integration more robust and that includes checking if the passed arguments are valid. As far as I can see there is no way to check if a property actually exists (not statements for the current entity with that property, but the property in general). mw.wikibase.resolvePropertyId() seems to return the supplied property ID even if it does not exist, so using that does not work. Any idea? --Count Count (talk) 11:58, 1 November 2020 (UTC)

Use mw.wikibase.entityExists(id), it returns either true or false. Seems to be the most efficent way of doing that.--Snaevar (talk) 12:13, 1 November 2020 (UTC)
Thanks! But I have to correct myself, mw.wikibase.resolvePropertyId(idOrLabel) returns nil if the supplied idOrLabel does not exist. --Count Count (talk) 12:49, 1 November 2020 (UTC)

Papers cited on Wikispecies

There are many thousands of pages on Wikispecies, about taxonomic papers. Over 20,000 of them have DOIs. For example:

species:Template:Zhang et al., 2020h

whose content begins:

* {{a|Rui-Yan Zhang|Zhang}}, {{a|Ya-Dong Zhou|Zhou}}, {{aut|Ning Xiao|Xiao}} & {{a|Chun-Sheng Wang|Wang}}. 2020. A New Sponge-associated Starfish, Astrolirus patricki sp. nov. (Asteroidea: Brisingida: Brisingidae), from the northwestern Pacific Seamounts. ''[[ISSN 2167-8359|PeerJ]]''. 8:e9071. {{doi|10.7717/peerj.9071}}

I have just manually matched that to A new sponge-associated starfish, Astrolirus patricki sp. nov. (Asteroidea: Brisingida: Brisingidae), from the northwestern Pacific seamounts (Q96231087).

Is there some tool that can match such pages to items, based on DOIs? If not can someone make one?

That would be a good first step, after which we can both start looking at the remaining papers, on Wikispecies, with DOIs and create new items for them; and start to apply the recent decision to templatise citations on Wikispecies. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 00:13, 2 November 2020 (UTC)

In Scholia we now show where a page is used in the English and Danish Wikipedia based on the DOI. So yes it can be done. Ask the Scholia team.. Thanks, GerardM (talk) 05:22, 2 November 2020 (UTC)

These items should be separated (disambig × Croatian road network), could anyone revert this merge? — Draceane talkcontrib. 10:09, 2 November 2020 (UTC)

Alexis900 (talkcontribslogs) Asqueladd (talkcontribslogs) BeneBot* (talkcontribslogs) Detcin (talkcontribslogs) Dough4872 (talkcontribslogs) Gz260 (talkcontribslogs) Happy5214 (talkcontribslogs) Imzadi1979 (talkcontribslogs) Jakec (talkcontribslogs) Labant (talkcontribslogs) Liuxinyu970226 (talkcontribslogs) Ljthefro (talkcontribslogs) mxn (talkcontribslogs) naveenpf (talkcontribslogs) Puclik1 (talkcontribslogs) Rschen7754 (talkcontribslogs) Scott5114 (talkcontribslogs) SounderBruce (talkcontribslogs) TCN7JM (talkcontribslogs) TimChen (talkcontribslogs) Bodhisattwa (talkcontribslogs) Daniel Mietchen (talkcontribslogs) Tris T7 TT me Izolight (talkcontribslogs) Gnoeee (talkcontribslogs)

  Notified participants of WikiProject Roads --- Jura 14:03, 2 November 2020 (UTC)

Query shows entries, which should be filter out; number of entries in result set changes when executed repeatedly

Hello, the following query should return all german streets, which have a Commons-sitelink, but no Commonscat-Property (P373):

SELECT ?item ?commonscat ?sitelink  WHERE {
  ?item wdt:P31 wd:Q79007. # Innerortsstraße
  ?item wdt:P17 wd:Q183.   # Deutschland
  ?sitelink schema:about ?item .
  ?sitelink schema:isPartOf <https://commons.wikimedia.org/> .
  OPTIONAL {?item wdt:P373 ?commonscat }  
  FILTER (!bound(?commonscat))   # nur jene OHNE commonscat-Property (P373)
}
Try it!

The query currently returns sometimes 22, sometimes 30 entries, depening when and how often the query is executed, allthough the objects have not been changed inbetween. Although the objects actually have a commonscat-Property, so they should not be listed at all in the result set, i.e. the result set actually should be empty.

For just one city (e.g. ?item wdt:P131 wd:Q61724. ) the query returns 2, 3 or 4 entries if the query is repeatedly executed.

Could this be a caching problem? Why are entries with commonscat-properties listed, while they should be filtered out?

  • A purge to some objects did not change anything
  • Using Cache Busting ( SELECT ?item ?commonscat ?sitelink (MD5(CONCAT(STR(?item), STR(RAND()))) as ?random) WHERE { ) does not change anything.
  • Replag currently shows no problems.

Thanks a lot! --M2k~dewiki (talk)

I've spotted a similar caching(?) issue with this query on P5008, so it's not just limited to P373 - it currently should have zero items but in fact returns 13, all of which had the relevant claim removed sometime in October - 11 on October 12th, two on October 27th. This simplified query only returns 11, this one returns 10, all from the same set. None of the items have been edited since, I think, and all were processed as part of a larger batch of similar edits. I assume something glitched and the query service index didn't update properly, but that's only my guess. Andrew Gray (talk) 12:33, 2 November 2020 (UTC)

Mayor as a position, should it be capitalized?

Mayor of Seattle (Q32945293) See its use at https://www.wikidata.org/wiki/Special:WhatLinksHere/Q32945293 where it used as a position title. See all the other Mayor entries where the job title is capitalized in Washington state: https://www.wikidata.org/wiki/Special:WhatLinksHere/Q99629971 and in New Jersey: https://www.wikidata.org/wiki/Special:WhatLinksHere/Q42688505. See also President of the United States (Q11696) and Chief Justice of the United States (Q11147) Does anyone else agree it should be capitalized? If you agree please change it. --RAN (talk) 14:14, 2 November 2020 (UTC)

  • It is both the title of the position and a description of the position; the former gets capitalized, the latter does not, so either is equally correct. - Jmabel (talk) 16:34, 2 November 2020 (UTC)
Shouldn't we harmonize on one so we have consistency? Every entry is both a description and a title. The same can be said of each example I showed above like President of the United States (Q11696) and Chief Justice of the United States (Q11147).
  • EnWiki does write "The Mayor of Seattle is the head..." which means that they consider it to be a proper name. I see no reason of deviating from the EnWiki judgement here and would capitalize it as well on Wikidata. I think it's more useful to treat that item to be about the position which has a proper name and not about an description of it. ChristianKl20:09, 2 November 2020 (UTC)

Can we automate the importation of Flickr urls for the Bain collection

The Library of Congress is crowdsourcing context for the Bain Collection at Flickr Commons, volunteers identify the people in the images and then find the date for the event photographed by matching it to newspaper articles from archives. All the images are already at Wikimedia Commons but the context is missing and the dates at Wikimedia Commons all are set at the year "1900" even though the events in the current tranche are now at 1922. If we can automate the importation of the Flickr url we will have a link to the correct date and who is in the image. Here is an example of an image at Flickr Commons crowdsourced and File:Burning Woodfill mortgage - Palace Theater LCCN2014715364.jpg here is the matching image at Wikimedia Commons, missing the correct date and without any context and without any of the people identified. I have been adding links one by one, but with a backlog of 20,000 images, it will never be completed without automation. Here is an example where I added the link by hand: File:Porter Hoagland & wife LCCN2014711875.jpg We currently have three links to the Library of Congress website where the images originate, but we need the fourth. Can anyone assist me? --RAN (talk) 15:00, 2 November 2020 (UTC)

I proposed it there last year, no one responded, and the best bot creation people are here at Wikidata, so a little cross-pollination to see if anyone is interested here. Ultimately the images are displayed here, especially where only one or two images exist. See today's new entry Edward Cecil Moore (Q101094208). --RAN (talk) 18:10, 2 November 2020 (UTC)

Wikidata weekly summary #440

Data import questions page

I recently posted a question to Wikidata:Partnerships and data imports, ostensibly a place to ask questions about data imports. However, it appears to be very low-activity and essentially unmonitored. Should we deprecate it? {{u|Sdkb}}talk 03:48, 2 November 2020 (UTC)

New Feature: Watchlist Expiry

Hello, everyone! The Community Tech team will be releasing a new feature, which is called Watchlist Expiry. With this feature, you can optionally select to watch a page for a temporary period of time. This feature was developed in response to the #7 request from the 2019 Community Wishlist Survey. To find out when the feature will be enabled on your wiki, you can check out the release schedule on Meta-Wiki. To test out the feature before deployment, you can visit mediawiki.org or testwiki. Once the feature is enabled on your wiki, we invite you to share your feedback on the project talk page. For more information, you can refer to the documentation page. Thank you in advance, and we look forward to reading your feedback! --IFried (WMF) (talk) 21:55, 2 November 2020 (UTC)

Hurrah! {{u|Sdkb}}talk 23:49, 2 November 2020 (UTC)

Value label constraint

Please see Help:Property constraints portal/Value label for a draft description. --- Jura 11:39, 30 October 2020 (UTC)

Hello @Jura1:, you created a documentation page about a constraint that is not implemented in the software, without clearly indicating it on the page. This can be pretty confusing for people who would stumble upon the page and think that the feature already exists. I think it would be good to add a mention at the top of the page, and to remove it from the template, so it doesn't mislead people. Thanks. Lea Lacroix (WMDE) (talk) 13:31, 2 November 2020 (UTC)
@Lea Lacroix (WMDE): What part do you consider "not implemented"? --- Jura 13:35, 2 November 2020 (UTC)
This constraint is not implemented in the quality constraint extension. Lea Lacroix (WMDE) (talk) 07:16, 3 November 2020 (UTC)
How is that relevant? --- Jura 07:19, 3 November 2020 (UTC)
It has a {{Draft}} template, but maybe one can make it even clearer that this is just a proposal at this point by moving the page somewhere else (user space, or add " (draft)" to the title). —MisterSynergy (talk) 13:44, 2 November 2020 (UTC)

New Project?

I wanted to contact the Wikidata community to get input on the data I have. The data is an educational resource that has data and rankings for colleges and tech bootcamps. Called "Optimal." Do you normally want info about a company, product, or specifically their research? HigherEd.3 (talk) 22:17, 30 October 2020 (UTC)

If from https://www.optimal.com/ I suspect it'd be under a licence incompatible with wikidata. I see you have created an item Optimal (Q101017985). --Tagishsimon (talk) 01:49, 31 October 2020 (UTC)
@Tagishsimon: yes I was referring to https://www.optimal.com/ . I did create an item but wanted to ask questions in this project chat page before proceeding any further. Can you elaborate what you mean by Optimal being "under a license incompatible with wikidata"? 98.246.168.31 16:41, 2 November 2020 (UTC)
@HigherEd.3: Data added to wikidata needs to be in the public domain. The proprietary results of a proprietary scoring system are most unlikely to be in the public domain. Noting I saw on their website suggested to me that they had open licenced their work; rather, it appeared to me that their business model was based on selling said results. If so, vanishingly small possibility they'd licence them such that thet could leak out through wikidata. --Tagishsimon (talk) 16:47, 2 November 2020 (UTC)
@Tagishsimon: it looks like their methodology and rankings results are in the public domain based on what they say on their websites https://www.gradreports.com/ & https://www.onlineu.com/degrees/bachelors & https://www.switchup.org/rankings. Looks like their business model is matching users with colleges/bootcamps but the rankings are free to view on their sites. Does that change things and make the data available to add to wikidata? HigherEd.3 (talk) 17:32, 2 November 2020 (UTC)
@HigherEd.3: "Free to view" does not imply "public domain".
At the bottom of https://www.gradreports.com/ it says, "Copyright © 2000-2020 Optimal", with no waiver of any rights. - Jmabel (talk) 04:00, 3 November 2020 (UTC)

Merge

  Done

Can someone merge Q101096641 into Q65707922 ? I accidentally created a new Q-page. -- 65.92.244.236 02:10, 3 November 2020 (UTC)

  Resolved Thanks. -- 65.92.244.236 13:38, 3 November 2020 (UTC)

Need feedback on project about mortality

We have parsed Wikidata and extracted data of persons who has died (looking for items {P570} - date_of_death, {P20} - place_of_death etc.). Then organized that data in charts and lists which represent the mortality rate of VIP's.

What do you think if that information reflects the big picture of total human mortality rate, especially for the current Covid-year 2020? What could be improved for the better representation?

Here is the link: Wikimeters.com

I would appreciate a response. Thanks!  – The preceding unsigned comment was added by [[User:|?]] ([[User talk:|talk]] • contribs).


For COVID, there is short comparison from back in May at Wikidata_talk:WikiProject_COVID-19#Age_chart_(based_on_Wikidata_items_for_individuals). The numbers involved are rather low. --- Jura 10:52, 3 November 2020 (UTC)

U. S. agricultural extension agencies

Cooperative Extension System (Q54703957) is a program of the US Department of Agriculture in partnership with land-grant universities. There is a list of URLs for these partnership agencies at en:Cooperative State Research, Education, and Extension Service. I was only able to find a few of these on Wikidata, and the only two that are modeled are Texas AgriLife Extension Service (Q7707494) and Kentucky Cooperative Extension Service (Q92111554). They are both modeled as "government agency" but in fact they are part of the university systems. We have an item for the concept agricultural extension (Q4693964) and for agricultural extension system (Q101067829), but not for "agricultural extension service". I suggest we create "agricultural extension service" AKA "agricultural extension agency" defined as <instance of> partnership and part of Cooperative Extension System (Q54703957), and that these state agencies should be modeled as <instance of> "agricultural extension service" and <part of> their universities and of Cooperative Extension System (Q54703957) Any objection, or better ideas?

(Asking because I want to add one as an author of a publication...) [UPDATE:EDITED] - PKM (talk) 23:52, 31 October 2020 (UTC)

@PKM: I'm somewhat familiar with these organizations (we've several times sent samples of diseased plants to our extension service). As far as I understand it, your proposal makes sense to me. ArthurPSmith (talk) 15:43, 2 November 2020 (UTC)
@ArthurPSmith: thanks, I'll probably proceed as described above. - PKM (talk) 20:05, 2 November 2020 (UTC)
  Done for now. Documented at Wikidata:WikiProject Universities/Recommended statements#Extension service of the institution. Cooperative Extension System (Q54703957) has four items for state-level extension services so far. - PKM (talk) 22:51, 3 November 2020 (UTC)

Q846492

Just realized Daxia is mixing up an animal species with a greco-bachtrian Central Asian kingdom. Not sure how separations are made so hope you can fix it. Thanks. Serg!o (talk) 19:18, 3 November 2020 (UTC)

I'll fix it. - PKM (talk) 20:08, 3 November 2020 (UTC)
  Done - PKM (talk) 22:52, 3 November 2020 (UTC)

Native labels for positions held

I'm working to upload a list of c850 Egyptian government ministers from the 1870s to 1950s. The names of the ministries changed frequently during this period, but the ministry itself remained the same. I've given all the variants in the aliases for the positions. My problem: I want to show (for each minister) what label was used _during his ministry_. I used a native label qualifier, but I get this error:

Qualifier native label (P1705) is incompatible with position held (P39).

Statements using position held (P39) such as the one on Ahmad Abdul Ghaffar Pasha (Q101052677) should not have a native label (P1705) qualifier as they are incompatible.

I uploaded a couple of samples, and they show error flags. See this one, for example, under positions: https://www.wikidata.org/wiki/Q4115731

Any suggestions? Will Hanley (talk) 04:12, 2 November 2020 (UTC)

I think the short answer is we don't have a clear model for describing this. I know in some cases there is a long series of ministries for each time the post changed name, even if they're sometimes treated as being the same thing - so we'd have Minister for Trade; Minister for Trade and Communications; Minister for Trade, Communications and Fisheries, etc - but that seems excessive if the post itself doesn't change. In other cases there is a single item for all the variants of "Minister for Trade" without worrying about recording what it's called at any specific point.
I wonder if using subject named as (P1810) would work here? Andrew Gray (talk) 13:32, 2 November 2020 (UTC)
I don't think subject named as (P1810), or object named as (P1932) are ideal, because a) they are really referring to how the name was recorded in an external source b) they are monolingual. It's also a big problem with Australian ministers, where the department and minister title change very frequently, e.g., see List of agriculture ministers on en:Minister for Agriculture, Drought and Emergency Management. I'm inclined to think it'd be best to create a new Wikidata item for each variation, although another complication is that some get reused after a period of disuse. Ghouston (talk) 21:06, 3 November 2020 (UTC)

Merge?

Castle Clinton National Monument (Q12183776) and Castle Clinton (Q1049423)? --RAN (talk) 01:45, 4 November 2020 (UTC)

Some tidying was needed. Each item is now in a much better shape. The Library of Congress see the so-called castle and monument as different things and I believe we have to agree, as properties such as inception (P571) have different values. Thierry Caro (talk) 13:58, 4 November 2020 (UTC)

Properties for legislation or acts?

Acts of the Indonesian legislative bodies basically can have three dates on its inception, the decision date (tanggal penetapan) usually when the text of the law is agreed by the body, the legislation date (tanggal diundangkan) when the law is registered in the State Gazette (Lembaran Negara) or Region Gazette (Lembaran Daerah) (a practice inherited from the Dutch "Statsblad" system), and the "in effect" date (tanggal berlaku) when the law come into effect. The three dates may be the same but also may be different in many cases. I have been looking for the correct properties to put these three dates to Wikidata but not sure what. Patient Protection and Affordable Care Act (Q1414593) uses start time (P580) for, I assume, when it came into effect while Obergefell v. Hodges (Q19866992) uses publication date (P577) for the date it was decided. Anyone knows whether there are more specific properties or may be I need to propose some new ones? RXerself (talk) 16:41, 4 November 2020 (UTC)

Possibly three areas to look at:
Not sure what to say about inception (P571) versus start time (P580). In part, there's a problem that the item represents two distinct things - the law, and the expression of law as a published document. Whatever choice is taken, consistency within the item members of a legislative series is obvs useful, as is qualifying the value to better specify exactly what it means, or exactly what its derivation was. --Tagishsimon (talk) 17:16, 4 November 2020 (UTC)
See also effective date (P7588). --Shinnin (talk) 17:23, 4 November 2020 (UTC)

Nomination for a WikiProject

Fashion,i dont know if its possible to creats a project about fashionSee the bright light (talk)6

We have one: Wikidata:WikiProject Fashion. - PKM (talk) 19:48, 4 November 2020 (UTC)

Multiple start and end dates

Q4956214 has a variety of start and end dates, 1887, 1890 - 1897, 1903 - 1921, 1947 - . How do I represent that? All iterations are considered one organization, to the point where the entity celebrated its 125th anniversary in 2012, despite it being various decades less than that. -- Zanimum (talk) 01:05, 4 November 2020 (UTC)

I recently encountered the same problem with determining the inception date for a periodical. Then I realized one could do it as is done in genealogy when dealing with conflicting dates: Add them all, sourcing each one. In that way the editor does not have to decide what is the correct date - the editor is merely reporting information. The user can then deal the situation. - Kosboot (talk) 01:37, 4 November 2020 (UTC)
Kosboot's answer is correct but probably not applicable in this case. The problem is that you don't want to have several inception (P571) and dissolved, abolished or demolished date (P576) statements if they correspond to different events in the history of the organization. My suggestion would be to consider only the first inception for inception (P571) and the last dissolution for dissolved, abolished or demolished date (P576). The events in between can be modelled using significant event (P793) Vojtěch Dostál (talk) 12:08, 4 November 2020 (UTC)
w:Brampton_Board_of_Trade#History seems odd.
If Brampton Board of Trade (Q4956214) is about a specific organization, then just the most recent foundation date applies.
If it's about a specific type of organization at this place, it would be the earliest date .. then you could have several "significant events" afterwards.
That the Wikipedia article amalgamates all of them, just determines where to link it from. --- Jura 13:26, 4 November 2020 (UTC)
If the organization considers itself as the same entity over the whole time span, it's a legitimate point of view and we can have an entity for this. Vojtěch Dostál (talk) 15:22, 4 November 2020 (UTC)
If so, I don't think it would have multiple foundation dates. They are obviously free to consider them as successors to some other organization. --- Jura 16:48, 4 November 2020 (UTC)
dormancy (Q55909176)? - Jmabel (talk) 15:59, 4 November 2020 (UTC)
@Zanimum: Have you considered using a series ordinal (P1545) qualifier on the start and end dates (i.e first is 1, second 2, etc.)? ArthurPSmith (talk) 00:12, 5 November 2020 (UTC)

Q101110997

Why no automatic WD entry since 22 August? Eurohunter (talk) 16:45, 3 November 2020 (UTC)

What is "automatic WD entry"? Do you mean the item would be created automatically (e.g. by a bot)? --Matěj Suchánek (talk) 11:49, 5 November 2020 (UTC)
@Matěj Suchánek: Exactly. Eurohunter (talk) 15:28, 5 November 2020 (UTC)

List of active Pakistan Air Force aircraft (CAC/PAC JF-17 Blunder) Show on this portal

Dear

When i check the PAF Jets on Wiki data we see today ,CAC/PAC JF-17 Blunder, What is this,

and whien we check the Incident of JAF 17 on Wiki data we find that total 4 air craft is crashed, actual only three air craft is crashed,

Refrene from Wiki data, the 15 Feb 2020 no incidance is occur of JAF 17 Thunder correct your data

 – The preceding unsigned comment was added by 115.186.146.10 (talk • contribs).

Dear IP,
Could you explain a bit more what is the problem?
I'm guessing this is something related to list of aircraft of the Pakistan Air Force (Q17008259), CAC/PAC JF-17 Thunder (Q744253) and linked items but I don't see any mention of crash... I also see 4 crashes on en:PAC_JF-17_Thunder#Accidents (but this is on Wikipedia, not on Wikidata).
Cheers, VIGNERON (talk) 08:52, 5 November 2020 (UTC)

Companies and businesses are using wrong properties

Hello, I was adding some new labels in the template "infobox company" on Czech Wikipedia and found out, that many companies and businesses are using wrong properties on Wikidata -> follows (P155)/followed by (P156) (and also English infobox use them). But the right one are replaces (P1365)/replaced by (P1366), because they are talking about replacement: one company is replaced/succeeded another. But the first one are talking about continuation: you have the movie and the sequel or prequel. What do you think? Is it possible to use a bot and change P155/P156 -> P1365/P1366 in companies and businesses??

Usage in companies (P1365/P1366 is used more often) and usage in businesses (it's fifty-fifty). P.S.RiniX (talk) 17:01, 4 November 2020 (UTC)

See Wikidata:WikiProject_Companies/Properties#Legal_and_reporting_hierarchy_linkage follows (P155)/followed by (P156) usage for companies items is not "wrong" but on the contrary preferred. As far as i know replaces (P1365)/replaced by (P1366) usage in company items is merely consequence of single import. --Jklamo (talk) 22:11, 5 November 2020 (UTC)

Wiki of functions naming contest - Round 2

22:10, 5 November 2020 (UTC)

Cleaning up US Senate data

@Oravrattas: @Andrew Gray:

I plan on deleting all of the statements coming out of this query:

SELECT * WHERE {
  ?sen p:P39 ?statement.
  ?statement ps:P39/wdt:P279+ wd:Q4416090.
  ?statement ps:P39/p:P31/pq:P4649/wdt:P31 wd:Q15238777.
}
Try it!

These largely duplicate statements which set position held (P39) to United States senator (Q4416090) directly instead of to a subclass of it. The extra semantic info from the subclass can be found in parliamentary term (P2937). A survey of those statements show that the only extra qualifiers these statements have are replaces (P1365) and elected in (P2715). This data is derivable from the source of the previous upload once we settle on how to represent which Senate class a senator has been elected to.

I'm open to suggestions on the latter as well as any advice regarding this data set in general.

Regards, Gettinwikiwidit (talk) 06:52, 3 November 2020 (UTC)

I'm not following this suggestion at all. The link to what query? Are you referring to your own post? My proposal is to remove precisely the statements referred to above. Gettinwikiwidit (talk) 08:55, 3 November 2020 (UTC)
  • Can you replace the malformed items we previously discussed? e.g. Q98077491. Once replaced, they can be deleted.
Also, please delete fictive and/or guessed end dates. --- Jura 06:55, 3 November 2020 (UTC)
Nothing is malformed. It's not clear what you mean by replace. Have you parsed what I'm talking about above? I'm not inclined to engage low energy requests. Gettinwikiwidit (talk) 07:49, 3 November 2020 (UTC)
Q98077491 should be replaced with United States senator (Q4416090). Q98077491 is malformed, but we seem to have a hard time to explain it to you. --- Jura 08:17, 3 November 2020 (UTC)
Just as others over the years others have had a hard time explaining, over your objections, why Q98077491 is both valid and useful. --Tagishsimon (talk) 08:28, 3 November 2020 (UTC)
Nobody tried afaik. Probably because it's fairly obvious that it's not compliant with Help:Label. --- Jura 08:31, 3 November 2020 (UTC)
Yes, well, 26 Jan 2019 &c. Which leaves us only to state that you, at least, prefer one approach; others prefer the approach currently implemented. Gettinwikiwidit should be informed that your suggestion to replace these values is not uncontroversial and is opposed. See Wikidata:WikiProject every politician/Political data model#Holding Political Office. --Tagishsimon (talk) 08:42, 3 November 2020 (UTC)
Can you quote the sections you consider relevant for the label and the P31 qualifier? I think we already discussed the use of Q98077491 before and the proposal of its use wasn't considered acceptable. --- Jura 08:46, 3 November 2020 (UTC)
I'm not inclined to continue a conversation in which continually misrepresents the facts. Nor do I see how this represents the best interest of the community at large. As has been stated elsewhere, this project was in limbo for two years where you had plenty of time to put in whatever effort you were inclined to put in. Waiting for someone to take an interest and then hounding them with demands doesn't seem like a model this site should encourage. I've stated my intent here and am not inclined to fill the discussion with your proposals. You're welcome to pursue your own projects. Gettinwikiwidit (talk) 08:53, 3 November 2020 (UTC)
I don't mind cleaning up behind you. --- Jura 08:54, 3 November 2020 (UTC)
Your incendiary language is not appropriate or appreciated. Please try to be more civil. Gettinwikiwidit (talk) 08:57, 3 November 2020 (UTC)
What are you referring to? --- Jura 09:02, 3 November 2020 (UTC)
Q98077491 might follow the model of Q30524710. I don't know which model predominates for officeholding for the US Senate, but fwiw accept that there should be a consistent model & as a corollary non-standard membership statements should be amended or deleted as appropriate. My preference is for the subclassed holder position but it's not a hill to die on. --Tagishsimon (talk) 09:00, 3 November 2020 (UTC)
Maybe the proposal above is actually to replace and delete Q98077491. Go figure .. --- Jura 09:06, 3 November 2020 (UTC)
My understanding is that it's still early days for this project and that the priority is on getting the data uploaded over a particular form of the data. If there is some project to organize across usess of position held (P39), I don't know about it but I support the idea in any event. Without a coordinated effort, though it seems like we're doomed to have dozens and dozens of conversations just like this. Without more info on a coordinated effort, I plan to continue to pursue my goal to have all the United States senator data uploaded and self consistent. Gettinwikiwidit (talk) 09:05, 3 November 2020 (UTC)
@Gettinwikiwidit: I think the standardisation you've suggested here is sensible, and I'm glad you're taking it on - thanks!
In terms of defining which class a Senator belongs to - hmm. I think there are a couple of possible ways to do this (option 1 would be a qualifier on all P39 claims to tie them to the class, option 2 would be to split the "seat" into two items for each state, a Class 1 senate seat and Class 2 senate seat, and assign people to the relevant one as needed). Not sure which would work best, though, and I can't think of any items where we have modelled a similar situation. Andrew Gray (talk) 22:44, 3 November 2020 (UTC)
  • This is done. Without a better guidance on how this work should be coordinated, I'll post before doing other batch updates.  – The preceding unsigned comment was added by [[User:|?]] ([[User talk:|talk]] • contribs).

Sibling marking in animals

There are notable animals whose families we also know. For them, the designation of a sibling is problematic because a sibling (P3373) can only be a human being. How can this be marked? Need to create a new property called "animal brother"?

Example:

Problem list: Wikidata:Database reports/Constraint violations/P3373

(Same problem with mother (P25) father (P22) marking.) Thanks! Palotabarát (talk) 14:04, 4 November 2020 (UTC)

This looks like it's probably just an omission on the constraints for sibling (P3373) - the constraints for mother (P25), father (P22), and child (P40) all seem to explicitly allow "animal" (as well as fictional people, deities, etc). Andrew Gray (talk) 17:56, 4 November 2020 (UTC)
The issue is we does not have a single "class" for animal. See phab:T173593.--GZWDer (talk) 10:44, 5 November 2020 (UTC)

@Andrew Gray, GZWDer: Thanks for the reply, I set the class to be an animal too - we’ll see. Palotabarát (talk) 07:47, 6 November 2020 (UTC)

Wikidata tutorial on Wikiversity

Might some of you have time to review and improve Wikiversity:Create and use a Wikidata item?

I've been using Wikidata for a couple of years, and I love it -- except that I have yet to find a good source of documentation and a good way to get answers to questions. Getting started by myself was hopeless, because the basic idea of adding one "statement" at a time did not occur to me until someone showed me how. I've encountered many other questions since then, and guessed at answers that seemed to work.

However, I'd be happier -- and some of you might be happier also -- if that article can be improved to correct any deprecated practices recommended there.

Tomorrow, I plan to create a similar tutorial on "Wikidata item for a legal document". With documents tied to court cases, this can be quite complex. I'm not an attorney, and I've so far been unable to obtain clear guidance for what I want to do. In particular, w:Fish v. Kobach began, I think, as a case in the w:United States District Court for the District of Kansas. If I understand correctly, it generated several appeals to the w:United States Court of Appeals for the Tenth Circuit. Each of those is probably a separate case with a separate case number.

Then after w:Kris Kobach was replaced by w:Scott Schwab as w:Secretary of State of Kansas, Schwab decided he wanted to file more appeals. I've heard that it has been appealed to the w:Supreme Court of the United States, but I have not yet found documentation of that. This creates a number of separate cases, with different names and numbers at different times in different courts. If I were competent to write this document, I might not ;-) However, since I've so far been unable to get the help I think I need, I will document my understanding and hope that others who know better will say, "Oh, that's wrong: I must fix it";-) DavidMCEddy (talk) 05:47, 6 November 2020 (UTC)

Is there a property for adding quotes from people?

Hi all

Is there a property for adding quotes from people? I know there's one for quoting from books but not sure if that is usable for this purpose.

Thanks

--John Cummings (talk) 23:18, 28 October 2020 (UTC)

No & no, afaics. You're thinking quotation or excerpt (P7081). --Tagishsimon (talk) 00:20, 29 October 2020 (UTC)
Probably best to simply use Wikiquote (Q369), rather than clutter items with what could easily be hundreds of trivial, unstructured quotes. -Animalparty (talk) 04:13, 29 October 2020 (UTC)
Aren't quotes close to encyclopedic content in stead of structured data? Lymantria (talk) 08:23, 29 October 2020 (UTC)
Related proposal: meta:Structured Wikiquote --Pyfisch (talk) 12:54, 29 October 2020 (UTC)
Thanks all, it looks like there currently isn't an option for recording quotes on Wikidata and do not meet Wikiquote's notability requirements. --John Cummings (talk) 15:47, 6 November 2020 (UTC)

JavaScript user wanted

If you don't know what JavaScript is then you're likely using it. Would somebody add the latest release to Q20204906. I would do it myself if I didn't think of JS as a threat. --Palosirkka (talk) 08:13, 3 November 2020 (UTC)

@Palosirkka: you can edit wikidata without relying on browser javascript. there is a libre cli. (or you could just edit in a VM but I'm sure you know that). BrokenSegue (talk) 20:49, 6 November 2020 (UTC)

Street junctions

Do we have properties for saying such things as street A continues as street B, crosses street C or terminates in T-junction with street D? I found terminus (P559), but that does not seem adequate. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:44, 5 November 2020 (UTC)

Example of your Street D case, done in terminus (P559) - Q4649588#P559 --Tagishsimon (talk) 22:51, 5 November 2020 (UTC)
For the streets in Bangkok I modeled it using terminus (P559) - Thai Wikipedia also has articles on the intersections. From the intersections then linking the streets with connects with (P2789). See for example Phran Nok road (Q13015240) for the street and Phran Nok intersection (Q16305728) for the intersection. Ahoerstemeier (talk) 14:45, 6 November 2020 (UTC)

Jind State (question to those who know India well)

are they same/should be merged ? Jind State (Q1252744) and Q3191136 Bouzinac💬✒️💛 20:41, 5 November 2020 (UTC)

Given what's on the items now, I'd merge ..
If you want an expert view, try WikiProject India. --- Jura 07:13, 6 November 2020 (UTC)
You'd have to decide what to do with the two linked hiwiki articles: hi:जींद_की_रियासत and hi:जीन्द राज्य. Ghouston (talk) 00:49, 7 November 2020 (UTC)
  Done Bouzinac💬✒️💛 13:27, 7 November 2020 (UTC)

Q96218663

Hi there, I stumbled upon this data entry. An entry about a standard news article (from the Verge) doesn't seem very relevant to store in this database. Is everything eligible for entry, or are there relevance criteria? Could a moderator otherwise delete the entry? Best regards, Mikalagrand (talk) 19:13, 6 November 2020 (UTC)

It is created because the item is used as source here. It is pretty common that the new item is created expecially when the same source is used multiple times. --Zache (talk) 19:49, 6 November 2020 (UTC)
Wikidata has many entries on news stories, web sites, etc. I cite many such in Wikipedia and Wikiversity using a syntax like {{cite Q|Q96218663}}<!-- Spider-Man: Miles Morales is coming to the PS5 this year -->. (The latter part between <!-- and --> is an optional comment. I included it, so I can see what it is; the QID is too terse to parse easily.)
That Wikipedia:Template:Cite Q is available in 19 other language Wikipedias. (In French, it's fr:Wikipedia:Modèle:Bibliographie.)
I do this, because for an item used more than once, I think it's better to have the citation information an central location. For the long term, I think we get better citations that are easier to maintain, e.g., against Wikipedia:link rot.
If you delete that Spiderman item, you'll have to delete many more, and you'll have to convince many editors like me that we should not put information like that into Wikidata. DavidMCEddy (talk) 20:10, 6 November 2020 (UTC)
  • While it's often useful for Wikipedia to have a lot of information on one page, it's generally useful for Wikidata to spread out information over multiple pages. Items for sources are notable under our structural needs section and are thus welcome. ChristianKl14:35, 7 November 2020 (UTC)

Oceanne Iradukunda (Q101245570)

Greetings!

I am requesting for the deletion of this item created by me using an anonymous IP address (129.205.113.241) at 13:47, 7 November 2020, after creating the item on Wikipedia. I intended creating it with my registered user name but got it all mixed up. I want to create it under my present username. I am new to editing on Wikidata. Please help me out. Thanks. Kambai Akau (talk) 14:20, 7 November 2020 (UTC)

merge problems with duplicate entries for counties and their county seats

I stumble a lot on merge problems with duplicate entries for counties and their county seats (I'm mainly editing items in Turkey and PRC). An example is Göksun (Q16005686) and Göksun (Q939677) in Turkey - one would have to read every single xxwiki article to decide if the focus is on the county (as administrative area) or on the county seat (as residential area and administrative centre). Most duplicates seem to have come from wikipedias like svwiki or cebwiki which might have imported geonames (GNS) data years ago. This problem makes merging very difficult and time-consuming... Any thoughts about this problems appreciated ;) --Katpatuka (talk) 14:39, 7 November 2020 (UTC)

Don't merge them - a county isn't the same as the settlement which serves as its seat. Its of course a bit technical to have them in two different items, and most Wikipedias will cluster together both aspects into one article - but just imagine what to do if the government merges two counties get into one. The settlement stays the same, it just changes its located in the administrative territorial entity (P131), but if both settlement and county are in one item it will look like the settlement ceased to exist when the county was dissolved. Ahoerstemeier (talk) 18:36, 7 November 2020 (UTC)
Like Ahoerstemeier these items should not be merged. I tried to make some clarification on Göksun (Q16005686) and Göksun (Q939677), but there are still articles on german and and english wikies that have the Bonnie and Clyde (Q219937) problem for Göksun. Pmt (talk) 23:08, 8 November 2020 (UTC)

Translation problem exposed by SPARKL

Wikidata:SPARQL tutorial included a query that identified Q52773763 as distinct from Q961957. The code is:

SELECT ?book ?bookLabel
WHERE
{
  ?book wdt:P50 wd:Q35610.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}

When I ran this just now (on 2020-11-07) I got 293 results. These were all the Wikidata items for which "author (P50)" = "Arthur Conan Doyle (Q35610)".

However, many were translations, etc. There are almost certainly many deficiencies in these 293 items. For example, Q52773763 is "instance of (P31)" = "version, edition, or translation (Q3331189)". I wonder if that should not be "instance of (P31)" = "literary work (Q7725634)". I also suspect that they may not all be properly linked, cross referenced, etc. At minimum, "Q52773763" appears in the results of the SPARQL query with "bookLabel" = "Q52773763", presumably because it does not include an English language title.

Is there documentation of the proper representation in Wikidata of the 293 results from this search? If yes, how can I find it? If no, what would you suggest be done to create such?

Yesterday I created "Wikiversity:Create and use a Wikidata item". I put it in Wikiversity, because that accepts teaching materials, and Wikipedia does not. If it should belong as part of Wikidata, where and how should it appear?

I'd be happy to collaborate with someone to create documentation on how these should be cross linked and documented otherwise, and then making sure all these entries meet that standard.

After that, I hope I can get help with a more ambitious project: Developing similar documentation for cross linking Wikipedia:Fish v. Kobach as a case in the Wikipedia:United States District Court for the District of Kansas with all its appeals and related documents, e.g., "Findings of fact and conclusions of law in Fish v. Kobach (Q97940156)". This would include changing the name to "Fish v. Schwab" when Wikipedia:Scott Schwab took over as Wikipedia:Secretary of State of Kansas and decided he wanted to appeal that case.

To this end, I'm studying SPARQL. I hope to find all Wikidata items with "instance of (P31)" = "legal case (Q2334719)", so I can see what properties are used with the statements associated with each such Wikidata item. From that I can then see if a model for this already exists in Wikidata. So far I've found 26,913 Wikidata items with "instance of (P31)" = "legal case (Q2334719)". However, I don't know yet how to find out what properties are used in statements associated with those 26,913. Without that, I can't do a complete search for a comparable model.

Of course, if you think I'm not approaching this problem in the best way, I would be thrilled to get help with it. Thanks, DavidMCEddy (talk) 03:35, 8 November 2020 (UTC)

WikiProject Jupyter on Meta

Hi all, I started WikiProject Jupyter over on Meta in order to help coordinate Jupyter-related Wikimedia activities. --Daniel Mietchen (talk) 02:36, 9 November 2020 (UTC)

How best to handle conflations in external identifiers?

I'm aware of Help:Conflation of two people, but have a question of how best to handle cases where external sources conflate 2 or more people. Widely-used databases of actors like Internet Movie Database (Q37312) or Internet Broadway Database (Q31964) occasionally conflate performers (especially for obscure early film actors who have similar names to other, more famous actors). For instance, This IMDb entry conflates Lillian Kemble (Q72596595) and Lillian Kemble-Cooper (Q6548149), while This IBDB entry conflates roles by Louise Allen (Q70776916) and Louise Allen (Q70760915). Even though these identifiers are misleading, they still have some useful content. Should the shared identifiers (e.g. IMDb ID (P345)) be deprecated, or left as normal rank but qualified with something like partially coincident with (P1382) or identifier shared with (P4070)? -Animalparty (talk) 23:47, 7 November 2020 (UTC)

It’s mostly done with deprecation and reason for deprecation = conflation --Emu (talk) 23:54, 7 November 2020 (UTC)
  • I think identifier shared with (P4070) should mostly be used for identifiers that actually concern several entities by design (e.g. pairs of sports people when apply to each individual).
Using normal rank for imdb identifiers that conflate several people would probably lead to proliferate the conflation into Wikidata. --- Jura 05:32, 8 November 2020 (UTC)

  WikiProject Movies has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. --- Jura 05:32, 8 November 2020 (UTC)

  • The good news - many external sites are editable. And those who are not (such as ISNI) can be contacted and will gladly fix their errors because it makes them look bad. It defeats the purpose of identifying someone if they are misidentifying people. So, before adding it here, I would give them a chance to fix their errors. Quakewoody (talk) 06:52, 8 November 2020 (UTC)
    • Somehow that invalidates the point of using their IDs as stable identifiers.
      In any case, I think a solution should enable Wikidata contributors to contribute to Wikidata to handle this. --- Jura 07:02, 8 November 2020 (UTC)
    • ISNI seems less and less happy fo fix their conflations. The big problem, however, is the Library of Congress, as they never seem to react to mails --Emu (talk) 12:15, 8 November 2020 (UTC)
      • Actually, I have just recently had excellent, prompt interactions with ISNI where they fix their data. I got better results from writing to their general contact email than using the reporting feature on the page with the error. But the real person to write to is one of the people in the ISNI project, Stavroula Angoura at the British Library, who told me to contact her again if there was anything else to fix. — Levana Taylor (talk) 05:48, 9 November 2020 (UTC)
    • Another problem to solve is what to do when we know that a previously ambiguous identifier was somehow patched up and can't really be relied upon. --- Jura 14:20, 8 November 2020 (UTC)

Grandfather, grandson

Hello, Am quite confused, by this property, is it relative, direct relative, father or something else?--DDupard (talk) 10:06, 9 November 2020 (UTC)

Practically, it means that Joe Biden (Q6279) wouldn't have Joseph Harry Biden (Q100276774) as value as it's already available through father (P22).
For your sample, the suggestion means that it's preferable to create an item for the missing link (is it Roland C.?). --- Jura 13:45, 9 November 2020 (UTC)
Jura, does this mean then, that grandfather, grandmother, grandchild, uncle or nephew does not show off in Info box, Quid of reasonator?  – The preceding unsigned comment was added by [[User:|?]] ([[User talk:|talk]] • contribs).
Most do. Which ones are missing for Joe Biden (Q6279)? --- Jura 14:52, 9 November 2020 (UTC)

Wrap up of the feedback about bug reports and feature requests

Hello all,

Here’s some news from the review we’re currently running about our support process. Our goal is to understand how the current processes to report bugs and feature requests work for you, and how we could improve them. You can read more about the project on this page.

Thanks a lot to everyone who gave feedback or answered our survey. We collected 23 anonymous answers on the survey, and 5 people gave feedback on the talk page or on the social media channels. The majority of the survey respondents (87%) already reported a bug report or a feature request, the most used platform being Phabricator (42%), followed by WD:Contact the development team and the Project chat.

Asked to think of an example report or feature request that they made in the last 2 years that generally represents their overall opinion of interacting with the Wikidata development team, people mentioned a long delay of response, issues taking too long to be fixed, problems with tagging the report on Phabricator or difficulty to get in touch with people who could solve the issue. Overall, the lack of response or delay in the response seems to be the main pain points encountered by the respondents.

People were asked to rate on a scale of 1 to 5 how satisfied they are with the current process, and the results are pretty polarized: one third of the respondents declared being very satisfied, one third declared being not satisfied at all. Some respondents mentioned that they couldn’t get an answer before pinging people several times. Some people declared that they had a positive experience and the issue they reported was solved quickly.

Respondents were asked about improvements they thought could be made to the existing bug report/feature request process. The main suggestions revolve around:

  • Having a better/clearer overview of the open bugs and already reported issues
  • Improving the way tasks are triaged and monitored, as well as the creation of Phabricator tickets
  • Being able to suggest and vote for the most important features and bugs to fix
  • The possibility of communicating in other languages than English

On the positive side, some respondents declared that the existing process works well for them, and a few people mentioned Phabricator being functional as a bug tracking system. People highlighted the importance of an on-wiki channel to report problems.

Respondents who never made a feature request or report a bug before in the last two years had varied reasons for not doing so. 50% said that they expected someone else to do it, 16% said they did not know how to report a problem, while the rest of the respondents said that they didn’t encounter the need.

Thanks again to people who took some time to answer the survey. Our next steps are to work on some concrete suggestions to improve the process, and to come back to you so we can discuss how to implement them.

If you have any questions or more suggestions, feel free to add them on this talk page. Cheers, Mohammed Sadat (WMDE) & Lea Lacroix (WMDE), 10:08, 9 November 2020 (UTC)

Wikidata weekly summary #441

Help needed from those very Chinese-knowers about 诸侯國

Hello, it would be greatly appreciated in fulfilling inception (P571) + dissolved, abolished or demolished date (P576) in that query about ancient Chinese state (Q836688). Even a novalue would be helpful. 鼎力 ! Bouzinac💬✒️💛 21:17, 9 November 2020 (UTC)

Social science

I want to ask questions because it is first time I use the wiki data so that it and what I learnt about social science it is that it teaches you bought history, what happened long time ago so that it  – The preceding unsigned comment was added by 41.113.151.60 (talk • contribs) at 09:37, 6 November 2020‎ (UTC).

Adding a name in Wiradjuri?

Parkes Radio Telescope (Q41804148) is now known as 'Murriyang' in Wiradjuri (Q957298) [3], is there a way to record that in the Wikidata item, please? At the moment I've just added it as an English alias, but that's not correct! Thanks. Mike Peel (talk) 12:19, 9 November 2020 (UTC)

Contribute to Wikidata

A change was made to this page as part of a paid research pilot study, in line with the Wikimedia Foundation's paid contributions policy.  – The preceding unsigned comment was added by NancyMacy (talk • contribs) at 21:10, 9 November 2020‎ (UTC).

Saeb Erekat

When you search for the late Saeb Erekat in Swedish Wikipedia, the birth and death years of the Wikidata come out backwords. (2020-1955) instead of (1955-2020). How do you change it?

https://sv.wikipedia.org/w/index.php?search=saeb+erekat&title=Special%3AS%C3%B6k&fulltext=S%C3%B6k&ns0=1 194.69.14.62 10:48, 10 November 2020 (UTC)

Coherent SI units

Hello. I've been improving quantities and units in Wikidata for 1.5 years now, making it more consistent and standard compliant and adding more data. While doing that I learned a lot. Therefore I though I'd share some of the motivation and give an overview of some aspects and the progress so far. Some of this is pretty technical and a casual user of units might not be aware (and does not need to be). For more details on the technical aspects see SI brochure and the ISO/IEC 80000 standard series or simply ask here.

Quantities and the units in which those quantities are expressed are the basic building block of quantitative science and communication. To unambiguously transfer the statements that a study determined the value of quantity X expressed in unit Y one needs unambiguous identifiers for those concepts. There are various ontologies which provide identifiers for quantities and units. Wikidata is one of them and also links to the others using external identifier properties.

An example quantity is the linear expansion coefficient whose recommended unit of measurement is the reciprocal kelvin.

Note how in this phase I linked to the respective Q-IDs to make it clear exactly which concept I was referring to. This facilitates correct translation (by humans and machines alike) of this sentence into other languages (note how there are multiple "expansion coefficients" - here we mean the linear one).

In Wikidata we currently have about 800 quantities and 1700 units which are "notable" in that they are linked to other ontologies, thereby fixing their precise meaning.

An important subset of those units are the SI-accepted units, which include SI units. Those can be further restricted to the coherent SI units. The coherent SI units are those formed by multiplying and dividing the 7 SI base units. For instance, dividing m (metre) and s (second) one gets m/s (metre per second), the unit of speed. Then there are certain units that received a special name, for instance N (newton) is a special name for kg m/s² (all SI units with special name are coherent). From the coherent SI units all the other SI units can be derived by inserting SI prefixes (k, m, μ and so on). The units derived using prefixes are related to the coherent ones by the property conversion to SI unit (P2370): For instance, 1 km is 1000 m. Non-SI units also have conversions to coherent SI units, like the inch. This makes the coherent SI units the most important set of all units.

Here is a query which lists all the coherent SI units, together with their unit symbol and the quantities which are expressed in this unit. Replace all instances of "en" (except next to ?symbol) with the code corresponding to your favorite language. Then review the labels and descriptions and see if you can improve the existing or add a missing one.

select
  ?unit (sample(?label) as ?label) (sample(?desc) as ?desc)
  (sample(?conversion) as ?conversion) (sample(?symbol) as ?symbol)
  (group_concat(distinct ?quantity; separator = "; ") as ?quantities)
where {
  ?unit wdt:P31 / wdt:P279* wd:Q69197847 .       # coherent SI unit
  optional { ?unit rdfs:label ?label . filter (lang(?label) = "en") }
  optional { ?unit schema:description ?desc . filter (lang(?desc) = "en") }

  optional { ?unit wdt:P2370 ?conversion }       # 1 for every coherent SI unit
  optional { ?unit wdt:P5061 ?symbol . filter (lang(?symbol) = "en") }
  optional { ?unit wdt:P111 / rdfs:label ?quantity . filter (lang(?quantity) = "en") }
} group by ?unit
Try it!

Best wishes. Toni 001 (talk) 12:35, 5 November 2020 (UTC)

  • Good work. Thanks for doing that. I think we have come a long way since having random abbreviations on unit items [4].
Maybe you want to provide an update list of conversions for Wikidata Query Service to Wikidata:Contact the development team.
What are your plans for historic units? --- Jura 07:17, 6 November 2020 (UTC)
Thanks @Jura1:
Best wishes. Toni 001 (talk) 11:58, 6 November 2020 (UTC)
About the conversion, the question was asked at Wikidata:Contact_the_development_team/Archive/2019/10#Is_the_unit_conversion_file_published_somewhere_?. Lucas would know if it's still the currently used code. --- Jura 12:17, 6 November 2020 (UTC)
(It's obviously possible to query the currently used definitions directly on WQS). --- Jura 12:23, 6 November 2020 (UTC)
@Toni 001: Unit conversion happens in the Wikibase RDF output, not in the query service. It’s configured by unitConversionConfig.json, which was generated about three years ago by the updateUnits.php maintenance script (and then hand-edited to remove degree Celsius (Q25267) as a unit). I tried running the maintenance script again just now, and the file grows by a factor of three, but looking at the diff it also seems to convert units like centimetre (Q174728) and millimetre (Q174789) to themselves now (with a factor of one), rather than to appropriate multiples of metre (Q11573), so I assume the data model on Wikidata changed in the meantime and the script needs to be updated before we can use it to generate a new version of the config. (Also note that the new conversions would probably only become fully effective after a complete query service reload, which is not often done. I’m not sure if changing the configuration without a reload would cause issues.) Are you aware of any specific problems with the current configuration that would make an update more urgent? --Lucas Werkmeister (WMDE) (talk) 12:22, 6 November 2020 (UTC)
Curious that centimetre (Q174728) gets incorrectly converted. I think we can rely on incremental updates unless some much used units are incorrectly converted or missing. BTW, any chance we could add °F/°C/°K conversion? --- Jura 12:29, 6 November 2020 (UTC)
To fix centimetre (Q174728), I suppose the option --base-unit-types Q223662,Q208469 needs updating --- Jura 12:36, 6 November 2020 (UTC)
Thanks @Lucas Werkmeister (WMDE), Jura1: I had a quick look at the script but not enough to figure out whether with "base unit" it is referring to the 7 SI base units or something else. In any case, a query like this can be used to re-generate the above mentioned .json file. As of today, there would be 32 removals (of the 440 original entries) and 1079 additions (or less if we take the usage into account - the script requires 10 usages of the unit). All changes that I spot checked are improvements. About urgency: So far I have not seen a strong need for an update, so no need (in my opinion) to halt other important projects. On the other hand, I do think that now we reached an important milestone in that we cover most units recommended by ISO/IEC 80000, covering space and time, mechanics, thermodynamics, light and radiation, acoustics, physical chemistry and molecular physics, and atomic and nuclear physics. In addition, by aligning units to other ontologies we can easily verify the conversion factors (I did this for one ontology). Overall, this will be a big update, but all updates following (say, once a year) will be much smaller because we now have a well-curated, stable collection of the most important units. Toni 001 (talk) 22:10, 9 November 2020 (UTC)
@Toni 001: I filed phabricator:T267644 for doing an update of the config, without prioritizing it. --Lucas Werkmeister (WMDE) (talk) 14:49, 10 November 2020 (UTC)
Thanks. Toni 001 (talk) 11:40, 11 November 2020 (UTC)

Hi,
can a Polish speaker please check this item? I think the anonymous user either fixed or introduced a conflation with another person, but I am not sure. --Pyfisch (talk) 08:02, 9 November 2020 (UTC)

it seems (by having a look at some webpages) that she was born Klementyna Franciszka Tyszkiewicz and then she married Stanisław Antoni Wróblewski and became Klementyna Franciszka Wróblewska. Kpjas (talk) 13:09, 11 November 2020 (UTC)

How to get all properties for Wikidata items?

How can I get all the properties for Wikidata items?

In particular, I want to know what properties are used with Wikidata items for which (instance of(P31) = legal case (Q2334719)). I can get the cases as follows:

SELECT ?lglcase 
WHERE
{
  ?lglcase wdt:P31 wd:Q2334719.
} 

I can add the labels as follows:

SELECT ?lglcase ?lglcaseLabel
WHERE
{
   ?lglcase wdt:P31 wd:Q2334719.
   SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}

I want to see what properties are being used by each. How can I get that? Thanks, DavidMCEddy (talk) 08:57, 9 November 2020 (UTC)

In R (Q206904), I can pass SPARQL code as a character string to function "query_wikidata" in the WikidataQueryServiceR (Q74696020) package of R functions. That returns a "list" in R, the first component of which is a character vector of URLs. I can write my own code to scrape that web page to get what I want.
However, if there is a way to do this in SPARQL, I naively think that would be better, especially since I plan to document (presumably on Wikiversity) what I'm doing so others can learn from it. See my (unanswered) question re. "Wikidata tutorial on Wikiversity" above.
Thanks, DavidMCEddy (talk) 13:50, 9 November 2020 (UTC)
@DavidMCEddy: A very simple query would be:
select distinct ?prop where {
  ?item wdt:P31 wd:Q2334719; ?prop ?o.
}
Try it!
Maybe somebody knows how to filter this correctly. --Haansn08 (talk) 13:33, 11 November 2020 (UTC)

WikiProject Amarna

If enough people are interested, we should create WikiProject Amarna or WikiProject AmarnaMD (Metadata), to create and maintain a repository of metadata about all the items (people, places, objects, etc...) from the Amarna Period of Ancient Egypt. The project can also include scientific research papers about the Amarna Period.

This period includes Nefertiti, Akhenaten, & Tutankhamon, but hopefully this project can help expand their relatives, consorts such as Ankhesenamun, architecture, culture, technology, and the work they produced and patronised like the examples of Amarna art.

The project itself could be divided into three parts; Pre-Amarna (1391 to 1353 BCE), Amarna (1353 to 1332 BCE) & Post-Amarna (1332 to 1322 BCE) periods, which can help expand the context of the era.

Eventually, the project can grow or become a part of a dedicated project for the entire Eighteenth Dynasty of Egypt, which would include Hatshepsut and her contemporaries and works, such as the Djeser-Djeseru. Further, it can grow to encapsulate the entire New Kingdom of Egypt and then most of Ancient Egypt.


The idea for the project came when, I wanted to and will start to add and expand statements in the individual Amarna letters. But I thought that more eyes should be present on these items, as I do not have enough expertise in this area, but have a keen interest in the topic. And seeing that no dedicated Wikidata Project exists for Ancient Egypt (unless I have missed something), I thought this could be a good starting point. There is also only a finite amount of information regarding this era, and as such the project could be semi-completed, until more information is uncovered. I am open to any and all suggestions, including a different title for the project to future proof it, or creating a chat group rather than a WikiProject.

Walkuraxx (talk) 05:06, 27 October 2018 (UTC) Daniel Mietchen (talk) 22:44, 27 October 2018 (UTC) Marsupium (talk) Pipimurphy (talk) 13:51, 30 January 2019 (UTC) EditaLeiden (talk) 09:38, 16 May 2019 (UTC) Tris T7 (talk) 09:55, 19 August 2019 (UTC) Spinster 💬 MassiveEartha (talk) 05:50, 11 February 2020 (UTC)

  Notified participants of WikiProject Africa

Susannaanas (talk) 07:40, 29 March 2016 (UTC) Abbe98 (talk) 16:53, 29 March 2016 (UTC) Iberti (talk) 17:05, 29 March 2016 (UTC) Eetu Mäkelä (talk) 18:13, 29 March 2016 (UTC) Jura. Good idea! Interested in place names (mainly). Humphrey Southall (talk) 13:43, 11 April 2016 (UTC) Fralambert (talk) 22:36, 11 April 2016 (UTC) Rainer Simon (talk) 10:10, 17 April 2016 (UTC) Melderick (talk) 13:56, 7 June 2016 (UTC) Llywelyn2000 (talk) 16:47, 7 June 2016 (UTC) Vladimir Alexiev (talk) 10:36, 15 June 2016 (UTC) Capankajsmilyo (talk) 21:04, 1 November 2016 (UTC) Ainali (talk) 08:59, 24 November 2016 (UTC) Jheald (talk) 01:15, 17 February 2017 (UTC) B20180 (talk) 05:33, 27 July 2017 (UTC) Elinese (talk) 19:19, 23 March 2018 (UTC) Salgo60 (talk) 10:34, 13 April 2018 (UTC) PKM (talk) 01:31, 15 April 2018 (UTC) Tris T7 TT me Tris T7 (talk) 19:53, 29 January 2019 (UTC) Sp!ros (talk) 16:07, 3 April 2019 (UTC) Pollockc (talk) 18:26, 25 November 2019 (UTC) ChristianSW (talk) 14:10, 10 December 2020 (UTC) Stephen Gadd (talk) 10:31, 12 January 2022 (UTC) Iain Hallam (talk) 01:36, 5 April 2022 (UTC)

  Notified participants of WikiProject Historical Place

  Notified participants of WikiProject Ancient Greece

  Notified participants of WikiProject Ancient Rome


I do not posses the knowledge to create a WikiProject, as such am posting this here. If this information belongs somewhere else, please guide me. And if anyone is interested in the above please reply to this or leave a message on my talk page. Wallacegromit1 (talk) 04:32, 11 November 2020 (UTC)

You could just create a page. Eventually, some people might join, others might just benefit from reading the outline it provides. --- Jura 07:00, 11 November 2020 (UTC)

Hello, w want knowledge panel for myself

I am very pleased to use this opportunity, I want a panel for myself and I have already created the item for myself. so I want to ask this, How many days will it take to get a panel after all necessary information is submitted. unsigned comment by Murtadoh on 20:59, 10 November 2020‎

It used to take 30 days for Google to add the information from Wikidata into a panel. I assume they thought if it survived 30 days it was no longer likely a hoax. I have not tested an entry in over two years to see how long it takes now, or if they are still using Wikidata entries. All you have to do create something new and do a Google search every day until it shows up. --RAN (talk) 21:33, 11 November 2020 (UTC)

When does 1953-01-01T00:00:00Z mean 1953 (no month/day) and when 1 Jan 1953?

Partial birth dates where only a year is given seem to be stored/ reported in the same way as a full birth date on the 1st of January, eg 1953-01-01T00:00:00Z

How can I tell them apart, which is a year only, and which a full date? Scarabocchio (talk) 17:40, 12 November 2020 (UTC)

@Jura1: Thanks!, but I haven't been able to see how to get this into my result set. How do I do that? Scarabocchio (talk) 18:48, 12 November 2020 (UTC)
PS: I'm using WDQS to submit a query. The two date fields, human-readable and ISO 8601, are coming from the ?bd variable and its label ?bdLabel. Scarabocchio (talk) 18:52, 12 November 2020 (UTC)
SELECT * { wd:Q2861756 p:P569 ?st . 
          ?st psv:P569 ?dtnode . 
          ?dtnode ?a ?b . }
Try it!

To get the full date, try the above.

SELECT * { wd:Q2861756 p:P569/psv:P569 [ wikibase:timeValue ?date ; wikibase:timePrecision ?dateprecision ] }
Try it!

A shorter version. --- Jura 18:58, 12 November 2020 (UTC)

Blimey. Thank you for that. Scarabocchio (talk) 19:37, 12 November 2020 (UTC)

Located inside buildings

I want to model a relationship of a business/restaurant/hotel inside a specific buildings, in this case Hotel Garni Frohsinn (Q101539973) is located inside Gasthaus Frohsinn (Q29883460) (a hotel (Q27686) located inside Gasthaus Frohsinn (Q29883460)). Is here Gasthaus Frohsinn (Q29883460)contains (P4330)Hotel Garni Frohsinn (Q101539973) the correct approach? Alternatively one could use Hotel Garni Frohsinn (Q101539973)location (P276)Gasthaus Frohsinn (Q29883460) but this does not seem to capture the hierarchical relationship properly. Any suggestions? --Hannes Röst (talk) 20:15, 12 November 2020 (UTC)

I use “location” for this (and, inversely, “occupant” on the item for the physical structure). - PKM (talk) 21:02, 12 November 2020 (UTC)
I like occupant (P466) for this, maybe we should then clarify that contains (P4330) should not be used in this case? --Hannes Röst (talk) 22:27, 12 November 2020 (UTC)

Surnames of saints

See Saint Benedict of Cupra (Q3946827) where the named person has the family_name "Benedict" and given name as "Benedict", I want to remove the family_name and leave it empty. In the middle ages there were no surnames for ordinary people. This is similar to Scandinavia where people were named by a patronym, you were Arthur, son of Uther of England. --RAN (talk) 21:28, 11 November 2020 (UTC)

<no value> is your friend. --Tagishsimon (talk) 21:46, 11 November 2020 (UTC)
Thanks for noticing and fixing. Should do some more cleanup --- Jura 06:03, 13 November 2020 (UTC)

Et al as author

This query lists 78 papers where we have an author name string (P2093) value containing "et al". At least some are errors, as the full lists names are available via a DOI. Are any not?

For example, the paper referred to in Lysenin (Q76846397) explicitly lists "et al", without enumerating the names of relevant individuals, so I have given that author (P50)=et al. (Q311624). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:25, 13 November 2020 (UTC)

Subclass of “male” and “female”

Once upon a time (in 2015) the description for female (Q6581072) stated “person who is female (use with Property:P21 sex or gender). For groups of females use with subclass of (P279)”. Is this still a best practice? We have lots of groups or pairs of women marked <subclass of> “female” https://w.wiki/kgy, and ditto for “male”. - PKM (talk) 04:05, 9 November 2020 (UTC)

There's a slight issue with grammar, of "female" as a concept, vs "a female" as an individual. I think we use the same item to represent both the female concept and the class of all female individuals. I don't think that this is unusual or undesirable. Ghouston (talk) 06:56, 9 November 2020 (UTC)
I also don't see Wendi and Brenda Turnbaugh (Q267678) subclass of (P279) female (Q6581072) to be good modeling. I see valid subclasses as items like trans woman (Q1052281) and maybe classes like wet nurse (Q472898). @Jura1: given that you set the value in this example back in 2015/2016 and a few others what do you think now? ChristianKl10:08, 9 November 2020 (UTC)
Not sure who first came up with that. I don't think adding P21 directly on such items is helpful. --- Jura 10:24, 9 November 2020 (UTC)
@Jura1: This is not about sex or gender (P21) but about subclass of (P279) and half of the uses of it seem to be from you. ChristianKl12:29, 9 November 2020 (UTC)
I hope all my edits are consistent with the above. What is the open question? --- Jura 15:05, 9 November 2020 (UTC)
The open question is whether Wendi and Brenda Turnbaugh (Q267678) subclass of (P279) female (Q6581072) is still considered good modeling practice now. I acknowledge that is was the standard five years ago. - PKM (talk) 21:23, 9 November 2020 (UTC)
It's always good practice to follow an established model. Beyond that, I'm not sure if the general question is within project scope. The question may rather be how? --- Jura 20:11, 10 November 2020 (UTC)
@Jura1: the question is whether most of results in https://w.wiki/kuc and https://w.wiki/kuf should be removed (those that are about groups of people). ChristianKl11:06, 10 November 2020 (UTC)
How would the information be made available? --- Jura 20:11, 10 November 2020 (UTC)
The information should be available on each member of the group. - PKM (talk) 21:31, 10 November 2020 (UTC)
I don't think it's sufficient. We may have all parts or not. Items for these may be complete or not. --- Jura 07:15, 11 November 2020 (UTC)

  WikiProject Ontology has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. ──────────────────────────────────────────────────────────────────────────────────────────────────── I note that the instructions on sex or gender (P21) do still say to use <subclass of> for groups. Personally, I think this is bad modeling. I would prefer to add new items “group of men” and “group of women” as subclasses of “group of humans” to carry this information. Or possibly to add a new property “members have gender”. - PKM (talk) 21:23, 12 November 2020 (UTC)

  • If you use “group of men” or “group of women” how would you link that to female (Q6581072)? Why would this be different from other groups? I don't think it's good modelling just to state that something is good or bad modelling. --- Jura 06:01, 13 November 2020 (UTC)

I agree with ChristianKl @10:08, 9 Nov: Wendi and Brenda, Vika and Linda (Q7929372) and The Breeders (Q1332134) are instances, not classes. Only something that “is a” class should be a subclass of another class. Pelagic (talk) 21:56, 13 November 2020 (UTC)

Widening the question in a different direction: Why tag all-male and all-female groups, but leave mixed-gender groups unmarked? Is that somehow meta-sexist? Should Vika and Linda Bull be marked as group of women, but Jack and Meg White (The White Stripes (Q268160)) not be marked as mixed group? While we're at it, how about adding statements for all-black or mixed-race groups? I would suggest that should only be done where the gender, race, ethnicity, etc. is a notable feature of the group. The Breeders were notable for being an all-girl band whilst not a “girl band”. One Direction was a “boy band”: One Direction (Q146027)instance of (P31)boy band (Q216337) is more helpful than One Direction (Q146027)subclass of (P279)male (Q6581097), and “boy band” could subclass “group of men”. Australian women's cricket team in England in 2015 (Q18701926) is by definition a group of women, it already has competition class (P2094)mens cricket (Q8031140), but I don’t see a handy way to chain “women's cricket” to “female”. In many sports, mixed-gender teams/competitions are the exception rather than the norm. Indigenous All-Stars (Q6024570) doesn’t have any link to Indigenous Australians (Q170355). Is there some property for “parts are members of class”? Pelagic (talk) 22:33, 13 November 2020 (UTC)

Can I download only articles with more than 10 edits?

Is there a way I can download only articles that have more than X edits. Or if I could download only articles in more than X languages.  – The preceding unsigned comment was added by James Kalapgs (talk • contribs) at 16:24, November 11, 2020‎ (UTC).

@James Kalapgs: We hold sitelinks in Wikidata, but don’t pull in other info like edit history. You’d need to see if there was some way to query that via MediaWiki API. Pelagic (talk) 23:51, 13 November 2020 (UTC)

River coordinates

We don't have properties for the coordinates of the source and of the confluence of a river; I'm happy to draft proposals, but before I do is anyone using workable alternative model? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:07, 11 November 2020 (UTC)

See Wikidata talk:WikiProject Rivers for answers or more questions. Thierry Caro (talk) 20:02, 13 November 2020 (UTC)

For linear features generally, if the junction point has an Item then you could make statements like ⟨junction of A and B⟩ ⟨co-ordinate location⟩ ⟨some point⟩; ⟨junction of A and B⟩ ⟨junction type⟩ ⟨tee | cross | etc.⟩. But then how to say ⟨A⟩ ⟨joins⟩ ⟨B⟩ ⟨at⟩ ⟨junction of A and B⟩? To reduce the 5-tuple to triples, we need to decide which verb/property/relation is primary and which is going to be the qualifier. RDF allows general statements about statements (as long as they have URIs), so you can use a statement as the object rather than subject: ⟨A⟩ ⟨starts at | ends at | crosses through⟩ ( [the point where] ⟨A⟩ ⟨joins⟩ ⟨B⟩ ). For Wikidata, only the subject can be a triple, and qualifiers only go one level deep. Do you say “A ends at Point (where it meets B)” or “A ends at B (at Point)”? Sorry if this is too abstract or off-topic. Pelagic (talk) 00:58, 14 November 2020 (UTC)

"Greenlanders" Q101444758

Not entirely sure how to fix the above item. Somehow the amalgamation there seems more suitable for lexemes or a Wikipedia disambiguation page.

I removed "instance of demonym" as words would generally go into lexeme namespace. @UWashPrincipalCataloger: who create a series of similar items. --- Jura 11:57, 12 November 2020 (UTC)

fuse manoir de Bricquebost (Q22979516) with manoir de Bricqueboscq (Q101580954)

Hello, please fuse manoir de Bricquebost (Q22979516) with manoir de Bricqueboscq (Q101580954). Thank. Best regards.--Thierry74 (talk) 05:48, 14 November 2020 (UTC)

The question is if it's really the same. P31 for one is about main building, for the other about the entire estate. Sitelinks and identifiers might not be aligned with and possibly need to be moved between the two. --- Jura 09:03, 14 November 2020 (UTC)

Duplicate items?

Are those two duplicatesː Shulamith Firestone (Q63044114)

and

Shulamith Firestone (Q271551)

If so what should be done? Thanks in advance, (sorry it looks like my keyboard is broken, can't sign)

No, Shulamith Firestone (Q63044114) is a journal article about Shulamith Firestone (Q271551) --Emu (talk) 00:24, 16 November 2020 (UTC)

Wikipedia is linking to Wikidata for references, how to facilitate the best experience

In a collaboration, a group of developers and User:Pigsonthewing developed a template to link a Wikipedia reference to Wikidata. What it does, it gets the details of the publication from Wikidata. I think it features among the best news of 2020. What I want is to facilitate publications not in Wikidata that are used or to be used in this way in Wikipedia. What I am looking for is that a DOI is provided to a tool that will immediately get all pertinent information into Wikidata.

An example of such a paper is DOI: 10.1126/science.abe5901. It describes what is known about the link between mink and SARS (Covid-19). That paper is not yet picked up but it may be of relevance to editors at Wikipedia. When there is a process to speed up the process, it would help acceptance of the use of templates at Wikidata.

At the same time Scholia now indicates for a paper where it is used in (English and Danish) Wikipedia. When you then look at this presentation in Scholia and compare it to this item, you will agree that it provides an interface that makes sense for the people who go down the rabbit hole to learn more about a paper. They are NOT editors who improve Wikidata; when they are they can click in the link to the item.

In conclusion:

  • can we have a process to fast track DOI missing in Wikidata
  • can we prioritise and consider processes where people elsewhere benefit from our data.

Thanks, GerardM (talk) 07:01, 11 November 2020 (UTC)

At Wikidata:Bot_requests#weekly_import_of_new_articles_(periodic_data_import), a user suggested importing all new articles on a weekly basis. That would be one solution to this. Another initiative I'd like to see is to regularly import all papers cited in Wikipedia into Wikidata. I understand this is done for all papers with DOI, but some citations in Wikipedia do not have DOI, so a tool which helps reconcile existing references to DOI would be very desirable too.Vojtěch Dostál (talk) 10:09, 11 November 2020 (UTC)
"can we have a process to fast track DOI missing in Wikidata" What, like SourceMD? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:35, 11 November 2020 (UTC)
I have playing with the "cites work" statement. What I find is that all too often a cited work does not exist. What is propose is similar to the "author name string", to introduce a "cites work string", it includes the name of the work. A DOI may be used as a qualifier. It can be used by a bot to import the work to Wikidata and replace the string with the new object.
I have added several new "scholarly articles" and I included the information that seemed most vital to me. What we need is a process that processes new DOI and enriches them based on the resources available to us.
Thanks, GerardM (talk) 13:47, 15 November 2020 (UTC)
@GerardM: When you say "a cited work does not exist" do you mean that people are citing nonexistent works, or works whose reality you doubt, or merely that people are citing works that lack a Wikidata item? - Jmabel (talk) 18:53, 15 November 2020 (UTC)
It is about papers that do have a DOI and are not known to Wikidata. eg "Impact of feral water buffalo on juvenile trees in savannas of northern Australia: an experimental field study in Kakadu National Park" it is cited from this paper I am enriching. With so many people impacted from wild fires, it makes sense to know the science around this subject. Thanks, GerardM (talk) 07:28, 16 November 2020 (UTC)

Modeling of video game modes

Should items such as Teamfight Tactics (Q67650639) be an instance of game mode (Q1971694)? Or should it be a subclass of multiplayer video game (Q6895044) and player versus player (Q719798)? --Trade (talk) 08:46, 16 November 2020 (UTC)

Same locomotives

'SNCB Class 55' and 'CFL Class 1800' are identical. see source [5]. Q840835 There is a seperate Commons (sub)category: CFL Class 1800. I have adapted the NL and EN Wikipedia artikels. The only problem is the number of locomotives produced: 42 Belgian and 20 for Luxembourg giving a total of 62. There are the two sources [6] and [7] Can I somehow make the addition of two sources? I find it very frustrating that there is no posibility to add a comment or explanation in Wikidata. If I set the value to 62 and add the two sources, there wil be confusion.Smiley.toerist (talk) 23:11, 12 November 2020 (UTC)

@Smiley.toerist: You can add a note on the talk page of the item. ArthurPSmith (talk) 17:02, 13 November 2020 (UTC)
  DoneSmiley.toerist (talk) 11:50, 16 November 2020 (UTC)
I would like to qualify the manufaturer ACEC (Q848270), as the manufaturer wich did the electrical engineering (Q43035). I have added 'field of work' to ACEC, but this a very indirect link.Smiley.toerist (talk) 11:50, 16 November 2020 (UTC)
Solved.Smiley.toerist (talk) 11:54, 16 November 2020 (UTC)

Victor Scholderer (Q18922365)

The Writings of Victor Scholderer: An Addendum (Q58632531) and To Victor Scholderer: A Birthday Greeting (Q58632493) concern Victor Scholderer (Q18922365) and should be linked there. Broichmore (talk) 13:58, 13 November 2020 (UTC)

@Broichmore: Indeed so. What's stopping you? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:27, 13 November 2020 (UTC)
@Pigsonthewing: Simply because this is not a project that "anyone can edit", it's a closed system for technical experts. As explained eloquently at length here by . Broichmore (talk) 16:13, 13 November 2020 (UTC)
Poppycock; I teach non-technical undergraduates to edit Wikidata in a few minutes. It takes less time than Wikipedia - and certainly less time than teaching them to upload an image to Commons. This 2m15s video covers the basics, with time to spare. You have already shown yourself capable of adding statements to items. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:22, 13 November 2020 (UTC)
Broichmore Both of those articles can be connected to their subject by main subject (P921) = Victor Scholderer (Q18922365). And I agree that Wikidata is anything but intuitive or orderly, as a regular contributor for 4+ years, I find concrete guidelines to be often absent or hidden deep within the bowels of rarely-visited (and themselves hard to find) subpages. -Animalparty (talk) 01:13, 14 November 2020 (UTC)
So I did that, connecting by main subject (P921). Did I do it right? Because it doesn't look it. --Broichmore (talk) 12:49, 16 November 2020 (UTC)

Political graveyard

I may have asked before, but I never saw an answer. How do I go from http://politicalgraveyard.com/bio/fitzgerald.html, which is where Google takes me, to the decimal version of the entry on the page. See for example http://politicalgraveyard.com/bio/fitzgerald.html#096.28.47 ? So I can enter it in Wikidata as per this example: David Edward Fitzgerald (Q98133137) --RAN (talk) 04:48, 16 November 2020 (UTC)

Curious arrangement. Best I can suggest is you inspect the element (i.e. look at the HTML markup) for the politician you're after, having manually located their entry on the page. So for Fitzgerald, Andrew, the element is a link <a name="194.19.01">Fitzgerald, Andrew</a>, from which you can derive the Political Graveyard politician ID (P8462) value, fitzgerald.html#194.19.01. --Tagishsimon (talk) 09:39, 16 November 2020 (UTC)
Thanks! Just what I was looking for. There are a few other websites that use the same software to create webpages from a database, now that I know the trick, I can dissect them. --RAN (talk) 13:59, 16 November 2020 (UTC)

Wikidata vandalism is perplexing

There seems to be quite a lot of Wikidata vandalism that's just people changing short descriptions to something totally unrelated, e.g. [8]. I'd love to know more about how those sorts of edits come about. Like, is it some first-time editor coming from Wikipedia and making a test edit, in which case how did they even find their way here? Or is it an LTA who knew Wikidata would be easier to vandalize, in which case why do something as boring as adding a description of not sold in stores (Q7062333) as "magic vase"? If we knew more about how these edits are coming about they might be easier to combat. {{u|Sdkb}}talk 23:34, 12 November 2020 (UTC)

I suspect based on this example, it's disgruntled poets. hth. --Tagishsimon (talk) 23:39, 12 November 2020 (UTC)
It was an unregistered editor in this particular case, and edit tags tell us that they were using the iOS (Wikipedia) app. Editing descriptions is offered in some places in both the Android and iOS Wikipedia app, although it is not very transparent that one actually edits Wikidata rather than Wikipedia. Maybe the user has just tested this function.
More patrollers would be helpful; for instance, currently there are a little over 10.000 modifications to English labels/descriptions/aliases from the past 30 days that have not been patrolled by an experienced editor. —MisterSynergy (talk) 23:53, 12 November 2020 (UTC)
@Sdkb: Also see Wikidata vandalism dashboard. --M2k~dewiki (talk) 00:26, 13 November 2020 (UTC)
@M2k~dewiki: Looks like a lot of false positives. {{u|Sdkb}}talk 00:31, 13 November 2020 (UTC)
Well, the clear majority of contributions by unregistered and new editors is indeed useful and just need to be patrolled. —MisterSynergy (talk) 08:15, 13 November 2020 (UTC)
@Sdkb: and others: nice! Structuring of unpatrolled edits is probably most effective way to fight against massive vandalism in WD. Eg would be nice to see all unpatrolled edits which have "instance of" taxon (Q16521). PS! over 95% of unpatrolled edits are related to foreign languages, so unable to patrol by regular English speaking user --Estopedist1 (talk) 06:33, 13 November 2020 (UTC)
Re. languages you can try the reCh tool (select "terms" and enter a language code such as "en" or "en,en-gb,en-ca" into the form). Volume-wise, all languages except for English can be completely patrolled by a single editor, although for Spanish, German, French, Italian, Chinese and Russian, three to five regular patrollers would be better in the long run. However, filtering by topic (such as "taxons") is not that simple indeed. —MisterSynergy (talk) 08:15, 13 November 2020 (UTC)
The forced reload and the loading time are impacting the usefulness of the reCh tool. And yes i am using 'no auto-reload'. --Trade (talk) 10:49, 13 November 2020 (UTC)
The tool has a glitch that if you select "reload after: x minutes" it will continue to do so even after you switch auto-reload off. Just log-out and log-in again to fix the issue. --Pyfisch (talk) 11:28, 13 November 2020 (UTC)
I suspect that we see a lot of perplexing edits to label or description because the user is attempting to change a data value and instead changes the text on the (wrong) data value to match the desired data value. See, for example, how often people change the label on specific hair or eye colours to match a different colour. Bovlb (talk) 19:42, 16 November 2020 (UTC)

Wikidata weekly summary #442

Worth noting that the Map of Welsh Railways query is slightly borked, per this thread: https://twitter.com/watty62/status/1326979452476280832 ... this version works slightly better: https://w.wiki/mGa --Tagishsimon (talk) 18:36, 16 November 2020 (UTC)

Item label embeds a non-breaking space (0xa0)

I ran a WDQS query that returned a line for Aleksandra Dąbrowska (Q61709735). The character between the two words arrived as a non-breaking space (0xa0) rather than the conventional character. I cannot see how to confirm/ fix that in the original source data item. Scarabocchio (talk) 19:43, 14 November 2020 (UTC)

I have run into this also when doing some automated bot work. I'm unsure how they got introduced but there's a non-trivial number of items with this issue. I'm unsure if there is ever a legit reason to have this in an alias (or almost anywhere?). Fixing it is tricky. BrokenSegue (talk) 01:53, 16 November 2020 (UTC)
I've run into it in author name string (P2093) values which look like somebody cut-and-pasted the names in from the article web page where they used non-breaking spaces (which is a sensible thing to do for web display). Maybe somewhere in the Wikidata UI or API there ought to be a way to confirm that you really meant to enter invisible alternate characters like this? Tricky though. ArthurPSmith (talk) 13:31, 16 November 2020 (UTC)
I don't think it's ever legit to have a non-breaking space in a name. I would support a bot clearing this out and then just banning them going forwards. I'm no name expert though and maybe there's some weird case. BrokenSegue (talk) 19:03, 16 November 2020 (UTC)
I wouldn't want to ban non-breaking spaces as there might very well be some language that declares certain terms to be non-breaking. https://english.stackexchange.com/a/28475/1502 suggests that a non-breaking space can be in English validly used in "J. Q. Adams" between "J." and "Q.". ChristianKl01:34, 17 November 2020 (UTC)

Global ban proposal for Kubura

Hello. This is to notify the community that there is an ongoing global ban proposal for User:Kubura who has been active on this wiki. You are invited to participate at m:Request for comment/Global ban for Kubura. Thank you. Blablubbs (talk) 21:30, 14 November 2020 (UTC)

@Blablubbs: I have moved this from the admin noticeboard to the project chat, which is probably the better location for community notifications --DannyS712 (talk) 23:40, 16 November 2020 (UTC)

Does every wikidata item need a reference?

I'm used to adding some references here and there over Wikidata items but sometimes I get the feeling that some of the data I add is superfluous. Like ie. my latest edit on Wikisource (Q263). ie. does instance of (P31) = website (Q35127) need a reference? I added one. We got 11250 Wikidata items with instance of = website. Will a bot add something to the reference field for these items or should we leave this data/references out of these items? LotsofTheories (talk) 23:49, 11 November 2020 (UTC)

I don't know if there is an official policy on this, but I know that there is substantial resistance to use of Wikidata by other Wikimedians because it does not seem to have the same level of control and documentation as Wikipedia. A few years ago I got several notices of changes to a Wikipedia article, which I could not see, because it was apparently a change to a Wikidata item used in an infobox. I haven't seen any of these recently, which suggests to me that the WMF may have decided to deal with this problem by concealing it rather than fixing it. I would never accuse the WMF of being better than the Vatican or the US government on things like this. And that raises a red flag for me.
However I think every Wikidata item, but not every Wikidata statement, should have a reference.
Does that answer your question, at least from the perspective of one User who has made 13,395 edits since 2010-03-26, most of them in Wikidata? DavidMCEddy (talk) 02:12, 12 November 2020 (UTC)
"suggests to me that the WMF may have decided to deal with this problem by concealing it " That supposition is false. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:55, 12 November 2020 (UTC)
(edit conflict): Every item needs to have at least one external source establishing and describing its existence (see Wikidata:Notability), regardless of whether or not each statement of an item is explicitly referenced. Not every statement requires a reference, especially undisputed or self-evident properties like instance of = human (Q5) or External identifiers, although sources can certainly be added if appropriate. See Help:Sources and Help:Sources/Items not needing sources for more info. -Animalparty (talk) 02:20, 12 November 2020 (UTC)
Unless the item is purely Wikidata related to help group certain items, almost all items require verifiable external sources, but not all statements within the item, as discussed above. Recommend you watch the below video completely, however, the information for your question starts at 11 min 27 sec (When is it okay not to add a reference?).
Wallacegromit1 (talk) 02:36, 12 November 2020 (UTC)
Thank you. I watched a bit of the video. It was helpful to me. LotsofTheories (talk) 07:35, 12 November 2020 (UTC)

What is the purpose of our data; references are secondary

When the data quality of Wikidata is assesses it goes from one to five. The elephant in the room is where we have no data or the data is based on Wikipedia where they have no data. Another quite frequent situation is where the data is not precise in one place and more precise in another. All of that is relevant but it starts with the question; why do we have this data. When Wikidata is to reflect BOTH the references of Wikipedia and the data that is in Wikipedia , a reference is linked to a Qid but it is best expressed in the associated Scholia. A fully developed Scholia adds value in many ways, it includes links to the authors and papers it cites. It includes the topics that are associated with the paper. So a proper reference is associated with more than just "this papers is supposed to say this or that".

When a Wikipedia list includes "all people who held an office" or "all people who won an award", it seems obvious that the website for the award should be the source. That or the website for an office and there often is nothing to be found. So we have a Wikipedia list with black or red links, she is called Karen or Mohammed and the only thing we can do to have a complete list is add Karen or Mohammed. The award is linked to Wikipedia it includes a black mark and in order to complete a list we do what we have to do. There is no reference but it is ok.

In a Reasonator page, a query, in a Scholia page and in hundreds of other applications this works fine. It is better than a level 0 failure; it is this notion of compulsory references that makes us fail. Thanks, GerardM (talk) 08:30, 12 November 2020 (UTC)

It is better to have no data, than to have wrong data. Frequently data from dubious sources, data with systematic errors and outright made up data is added to Wikidata. If these claims don't have sources I have no better way of checking than googling for the claim most of the time. This is extremely inefficient and prevents substantial improvements to the data quality. If on the other hand the claim has a source, I just need to check the source and know if the claim is ok. The insufficient standard for sources harms verifiability and damages the reputation of Wikidata. --Pyfisch (talk) 11:40, 13 November 2020 (UTC)
It is a false assumption that the opposite of data that is not "properly" references is poor data. The notion that data with references is without problems is also problematic. For instance when a scholarly paper has an "author string" with references, it is not certain that whatever author is indeed the author suggested by the paper. In that case a reference to ORCID provides much better assurance. We "accept" Wikipedia as a source and it provides assurance but not a reference.
The assumption of "better to have no data, than to have wrong data" is hugely problematic -it assumes that all data without references is wrong. Problematic is the bias that makes you blind for what does not conform to your notions of reference. Thanks, GerardM (talk) 15:36, 13 November 2020 (UTC)
I have made the experience that there are some claims in Wikidata I just can't verify. Especially if the claims are remarkable, I have no choice but to consider this data to be poor. If there is a reference – whatever is given: a website, authority control record, or book – I can look this reference up, see if it makes the claim and make my own judgment if I believe this claim. (My sentiment mirrors that of the essay Verifiability, not truth.)
One example of such an unverifiable claim would be the eye color of Francis Biden (Q100741824). How does Wikidata know that I have no idea, it might as well be green, purple or gray. I think for all data that does not have references right now, references should be added (statements not needing references by policy excluded of course). I am not sure what do you mean by "my notions of reference"? --Pyfisch (talk) 19:48, 13 November 2020 (UTC)
When you make it about what you believe, then there is little point in having an argument. At some stage a taxonomy of quality was proposed and imho it does not apply to single statements. You can consider an unreferenced statement suspect but even with what we have as references, the correctness of a statement does not change with or without a statement of reference. When you compare the Scholia for "red deer" with the references on the article of a major herbivore, it is obvious that the Wikipedia article references are piss poor and that the subject does not cover much ground. (I made a comparison, considered the breadth of what there is available for a reference and made this judgement). When you argue that you read an essay and use it as a reference point of consensus, Mr Trump is a master of consensus building, he is the president of the USA, can be used as a reference and is proven to be a liar. Thanks, GerardM (talk) 06:42, 14 November 2020 (UTC)
You misrepresent every argument I made. Nowhere I state that the correctness of a statement on the fact that it has a reference (but its verifiability often depends on it), neither do I say that we should use claims made by incorrigible liars to establish consensus or avoid good sources. It still remains mysterious to me why you conclude that because statements with references still can be wrong, statements don't need references (or be verifiable). There is one thing however I can agree with you: there is little point in having an argument with you. --Pyfisch (talk) 08:19, 14 November 2020 (UTC)
Your whole argument starts with "It is better to have no data, than to have wrong data". My point is that references are secondary to having the data. Your arguments are inconsistent and it is easy enough to show how references can be problematic. So going with your argument; it is better to have no references than problematic references. Thanks, GerardM (talk) 08:34, 14 November 2020 (UTC)

If someone would prefer no data over unsourced data they can just discard all unreferenced statements. Clearly unsourced data is generally prefered to no data especially if it's common knowledge and the time spent not adding the reference is spent adding more good data. BrokenSegue (talk) 01:58, 16 November 2020 (UTC)

When someone discards unreferenced statements, it is vandalism pure and simple. Particularly when it should be obvious that the data is likely correct eh unlikely to be incorrect. Thanks, GerardM (talk) 10:06, 17 November 2020 (UTC)

Content in a museum

Looking at some museums like Frog Museum (Q125748) or Museum Tinguely (Q63058423) and Landslide Museum (Q27479538), I noticed that there is no connection between the museum and the content of the museum, for example Landslide Museum (Q27479538) is about Goldauer Bergsturz (Q820308) but there is no link between the two. What would be the right property here? I see that we have main subject (P921) and commemorates (P547) (or contains (P4330)) that may fit, but none of these really fit well. To use these we would have to expand the definition of some of these. It just seems strange that Museum Tinguely (Q63058423) is not really linked to Jean Tinguely (Q163938) except through Museum Tinguely (Q63058423)named after (P138)Jean Tinguely (Q163938). Any thoughts? --Hannes Röst (talk) 19:25, 13 November 2020 (UTC)

Main subject fits very well for the examples you've given. You say it yourself: Landslide Museum (Q27479538) is about Goldauer Bergsturz (Q820308). The main subject (P921) of Landslide Museum (Q27479538) is Goldauer Bergsturz (Q820308). --Tagishsimon (talk) 19:39, 13 November 2020 (UTC)
Thanks, I am bit hesitant because (a) out of 1800 museums, only 5 currently use this approach and secondly (b) main subject (P921) is defined as "primary topic of a work" and does not seem to be indented for museums but really more about books/works of art etc. I am happy to use main subject (P921) but wanted to get some input on whether we should change and expand the scope of main subject (P921). Best --Hannes Röst (talk) 21:49, 13 November 2020 (UTC)
I agree, main subject seems to fit well here. A curated collection is “built by humans”, so it is a “work” in the broad sense. But I wouldn’t object to an updated description on main subject (P921) saying “primary topic of a work, collection, or anthology” to make it clear that it applies to collective works as well as creative works. Pelagic (talk) 01:36, 14 November 2020 (UTC)
For museums with more general scope: say a modern art museum is an instance of (P31)art museum (Q207694), should we also say that it has main subject (P921)modern art (Q38166) &/or genre (P136)modern art (Q38166) (taking note of Pmt below for genre)? I would lean toward preferring genre where applicable, and main-subject for museums that have a focus which is more specific than a genre. Or would this be an undesirable conflation of “about genre”, “belongs to genre”, and “has things belonging to genre”? Pelagic (talk) 02:19, 14 November 2020 (UTC)
Following up on that, it is clear that a frog museum is about frogs and a landslide museum is about the landslide, but to some degree a Tinguely Museum is generally not about the person and using "main subject -> Tinguely" would be wrong, since the museum would generally be about the artists work. Should in this case a separate item be created that describes the oeuvre of an artist? --Hannes Röst (talk) 21:58, 13 November 2020 (UTC)
I think that is done in many cases. Both as list has list (P2354): Wikimedia list related to this subject, genre (P136): creative work's genre or an artist's field of work (P101). Use main subject (P921) to relate creative works to their topic and as an seperate item for each work. You will also have has works in the collection (P6379): collection that has works of this person or organisation (use archives at [P485] for archives), notable work (P800): notable scientific, artistic or literary work, or other work of significance among subject's works and genre (P136): creative work's genre or an artist's field of work (P101). Use main subject (P921) to relate creative works to their topic. catalog code (P528): catalog name of an object, use with qualifier P972 is also available Pmt (talk) 22:28, 13 November 2020 (UTC)
Looking at your suggestions, I think you mean the inverse of has works in the collection (P6379) which is contains work(s) by (Q83404986) since the statement should be Jean Tinguely (Q163938)has works in the collection (P6379)Museum Tinguely (Q180904) which is a start and implies the inverse. But it does not specify yet that the whole museum is *mainly about this* part of art. As stated below, Museum Tinguely (Q180904)main subject (P921)Jean Tinguely (Q163938) is also not correct since its not about the artist but his work.
Secondly, at the moment art museum (Q207694)field of work (P101)art (Q735) which seems quite reasonable "specialization of a person or organization" and could also be used for Landslide Museum (Q27479538)field of work (P101)Goldauer Bergsturz (Q820308)
I also think genre (P136) could be appropriate in some cases but not for technical / scientific museums but for artistic museums. In German it even clearly specifies "genre in which a person or project is working" Landslide Museum (Q27479538)genre (P136)Goldauer Bergsturz (Q820308) seems wrong. --Hannes Röst (talk) 03:19, 14 November 2020 (UTC)
For technical / scientific museums as Pelagic suggest; for a museum main subject (P921) can be used in a broad way ,and should be able to carry multiple main subjects (?). With reference to the property proposal Property_talk:P921 the domains for this property is set to, among others, cultural institution (Q5193377), Q2668072, Q18593264, Wikipedia article covering multiple topics (Q21484471) and theme park (Q2416723), and should be able to carry multiple main subjects Pmt (talk) 20:24, 14 November 2020 (UTC)
A general technological museum could be "instance of = museum" with "main subject = technology (Q11016)". "Smallville Automobile Museum . main subject = motor vehicle (Q752870) or motor car (Q1420)" feels appropriate.
Seems so far that people are generally comfortable (?) about using P921 with cultural institutions, and as Pmt points out, it's explicitly mentioned in the property talk. Multiple P921 might sometimes be useful (e.g. science and technology).
How to say that a museum is about "works of Jean Tinguely" versus about "Jean Tinguely" is trickier.
Pelagic (talk) 12:47, 17 November 2020 (UTC)


The Museum Tinguely (Q63058423) is for the building. So the collection should be a seperate item located in the building, like Louvre Palace (Q1075988) contains Musée des Arts décoratifs (Q1319378) and other non-museal instutions. The Landslide Museum (Q27479538) has a collection exebiting of things related to the Goldauer Bergsturz (Q820308). Pmt (talk) 21:51, 13 November 2020 (UTC)

Sorry, yes that should have been Museum Tinguely (Q180904) - I mixed them up. Sorry about that. --Hannes Röst (talk)

voice actor (Q2405480) for movie items

Is Q2405480 usable this way or it was created only for actor items? Say I theoretically can add to Tor (Q217020) for some actor dubbed by qualifier. But it is useless unless I can right away add dubbing language subqualifier or something like that - especially for a movie dubbed to a bunch of languages. Am I missing the right procedure or just trying to use the property/quality in a way it was not intended to? --NeoLexx (talk) 08:46, 17 November 2020 (UTC)

חוימחעחחיו ןמהטי לחימ טללןכחכוימ חחחכחוצחסחח להחהItalic textחכמצצבחצלצבםחל םןח לחכןםחיחףדםך

It would be best to create an own item for the dubbed version(s), see for instance Thor (Q24046091). - Valentina.Anitnelav (talk) 09:55, 17 November 2020 (UTC)
Oh, I see. And how to connect it with the original Tor (Q217020)? So it would be accessible from the main wiki-article? Maybe it is already connected, but so many filled properties in the item that really hard to find even if there. What is that Hebrew text a bit atop? --NeoLexx (talk) 10:11, 17 November 2020 (UTC)
edition or translation of (P629) on the dubbed version, pointing back to the original version. has edition or translation (P747) on the original version, pointing to the dubbed versions. --Tagishsimon (talk) 10:30, 17 November 2020 (UTC)
Got it, thank you to all of you! --NeoLexx (talk) 10:48, 17 November 2020 (UTC)

Wikidata property for arbitrary lists?

This is a beginner question - I do not know what I am doing. Thanks.

I would like some thoughts from others on bringing arbitrary lists into Wikidata. I am especially interested if anyone can point me to previous discussions or similar ideas.

The general idea is that I want to run queries on arbitrary lists of academic articles. For example, I want to be able to take classroom reading lists of about 30 pages, or review article collections of ~50-300 papers, and run queries on the Wikidata items for those papers. I would compile the list then I want a way to run queries.

Other circumstances where someone might want to compile a similar list to query include the following:

  • List of works (movies, museum objects, books) which an particular artist casually says are their favorites
  • List of biographies profiled in a "top 100" people list
  • List of ~100 subjects of Wikipedia articles which some internal wiki working group deems to be important

One way to make a list is to create a Wikidata item titled for each list, then that Wikidata item has part(s) (P527) for each item in the list. In this way, each Wikidata item for a list contains all the elements of that list. The advantage of this is that there is little mess to other items if people bring in many lists of low importance, as all of the editing only happens on the one item.

Another way to make a list is to have a property which tags various items in a collection as being part of a particular list. Advantages of this is that I feel like queries are easier to do this way, however, I am inexperienced in querying and I am not sure how complicated it would get to start writing queries to instead always pull out the list of parts from an item. I think there is no Wikidata property for arbitrary lists of this sort, and perhaps this is intentional.

Wikidata properties which I found which seem like they could be lists include

What is the precedent, and which way should I explore? Is there a Wikidata property which holds arbitrary lists? If not, are people writing queries which instead of querying for a property, they instead get all their Q IDs from one Wikidata item which itself contains a list, then further query all the items in that list?

Thanks for any response.

Blue Rasberry (talk) 13:05, 17 November 2020 (UTC)

Part of the question you ask - the mechanics of representing lists in wikidata - has a vast answer. Really, wikidata is all about lists, and everything we do here contributes to the production of lists. There is not the time or space to meaningfully answer this part of the question, beyond to note that there are very many ways of pointing one item to another saying, this is part of that, or this has that as a part, &c.
As to creating items so as to create arbitrary lists, the clear answer is yes. And no. Mainly no. I think it is the case that wikidata supports a couple of sorts of lists, being 1) sets that have well-defined membership criteria and 2) lists of things of interest to wikiprojects - on focus list of Wikimedia project (P5008). It does not, I think, support the idea of me adding an item for 'Tagishsimon's 100 favourite items in wikidata'. So perhaps we can say that list definition items need to meet WD:N.
Wikipedias and, though I don't know much about it yet, tabular data on wikipedia commons, are very good places to put arbitrary tabular lists. Recreating that in wikidata is a poor solution
Touching on a specific point you raised, my own view is that in most circumstances, it is better to point child records at parents, than parent records at children, mainly for the reason that very long lists of property statements in a single item are ugly to navigate and have negative impacts for the performance of wikidata, to do with its RDF serialisation. If the parent, list definition, item is notable, then there can be no objection to finding it pointed to from a child record.
I very strongly urge you to learn SPARQL and get to grips with WDQS, if you have real interest in wikidata. I think the reason you are reaching out for lists in the way you have described them, above, is because you cannot yet write a report. (And this is fine - we all started where you are starting). Wikidata:Request a query is very much your friend here: it helps to familiarise yourself with the sorts of questions being raised; the patterns found in the data; the specifics of the SPARQL language. Trying to work out why answers given on that page fit the questions being raised; or trying to work out what the various bits of a query are doing, is a fine way to learn. And there are ample users on that board who have great knowledge and an inextinguishable willingness to answer questions, show methods, hand-hold you through the learning curve. hth --Tagishsimon (talk) 14:33, 17 November 2020 (UTC)
It is relatively simple to provide input QIDs to WDQS and perform the query on this input set. The easiest would be to input directly in the SPARQL query using a VALUES clause—this even works with longer lists of roughly up to 10k input items. Another option would be to read the input items from an onwiki page in Wikidata (or probably also Wikipedia), e.g. in your user space, directly to the Query Service using the "mwapi" service; I am not aware of limits here, but "hundreds of items" should definitely be possible. In both cases, you do not need to place your lists into any production workspace.
If you need assistance, please let us know. I think we can write some basic examples once you come up with actual use cases. —MisterSynergy (talk) 17:35, 17 November 2020 (UTC)
@Bluerasberry: See the above responses for some thoughts on this. The way I would approach it is probably to create the list elsewhere (in a spreadsheet, for example) and then work with it through tools like OpenRefine, which can pull in data from Wikidata via reconciled items. There are also tools to work with Google spreadsheets that may be helpful (I haven't tried those myself). ArthurPSmith (talk) 18:11, 17 November 2020 (UTC)

Community Wishlist Survey 2021

Hi everyone,

The proposals phase of the Wikimedia Foundation Community Wishlist Survey 2021 has started. You can submit proposals for features and changes in various categories including for Wikidata - Community Wishlist Survey 2021/Wikidata.

The survey is open until November 30th and the evaluation phase will take place between November 23rd and December 7th.

Cheers, -Mohammed Sadat (WMDE) (talk) 17:09, 17 November 2020 (UTC)

Also relevant:
in the Commons section, because we need to find an alternative, and make it workable, instead of large time-series getting dumped in Wikidata items, causing endless item bloat. Jheald (talk) 19:28, 17 November 2020 (UTC)

Language availability

Hello,

Does Wikidata exist in Korean version (like wikipedia does https://ko.wikipedia.org/)?

Thank you

Wikidata by itself is language-independent, as it is only around pure data. Only the meta discussions are in various languages, for example the korean version of this page at Wikidata:사랑방. Ahoerstemeier (talk) 12:01, 18 November 2020 (UTC)

Call for insights on ways to better communicate the work of the movement

The Movement Strategy recommendations published this year made clear the importance of establishing stronger communications within our movement. To this end, the Foundation wants to gather insights from communities on ways we all might more consistently communicate about our collective work, and better highlight community contributions from across the movement. Over the coming months, we will be running focus groups and online discussions to collect these insights. Visit the page on Meta-Wiki to sign up for a focus group or participate in the discussion.

ELappen (WMF) (talk) 18:56, 18 November 2020 (UTC)

cloud-vps projects purge

There are several unclaimed projects that appear to be wikidata-related listed here, which will be deleted at the end of this month (November 2020) unless somebody claims them. The ones I noticed still unclaimed are:

wdqs-scaling
wikidata-federation
wikidata-history-query-service
wikidata-primary-sources-tool
wikidata-realtime-dumps

If these are important to anybody here please check into what's needed to prevent them from being removed. ArthurPSmith (talk) 19:47, 18 November 2020 (UTC)

Horse-drawn railways and Horse-drawn trams

One is for a Commons category: Q7959671 (Category:Horse-drawn railways) and the other Q7959671 (Wagonway) has two wikimedia articles. I cannot link Q7959671 to the Commons category because it is already linked to Q7959671. Why are these wikidata items for Commons categories even allowed when there is an wikidata about the subject? Smiley.toerist (talk) 01:18, 17 November 2020 (UTC)

Q832003 (horsecar) has as topic's main category: Category:Horse-drawn railways again a wikidata item for Commons category. It should be Category:Horse-drawn trams This should be fairly simple to solve:

In the Commons you have:

level 1: Animal-powered rail transport (in practice nearly always horses, but there are trams drawn by donkeys) level 2: Horse-powered rail transport level 3: Horse-drawn railways, Horse-drawn trams

To confuse things there is also tramway (Q7833250), wich is a bush tramway, not associated with the infrastructure for trams.

How to bring some clarity into this?Smiley.toerist (talk) 01:18, 17 November 2020 (UTC)

Confusion seems to be the name of the game here... Your first two links are to the same item. I believe one should be Category:Horse-drawn railways (Q8522379), instead?
Categories are regarded as distinct from their main topic, so they shouldn't be merged. If you are irrated by the different terms, it would usually be fine to put the cart before the horse or, specifically, align them in whichever way you see fit and to record disfavoured terms as aliases. I would advice however, to try to glean the meaning of any terms in other languages that may be linked and to consider that most of those linked projects have far more editors than wikidata. The increased scrutiny that comes with all those eyeballs should receive some deference.
From quickly gleaning at that category page on enwiki, it appears that the horse-drawn railway isn't a synonym, but rather the superset including "plateways, tramroads, tramways, wagonways and dramways". Two of those terms are blue links, so these things may actually exist and/or are a prank so elaborate you don't want to ruin it. --Matthias Winkelmann (talk) 19:38, 18 November 2020 (UTC)
There is no precise way to define 'trams' and 'railway'. Trams are a big subset of railways, but most cases can be treated separately. Historicaly nearly every city of any size had a Horse-drawn tramway wich in most cases where later electrified. These horse-drawn trams where always single vehicles used for person transport. Horse-drawn railways however where almost completely used for freigth transport with generaly many small freigth wagons. Mostly such lines carried freigth downhill and needed the horses mainly to pull the empty wagons back uphill. (Horses can only pull, they are of no use in braking. They certainly dont like to be pushed from behind) These railways where of little use for person transport as the speed was quite low (except downhill). The only advantage was that the ride could smoother than on the roads. I have been splitting the data-items into the two sets om the basis of content (not the confusing names such as tramway):

How does one convert a big list of articles on a specific wiki to Wikidata IDs and vice versa?

? WikiJunkie (talk) 17:07, 18 November 2020 (UTC)

This sort of thing?
# Sitelink name (i.e. EN article name on EN Wikipedia) to article URL and QId
SELECT DISTINCT ?item ?itemLabel ?article ?sitelink
WHERE 
{
  VALUES ?sitelink {"London"@en "New York"@en}
  ?article schema:about ?item ; 
          schema:isPartOf <https://en.wikipedia.org/> ;
          schema:name ?sitelink .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!
# QId to article URL & sitelink for EN wikipedia
SELECT DISTINCT ?item ?itemLabel ?article ?sitelink
WHERE 
{
  VALUES ?item {wd:Q84  wd:Q22654} # this could be a query yielding QIds
  ?article schema:about ?item ; 
          schema:isPartOf <https://en.wikipedia.org/> ;
          schema:name ?sitelink .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!
--Tagishsimon (talk) 17:15, 18 November 2020 (UTC)
https://www.wikidata.org/wiki/Wikidata:Tools/OpenRefine might be a good idea too, if you are speaking about a list of articles pasted from a wiki page. Bouzinac💬✒️💛 17:55, 18 November 2020 (UTC)
Here's how to do it using the Mediawiki API: https://en.wikipedia.org/w/api.php?action=query&format=json&prop=pageprops&meta=&continue=&titles=Berlin|New%20York|Paris|London&ppcontinue=&ppprop=wikibase_item
You can get IDs for up to 50 pages with a single request by adding their titles, separated by the pipe symbol ("|"). As far as I can tell it's far faster than the query service which, in my impression, is a bad match for such queries anyway.
Here's the link to the API Sandbox for that request where you can play around with it and see what else it can do: Sandbox --Matthias Winkelmann (talk) 19:12, 18 November 2020 (UTC)
I used to be able to do this a couple of years ago with Petscan if i'm not mistaking, but this tool was slightly modified through the last years and now I can't seem to figure out how to do this. What I want it to do, is to fetch the Wikidata Page IDs of thousands of articles using Pagepiles. Does such a simple feature not exist anymore? Should I try to discuss this with the Petscan developers? if so, where can I address questions to them? WikiJunkie (talk) 01:23, 19 November 2020 (UTC)
@WikiJunkie: You can actually do articles -> QIDs directly within PagePile, by checking the "Translate result to Wikidata items immediately" checkbox down at the bottom of the new pile form. In PetScan, you need to put your articles in the "manual list" box of "Other sources" (and fill in the wiki box right below), then go to the Wikidata tab and select "Add items, where available" (or similar) at the top. Items to articles is trickier; I don't know of any graphical tools that do that, so the query Tagishsimon put up above is probably your best bet. Vahurzpu (talk) 05:00, 19 November 2020 (UTC)
You can do QIDs -> articles in Petscan as well. "Other sources", paste a list of QIDs into manual list, set to wikidatawiki, and then lower down that same page "use wiki > enwiki" (or wherever). Example. Andrew Gray (talk) 09:58, 19 November 2020 (UTC)

My first property proposal

I created the data-item Q102109830 and coupled this to a new property proposal in Wikidata:Property proposal/Transportation: 'date posted'. I have a few questions:

  • I put this proposal in the transport section on the basis that the postal service is a transport service, but maybe the proposal should be put in the generic section.
  • Are the parameters for the data-item correct. On some I have my doubts such as 'occurrence', 'transport', 'administration' and 'legal concept'. There are connections there, but I am not certain.Smiley.toerist (talk) 14:13, 19 November 2020 (UTC)

Tube zither (Q30034781) and Category:Tube zithers (Q8874118)

I created the page Tube zither (Q30034781) a while ago. While trying to get a picture to show up on [Wikipedia Commons I realized there are two different pages. Normally I wouldn't care, but I think both pages need to be linked to the category pages on the English Wikipedia and on Commons. In the larger scheme of things when pages are created for categories and also for articles about the thing that categories also cover, what should be represented on Wikidata? Should the category page stand and link to the article page or should the other page stand and link to the category. Is there a standard? Thanks Jacqke (talk) 15:19, 19 November 2020 (UTC)

@Jacqke: So one item represents the concept of tube zithers and one represents just the wikimedia category. Both should stand and link to one another. The commons link should go to the concept not the category (unless the commons media is like pictures of the category or something weird). The enwiki should be linked to the concept. BrokenSegue (talk) 15:26, 19 November 2020 (UTC)
@BrokenSegue: I appreciate the explanation. I think I've got it set that way now. Thank you. Jacqke (talk) 15:31, 19 November 2020 (UTC)

Template for paid editing

At https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Alternate_disclosure_policy there's currently a debate about adopting a template for paid disclosure. Having input from more people would be great. ChristianKl20:31, 19 November 2020 (UTC)

Statistics on maxlag

Are there statistics available of how often maxlag=5 is exceeded? I think it would be useful for understand whether and when we need to do things to reduce edit count to have those numbers. ChristianKl18:45, 19 November 2020 (UTC)

https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&refresh=1m has about the last 4 months of data, or https://grafana.wikimedia.org/d/000000489/wikidata-query-service?viewPanel=8&orgId=1&refresh=1m&from=now-6M&to=now if maxlag is all you want to see. Any bot should already know what to do when maxlag > 5, no? --Tagishsimon (talk) 18:48, 19 November 2020 (UTC)
Bot should know what to do but if for example 12 hours a day we would be over maxlag, then we likely need to take more actions to bring the overall level down. When thinking about whether or not to approve a new bot request that would create a lot of new edits, it matters whether or not we are at capacity. ChristianKl20:27, 19 November 2020 (UTC)
My experience of looking into the detail of maxlag > 5 generally finds a bot completely ignoring maxlag. My understanding is that if bots obay maxlag, as they should, then that is all that is required. High maxlag is a problem that fixes itself if bots obey the rule. Even if we are 'at capacity' - and we have not been for some months now ... the maxlag situation is an order of magnitude better than it was, say, a year ago - but even if we were, the solution is still for bots to apprehend maxlag, rather than to ration new bot permissions. YMMV. --Tagishsimon (talk) 20:32, 19 November 2020 (UTC)
I'm not sure how to best measure it, but it would be great if we have a clear metric about when we are 'at capacity' so that we are able to react to it. ChristianKl21:08, 19 November 2020 (UTC)
Well, I guess that would be when maxlag > 5 for long periods of time, and Grafana is the place to look out for it. Looking back through my tweets, I see I spent much of November 2019 and probably the months before, ranting about performance and maxlag design when we had repeated very protracted periods of > 5 ... & e.g. https://lists.wikimedia.org/pipermail/wikidata/2019-November/013658.html ... this year, performance has been very much better. I don't think you need to worry too much about the absence of people coming forward if we get into a repeat of the previous pattern. --Tagishsimon (talk) 21:21, 19 November 2020 (UTC)
(edit conflict) As Tagishsimon said, the status quo is a setting that is sort of self-stabilizing—as long as bots and tools actually respect the maxlag value per policies. In my opinion, nothing else needs to be done on our side now that we have the 90 edits/min hard limit anyways. The observed on/off edit pattern of bots "at capacity" is certainly not ideal, but it is effectively the best we can have and most bots are meanwhile not failing when it happens. It is important to look for non-complying bots when maxlag>5, as those are the ones who continue to cause trouble when they really shouldn't. When we had these problems roughly half a year ago, this happened more often than acceptable and I did engage with some bot operators in order to make their bots editing compliant with the policies. However, recently there have barely been any problems, as the edit load decreased significantly in the past months and potentially also because the WMF team might have streamlined the critical WDQS update process by a bit. —MisterSynergy (talk) 21:23, 19 November 2020 (UTC)
That description of the time-line sounds like the 90 edits/min hard was established at a time when they problem already were reduced. If that's the case is it good that we have the limit?
Are new bots that add 10,000,000 of academic papers unlikely to cause problems in the current setup? ChristianKl23:07, 19 November 2020 (UTC)
Empirically they - LargeDatasetBot et al - are not causing problems. The max limit is a sensible way of rationing capacity. It would be ideal to push the max rate up; but it is as or more important to prevent large maxlag which will tend to arise as the limit of capacity is neared. Given that demand for resource is very changeable - depends on how many bots & users are running simultaneously, and exactly what they're doing - WMDF need to keep considerable resource in hand to manage the peaks. We need to trust that WMDF is monitoring load metrics throughout the system, and working out how to provide more capacity; but consistent with the notion that it has limited capacity (people, capital) to make changes. I think most of us are mainly glad that the maxlag induced outages of the past seem to be in the past, rather than immediately desirous of more throughput. --Tagishsimon (talk) 00:04, 20 November 2020 (UTC)
Yes I think it is good to have the hard limit. I've witnessed a bot running at 1000 edits/min in the past twice, and values such as 500/min by various operators several times. There has been kind of a race for resources for a while, in order to get as many edits as possible to the servers while maxlag<5 (it was the time when maxlag showed this sawtooth pattern which made all bots running for like 10 minutes and then waiting for 10–20 minutes). All of that is now impossible, as 90/min is the server-side upper limit.
More items or more bots do not cause immediate problems at this point, particularly not regarding maxlag. There are some technical scaling issues which are rather serious: graph databases generally do not scale well horizontally, or in other words there are basically some "entire Wikidata needs to fit into a single computer" situations which cannot simply be avoided with more hardware. In the long run I would expect more federation to be necessary, i.e. distribute the knowledge over different databases/projects/whatever and make them talk to each other via federated queries. The required technology is not very mature yet, but the WMDE team is very much aware of the problem and they spend quite some efforts for Wikibase development in order to make it fit for the future. —MisterSynergy (talk) 00:26, 20 November 2020 (UTC)
Always interesting to ponder on when the federation point will come. I see a user has started adding every road in some country or other, which looks like the thin end of a large wedge. I've added a gazillion laws. Sweden has added the same number of parliamentary motions &c. GZWDer is mining geneolgical sites for all they contain. Project every academic paper still has plural active contributors. GLAMs and their catalogues. I stand amazed that WDQS continues, nonchalantly, to give us results. --Tagishsimon (talk) 01:14, 20 November 2020 (UTC)

Has Part and Reference Tools?

1. Is there a tool/gadget/userscipt that with one click would add the inverse statement of has part to a parent item, if part of is added to another item, only if the statement does not already exist in the parent item?

MoveClaim and MoveClaim2 User Scripts only provides the option to directly duplicate the statement, or change the statement within the same item. I do not believe there is an option for inverse statements, unless I have missed something?


2. Is there a tool/gadget/userscipt similar to MoveClaim2, that allows only the reference to be moved/copied to another item, rather than just another statement in the same item?

@Melderick: & @Matěj_Suchánek:, Thank you for the userscripts! Wallacegromit1 (talk) 03:27, 17 November 2020 (UTC)

You are welcome. User:Frettie/consistency check add.js does sort of (1). --Matěj Suchánek (talk) 09:13, 17 November 2020 (UTC)
@Matěj_Suchánek: Consistency check provides more of a suggestion. Would you be interested in creating MoveClaim3 with the above options included?  – The preceding unsigned comment was added by Wallacegromit1 (talk • contribs) at 17. 11. 2020, 16:32‎ (UTC).
No, at least not for now. --Matěj Suchánek (talk) 09:57, 20 November 2020 (UTC)

Community Wishlist Survey 2021

 

The 2021 Community Wishlist Survey is now open! This survey is the process where communities decide what the Community Tech team should work on over the next year. We encourage everyone to submit proposals until the deadline on 30 November, or comment on other proposals to help make them better. The communities will vote on the proposals between 8 December and 21 December.

The Community Tech team is focused on tools for experienced Wikimedia editors. You can write proposals in any language, and we will translate them for you. Thank you, and we look forward to seeing your proposals!

SGrabarczuk (WMF) 05:52, 20 November 2020 (UTC)

2021 enquête over de verlanglijst van de gemeenschap

 

De 2021 enquête over de verlanglijst van de gemeenschap is nu geopend! De enquête is de manier waarmee gemeenschappen bepalen waar het Community Tech team volgend jaar aan moet gaan werken. We moedigen iedereen aan om voorstellen in te dienen voor de deadline op 30 November, of om te reageren op andere voorstellen om ze te helpen verbeteren. De gemeenschappen stemmen op de voorstellen tussen 8 December en 21 December.

Het Community Tech team is gefocussed op hulpmiddelen voor ervaren Wikimedia bewerkers. U kunt voorstellen in elke taal opstellen en wij vertalen deze. Bedankt en we kijken uit naar uw voorstellen!

SGrabarczuk (WMF) 05:11, 20 November 2020 (UTC)

This item seems conflated two people: George Harrison, Lord Provost of Edinburgh and another George Leib Harrison whose life is completely in the US. Can someone cleanup it?--GZWDer (talk) 16:01, 20 November 2020 (UTC)

@GZWDer: Thank you, it should be done! --Emu (talk) 11:46, 21 November 2020 (UTC)

Recency constraint

Please see Help:Property constraints portal/Recency for a draft description. --- Jura 09:13, 21 November 2020 (UTC)

Representing United States Senate classes

I'd like to have statements such as Bernie Sanders (Q359442) position held (P39) United States senator (Q4416090) indicate to which United States Senate class the given senator belongs to. This could be done by making a qualifier for this statement indicating which class but there is no obvious qualifier that applies. Further, a qualifier on the statement puts it on a par with elements describing the statement itself such as start time (P580) or end time (P582) which describe details of the actual holding of the seat whereas the Senate class seems to describe the seat itself which is only occupied by one senator at a time. Thus, while the electoral district (P768) for both senators in a given state covers the same geography, it feels more natural to split each electoral district (P768) in two, one for each class since it's not possible to have qualifiers of qualifiers.

Thus while representing senate classes as qualifiers is undeniably simpler to manage it seems less true to what we are modeling.

Thus my proposal is to re-label entries like Vermont (Q58425109) as the "Vermont delegation" which will have two subclasses: "Vermont Senate class 1" and "Vermont Senate class 2", both of which will subclass "Vermont delegation". Relabeling will make it easier to disambiguate the United States senate constituency (Q58425000) from administrative territory. Having each class subclass from the delegation will allow us to do queries with electoral district (P768)/subclass of (P279) when we don't care about the particular class.

This should make it easier to generate tables like https://en.wikipedia.org/wiki/List_of_United_States_senators_from_Vermont from Wikidata information. @Andrew Gray: @Oravrattas: @Tagishsimon:

  • I think there is a misunderstanding of basic concepts about the way Wikidata works. "electoral districts" are unrelated to the seniority of a senator. To determine a senator's seniority, one should generally be able to compare start dates. Please don't create any fictional electoral districts or other fictional data. --- Jura 11:47, 6 November 2020 (UTC)
I think you misunderstand what United States senate classes are. They have nothing to do with seniority. I won't be entertaining more comments from you in this project. Gettinwikiwidit (talk) 12:55, 6 November 2020 (UTC)
@Jura1: - "seniority" in terms of time served isn't anything to do with it. As you say, that can easily be determined by looking at start dates. Please try reading a bit more about these issues before accusing people of "misunderstanding basic concepts".
The issue is that there are three classes of senate seats; the class dictates the year they are elected. "Class 1" senators are equal in all other respects to "Class 2" or "Class 3", and it doesn't affect seniority; at the moment, the most senior senator holds a Class 3 seat. The two seats are defined as having different terms (see en:Classes of United States senators) which is why we end up with strange things like this week having two different elections for senators in Georgia, one for Class 2 and one for Class 3. This is quite an unusual approach for a multi-member system - most countries with systems like this elect everyone at once, and don't distinguish between the seats - but it's not "fictional data".
@Gettinwikiwidit: I think this proposal sounds sensible and I support it. Breaking down the "constituency" into two overlapping seats seems to make a lot more sense, since we can then do queries like "find the split in party affiliation among Class 2 senators"; the alternative is to hack something together based on dates of first election but that would break down for anyone who started in a special election. I would quibble with "Vermont delegation" - maybe "Vermont delegation to the US Senate"? - but otherwise it sounds fine.
(PS - the pings won't trigger if you don't sign! So re-pinging @Oravrattas, Tagishsimon: in case they have comments.) Andrew Gray (talk) 13:06, 6 November 2020 (UTC)
  Oppose fictional districts. This can easily be handled by one of the available qualifiers. --- Jura 13:08, 6 November 2020 (UTC)
Before adding further data, please create a full sample with a sandbox item. These highly visible items aren't suitable for editing experiments and cleaning them up takes me and others too much time. --- Jura 13:13, 6 November 2020 (UTC)
Saying something is easy without making a suggestion isn't at all useful. Gettinwikiwidit (talk) 13:14, 6 November 2020 (UTC)
I think you finally implemented most of my suggestions. Congrats to that. We just need to clean up the fictional dates. --- Jura 13:18, 6 November 2020 (UTC)
Please try to read more carefully. You have just suggested that this is more easily done with qualifiers and that was the reason for your opposing. Please clarify what qualifiers you mean. These conversations don't have to be so painful. Gettinwikiwidit (talk) 13:36, 6 November 2020 (UTC)

State + class is a seat and is analogous to a district, but it isn't a district as such. The state is the district. We probably need a distinct way to model seats that are not districts. Same problem arises (if we care) about modeling the present structure of the Seattle City Council, which has seven district seats plus two distinct at-large seats. - Jmabel (talk) 16:58, 6 November 2020 (UTC)

I agree that, as there are two distinct seats per state, we should have a distinct item for each. I'm less convinced by the subclassing suggested here. I think we only need three new items, named something like, "US Senate class 1 seat", "US Senate class 2 seat", and "US Senate class 3 seat", and then each of the seats can subclass the relevant one, and be connected to the relevant geographic item for the state via coextensive with (P3403). Other than the Cook PVI column, this would then allow us to generate an equivalent table to that at https://en.wikipedia.org/wiki/Classes_of_United_States_senators#List_of_current_senators_by_class --Oravrattas (talk) 18:49, 6 November 2020 (UTC)

Seems safest to keep P131 as well, since both are true and a lot of queries will expect it. But I think this model makes sense and gives us a good framework to work with. Andrew Gray (talk) 20:49, 6 November 2020 (UTC)
So we completely abandon entries like Vermont (Q58425109)? Gettinwikiwidit (talk) 02:26, 7 November 2020 (UTC)
@JMabel: @Oravrattas: @Andrew Gray: Also it's not clear how to connect a senator with one of these entries if we don't use electoral district (P768). What exactly is being suggested? To make a new "seat" property? I'm not sure what value we get in the model by making this distinction. I don't think there's much risk of confusion and I'm not sure who would practically want to use the distinction in a query. Gettinwikiwidit (talk) 02:31, 7 November 2020 (UTC)
In any event I don't think it's worth holding up making this change on getting this property added. Someone can revisit later if they feel strongly about it. Gettinwikiwidit (talk) 02:33, 7 November 2020 (UTC)
How does this look? United States Senate state classes It's easy enough to take this table and generate QuickStatements or load into OpenRefine. If we have a consensus, I'll create the new entries. (FWIW, I have no problem with other people using the info in this repository to upload to Wikidata. It's all ultimately taken from either Government sources or Wikipedia.) Gettinwikiwidit (talk) 03:07, 7 November 2020 (UTC)
I think we'd want to keep electoral district (P768) for linking all of these claims; the fine nuances of differences between districts and seats, geographic and at-large areas, etc, are best modelled on the item rather than by using a different linking property. The existing property already has seat, etc as aliases, and I think it's always been used to cover the broad span of such things.
In terms of what to do with the old items, I guess if we're not creating an intermediate "Senate representation in Virginia" item then there's no real point in keeping them - I guess we can switch to the new items and then list for deletion, or blank and redirect/merge them into the state. Andrew Gray (talk) 10:48, 7 November 2020 (UTC)
I agree with Andrew Gray on electoral district (P768). Most standard queries for getting members of a legislature will expect that, and I see no need to force a different approach here. --Oravrattas (talk) 12:44, 7 November 2020 (UTC)
@Jura1: Seeking community input is precisely what this thread is all about. I think you misunderstand how a community works. There is a proposal currently being discussed. If there is something you don't understand or have questions about feel free to ask. You made a proposal which lacked specificity and ignored my comments directed precisely to such a proposal. When asked to be specific you dropped the topic altogether at which point I wrote it off as being unserious. More directly, a community is about about a discussion back and forth. I'm more than happy to entertain questions which are focused on the utility to the community at large. Further, we all understand that this is volunteer work. None of us are in a position to give assignments to anyone else here. Regards, Gettinwikiwidit (talk) 06:53, 8 November 2020 (UTC)
It's probably also worth mentioning that before I started this work that the U.S. Senate info was spotty and inconsistent. It had been left that way for two years despite your involvement. I'll be so bold as to suggest that my supplying of the entire history of the U.S. Senate and continuing efforts to make it consistent do in fact a service to the community. You have focused on the fact that some entries have dates in the future (of which there are many examples in Wikidata. They are not "fictitious".) Refusing to acknowledge accomplishments does a disservice to the community potentially discouraging future work which would also benefit the community. Gettinwikiwidit (talk) 07:00, 8 November 2020 (UTC)
Per Jura, please publish a coherent proposal before launching into precipitate action. I have nodding familiarity with political office P106 but I cannot make head nor tail of the proposal here. Bernie Sanders currently has a set of United States senator (Q4416090) statements, qualified by electoral district (P768). Is the suggestion now that he will have instead P106=Q4416090 PQ768="Vermont Class 1 senate seat"? In other news, if the issue is a need to qualify a P106 Senate statement to represent seat class, would there be anything objectionable about using, instead, object has role (P3831) as the qualifier - object has role: class 1 seat? --Tagishsimon (talk) 12:25, 8 November 2020 (UTC)
@Tagishsimon: For what it's worth, this seemed to be a coherent proposal to me. It pretty much works as you described, except with P39 claims, not P106. The change is for those statements to be qualified electoral district (P768):"Vermont Class 1 senate seat" rather than electoral district (P768):"Vermont senate seat".
In terms of using object has role (P3831) on the statement, this would work, but I think it is conceptually a bit shaky. It suggests that his membership of the Senate (the object) is "Class 1", "Class 2", etc; but that qualifier properly attaches to the seat, not to his incumbency of it, so it makes more sense to describe it on the item for the seat. This is consistent with how we handle modelling the situation where there are different "kinds" of constituency existing alongside one another - eg with the NZ Māori seats, we classify Te Tai Tonga (Q7690987) as being a Māori seat, rather than qualifying Rino Tirikatene (Q3527980) as object has role (P3831):holder of a Māori seat. Andrew Gray (talk) 14:48, 8 November 2020 (UTC)
Colour me unconvinced. Whilst I accept that the current proposal is workable &c, and that the different approaches discussed here might be seen - reporting aside - a 2*3s & a 6 affair ... Te Tai Tonga seems to be the south island Maori seat, and its item describes it as such. We have not had to go out and invent a new item coextensive with Te Tai Tonga as we're purposing to do here. Your dismissal of object has role is valid, but only with the frame you selected. I could as easily frame it as 'A senatorial position for the Vermont electoral district which takes the role of furnishing a class II senator" or somesuch. The proposal we see here is to invent Class I, II & III districts which, strictly speaking, do not exist. That seems a bigger crime than merely taking a slight tyre-wrench to the meaning of object has role. To the extent it matters, coining new districts hitherto unknown in the history of US Senatorial history becomes problematic for the naive report writer who considers, not unreasonably, that the electoral district for a senator for Vermont is Vermont, rather than our confection. Find me the senators who have a seat that is coextensive with Vermont will just have to be one of those very unfortunate and IMO unnecessary ontological complications that those that come after us have to deal with ... when there were other, how to say, non-breaking, means of conveying class. I don't think you're going to get unanimity on this, and I think I've now said all I have to say; Gettinwikiwidit - athough I'm not a fan of the proposed approach, nevertheless I'm happy that you're giving this corner of wikidata the attention it needs. --Tagishsimon (talk) 23:45, 9 November 2020 (UTC)
@Tagishsimon: Not a new invention at all. It's straight out of the U.S. Constitution. See en:Classes of United States senators. - Jmabel (talk) 06:39, 10 November 2020 (UTC)
Well, no. The district is Vermont. The class is I, II or III. We are choosing to conflate class into district ... they are distinct things. The solution prevents us from a simple query for Senators from Vermont. Instead the user must know to search for senators from a district that is coterminous with Vermont. That's a very poor and unnecessary solution. --Tagishsimon (talk) 08:17, 10 November 2020 (UTC)
@Tagishsimon: Are you saying that Vermont (as constituency) should have a single item as a multi-member constituency, with number of representatives in the Senate equal to 2? I think it's slightly more accurate to say that Vermont has two single-member seats than one multi-member seat, and whilst there are definitely pros-and-cons of each approach, including wrt querying, I do think it's already generally accepted practice to have to take a separate step to get from constituency to geography: consider producing a list of all the United States House of Representatives (Q11701) members for Vermont constituencies. --Oravrattas (talk) 08:57, 10 November 2020 (UTC)
I'm saying that there is a single electoral district; and that it is more important to be able to support plain queries such as "find me the senators from Vermont" than it is to conflate class - the specific nature of the seat in the district - into 'electoral district'. --Tagishsimon (talk) 09:32, 10 November 2020 (UTC)
@Tagishsimon:This sounds a lot like the discussion of the difference between a "seat" and an "electoral district" mentioned above. I don't disagree that the two separate concepts are being conflated here, but the decision was to punt for now because of all the places the two concepts are being conflated. Being a model, it seems to me to be less a question of whether it's real or not as whether the modeling the distinction is useful or not. Do you have any examples of useful queries which would be made more difficult by this conflation? Gettinwikiwidit (talk) 07:31, 13 November 2020 (UTC)
@Tagishsimon: We currently differentiate Vermont (Q58425109) from Vermont (Q16551) so even saying "find me the senators from Vermont" paints over the need to work with the model. With the proposed model, it's still possible to fetch the senators from Vermont using wdt:P39/wdt:P131 wd:Q16551 Gettinwikiwidit (talk) 07:43, 13 November 2020 (UTC)
It's probably worth pointing out that this query works with both models. Gettinwikiwidit (talk) 07:46, 13 November 2020 (UTC)
@Tagishsimon: Thanks for the more specific suggestions about how this could be done with qualifiers but as mentioned in my initial proposal at the start of this thread and as Andrew says here this is more about a property of the "seat" itself and not about the incumbency of the seat. Also discussed above is the distinction between "seat" and electoral district (P768). If you have anything to add on that topic, feel free. The proposal is to change all the electoral district (P768) qualifiers in all US senator entries to reflect the class of the seat being held. Rather than have it be a United States senate constituency (Q58425000) representing only the state in question such as Vermont (Q58425109) it would be one representing the state and the class of the seat. All seats of the same class across all states would subclass a single entry representing the class itself. (I'm personally less convinced of the usefulness of this, but I'm happy to play along.) The new seat entries thus would look exactly like the Vermont (Q58425109) entry but also have a coextensive with (P3403) property with the same value as the located in the administrative territorial entity (P131) property. Gettinwikiwidit (talk) 21:25, 8 November 2020 (UTC)
More generally though, I think the priority should be getting the info into the Wikidata store over arguing over properties which can easily be modified later on. I understand the drive to get it right the first time to minimize work later on, but I would argue that it's pretty hard to know what works well until you've seen it in action for a while. As @Andrew Gray: alludes to above this proposal hews closely to other models which seem to be working well. Gettinwikiwidit (talk) 21:25, 8 November 2020 (UTC)
I'm happy to mock this up for one senator. Is there a preferred place to create scratch items? Searching for "scratch" in Help and Wikidata didn't help much. Gettinwikiwidit (talk) 21:59, 8 November 2020 (UTC)
@Gettinwikiwidit: P3403 should be to the "real state", I think - Vermont (Q16551) is the "master thing" that determines what the borders are, and all the things that are defined as being coterminous with Vermont get defined by pointing to it.
On your other question. we don't really have a scratch items setup (there's the sandbox, but that's mostly for testing individual edits). I'd normally make some sample items to demonstrate with, but for now I've made a stab at sketching it out below based on this discussion:
  • [new] Vermont Class I senate seat
  • [new] US class I senate seat
This is how the NZ setup handles it, though I can see an argument for dropping the instance of (P31):United States senate constituency (Q58425000) on the individual seats and just having them be instances of Class I seats (it's a bit cleaner and simpler). But as you say, we can easily fiddle with the upper structure once the new items are in place in the P39 claims. Andrew Gray (talk) 22:24, 9 November 2020 (UTC)
@Oravrattas: @Andrew Gray: I made one example: Vermont Class 1 senate seat (Q101435082). I left of coextensive with because it has a symmetric constraint that I don't want to deal with at the moment. I'm also not clear on its purpose. I've also not linked it to the current US senator, but this should give the idea. Gettinwikiwidit (talk) 09:19, 10 November 2020 (UTC)
@Gettinwikiwidit: I think this seems good. One tweak I'd suggest is that United States senate seat (Class 1) (Q101434824) should be subclass of (P279) United States senate constituency (Q58425000) rather than instance of (P31) - it's a class of Senate seats, rather than a single instance of a Senate seat - but otherwise seems good to me. Andrew Gray (talk) 13:38, 10 November 2020 (UTC)
@Andrew Gray: Done. Also given the symmetry constraint, I wonder whether coextensive with (P3403) shouldn't exist between the senate seats entries for a given state. I don't know that I like having the state entry pointing to the seat. Gettinwikiwidit (talk) 01:44, 11 November 2020 (UTC)
I have QuickStatements ready to go to create items for all the other seats. Changing all the position held (P39) statements is more work I think because I believe I'll need to generate all new statements and then delete the old ones. I'll have to pull all the current data to copy the position held (P39) statements, edit the electoral district (P768) qualifier. I can do this by generating an appropriate SPARQL query to pull all the relevant fields from these statements. Is there a better way? @MisterSynergy: @Andrew Gray: @Oravrattas: Gettinwikiwidit (talk) 06:54, 11 November 2020 (UTC)
@Gettinwikiwidit: In general, I find wikibase-cli to be by far the best tool for working with P39 statements. I'm happy to help out with doing the import/migration of any of these if that's useful. --Oravrattas (talk) 07:42, 11 November 2020 (UTC)
@Oravrattas: Thanks for the offer. I may take you up on it, but I'm also trying to learn here. I'll have a look at wikibase-cli. FWIW, this (somewhat messy) SPARQL query says that in this case it shouldn't be terribly difficult.
SELECT DISTINCT ?qualLabel WHERE {
  ?sen wdt:P39 wd:Q4416090;
    p:P39 ?statement.
  ?statement ps:P39 wd:Q4416090;
    pq:P2937 ?term;
    ?pred ?val.
  ?qual wikibase:qualifier ?pred.
  ?statement prov:wasDerivedFrom ?ref.
  ?ref ?p1 ?v1;
       pr:P248 ?v2.
  ?prov wikibase:reference ?p1.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!
There aren't that many reference types either. I also learned that I can reference the Biographical Directory of the United States Congress (Q1150348) directly as well as supply the US Congress Bio ID (P1157). Gettinwikiwidit (talk) 07:56, 11 November 2020 (UTC)
I'm suggesting wikibase-cli because it makes it easy to edit existing statements. That's generally a much better idea than the copy/delete approach. See, for example, update-qualifier. --Oravrattas (talk) 08:47, 11 November 2020 (UTC)
I just looked it up right after you first posted. I'll probably use that. I had suggested similar functionality to the QuickStatements folks. Thanks again. Gettinwikiwidit (talk) 09:18, 11 November 2020 (UTC)
This all looks great. Agree with the suggestion that wd-cli is probably the way to go for handling multiple statements - it's incredibly good at this sort of thing. Andrew Gray (talk) 17:21, 11 November 2020 (UTC)
  • Okay, I think I have this all lined up to make the change. I've created all the new senate seat objects.
    SELECT ?seat ?seatLabel WHERE {
      ?seat (wdt:P31/(wdt:P279+)) wd:Q101500234.
      SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
    
    Try it!
    }

I've also lined up the data to make the changes to all the located in the administrative territorial entity (P131) qualifiers. Last chance to comment before I make this change. Regards, @Tagishsimon: @Andrew Gray: @Oravrattas: @Jmabel: Gettinwikiwidit (talk) 07:23, 13 November 2020 (UTC)

  • Please see the note on your talk page. I asked an admin to stop this given that you have been requested to seek input of the wider commmunity before make such changes to our data. --- Jura 09:05, 14 November 2020 (UTC)
@Jura1:, this is literally a discussion on Project Chat - it is what constitutes "input of the wider community". It is the third such thread since @MisterSynergy: asked for wider discussion (see October 2-14, November 3-6). The model has developed over the course of these discussions - for example, the 'XXth Congress member' items have been removed. As you yourself said above, "I think you finally implemented most of my suggestions".
In this thread, we have discussed the proposed model for another week and provided examples. You have not posted anything since the example was provided, and now you turn up demanding an admin intervene. This is simply disruptive.
I would also remind you that you are under an editing restriction, agreed by admins in 2019, which says you are "forbidden from undoing another autoconfirmed editor's edits to items about politicians, politics, and government without prior discussion" This revert clearly breaks those restrictions. Andrew Gray (talk) 11:37, 14 November 2020 (UTC)
I don't think this thread satisfies to what was asked from Gettinwikiwidit, but MisterSynergy is best place to state that. I suggest the restriction for Gettinwikiwidit and Oravrattas is extended to Andrew Gray as well. --- Jura 11:55, 14 November 2020 (UTC)
@Jura1: I don't believe I'm under any restriction. Please point me at the relevant discussion. Regards, Gettinwikiwidit (talk) 13:20, 14 November 2020 (UTC)


The change that has been made to the electoral district (P768) of US Senate P39s continues to be really really stupid, predicated on nothing more than one person's desire to encode the class of the seat, and a feeble grasp of semantics.

An electral district is a geographic area. electoral district (P768), to remind, is in EN labelled "electoral district this person is representing, or of the office that is being contested". The electoral district of the Oklahoma class 2 seat is, unambiguously, Oklahoma. Electoral district != seat. Was never designed to = seat. Is documented as not being seat. Yet here we are with the electoral district (P768) now taking values such as Oklahoma Class 2 senate seat (Q101498943), which is not a district, but a seat.

If you just want to make things up as you go along, and decide that you'll stuff inappropriate data into property statements that were never designed for that data, we might as well all pack up and go home.

This is simply the worst large-scale change I've seen in wikidata. It really is appalling. --Tagishsimon (talk) 21:35, 14 November 2020 (UTC)

More. The class applies to the seat, not the district. Just as the district for the AG of Texas is Texas, so the district for the class 1 senatorial seat in Texas is Texas. The solution to the class problem is to be found in a seat - P39s main property statement - rather than in its district. The solution would be three items to be used as the main values in P39s - "US senator for a class 1 seat", "US senator for a class 2 seat", "US senator for a class 3 seat". --Tagishsimon (talk) 21:55, 14 November 2020 (UTC)
Let's see if I can make this debacle any clearer. All of these different seats have the same electoral district.
Positions versus Districts in Texas - IRL
Position district
Governor Texas (or Texas electoral district)
Lieutenant Governor Texas (or Texas electoral district)
Attorney General Texas (or Texas electoral district)
Comptroller Texas (or Texas electoral district)
Land Commissioner Texas (or Texas electoral district)
Agriculture Commissioner Texas (or Texas electoral district)
Railroad Commissioner(s) Texas (or Texas electoral district)
Senator for a Class 1 seat Texas (or Texas electoral district)
Senator for a Class 2 seat Texas (or Texas electoral district)
And now look at the mess we've made:
Positions versus Districts in Texas - current debacle
Position district
Governor Texas (or Texas electoral district)
Lieutenant Governor Texas (or Texas electoral district)
Attorney General Texas (or Texas electoral district)
Comptroller Texas (or Texas electoral district)
Land Commissioner Texas (or Texas electoral district)
Agriculture Commissioner Texas (or Texas electoral district)
Railroad Commissioner(s) Texas (or Texas electoral district)
Senator Texas class 1 senator seat
Senator Texas class 2 senator seat
It's very plainly nonsense on stilts. To perform some reductio ad absurdum, we might as well go the whole hog:
Positions versus Districts in Texas - future debacle
Position district
Governor Texas Governor seat
Lieutenant Governor Texas Lieutenant Governor seat
Attorney General Texas Attorney General
Comptroller Texas Comptroller seat
Land Commissioner Texas Land Commissioner seat
Agriculture Commissioner Texas Agriculture Commissioner seat
Railroad Commissioner(s) Texas Railroad Commissioner seat
Senator Texas class 1 senator seat
Senator Texas class 2 senator seat
I just do not know how else to say, senatorial class is an attribute of the position, not of the district which elects a person to that position. --Tagishsimon (talk) 23:11, 14 November 2020 (UTC)
@Tagishsimon: You seem to be saying different things and conflating them in odd ways. If we could take the histrionics down a notch or two maybe we can have a rational discussion here. First and foremost I think discussions on the data should be based on comprehensibility and usefulness to the community at large and intend to center my arguments on these precepts. Second, the only reason this change is so large is because I supplied the vast majority of the US Senate entries just a short while ago. There were inconsistencies before I arrived but didn't seem to raise anyone's ire. Completing this data was a boon for the community. Hopefully we can all agree on that.

Thirdly (I'll stop numbering points from here on in), I think we can all agree that a US Senate seat is not a fictional thing, no one is inventing anything here. The only question is how to model it. If anyone needs convincing, just look at the Georgia special elections this year. Both seats are up and candidates run for *one seat or another*. The class III seat has two more years left on it and the class II seat begins a fresh six years. They are distinct.

Is there a difference between a district and a seat? Yes. I believe this was stipulated by everyone in the thread above. Is the district identical to the geographic area? I would argue no. An electoral district has a property of the area it represents but is not equal to it. Entities in Wikidata which predate my contributions seem to conform to that notion as seen by the difference between Vermont (Q58425109) and Vermont (Q16551). (Incidentally, some of the inconsistencies which predate me were that the two were used interchangeably.) Even here in order to avoid philisophical rabbit holes (stating what an abstract concept *is* gets into murky territory) it's probably best to say this was *modeled* by an electoral unit (Q192611) entity which had electoral district (P768) property pointing at Vermont (Q16551).

In many cases there is a single representative for the entire district. In others there are more than one but they are elected at the same time. The US Senate is relatively rare in that two people holding the same "office" covering the same "district" occupy distinct seats. We're left asking how to model this. Your last suggestion is that the "office" should incorporate the "seat". The rest of the discussion in this thread has thought that the "district" can model the seat. I think both of these are compromises. I don't believe anyone thinks of the office as being "Senator from Vermont Class III" and the difference between a "district" and a "seat" has already been stipulated.

Again, these are two different approaches to modeling the reality. As for the relative merits of either, I think there's a healthy degree of subjectivity which goes into that. As people talk about the "office" more often than they do the "district", I'm inclined to prefer the model which leaves the position as United States senator (Q4416090). In any event I haven't heard any strong arguments about how one model is more useful than another.

Also, I think a more precise way to critique the chosen method is that the property electoral district (P768) doesn't apply, not that the seat is non-existent. One solution might be to make a new property "seat" which has an "electoral district" thereby distinguishing the "position", from the "seat" and the "electoral district". In the vast vast majority there will be a 1-1 correspondence between "seat" and "electoral district". That probably explains why it wasn't done before. I personally am for such clarification which might help to address Jmabel's concerns mentioned above.

To get back to the community, I think it's better to have this data in the data store rather than not. Having the data in the store means it's available for future modifications. I make no claims that this data should remain immutable for time immemorial. And while I agree it could be more precisely modeled, I'd argue that the form it's in now is largely comprehensible and useful. For instance, we can now sort each seat by start time to calculate which senator followed and was followed by which senator and augment the data with even more useful information. You mentioned one query above, which I mentioned was no more difficult in this model than it was in the preexisting model. I'm happy to further discusses the comparative merits of the models based on how difficult it is to extract info out of them. Regards, Gettinwikiwidit (talk) 03:10, 15 November 2020 (UTC)

So your reductio ad absurdum version probably is the most consistent, though at that point I would call it a "seat" and not a "district" and the "seat" would have a property of "district". But again, all models contain compromises though I appreciate that not all compromises are created equal. Gettinwikiwidit (talk) 03:23, 15 November 2020 (UTC)


@Tagishsimon: My first thought is that focusing on electoral district (P768) as a "district" not a "seat", and emphasising the geographic aspect of it, is I think a bit of a red herring. We already use it for non-geographic electorates - eg Irish Seanad panel seats like Administrative Panel (Q4683477), UK university seats like Combined Scottish Universities (Q5150880), Hong Kong functional constituencies like Medical (Q3248957). It's better to think of it, IMO, as an item for all forms of "electoral division", however that division is handled locally; it happens to use "district", but "seat", "electorate", and "constituency" are also aliases.
I can see the arguments for using the administrative geography (the state) as the value for electoral district (P768) for senators, but I don't think that splitting out a distinct item for the electoral entity necessarily leads to the silly effect of us having to create a new district for every statewide elected post. In the examples you give, we would normally say that someone is eg Texas Attorney General (Q7707525), and then set it up so that that item is connected to Texas, rather than saying they were elected to the post of Attorney General for the electoral district of Texas. (If we don't have an item, the usual approach is to say they hold generic post:of (P642):place, and use the item for the place; note P642 not P768). So I don't think the Texas example here does suggest this approach is absurd.
To my mind, the thing that makes sense for Wikidata is to use distinct electoral entities, even if it does mean adding a bit of abstraction. It means we have a consistent model ("a person holds a post with P768 electoral district, and the electoral district is P131 of an admin geography") across countries, and within parliaments (eg we don't have to have a different approach for House seats in Virginia and Vermont). We already did this abstraction by having items set up so that Texas senators were previously using Texas (Q58425106), not Texas (Q1439). (I appreciate you're contending we shouldn't have been doing that in the first place, but just to emphasise it's not a new feature.) And if we're doing that abstraction, splitting the Senate seat into two overlapping class seats isn't any more complicated than having one. So I think this is all reasonable and consistent with existing modelling approaches.
@Gettinwikiwidit:, I would be cautious about creating a new "seat" property to try and resolve this ambiguity. It would just cause endless confusion further down the line as we have to redefine all existing uses of "district" against a new property, and figure out which one to use in the majority of cases where the two are treated as being the same, or how to handle cases where some people would use "seat" and other "district" within the same context (eg should Vermont's at-large congressional district (Q4311018) use "seat" but New York's 1st congressional district (Q12357474) use "district"?). Far easier IME to say "yes, we accept there's a tiny bit of ambiguity between 'seat' and 'district' in some cases, but it makes sense from a modelling perspective to use a single property"; this is similar to the way that we collapse all existing subtleties of party affiliation into the parliamentary group (P4100) qualifier. Andrew Gray (talk) 10:01, 15 November 2020 (UTC)
In the short term, I'm going to reinstate the vandalized entry because it's more useful to have a consistent set of data while we continue to discuss. Gettinwikiwidit (talk) 21:22, 16 November 2020 (UTC)

Options for modelling

I've taken the liberty of dropping in a section heading to break this up a bit, and tried to set out where we are and the options for modelling this. I hope everyone is OK with this.

At the start of the year, we had (incomplete?) Senate data which did not reflect class, and had a single "electoral unit" per state in electoral district (P768) qualifiers. As of this weekend, with the edits discussed above, we have (complete) Senate data which has two different "electoral units" per state, each embedding a particular Senate class, in electoral district (P768) qualifiers.

There are two overlapping issues here, and I think it helps to consider them separately. We've discussed all of them at some point and they all have pros and cons.

Should we model senate class, and if so how?

Option 1 would mean we couldn't reliably work out class in a query (we could infer it sometimes, but not always); options 2-4 would mean we could easily query for it. Option 4 makes most Senate queries a little more complex but not massively so.

What type of items should we use in electoral district (P768) for senators?

Option A is simplest, options B/C make some Senate queries a little more complex. There are examples of both options A and B in other parliaments, though B is generally more common. Option D has implications for all political data, not just Senators.


Personally, I have a mild preference for some approaches over others, but to be honest I think I would be happy with almost any of them assuming we were consistent. Most combinations of approaches make sense and can be justified from a modelling perspective.

The only one I have major reservations about is option D, implementing a new property model to distinguish seats and (geographic) districts - I think the current broad use of P768 is very well established, and trying to substantially change this up to accommodate an unusual edge case could be a lot more trouble than it is worth. Andrew Gray (talk) 12:56, 15 November 2020 (UTC)

Thanks, @Andrew Gray:. I would like to suggest that when advocating one approach or another that we supply sample queries to demonstrate the benefits/demerits of the model. I believe the query to list senators sorted by class and start time (in order to fill in follows/follows by qualifiers) is unwieldy with option 2, since you need to fetch another qualifier out of the statement. As mentioned above anything you could do with the previous model you can do with the latest model using p:P768/wdt:P131. (or changing p:P768 to p:P768/wdt:P131 if using geography over state)
Moreover, I think there is value in separating the district from the geography mostly because it allows us to distinguish the political entity from the geographic one so that over time as these objects get more and more descriptive the whole thing doesn't become a garbled mess.
Lastly, I'll reiterate that I believe the primary concern should the the data's comprehensibility and usefulness and thus clearly lean towards having more information modeled rather than less information modeled as well making easiest the most common queries rather than less common queries. Comparing people's passion over one model over another isn't likely to get us anywhere. Regards, Gettinwikiwidit (talk) 21:01, 15 November 2020 (UTC)
  • I think it would be good to see references for each option especially for electoral districts.
Also, if information is repeated several times (as options 2, 3, 4) do, it would good to see a usecase for that. --- Jura 12:04, 22 November 2020 (UTC)

IM channel datatype

A few months ago, IM channel URL (P8009) was created with string datatype instead of external id datatype, because the last datatype didn't support many URI schemes. In this discussion Lea Lacroix said that it would be possible fix that problem. So ¿it's ok changing P8009 datatype to external-id?  – The preceding unsigned comment was added by Tinker Bell (talk • contribs) at 03:54, 14 November 2020‎ (UTC).

Did you mean "URL"-datatype? external-id has no limitation. Please mention it on the property's talk page as well. --- Jura 11:59, 22 November 2020 (UTC)

Denoting the domain of awards and prizes

I am wondering which property should be used to denote the domain of awards and prizes in Wikidata. In Governor General's Performing Arts Award (Q3405815), the performing arts domain for the award is denoted with field of work (P101). However, the description for field of work (P101) is: "specialization of a person or organization". This description does therefore not seem to apply to awards. At the same time, the subject type constraint (Q21503250) does include award (Q618779) (as well as museum (Q33506)) as accepted values. Is field of work (P101), then, the right property to denote the domain of an award? --Fjjulien (talk) 14:51, 16 November 2020 (UTC)

  Notified participants of WikiProject Award --- Jura 11:49, 22 November 2020 (UTC)

Making US general elections consistent

I propose to model all the United States general election (Q26252880) instances on the 1790 United States elections (Q18356754) example so that the labeling is consistent. In addition I propose to connect them with follows (P155) and followed by (P156) qualifiers. I would also like to collect elections by which Senate class is up for election as collected here. I propose connecting them to the respective United States senate seat (Class 1) (Q101434824) using significant event (P793) but am open to suggestions.

SELECT ?election ?electionLabel ?pred WHERE {
  ?election wdt:P31 wd:Q26252880.
  
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
order by ?nounLabel
Try it!

Gettinwikiwidit (talk) 07:33, 17 November 2020 (UTC)

  • Please make a sample item and link to the revision you consider a sample.
https://www.wikidata.org/w/index.php?title=Q18356754&oldid=884540192 has currently just three samples. Description of that version isn't useful. See Help:Description for that. --- Jura 11:46, 22 November 2020 (UTC)

Soucis de fusion

Note: originally posted at WD:AN. Nomen ad hoc (talk) 11:45, 21 November 2020 (UTC).

Bonjour, pourriez-vous annuler la fusion de ces pages et les rediriger vers les + anciennes ([9] vers [10], [11] vers [12]). Il y a aussi cette page dont le contenu est erroné, le lien en français n'est pas le bon ([13] vers [14]) ; je l'aurais bien fait mais depuis que j'ai installé la fonction merge.js, je ne peux plus le faire manuellement. Merci d'avance, Méphisto38 (talk) 5 octobre 2020 à 16:00 (UTC)

Modelling of holocaust victim

I was looking around how to model a Holocaust victim (Q5883980) and I noticed that Anne Frank (Q4583) has instance of (P31) set to Holocaust victim (Q5883980). That seems to be completely out of line with our normal modelling of humans. I propose we change this to significant event (P793) set to The Holocaust (Q2763) with qualifier subject has role (P2868) set to Holocaust victim (Q5883980) (example edit). This affects about 5200 items, see https://w.wiki/nXB . Please comment at Talk:Q2763#Modelling of holocaust victim. Multichill (talk) 19:38, 21 November 2020 (UTC)

Question

Does WD:QS no longer support add sources? I got stuck in the task of adding the source during execution. (`・ω・´) (talk) 04:04, 22 November 2020 (UTC)

you mean references? it should BrokenSegue (talk) 04:05, 22 November 2020 (UTC)
@BrokenSegue: Yep I mean reference, Is it no longer supported? (`・ω・´) (talk) 04:10, 22 November 2020 (UTC)
worked the last time I used it. what issue are you having? BrokenSegue (talk) 04:48, 22 November 2020 (UTC)
@BrokenSegue: It will keep such this page, and then there has been no progress. (`・ω・´) (talk) 05:51, 22 November 2020 (UTC)
Well, I found the problem, I entered an extra "?" and it didn't work properly. (`・ω・´) (talk) 05:55, 22 November 2020 (UTC)

P5058 needs reboot

Due to recent changes in the way Instytut Teatralny im. Zbigniewa Raszewskiego (Q11713227) make their data available on-line to the public, our property Encyklopedia Teatru Polskiego person ID (P5058) requires some technical work and I would like to ask for help (maybe some bot edits would be possible)? In fact all the P5058 values are now redirects to a different website run by the same institution, namely encyklopediateatru.pl. So the first task would to be to create a new format as a regular expression (P1793) for P5058. Secondly we need to check (hopefully using bot) if the numbers given as P5058 values are still valid. From my random tests I can tell some are valid, but some are not - the bot should take new values from the new website. Finally, it may be appropriate to change the labels and descriptions for P5058, to reflect the new website the property is now linking to. I would be more than happy to assist as a native Polish speaker, but I definitely lack the technical skills. Powerek38 (talk) 19:58, 19 November 2020 (UTC)

P276 for online events

To state the obvious, many events have been moved to online mode due to COVID-19. I tried to add location (P276) to one such event (namely Jednodniówki edukacyjne (Q102227964)), but values such as Internet (Q75) or online (Q73368) activated constraints warnings. So how do we say in Wikidata that something happened online? Powerek38 (talk) 21:09, 22 November 2020 (UTC)

an online event has no location. maybe indicate the location property has no value. then maybe use broadcast by (P3301)? BrokenSegue (talk) 00:58, 23 November 2020 (UTC)
The location of an online event is online, as Powerek38 rightly suggests. We are here to try to model the real world. In the real world, the location of an online event is online. If you check out Property talk:P276, it gives the location of 'Passover' as being 'worldwide', and of 'bronchitis' as the 'bronchus'. P276's description points to P131 for administrative loatins and P706 for geographical locations. Taking all of these together, I cannot see any good reason why P276 should not support the plainly accepted truth that the location of an online event is online. The cure is to remove the erroneous constraint. --Tagishsimon (talk) 01:07, 23 November 2020 (UTC)

Identifiers

Did Identifiers on Wikidata get removed? I see that new creations don't have a section for them. Thanks--PremierePrush (talk) 22:15, 22 November 2020 (UTC)PremierePrush

The section will appear after an identifier is added. Ghouston (talk) 23:05, 22 November 2020 (UTC)

Modelling unseated politicians

Should we be using position held (P39) for politicians who were never seated? If so, is there an appropriate qualifier to indicate this? A few of the people on this list have position held (P39) United States representative (Q13218630): John Willis Menard (Q6264430) and Matthew Vincent O'Malley (Q6791362). FWIW, they came to my attention because there is no corresponding US Congress Bio ID (P1157) for them.

significant event (P793) is being used as a qualifier for P39 to indicate UK MPs who were elected but did not take their seats. https://w.wiki/nSZ --Tagishsimon (talk) 03:30, 22 November 2020 (UTC)
@Tagishsimon: Awesome. Thanks very much. Gettinwikiwidit (talk) 04:57, 22 November 2020 (UTC)
I would add that there's a difference here between "never turned up" and "unseated". In the latter case, we'd model them as being an MP until the day their election was overturned and use a suitable end cause (P1534), with the new one then taking effect from the date of the ruling. We also have the phenomenon of contemporary MPs who intentionally never take their seats for political reasons, but are still considered MPs by everyone involved; the plan is to represent those with electoral district (P768) as well (edit: now added post-1950). I don't think the US has anything similar. Andrew Gray (talk) 14:17, 22 November 2020 (UTC)
If they never held the position, the rank should be deprecated. --- Jura 11:29, 22 November 2020 (UTC)
@Jura1: Also a good idea. Thanks. Gettinwikiwidit (talk) 12:26, 22 November 2020 (UTC)
It's not really an idea, it's just what comes from Help:Ranking.
BTW are you asking you about election winners that "who_never_took_their_seats" (text) or who were "unseated" (heading). The later could be that somehow their mandate ended prematurely after they took office.
For the later, the end date should explain it. --- Jura 12:34, 22 November 2020 (UTC)
@Jura1: By "unseated", I meant "not seated" not "removed from seat". Again, I ran across these entries. I did not create them. Gettinwikiwidit (talk) 05:23, 23 November 2020 (UTC)

How do I open all Wikidata entries from a Wikipedia category?

How do I open all Wikidata entries from all articles in a Wikipedia category? How would I add the same description to all of them quickly? Thanks, --Tastenlöwe (talk) 20:23, 22 November 2020 (UTC)

Try PetScan (you will need to change "Pages with items" and/or "Use wiki: Wikidata"). --Matěj Suchánek (talk) 09:18, 23 November 2020 (UTC)

Is Playmates.com the only database of its kind out there?

As there does not seem to be a WikiProject Pornography (Q14942913) then I leave the following message here. According to this query:

SELECT distinct ?human ?humanLabel ?playmate_id
WHERE
{ 
  {
    # Human
    ?human wdt:P31 wd:Q5.
    # Playmate
    ?human wdt:P106 wd:Q728711.
    # with optional id in playmates.com
    OPTIONAL { ?human wdt:P5346 ?playmate_id }
  }
  UNION
  { 
    # Additional humans with id in playmates.com
    ?human wdt:P5346 ?playmate_id 
  }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!

Of the 567 "playmates"¹ on Wikidata, only 230 have an Playboy Plus ID (P5346), in addition to the fact that according to its web map it only has 301 registered playmates.

Do you know of a more complete playmates database that can be added to Wikidata as a property?--190.31.204.3 03:01, 18 November 2020 (UTC)

  1. It is worth clarifying that the query also contains 81 people who have an playmates.com ID but whose occupation (P106) is not Playboy Playmate (Q728711), so you might want to check that.
/:An then you may have a look here for more related properties. Thierry Caro (talk) 12:31, 19 November 2020 (UTC)
Looks like playmates.com is regional-based. For example, Cathy Lugner (Q20798793) was published in Playboy Slovakia, and there is no official website for Playboy Slovakia. However Cathy Lugner (Q20798793) is described at playboy.de for whatever reason. There are other regional websites for Playboy, like playboyrussia.com. Due to unclear url stability and seemingly low amount of identifiers per each website (<500), I suggest to use described at URL (P973) instead of creating a separate external identifier for each regional variant. --Lockal (talk) 12:54, 23 November 2020 (UTC)

About people with two or more eye colors

Through the following query:

SELECT ?human ?eyeColorLabel
WHERE
{
  ?human wdt:P31 wd:Q5.
  ?human wdt:P1340 ?eyeColor.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!

And filtering the results externally to obtain people with two or more eye colors I got the following:

I do not know how to indicate in the query that only return items that have a property with multiple values, but if someone knows how to do it would be good to explain it to me for future reference.

I don't think that all the people in the list have heterochromia iridum (Q461486) so it may be necessary to check in other sources to see if this is due to a bad import.--190.231.242.204 14:41, 19 November 2020 (UTC)

@190.231.242.204: Here's a query to get people with multiple eye colours:
SELECT ?human ?colours WHERE {
  {
    SELECT ?human (COUNT(?eyeColor) as ?colours)
    WHERE
    {
      ?human wdt:P31 wd:Q5.
      ?human wdt:P1340 ?eyeColor.
      SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
    }
    GROUP BY ?human ?humanLabel
  }
  # can further filter ?human here if needed
  FILTER(?colours > 1).
}
Try it!
--SixTwoEight (talk) 14:52, 19 November 2020 (UTC)
can also be done without a subquery, using HAVING:
SELECT ?human (COUNT(?eyeColor) AS ?eyeColorCount) WHERE {
  ?human wdt:P31 wd:Q5;
         wdt:P1340 ?eyeColor.
}
GROUP BY ?human
HAVING(?eyeColorCount > 1)
Try it!
--Lucas Werkmeister (WMDE) (talk) 10:49, 23 November 2020 (UTC)

Item with wrong value

.NET Framework (Q5289) have wrong value: the version 5.0 is for ".NET" and not ".NET Framework". It should be removed. --2001:B07:6442:8903:C186:597:CF6A:C596 10:30, 23 November 2020 (UTC)

Some language names is not translated to other languages

Hello, i view that some language names, such as:

  • Central Bikol
  • Korean (North Korea)
  • Belarussian (Tarasklevica ortography)
  • Chinese (Min Nam)
  • Egyiptian Arabic
  • South Azerbaijaní

among others, is not translated to other languages. How can translate this? Thanks. --Rodney Araujo (talk) 20:59, 14 November 2020 (UTC)

Language names are translated via CLDR, https://st.unicode.org/cldr-apps/v#/es// is the edit interface (for Spanish), request an account at http://cldr.unicode.org/index/survey-tool/accounts. Editing there needs to wait as there is an edit freeze due to CLDR rolling out a new release.--Snaevar (talk) 22:32, 15 November 2020 (UTC)
Hey, thanks for pinging me. @Rodney Araujo: can you confirm that the examples you mentioned are taken from the termbox (the list at the top of an Item page, including labels, descriptions and aliases)? In which language was your interface when you noticed it?
If it's languages in the termbox: indeed, we're taking the content from CLDR, except for the languages that are not listed in CLDR, in that case we add them directly in Wikidata. Lea Lacroix (WMDE) (talk) 16:50, 23 November 2020 (UTC)

Which one of these external identifiers can be considered reliable reference?

Hey, I'm adding references that I got from Wikidata:Automated finding references input dump. So far 40K has been added (and will add around 30K more). here's an example. I have been adding references from external identifiers that are obviously reliable (e.g. National library of Spain). But I'm not 100% sure of these ones. If you know them, can you tell me if they are reliable enough to have their refs added to unreferenced statements?

I feel Internet Archive ID (P724) and MusicBrainz label ID (P966) are clearly unreliable but I'm not sure about the rest. Amir (talk) 15:32, 21 November 2020 (UTC)

  • The question is what statement (property) you want to reference with it.
Unfortunately, I don't recall the Automated finding references input-team providing us with the list to review. Maybe this is available in the meantime.
You can obviously add all of the above identifiers to items. --- Jura 15:40, 21 November 2020 (UTC)
Sources are not simply reliable or unreliable. Some sources are self-published and can be used about information on themselves (such as ORCID). Note VIAF and sources based on them are not always reliable as they sometimes pull data from Wikidata.--GZWDer (talk) 16:49, 21 November 2020 (UTC)
Eh, we have minds, we should use them. Be not a robot. -Animalparty (talk) 02:28, 22 November 2020 (UTC)

Thanks User:Kirilloparma and User:Powerek38. I started the bot with Igromania ID (P6827) and Catholic Hierarchy person ID (P1047) which adds 10K more refs. Amir (talk) 03:28, 22 November 2020 (UTC)

How to make "connection" between them? Eurohunter (talk) 11:44, 23 November 2020 (UTC)

practiced by (P3095), field of this occupation (P425) Jheald (talk) 18:09, 23 November 2020 (UTC)
(strictly speaking, those are maybe the relations between playtesting and playtester). Jheald (talk) 18:11, 23 November 2020 (UTC)
@Jheald: Thanks. Eurohunter (talk) 20:45, 23 November 2020 (UTC)

Wikidata weekly summary #443

Q94521341

At Azariah Dunham (Q94521341) the conflict is giving a warning, can someone fiddle with the subclasses for American Revolutionary War (Q40949) so that the warning disappears? --RAN (talk) 20:20, 23 November 2020 (UTC)

It looks like it should work? American Revolutionary War (Q40949) is an instance of war of national liberation (Q1006311), which is a subclass of war of independence (Q21994376) > war (Q198) > armed conflict (Q350604) > conflict (Q180684) - which is a permitted class. Michael Collins (Q173196) has conflict:Irish War of Independence (Q208297), which is also instance of (P31):war of national liberation (Q1006311), and that's not showing any errors. Wonder if it's a caching problem somewhere? Andrew Gray (talk) 21:32, 23 November 2020 (UTC)
It looks like the fiddling fixed it, and the cache kept it un-fixed longer than expected. --RAN (talk) 23:40, 23 November 2020 (UTC)

How long does it take a change to a formatter URL qualifier to take place?

The formatter URL (P1630) for US Congress Bio ID (P1157) was no longer working so I supplied a new one with preferred rank per the documentation. If I reload an entity using this property (Frank Leslie Smith (Q3068384) for instance), I don't yet see the updated URL. I don't see anything in the documentation about how this is technically implemented. Do I have to wait for some cache to be flushed? Are the pages generated on some sort of schedule? Thanks, Gettinwikiwidit (talk) 08:47, 21 November 2020 (UTC)

Some caches are really to blame. The developers should probably be made aware that neither purging an item nor even making an edit will force the new formatter URL to be used (so it is possible there is a cache entry for this which was not invalidated correctly). --Matěj Suchánek (talk) 13:27, 21 November 2020 (UTC)
@Matěj Suchánek: Thanks very much. It seems to be working now. I still don't understand the mechanism. It would be good if the documentation had a few words to set expectations on when changes can expect to show up. If someone who knows would do that I be greatly appreciative. Gettinwikiwidit (talk) 21:32, 21 November 2020 (UTC)
Confirmed, now I can see the URLs correctly formatted. So judging from timestamps the delay could be somewhere between 5 and 13 hours but I really don't know. --Matěj Suchánek (talk) 11:48, 22 November 2020 (UTC)
Longstanding problem: see phab:T112081. As I understand it, the URL will never be reformatted, until the page is next edited. @Andrew Gray: Do you know if that is still the case? Jheald (talk) 18:20, 23 November 2020 (UTC)
I think that's still the case. Although it's possible this is kind of a blessing in disguise - if someone changes eg the VIAF formatter url to go to some spam site, it's probably good that doesn't instantaneously trigger changes to a million linked items! I'll run the purging script for P1157 tonight to clean up this one, though. Andrew Gray (talk) 19:09, 23 November 2020 (UTC)
@Jheald, Gettinwikiwidit: Purging script is now churning away, and should take about 24 hours to touch every item and refresh it. I've had a look at a couple of items and they seem to be correctly displaying the new URL. Andrew Gray (talk) 21:21, 23 November 2020 (UTC)
As I understand it, the URL will never be reformatted, until the page is next edited. As I commented above, I did an edit and even that didn't help. --Matěj Suchánek (talk) 09:58, 24 November 2020 (UTC)

Vaudeville performer

Is there any way I can automate adding occupation=vaudeville_performer where description contains "vaudeville performer"? I was doing it by hand, but there are too many. --RAN (talk) 19:06, 23 November 2020 (UTC)

@Richard Arthur Norton (1958- ): This report will list them. I presume you can quickstatements? 2 minute job. --Tagishsimon (talk) 10:03, 24 November 2020 (UTC)

Template subpage to be used as language link?

Maybe someone could answer to user:Wolverène's question: Talk:Q15692365?--Estopedist1 (talk) 16:39, 24 November 2020 (UTC)

It's Special:AbuseFilter/36; it should block addition of non-notable subpages, which according to Wikidata:Notability are "/doc, /XML, /meta, /sandbox, /testcases or /TemplateData" for templates, but it is also blocking links to others such as /end and /header for which the requirement in the policy is that the item has more than one sitelink. Peter James (talk) 20:27, 24 November 2020 (UTC)

I'm posting it here as this page hasn't been edited for over two (2) months, but is there any interest in proposing it? Should all the individual changes be proposed individually or could these be proposed as a package? Perhaps there are some people here that want to propose their own changes here, as I think that the Notability policy needs updating.

Wikimedia Commons uses a Wikidata-based system for organising media files now and having a much more broader Notability policy could benefit this project. Furthermore, Wikidata has items for individual celestial bodies that may have only been observed by a handful of astronomers, while there are major local buildings and monuments with entries on Wikimedia Commons but not Wikidata. I'm not proposing the change here in the Project Chat, I'd rather see more eyes on this page so more improvements could be made and the discussion could continue. Especially since discussions about the Notability policy have been held for years that seem to end abruptly with no changes happening often due to disinterest and all parties moving on. -- Donald Trung/徵國單  (討論 🀄) (方孔錢 💴) 22:09, 24 November 2020 (UTC)

What problem, exactly, are you trying to cure? --Tagishsimon (talk) 22:59, 24 November 2020 (UTC)
That’s the big question. I might be wrong, but this might be a solution looking for a problem. @Donald Trung: Could you summarize which concrete discussions of the last few months could have been resolved by applying your proposal? --Emu (talk) 14:39, 25 November 2020 (UTC)
I think the only difference between the cases is that somebody has bothered to create items for the astronomical objects (probably by importing from another database), but nobody has bothered to create items for the "major local buildings and monuments". I don't think it's a notability issue. Ghouston (talk) 00:52, 25 November 2020 (UTC)

Both pages contain the Wikipedia disambiguation pages of the cell.

And936 (talk) 06:39, 25 November 2020 (UTC)

Probably not, according to Help:Modelling/Wikipedia_and_Wikimedia_concepts#Disambiguation_pages: "Disambiguation pages are grouped by form, not by meaning. All the Wikipedia and Wikivoyage pages with sitelinks to a Wikidata page should have exact same spelling apart from the exceptions listed in the guidelines of the respective projects." Ghouston (talk) 06:46, 25 November 2020 (UTC)
No, disambiguation pages are for exact string matches. It might be possible to have a way with redirect pages that links those together, but that requires change in redirect policies on some Wiki's. ChristianKl13:53, 25 November 2020 (UTC)
@ChristianKl: So you say I should merge these 2 pages. Am I right? And936 (talk) 16:31, 25 November 2020 (UTC)
No, you are exactly wrong. You should not merge them. As ChristianKl and Ghouston have explained. --Tagishsimon (talk) 16:36, 25 November 2020 (UTC)

Tourism

Hello. I have just initiated Wikidata:WikiProject Tourism/Participants. There is no actual project yet but at least we can now ping people interested by this topic. If you are one of these people, don't hesitate to put your name in the list. Thierry Caro (talk) 18:41, 25 November 2020 (UTC)

SciTLDR: one line summaries of research papers

Has anyone played with this? Semantic Scholar is rolling it out now (Nature article). Would make a nice addition, and could help adding topics? --SCIdude (talk) 18:36, 23 November 2020 (UTC)

Are you suggesting adding the one-sentence summaries to Wikidata items on papers? Assuming the text is public domain (AI and other non-human creations are generally non-copyrightable), how would this play out in practice? -Animalparty (talk) 20:14, 24 November 2020 (UTC)
So, there's an open question here about storing computed results in wikidata. A while ago I proposed computing pagerank and storing that as a new property. Now this SciTLDR has an advantage over that that for a given version of the summarization model the output is static (since papers are immutable). I'm generally in favor of doing this sort of stuff but I think we need to be careful to properly version and attribute it. BrokenSegue (talk) 19:24, 25 November 2020 (UTC)
The idea was that a summary statement would somehow make the process of finding "main subject" claims easier, or on the other hand, improve any search over all articles. Having abstracts might be similar---but abstracts have an introduction, these summaries apparently don't. Note also that you need to train the software yourself, data is not provided. --SCIdude (talk) 16:11, 26 November 2020 (UTC)

Bulk import on data from Sports-Reference.com College Basketball site - data issues

Someone appears to have done a bulk import of data from this site. Problem is they did not account for data already on the site, leading to significant duplication.

At a minimum, there is now significant duplication under P118 (league), and P54 (member of sports team). For example : https://www.wikidata.org/wiki/Q58008659

Other issues include duplication of players in cases where the existing record did not have P3696 populated, an inclusion of 2020 as end time (P582) for all players, including those still active.

Can someone suggest the syntax of a SPARQL search to find all cases where Q94861615 is entered twice for a page under P118 to at least start to understand the scope of the issues/duplication? CanadianCodhead (talk) 17:48, 25 November 2020 (UTC)

--GZWDer (talk) 18:15, 25 November 2020 (UTC)

Thanks, 9000+ duplicated, ugh...CanadianCodhead (talk) 18:37, 25 November 2020 (UTC)

9000 duplicted for that league. There are other leagues. Fortunately, per league, it's a job of moments to get something like quickstatements to delete the duplicate statement - Help:QuickStatements#Removing statements. --Tagishsimon (talk) 19:02, 25 November 2020 (UTC)

Technically duplication is possible for other leagues, you can play in the Bundesliga, leave to go play in Ligue 1 and then return to the Bundesliga. So long as the start and end times are populated it is perfectly valid. You can not however go play somewhere else and then return to college basketball. Your entire stay is contiguous. CanadianCodhead (talk) 19:33, 25 November 2020 (UTC)

I've spent a few hours in the last few days trying to clean up hundreds of duplicate newspapers the same user had imported. They've been apppropriately contrite but have done nothing to help in the cleanup, as far as I can tell. --Matthias Winkelmann (talk) 20:43, 25 November 2020 (UTC)
@Matthias Winkelmann: when complaining about any user it usually makes sense to ping them to allow them to respond to the criticism. ChristianKl10:03, 26 November 2020 (UTC)

What's the proper syntax/format to use QuickStatements to remove just statement 1 from the items? So for example to remove from Q16205693 statement Q16205693-266E0290-2DCA-4AC9-B1E1-D404C7A4C3C1 ? I cant figure out how to do that. It seems straightforward in the help file, but it is generating an error when I try it. I use quickstatements to load data frequently, so am familiar with the tool, but cant seem to get the remove syntax correctCanadianCodhead (talk) 16:54, 26 November 2020 (UTC)

As below. Seemed to work. The first hyphen gets changed to a $. diff --Tagishsimon (talk) 17:12, 26 November 2020 (UTC)
-STATEMENT Q16205693$266E0290-2DCA-4AC9-B1E1-D404C7A4C3C1

Thanks - it was the $ I was missing. Should not try to read documentation on 75 minutes of sleep...CanadianCodhead (talk) 17:32, 26 November 2020 (UTC)

Merging duplicate items

Hiǃ I wanted to request the merge of items Q15684647 and Q5955430, which are duplicated but apparently reject an automatic merge. Thanks in advance! --NoonIcarus (talk) 09:05, 26 November 2020 (UTC)

They're not the same people. Different DoBs, amongst oher things. That is why someone has added Q15684647#P1889 to prevent erroneous merges such as your suggestion. You *really* need to look harder at items before suggesting merges. --Tagishsimon (talk) 09:12, 26 November 2020 (UTC)

New external identifier

I would like to have a new external identifier added. How do I go about this (or preferably have someone who knows how to do this do it)? The identifier would be: ArtFacts ID. DoSazunielle (talk) 14:54, 26 November 2020 (UTC)

See WD:PP.--GZWDer (talk) 15:58, 26 November 2020 (UTC)

Using sex or gender (P21) in Lexemes, pronunciation audio: I get "Potential issues"(constraints?)

While editing the Lexeme for English: "frog" I added to a pronunciation audio file that the gender/sex is female. I get the impression that Property P21 isn't supposed to be used in Lexemes or it needs an update in constraints? LotsofTheories (talk) 08:03, 22 November 2020 (UTC)

Use voice type (P412).--GZWDer (talk) 08:50, 22 November 2020 (UTC)
There's a citation needed constraint here. I'm a bass (Q27911). Should I add to my user page I'm a bass? In Sweden I once visited a choir and they told me I'm "2nd bass"(guessing the translation) or "andra bas"(Swedish). Does this help in creating a reference for my voice type? LotsofTheories (talk) 11:23, 22 November 2020 (UTC)
Given that there seems no justification for the citation needed contraint I removed it. ChristianKl17:37, 22 November 2020 (UTC)
@GZWDer the pronunciation audio (P443) doesn't have voice type (P412) in allowed qualifiers constraint (Q21510851). Should voice type be added as an allowed qualifier to pronunciation audio? LotsofTheories (talk) 06:03, 27 November 2020 (UTC)
I realized now that on the talk page for pronunciation audio (P443) there's an example about Lviv (Q36036) which happens to have voice type in it. I guess I might try to add it. If I can't edit it for some reason please add it if you can. LotsofTheories (talk) 06:40, 27 November 2020 (UTC)

US senator statements clean-up needed

There are dozens of subclasses listed with has part(s) (P527) at United States senator (Q4416090). Would someone kindly delete these malformed statements?

Also, the still existing malformed and unused items like Q98082299 should be listed for deletion. --- Jura 19:38, 27 November 2020 (UTC)

Not sure why you reverted my move to the relevant section, but you seem to be starting a topic about the same thing. You're seem to be leaving out why you are not doing this yourself. Maybe because the deletion request got half done by user:Wiki13 and half not done by user:MisterSynergy? Just starting a new topic out of the blue sure looks fishy. Multichill (talk) 10:05, 28 November 2020 (UTC)
This is about the item United States senator (Q4416090).
The other section is about electoral districts called "class A Montana" and similar. (There is also one about general elections in the US).
I have been asked not to clean up such stuff myself, but post it to the forum. So here it is. --- Jura 10:16, 28 November 2020 (UTC)
I have deleted the remaining ones as another data model has meanwhile been chosen and implemented. —MisterSynergy (talk) 12:42, 28 November 2020 (UTC)
Thanks. As long as they are subclasses, the P527 statements are wrong in any datamodel.
Can you also delete the two remaining statements: e.g. United States senator (Q4416090) has part(s) (P527) Majority Leader of the United States Senate (Q28530268)? --- Jura 12:48, 28 November 2020 (UTC)

Consistency of dates between point in time and temporal processes

Hello, it might have already been discussed. But I'd have a question : is there a way to tell that a instance of (P31) (eg historical event (Q13418847) is about a point in time and therefore instances of that P31 should be having a simple point in time (P585). In the contrary, if a instance of (P31) (eg any temporary exhibition (Q29023906) ) is about a temporal process (it has commenced and it has ended or shall be ended == 2 dates) and therefore should get both start time (P580) + end time (P582) ? Would property "timeframe of that element" = point in time (Q186408) OR time interval (Q186081) "do the job?Bouzinac💬✒️💛 11:14, 28 November 2020 (UTC)

"merge" of two erroneous items to a new item?

The items Premil Petrović (Q53110639) and Premil Petrović (Q95770358) seem to refer to the same musical conductor, actually Premil Petrović according to his premilpetrovic.com website. (The c-cedilla usually takes the c-acute form when at the end of slavic language words).

As a neophyte, can I request someone else merge these/ sort this out please? Scarabocchio (talk) 12:54, 26 November 2020 (UTC)

done --Matthias Winkelmann (talk) 13:34, 26 November 2020 (UTC)

Thank you! That looks like a merge into one of the existing entries plus an edit on the label? (I should have tried that). Scarabocchio (talk) 14:16, 26 November 2020 (UTC)
Yes... The merge function should be in the top-right corner, possibly under "more". It will automatically merge into the item with the lower ID, and throw an error if there are specific indicators that the items are not, in fact, duplicates (such as "different from" statements). After that, I straightened out the labels per your explanation and also added the website you mentioned. I tend to also check for duplication of data and remove excessive "imported from / XY Wikipedia" references, but there's no specific process prescribed here.
As always: start with just a few (thousand). And, if nobody comes screaming, assume you're doing ok. Like childcare, basically. --Matthias Winkelmann (talk) 20:01, 26 November 2020 (UTC)
We do have https://www.wikidata.org/wiki/Help:Merge that describes merging in more detail. One thing that's worth noting is that the merge function is only available if the gadget is activated. ChristianKl23:19, 26 November 2020 (UTC)
Thanks, both. I think that I will be spending more and more time here, so need to master the basic operations. Scarabocchio (talk) 20:31, 27 November 2020 (UTC)
  • Great. Merging items is indeed one of them. So, in general, I think it's preferable to help users find Help:Merge than merge them directly for them.
    Thanks to that, we recently revised it and collapsed some sections users sometimes got lost with .. if there are other parts that need clarification, don't hesitate to mention them. --- Jura 17:37, 28 November 2020 (UTC)
I enabled the gadget following the links above, and have just merged my first pair of items (to Sanna Gibbs (Q64682995)) .. very smooth, very clear, no problems! Scarabocchio (talk) 18:09, 28 November 2020 (UTC)

Two cats for one thing

Hi all. Item Q9032983 and item Q7696450 are about the same issue: Category:Universities in Turkey/Universities of Turkey/Turkish universities or whatever name you like. Look at the WP links and you will see that some language WPs use one of these items and the others the other one, although the contents in the respective languages are the same. I could not merge them. Probably some WP has links to both? I do not know. For some reason I feel myself not so good today; therefore help of colleagues would be most welcome. Thanks in advance. --E4024 (talk) 17:18, 28 November 2020 (UTC)

They're not the same things, at least according to their EN labels - universities, versus universities and colleges. Greek has sitelinks for both, calling one a category of universities, and another a category of higher educational institutes. Were someone to wish to improve the situation, they would have to work out sitelink by sitelink what was found on each langauge wiki, and variously redistribute sitelinks and possibly create new items. Merging is unlikely to figure in the solution. --Tagishsimon (talk) 17:28, 28 November 2020 (UTC)
EN labels? I just added EN label to one item that I saw there orphan as it was. Now am I the one who created a "differentiation" between the two? :) I had just opened a cat for Universities of Turkey in CA:WP and went to make the WD link and saw one item without any EN label and linked it. I also thought I should write a label there. Oh my! --E4024 (talk) 17:57, 28 November 2020 (UTC)

Need help removing aliases

Is there any automated tool that can remove aliases in bulk? I ran an OpenRefine job that accidentally added the item ID of several items as an alias (e.g. it added "Q151521" as an alias to Texas A&M Aggies (Q151521)). QuickStatements apparently doesn't support using "-Aen" as a header to remove an alias, so I'm at a loss as to how I might remove these bad aliases. Any suggestions would be much appreciated. Thanks, IagoQnsi (talk) 20:06, 28 November 2020 (UTC)

I have a pywikibot script which can do this. All I need is a list that maps the aliases to remove to QIDs and the language code of the alias to be removed (any txt based format and formatting possible). If you can provide such a list, I could feed my script with it and remove the aliases. —MisterSynergy (talk) 21:41, 28 November 2020 (UTC)
@MisterSynergy: Oh thank you! Here's a CSV of the IDs and aliases: https://pastebin.com/98WifNWW. Not all of those aliases actually exist -- my batch upload only affected a few hundred of the 922 instances of university and college sports club (Q2367225) that exist on Wikidata. Thanks, IagoQnsi (talk) 23:09, 28 November 2020 (UTC)
  DoneMisterSynergy (talk) 23:34, 28 November 2020 (UTC)

Q item academic degree held by (Q66499315) currently claims it is the inverse property label item (Q65932995) of academic degree (P512). If that was the case, it should be a P item, not a Q item. Am I correct? Should this be changed? Mateussf (talk) 20:44, 28 November 2020 (UTC)

You are not correct. Properties may get an inverse property label item associated with them, whether or not there is an inverse property. This is one of the 'not' cases. All is good; we do not really have a use for the inverse property, but a user might well wish to reach for the label. --Tagishsimon (talk) 22:33, 28 November 2020 (UTC)
I don't fully understand the concept but I appreciate the answer and the attention, and I'm glad I didn't do anything wrong before asking. Thanks! Mateussf (talk) 01:46, 29 November 2020 (UTC)
I made an item about the gadget. Maybe that helps: relateditems (Q102435390). --- Jura 05:44, 29 November 2020 (UTC)

New badges for templates and modules

Some templates and modules are developed in one wiki (Wikipedia) and re-used in other wikis. I think it could be interesting to note this on sitelinks of items with badges.

Sample: Module:Biblio is a copy from fr:Module:Biblio. Badges on sitelinks at Q102438737 could indicate it. --- Jura 08:51, 29 November 2020 (UTC)

Comment about mass import

Hi, last week B.Zsolt did a mass edit on some railway stations, importing some data. I saw that he edited the item about the railway station in my small city adding toilets. I think that he imported mass data from OSM without checking whether statements are valid (toilets are always closed). Moreover, I don't think he can import data from OSM as license are incompatible. As soon as I checked that edit, I wrote in the user page of the user but he ignored my request, so I write here. --★ → Airon 90 08:57, 27 November 2020 (UTC)

I do not think we can mass-import OSM data--Ymblanter (talk) 20:57, 27 November 2020 (UTC)
You can only import data from OSM if every single contributor of an object has stated that their contribution is released as CC0. ATM very few do this and there is no simple way of machine-checking if the user released as CC0 in the profile which means that it is practically nearly impossible to import anything you did not create yourself that has version 1 (no other contributors). Most objects have version >1 and multiple contributors. I tried as a test to import hospitals from Chad in WD from OSM and asked all contributors of a few items for permission. Unfortunately few answered, so I deem it impossible to use OSM data in WD in practice because of license incompatibility.--So9q (talk) 06:16, 30 November 2020 (UTC)

Q12142126 seems to cover two unrelated topics

Q12142126 covers both en:Hillpark, Auckland a suburb in New Zealand, and also the Politburo of the Central Committee of the Communist Party of the Ukraine. I've removed the latter link, but then realised there's lots of metadata in the item for both. Rather than do any further damage to the Wikidata item, I'm posting here to ask someone to sort it out. The article on the suburb does not appear damaged by the link now that I've removed the link, but the article on the Politburo is pulling location and a picture which is irrelevant to it. I have no understanding of Ukrainian language,-Gadfium (talk) 02:55, 30 November 2020 (UTC)

Looking at the history of the qid, it seems that it is supposed to be for the politburo and for some reason was merged with the suburb qid a few days ago.-Gadfium (talk) 02:58, 30 November 2020 (UTC)
I reverted Politburo of the Central Committee of the Communist Party (Q12142126) and Hillpark (Q28180683), hopefully that's enough. Ghouston (talk) 05:14, 30 November 2020 (UTC)
Likewise for User:Wemyang's other merge, which isn't any better. Ghouston (talk) 05:19, 30 November 2020 (UTC)

Query always timing out

Hi folks - I've been using this query for a few years, and suddenly in the last few days it's timing out every time I try it:

Non-fiction writers (&subclasses) with labels in Spanish-but-not-English

Are other people having trouble getting queries to run? Or any suggestions for optimizing this one?

Thanks - Kenirwin (talk) 16:21, 21 November 2020 (UTC)

Try something like https://w.wiki/nWc --- Jura 17:03, 21 November 2020 (UTC)
Thanks @Jura1: -- can you tell me anything about what you did here or if there's any documentation on getting the queries to run better? It looks like you took out the part about being an instance of a person, but I'm not sure what else is happening here. Thanks! - Kenirwin (talk) 20:56, 23 November 2020 (UTC)
@Kenirwin: It looks like the human clause is still there at the end. It looks like the big practical difference is the ORDER BY clause is removed. The problem is that you have to pull all the entries in order to order them, so the LIMIT 10 isn't helping. Also, he used the wdt: prefixes rather than trying to pull out full statements as you did. On the query services page, you can find a link to "list of prefixes" to get to the documentation. Lastly he used the hint:Query which you can read about here. Gettinwikiwidit (talk) 11:14, 30 November 2020 (UTC)

Need Help With Authority Control

Hello out there in Wikidata land! I'm very new to WD and know almost nothing about Authority Control. While I've tried to learn as I go, I must admit to failure. I added the Authority Control template to the "Rocky Kramer" Wikipedia article and then did my best to update each of the data inputs on the corresponding WD page. That resulted in nothing propagating on the article, so I asked at the WP Help Desk, where I was told I hadn't inserted identifiers. I tried looking up identifiers and only became further confused. Someone at the WP Help Desk gave me some WD usernames to ping and one directed me to this chat. Therefore, I humbly ask if there's anyone who can help me get the Authority Control working for this article? Thanks much. --Warriorboy85 (talk) 09:44, 26 November 2020 (UTC)

Identifiers are the ID properties found, for example, at Q42#P214 and below. Rocky Kramer (Q100744874) lacks any of these. Find some values and appropriate properties for your musician and, if they are included in the IDs which the wikipedia template pulls through, then all will be happy. --Tagishsimon (talk) 10:08, 26 November 2020 (UTC)
That's where I get lost. First, where do I find the identifiers and once I have them, where do I insert them to make the Authority Control populate? I'm assuming I enter the Identifiers on the WD page and they will automatically propagate to the article? I apologize for being so ignorant when it comes to Authority Control, but it just seems complicated to me. Thank you for your help! --Warriorboy85 (talk) 10:51, 26 November 2020 (UTC)
P.S., I mean where do I find them for a musician? --Warriorboy85 (talk) 10:54, 26 November 2020 (UTC)
@Warriorboy85: For a musician, probably places like discogs https://www.discogs.com/artist/7596392-Rocky-Kramer or musicbrainz https://musicbrainz.org/artist/f55827bb-4224-4db8-bf72-0a19a3f106ef/works ... for these two the value 7596392 would be added with Discogs artist ID (P1953) and f55827bb-4224-4db8-bf72-0a19a3f106ef would be the value for MusicBrainz artist ID (P434). If you look at the property items, you'll see use examples which hint at which bit of the URL you need. To get ideas for other IDs, look at items for other musicians; does Rocky have an entry at the same places they do? Does that help? --Tagishsimon (talk) 10:59, 26 November 2020 (UTC)
@Tagishsimon: Sorry for the delay getting back to you. I went to sleep. Yes, that's very helpful. I have now looked at both Discogs and Musicbainz and I believe I'm starting to catch on. However, now that I have those two identifiers, where do I add them in Rocky's WD page? I see lots of places I can add numbers as references, but I don't see anything listed for either Musicbainz or Discogs. Should I add them as new categories? --Warriorboy85 (talk) 16:40, 26 November 2020 (UTC)
@Warriorboy85: Right at the bottom of the Rocky Wd page, a link saying "add statement". Click that. THe cursor will focus on the property cell os a statement box. Enter part of the 'Discogs artist ID' string into that, and chose the property from the list it provides. Then tab thru to the central cell and add the value. Hit publish. --Tagishsimon (talk) 16:45, 26 November 2020 (UTC)
@Tagishsimon: Okay, now that really helped me understand a lot! I can't thank you enough. I've added those numbers and now have a basic idea of what I'm looking for. I'll look at other musicians to find the types of identifiers that might be out there. I really appreciate your help. This WP and WD community is exceptionally helpful and I appreciate and value it greatly. Happy Thanksgiving! --Warriorboy85 (talk) 17:13, 26 November 2020 (UTC)
@Tagishsimon: Well, I got Rocky's ISNI and ORCID numbers posted and they now show along with his MBA number. Although I added his Discogs number and it took the number, it isn't showing. He also has a BMI Repertoire at [Kramer BMI] I have attempted to input his CAE/IPI # and even tried listing individual songs by their ISWC numbers, but the exclamation point keeps coming up saying there is missing information. Any idea what I am doing wrong? Thanks again. --Warriorboy85 (talk) 20:37, 26 November 2020 (UTC)
The authority control template on Wikidata decides which of the external IDs it wants to show on Wikipedia. Without having looked into the issue my first hypothesis would be that Wikipedians don't like to include the Discogs number. If there are other Wikipedia pages that do show the Discogs number but the Discogs number isn't shown here, it would make sense to point to those examples here. Otherwise, I would just expect that this is a content decision on Wikipedia's end. ChristianKl12:11, 30 November 2020 (UTC)

List of Q-items by number of statements

Hi, for Wikidata_talk:Lexicographical_data#Semi-automated_import_of_missing_lexemes_based_on_the_Q-items_with_most_statements I would like a list of Q-items (optionally that has a label in Esperanto) sorted by the number of statements. Is that possible without having to download the dump? If someone has access to a database where this information is available I would be very happy to receive a list of the first 50.000 items or so.--So9q (talk) 10:59, 26 November 2020 (UTC)

Here's what I came up with, because how can you not invest time in a request that includes the phrase "optionally has a label in Esperanto"): Query
This is rather not what you asked for, unfortunately. I couldn't find an existing list. Absent that, the straightforward query (as above without the line that includes "P31") is certain to time out, because it involves reading all of the data.
I am not entirely convinced that list would be helpful for your purposes, though. Intuitively, I believe a list by count of statements will include (1) countries, (2) movies (lots of actors etc.), and (3) other artefacts. Every paper coming out of the Large Hadron Collider has every single project participant as a co-author, for example. If that is reflected in our data, the list would be dominated by [Y particle flux in plasma conjunction is tri-symetric under Gedfeller conditionality]. There may also just be items with proper names at the top of such a list, because these are the same in any language and therefore have lots of (auto-generated) labels.
In the query above, I limited the list to instances of subclasses of human activity, which allows it to complete within the 60s we get. The chosen class(es) additionally restrict it to items that would seem to be meaningful for your purposes. You would vary that restriction to look at other promising segments. --Matthias Winkelmann (talk) 12:48, 26 November 2020 (UTC)
Oh, thanks for pointing out my flawed strategy :) You are completely right that most statements is not suitable for my needs it seems. I really like your query as it seems to count the labels instead. I will try to use that and maybe a list with statistically most often used words in Esperanto which seems to be exactly what I want.

@So9q: I was playing around with a tool doing things with the wikidata dumps and used your question as a test. I've put the data I came up with on my userpage. This includes the top 3500 items or so, with repetitive items omitted. Japanese surnames figure prominently, as does unicode. Top billing goes to a research paper, however. Clicking it may kill your browser. --Matthias Winkelmann (talk) 21:31, 29 November 2020 (UTC)

Thanks Mattias, that was an interesting list. The labels seems to be counted as a statement in your list, but they are not when searching on Wikidata using the searchbox. What would be the result if you ignored labels in the count of statements?--So9q (talk) 05:51, 30 November 2020 (UTC)
I didn't think to record that data unfortunately. The research papers, countries, and "other" items (the sun, for example) on that list "score" mostly for "real" statements. --Matthias Winkelmann (talk) 13:12, 30 November 2020 (UTC)

I stumbled upon this item All your base are belong to us (Q54) and found that the spoken text audio (P989) missed having pronunciation variety (P5237). I didn't manage to find any "computer generated English" or "speech synthesis English" dialect/accent in Wikidata so I just added the general English (Q1860). Did I do the right thing? Does anybody know if there is a pronunciation variety which includes English pronounced by software? I did a query about subclass of English with little luck. Maybe broken English (Q20504733) can be used in cases where speech synthesis is being used? LotsofTheories (talk) 09:21, 30 November 2020 (UTC)

@LotsofTheories: We have Chinese speech synthesis (Q16369) so I don't see why we couldn't have a similar item for English. On the other hand if you know a specific piece of software was used then the item for that software perhaps should be used? ArthurPSmith (talk) 14:32, 30 November 2020 (UTC)

Wikidata descriptions changes to be included more often in Recent Changes and Watchlist on Wikimedia wikis

Hello all,

As you may know, you can include changes coming from Wikidata in your Watchlist and Recent Changes on other Wikimedia projects. Until now, this feature didn’t always include changes made on Wikidata descriptions. This is due to how Wikidata tracks what data is used in a given article.

Starting on December 3rd, the Watchlist and Recent Changes will include changes on the descriptions of Wikidata Items that are used in the pages that you watch on the client wiki. This will only include descriptions in the language of your wiki to make sure that you’re only seeing changes that are relevant to your wiki.

This improvement was requested by many users from different projects. We hope that it can help contributors of Wikipedia and the Wikimedia projects to monitor the changes on Wikidata descriptions and participate in the effort of improving the data quality.

If you encounter any issue or want to provide feedback, feel free to use this Phabricator ticket. Thanks! Lea Lacroix (WMDE) (talk) 14:50, 30 November 2020 (UTC)

Are these two the same

Natural History of New York (Q51508328) and Natural History of New York (Q51431792) or are they different printings of the same set, or is one the set and one an individual volume? --RAN (talk) 23:12, 29 November 2020 (UTC)

Good question. Where we have different scans of (some of) the same text, with different BHL identifiers and Internet Archive links, I think there's a choice. They could be bundled together, as a single item for the edition of the book (or for the periodical), with qualifiers (eg collection (P195)) to indicate different copies on different statements. Or you could regard the items as for individual copies (or individual sets of copies), and have a third item for the edition, that these two would each be exemplars of. I'll be interested to see what the balance of preferences is, among the community. Jheald (talk) 12:15, 30 November 2020 (UTC)
Example of the first approach: internet archive IDs on Animate creation; popular edition of "Our living world" a natural history (Q51401008) or Bulletin of the Museum of Comparative Zoology (Q21385585)
Example of the second approach: On the laws and practice of horse racing, etc., etc (Q51425849) and On the laws and practice of horse racing (Q51514189) both with exemplar of (P1574) = On the laws and practice of horse racing (1866 edition) (Q53738443). Jheald (talk) 12:28, 30 November 2020 (UTC)
  • I will let it be until we come up with a standardized way of handling them. I harmonized the titles so that they will be easier to recognize in the future. --RAN (talk) 19:54, 30 November 2020 (UTC)

Wikidata weekly summary #444