Wikidata:Project chat/Archive/2022/11


Please merge the two items. 2800:A4:339D:AA00:C8AD:84B4:1DC:EE8 17:38, 4 November 2022 (UTC)

  Done Michgrig (talk) 19:23, 4 November 2022 (UTC)
This section was archived on a request by: —‍Mdaniels5757 (talk • contribs) 20:14, 4 November 2022 (UTC)

Occupation: hostage taker

Is hostage taker (Q111007953) an occupation (Q12737077)? The item currently says so, but I’m not convinced; slave owner (Q10076267), which I’d consider similar, has instance of (P31)social class (Q187588) (which doesn’t really fit hostage taker (Q111007953) either, though). Same problem, by the way: kidnapper (Q65622885). --2A02:8108:50BF:C694:7062:2BB6:D365:3A7E 14:14, 31 October 2022 (UTC)

I think it could be. Personally I view occupation pretty expansively. It's not just something you do for gain. And anyways hostage taking could be a career. BrokenSegue (talk) 22:42, 31 October 2022 (UTC)
I think its's more of a context-dependent role (Q4897819) (or role (Q214339)?), rather than what is most commonly evoked by "occupation" (i.e. career or livelihood). If we normally wouldn't classify hostage (Q192620) or victim (Q1851760) as occupations, neither should we classify the opposite as such (and especially not for occupation (P106)). Similarly, employer (Q3053337), employee (Q703534) and lessee (Q45574934) probably shouldn't be instances of occupations ("Bob is an accountant" or "Bob is a chemist" is semantically informative while "Bob is an employee" is essentially meaningless unless qualified in relation to a particular business/employer). If a P31 must be stated for this item (and I'm not necessarily convinced it needs to have one), role (Q4897819) or role (Q214339) might be more appropriate, and Q111007953 relegated to specific qualifiers on incident statements or items about crimes, using subject has role (P2868). But an equally important question: does it really actually matter in the long run whether we classify anything as an occupation or not? Are we just rearranging deck chairs on the Titanic? -Animalparty (talk) 03:19, 1 November 2022 (UTC)
You'd not expect symmetricality between doctor and patient. But I'd argue criminals have occupations if they do something major or many times, and kidnapping/hostage taking counts. Vicarage (talk) 09:32, 1 November 2022 (UTC)
Well, criminal (Q2159907) currently has instance of (P31)occupation (Q12737077). Does any criminal action someone could engage in many times have to follow suit? In any case, I’m not sure where to draw the line between occupations and other “things people do more or less regularly in a more or less professional way, and which could be used to describe their livelihood”. I’d remark, though, that most hostage takers do probably not take hostages “regularly”, be it because they are caught and imprisoned (prisoner (Q1862087) happens to be an occupation (Q12737077), too) or because the hostage-taking is just a part/byproduct of another action (such as bank robbery). --2A02:8108:50BF:C694:658F:7048:504A:F96A 10:45, 1 November 2022 (UTC)

Property is an occupation

discoverer or inventor (P61)instance of (P31)occupation (Q12737077) – is this intended to be so? I happened to find this property with a SPARQL query for all occupations. --2A02:8108:50BF:C694:658F:7048:504A:F96A 10:34, 1 November 2022 (UTC)

inventor is a common occupation of notable people Vicarage (talk) 12:53, 1 November 2022 (UTC)
Undoubtedly – but the subject of the above statement is a property… --2A02:8108:50BF:C694:658F:7048:504A:F96A 13:04, 1 November 2022 (UTC)
Thanks for noticing. I restored the previous version. - Valentina.Anitnelav (talk) 13:11, 1 November 2022 (UTC)

Is it possible to add labels in Transylvanian Saxon (Q260942)?

Hi there. The Romanian OSM community would like to introduce settlement names in Transylvanian Saxon for the relevant settlements in Romania in order for the names to appear in OSM.

The dialect does not have an iso code, but Based on a discussion in the German community, OSM is currently using the code gem (reserved for the germanic languages group).

Is it currently possible to add labels using the ISO code gem on Wikidata? If not, is there another way to introduce labels in Transylvanian Saxon (Q260942)? Thanks. Strainu (talk) 23:14, 26 October 2022 (UTC)

Help:Monolingual_text_languages suggests that one might use "mis" for monolingual text that has no language code, but I don't think I would encourage you to use that for labels. Is there any prospect of a language code being assigned? Bovlb (talk) 04:13, 27 October 2022 (UTC)
In general it would be helpful if you would link back to the OSM discussion. At Wikimedia we generally don't use language codes outside of their stated purpose.
https://www.rfc-editor.org/rfc/rfc4646#section-3.5 describes the process for getting new official language tags. I would recommend you to go through that process. It's bureaucratic and takes some time, but it will do the Transylvanian Saxon community good outside of Wikimedia as well. ChristianKl14:51, 27 October 2022 (UTC)
The latest (Romanian) discussion is on a Telegram channel, I doubt it's logged somewhere. The previous discussion which led to the use of gem is linked from https://wiki.openstreetmap.org/wiki/Key:name:gem . Strainu (talk) 21:11, 27 October 2022 (UTC)
The way to request new languages for labels is by opening a ticket on Phabricator, but the backlog for this has been accumulating for over a year.
In the meantime, I think it would be fine to start adding this data now using name (P2561) set to "mis" on the item, and a qualifier with language of work or name (P407) linking to the Transylvanian Saxon item. The query to get these names would be slightly more convoluted, but you would be able to start using this data right away, whereas it's anyone's guess if labels might be usable for this. عُثمان (talk) 07:25, 30 October 2022 (UTC)
Wikimedia has Langcom and the policy is to ask them in Phabricator tickets before adding new codes. Langcom puts value on not misusing codes and only using valid IETF tags, so it's very unlikely that a ticket would lead to us adding gem for Transylvanian Saxon. ChristianKl20:10, 1 November 2022 (UTC)

What to do about a reference (possibly) not supporting a statement?

I’ve just converted the labels of Abiodun Folashade Tokunbo (Q112240139) to title case (used to be uppercase). Having no idea about Yoruba naming (particularly their naming order), I was grateful for the given name (P735) and family name (P734) statements, but I’m a bit taken aback by their references. Given name and family name have references to two different websites; both of them contain the subject’s (full) name, but as far as I can tell, neither explicitly states which part of the name is which. At best, both are valid references for the whole name, but not (at all) for the given/family name assignment. The fact that there are two separate references for the two statements would imply they are… I’m not even sure whether the name (as a whole) of a person must be sourced, as the alternative (creating an unnamed item if the name cannot be sourced) wouldn’t make much sense.

Sorry if my posting comes across as confused; essentially my question is what to do about those references: Add both to both statements? Delete them? Leave everything as it is? I have no reason to believe that the statements in question are incorrect. --2A02:8108:50BF:C694:90D3:B504:2290:3B75 20:52, 1 November 2022 (UTC)

Sitelinks to Redirects are now available on Wikidata

Hi everyone,

We're excited to announce that the much-requested Sitelinks to Redirects are now available on Wikidata.

As you may recall, we implemented the functionality on test.wikidata.org to directly add a sitelink to a Redirect to an Item if you also add a Redirect badge in the same edit. (see the previous announcement). Many of you tested the feature and helped us find a good implementation. We would like to thank everyone who participated and provided feedback.

We hope that this will make it easier to find and connect related information across different wikis.

This new feature will likely require some updates of guidelines and help pages. If you have any questions or suggestions please let us know here or on Wikidata:Report a technical problem.

Cheers, -Mohammed Sadat (WMDE) (talk) 19:06, 27 October 2022 (UTC)

Seems to work! Thanks a lot, —MisterSynergy (talk) 20:50, 27 October 2022 (UTC)
Great to have it finally working after all those years. ChristianKl19:58, 1 November 2022 (UTC)
@Mohammed Sadat (WMDE)@Lydia Pintscher (WMDE) Currently the error message that gets shown when the flag isn't set is https://www.wikidata.org/wiki/MediaWiki:Wikibase-validator-sitelink-conflict . We could change that error message to generalize it to link to https://www.wikidata.org/wiki/Wikidata:Sitelinks_to_redirects or we need a new error message that links to https://www.wikidata.org/wiki/Wikidata:Sitelinks_to_redirects. ChristianKl19:30, 2 November 2022 (UTC)

I'm looking for good translations for the German Überwachungsdruck, which describes the feeling of being surveilled. Probably most Westerners don't know this, but suffragettes, Black Panther activists or Uyghurs certainly do.

IMHO "surveillance strain" is not well translated.-- SF Nudist (talk) 14:42, 1 November 2022 (UTC)

I'm guessing the term is distinct from unfounded fears of surveillance. How about surveillance pressure, or surveillance anxiety? --William Graham (talk) 17:59, 2 November 2022 (UTC)

“Mixed” disambiguation items

Are items such as chekist (Q4508280) or fashionista (Q5436793) desirable, which have instance of (P31)Wikimedia disambiguation page (Q4167410), but also statements apparently describing a (partially disambiguated?) concept (instance of (P31)profession (Q28640) in this case, for chekist (Q4508280) also subclass of (P279)intelligence agent (Q392651))? --2A02:8108:50BF:C694:658F:7048:504A:F96A 15:34, 1 November 2022 (UTC)

We clearly need a new item Чекист to avoid any mix-up.--Oberbefehlshaber Ost (talk) 16:02, 1 November 2022 (UTC)
Thanks In Klebensgefahr stecken and Oberbefehlshaber Ost for splitting these items. There seem to remain 2232 others, though. --2A02:8108:50BF:C694:E5D0:1415:830F:FE1C 11:29, 2 November 2022 (UTC)

Language policy for disambiguation items?

I’ve noticed that Avocat (Q417973) is linked to en:Lawyer (disambiguation), while (some) other languages link a disambiguation page avocat even if that word does not exist in the language. Is it intended to be this way? I’d expect disambiguation page items to be linked only to pages disambiguating the same term – an avocat in one language may have nothing to do with a lawyer in English. --2A02:8108:50BF:C694:90D3:B504:2290:3B75 21:32, 1 November 2022 (UTC)

According to our policy, the links shouldn't be created this way. "Avocat" should only be linked to Wikis that have a disambiguation page on "Avocat". Sitelinks to redirects can be used when links as desired between those pages. ChristianKl21:41, 1 November 2022 (UTC)
Thanks. I’ve now removed az:Vəkil (dəqiqləşdirmə), en:Lawyer (disambiguation), ln:Avoká and nds-nl:Avvekaot from the item. What about different writing systems? Should ko:로이어 and ru:Адвокат (значения) be removed, too? --2A02:8108:50BF:C694:E5D0:1415:830F:FE1C 11:20, 2 November 2022 (UTC)
https://www.wikidata.org/wiki/Wikidata:WikiProject_Disambiguation_pages/guidelines is our policy document. It does suggest to allow transcription for different scripts. On the other hand, it's my impression that we didn't put that much effort into cleaning up disambiguation in different scripts. ChristianKl20:10, 2 November 2022 (UTC)

Modelling an award longlist versus shortlist

I do a lot of work about awards and am currently importing the Booker Prize (Q160082) database. For nominated for (P1411), this award has two levels of nomination: a longlist of about a dozen entries and shortlist of six. Both of these are important and so is the difference between them. I have created Booker Prize shortlist (Q115010579) and Booker Prize longlist (Q115010563) to model this difference but am struggling to find a suitable qualifier property. assessment outcome (P9259) has a restricted list of allowed values. ranking (P1352) requires an ordinal number. How should I mark the difference between these two levels of shortlist? I've looked at qualifiers that are frequently used with P1411 but none seem appropriate for this task. MartinPoulter (talk) 15:59, 2 November 2022 (UTC)

I'd be inclined to use has characteristic (P1552) or object has role (P3831) - and probably the first of these. --Tagishsimon (talk) 18:02, 2 November 2022 (UTC)

Awareness of deletions

I have a couple of questions regarding entity deletions:

- Is there a way to see a list of entity names that I previously created and have since been deleted?

- Is there a way to get notified about entites I've added that have been posted to Requests for deletion?

Thanks! Pauljmackay (talk) 07:39, 29 October 2022 (UTC)

To answer your first question: Here’s a list with all your item creations including those deleted. --Emu (talk) 08:11, 29 October 2022 (UTC)
Very useful, thank you. Can anyone help on when an entity may be put forward for deletion? How can authors defend or improve an entry before it gets deleted? Pauljmackay (talk) 08:15, 30 October 2022 (UTC)
@Pauljmackay: we haven't automatical user notification when the item is requested to be deleted (like e.g. Wikimedia Commons has). I personally notify manually the user when checking and solving requests at Wikidata:Requests for deletions Estopedist1 (talk) 06:00, 31 October 2022 (UTC)
don't most users get notified when the item is nominated? (doesn't happen for bulk or manual nominations though) BrokenSegue (talk) 06:13, 31 October 2022 (UTC)
@BrokenSegue e.g. Q114995774 was nominated, but the creator is not automatically notified. Estopedist1 (talk) 06:54, 3 November 2022 (UTC)

Duplicate items for web analytics

There's web analytics (Q10719477) which is a bit more fleshed out (is a subclass of analytics) and there's Web analytics (Q2382498) which just has four sitelinks. I think the latter should probably be merged with the former but the Spanish, Russian and Ukrainian Wikipedias actually seem to have separate articles. Can somebody look into this? --Push-f (talk) 10:12, 3 November 2022 (UTC)

@Push-f These do seem to be duplicate entries but I'm only using Google Translate. Personally I'd mark Web analytics (Q2382498) as Wikimedia duplicated page (Q17362920). Let's see if Russian/Ukrainian/Spanish speakers can spot a difference between the two concepts. Vojtěch Dostál (talk) 11:17, 3 November 2022 (UTC)

IMHO meal delivery service is now obsolete due to the deletion of the English article and duplicate to food delivery service (Q10932402). What is your opinion?-- Wootrition (talk) 17:08, 2 November 2022 (UTC)

I could see meal delivery service and grocery delivery service being valid subclasses of food delivery service. Since they're distinguishable that way, it would be fine for both to stay and not merge them. --William Graham (talk) 17:54, 2 November 2022 (UTC)
If we look at what the WP article was before its merge - https://en.wikipedia.org/w/index.php?title=Meal_delivery_service&oldid=1024777465 - it is dealing with "a service that sends customers fresh or frozen pre-portioned prepared meals", so meal delivery service (Q30594545) is a subclass of food delivery service (Q10932402). --Tagishsimon (talk) 17:57, 2 November 2022 (UTC)
I disagree that they are the same BrokenSegue (talk) 18:23, 2 November 2022 (UTC)
I created the missing grocery delivery (Q115016960). It's really great to finally be able to sitelink to redirect in cases like this. ChristianKl19:12, 2 November 2022 (UTC)

Subquestion

 
Screenshot

@ChristianKl: How do I add a sitelink to redirect? Does it not work for any new/untrusted user?

Thanks in advance.--Angstspinner (talk) 18:52, 3 November 2022 (UTC)

https://www.wikidata.org/wiki/Wikidata:Sitelinks_to_redirects should explain it. If anything feels unclear to you, I'm happy to hear it to potentially improve it. ChristianKl21:06, 3 November 2022 (UTC)

What does S. N. A. K. (Q86719099) Really meanʔ

S. N. A. K. (Q86719099)

ɑsked by a Wikidata beɡinner. Thanks JuguangXiao (talk) 07:53, 3 November 2022 (UTC)

"some notation about knowledge" Germartin1 (talk) 07:54, 3 November 2022 (UTC)
If you look at the item you find the "described at URL" statement. It gets you to https://www.wikidata.org/wiki/Wikidata:Glossary#Snak is there anything unclear about that? ChristianKl12:28, 3 November 2022 (UTC)
Equally, there is more information at https://www.mediawiki.org/wiki/Wikibase/DataModel#Snaks and having read it I'm tempted to ask is there anything clear about that? --Tagishsimon (talk) 13:30, 3 November 2022 (UTC)

As Germartin1 said, "some notation about knowledge". Interestingly, the only result when you look for that on the Web seems to be a preprint for an ACL submission on a system called "Hansel". Maybe we should make a better reference for that :) --Denny (talk) 19:34, 3 November 2022 (UTC)

ORES Score

Why is the ORES score system only detecting 8/54 edits of blatant vandalism by 213.221.251.2 (talkcontribslogs), but flagging many correct link removals?

This is an excerpt of the vandalism dashboard, which only keeps those entries, which were not reverted:

Item action used
Move
assessment ORES

prediction correct?

current location of the sitelink
slack water (Q432343) Removed sitelink in dewiki: de:Stauwasser different things 0.965 Q115044926
Q913015 Removed sitelink in dewiki: de:Bundestrainer Dissolution of a disambig page 0.928
Fiat Powertrain Technologies (Q1410621) Removed sitelink in dewiki: de:FPT Industrial Mother / daughter company mixup 0.946 FPT Industrial (Q80281)
moonshine (Q917046) Schwarzbrennerei, jetzt Q107573569 X different things 0.955 illicit distilling (Q107573569)
tattooing (Q43006) Tätowierung, jetzt Q72941682 X different things 0.953 tattoo (Q72941682)
microphilia (Q1478636) Removed sitelink in dewiki: de:Mikrophilie redirect to macrophilia (Q1263008) 0.989
Burgbernheim (Q55606161) Removed sitelink in dewiki: de:Bahnhof Burgbernheim only a Haltepunkt (train stop) (Q27996460), and these are not relevant 0.954
Shannon Lucid (Q233795) Removed sitelink in dewikinews: Kategorie:Shannon Lucid not part of it as a category 0.987
economy of Russia (Q461731) Removed sitelink in dewiki: Wirtschaftspolitik Russlands different things (economy / economic policy), German article made to a redirect 0.787
beer measurement (Q878975) Removed sitelink in dewiki: de:Bittereinheit different things 0.956 International Bittering Units scale (Q2000486)
mud building (Q12019766) Removed sitelink in dewiki: de:Lehmbau, now Q65805058 X different things 0.943 earthen architecture (Q65805058)
Q53666410 Removed sitelink in dewiki: de:Kabinett Brnabić I different things (List of ministers) 0.318
Catalan vault (Q2362415) Removed sitelink in dewiki: de:Katalanisches Gewölbe different understanding of the concept: de:Hyperbolische Paraboloidschale OR de:Flachziegelgewölbe 0.969
Palace (Q1669287) Removed sitelink in dewiki: de:Scandic Palace, Moving sitelink to Q111400862 X different things 0.890 Palace Hotel Tallinn (Q111400862)

I think we need a general discussion about the patrolling burden. Do we really need anonymous and especially mobile IP edits? Angstspinner (talk) 09:56, 4 November 2022 (UTC)

  • ORES is known to be not perfect, and the per-revision score is maybe not even the right approach.
  • ORES scores are available for edits of the past 30 days only in the database. You can still retrieve scores for older revisions individually via the API at https://ores.wikimedia.org/.
  • The "possible vandalism" tag that you are seemingly referring to is added via several AbuseFilters, but it does not (necessarily) depend on ORES scores.
  • IP editors are a substantial part of our editor community (probably between 25–50% of all editors). Since they do not use automation for editing, they only account for ~1–3% of the edits, but they have their eyes at many places where registered users do not look. And their contributions are overwhelmingly okay.
MisterSynergy (talk) 10:13, 4 November 2022 (UTC)
In my little corner of Wikidata IP edits are the main source of uncaught vandalism. Having people not register an account was all very nice and cool when we started Wikipedia, but for Wikidata it's doing more harm than good. We had ten years to set up a system to properly handle IP vandalism and we failed. Time for a different approach.
I'm happy properties can no longer be edited by IP's. Would support limiting ip's to restrict them from changing exiting labels, description and statements as this is the most common vandalism. Would also support restricting ip's from editing items like we did with properties. Multichill (talk) 21:36, 4 November 2022 (UTC)
@Angstspinner@MisterSynergy@Multichill if we restrict IP edits for Wikidata, then vandals just create a user name. The creating of a new user is stupidly easy in Wikimedia. I guess the user creating process should be changed. Estopedist1 (talk) 07:08, 5 November 2022 (UTC)
Some might, but I think creating an account is the extra barrier that will significantly reduce the amount of uncaught vandalism. Multichill (talk) 10:12, 5 November 2022 (UTC)
Generally, the idea of inhabiting a two digit percentage of our editors from contributing to Wikidata is problematic. On the other, I don't think that the app should allow making edits without creating an account. The app should automatically create accounts when users want to use it to make edits and be able to handle the account creation without discouraging users from contributing. ChristianKl13:18, 5 November 2022 (UTC)

Created items for review...

I discovered a batch of ~400 recently created items by @Obedmakolo that should be reviewed : https://www.wikidata.org/w/index.php?title=Special:Contributions&end=&namespace=all&newOnly=1&start=&tagfilter=&target=Obedmakolo&offset=&limit=500

This user looks to have created pages automatically based on photography metadata. Some items should be merged together and/or with preexisting items (eg. Q114960121, Q114960043, Q114960039, Q114960024, Q114960131, Q114960061, Q114955673, Q114960036 and Q114960030), some other items do not reach the criteria of notability (eg. Q114960355), some items are admissible but with wrong statements, and fortunately some look correct. Have fun... Louperivois (talk) 05:23, 5 November 2022 (UTC)

@Magnus_Manske can you look into making WikiShootMe less likely to encourage users to behave like this? Allowing users to create items with match an existing name and description like Q114955673 Q114960131 and Q114960024 seems clearly wrong. When it comes to Q114960043 and Q114960061 who are items who's name differs by one word, maybe the user should also be told about them and explicitely have to say that the new item is different? ChristianKl13:00, 5 November 2022 (UTC)
  Info Mass deleted by Mahir256 --Emu (talk) 15:10, 5 November 2022 (UTC)

New RFC: Create items for property proposals

Wikidata:Requests for comment/Create items for property proposals Lectrician1 (talk) 20:41, 5 November 2022 (UTC)

Request for move with permission

Please move the wikiquote link from Redner (Q2136324) to orator (Q12859263). Thanks a lot. Oure Ladye (talk) 20:53, 5 November 2022 (UTC)

  Done --Ameisenigel (talk) 21:11, 5 November 2022 (UTC)

Reverting lots of edits without reverting newer edits

hi, i noticed Vincent Lefèvre (talkcontribslogs) set the rank for software version identifier (P348) to deprecated on statements that are release candidate version (Q1072356) on the item Mastodon (Q27986619). i don't see the reasoning behind it (the release candidates are actual releases that were actually made, and still exist as i just checked). i would like to revert the edits, but restoring would also include the newer (unrelated & valuable) edits, and individually undoing would be pretty repetitive. is there a way to undo a large amount of edits in a fast manner (i.e. not manually or without extra prompts on each undo or not repetitively) while not reverting the newer edits? i haven't found any tool that seems like it could do that. JacksonChen666 (talk) 00:12, 6 November 2022 (UTC)

https://github.com/maxlath/wikibase-cli will do that sort of thing - set the rank for a statement - if you collect the affected statement IDs e.g. via SPARQL. THe command is wb update-claim. --Tagishsimon (talk) 00:16, 6 November 2022 (UTC)
The release candidates typically have deprecated rank (except when a release has not been done yet, i.e. possibly for the latest release candidate), and templates using wikidata are written taking this point into account. See fr:Discussion Projet:Informatique#Mastodon (réseau social) : liste de versions à rallonge dans l'infoboîte (in French). — Vincent Lefèvre (talk) 00:40, 6 November 2022 (UTC)
@Vincent Lefèvre: Then that is a very clear misuse of rank - which does not exist to satisfy a single use on a single wikimedia platform, but instead follows a rule base documented at Help:Ranking#Deprecated_rank - "The deprecated rank is used for statements that are known to include errors". That there were release candidates is not erroneous. --Tagishsimon (talk) 00:50, 6 November 2022 (UTC)
@Tagishsimon: This seems to have already been discussed on Property talk:P348, where "deprecated" was regarded as OK. — Vincent Lefèvre (talk) 01:04, 6 November 2022 (UTC)
@Vincent Lefèvre: No it has not; no it was not. The rules are specified at Help:Ranking#Deprecated_rank. You do not get to bork them just because you've elected to write a template somewhere else that has an ill-conceived logic. It's perfectly possible to conceive of a data structure, including qualifiers, which will satisfy your requirements. I suggest you do that. --Tagishsimon (talk) 01:10, 6 November 2022 (UTC)
@Tagishsimon: I hope that Genium (talkcontribslogs), who did some of the changes to deprecated rank (at least for GNU Emacs (Q1252773) on 2022-09-12 and Apache Subversion (Q46794) on 2020-05-27), could comment. — Vincent Lefèvre (talk) 01:43, 6 November 2022 (UTC)
This issue is well explained by Vincent there. AFAIK. it's currently the best way to avoid displaying misinformation on Wikipedia… Genium. 14:35, Nov 6, 2022 (UTC+01:00)
I searched for both rank an deprecated on the talk page of software version identifier (P348) and the only indication that one person thought that deprecated would be okay was one statement from 2015 saying "Use three small boxes on the left of the date to mark the latest version as "Preferred rank". Don't forget to mark the older version(s) as "Normal (or deprecated) rank"" that is followed up by another person saying that using deprecated throws a warning which indicates that it's not right.
Later Jura linked to https://www.wikidata.org/wiki/Help:Ranking and nobody objected that the rule applies on the talk page. Deprecated has a meaning on Wikidata and it doesn't mean that something is old. Release candiates are not errors and thus using that rank is wrong for them.
If you have in Wikipedia a problem where you only want stable versions to be shown you decide to only show versions with the qualifier version type (P548) stable version (Q2804309) if you just want to filter out release versions you can also do that via {{P|548}. Entering wrong claims into Wikidata is not the way to get Wikipedia templates to behave like you want them to. ChristianKl18:57, 6 November 2022 (UTC)
As I already stated on w:fr:Discussion_modèle:Infobox_Logiciel#Version_avancée, I agree with Tagishsimon : Wikidata's ranks should remain as normal rank (and preferred rank for the latest version), and frwiki's template should be fixed. I plan to work further on this template in the next weeks as soon I can and this won't be an issue anymore.
AFAIK. it's currently the best way to avoid displaying misinformation on Wikipedia… This is not misinformation. It would be misinformation if the statement is wrong (e.g. version stated has never been published, or the qualifier publication date (P577) has wrong value).
Also, I think an abuse filter could be deployed to warn contributors about the deprecated rank of this specific property because it is easy for new contributors (not those involved above of course) to be confused between the deprecation of the information on Wikidata and the deprecation of the software version itself, and because real cases misinformation on software versions are not really frequent. — Metamorforme42 (talk) 23:20, 6 November 2022 (UTC)

Taxobox module

A lot of work has been done on Module:Taxobox with the goal of replacing the classic template on Wikipedia. Only 16 languages versions have Module:Taxobox (Q18091359) (a fraction of them actually calling it in articles) compared to 183 with Template:Taxobox (Q52496). The revival of this project could mean removing litterally millions of template subpages on Wikipedia (and numerous corresponding items here), and an easier maintenance. Is there interest to set a development and WP-testing roadmap ? (ping @Vikiolog) Louperivois (talk) 02:07, 7 November 2022 (UTC)

Scholar

de:Scholar and eo:Vagolernanto are clearly not the same like scholar (Q2248623), but I am not allowed to edit them out.- Aunty Gormint (talk) 15:28, 6 November 2022 (UTC)

Which item do these sitelinks belong to? clerici vagantes (Q2078646) looks feasible but there is already de:Vaganten and eo:Vagantaj klerikoj attached to this item. Are these duplicates? — Martin (MSGJ · talk) 09:38, 7 November 2022 (UTC)

Can someone merge Q92732011 into Q10134?

Can someone merge Q92732011 into Q10134?

Q92732011 claims to be a list of vocabulary, which is not correct, as all three indicated project pages are not lists of vocabulary, rather they are language pages with various information about the language, including some vocablulary. That is covered with Q10134 for the language

-- 65.92.246.191 08:06, 7 November 2022 (UTC)

Okay   Done — Martin (MSGJ · talk) 09:09, 7 November 2022 (UTC)

Titles erroneously used as occupation (P106), correction to noble title (P97) reverted

Sometimes, titles are erroneously used as occupations (example: academician (Q414528) in Muhammad Alhamid (Q61055308), Doctor (Q4618975) in Michelle Labbé (Q109937320)). I used to change occupation (P106) to noble title (P97) in such cases but have frequently been reverted (e.g. by Fralambert). Which is the property to go for in such a situation? academic degree (P512)? I also wonder about titles that neither belong to nobility nor are academic degrees (such as varatuomari (Q10714990), which appears to be a title for lawyers). Is there another property for titles? --2A02:8108:50BF:C694:A5D3:3F4B:D71C:DFCD 14:20, 6 November 2022 (UTC)

I am sure that no one does become noble by obtaining a university degree. Maybe honorific suffix (P1035) or honorific prefix (P511) would be better, but I'm not sure. Fralambert (talk) 14:32, 6 November 2022 (UTC)
I’d be happy to use a different property, I just don’t know which. (Still waiting for other suggestions.) --2A02:8108:50BF:C694:E0F2:7F6B:7EAD:26F9 09:50, 7 November 2022 (UTC)
@2A02:8108:50BF:C694:A5D3:3F4B:D71C:DFCD Can you just stop to make this wrong ? Doctor IS an occupation in several case in France and I’m completely sure that is not a noble title (P97). Lyokoï (talk) 14:59, 6 November 2022 (UTC)
Depends which one, Doctor (Q4618975) isn't a profession. It's a academic title. physician (Q39631) is the profession. Mbch331 (talk) 15:51, 6 November 2022 (UTC)
@Fralambert Just as a side-note: There were cases where obtaining a university degree indeed means becoming a noble. This was the case for certain degrees of the universities of Bologna and Avignon, see w:fr:Noblesse pontificale. --Emu (talk) 19:38, 6 November 2022 (UTC)

Harry Potter

Harry Potter, In wikipedia ist keine Genaue breschreibung z.B. von Zaubersprüchen was Für Richtige Potterheads eine Schande ist das Gehört sofort korrigiert 85.255.151.66 09:31, 7 November 2022 (UTC)

Deutsche Anfragen bitte unter Wikidata:Forum. Momentan scheitert es allerdings schon am Satzbau. --2A02:8108:50BF:C694:E0F2:7F6B:7EAD:26F9 10:08, 7 November 2022 (UTC)

Where is the right place to request a mass link transfer?

Interaccoonale (talk) 14:48, 7 November 2022 (UTC)

@Interaccoonale Why don't you transfer the links yourself? If you activate the move gadget switched on it's fairly straightforward. ChristianKl14:55, 7 November 2022 (UTC)
I transferred some items manually. But it took a long time. I didn't know the gadget, thank you for it. Interaccoonale (talk) 14:57, 7 November 2022 (UTC)

Mangadex?

Hello. There is a property MangaDex title ID (P10589). However, this website looks barely legal. It seems like you can access copyright protected works for free without consent of the copyright holder. It is a website for scanlation (Q557923) which is clearly criminal from my point of view. Christian140 (talk) 11:23, 4 November 2022 (UTC)

@Christian140: Do you demand a Q228916 disclaimer which is commonplace on German websites? Don't stir up problems.--Pizos (talk) 12:10, 4 November 2022 (UTC)
I don't think wikidata or a wikidata editor should be held liable for linking to illegal websites. However, I don't see why wikidata should have a property to link to an illegal website. If something is allowed or not also depends on the country and their law, but this is clear copyright infringement. --Christian140 (talk) 13:46, 4 November 2022 (UTC)
I would like to point out that some works are not available in their translated form for purchase. Hence Christian140's claim are ignorant. But yes, I do think artists deserve geting paid for their work. Infrastruktur (talk) 13:51, 4 November 2022 (UTC)
Whether or not the works are available in their translated form for purchase changes little about the fact that it's a copyright violation. As the EnWiki article describes, there's previous legal action against aggregator websites for Scanlation. Hosting the links to individual manga's might be easily understood by them as implying that Wikidata is an aggregator websites for Scanlation that also deserve to be sued. I don't see a good reason why Wikidata/Wikimedia should take that legal risk.
@Christian140 please open a request for deletion at https://www.wikidata.org/wiki/Wikidata:Properties_for_deletion . That's our process for removing undesirable properties and the place were proper voting can take place. ChristianKl17:42, 4 November 2022 (UTC)
He asserts that the content is not legal but fails to say which law it supposedly breaks. As far as I know, unless the content has a licensing deal outside of Japan, the legality of hosting translations is very much unclear. Most of the sites I know about takes care to remove material that has been licensed abroad. Infrastruktur (talk) 00:36, 5 November 2022 (UTC)
No, the legal situation is very clear in pretty much all countries that are signatories to the Berne convention. Translations are derivative works that cannot legally be published without permission from the copyright holder. Japanese publishers and artists might not care much about these translations into foreign languages and don't always bother with pursuing legal action against these sites, but the sites are nonetheless clearly violating copyrights. And if disseminating copyvios is the main purpose of a website, it might not be a good fit for Wikidata. --2A02:810B:580:11D4:BC47:2B30:672C:6F9C 18:42, 5 November 2022 (UTC)
I don’t think it’s that clear. If the material in question is not commercially published (sold) in a country, the copyright holder cannot gain money from there, so no monetary rights are infringed upon, which leaves us with infringements of non-material intellectual property rights. Such infringements are hardly ever investigated unless the copyright holder requests that they be, opening up a legal grey area. That said, I don’t think either that Wikidata should touch that grey area. I’m not sure though whether it does here, since linking to a site does not mean endorsing it. --2A02:8108:50BF:C694:A5D3:3F4B:D71C:DFCD 10:37, 6 November 2022 (UTC)
Work doesn't need to be commercially published to be protected by copyright. Through the Berne convention there's mutual recognition of copyright between countries. Work that's copyright in Japan is also copyrighted automatically in other signatories of the convention.
It's unclear to me what you mean with "monetary rights". That's not a standard term in copyright law. ChristianKl15:26, 7 November 2022 (UTC)

I started a request for deletion at Wikidata:Properties for deletion/P10589. --Christian140 (talk) 08:06, 7 November 2022 (UTC)

Main classes of data inconsistency in Wikidata

I'm drafting a paper aimed at getting academiccs into Wikidata. One of the secitions is aiming to cover some aspects to be cautious of when querying and I'm including a box of come common aspects (below).

Box 1: Main classes of data inconsistency in Wikidata

A) Item incompleteness. Since Wikidata is still in an exponential growth phase, it can be difficult to predict which topics will have already been well-developed by the community, and which are not yet well covered or linked out to external databases. For instance, at time of writing, many common bioinformatics techniques and equipment types are currently missing.

B) Statement incompleteness. The issue of incompleteness can affect any part of Wikidata's data model. For instance, for many items about people, there are no statements about their date or place of birth. In cases where multiple statements are common for a given property on a given item (e.g. someone's employer), some or all of them might be missing or outdated.

C) Language incompleteness. The language coverage for item labels and descriptions also varies, where core items (e.g. evolution) will be in hundreds of languages, whereas items towards the edge of the network (e.g. evolvability) may be in only a few languages or even just English.

D) Referencing incompleteness. For example at time of writing: the fact that the SARS-CoV-2 NSP9 complex is found in SARS-CoV-2 is referenced (from the EBI complexportal), but that it contains two NSP9 subunits isn’t referenced.

E) Classification & description disparity. For example at time of writing, principal component analysis is listed as a subclass of multivariate statistics, used for dimensionality reduction, but factor analysis is listed as a subclass of statistical method, used for looking for latent variables. Some inconsistency also stems from the fundamental lack of a single universal classification (‘how many countries exist?’ being a classic example).

F) Depth, specificity & granularity. For example at time of writing, ribosomal binding site is part of the untranslated region of prokaryotes and physically interacts with the ribosome of prokaryotes. But the 50S ribosomal subunit didn’t yet indicate that it physically interacts with the 30S subunit, or that it is part of the 70S ribosome, or that it’s found only in prokaryotes.

G) External databases & controlled vocabularies. Mapping to external databases can vary since some are proprietary or have other licensing issues, whilst others are simply incomplete, for example there is currently minimal mapping over to Research Resource Identifiers.

What do ppl think about this catagroisation? Any ideas for other aspects to flag? T.Shafee(evo&evo) (talk) 02:49, 7 November 2022 (UTC)

I don’t understand F, but that’s mainly because I fail to map the example to “depth, specificity & granularity”. As for G, I remember there having been a discussion in German Wikipedia about the toil of getting external identifiers right, since some (VIAF was the example back then) tend to have complex data consistency issues (erroneously split/merged clusters and such). In addition, it seems that some aggressive bots would frequently override manual corrections (at least when claims were removed instead of deprecating them), a common source of frustration.
I’d like to add a point H: Hierarchical inconsistencies. Even minor errors in entity linking can lead to nonsensical conclusions when paths are followed. Examples (currently for instance of (P31) and subclass of (P279)) are being collected here. A related issue, of which I’m unsure how to call it though, is the sloppy use of property values, such as having occupation (P106) statements with a value that is not actually an occupation (but a field of work, a good produced, a genre, a sport etc.). Admittedly, the relevance of this latter issue for getting academics into Wikidata is marginal at best. --2A02:8108:50BF:C694:E0F2:7F6B:7EAD:26F9 10:06, 7 November 2022 (UTC)
There are areas in Wikidata where someone put in work into organising them and others where that didn't happen. In areas where it didn't happen you often still have a bunch of errors and I would also count that as data inconsistency. ChristianKl13:06, 7 November 2022 (UTC)
@Evolution and evolvability One additional thing to keep in mind is the problem of duplicates and conflations. --Emu (talk) 16:18, 7 November 2022 (UTC)

Invitation to attend “Ask Me Anything about Movement Charter” Sessions

You can find this message translated into additional languages on Meta-wiki.

Hello all,

During the 2022 Wikimedia Summit, the Movement Charter Drafting Committee (MCDC) presented the first outline of the Movement Charter, giving a glimpse on the direction of its future work, and the Charter itself. The MCDC then integrated the initial feedback collected during the Summit. Before proceeding with writing the Charter for the whole Movement, the MCDC wants to interact with community members and gather feedback on the drafts of the three sections: Preamble, Values & Principles, and Roles & Responsibilities (intentions statement). The Movement Charter drafts will be available on the Meta page here on November 14, 2022. Community wide consultation period on MC will take place from November 20 to December 18, 2022. Learn more about it here.

With the goal of ensuring that people are well informed to fully participate in the conversations and are empowered to contribute their perspective on the Movement Charter, three “Ask Me Anything about Movement Charter" sessions have been scheduled in different time zones. Everyone in the Wikimedia Movement is invited to attend these conversations. The aim is to learn about Movement Charter - its goal, purpose, why it matters, and how it impacts your community. MCDC members will attend these sessions to answer your questions and hear community feedback.

The “Ask Me Anything” sessions accommodate communities from different time zones. Only the presentation of the session is recorded and shared afterwards, no recording of conversations. Below is the list of planned events:

  • Asia/Pacific: November 4, 2022 at 09:00 UTC (your local time). Interpretation is available in Chinese and Japanese.
  • Europe/MENA/Sub Saharan Africa: November 12, 2022 at 15:00 UTC (your local time). Interpretation is available in Arabic, French and Russian.
  • North and South America/ Western Europe: November 12, 2022 at 15:00 UTC (your local time). Interpretation is available in Spanish and Portuguese.

On the Meta page you will find more details; Zoom links will be shared 48 hours ahead of the call.

Call for Movement Charter Ambassadors

Individuals or groups from all communities who wish to help include and start conversations in their communities on the Movement Charter are encouraged to become Movement Charter Ambassadors (MC Ambassadors). MC Ambassadors will carry out their own activities and get financial support for enabling conversations in their own languages. Regional facilitators from the Movement Strategy and Governance team are available to support applicants with MC Ambassadors grantmaking. If you are interested please sign up here. Should you have specific questions, please reach out to the MSG team via email: strategy2030@wikimedia.org or on the MS forum.

We thank you for your time and participation.

On behalf of the Movement Charter Drafting Committee,

MNadzikiewicz (WMF) (talk) 15:40, 7 November 2022 (UTC)

Wikidata weekly summary #545

Citizenship references

This discussion concerns the quality of sources in general, and country of citizenship (P27) specifically.

I have noted that various library catalogs (like Libris-URI (P5587) and NL CR AUT ID (P691)) are often used as sources for country of citizenship (P27). It is also common that various other websites and databases are used, such as AllMusic artist ID (P1728) and stated in (P248)People of Sweden database (Q10686037).

But only sources that actually make claims about citizenship can be used for claims about citizenship on Wikidata. None of the aforementioned sources do this, and very few sources do in general. Using these sources anyway would not be acceptable on Wikipedia, and I don't see why Wikidata should have drastically lower standards.

This is one of the reasons I find the property country of citizenship (P27) quite problematic. I wish that country (P17) or some other easily substantiated property/claim could be used instead, and that it would be supplemented by e.g. work location (P937) as needed. It seems much better to state which country or countries a person is associated with by reliable sources (and, if needed, the specific kinds of associations made), than to state their citizenship, for which reliable sources are few to none in the vast majority of cases. Förbätterlig (talk) 12:30, 28 October 2022 (UTC)

  • When we take an educated guess we have heuristic= , which explains how we have come to that conclusion. We had at one point over 1,000 people where we guessed their gender based on the heuristic, that they had a boy's given name or girl's given name. All of our data, even coming from the most reliable sources, have a built in error rate. --RAN (talk) 13:17, 28 October 2022 (UTC)
    Three observations:
    1. It is true that country of citizenship (P27) is sometimes sourced with sources that on closer inspection don’t really confirm the citizenship of a person. This should not happen.
    2. It is true that we rely heavily on heuristics for country of citizenship (P27) because there aren’t many sources that really care that much about the precise citizenship. But in such a case the source should be flagged as an heuristic, not as a positive fact from a given source.
    3. Sometimes I also want a property that can be used for a vague idea like “Italian in a somewhat cultural sense” or “German but in a sense that’s neither necessarily language nor ethnic group nor citizenship nor cultural background but somehow tangentially geographically linked to modern or past borders but not in every case” or “kinda sorta British but including some portions of Britain before Britain proper existed and also the Empire but only people who feel like British“ or “feels like Austrian even though there is nothing Austrian about them“. But if I contemplate such a proposal, I invariably come to the conclusion that this would be a terrible idea. Not least because such a property would result in never-ending quarrels.
    --Emu (talk) 14:06, 28 October 2022 (UTC)
Answering both RAN and Emu.
Yes, there is always a margin of error, but gender and citizenship are not comparable in this regard.
When it comes to the types of sources I have mentioned, the information therein as well as the sources themselves aren't nearly sufficient enough for the guess to be called educated. They do not contain, nor can they be assumed to have access to, information about any person's legal status.
For example, Libris-URI (P5587) entries are deliberately vague and say e.g. "Nationalitet/verksamhetsland: Sverige", which covers everyone whose nationality is assumed to be Swedish and/or anyone who has worked in Sweden, i.e. everybody who is somehow associated with Sweden in the long term. It is very vague, yet still it is repeatedly used as source for country of citizenship (P27). Adding properties like based on heuristic (P887) and determination method (P459) does not fix this problem, because these kinds of sources do not provide a basis for a citizenship heuristic at all. What we are inevitably left with, in most cases, is an incorrectly sourced and quite uneducated guess.
This is not nearly the same thing as assuming a person's gender based on a gendered name. (Just like assuming someone's formal relationship with a particular state is not the same thing as assuming that they can reasonably be associated with a particular country.) Also, many sources do in fact contain information about sex or gender.
I understand that a less specific property also presents certain problems, but those problems are already part of Wikipedia and the other projects, and have been since the beginning: WP has always been calling people British, American, Chinese, etc without necessarily providing sources. Again, I'm not saying that this isn't problematic in itself, just that using a vague country property doesn't add a problem that wasn't already there. However, adding and preferring country of citizenship (P27) definitely does. And what exactly do we gain in return?
Furthermore, the focus on citizenship also presents other, minor problems, such as non-independent countries. The possible values are limited to entities where a person can become a citizen. For example, everyone from Greenland and the Faroe Islands are citizens of Kingdom of Denmark (Q756617), but can generally only be meaningfully categorized as Greenlandic and Faroese. I assume that ethnic group (P172) can solve this problem, but this particular claim currently requires "a VERY high standard of proof", to the point that poorly based claims are being deleted by bots. country of citizenship (P27) does not have the same requirement, interestingly.
So to me, relatively vague statements like country (P17) seem like the least complicated alternative, i.e. disregarding citizenship and ethnicity completely, not adding them unless adequate sources can be provided. For those few cases where there isn't a single reliable source that even mentions a nationality or country (P17), the person in question most likely isn't notable enough anyway. Förbätterlig (talk) 15:12, 28 October 2022 (UTC)
Yes, I take your point. But on the other hand: What would we gain by introducing a property that is so intentionally vague that it encompasses pretty much everything from cases like Faroese to concepts like “German”? I find it hard to imagine a plausible use case for a property devoid of any meaning. It might be better to focus on clear problems like the Greenlandic/Faroese case and similar cases (South Tyrol springs to mind). Much of the rest could be dealt with by residence (P551) and work location (P937) anyway. --Emu (talk) 15:26, 28 October 2022 (UTC)
Thank you for your response. I actually think you make a very good point about South Tyrol, because the specific state or autonomous and/or cultural region is more relevant than the country in many biographies, especially in areas where state borders have changed a lot, and so it seems to me that a "vague property" could actually be more accurate in this regard.
But the vagueness of country (P17) or another country (P17)-like property is still problematic, yes. I can understand how and why country of citizenship (P27) has become the preferred option, so the reason I object to it is mainly because secondary problems, related to its implementation. It's still a good idea. With country (P17) instead, we would have to be crystal clear in the description of what claim the property actually makes and what it doesn't mean (kind of like my library example, "nationality/country of operation"). Förbätterlig (talk) 10:33, 29 October 2022 (UTC)

I've checked a few statements referenced by NL CR AUT ID (P691) and it seems that many of them were added by Robbot operated by Andre Engels. I would be very careful about this. That someone is called "American" in the catalog description doesn't mean anything. Perhaps we should remove all references with NL CR AUT ID (P691) pertaining to country of citizenship (P27).Vojtěch Dostál (talk) 16:14, 28 October 2022 (UTC)

@Neme12: I would suppose that P887 is more appropriate, but that's just because I frequently see it on script-assisted edits that use it for sex or gender (P21) (e.g. Q113499189). Unfortunately P3452 lacks examplar usages, but it seems to be often used for discreet items like a specific person, place, or event rather than more abstract, general concepts like "place of birth" or "date of baptism". The sad truth of the matter, I fear, is that nobody really knows what's "best", or those that do don't document it well. Wikidata still has massive structural confusion and many properties lacks well codified, readily accessible norms (lots of "norms" are buried deep in the archives of talk pages, or obscure abandoned projects, or the minds of power editors, rather than anything like a manual of style). Almost everything is a hodgepodge of "whatever the previous person did" and "whatever works". -Animalparty (talk) 02:22, 8 November 2022 (UTC)
Thanks, I see, the difference of based on heuristic (P887) being limited to a well-defined set of heuristics, or any item that is an instance of heuristic (Q201413) (even if it starts with the words "inferred from") whereas inferred from (P3452) pointing to a specific item makes sense to me, although I can't quite imagine a scenario for the second one (How is it inferred from the other item? Does the other item have a statement that is related (maybe an inverse statement) to the one being referenced? If so, why not just use the same reference as the other statement?) But yeah, I'm new here and noticed that the data isn't really described in any consistent way or consistent structure at all :( to the point that I'm having a hard time imagining Wikidata being useful for any machine use (unless it's something that incorporates AI and is able to understand unstructured data anyway like Google Knowledge Graph), because the same thing is often described in a few different ways on different items and that's something only a human can recognize. :( Neme12 (talk) 03:03, 8 November 2022 (UTC)
  • Getting notified of discussion because my username was linked, I tend to agree with Förbätterlig that it would be good to have country (P17) or something similar opened up for people that are connected to a country, but of which we do not know whether they were citizens or just long-time inhabitants. As an extra argument I would like to bring in that the whole concept of 'citizenship' as we know it nowadays is relatively recent. People, or at least most people, from before the time of the French revolution, would not be a citizen of any country. They could be a citizen of a city in the country, or a member of the nobility of the country, or something like that, but a citizenship of a country such that you either are or not a citizen from birth, and if you're not you can become through some formal procedure, I don't think such a thing really existed. So if we go by the rules as they officially are, we currently cannot say anything about the country of people from the 18th century or before. - Andre Engels (talk) 19:56, 28 October 2022 (UTC)
    If they were long term inhabitants (and this wouldn't exclude the possibility of them additionally being a citizen as well), the correct property to use would be residence (P551). I really can't see why something meaningless like country (P17) would be needed or useful at all, because everyone who presented an example of its usefulness has in fact presented a scenario where the subject has been an inhabitant of that country and therefore residence (P551) is the correct property. Neme12 (talk) 02:16, 6 November 2022 (UTC)
  • For a period of history, I think people in quite a few countries were British subjects, only becoming citizens of particular countries in the late 1940s, after the independence of India and Pakistan. That was the case for Australia and New Zealand. Before that time, a "New Zealander" or "Australian" in many cases was just an British subject who happened to live or be born in those places (the attitude to indigenous people or people from other countries such as China may have varied). I'm not sure that you can really speak of a separate ethnicity, either, since it's basically just British. Dates of birth can also be badly sourced. In Wikipedia, you aren't supposed to cite sites like IMDB, but apparently an unsourced date of birth is often accepted (and probably just copied from a site like IMDB). Ghouston (talk) 10:03, 29 October 2022 (UTC)
    "That someone is called 'American' in the catalog description doesn't mean anything."
    True! That's precisely what I meant. But, as for Wikidata, I don't think we should be afraid of a "meaningsless" property. Imagine a person who is born in Somalia, relocates to Sweden and becomes a notable writer there, before they eventually move back to Somalia and dies there. I find this to be the most meaningful and informative way to describe this situation:
    place of birth (P19): Somalia
    country (P17): Somalia
    country (P17): Sweden
    place of death (P20): Somalia
    In this scenario, however, there is a reliable source (such as one based on documents from Swedish authorites) which states that this person was actually a citizen of Ethiopia. This is how the Wikidata item will have to look like:
    place of birth (P19): Somalia
    country of citizenship (P27): Ethiopia
    place of death (P20): Somalia
    The first solution doesn't really mean a lot compared to the second one, true, but it still makes a lot more sense. (country (P17) can be swapped for a more suitable property.) Förbätterlig (talk) 12:12, 29 October 2022 (UTC)
    Why not just use residence (P551) instead of country (P17) in those cases? --Emu (talk) 13:29, 29 October 2022 (UTC)
    Yes, that is a more suitable property in this case. But the point here is to address the issues that arise when country of citizenship (P27) is the norm. Förbätterlig (talk) 20:21, 29 October 2022 (UTC)
    This made-up example reminds me of a real one: Jon Milos (Q107458533), a writer who was born in the Serbian part of the traditionally Romanian-speaking region Banat, in then Yugoslavia. Banat is the most accurate and relevant place name in his case, but Wikidata would probably prefer Kingdom of Yugoslavia (Q191077), because of the focus on citizenship. I see now that I've only added Sweden (Q34)! (I actually have no idea what his citizenship status was, so it was just an assumption.)
    In his Swedish Libris-URI (P5587), the "nationality/place of operation" is simply "unknown", but under his GND ID (P227), the "Country" section says Serbia, Romania and Sweden. He was born in present-day Serbia, lived and worked most of his life in Sweden, where he died, but he was also a prominent poet in Romania. I'd prefer mentioning Banat and/or Yugoslavia rather than Serbia, but I do agree with the decision to mention several countries in that manner. Förbätterlig (talk) 12:24, 29 October 2022 (UTC)
    I have expanded Jon Milos (Q107458533) a little bit. Wikidata now knows his place of birth (P19) (and through this item all the necessary historic country information), his place of death (P20) (and hence the country he died in), his country of citizenship (P27) (although it’s unsourced and probably incomplete), his ethnic group (P172), some of the schools/universities he attended via educated at (P69) (and through those items their location including country information), also residence (P551), native language (P103), languages spoken, written or signed (P1412) and writing language (P6886). That should cover the information you want to include in a more nuanced way, I think. --Emu (talk) 15:15, 29 October 2022 (UTC)
    Thank you for your contributions!
    I've been trying to figure out how to source someone's Swedish citizenship, but I'm still not entirely sure how. In Sweden, this information is normally easily accessible, so anyone can write to Swedish Tax Agency (Q836916) and ask for it. I assume it's then possible to cite the source like this:
    But there are some limits to this: If someone is a Swedish citizen, their Swedish citizenship is the only one that will be registered in Sweden. Additional sources will be needed. However, if that someone is not a Swedish citizen but a registered inhabitant of Sweden, their foreign citizenship(s) will be registered. All this information can be accessed and verified from abroad as well.
    Several questions remain:
    • Is it even likely that users will do the necessary research for all instances of country of citizenship (P27)?
    • What to do with the multitude of cases where information isn't accessible and verifiable? Wikidata should not keep accepting any type of source regardless of what it actually says or how reliable it actually is.
    • I'm not sure about about automatically adding/sourcing citizenship based on various legal heuristics, unless the country of citizenship (P27) description is changed to "the object is a country that recognizes or has recognized the subject as its citizen". A person's citizenship status can change over time.
    • Everything else that has been mentioned, but not addressed, in this discussion.
    Förbätterlig (talk) 15:34, 30 October 2022 (UTC)
    It is true that country of citizenship (P27) causes a lot of problems, to name a few:
    1. It doesn’t really make sense pre-1800s and next to no sense pre-1648 but is used anyway for those people.
    2. Wrong values abound. There are 69.398 items with a citizenship of a state that didn’t exist during the life of the person in question. There are 13.801 items with Austria-Hungary (Q28513) although there has never been a unified citizenship for this country. And we rely on highly dubious sources in many cases if we (including me) aren’t just using our very own gut feeling heuristics.
    3. It's often very hard to source the citizenship of many dead people (although it’s better for jus soli (Q604971) countries where you can at least assume citizenship for free, upperclass non-diplomats in most cases).
    4. It’s almost impossible to properly source the citizenship of many living people unless they are providing the information themselves (although they may lie or even be unsure in some cases). Sweden seems to be an exception (it’s openness even about people’s tax returns is legendary and often cited in Austrian media). In Austria, you can safely assume citizenship for people who hold public office on the state or national level and you can find out for a fee if people own real estate. That’s about it.
    So yes, country of citizenship (P27) is a problem. But it’s here to stay with all its headaches. --Emu (talk) 18:13, 30 October 2022 (UTC)
    I do think it should stay, and I understand that consensus must have been reached at some point and why. But it shouldn't be the go-to property. I seriously doubt that anyone intended for it to have these consequences, i.e. the widespread poor-quality data that we have described in this discussion. It has to be possible to reevaluate. Förbätterlig (talk) 01:17, 31 October 2022 (UTC)
    Indeed, I don't think anyone is proposing to remove or even significantly redefine country of citizenship (P27). At least I am not. Rather, what I would be proposing is having another property (whether country (P17), another one or one to be defined) for 'nationality', 'closely related country' or somesuch, and then use that for cases where we have a hunch of someone's citizenship but no sufficiently hard evidence, and keep country of citizenship (P27) (presumably in combination with the other one) for those cases where we do have solid evidence. - Andre Engels (talk) 06:59, 31 October 2022 (UTC)
    @Andre Engels https://www.wikidata.org/wiki/Wikidata:Property_proposal/Person#cultural_identity is the latest discussion about adding a new property about this kind of information. ChristianKl19:23, 2 November 2022 (UTC)
    Regarding this part:
    • I'm not sure about about automatically adding/sourcing citizenship based on various legal heuristics, unless the country of citizenship (P27) description is changed to "the object is a country that recognizes or has recognized the subject as its citizen". A person's citizenship status can change over time.
    As far as I understand it, unless the statement specifies one of either point in time (P585), start time (P580) or end time (P582) as qualifiers, then it makes no statement about *when* or *at which point in time* this fact about the citizenship was true, just that it was true *at one point*. If none those time related qualifiers is specified, it doesn't automatically mean that the statement was true by the end of the person's life or that it is true today if the person is still alive, it means that we don't know the time range, so it actually *is* equivalent to "the object is a country that recognizes or has recognized the subject as its citizen" unless a time related qualifier is specified. Of course, if someone is adding country of citizenship based on assignment of citizenship at birth by being born in that country, they could add the qualifier start time (P580) with the value of the person's birth date. This would represent the fact that they were born with that citizenship but we don't know if the citizenship was ever revoked. It would still make no statement about it having never been revoked, that could still be a possibility. Neme12 (talk) 01:55, 6 November 2022 (UTC)
    Regarding:
    Imagine a person who is born in Somalia, relocates to Sweden and becomes a notable writer there, before they eventually move back to Somalia and dies there. I find this to be the most meaningful and informative way to describe this situation:
    place of birth (P19): Somalia
    country (P17): Somalia
    country (P17): Sweden
    place of death (P20): Somalia
    You're proposing adding the country (P17) property based on where that person has *lived*, but there is already a suitable property to represent where the subject has lived, and that is residence (P551). In my opinion, a better way to describe this scenario would be this:
    place of birth (P19): Somalia
    residence (P551): Somalia
    residence (P551): Sweden
    place of death (P20): Somalia
    Having read all of this discussion, I fail to see why using some meaningless property like country (P17) would be useful or have any purpose at all. Correct me if I'm wrong but basically everyone who proposed using a meaningless property such as country (P17) in this discussion has used it to represent the countries where the person has lived, but as I said, there's already a perfectly good property for that - residence (P551). Neme12 (talk) 02:09, 6 November 2022 (UTC)
    Also, residence (P551) has an advantage in that it doesn't have to be a country. It can be more specific, such as a state, province, county, municipal area, city/town/village etc. depending on what is known. It also doesn't have the issue of being tied to a political entity (like a country) that may not have existed when the subject lived, as it can point to a geographical area or cultural region instead of a political entity. Anything that has an associated geographical area or location. Neme12 (talk) 02:59, 6 November 2022 (UTC)
    Oh, I assumed there is already a property called "nationality" which would represent what country/nation the person (most) identified with. Now I see that there's no such thing and the closest properties are country of citizenship (P27) and ethnic group (P172), both of which are inappropriate for different reasons. Now I understand the need for a generic property like country, although I would call it nationality so that it doesn't have to be tied to recognized countries and it wouldn't have the problem in cases where that country didn't even exist yet when the subject lived. Such a vague property (because it's subjective based on how the person identified) could have vague values, values that aren't necessarily tied to actual countries. Neme12 (talk) 16:59, 6 November 2022 (UTC)

@Infovarius Please read this discussion and do not restore any more insufficient sources for claims about citizenship specifically! Förbätterlig (talk) 11:02, 6 November 2022 (UTC)

What are you talking about? I'm not talking about citizenship and didn't say anything about specific sources, and I'm definitely not restoring any actual data about citizenship anywhere. Can you please give me a quote of what you're replying to? (And I'd appreciate if you didn't use exclamation marks when talking to me as if you're raising your voice). Neme12 (talk) 15:38, 6 November 2022 (UTC)
@Neme12 I addressed Infovarius, who has been using Libris/Allmusic to substantiate a citizenship claim. I apologize for the confusion. I have now adjusted the indentation so it doesn't seem like I'm replying to your post. Förbätterlig (talk) 15:44, 6 November 2022 (UTC)
Oh I see, I'm sorry, I thought it was a reaction to me since it looked like a reply to my comment by the indentation. Neme12 (talk) 15:52, 6 November 2022 (UTC)
@Neme12 Yes, I used the reply function without thinking! You actually make good points (both here and in the paragraphs above). Förbätterlig (talk) 11:16, 7 November 2022 (UTC)

formatter URL

Can someone explain why lito to external site does not working? see Shironet song ID (P4035) the formatter URL (P1630) = https://shironet.mako.co.il/artist?type=lyrics&lang=1&$1 but in Q61126138 (and all the others) the link does not work properly Geagea (talk) 12:54, 3 November 2022 (UTC)

I suspect this is @ArthurPSmith: territory; the ID value is being URL encoded when it should not be? See https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2022/03#URL_encoding_snafu --Tagishsimon (talk) 13:22, 3 November 2022 (UTC)
So theoretically https://wikidata-externalid-url.toolforge.org/?p=10431&url_prefix=https://shironet.mako.co.il/artist?type=lyrics&lang=1&$1 should work? Geagea (talk) 13:31, 3 November 2022 (UTC)
I defer to Arthur, who has the advantage of knowing what they're talking about. The problem seems to me to be unwanted URL encoding of the statement values; on the face of it the formatter URL is fine, although it may be that a different formatter URL property, such as third-party formatter URL (P3303), is required. --Tagishsimon (talk) 13:36, 3 November 2022 (UTC)
@Geagea, Tagishsimon: The problem is the '&' - the Wikidata UI URL-encodes it and so it's not passed in to the site correctly. Geagea's suggestion for a wikidata-externalid-url link should work, yes. ArthurPSmith (talk) 14:58, 3 November 2022 (UTC)
t/y. Another thing @Geagea: might wish to know is there is, iirc, latency in the job queue, so an update to the formatter URL may take ~24 hours to propagate to items? --Tagishsimon (talk) 15:04, 3 November 2022 (UTC)
ok, thanks to all. Geagea (talk) 17:04, 3 November 2022 (UTC)
ArthurPSmith, it doesn't worked. See this for exaple. Geagea (talk) 16:28, 5 November 2022 (UTC)
@Geagea: Sorry, your formatter was missing the 'id=' piece. I've edited it on the property page, so it should work going forward (with the usual delay). ArthurPSmith (talk) 21:51, 7 November 2022 (UTC)

Instance of ontology?

scanlation (Q557923) has instance of (P31)ontology (Q324254) (“specification of a conceptualization”), which I don’t understand. Scanlation probably isn’t a specification of a conceptualization. That item being linked, however, prominently from Wikidata:Properties for deletion/P10589, I wonder whether anyone else has just overlooked this or whether there is some sense to it. --2A02:8108:50BF:C694:E83C:EBFF:49AF:23CC 10:21, 9 November 2022 (UTC)

Scanlation can't possible be an instance of that, as it is an action that one can take, not a specific thing, according to enwiki. -wd-Ryan (Talk/Edits) 22:17, 9 November 2022 (UTC)
Most specific things aren’t ontologies either (but Wikidata is one, or aims at being one). So this statement is indeed incorrect. Which leaves us with the question which item to use instead for instance of (P31) here. --2A02:8108:50BF:C694:E83C:EBFF:49AF:23CC 11:55, 10 November 2022 (UTC)
I notice that there has been no objection to removing scanlation (Q557923)instance of (P31)ontology (Q324254). What remains is the question what scanlation (Q557923) is an instance of instead. I’m unsure about this given the recently added subclass of (P279) statements (scanlation (Q557923)subclass of (P279)translation (Q7553), scanlation (Q557923)subclass of (P279)scanning (Q11873863)). Possibly scanlation (Q557923)instance of (P31)activity (Q1914636). One could also argue for scanlation (Q557923)instance of (P31)copyright infringement (Q647578) – or should that be a subclass of (P279), too? --2A02:8108:50BF:C694:29E6:BE9C:1625:78C8 20:14, 12 November 2022 (UTC)
Okay, I’ll now go for scanlation (Q557923)instance of (P31)activity (Q1914636) and scanlation (Q557923)subclass of (P279)copyright infringement (Q647578). --2A02:8108:50BF:C694:C89D:B1AC:1FCB:92D7 23:32, 13 November 2022 (UTC)
This section was archived on a request by: --2A02:8108:50BF:C694:C89D:B1AC:1FCB:92D7 23:39, 13 November 2022 (UTC)

Lancashire is a fortification, to my surprise

The chain is Lancashire (Q67311452), county palatine (Q2991489), kaiserpfalz (Q1796442), castle (Q23413). The problem seems to be county palatine (Q2991489) as a territory being a subclass of a building type, but I'd like someone to check the implications of breaking the link. Vicarage (talk) 08:06, 7 November 2022 (UTC)

In my opinion, county palatine (Q2991489)subclass of (P279)kaiserpfalz (Q1796442) is nonsense, not only because of the territory vs. building type issue, but also because the latter is specific to the Holy Roman Empire in the Middle Ages, which is not true for a general county palatine (Q2991489). (I suspect a mixup from the German side here.) --2A02:8108:50BF:C694:E0F2:7F6B:7EAD:26F9 10:39, 7 November 2022 (UTC)
  Fixed by removing subclass. Vicarage (talk) 08:03, 8 November 2022 (UTC)

How to most easily make bulk changes to incoming items?

David Sanders (Q5239461) currently conflates at least two different people with the same name: an American biochemist (David Sanders (biologist) on English Wikipedia) and a South African public health researcher (the former David Sanders (Q47817416), ORCID id: 0000-0003-1094-7655), which was errantly merged in 2018. Most of the incoming links appear to be papers (co-)authored by the South African researcher or other people with the same name. A simple revert of the merge is a first step, but won't change incoming links. Are there simple tools to make bulk reassignments, e.g. correct the author (P50) for dozens of papers? -Animalparty (talk) 01:12, 8 November 2022 (UTC)

Author Disambiguator can do that for author (P50) statements. See the last sentence of the first paragraph here. --Quesotiotyo (talk) 09:19, 8 November 2022 (UTC)

Please add this flag image to (Q490920) Berks County

Could someone please add this file to (Q490920) Berks County, Pennsylvania as the flag? Thanks!

 
Flag of Berks County, Pennsylvania

Physeters (talk) 04:14, 8 November 2022 (UTC)

@Physeters:   Done Vahurzpu (talk) 04:52, 8 November 2022 (UTC)
Thanks @Vahurzpu! Physeters (talk) 08:53, 8 November 2022 (UTC)

What to use instead of P642 here?

disease X (Q50410669) previously had instance of (P31) infectious disease (Q18123741) - which it is not, it is a placeholder name for a hypothetical future disease. So I've turned it to have instance of (P31) placeholder name (Q1318274) instead, but I feel it should still be connected to "infectious disease" somehow. How do you model "Is a placeholder name for a (hypothetical) infectious disease)"? I was about to use of (P642) as a qualifier, but that one yells "don't use me!" El Grafo (talk) 09:06, 3 November 2022 (UTC)

What about subclass of (P279) : infectious disease (Q18123741), does it make sense to you? It sort of does to me. Vojtěch Dostál (talk) 11:19, 3 November 2022 (UTC)
Mmmh, no, not quite. "Disease X" is not a specific disease, nor a kind/group of diseases. As far as I understand, this is not like calling an unidentified patient "John Doe" (patient has materialized but you don't currently know their name). It's more like putting "Pastor/Rabbi/... X" on the seating plan of your wedding because you have rented the place and want to arrange the tables, but you don't know yet who will be your priest (once the priest has been chosen, you will also now their name). Does that make any sense? El Grafo (talk) 12:28, 3 November 2022 (UTC)
I’d recommend a qualifier, but I’m not sure either which one to use. Possibly replaces (P1365), or subject has role (P2868) perhaps. It seems that so far, there are no qualifiers associated with a statement instance of (P31)placeholder name (Q1318274). --2A02:8108:50BF:C694:530:6744:9D46:5AA1 12:28, 3 November 2022 (UTC)
To me subclass makes the most sense. I do think that one can speak of various possible disease as cases of disease X and thus it's a class for those. ChristianKl16:45, 3 November 2022 (UTC)
Maybe; if “every disease X is an infectious disease” makes sense, and it can. So far, I’ve understood “disease X” as a placeholder, not a class; Covid may have been a “disease X” when the first wave of infections swamped the world, but certainly isn’t any more – on the other hand we don’t know the next “disease X” yet. The class an item is an instance of should not conceptually be something time-dependant. It can obviously change as our knowledge evolves – hard to tell which is the case for Covid (or any other “disease X”). --2A02:8108:50BF:C694:530:6744:9D46:5AA1 17:48, 3 November 2022 (UTC)
Perhaps in the same way that a wargame (Q1501543) isn't a war, and doesn't bother to link to War. Ghouston (talk) 10:04, 4 November 2022 (UTC)
I don't think wargames are an equivalent for disease X. Event 201 exercise is the kind of thing that would be equivalent to a wargame. ChristianKl14:24, 5 November 2022 (UTC)
I suppose one can think of it like that - better than nothing, imho. Thanks, everyone. El Grafo (talk) 11:22, 8 November 2022 (UTC)

Definition question (Dutch speakers)

Hello,

may you please help me differentiating ground (Q3646210) and soil (Q36133)? If you miss information, there is also soil type (Q10926413).~~ Батальйон Нахтігаль (talk) 19:35, 8 November 2022 (UTC)

According to [1], the two terms tend to be used interchangeably, but "bodem" can be viewed as the combination of "grond", water, air and soil life. "Grond" is used more as "onderground", presumably deeper layers of relatively lifeless soil. Ghouston (talk) 09:15, 9 November 2022 (UTC)

ResPublicae.eu mastodon addresses

Back in April @Nemo bis: created a Mastodon instance - respublicae.eu - that mirrored a few thousand Twitter accounts into the Fediverse, and then added these mastodon addresses to WikiData (edit group). To be fair, these are qualified with itemMastodon address (P4033)x@respublicae.euobject has role (P3831)unofficial (Q29509080)object has role (P3831)mirror storage (Q654822) however I'd like to propose that these are all removed/the edit group reverted.

First, because it's somewhat spammy and (IMHO) a conflict of interest to add thousands of (what are essentially) links to one's own service. Secondly it renders queries like this, posted above in Wikidata weekly summary #545 essentially unusable because almost every response to that query is for x@respublicae.eu and these aren't the subject's actual mastodon addresses.

Thoughts? --M2Ys4U (talk) 20:52, 9 November 2022 (UTC)

If these were removed I'd understand, but I don't think it's just "spam".
The reason I added the links to Wikidata is to show how useful Wikidata is. I had to do a lot of scouting to find the correct accounts of (current or former) EU officials because this kind of information wasn't open data anywhere. I could have done it without Wikidata, but why would I? I found dozens of mistakes here in Wikidata and on the official list of EC accounts (I reported them and they were grateful; now the European Commission knows that Wikidata has a better grasp of their own social media accounts than they do).
People are now passing around manually curated lists which take a lot of effort and are often hard to use, when instead they could pool their efforts on Wikidata. We should be proud of it and showcase it.
As for queries, yes, they need to be more specific, but that's easily solved. By using this data more, we were able to find some mistakes which were more problematic/misleading (if not vandalism). Nemo 21:53, 9 November 2022 (UTC)
What tips it over in to spam territory, for me at least, is that you didn't just go and find accounts for these people, you created them on your own service and then put the addresses of the newly created accounts in WD. Though don't get me wrong, I'm sure you did it in good faith, but I'm afraid I don't think it's appropriate, hence my comment here. Linking people's addresses is fine by me - a good thing even! - as long as those accounts are actually their accounts. --M2Ys4U (talk) 22:11, 9 November 2022 (UTC)

Empty scientist items

Dear Community,

Neferkheperre is creating very meagre items about scientist with linkage to Wikispecies. On 23 September 2017 he was questioned about this, but never answered the thread.

What shall be done? Many items are possibly duplicates. Польские манеры (talk) 12:48, 8 November 2022 (UTC)

Duplicates are unfortunate but not fatal; making the assertion that 'many items are possibly duplicates' without providing any evidence that any are duplicates is problematic. Maybe there are some or many duplicates. Maybe there are none. I have created about one gazillion items based on En wiki articles, with P21 & P31 statements only; and sometimes only with a P31. The question of incidence of duplicates aside, I don't see how the items of concern to you differ from those I created.
If duplicates are being created, eventually they'll be merged; or else they'll sit there and not do very much harm, beyond isolating the article to which they're sitelinked. WD items come from many different directions, and it is very common to encounter duplicates; the issue is very widespread and it goes with the territory. There are very few domains I've worked with where I have not come across sets of duplicates which reflect the vectors from which the articles arose - EN wiki, DE wiki, this database, that website. Deduplication, especially where the source data is sparse, as I expect is the case on wikispecies, is hard or not conceivably possible. You can argue that deduplication should always be done before item creation, but the effect will be to exclude subjects that should have an item. Or you can live with deduplication following item creation.
What shall be done? Maybe accept the user is acting in good faith, does not feel inclined to talk; and maybe put some effort into deduplication if you have a real interest in it. --Tagishsimon (talk) 14:12, 8 November 2022 (UTC)
Oh, and there's the rather important convention that if you bring a user's behaviour to a forum like this for discussion, you inform the user. You do not seem to have done that in this instance. I'm not going to ping the user b/c I don't see that there's any cause to drag them here b/c there's no case to answer. --Tagishsimon (talk) 14:16, 8 November 2022 (UTC)
Providing Wikilinks is one of the purposes for Wikidata. While it would be better to have no duplicates, automatic item creation for Wikimedia pages is something we generally allowed in the past. ChristianKl14:23, 8 November 2022 (UTC)

While there is probably no sharp boundary between “meager” and “not meager” items, I feel inclined to say that if you know enough about a person to write a description like “Brazilian annelidologist”, you should not only be able to but also do provide equivalent statements (country (P17)Brazil (Q155), occupation (P106)annelidologist (Q38687772) in this case). --2A02:8108:50BF:C694:E83C:EBFF:49AF:23CC 10:27, 9 November 2022 (UTC)

Well, yes, but also, mainly, no. There's no correct order in which to do things; adding just two more statements may well, for instance, halve the number of items the user has time to create. It's fine if you want to ascribe more value to the additional statements than to the item creations & so would welcome that rate halving. Not so fine if you value item creation over statement depth. Eventually the job will be done 'properly', as it were; but it does not do to be too prescriptive about how other people should be doing that corner of the task they have elected to do. --Tagishsimon (talk) 19:55, 9 November 2022 (UTC)
I’m not saying the user should add these statements immediately (I probably would, because otherwise I’d lose track), but Neferkheperre could have stated intents to do so when asked (as linked by the OP, there is a thread on the user talk page, without answers). Anything else is probably a question of whether Wikidata needs a lot of mostly uninformative items, a philosophical question we’re not going to settle in this discussion. --2A02:8108:50BF:C694:E83C:EBFF:49AF:23CC 22:12, 9 November 2022 (UTC)
(sarcasm) Very happy we decided to bring back the pillory! Of course there should be Tarring and feathering. On a serious note, not bothering to check if items exist creates unnecessary work for others, so that is not ok. An administrator on a wikimedia project is expected to know better. And also you should have taken this up with the person in question on their talk page and certainly not here. Infrastruktur (talk) 18:24, 10 November 2022 (UTC)

How best to merge stumbling stone (Q26703203) and Stolpersteine (Q314003) and then convert the first one into a Wikimedia Disambiguation Page

I just came across Stolpersteine (Q26703203) and stumbling stone (Q314003) which are basically duplicates of each other but both have iw-links to de and ffr. Q314003 seems in fact to be links to redirect or disambiguation pages, so should probably instead converted into an instance of: wikimedia disambiguation page. A bit unsure about the best way to go about this. Unlink the two redirect and disambiguation articles on de and ffr, then merge the two items, and then finally create a new item for them, or is there another, better way? TommyG (talk) 07:38, 10 November 2022 (UTC)

Isn’t one the project and the other one the concept of a single stumbling stone? --Emu (talk) 08:08, 10 November 2022 (UTC)
The problem here seems to be that the German sitelink of stumbling stone (Q26703203) is a redirect to the article that is the German sitelink of Stolpersteine (Q314003), and the Northern Frisian sitelink of stumbling stone (Q26703203) is a disambiguation page. Removing the latter and adding it to a newly created item “Snöfelstiin” with instance of (P31)Wikimedia disambiguation page (Q4167410) should do. Then Stolpersteine (Q314003) is the project and stumbling stone (Q26703203) is the (class of) commemorative plaque. --2A02:8108:50BF:C694:E83C:EBFF:49AF:23CC 11:52, 10 November 2022 (UTC)
  Done (Q115156346). --Emu (talk) 14:06, 10 November 2022 (UTC)

如何描述 2022年10月杭州市小客车增量指标摇号结果公告 里的相关数据

2022年10月杭州市小客车增量指标摇号结果公告里的

  • 个人指标5106个
  • 单位指标931个
  • 个人有效编码数835823个
  • 单位有效编码数23714个

该如何描述? 118.143.233.74 01:12, 11 November 2022 (UTC)

Strange edits

Hi,

Could someone take a look at Tourbillon Chapel (Q29892233) (it looks like repurposing - if so, it should be reverted - but I'm not entirely sure).

Cheers, VIGNERON (talk) 10:59, 7 November 2022 (UTC)

Yes, it does look like repurposing. It was changed from a castle to a chapel in a different location. @HwætGrimmalkin this will need to be reverted and a new item created for the chapel. — Martin (MSGJ · talk) 12:44, 7 November 2022 (UTC)
Yes, Granary of Kloster Sankt Katharinental (Q29525812) seems to have the same problem. ChristianKl13:20, 7 November 2022 (UTC)

Just tried, but Q29892233 can't be properly reverted because it was a duplicate to Leuk Castle (Q6534487).--Deutscher Dicksack (talk) 21:28, 7 November 2022 (UTC)

Hi, Nicolas! I object to merge. "Appendix:Klingon" is not a language, it is a kind of Wikimedia appendix page (Q35243371). --Infovarius (talk) 10:02, 11 November 2022 (UTC)
Hi Infovarius, I'm not sure to understand. What has klingon language to do with this chapel in Switzerland? Cheers, VIGNERON (talk) 10:07, 11 November 2022 (UTC)

English Wikivoyage article for Cognac

Q285, the Wikidata page for the city of Cognac in France is locked so I can't edit it (vandalism in the past I guess?). About two weeks ago I created an article on English Wikivoyage and wanted to add it. As it wasn't possible, I added a comment to Talk:Q285 asking someone who has the editing rights to make add the article (usually comments get noticed here pretty quickly) but this time nobody seems to have noticed it, so I'm posting here instead. Ypsilon (talk) 06:02, 11 November 2022 (UTC)

@Ypsilon: Cognac (Q285) is only semi-protected, I added the sitelink. You will be able to edit semi-proected items very soon (you only need more 20 edits on Wikdata). Cheers, VIGNERON (talk) 11:35, 11 November 2022 (UTC)

How to tag toll-free numbers

Hi, does anyone know what qualifier I would use to indicate the region of a toll-free/0800 number? Like in Q111869798 CoderThomasB (talk) 06:09, 11 November 2022 (UTC)

I'd probably qualify the statement with
⟨ telephone number ⟩ has characteristic (P1552)   ⟨ toll-free telephone number (Q348308)      ⟩
--M2Ys4U (talk) 11:30, 11 November 2022 (UTC)

Metal sector / Metal crafts

In my language metal sector (Q19707051) is identical to metal crafts (Q57261186). Please check for yours and tell me.--Strom auf der Gurke (talk) 14:08, 11 November 2022 (UTC)

Kapellmeister

I’ve noticed that chapelmaster (Q215793), according to its German and English descriptions, refers to a secular profession (“leader of a musical ensemble, often smaller ones used for TV, radio, and theatres”), while according to its French description, it refers to an ecclesiastical one. Should the item be split? --2A02:8108:50BF:C694:E83C:EBFF:49AF:23CC 12:51, 9 November 2022 (UTC)

The job has evolved into the Q767026 for sacred and the court chapel master (Q1002228) for secular music.--Budy Greene (talk) 15:34, 9 November 2022 (UTC)
Okay, so what we need now are consistent descriptions and sitelinks for chapelmaster (Q215793) – given the large number of them in languages I don’t understand, this isn’t something I can do alone… --2A02:8108:50BF:C694:E83C:EBFF:49AF:23CC 16:19, 9 November 2022 (UTC)
The French description of chapelmaster (Q215793) doesn't refer to an ecclesiastical profession, it refers to profession originally linked to a chapel (which is indeed an ecclesiastical place). So the descriptions and sitelinks look fairly consistent to me. Cheers, VIGNERON (talk) 11:30, 11 November 2022 (UTC)
@VIGNERON: Probably a problem of my French, but « personne chargée, dans un cadre religieux chrétien, d'enseigner et de faire chanter la musique, et de composer des partitions polyphoniques au sein de la chapelle musicale d'une église » does sound fairly ecclesiastical to me… --2A02:8108:50BF:C694:142C:9BB9:428D:1C6A 16:40, 11 November 2022 (UTC)

Botola

There are too many Botola's in this changeset at trapdoor (Q1666492). How may I fix this?~~ Батальйон Нахтігаль (talk) 20:53, 8 November 2022 (UTC)

Revert. All of that is bullshit. See #Language policy for disambiguation items? above for the reason why. Ah, I see, the item isn’t a disambiguation item anymore. --2A02:8108:50BF:C694:E83C:EBFF:49AF:23CC 10:51, 9 November 2022 (UTC)
I used the undo function on the changeset, restored the disambiguation item Q13489665 to move the sitelink, and added the Italian label to Q1666492. There was a new disambiguation item (Q109569876) which I merged to Q13489665. Peter James (talk) 14:45, 12 November 2022 (UTC)

Swiss National Databank of Vascular Plants dataset

Hi all, during the GLAMhack2022 in Mendrisio (CH), 4-5/11/2022, a pair of teams worked on transferring and making educational use of the data coming from the big dataset of the Swiss National Databank of Vascular Plants (about 7 mln samples, across two centuries). Each record contains mainly sample id, plant name, place of finding, time of finding (several), altitude of finding author of recording, and for several of them there is also a 2d image or drawing board. Most of them from Switzerland and surroundings territories. All released under CC BY 4.0.

One team worked specifically on automating the detection of the text present in the 2D scanned drawing sheets (often old style handwritten), it is work in progress (Wikimedia Commons can benefit from it).

We have started to inject data for three plants only, within the property taxon range, adding only location and reference (see Q49630098, Q15547283, Q161797). First questions:

  1. at which level of definition the taxon range we want to go deep? We used districts (province)-level (not cantons/regions/states, neither counties) for having the possibility of mapping them not only on the base of large political regions where geography may vary significantly.
  2. does the time when the sample was found is of interest? may each of the taxon range location have several points in time attributes? Until now this property has not been transferred on here yet, but I claim that it is necessary (since a plant might had been present in a territory in the past but now no longer, or viceversa).

Does all makes sense for you? Suggestions, corrections, remarks are all welcome.

Thank you all, sNappy (ML) 22:47, 11 November 2022 (UTC)

@Mattia Luigi Nappi Hello. I love plant distribution data but I am a little sceptical that your selected approach would be scalable in Wikidata. In taxon range (P9714), you already have 28 statements and that is just Switzerland. If we apply the same logic in other countries, the item will collapse, especially for plant taxa growing worldwide. I don't know precisely why, but there seems to be a technical problem with items that have too many statements. I think that the data you describe (geographical and temporal) are quite unique and would best be represented in a different type of database (maybe something GIS-based). If you insist on using wikibase, I think the data would deserve their own wikibase instance (of course federated with Wikidata). But in the case of a similar Cezch database (Nálezová databáze ochrany přírody (Q12041572)), we ended up only importing identifiers to Wikidata (maybe we could consider also importing taxon range (P9714) : Czech Republic (Q213) but that's the limit of granularity that Wikidata can probably take). Vojtěch Dostál (talk) 07:53, 12 November 2022 (UTC)
Hi @Vojtěch, that is my scare too. But, at the same time, I imagine the case "Taxon range: Russia": that would be an almost useless information, since some nations are so widely extended.
I would rather imagine that as community we define the minimum and maximum surface of the entity identified as taxon range, such that it is possible to collect useful information, without exceeding in granularity or extension. (exm. to put in taxon range Discussion page: "The taxon range should be a territory that is bigger than 10'000 km2 and smaller than 100'000 km2, i.e. New York State (141'300 km2) OK, New York City (778 km2) NOT OK"). That could be useful for creating maps by SPARQL queries. Opinions about it?

Wikidata Day in NYC covered on the radio

Of possible interest here, the public radio program On the Media attended Wikimedia NYC's Wikidata Day event for a segment about universal knowledge projects. Listen to it here if you're interested. Appearances by a few Wikimedians. — Rhododendrites talk \\ 13:57, 12 November 2022 (UTC)

Solo performance

I think there is some mixup in solo performance (Q2319401), whose English description implies it referring to a person/occupation. The German label and description clearly do so, too, while the French refer to a kind of show. Until recently, the item also had instance of (P31)profession (Q28640) and instance of (P31)occupation (Q12737077). Please review this item and split it accordingly. Currently there is the item Fips Asmussen (Q105689) using it as occupation (P106), which should refer to the occupation (i.e. person) item afterwards. --2A02:8108:50BF:C694:29E6:BE9C:1625:78C8 13:33, 13 November 2022 (UTC)

Unknown date but known period

Hi all,
If there is no known date of a certain event (for example, opening a railway station) but there is a definite period (say, not earlier than 1947 and not later than 1953), how to better specify this in Wikidata?
I'd say "value = unknown value" + qualifier earliest date (P1319) + qualifier latest date (P1326). But this arises an error flag for the property. See, for example, Q94232905. Michgrig (talk) 17:40, 13 November 2022 (UTC)

you can specify dates as within a certain range . e.g. the 1900s. I think the error flags you are seeing are wrong and the constraints should be amended BrokenSegue (talk) 17:55, 13 November 2022 (UTC)

Do we need different wikidata items for terms that are structurally different but have the same content?

I am actually an author of the German Wikipedia and landed here because I have a general problem with linking Wikipedia articles to other languages.

At first I will discuss the question at first with the item d:Q1537963. The Wikipedia articles linked there all describe the same object. But the names are structurally different. In the German Wikipedia the name of the article is de:Hölderstetigkeit (hölder continuity). This is a property of a mathematical function, in France the article has the name fr:Hölder-jatkuva funktio, so it is the function itself with the property hölder continuity. In the english Wikipedia the name of the article is en:Hölder condition. This is the condition to formulate the hölder continuity. Is it correct that these articles have the same wikidata item? Actually I would prefere this approach.

But in the German Wikipedia there are other opinions on this. And this opinion was applied to the links of the articles de:Absolut konvergente Reihe and de:Bedingt konvergente Reihe. For the German article de:Bedingt konvergente Reihe there are is the item d:Q91134419 but for the english article en:Conditional convergence with the same topic there is the item d:Q2425336. This division was done by another user because he said, that a Property in mathematical context is not a mathematical function. Because of that division most of the Interwikilinks for the German article de:Bedingt konvergente Reihe are missing now.

Besides the missing interwiki links, I see the following disadvantages with this division of Wikidata items:

  • Many properties of the Wikidata data object would have to be maintained twice. In the example of Absolute Convergence (d:Q332465), at least all identifiers would have to be maintained twice. This is time-consuming and error-prone.
  • Sources for statements such as conditional convergence is the opposite of unconditional convergence (and not necessarily of absolute convergence) would have to be maintained twice. This is also time-consuming and error-prone, as the current state of affairs shows.
  • For new people in Wikipedia and Wikidata getting started is made even more difficult.

In the mathematical context, there are many more such examples: de:Vage Konvergenz (Maßtheorie), de:Differenzierbarkeit, de:Halbstetigkeit, de:Glatte Funktion, de:Lipschitzstetigkeit, de:Gleichmäßige_Stetigkeit, de:Absolut stetige Funktion and many more. I linked to the German arcticles because it was easier for me, but all of that articles have an english article and a Wikidata item.

What is the correct procedure here? --Christian1985 (talk) 20:49, 7 November 2022 (UTC)

  • The technically correct approach is to create a separate data item for each individual concept. If these concepts are related, try to link the data items with suitable statements.
  • Surprisingly often, there is some Wikipedia which actually has articles on both concepts, even if they are somehow closely related and described on in joint article in many other Wikipedias. These sitelinks need to go somewhere anyways.
  • A very fresh feature here at Wikidata is to use redirects as sitelinks in Wikidata (available just for a week or so). If you separate sitelinks about distinct, but related concepts into separate data items, you can now create local redirects and use these to maintain good interwiki linking. This feature was developed exactly with these sort of situations in mind.
  • All of that said: for rather abstract concepts we often have situations where articles about closely related, but technically distinct concepts are all together in one data item. Articles that conflate different concepts are difficult to assign to any "single concept item" anyways and we rather avoid to classify items in larger numbers as "conflations". This is not ideal for many reasons, but interwiki management is simpler for most involved users if related articles are gathered in a single item.
MisterSynergy (talk) 21:10, 7 November 2022 (UTC)
In future, Wikipedia may also devise a more automatic system to group together very similar Wikidata items and display interwiki links from them in all related articles. In the meantime, Wikipedias will have to use the redirecting sitelink solution, which is actually already a significant step forwards compared to what the situation was a few months ago. Vojtěch Dostál (talk) 21:49, 7 November 2022 (UTC)
Wikidata policy is pretty clear about not merging Wikidata items if those Wikidata items aren't referring to the same entity even if Wikipedia articles linked to it are about the same concept. Usually, the properties for both items are not the same. As far as I see it the decision to which Wikidata item a Wikipedia article should be linked should be up to the relevant Wikipedia. So there's no need to create new items in the case like "hölder continuity", it's just that if there are two items, merging them is wrong.
If you do want Wikilinks you can get them via sitelinks to redirects. ChristianKl14:35, 8 November 2022 (UTC)

Thanks for your answers. @ChristianKl can you pleae give me the link to the Wikidata policy you are talking about? --Christian1985 (talk) 19:48, 14 November 2022 (UTC)

Pizzle

A filter has kept me from changing the description of pizzle (Q925901) to flogging instrument made from a bull's penis. I have gone for flogging instrument made from a bull's p***s to be able to change it at all (the previous description Middle English word was unacceptable since Wikidata items do not describe words). Could someone please correct the description? --2A02:8108:50BF:C694:E4D3:B18:6DDC:71F 00:16, 14 November 2022 (UTC)

  Done, reluctantly. -wd-Ryan (Talk/Edits) 00:38, 14 November 2022 (UTC)
If you have a better description, go ahead :) --2A02:8108:50BF:C694:64CE:A0F0:AE59:1900 09:46, 14 November 2022 (UTC)
The enwiki article seems like a wordy disambiguation page. Ghouston (talk) 09:49, 14 November 2022 (UTC)
"penis of a non-human animal", perhaps, but the linked Dutch article is about something more specific. Ghouston (talk) 09:54, 14 November 2022 (UTC)

Wikidata weekly summary #546

Stained glass

There was a mixup in stained glass (Q1473346): the technique/art and the artwork produced by it. I have created a new item, glass staining (Q115200950), for the technique so that stained glass (Q1473346) refers only to the artwork. The German (and Alemannic) Wikipedia article refers to the technique, so I moved it to glass staining (Q115200950), while the English and French articles refer to the artwork, so they stay connected to stained glass (Q1473346). Please check labels, descriptions and sitelinks for your language. (And I would also appreciate further contributions to the two items, as I have done only the bare minimum to separate them.) --Wiltbeider Thierry Dubois 39 (talk) 00:08, 14 November 2022 (UTC)

@Wiltbeider Thierry Dubois 39: thanks for sorting that out. This is a common problem, see Wikidata:WikiProject_Visual_arts/Item_structure#Works_of_visual_art for an overview. Multichill (talk) 21:54, 14 November 2022 (UTC)

Country for Crown Dependancies and British Overseas Territories

https://www.wikidata.org/wiki/Property_talk:P17#P17_for_places_in_Jersey_(Q785) discussed whether the country (P17) for things in Jersey (Q785) should be United Kingdom (Q145) or not, but no decision was reached. There is general confusion for Crown Dependencies (Q185086) and British overseas territories (Q46395), for example checking the countries for items located in the administrative territorial entity (P131) of Isle of Man (Q9676) 211 are in the Isle of Man, 232 the UK, 34 in both. The definitions of the terms state clearly they aren't in the UK, but equally they aren't countries, though we do have a liberal view of what a country is, and the names are not flagged. I often do queries for maps using country (P17) and its a pain having results scattered across the globe, so I'd prefer they weren't UK, but I don't want to embark on changing them without consensus here. Vicarage (talk) 10:12, 7 November 2022 (UTC)

I scent a general issue with the definition of country (P17) here, in terms of what is a country and what makes a country “the” country of a geographical item. This opens up a new and potentially much bigger can of worms, but what about debatable territories? --2A02:8108:50BF:C694:E0F2:7F6B:7EAD:26F9 10:34, 7 November 2022 (UTC)
The discussion and the situation is quite clear : Jersey is not (even remotely) in the UK. Wether "Jersey" is a good value or not for P17 is a bit more complex but still widely accepted I think.
« its a pain having results scattered across the globe » well, for this general issue, there will always be country scattered across the globe (France for instance is the most extended country in the world ;) ). It's unrelated to the Jersey question.
Cdlt, VIGNERON (talk) 11:11, 7 November 2022 (UTC)
Well, this depends on whether country (P17) is assigned based on location (the German and French descriptions of the property say so, the English one is relatively vague) or other criteria such as jurisdiction, military control etc.; even the location criterion bears conflict potential e.g. in case of countries with separated remote parts (such as France with its Overseas territories or the US with Hawaii and Guam) and in connection with territorial waters / exclusive economic zones. --2A02:8108:50BF:C694:E0F2:7F6B:7EAD:26F9 11:38, 7 November 2022 (UTC)
P17 has always been more political than geographical (see for instance the description "sovereign state", plus there is already located in the administrative territorial entity (P131) or location (P276) for the purely geographic data), hence why it's also used on non-geographical (and even some intangible) items. Cheers, VIGNERON (talk) 13:23, 7 November 2022 (UTC)
The problem with located in the administrative territorial entity (P131) as a globbing tool is that its at the whim of the editors. Sometimes its a city in common knowledge, sometimes its an obscure unitary authority like Halton, sometimes a village, and its hard to chase back to a common branch. Everyone has a clear idea what a country is, and you can chose historic counties, states or departements below that, but then you get caught with the edge cases like here when things fall through the gaps. It does seem clumsy to have to assign historic county (P7959) to every museum as I'm finding it hard to deduce a regional scale id through the located in the administrative territorial entity (P131) route, as you get one-to-many mappings. I wish there was a selection I could apply worldwide for "divide the world into a set of regions of about 5 million people, give me the one this thing is in". Certainly if P17==UK is political, I expect something in NW Europe, not the Caribbean. Vicarage (talk) 14:15, 7 November 2022 (UTC)
@Vicarage: First, what are you trying to do? What tool do you use? It seems a bit like you're using the wrong property for the wrong things (which obviously gives wrong results). If you want something only in Europe, then you should probably use coordinate location (P625) (which is truly geographic) or located in the administrative territorial entity (P131) (which is mostly geographic) to filter that. For example, here is a query for the British Museums within 1000 km of London https://w.wiki/5wn3
P131 is not « at the whim of the editor », it's supposed to be the smallest administrative unit(s) (and if it exists, because yes, edge cases are everywhere). Plus, common knowdledge is tricky, for me it's common sens than « something in NW Europe » is a geographic concept not a political one (so « if P17==UK is political, I expect something in NW Europe » doesn't make sense to me).
Cheers, VIGNERON (talk) 11:17, 11 November 2022 (UTC)
I want to present people with a reasonable number of results on a map, which for my dataset is about 100 items and UK county size, thought the site is global. I want the names to be recognisable and similar sizes, which administrative regions are not, and who's ever heard of Rushden apart from locals. So historic county is good, and coordinate location (P625) bad. I want places to be single value, so historic county is good until you get Gibraltar which has none or London which has 6. I use SPARQL queries and Mediawiki.
And located in the administrative territorial entity (P131) does seem to be added by whim. Few people understand or can picture parish/district/town/unitary authority/regional boundaries, and put in something random, and I see no evidence of it being made consistent using {{P|625}. If I want something in Kent I don't want Medway excluded. Vicarage (talk) 08:51, 15 November 2022 (UTC)

One village in two municipalities (population statement)

Hi data friends,

In Harkstede (Q2594189) there is currently a P1082 population statement with two references. For detailed reporting wishes in Wikipedia tables we are seeking for the best resolution to register this. The village is a weird example which is part of two adjacent municipalities. On top population level I've added the total population and within the statement the two different amounts. Now I've choose for two references (applies to part (P518) for the name of the municipality and including (P1012) for the population value). However, Wikidata throws a property scope constraint. Therefor I am looking for a better way to do this. Should I use a second P518 for the population value, should I use another allowed qualifiers constraint or should a new qualifier constraint be added to the P1082 allowed qualifiers constraints? Your help and advice will be highly appreciated. Best regards, Démarche Modi (talk) 19:59, 13 November 2022 (UTC)

Those qualifier shouldn't be on the reference, because that implies it's qualifying the source that you're referencing and not the main statement.
I would use multiple population (P1082) statements: one for the entire village (with preferred rank), and one each for the parts in the two municipalities (qualified with applies to part (P518) and with normal ranks). --M2Ys4U (talk) 02:32, 15 November 2022 (UTC)
Ok, did it like that. Will now take it back to the module programmer and see if it can be used in there. Thank you for your advice! Démarche Modi (talk) 08:39, 15 November 2022 (UTC)

Manufacturing by people/places

Before 1900 its common for industrial concerns like shipbuilding to be known by the name of their founder or location. So ships will be said to be built by Buckler's Hard (Q4983328) or Henry Adams (Q5717107). And Wikipedia articles will describe the industrial activities as part of their biography/place description. But when WD uses them as the manufacturer (P176) of HMS Indefatigable (Q2707827) do we want the place/person labelled as a shipyard (Q190928) with the properties of the organisation, like workforce or 3rd party ids? We could create new items for the organisation, using the same names, to keep the structure purer, even though they might not be a formal company. This doesn't just occur for company towns, sources will say a ship was manufactured by Toulon (Q44160), and using location of creation (P1071) rather than manufacturer (P176) distorts the graph. manufacturer (P176) itself accepts both people and organisations, but was originally created as an occupation. Vicarage (talk) 08:35, 8 November 2022 (UTC)

v.good question, Vicarage. No contribution from me right now other than to agree that the issues you raise are real, the current modelling mostly inadequate. --Tagishsimon (talk) 20:00, 9 November 2022 (UTC)
I have changed the manufactured by person to designed by person, and added a few companies associated with the people. Places TBD Vicarage (talk) 11:44, 15 November 2022 (UTC)

Friends' Groups

Many places have friends groups distinct from the formal operators, and for some places like nature reserves the Friends website is more informative than the official local council one. What's the best way of coding this? described at URL (P973) with a new item "friends' group" perhaps. We want to avoid supporters' group (Q1070414) which seems to be narrowly for sports. Or would a new property 'Friends Group URL' be better, though I don't like all these *URL proposals. Vicarage (talk) 23:09, 9 November 2022 (UTC)

yeah personally I don't like the idea of adding yet another URL type. the described at URL idea seems best. Or else make a new item for the friends' group itself BrokenSegue (talk) 06:20, 10 November 2022 (UTC)
I guess the Friends Groups that are charities might merit their own items, but I suspect the rest are too small, if their only properties were their site and their URL. Vicarage (talk) 08:23, 10 November 2022 (UTC)
friends' group (Q115221744) created, its talk page describes its use as suggested here. Vicarage (talk) 19:28, 15 November 2022 (UTC)
I agree qualifying the type of URL is better than a devoted property, at least for now. There may be several different types of "official" or affiliated web pages for the same subject. Potential other URL qualifiers for official website (P856) or described at URL (P973) include: official website (Q22137024), home page (Q11439), personal web page (Q2737701), faculty web page (Q109647055), lab website (Q6466676), alumni association (Q447877), and corporate website (Q5172507). -Animalparty (talk) 21:17, 15 November 2022 (UTC)

I can't add statement

Please check and revert, there is some bug issue  – The preceding unsigned comment was added by Vijayak1 (talk • contribs) at 20:30, 9 November 2022 (UTC).

Please describe more in detail what you were trying to do. For reference, I deleted Q115145326, and Q114001691 was deleted earlier. --Matěj Suchánek (talk) 17:04, 16 November 2022 (UTC)

Block the Vodafone IP 5.91 and 37.159

I don't know if this is the right place, but i'm direct. I write with this ip and similar Vodafone ip connection (i don't select the ip address because it is random) some many page on Italian Wikipedia and with this unusual block range, i don't linking the new page that i created with someone. Finally, on Italian wiki, some many things and data of template/infobox are automatically to wikidata and some many authority control and external link such as Britannica or Treccani, are some many important to the page, they all go through and they can end the relevance of an item. Please, unlocking the possibility to linking a new page on others wikis. I have been years that i contribute to wikidata, i don't know why there is this block, but I didn't do anything wrong and I respected the rules. 5.90.234.1 17:46, 12 November 2022 (UTC)

unblocking of these IP ranges are questionable. Maybe you should do an account in Wikidata. Established users are not blocked Estopedist1 (talk) 14:56, 13 November 2022 (UTC)
I don't understand why you have blocked all the ip, can't you at least make a selective block leaving only the possibility to add links to other wikis and commons? So it is totally counterproductive and goes to damage the work one does on Italian wiki; unfortunately most of the stuff goes from wikidata on ITwiki, not having the link to his page on wikidata is a big problem. I warmly renew my invitation to make a modification or reshaping of the block in such a way as to allow at least this to be modified. Because the damage this block does outweighs the benefits. Can you make an abuse filter, maybe is the best solution. If one wants to vandalize it does so with any ip or account, this block has the placebo effect only. And harms those who want to do and contribute by following the rules with common sense and for improvement purposes.5.90.237.81 15:14, 14 November 2022 (UTC)
please read/use Wikidata:IP block exemption Estopedist1 (talk) 07:03, 15 November 2022 (UTC)
Estopedist1 ok but, I have always contributed without registering for some many years, I don't find it logical and correct to contribute from ip on itwiki and with an account registered on wikidata. If there is the possibility to contribute without subscribing I don't see why it can't be done. I would like to ask how and why of such a "super block" and if one can at least add links to other wikis with a selective block. I didn't understand what harm it is to add the various pages and languages ​​and why this cannot be excluded from the block. moreover, I am against total blockades for months, then here it was done without indicating a prior consensual discussion or even a reason. This causes inconvenience in contributing to Wikipedia. 5.90.236.78 04:44, 16 November 2022 (UTC)

Operator or owned by

National heritage bodies like English Heritage or the National Trust both own (owned by (P127)) and operate (operator (P137)) their sites (there is also maintained by (P126), which seems the least useful). There is a 98% overlap in the properties, but occasionally something is owned by one body and operated by another. Should we aim to have both properties fully populated, or only one, with the exceptions noted, like Wakehurst (Q7961105), and which should dominate? Vicarage (talk) 09:42, 16 November 2022 (UTC)

@Vicarage Generally, sources will specify either owner, or operator. If you have a source claiming that some subject is "both owner and operator", then it makes sense to me to populate both properties. Vojtěch Dostál (talk) 09:56, 16 November 2022 (UTC)
Concur that WD should aim to have both statements fully populated. They're different concepts, even if the position is often held by one & the same organisation. Trying to handle it by exception involves knowledge - that there is a weired information handling hack going on here - which is difficult and for most purposes impossible to promulgate. --Tagishsimon (talk) 17:15, 16 November 2022 (UTC)

Importing and Exporting Greek Author Names in Multiple Languages

The Canada research Chair in Digital Textualities produced an online scholarly edition of the Greek Anthology powered by an API. Their data is complementary to Wikidata, and we are planning on contributing to the project by importing a CSV with our information on our more than 300 authors from Ancient Greece.

Here is the issue: names in different languages do not have properties associated with them. This is problematic both when importing and exporting data about our authors; how can one contribute with the names of an author in multiple languages? Thank you and have a great day. Yann Audin (talk) 16:27, 16 November 2022 (UTC)

I’m not sure if I understand the problem, but there is name in native language (P1559), which is language-tagged, see e.g. [2] (try clicking edit) or specifically for Ancient Greek [3]. There can be more than one name in native language (P1559) statement per item with different language tags. There is also name (P2561), which is language-tagged, too, but not restricted to the subject’s native language. That way you can have items for the authors declaring their name(s) in different languages. These author items can then, if needed, be linked using author (P50). --2A02:8108:50BF:C694:A4EE:1267:E564:785E 10:37, 17 November 2022 (UTC)

Historic Environment Scotland doing a website consultation

Historic Environment Scotland, the government department responsible for historic sites and listing, are looking for volunteers to shape their website redesign, including Canmore. As I'm interested in fortifications I've offered to help, and will try to represent Wikidata. https://canmore.org.uk/ Vicarage (talk) 07:41, 17 November 2022 (UTC)

Two slightly related topics

How do we handle the reclassification of an object? For instance, if a species is reclassified into another genus, do we update the existing item, or do we create a new one? Similarly, if a person transitions from male to female? Both involve a name change and the possibility of some changes to statements. - UtherSRG (talk) 12:34, 17 November 2022 (UTC)

So I guess in the first part, we should have multiple objects, because old names are still valid (in the Wikidata sense, not in the taxonomic sense) and should be listed as a synonym. So I'll amend my question to be: How do we indicate that Euoticus inustus is a (junior) synonym of Galago matschiei? - UtherSRG (talk) 13:01, 17 November 2022 (UTC)
I can't help with the taxonomy stuff, but for your second question, we would qualify official name (P1448) and sex or gender (P21) with start time (P580) and end time (P582) — Martin (MSGJ · talk) 13:07, 17 November 2022 (UTC)
And the currently correct statement should get preferred rank. (Personally I think “sex or gender” is something start time (P580) and end time (P582) do not make sense for, but a cleaner solution would require separating sex, gender, and publicly communicated value thereof, which does probably not have any chance for consensus at the moment.) --2A02:8108:50BF:C694:A4EE:1267:E564:785E 13:58, 17 November 2022 (UTC)

Erroneous merge

I merged two items in this edit; however, I now see that I have merged two separate editions of the same work in different languages, and that this was probably a mistake. Can anyone undo this, and if so, can you also tell me how to do it? The Anome (talk) 16:19, 17 November 2022 (UTC)

@The Anome: Done. Method is to access the histories for each item, and restore a pre-merge version for each. Accessing the history of the merged (and disappeared / turned into a redirect) item involves: search for the QID, select it, get taken to the extant item, which will at the top give you a link back to the redirect item, from where you can get to its history. Please have a look at both items - I see one has no EN label. --Tagishsimon (talk) 16:39, 17 November 2022 (UTC)
Thank you! The Anome (talk) 17:11, 17 November 2022 (UTC)

Murder entry

Does anyone know how Hall–Mills murder (Q4157206) is supposed to be modeled, I look at other murders and they have the same error flags and seemingly contradictory guidelines about how to use various properties and qualifiers. RAN (talk) 19:12, 11 November 2022 (UTC)

maybe significant person (P3342) instead of participant makes more sense? that seems to be the main error flag BrokenSegue (talk) 19:48, 11 November 2022 (UTC)
I think Hall–Mills murder (Q4157206) now looks great, so I added it as model item for murder (Q132821). That means in the future there's a clear model to be followed. ChristianKl11:56, 12 November 2022 (UTC)
Are the object has role (P3831)murder victim (Q73153647) qualifiers for the two victim (P8032) statements really necessary? When the subject is a murder, the object of a victim (P8032) statement should be a murder victim by default. A qualifier would be needed in other cases (e.g. someone accidentally injured during the same event). --2A02:8108:50BF:C694:142C:9BB9:428D:1C6A 13:43, 12 November 2022 (UTC)
Yes, they are necessary. Because there's no way to differentiate between cases where information/statements/qualifiers were left out on purpose and cases where they simply haven't been added yet. Also better to outright state information than to hope that whoever is using the information will make the same assumptions about the lack of statements. --2A02:810B:580:11D4:5586:A932:349F:91E2 12:48, 13 November 2022 (UTC)

Wikidata in wrong place for 2 articles

I'm looking for someone who has permission to edit semi-protected page on Wikidata. There are 2 articles in foreign languages that are in wrong place. Both Hungarian and New-Norwegian article pages need to be removed from here Q3824358 and they need to be re-added here Q115244455. I can't do it myself, since I don't have permission to edit semi-protected pages. --Pek (talk) 15:48, 17 November 2022 (UTC)

@Pek: Done. intracellular space (Q115244455) has few statements, and no P31 not P279. If you could fix that issue, that would be good; it's not excellent to move sitelinks to an item which has little or no information on it. --Tagishsimon (talk) 16:43, 17 November 2022 (UTC)
Currently, on intracellular space (Q115244455) the dewiki and enwiki seem to be about two different subjects. The enwiki article looks like it means the room within one cell while the dewiki article is about the internal room of all cells in an organism together. I moved the dewiki article into it's own item. ChristianKl16:11, 19 November 2022 (UTC)

Awards given in one year for works in a different year

The Retro Hugo Award (Q21163264) (https://fancyclopedia.org/Retro_Hugos) is an awkward beast. It tries to fill expand on the science fiction Hugo Award (Q188914), which is awarded in year X for works produced in year X-1, but looks back in time 50,75 or 100 years. So in 1996 the "1946 Retro Hugos" were awarded, for works produced in 1945. I think the point in time should be the date of the award, 1996, but I'm not sure how to model the date of work. Colloquially its the "1946 Retro Hugo for Best Novel", but I can't find a good date qualifier, as publication date (P577) will be out by one. effective date (P7588) and announcement date (P6949) are possibilities, but the formal is legal, and the latter makes the award inconsistent with other literary awards Vicarage (talk) 09:52, 19 November 2022 (UTC)

covered period (P7643) could be an option, I think. It was created with collections and historical books in mind, but maybe it is not too much a stretch to use it with awards, too. The value-type constraint would need to be broadened to include other temporal entities than historical periods. - Valentina.Anitnelav (talk) 10:53, 19 November 2022 (UTC)
awarded for period (P4566) seems what is needed here, I think? --M2Ys4U (talk) 21:26, 19 November 2022 (UTC)
Yep, your update crossed with mine. I'll go with that. Thanks Vicarage (talk) 21:39, 19 November 2022 (UTC)
Thanks, I completely missed this. - Valentina.Anitnelav (talk) 22:32, 19 November 2022 (UTC)

How to document a software instance

Do we use instance of (P31) to relate a software with its instance?

For example, Wikimedia Gerrit (Q106171018)instance of (P31)Gerrit (Q1164920)?

That wouldn't really work though since Gerrit (Q1164920) doesn't have subclass of (P279)...

Also, we could use software engine (P408), but then what would we put in instance of (P31)? Lectrician1 (talk) 20:24, 19 November 2022 (UTC)

@Lectrician1: Instance of an instance (Q1664689)? Or deployment environment (Q3055454), which looks like the result of software deployment (Q2297740). – Minh Nguyễn 💬 00:20, 20 November 2022 (UTC)
lol no Lectrician1 (talk) 01:11, 20 November 2022 (UTC)

Help with template for country

We have a template for countries on the Icelandic Wikipedia is:Snið:Land. Can someone help me pull the population of the country into the template so we don't need to update it manually (We call the element fólksfjöldi). Where is a good place to learn how to structure templates to pull information from Wikidata? Steinninn (talk) 05:30, 11 November 2022 (UTC)

Wikidata:How to use data on Wikimedia projects is probably good place to start. From a quick glance at the page, it looks like you'll need a #property or a #statement tag, like {{#statements:population|from=Q142}} to get the population of France (Q142). I'm not sure which tag is better for this situation though. DoublePendulumAttractor (talk) 03:02, 15 November 2022 (UTC)
A bunch of wikipedias also have the Module:WikidataIB (Q25714577) module, which seems to fulfill this for infoboxes. It doesn't look like it's installed on the Icelandic Wikipedia though. DoublePendulumAttractor (talk) 03:08, 15 November 2022 (UTC)
Thank you DoublePendulumAttractor for the help. I'll take a look at it. Hopefully I'll get it working on the Icelandic Wikipedia. --Steinninn (talk) 21:59, 20 November 2022 (UTC)

tricky Wikidata External ID redirector

Chamber of Deputies of Romania person ID is a kind of 2-in-1 ID.

It identifies deputies ("cam=2"), but also some senators ("cam=1").

In short, the senators seem to be divided between Senate of Romania person ID and Chamber of Deputies of Romania person ID.

"chamber" ("camera" in Romanian) → "cam" parameter

  • for senators : "cam=1" (mandatory)
  • for deputies : "cam=2" (it's by default, no need for it).

In conclusion, wikidata-externalid should include "cam"(=1) parameter for senators and no "cam" parameter for deputies. - Coagulans (talk) 18:28, 20 November 2022 (UTC)

Error message for missing "replaces"

At Talk:Q7013056 Egbert Benson (Q328370) gives an error saying "replaces" is empty, but we have "no value", since he was the first, this didn't used to give an error message, did something change? RAN (talk) 03:18, 18 November 2022 (UTC)

I suspect the three rows which have no start/end dates are throwing PHH. --Tagishsimon (talk) 18:17, 18 November 2022 (UTC)
@Richard Arthur Norton (1958- ), Tagishsimon: I think there are two slightly different issues at play here. The main problem is that PositionHolderHistory doesn't currently handle an explicit no value being set on replaces (P1365) or replaced by (P1366). However, it also never complains about a missing 'replaces' on the final row of any table, so the issue was presumably historically disguised by that, and now that there are other rows, the warning has appeared. Cleaning up those extra rows should make the warning vanish again, although obviously the better long-term solution is to properly handle wdno. Suggestions for how to adjust the underlying SPARQL to sensibly differentiate that from the value simply being missing are very welcome. --Oravrattas (talk) 18:49, 18 November 2022 (UTC)
@Oravrattas: This sort of thing. Presumably could be coalesced into the replaces/replaced columns if you'd prefer - https://w.wiki/5zAQ --Tagishsimon (talk) 19:02, 18 November 2022 (UTC)
  • I see, I need to create "Attorney General of Colonial New York" and migrate the two unnumbered ones. We had the same problem at Governor of New York and Mayor of New York City, where we had entries prior to the canonical numbering system. Figuring out how the numbering system works can be difficult if there is no canonical list provided by the government. --RAN (talk) 06:42, 21 November 2022 (UTC)

should we merge standard with item being standardized?

(Continuing https://www.wikidata.org/wiki/Talk:Q6770697 with @Mormegil:)

There are cases where a standard is about exactly one thing, eg:

So two questions:

  • Should we merge (identify) the standard with its subject item in such cases?
  • Can we use ISO standard (P503) on the subject item in cases where nobody bothered to make a separate item about the standard? (see last bullet).

--Vladimir Alexiev (talk) 09:21, 20 November 2022 (UTC)

A) If ISO invented the term, you could merge them and have the commonly used version as the label 'Market Identifier Code', with aliases for 'MIC' 'ISO 10383' and the ponderous 'ISO 10383 Securities and related financial instruments — Codes for exchanges and market identification (MIC)'. And it should be an instance of ISO standard (P503)
B) But if ISO are standardising existing common usage, you can't, because there may be other interpretations of the phrase
If there are many more B than A, we might as well have them all Bs. We certainly want every ISO standard in WD, and I expect most don't have catchy names. Vicarage (talk) 10:13, 20 November 2022 (UTC)
I would not merge them: The ISO standard is a document by its nature, the standardized object may get very different statements.--अक्कू यादव (talk) 13:22, 20 November 2022 (UTC)
Agreed on don't merge. There are often numerous standards documents or other informal sources that describe a common specification eg. format of of file could be described in both an ISO and RFC standard. ISO standards in particular are known for just copying (more or less) pre-existing industry standards. Dhx1 (talk) 04:46, 21 November 2022 (UTC)

Labelling of gender items

There are a number of items being used as objects of sex or gender (P21), mostly male (Q6581097) and female (Q6581072) of course. These two have an adjective as label (“male”, “female”). Other often used items include male organism (Q44148), female organism (Q43445), trans woman (Q1052281), trans man (Q2449503), or eunuch (Q179294), which are labelled with nouns. (Personally, I wouldn’t call eunuch a “sex or gender”, but that’s another can of worms.) Shouldn’t gender labels be consistent, either nouns or adjectives? It seems most items used as sex or gender (P21) objects have noun labels (SPARQL query), which would imply male (Q6581097) should be “man” and female (Q6581072) “woman”, but I’d like statements like “sex or gender: female” better than “sex or gender: woman”, so I personally would opt for renaming trans woman (Q1052281) to “trans female”, trans man (Q2449503) to “trans male” and so on. (Not sure about male organism (Q44148) and female organism (Q43445) and what the reason for them being distinct from male (Q6581097) and female (Q6581072) is in the first place.) Are there other opinions? --2A02:8108:50BF:C694:A4EE:1267:E564:785E 10:04, 17 November 2022 (UTC)

trans woman was recently relabeled as such. personally I think the old label "trans female" was better. BrokenSegue (talk) 17:49, 17 November 2022 (UTC)
I think this discussion should be made on https://www.wikidata.org/wiki/Wikidata:WikiProject_LGBT . Changing the names of trans identities around is easy to cause offense. Wanting to have more consistent labels, is for me not valuable enough to unnecessarily offend trans people. That Wikiproject has enough people who care about the issue and thus can give good advice about what labels are good and will cause the least offense. ChristianKl15:49, 19 November 2022 (UTC)
I don’t think changing a noun to an adjective can be offensive (or at least changing “man” to “male” etc. shouln’t be), but talking to the WikiProject is probably a good idea. --2A02:8108:50BF:C694:9507:ACDA:C240:EA63 09:28, 21 November 2022 (UTC)
And Tagishsimon has already informed the WikiProject. --2A02:8108:50BF:C694:9507:ACDA:C240:EA63 09:33, 21 November 2022 (UTC)

Adjectives as items

According to Help:Items, items are supposed to be "things", including "topics, concepts, and objects"; in other words, nouns. For the most part, this is true, but then we have items like polychromy (Q21157531), round (Q59564206), and hot (Q28128222), which are clearly adjectives. Should these be changed to "colorfulness", "roundness", and "hotness"? At least in the case of polychromy (Q21157531), this seems to be an awkward hack to get translations of the term for Commons,[4] which I doubt is the correct way to do that. Nosferattus (talk) 02:06, 20 November 2022 (UTC)

pizza (Q177) has serving temperature (P7767) set to hot (Q28128222). I'm not sure what to make of it. Ghouston (talk) 06:59, 20 November 2022 (UTC)
In my language hot (Q28128222) is named as "high temperature". D6194c-1cc (talk) 07:21, 20 November 2022 (UTC)
@Nosferattus: I don't think you can just add the '-ness' extension to the English labels as that (noun) is a measure of that attribute which might be low or high, whereas the items themselves seem clearly to refer to the high value state. In general I'm not convinced that Wikidata items necessarily must be nouns - it would actually be helpful for lexeme translations to allow adjectives and verbs and other parts of speech also (of which we do have some like this). The vast majority of items will always be nouns of course. ArthurPSmith (talk) 18:51, 21 November 2022 (UTC)
@ArthurPSmith: But that's exactly the problem. If an item corresponds to a particular measurement or range of measurements, that not only breaks the data model in some cases, but it necessitates additional items for all the other possible measurements. If we have "hot", we also need "warm", "cool", "cold", "lukewarm", "boiling", "frigid", "icy", "mild", and eventually "70°F", "71°F", etc. But that's not what items are for. And regarding lexemes, Wikidata lexemes have their own system for translation which links to other lexemes (for example Lexeme:L2). Items shouldn't have anything to do with that. Nosferattus (talk) 20:08, 21 November 2022 (UTC)
@Nosferattus: We have items for 1969 (Q2485) and February 29, 1936 (Q69271647), not to mention freezing (Q1135221) and absolute zero (Q81182); I don't see a problem if we have items for many specific temperatures or ranges. If it makes sense as the value of a property then why not? ArthurPSmith (talk) 20:26, 21 November 2022 (UTC)

Wikidata weekly summary #547

I wanted to add the 2012 Sight & Sound Greatest Films of All Time list as an item to Wikidata and then add the part of (P361) property to the various films on the list. I think I was vaguely aware that there was already an item that referred to the list (The Sight & Sound Greatest Films of All Time 2012 (Q4835528)). However, since it is an instance of (P31) a Wikimedia list article (Q13406463), I didn't think much of it. My thought was that The Sight & Sound Greatest Films of All Time 2012 (Q4835528) isn't an item for the thing itself, but instead for a description of the thing in Wikimedia. Hence, I went ahead and created an item for the thing itself (The Sight & Sound Greatest Films of All Time 2012 (Q115273251)). When I tried to add the Wikipedia article for the list (https://en.wikipedia.org/wiki/The_Sight_%26_Sound_Greatest_Films_of_All_Time_2012) I ran into an error message that told me that I couldn't add it, because the article was already linked on The Sight & Sound Greatest Films of All Time 2012 (Q4835528).
So what is the official policy here? Should there be both an article for the Wikimedia list and the thing itself? Or should The Sight & Sound Greatest Films of All Time 2012 (Q4835528) and The Sight & Sound Greatest Films of All Time 2012 (Q115273251) be merged? -- Zamomin (talk) 00:51, 19 November 2022 (UTC)

Seems like thereis a clusterfuck going on here. The record Q4835528 needs to decide what it is; right now, few of the sitelink article names marry up with the EN wikidata label. Presuming that "The Sight & Sound Greatest Films of All Time 2012" is not the same thing as "BFI The Top 50 Greatest Films of All Time", then you are right to coin a new item for it - The Sight & Sound Greatest Films of All Time 2012 (Q115273251) ... all sitelinks from The Sight & Sound Greatest Films of All Time 2012 (Q4835528) that are about the "The Sight & Sound Greatest Films of All Time 2012" should be moved to that item (i.e. delete from existing item, then add to new item). The EN wiki article https://en.wikipedia.org/wiki/The_Sight_%26_Sound_Greatest_Films_of_All_Time_2012 is not a list article (i.e. not a wikimedia list). It's unlikely that Q4835528 should be a Wikimedia list article (Q13406463), but we won't know that until we've made up our mind what it is about. Policy is that wikimedia lists take Wikimedia list article (Q13406463) and third part lists take values such as you have used in The Sight & Sound Greatest Films of All Time 2012 (Q115273251). --Tagishsimon (talk) 16:02, 19 November 2022 (UTC)
@Tagishsimon: Thanks for taking the time to answer. A bit of domain knowledge from my side might be helpful here (there is a TLDR at the end, in case things get too nerdy): "Sight & Sound" is a publication by the British Film Institute (BFI). Hence, saying "The Sight & Sound Greatest Films" is the same as saying "BFI Greatest Films". I looked into the history of the Wikipedia article about the list and it turns out that it originally contained only 50 entries. As its sole reference it links to http://www.bfi.org.uk/news/50-greatest-films-all-time, which these days redirects to https://www.bfi.org.uk/sight-and-sound/greatest-films-all-time. The current version of the website lists 100 entries. So my best guess is that orginally a shorter version of the list (50 entries) was published and that it was later expanded to 100 entries. The Sight & Sound Greatest Films of All Time 2012 (Q4835528) was created based on this older version of the list. I compared the top 50 films listed on the modern version of the BFI site with the old version of the Wikipedia article and they are identical. I would say it makes no sense having an Wikidata item for a short and one for the long version of the same list. TLDR: Q4835528 and Q115273251 refer to the same thing.
My best guess is that the best way to move forward would be to delete the new item that I created (Q115273251) and bring the title of The Sight & Sound Greatest Films of All Time 2012 (Q4835528) in line with the Wikipedia article (i.e. rename it to "The Sight & Sound Greatest Films of All Time 2012"). -- Zamomin (talk) 01:41, 22 November 2022 (UTC)

Matching identifiers as reference

NIOSH publications have the feature that, when a publication is translated into a new language, both versions have the same NIOSH Numbered Publication ID (P4596). So I've been identifying sets of items with matching P4596 values and preparing to add has edition or translation (P747) and edition or translation of (P629) statements. The problem is what to use as a reference.

I suppose the best choice would be to use based on heuristic (P887) with value "inferred from identifier match" (which doesn't currently exist), although inferred from statements (Q114963892) seems to be the closest existing value. The problem is that, in either case, there should be an "identifier used" property that somehow has the value P4596 to specify what was the thing that was matching. There doesn't seem to currently be such a property. Is there another way to do this, or should such a property be created? John P. Sadowski (NIOSH) (talk) 04:55, 21 November 2022 (UTC)

@John P. Sadowski (NIOSH) I'd suggest using the identifier directly as a property in the reference. And maybe also include some explanatory based on heuristic (P887). Vojtěch Dostál (talk) 11:32, 22 November 2022 (UTC)

wikipedia list generator handling multi-value qualifiers

In https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Ships#End_date_of_a_ship we are discussing how to add dates to ships such in a way that's compatible with wikipedia list generators (ships built by a shipyard). It makes a lot of sense to use significant event (P793), but apparently the wikipedia list generator, while it can handle qualifiers to a limited extent, it struggles with the multiple dates in a ship's history (See Emma Maersk in https://no.wikipedia.org/wiki/Mal%3AListeboks_Odense_Staalskibsv%C3%A6rft). Is there a more flexible list generator @Cavernia: can use? Vicarage (talk) 17:42, 21 November 2022 (UTC)

For those who don't understand Norwegian: The coloumn "Utfaset" fetches service retirement (P730) and the qualifier point in time (P585), "Hendelser" lists all significant event (P793) entries with qualifier point in time (P585), and "Sjøsetting" lists the qualifier point in time (P585) for significant event (P793) entries with ship launching (Q596643). The goal is to be able to present the final destiny of a ship the way it is presented in coloumn "Utfaset", but by using the property significant event (P793) instead of service retirement (P730). --Cavernia (talk) 17:55, 21 November 2022 (UTC)
Listeria, aka Wikidata list, does not seem to be struggling with qualifiers for Emma Mærsk (Q477173). The item has 4 dated significant event statements, and Listeria reports all four values & all four dates. Listeria is driven from a SPARQL report, and can almost certainly be made to do *exactly* what you want it to do, albeit at the cost of more specification within the SPARQL report and less in the columns= parameter. So if, for instance, you wished for all significant events except the launch in one column, and the launch, derived from a significant event statement, in another column ... that's completely possible. So, idk. Which direction do you want to go in? --Tagishsimon (talk) 18:20, 21 November 2022 (UTC)
Yes, in the documentation it is mentioned that you can use SPARQL to generate whatever results you want by defining variables. I have tested it, but it doesn't seem to work. If you can give be an example of how this is done in a Wikidata List, I would be thankful. --Cavernia (talk) 18:47, 21 November 2022 (UTC)
@Cavernia: For example, in https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Women_in_Red/Missing_articles_by_occupation/Architects the SPARQL fetches the country label, place of birth label, and place of death label manually (because accessing large items is a sometimes failure mode in Listeria, afaik), as well as the number of sitelinks. The variables are in the SELECT. The columns= parameter references these variables, e.g. columns=?pobLabel:place of birth,?linkcount:site links. If you'd like SPARQL/Listeria help, put a note on Wikidata:Request a query. hth --Tagishsimon (talk) 19:37, 21 November 2022 (UTC)
Should say, it's mainly the three OPTIONAL{} at the end of the query, in the above example, not all the gook which comes before. --Tagishsimon (talk) 19:40, 21 November 2022 (UTC)
Thanks, after some struggling I was able to use variables from Wikidata List now. Trying to request a query to solve the rest. --Cavernia (talk) 10:19, 22 November 2022 (UTC)

The Wikidata Library

Currently access to resources via The Wikipedia Library, when will Wikidata have representation about who gets access to the resources. RAN (talk) 03:28, 17 November 2022 (UTC)

Are you looking for Wikipedia Library content to be expanded to include access to closed databases/data sets that are generally most useful to Wikidata? Or concerned that access to existing Wikipedia Library content is being prioritised towards Wikipedia editors updating Wikipedia articles, as opposed to Wikidata editors updating Wikidata items? Dhx1 (talk) 12:01, 17 November 2022 (UTC)
  • I have been blocked from using the Library since blocked from English Wikipedia, despite donating over 5,000 hours to the project here. The one person there decided to remove all my library rights. I no longer have access to birth, marriage and death dates via Ancestry, Fold3, and newspaper.com. I would fix a dozen missing dates a day, and fill in missing middle names. I need an advocate from Wikidata to speak out for me. The Library doesn't respect my work here. --RAN (talk) 09:31, 19 November 2022 (UTC)
    Have you considered posting on the talk page for the Library? I would include an acknowledgement that you may be better suited to WD and Commons than to WP. DS (talk) 15:17, 22 November 2022 (UTC)

Which should I put the instances list for regions of Italy?

There are 2 obvious items for that, as below, but I am not sure the difference.



Another question is which property to use. Should I use contains the administrative territorial entity (P150) or has part(s) (P527) ?


Thanks JuguangXiao (talk) 21:12, 20 November 2022 (UTC)

one item is a list of regions of italy and one is the item representing the concept of regions of Italy. you probably want to use region of Italy (Q16110). BrokenSegue (talk) 21:17, 20 November 2022 (UTC)
Thanks. But {{Q|Q16110}} is semi-protected, any way to unlock it? JuguangXiao (talk) 04:01, 21 November 2022 (UTC)
What change do you want to make to it? I don't understand. BrokenSegue (talk) 07:10, 21 November 2022 (UTC)
the instance of region of Italy, such as Lazio (Q1282), etc . to ask the question _what are_ the regions of Italy. Just like United States of America (Q30) which lists all states. JuguangXiao (talk) 08:13, 21 November 2022 (UTC)
Ok. I realized Italy (Q38) also have the list of P150. But what is region of Italy (Q16110) or list of regions of Italy (Q21235336) for? By names, I expect the instances of regions of Italy from them. JuguangXiao (talk) 08:19, 21 November 2022 (UTC)
I know I can run query, like `SELECT ?region ?regionLabel WHERE { ?region wdt:P31 wd:Q16110 . SERVICE wikibase:label { bd:serviceParam wikibase:language "en" } }` . but it requires people to do query... JuguangXiao (talk) 09:23, 21 November 2022 (UTC)
I'm still not sure I understand what you want to know. The items don't need to be "for" anything". We need to have an item for every wikipedia article. We could use region of Italy (Q16110) as instance of (P31) for the various regions of Italy but we don't have to. In practice it is being used for example in Lombardy (Q1210). list of regions of Italy (Q21235336) likely will have no use. There are lots of questions that can only be answered by running a query. Wikidata does not store all data in tabular form and it's ok if a query is sometimes needed to get an answer. BrokenSegue (talk) 17:54, 21 November 2022 (UTC)
Thanks for your thought. For requirement, I simply want to know all instances of {{Q|16110}}. Yes, I can do query, but without query, it seems impossible to find what I ask, but this is a discoverability issue - it is a, well, natural ask for a list of finite elements, (I did not ask for list of people), and there is an item named with such name, but failed to provide what its name implies. JuguangXiao (talk) 15:24, 22 November 2022 (UTC)
@JuguangXiao Please consider that Wikidata is a database, not a website dedicated to presenting information. You have not explained why it would be practical to curate lists of objects in class items such as region of Italy (Q16110), when this information can be retrieved by a more sophisticated and always up-to-date method. Vojtěch Dostál (talk) 19:41, 22 November 2022 (UTC)
Thanks. If as you said Wikidata/base is the database, it is natural way to do the query. You are right that maintaining bidirectional relationship data by pointing to each other is not a good idea. I'll do queries. JuguangXiao (talk) 21:08, 22 November 2022 (UTC)

Require autoconfirmed to create items

I'm the creator of the content creator (Q109459317) item and so I get notifications when people link to it - particularly people who want to create Wikidata items for themselves or others for self-promotion purposes and who are not notable. I'm seeing the trend that they'll create an account to create the item and that self-promotion item is the first edit they make. It's really annoying at this rate (happens 4-5 times per week) and it should stop.

I think we should make it so that only autoconfirmed users can create items, showing they at least have a bit of experience with Wikidata and by then they should figure out they can't create items that aren't notable. Creating an account is too easy at this point. Lectrician1 (talk) 18:17, 21 November 2022 (UTC)

'Make newbies life more difficult so that Lectrician1 does not get so annoyed' does not sound a very compelling proposal. There's no analysis of the issues involved here, just a knee-jerk. --Tagishsimon (talk) 18:27, 21 November 2022 (UTC)
You should not be creating items if you don't even know how to add statements. Lectrician1 (talk) 18:58, 21 November 2022 (UTC)
They seem to be capable of adding a content creator (Q109459317) statement :). I take from MisterSynergy's contribution, below, that newbies do a lot of good work, and so you need to set your concern against a) dissuading newbies by adding another hurdle b) losing contributions b/c hurdle c) whether or not the hurdle's imposition makes newbies better once they are autoconfirmed d) whether or not forcing users to work on some other items is rational if their impetus is to add this item and they have no particular interest in or knowledge of other items e) whether the hurdle will result in users making howevermany really bad edits simply to get confirmed ... all that versus how many non-notable subject item creations will be stopped (and, perhaps, what proportion of non-notable subject item creations that represents). --Tagishsimon (talk) 19:53, 21 November 2022 (UTC)
but we don't even require you to be logged in to create new items. BrokenSegue (talk) 18:42, 21 November 2022 (UTC)
What we really need is (personal opinion follows) a ban of IP edits in Wikidata. But we'll need to make a stronger case than anecdotal evidence and personal feelings.Vojtěch Dostál (talk) 18:53, 21 November 2022 (UTC)
Informational only: IP users and registered newcomer users (not yet (auto)confirmed) have created over 8200 data items in the past 30 days, excluding already-deleted items. This is a fairly usual number from my patrolling experience. —MisterSynergy (talk) 19:05, 21 November 2022 (UTC)
This might of course curb creation of spam items (at least in the short run). But it would also be a burden for our efforts to attract bona fide new users. Think of projects, academic and otherwise, that are set up by Wikidata novices who, naturally, have to create new items for their sometimes pretty niche (yet important) areas of interest. It’s also not good policy to impede participation without really solid evidence. --Emu (talk) 19:54, 21 November 2022 (UTC)
What would constitute "solid evidence" here? — The Erinaceous One 🦔 20:43, 22 November 2022 (UTC)
@The-erinaceous-one There's many things we don't know. For example, % of items created by new users/autoconfirmed users, which are later deleted. Or % of edits by IPs, new users and autoconfirmed users, which are later reverted. And last but not least, time spent by other users cleaning up the 'mess' which might be otherwise spent more productively. Vojtěch Dostál (talk) 20:55, 22 November 2022 (UTC)
Some actual numbers:
  • 800+ items created by IP users in the past 30 days have been deleted meanwhile
  • IP+newcomer users usually make 175.000–250.000 edits per month
  • The revert ratio of IP+newcomer users is usually <10% of all their edits; most reverts happen during the first two days after an edit has been made
  • 21k different IPs have been used during the past 30 days, and 26k different registered accounts; both are fairly usual numbers for a 30-day period. IPs can change of course, but I'd estimate that around 25% or more of all individuals editing Wikidata do this without being logged in which is a substantial part of the workforce here.
  • Nevertheless, IP+newcomer editors make up for only ~1% of Wikidata's edits; mainly because they usually do not use automated editing tools, and if so they are strictly ratelimited.
Personal opinion: I use both the patrol function and the delete function more than any other community member here (by quite a margin in fact), and I do not think that IP/newcomer editing is a problem to worry about. It does make some extra efforts necessary, but nothing that even would remotely threaten the integrity of this project. —MisterSynergy (talk) 21:16, 22 November 2022 (UTC)
Do you have statistics with separate data for IPs and newcomers? Most newcomers are one-edit wonders, some are vandals, very few become long-term editors.
I'm currently a new user. IMHO it's not logical that newcomers can't remove statements without getting schooled by a big a red warning, but are free to add statements. Adding trash is also vandalism.--Pinda Chinees (talk) 23:49, 22 November 2022 (UTC)
I can distinguish between IPs and newcomers in most situations, but deleted contributions are in fact a bit difficult. Usually there are twice as many IP edits as newcomer edits per period of time, and a first guess would be that this ratio somehow translates to deleted contributions as well. —MisterSynergy (talk) 00:03, 23 November 2022 (UTC)
For reference, you can look through the list of new pages recently created by unregistered users here. — The Erinaceous One 🦔 04:17, 23 November 2022 (UTC)
Always good to be lectured on the characteristics of newcomers ("Most ... are one-edit wonders, some are vandals, very few become long-term editors") by a user who joined wikidata <checks notes> yesterday. --Tagishsimon (talk) 05:07, 23 November 2022 (UTC)
@Tagishsimon: Newcomer or not, you still need to be polite to them. Please avoid sarcastic comments. — The Erinaceous One 🦔 06:25, 23 November 2022 (UTC)
@MisterSynergy Thanks for the numbers, although it would be good to see them in perspective (comparing IPs vs newcomers vs autoconfirmed users in all statistics). I am still making up my mind over this whole thing. I must say I value your patrolling a lot and it is impressive that you are making such a difference. What would happen if you stopped editing today? If the number of unreverted poor edits/non-deleted poor items rised, would that not indicate that the current mode of operation is not sustainable? Vojtěch Dostál (talk) 09:05, 23 November 2022 (UTC)
No question that IPs/newcomers are responsible for practically all vandalism in this project. Experienced editors simply do not risk their reputation with such behavior; newcomers and IP users are pretty similar in behavior as much as I am aware, but I could look for quantitative numbers if you are interested. (I do have a tool that periodically performs a lot of data analysis on IP+newcomer recentchanges edits, and I could easily extend it).
Regarding my own efforts: I am using "patrol" a lot and more than others, but this does not mean that I am the only one here who looks for vandalism. In fact, I am relatively selectively patrolling certain types of edits, particularly those involving German terms and a couple of other things. If I see collateral vandalism involving other content, I remove it as well of course. Regarding patrolling, there would likely be some cases with remain unattended for a somewhat longer time if I were to leave, but nothing would seriously break. Yet, if more users engaged in this activity, I would not complain of course. Main problems of patrolling are: generally poorly understood, poor tooling, not a rewarding activity, tedious since most edits are actually fine.
Re. deletions the situation might be different. I am no longer making 50-60% of all deletions as I used to a couple of years ago after Dexbot started deleting items as well, but it is still a very significant part and more than what others do per [5]. Not sure whether the other admins would be willing to take this over. There would probably be an increasing amount of items that are not notable remaining in main namespace with no attention. This is not limited to IP editors, however, since many or even most of my deletions affect items of seasoned editors/bots.
There is a general problem in this community that maintenance tasks are not very popular, and this is not limited to the admin job and patrolling (property creations come to mind immediately, but there are other fields as well). Yet, this is a bit unfortunate, but not dangerous to the project at this point. ---MisterSynergy (talk) 09:30, 23 November 2022 (UTC)

Why are these items getting corrupted like this?

I recently used my wikidata model (discussed above) to identify items that are misclassified. The most suspicious items are listed at User:BrokenSegue/PsychiqConflicts. But one really common pattern is the following:

Now I thought that when pages moved the reference gets changed on Wikidata. But this isn't happening? Is there something we can change? Can we ask Wikipedia users to be more careful? There are probably hundreds or thousands of errors like this. BrokenSegue (talk) 01:21, 23 November 2022 (UTC)

Probably worth raising at Wikidata:Report a technical problem. The patterns /seems/ to be, pagemove to a hitherto unused name -> WD sitelink is updated; pagemove over an existing page -> WD sitelink is not updated. If so: pagemove over is a routine event; why is it not being recognised / acted on? --Tagishsimon (talk) 02:26, 23 November 2022 (UTC)
Krdbot updates links after pages are moved over redirects (and possibly others that are not automatically updated). There seems to be a delay updating Wikidata when a page is moved, and it looks like Wikidata is not updated when a page is deleted and another page immediately moved over it, or when a page is moved and the redirect replaced with other content such as a disambiguation. The bot wouldn't be able to update the moved page to a title still linked to another item. In most cases it would be correct to remove the other sitelink, but occasionally these are partial history merges. Peter James (talk) 17:27, 23 November 2022 (UTC)

Problema para crear un artículo

Me gustaría crear la sección en español del artículo "Normes ortogràfiques" pero hay un error y no me deja hacerlo. ¿Me podríais ayudar, por favor? Mireiavilabonet (talk) 13:24, 22 November 2022 (UTC)

Si pides ayuda en internet con un error, copia siempre el texto del error para que las personas a las que pides ayuda entiendan tu problema. ChristianKl15:02, 22 November 2022 (UTC)
¿Supongo que refieres a Normes ortogràfiques (Q19257154)? A propósito, hay también Wikidata:Café para discusiones en español. --Data Consolidation Officer (talk) 22:07, 23 November 2022 (UTC)

Warning for editing while not logged in: Status quo not ideal

Currently, users that aren’t logged in aren’t warned until after the Publish button is pressed. This is not ideal, since after a clean browser start central login would fail to log in to Wikidata (e.g., on modern privacy-oriented browsers), so the first edit is almost always guaranteed to be an anonymous edit. For people concerned with IP leakage this is a problem.

I suggest that the warning be displayed after the Edit button is pressed, so that anonymous users are immediately warned if they expect to have been already logged in but are not. Al12si (talk) 22:44, 21 November 2022 (UTC)

Is this a Wikidata issue or a general Mediawiki issue? Does Wikidata behave differently than Wikipedia in this regard? ChristianKl11:22, 22 November 2022 (UTC)
This is specific to editing Wikidata properties as far as I can tell. (It could be a general Mediawiki issue but Wikipedia isn’t using the properties editor, if this is a general Mediawiki feature.) Sorry for forgetting to mention this detail. Al12si (talk) 15:33, 22 November 2022 (UTC)
@Al12si: I can’t reproduce this issue – in a private window, as soon as I start to edit the labels or statements of an item, I get a warning “You are not logged in” (upper right corner of the screen). The warning doesn’t go away on its own either. What does the warning look like that you get after publishing? (I’m wondering if it’s something else.) Lucas Werkmeister (WMDE) (talk) 15:48, 23 November 2022 (UTC)
There's a warning in the upper right of the screen, but there's nothing that happens when the user presses "publish". I think Al12si asks for some sort of popup that asks "Do you really want to make this edtis anonymously? Yes/No". ChristianKl16:59, 23 November 2022 (UTC)
That’s odd. I’m not seeing any warnings that pop up at the top-right corner (similar to what I see on Wikipedias), only a popup after I hit publish when I change a data item. I’ll take detailed notes next time my browser crashes and report back here. Al12si (talk) 18:34, 23 November 2022 (UTC)
@Al12si: You can also open a Phabricator task – I find it easier to attach images there, which might be helpful in this case. (But then, I’m quite used to Phabricator ^^) Lucas Werkmeister (WMDE) (talk) 11:45, 24 November 2022 (UTC)
@ChristianKl @Lucas Werkmeister (WMDE) My browser crashed today and I took notes. Someone seems to have taken note of this and changed something; Wikidata is now doing a pop-up warning at the top-right hand corner. There is also no longer a warning after I pushed Publish.
I’m not sure if this is ideal, since as a user I guess I’d have preferred something to pop up closer to the Wikidata property (where my eye would be focusing at), but I guess this is an improvement. Thanks! Al12si (talk) 19:50, 1 December 2022 (UTC)

Item to be restored?

Q24699193 has the same subject as Q108804436; shouldn't it be restored hence? Nomen ad hoc (talk) 07:13, 24 November 2022 (UTC)

As I said here: If Q24699193 in its original form represents the same person as Q108804436, then it should be recreated, rolled back to its stable state, and the items merged into the older version. If the contention is that the subject of Q108804436 is not notable, then that should be nominated for deletion and discussed accordingly. There is no logical basis for keeping Q108804436 and not Q24699193, if the subjects are the same. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:48, 24 November 2022 (UTC)
It all depends on your philosophy of the meaning of an item. In this case, the item was repurposed for some reason and changed its meaning completely for for quite some time. While I generally agree that the original meaning of the item is the true meaning and deletions after splits should generally be avoided, in this case, the deletion was in all probability warranted. --Emu (talk) 09:22, 24 November 2022 (UTC)
Agree, Q24699193 had originally been about the same person as Q108804436, but User:Nomen_ad_hoc has repurposed it at some point so it has described the same person as in Q57079316 for a substantial amount of time. Q24699193 is a conflation due to editorial mistakes made in the past and should not be restored or redirected anywhere. ---MisterSynergy (talk) 09:33, 24 November 2022 (UTC)

Inferring instance/subclass of using machine learning

I've been working on a small project to try to improve our categorization of items linked to Wikipedia articles.

Take a look at psychiq. There's a UI where you can play with the model. Try to look at the examples.

My thinking is that if this gets to be high quality enough we could do this automatically (like a smarter version of the NoClaims bot). And even if it never is good enough I could link it up to my extension wwwyzzerdd and provide a semi-automated experience for adding P31/P279 to articles.

Feedback appreciated. It has only had time to train on 1/4th of the total training data and I made a few mistakes when building the training data so it could get much better. BrokenSegue (talk) 07:12, 10 November 2022 (UTC)

Interesting idea. Could you change the interface so it deduces the categories from the page itself, and adds the item labels to the output rather than just use Qs. And do you have examples where the item really should be in multiple classes? Vicarage (talk) 08:07, 10 November 2022 (UTC)
So the current user interface is totally notional and just for demonstration. The desired endpoint would be a slick UI (either a plugin or browser extension) that automatically loads a dropdown of suggested QIDs (tagged with the appropriate language labels) somewhere on the enwiki page itself. The model is trained on cases where there are multiple classes but this situation is pretty rare. BrokenSegue (talk) 14:36, 10 November 2022 (UTC)
A thought on this. How good does it have to get “to be high quality enough we could do this automatically”? For example, with 99.9% accuracy – which I’d consider tremendous –, when adding statements to one million items you’d still expect 1000 errors (erroneous statements added or statements with at least one erroneous statement added, depending on how you count). Manually correcting 1000 statements would be quite cumbersome, and you’d have to be aware of which are erroneous in the first place. On the other hand, I’ve no idea how many items are currently lacking instance of (P31) or subclass of (P279) statements or could otherwise benefit from the proposed automation, so I don’t know whether one million is a realistic estimate. There could also be a way to get “confidence” estimates out of the predictor. So I’d say the potential is there but the applicability depends on the details. --2A02:8108:50BF:C694:E83C:EBFF:49AF:23CC 11:48, 10 November 2022 (UTC)
How good automated edits need to be is already an open question. The NoClaimsBot is already doing work like this and it is definitely not 100% accurate (I recently corrected one of its edit types that it had been doing "wrong" for a long time) and we don't even know how inaccurate it is. I don't think there is a consensus on how accurate automated edits need to be. There are lots of and lots of effectively zero statement items with enwiki articles. I think there is value in this tool even if we decide never to 100% automate it. We also can set a very conservative threshold such that it only edits when it is supremely confident. BrokenSegue (talk) 14:44, 10 November 2022 (UTC)
Also accuracy should be thought of more broadly. If we tag something as an instance of "event" but it's labeled as a "battle" then it should be awarded partial credit since one is a subclass of the other. BrokenSegue (talk) 14:53, 10 November 2022 (UTC)
Yes, defining accuracy is not easy when dealing with hierarchical data. Erroneously substituting event for battle is much more venial than, say, cat for human. I’d say that a sufficiently accurate system with a confidence estimate correlating closely with actual errors is most promising. The bot could keep a record (in its user namespace) of statements added with low confidence, to be double-checked. Thresholds could then pragmatically be set in such a way that the number of low-confidence edits remains tractable. (Reverting an erroneous bot edit is probably easier than making the same edit oneself.) --2A02:8108:50BF:C694:E83C:EBFF:49AF:23CC 15:47, 10 November 2022 (UTC)
I'd start use the bot to flag up existing classifications it disagreed with, for human confirmation. A checked set could be added to the training set to improve its weaknesses. That way it wouldn't be actually adding information until it matched a human classifier. Vicarage (talk) 18:32, 10 November 2022 (UTC)
If we filter down to cases where the model is confident I expect most of the errors to be in confusing very similar QIDs like written work (Q47461344) and literary work (Q7725634). BrokenSegue (talk) 20:05, 10 November 2022 (UTC)
By the way, are there accuracy or other performance metrics available? I don’t see any in the model card… --2A02:8108:50BF:C694:E83C:EBFF:49AF:23CC 15:52, 10 November 2022 (UTC)
I totally agree with you. As for performance metrics, I am having trouble getting enough GPU time to finish training the model. The current epoch zero training cross-entropy loss is around 0.54. I don't have any code to compute the more expansive version of accuracy/precision. BrokenSegue (talk) 16:36, 10 November 2022 (UTC)
@ 65.109.60.7 17:08, 24 November 2022 (UTC)

Ok I managed to finish training the model. Accuracy is 84% and top 3 accuracy is 97%. I updated the model on huggingface (note that I changed the input format slightly). BrokenSegue (talk) 05:50, 13 November 2022 (UTC)

Good job! I’m afraid 97 % top-3 accuracy doesn’t help much (we still don’t know which of the first three predictions is the right one; however, the probabilities of the three highest-ranking predictions could possibly be used for some kind of confidence measure), but 84 % accuracy is also quite high, especially given that the number of possible output categories is very high (I presume) and they are probably not too skewed. It would also be interesting to have data for some kind of, say, “badness-of-substitution-aware” accuracy (see above), which we still haven’t defined yet, unfortunately. --2A02:8108:50BF:C694:C89D:B1AC:1FCB:92D7 23:31, 13 November 2022 (UTC)
I quoted top-3 accuracy because there are interactive modes where the model can be used where we can display 3 choices to the user and they will pick amongst them. Yeah the number of possible outputs for this model is 1000 so even 84% is quite good. I'm still unsure how to model "badness-of-substitution-aware" other than manually going through and labeling all pairs of answers with a "badness" metric BrokenSegue (talk) 18:18, 16 November 2022 (UTC)
Oh, yes, I wasn’t thinking about an interactive mode. That makes sense. “Badness of substitution”… possibly EMD, or something similar. Intuitively, only going up in the hierarchy (“event” instead of “battle”: battle (Q178561)historical event (Q13418847)past occurrence (Q110227435)occurrence (Q1190554) alias “event”) should be penalized less than going down somewhere (“cat” instead of “human”: human (Q5)mammal (Q110551885)captive mammal (Q57812611)domesticated mammal (Q57814795)house cat (Q146) alias “cat”). Computing that could be a pain though, having to search for a shortest undirected subclass of (P279) path between the predicted and the correct class. On the other hand, this is only necessary for the 16 % items with incorrect predictions (but the shortest-path code would have to be written anyway). --2A02:8108:50BF:C694:6999:87ED:F823:6BBD 22:05, 16 November 2022 (UTC)
yeah I haven't written the code for this. Computing the distance between all pairs of the 1000 classes I'm predicting sounds really unfun. And a lot of the time "acceptable substitutes" aren't (grand)parents of one another but are cousins. The memory required for doing this math is bad. Basically have to store the entire graph in memory. Really not fun. BrokenSegue (talk) 20:24, 20 November 2022 (UTC)
The sophisticated measure doesn’t have to be computed for the items where the prediction was correct, and for the others, the path(s) between the actual and the predicted class could be determined on-the-fly (that would at least not be that memory-intensive). As for the cousins, moving in the hierarchy (along a path from the actual to the predicted class) could have costs/penalties associated, with upward steps being relatively cheap and downard steps expensive; the cost of downward steps could even be different depending on where in the hierarchy they are taken (in order to not penalize “acceptable” cousins too much), but all of that remains pretty abstract (what are good values for the costs?).
Another idea: I don’t know whether it would be possible to “gamify” this evaluation once the bot is running. Instead of just correcting predictions by the bot, Wikidata users could rate the acceptability of the substitute on some Likert scale. This couldn’t be used for an evaluation before introducing the new bot, but maybe afterwards for finding remaining weaknesses. I have no idea, though, whether such a gamification would be possible technically; if users have to manually detect whether an instance/subclass they correct was added by the bot, and if they have to leave their assessment on some distant user page, many won’t do this. There would have to be some kind of gadget to support such an evaluation.
Or the 16 % incorrect predictions in your evaluation (if not too many in absolute numbers, alternatively only a sample of them) could be collected somewhere and assessed by volunteer users (if there are any). --2A02:8108:50BF:C694:15E5:5457:C449:FFAA 11:50, 23 November 2022 (UTC)
Ah, the latter is actually something like what you already have, plus assessment column(s). --2A02:8108:50BF:C694:15E5:5457:C449:FFAA 13:32, 23 November 2022 (UTC)
all good ideas. I'll think about it some more. BrokenSegue (talk) 22:53, 23 November 2022 (UTC)

Nonbinary genders vs. groups of humans

I’ve turned vakasalewalewa (Q108266757) into an instance of (P31)group of humans (Q16334295) (instead of instance of (P31)gender identity (Q48264)) because the articles behind all sitelinks seem to describe a social group rather than a gender identity (English: “are people from Fiji”, French: “sont des personnes des Fidji”, Spanish: “son personas de Fiyi”, Catalan: “són persones de Fiji”). What characterises this group, though, seems to be their members having “a traditional third gender identity, culturally specific to the country” (enWP). Should there be some (separate) item for that gender identity, and if so, how should it be linked to vakasalewalewa (Q108266757)? Plus: Should such an item be used for sex or gender (P21), or is third gender (Q48279) preferrable? --2A02:8108:50BF:C694:A4EE:1267:E564:785E 14:07, 17 November 2022 (UTC)

I do not have a good answer for you. I think that the talk page of Wikidata:WikiProject LGBT is the most likely place to eventually attract comment. Alternatively, you can propose thoughts here. If you like, you can ping anyone who participated in any related discussion you find in the list of links I just shared.
There is no WikiProject for organizing social groups. There is a Wikidata:WikiProject Biography, and all the related biographical WikiProjects are listed at Wikidata:WikiProject Biography/Model.
The point that I think you are trying to make is that there are two classifications for different purposes - "gender identity" and demographic profile (Q5932254). I think you are doing demographics. We do not have a WikiProject for demographics but we need one, especially for its appearance in academic literature in health, economics, climate change, and everything else. We do not have strong models for this. Thoughts? Bluerasberry (talk) 14:28, 24 November 2022 (UTC)
I don’t know whether what I’m trying to do here is demographics, but I have the (maybe wrong) impression that the concept of a gender (identity) is different from the concept of a social group characterised by the gender. Or at least I doubt that people would define male (Q6581097) or female (Q6581072) by starting with “Males are people” (or “Females are people”, respectively), let alone “Males are people from …”. But I think I’ll leave a more general comment at Wikidata Talk:WikiProject LGBT. --2A02:8108:50BF:C694:A549:3462:AD30:9E21 18:48, 24 November 2022 (UTC)

Vandalism alert

IP: 80.24.11.169 TaronjaSatsuma (talk) 16:45, 24 November 2022 (UTC)

  blocked, but next time please write such kind of alert messages at Wikidata:Administrators' noticeboard @TaronjaSatsuma Estopedist1 (talk) 19:22, 24 November 2022 (UTC)
I'll. Thanks!--TaronjaSatsuma (talk) 21:09, 24 November 2022 (UTC)

Reducing the backlog of unconnected pages on a regular base

Hello, the statistics like

regularly show some thousands of unconnected pages.

  • Are there any systematic approaches to reduce the backlogs for different languages on a regular base (except Pi bot for humans in some languages) while trying to avoid creating duplicates ?
  • Does anyone know, what the horizontal axis on these graphs actually means, i.e. which dates are connected to which values?

Also see:

Thanks a lot! M2k~dewiki (talk) 21:09, 24 November 2022 (UTC)

  • The date axis on the plots start mid-2015 roughly (raw data is in the source, you need to convert UNIX timestamps). The x axis labels are mm/dd without year, unfortunately, so you need to guess which year this is.
  • User:GZWDer_(flood) used to create new items as well, but it is blocked since Jan 2021.
MisterSynergy (talk) 21:36, 24 November 2022 (UTC)
Note: for the german language wikipedia the VIAF entry at the bottom of biographical articles can be used to find the related wikidata object. For example:
For films IMDb und filmportal.de can be used to find the objects, for chemical substances the CAS ID, for monuments the monument ID, for flight incidents the ASN, and so on. If the wikidata object is connected to a Commonscat (which is often used in articles), the object can be found and connected through the commonscat.
Also see de:Benutzer:M2k~dewiki/Tipps and de:Benutzer:M2k~dewiki/FAQ#Wie_finde_ich_ein_bestehendes_Wikidata-Objekt_zu_einem_Artikel?
In my opinion, also one way the reduced the backlog is to continuously explain authors of new article, why and and how they could connect the articles to the (existing) objects (de:Benutzer:M2k~dewiki/Kopiervorlagen) M2k~dewiki (talk) 00:07, 25 November 2022 (UTC)

Why is honorary member (Q10519151) only "for a learned society"?

Why is honorary member (Q10519151) only "for a learned society"? I think it should also be applicable to other membership organisations. Can I change the definition?
Otherwise the question is: What is the English word for an honorary member of a society that is not learned?
(This discussion has been started on Talk:Q10519151 by GerardM in March 2017, I joined about three weeks ago, but there were no reactions at all uptill now, so I hope there will be more chance here.) JopkeB (talk) 11:46, 18 November 2022 (UTC)

I agree with you. Its a common feature of universities and informal groups like clubs Vicarage (talk) 13:17, 18 November 2022 (UTC)
I agree too. This could apply to many organizations worldwide. Vojtěch Dostál (talk) 08:09, 19 November 2022 (UTC)
+1 GerardM (talk) 05:44, 22 November 2022 (UTC)

Thanks for all your contributions. I am happy with the results. The most important change has already been made, I'll include this discussion in the Talk page of the item. --JopkeB (talk) 05:20, 25 November 2022 (UTC)

Is...

... this claim right?

Thanks in advance! 92.184.102.111 21:14, 23 November 2022 (UTC)

The Maitron page does indeed say "par Maxime Ravel".
So unless there's some compelling reason to believe otherwise, I would say that yes, the claim is right. DS (talk) 03:52, 24 November 2022 (UTC)
OK, thank you DragonflySixtyseven.
And see also: Wikidata:Requests for deletion#Q115396311.
92.184.100.110 11:48, 25 November 2022 (UTC)

Birth Defects Research

Does WD have an item for "Birth Defects Research"? I am not looking for journals on this topic etc, but for articles on the topic at wikimedias / wikiversities, etc. Thanks in advance, Ottawahitech (talk) 20:24, 24 November 2022 (UTC)

WD does not seem to have an item for "Birth Defects Research", and journal article items with BDR in their title don't point to main subject (P921) items responsive to your question - https://w.wiki/629E . I suspect birth defects as a general area of research is not well covered on WPs, although specific types of birth defects (or causes thereof - w:en:Folate deficiency) will be covered. --Tagishsimon (talk) 01:55, 25 November 2022 (UTC)
Generally, if you are seeking an item and don't find it, just create a new item. In case there's an existing item that's named in a strange way it can be merged later. ChristianKl12:30, 25 November 2022 (UTC)

Duplicate english wikipedia article

The items for Lillehammer municipal council (Q16896781) and Lillehammer Municipality (Q101341) are duplicates, but they also link to duplicate articles on enwiki. Can someone who is active there merge them? Infrastruktur (talk) 15:42, 25 November 2022 (UTC)

I think that Lillehammer municipal council (Q16896781) - or at least the EN article - is about the local government organisation, Lillehammer Municipality (Q101341) about the municipality governed by the organisation. If so, the P31 for Q16896781 is incorrect, and the two need to be joined together, e/g/ by applies to jurisdiction (P1001) & legislative body (P194). --Tagishsimon (talk) 17:37, 25 November 2022 (UTC)
Yeah, it looks that way. Thanks. I've changed the P31 to "city council". Infrastruktur (talk) 18:32, 25 November 2022 (UTC)

Rename a page / renommer une page

How to rename Q2958982 This person is called Maurice and not Charles. I know how to rename a page in Wikipedia, not in Wikidata ? Could you help me ? What is the process ? Arrakis (talk) 17:11, 25 November 2022 (UTC)

@Arrakis: Go to the record. Click "edit", found just above the box with the name in. Change the name. Add aliases for other names he may conceivably be known by, for he was a man of many named. Add a very brief description. Click 'publish'. Welcome to Wikidata. I have updated FR and EN names and aliases, fwiw, so maybe job already done. --Tagishsimon (talk) 17:32, 25 November 2022 (UTC)
@Tagishsimon: Merci. --Arrakis (talk) 21:48, 25 November 2022 (UTC)

Contemporary constraint for father and child

I brought this up once before, but it didn't get enough attention for a change. See for instance: "The entities Johann Conrad Hahnenkratt and Wilhelmina Christine Hahnen should be contemporary to be linked through child, but the latest end value of Johann Conrad Hahnenkratt is 20 June 1842 and the earliest start value of Wilhelmina Christine Hahnen is 19 October 1842." We need to program in a nine-month grace period before the contemporary constraint gives a warning, we have several instances where a father inseminated the mother, then died shortly after. We see this a lot for fathers that die in wars. RAN (talk) 20:56, 20 November 2022 (UTC)

is that grace period technically doable? BrokenSegue (talk) 01:08, 21 November 2022 (UTC)
The complex constraint 'recency' seems to do date-maths, so on the face of it, yes, it is technically possible. --Tagishsimon (talk) 01:20, 21 November 2022 (UTC)
@Lydia Pintscher (WMDE): what do you think about the technically doability? ChristianKl15:27, 21 November 2022 (UTC)
This sounds like phab:T275392. Would one way to address this to change the contemporary constraint to allow for 9 or 10 months of non-overlap? Would we be ok with this for the other cases where the constraint is used? Does anyone have other ideas? Lydia Pintscher (WMDE) (talk) 11:42, 26 November 2022 (UTC)
What about cases where various sources contradict each other (or itself)? Mateusz Konieczny (talk) 16:54, 22 November 2022 (UTC)

Hi from your new Wikidata Junior Product Manager

Hi 👋🏽

I am Arian, I just joined the Wikidata team as the new Junior Product Manager.

In this new position, I’ll be helping Lydia and Manuel day-to-day to support you and Wikidata’s development and learning as much as I can from them.

Like a lot of you, I am passionate about free knowledge and learning. I have worked as a product designer and community manager and in my previous lives, I have worked as a curator, writer and designer.

I am excited to work with all of you to continue our mission of sharing free knowledge and open data with the world.

Feel free to reach out to me anytime – looking forward to meeting you all! Arian Bozorg (WMDE) (talk) 10:15, 22 November 2022 (UTC)

For what would it make sense to reach out to you, for what to Lydia and for what to Manuel? ChristianKl11:10, 22 November 2022 (UTC)
In general you can approach either one of us for anything any we'll sort it out among ourselves. But as a rule of thumb: Manuel for anything that's analytics-related, Arian for anything hands-on on specific projects that are in development (once he's properly onboarded) and me for strategy and overarching topics. Lydia Pintscher (WMDE) (talk) 11:33, 26 November 2022 (UTC)
Hello! :-) Syced (talk) 02:19, 25 November 2022 (UTC)

muting notifications from an IP address user

An IP address user is reverting some 600 quickstatements changes I made one by one (as they are making lots of edits, many detrimental, to Japanese ships, but I'm not prepared to argue with them at the moment. I want to block their revert notifications, but it seems I can either switch off all revert notifications or block them from a particular username, but not from an IP address. Is there a way round this? Vicarage (talk) 07:08, 26 November 2022 (UTC)

might be easier just to ask them to hold off on the changes and give you time to address it. we can block if they keep going. BrokenSegue (talk) 07:16, 26 November 2022 (UTC)

Pen names

Should we split Luke Rhinehart (Q1855263), for example, between items about Luke Rheinhart, the pen name, and George Cockroft, the real person? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:33, 26 November 2022 (UTC)

If there is a one-to-one mapping, one item. Its harder if the pseudonym is actually a collaboration (Eg Lewis Padget https://en.wikipedia.org/wiki/C._L._Moore#Marriage_to_Henry_Kuttner_and_literary_collaborations), or a publisher's house name, when the pseudonym really needs its own entry. Vicarage (talk) 13:33, 26 November 2022 (UTC)

Many items incorrectly labeled as Instances of musical term (Q20202269)

There are many music-related concepts that have the statements labeling them as instances of musical term (Q20202269). For example: equal temperament (Q723441)instance of (P31)musical term (Q20202269). These statements are incorrect because the items do not represent terms (i.e., words). They are, in fact, translated into many languages and represent the underlying concepts. Is there an easy way to remove all of them? — The Erinaceous One 🦔 10:36, 11 November 2022 (UTC)

Labelling as instances of musical term (Q20202269) is definitely not very informative (and musical term (Q20202269) itself doesn’t seem to be particularly well-kept). It should probably be equal temperament (Q723441)instance of (P31)musical temperament (Q1080022) instead (the existing equal temperament (Q723441)subclass of (P279)musical temperament (Q1080022) looks wrong, given that there is only one equal temperament – equal temperament isn’t a class of temperaments, presumably). Since Wikidata mainly deals with concepts and not with terms (except for Lexemes), renaming musical term (Q20202269) to “musical concept” could be another solution. --2A02:8108:50BF:C694:142C:9BB9:428D:1C6A 16:49, 11 November 2022 (UTC)
In my view, removal is acceptable. Even better, if you can come up with an alternative subclass or instance. Vojtěch Dostál (talk) 17:02, 11 November 2022 (UTC)
Without diving into the broader discussion, I note that equal temperament (Q723441) is a class of musical temperaments or tuning systems, and so would be a subclass of whatever, rather than an instance of whatever. --Tagishsimon (talk) 17:07, 11 November 2022 (UTC)
But then it’s a singleton class – there is only one equal temperament. With all the caveats with respect to natural language when talking about the instance–class distinction, English Wikipedia seems to support the instance of (P31) variant: “An equal temperament is a musical temperament or tuning system” --2A02:8108:50BF:C694:142C:9BB9:428D:1C6A 17:14, 11 November 2022 (UTC)
Not according to my reading of https://en.wikipedia.org/wiki/Equal_temperament ... Equal temperament is octaves divided into equal steps, but different equal temperaments have different numbers of steps, notably 12-TET, 19-TET & 31-TET. --Tagishsimon (talk) 17:49, 11 November 2022 (UTC)
Yes, I think 2A02:8108:50BF:C694:142C:9BB9:428D:1C6A is thinking that equal temperament (Q723441) is actually twelve-tone equal temperament (Q29571370). (In fact, he is not the only one. Some of the statements on equal temperament (Q723441) should probably be moved to twelve-tone equal temperament (Q29571370).) — The Erinaceous One 🦔 08:25, 13 November 2022 (UTC)
Ah, or twelve-tone equal temperament (Q21087560). Looks like we have a duplicated item. — The Erinaceous One 🦔 08:31, 13 November 2022 (UTC)
I just merged the duplicate. — The Erinaceous One 🦔 10:35, 14 November 2022 (UTC)
The problem with musical term (Q20202269) and similarly with medical term (Q52193405) is that both seem to be often used for items that are about concepts and not about the terms that refer to those concepts. Given that a majority of the claims might be wrong, I would support removing them in bulk. ChristianKl11:53, 12 November 2022 (UTC)
  Support The Erinaceous One 🦔 08:29, 13 November 2022 (UTC)
@ChristianKl: What's the process for removing the statements in bulk? — The Erinaceous One 🦔 10:36, 14 November 2022 (UTC)
Someone has to run a bot. There's no special process. ChristianKl12:47, 14 November 2022 (UTC)
@Moebeus, Solidest, Galaktos, Sergey kudryavtsev, JAnDbot, JAn Dudík: @NANöR, Ginevra Colleoni: you have all created statements using "instance of musical term (Q20202269)". Would you like to weigh in on the discussion before we bulk remove the statements? — The Erinaceous One 🦔 10:58, 14 November 2022 (UTC)
  Neutral I added this statement to articles in categoy where was many of them without statements. If there is more accurate statement, i have no problem with removing JAn Dudík (talk) 12:28, 14 November 2022 (UTC)
@The-erinaceous-one: Can you share some of those statements (or whatever query you used to find that list of names)? I don’t remember what I’m supposed to have done here, so I can’t participate in the discussion very well. Galaktos (talk) 19:26, 14 November 2022 (UTC)
Some of the current instances of musical term (Q20202269) are (in no particular order): crescendo (Q2347888), allegro (Q108525384), a piacere (Q300635), ad libitum (Q108883749), epilogue (Q106206684), printed sheet music (Q4327689), polyphony (Q1917591), musical analysis (Q1544924), coda (Q852466), percussion notation (Q7167190), imitation (Q1049742), round (Q1145381), piano (Q2707020), downbeat (Q2416838), walking bass (Q2164509), music therapy (Q209642). You can find more by going to musical term (Q20202269) and opening "Show Derived Statements." — The Erinaceous One 🦔 23:28, 14 November 2022 (UTC)
Hello @The-erinaceous-one, all items for which I have added the statements as musical term are in this dictionary and I have adopted it as a reference. Best. NANöR (talk) 07:18, 19 November 2022 (UTC)
  Oppose. I gradually remove "musical term" from items, replacing them with more suitable p31 so it gradually becomes less and less. I also add p31=musical term when I can't find any suitable p31/p279s (like if it hasn't been created yet or if my knowledge is limited regarding some things). I don't see any reason for mass removal, as many of these things will remain completely without statements and will be lost of years again. This has to be dealt on a case-by-case basis. Solidest (talk) 11:20, 14 November 2022 (UTC)
Would a compromise be mass removal of all claims iff there's another p31 or p279? ChristianKl12:48, 14 November 2022 (UTC)
@Solidest, ChristianKl:It would be better to use a subclass of concept (Q151885) instead of "musical term". I just created musical concept (Q115211517) for this purpose. Perhaps we could transition all of the statements to musical concept (Q115211517) instead of deleting them completely? (A similar approach could be used for "medical terms".) — The Erinaceous One 🦔 23:45, 14 November 2022 (UTC)
Removing "musical term" where there are already other p31s, and replacing it with "musical concept" in all other cases – I would support that. (if an item has p279 but p31 has "musical term" only – I would still kept "musical concept" in p31 in such cases, as it can often be replaced with other existing p31s). Solidest (talk) 08:19, 15 November 2022 (UTC)
This sounds like a good plan to me. @ChristianKl: could you help us with running a bot to make the proposed change? — The Erinaceous One 🦔 06:51, 18 November 2022 (UTC)
I'm not used to running bots. There are people who actually have the routine at it. ChristianKl14:05, 19 November 2022 (UTC)
@The-erinaceous-one I can run the bot and do the replacement. But should the references be kept in the statements or removed? Vojtěch Dostál (talk) 15:03, 22 November 2022 (UTC)
@Vojtěch Dostál: Thanks for volunteering! I don't have a strong opinion either way regarding the references. Most of the statements don't have a reference or use imported from Wikimedia project (P143). A few list a dictionary as a reference (see allegro (Q108525384)), which would still be accurate so I'd suggest leaving the references. — The Erinaceous One 🦔 20:34, 22 November 2022 (UTC)
Should we, at the same time, also replace medical term (Q52193405) in instance of (P31) statements with medical concept (Q111796709)? — The Erinaceous One 🦔 20:39, 22 November 2022 (UTC)
No, an item like medial rectus muscle (Q1090541) shouldn't get medical concept (Q111796709). Outside of anatomy heart-lung machine (Q29513568) or eyepatch (Q760330) shouldn't get it either. When I did look at examples of statements with medical term (Q52193405) I haven't found a single one that would benefit from medical concept (Q111796709) even if that might exist theoretically. ChristianKl19:11, 24 November 2022 (UTC)
@ChristianKl The examples you give all have other instance of (P31), so a plain removal of the statement would follow (as you all seem to have agreed here). Or do I get it wrong? Vojtěch Dostál (talk) 08:12, 26 November 2022 (UTC)
@Vojtěch Dostál: I think having either instance of (P31) or subclass of (P279) should be enough for a plain removal without replacement. ChristianKl12:05, 26 November 2022 (UTC)
I get that, but I don't see a strong enough consensus for that here - we can do that later. In the meantime, I removed terms from items which had other P31 and replaced terms with concepts in the rest. Vojtěch Dostál (talk) 20:12, 26 November 2022 (UTC)
  Support ArthurPSmith (talk) 17:58, 18 November 2022 (UTC)
  Support There's a lot of "term" items. Remove them as you please as long as you make lexemes to replace them. Lectrician1 (talk) 18:18, 21 November 2022 (UTC)
The items are not, generally, modeled as lexemes, though, and it would be quiet cumbersome to generate lexemes for each item. I think we should just correct the items using a bot and let the corresponding lexemes be created organically. — The Erinaceous One 🦔 06:40, 22 November 2022 (UTC)
Then go ahead and do that. Lectrician1 (talk) 14:00, 22 November 2022 (UTC)
I don't see a reason to require creation of lexemes in cases where items were incorrectly modelled. Vojtěch Dostál (talk) 15:04, 22 November 2022 (UTC)

Members of a city council

How do we model members of a city council? I tried looking up a few current and former members of the Seattle City Council who have entries here, and haven't found any where there membership on the council is in their Wikidata item. - Jmabel (talk) 22:43, 25 November 2022 (UTC)

WD uses position held (P39) on the human item, with value "member of X city council". --Tagishsimon (talk) 23:20, 25 November 2022 (UTC)
See also Wikidata:WikiProject every politician/Political data model --M2Ys4U (talk) 03:13, 26 November 2022 (UTC)
  • I created a new Wikidata entry called "member of the Seattle City Council". See Lorena González (Q23762614). If you want to split them up by "1st position of the Seattle City Council", you can do that too, but it is more work. If you break it up smaller, you can create a list for that position, like at Talk:Q32945293. I also created "President of the Seattle City Council", since there is only one member at a time, you can create a concatenated list: Talk:Q115459552.You don't edit the list, you just edit the entry for the person, and add in start_date and end_date and "replaces" and "replaced by", when you hit the refresh button at the talk page, it formats all the new data. --RAN (talk) 18:33, 26 November 2022 (UTC)
  • If you have time Talk:Q32945293 needs cleanup, most of the people do not have the position in their entry. --RAN (talk) 21:20, 26 November 2022 (UTC)

Submitting a few hundred GPS coordinates from LÍM

I'm preparing to submit a few hundred, maybe thousands, of GPS coordinates from LÍM ( National Land Survey of Iceland ). They released the data under CC by 4.0. They have 162.940 coordinates. This includes cities, mountains, rivers, valleys, farms, lighthouses and so much more. I'm sifting through them to try to find the ones that have articles on the Icelandic Wikipedia. I was wondering if I had the GPS coordinates and the title of the Icelandic Wikipedia if that was enough for a bot to go and add all of them to Wikidata or would you need the Wikidata identifier? Also, I noticed that a lot of them already have GPS coordinates listed with a weak reference, usually imported from a Wikipedia article. Should I ignore those, have both GPS coordinates or delete the old one in favor of coordinates with a better source (reference)? Steinninn (talk) 15:33, 21 November 2022 (UTC)

coordinate location (P625) should have single best value. With such an impeccable source I'd make it the only value by deleting other values, but I'd write the bot matcher to require both matching labels and co-ordinates matching to within some limit, say 10km, to avoid problems with 'New Farm' occurring multiple times in the dataset. Vicarage (talk) 15:47, 21 November 2022 (UTC)
@Vicarage, Steinninn: Wikidata collects statements, not facts; it’s perfectly acceptable to have multiple statements about the same data. You should not remove data just because you have more accurate data. It’s okay (and IMO preferable) to remove unreferenced or poorly referenced (references using imported from Wikimedia project (P143)) data, but those that have reliable sources should remain. So if you can make it that only unreferenced or poorly referenced statements are removed, it’s great, but if you can’t, it’s better not to remove anything. —Tacsipacsi (talk) 17:39, 27 November 2022 (UTC)
So do I need the identifier from Wikidata. It would make my life a lot easier if I didn't. Or if I could get a list of items that are rivers in iceland, mountains in iceland ect. --Steinninn (talk) 16:07, 21 November 2022 (UTC)
A query of with instance of (P31) of geographical feature (Q618123) for Iceland should get you everything WD has that a Land Survey might cover. Vicarage (talk) 16:26, 21 November 2022 (UTC)
Sorry, I'm a total newbie (Q718943). How do I query that? Thank you in advance. --Steinninn (talk) 16:43, 21 November 2022 (UTC)
Have a play with https://query.wikidata.org/querybuilder/?uselang=en, its friendly Vicarage (talk) 17:05, 21 November 2022 (UTC)
@Steinninn You could match against the 9000 items with country:Iceland and coordinates (https://w.wiki/5$QF). Vojtěch Dostál (talk) 13:33, 22 November 2022 (UTC)
For matching to items based on distance, I recommend csv-reconcile-geo for OpenRefine. Here's my written summary of the required steps. Vojtěch Dostál (talk) 18:57, 21 November 2022 (UTC)
Hello again friends. I've started working on the data from LÍM. I started with all rivers. I'm surprised as to how many rivers have the same name! It took a few hours just to confirm what gps coordinates connect to what wikidata object. What would you suggest is the best way for me to dump 136 datapoints to Wikidata? I have them in an excel sheet with the Wikidata Q code, GPS data as well as LMÍ unique id (32 digit, I think it's good to include it for future updates and maintenance). I'd also add the reference to LMÍ, but they are all the same, just saying that the data came from there and a link to their website. --Steinninn (talk) 10:19, 25 November 2022 (UTC)
https://quickstatements.toolforge.org/#/batch is easy to use if you can produce text in its Q123|P345|"value" format Vicarage (talk) 12:57, 25 November 2022 (UTC)
You mentioned I should remove any GPS that is already listed, I don't seem to be able to do that in this tool unless I identify what the coordinate is that I want to remove. Can I list all 136 objects and have the tool remove any coordinates if they exist? --Steinninn (talk) 13:18, 25 November 2022 (UTC)
I'd ask the friendly folks at on the query advice page, they can suggest a suitable format that can be fed back into quickstatements Vicarage (talk) 14:20, 26 November 2022 (UTC)

“dick slang”

Two days ago an IP user edited some labels, aliases and one description of dick slang (Q15333). Apparently the intention was to use the vulgar vocabulary as item labels, instead of “penis slang” (which was probably not correct either in Danish and Swedish). Or at least that way it would have been consistent with the German, Spanish, French, Italian, Japanese labels (and probably in other languages as well), but they stopped short of changing the English label (due to abuse filter?), which still reads “dick slang”.

I refrain from changing that label (to “dick”) and bring this up here instead, since the item itself seems strange, not only for the inconsistent labelling. The English description reads “cultural expression using penis form or words”, and descriptions in several other languages even describe it as an expression for referring to the penis itself. I’m not sure whether this is a valid concept; Wikidata items should probably not be about mere expressions (that would be lexemes, though I have to admit I don’t know Wikidata’s lexemes well). This item, on the other hand, is not about an expression in a certain language, but about a cross-lingual class of expressions (vulgar expressions referring to the male sexual organ). I’m also unhappy with some of the item’s statements (e.g. has part(s) (P527)dick pic (Q85506520), Commons category (P373)Khuy). What do you think?

By the way, there is also dick (Q108404149), which is a duplicate at best, and diu (Q112151075), linked from dick slang (Q15333) (as has part(s) (P527)); as a matter of fact, “Diu” used to be an English (!) alias of dick slang (Q15333) (removal), but it seems diu (Q112151075) should be a Cantonese lexeme instead (?). Any suggestions for improving this unsatisfactory situation? --Data Consolidation Officer (talk) 10:04, 26 November 2022 (UTC)

PS. I have the impression that there is some idiosyncratic modelling going on here: cultural expression (Q108404405), the current class of dick slang (Q15333), is only linked from dick slang (Q15333) and diu (Q112151075), and English noun (Q12045976), the class of dick (Q108404149), only from dick (Q108404149) and court (Q10460907). I think these close-to-isolated classes should be gotten rid of entirely. --Data Consolidation Officer (talk) 19:47, 27 November 2022 (UTC)

Unsuspended/Restored Twitter accounts

Not certain if this was tackled here or elsewhere but is there a way to indicate that a Twitter account that was previously suspended has been restored as in the case of Carl Benjamin (Q28924867) among other persons and entities? -Ianlopez1115 (talk) 11:12, 27 November 2022 (UTC)

@Ianlopez1115: I think the way to do this would be discrete statements for each period of the account's life, with appropriate start time (P580), end time (P582), end cause (P1534) and possibly has characteristic (P1552), as well as appropriate rank. So, something like:
  • value=Name, start time x, end time y, end cause: suspended, rank: normal
  • value=Name, start time y, end time z, has quality: suspended account, end cause: suspension lifted, rank: normal
  • value=Name, start time z, rank: preferred
--Tagishsimon (talk) 19:28, 27 November 2022 (UTC)

buildings whose existence is disputed

There are a lot of castle entries for the UK that came from the respected Gatehouse Gazetteer (Q59259501) who's very existence is in doubt, an example is Cae Tump Placename 2, Gladestry (Q38611878) which has a vague label, a description of "rejected castle" and the unoptimistic "Clwyd Powys Archaeological Trust record of ? motte reads 'Name suggests motte or barrow. Nothing seen on ground (CPAT site visit 1979)' ... Does not seem to have ever really been suggested as a motte". Not only are there no visible remains, no archeology has ever been found. I did consider using a conservation state (Q55553838) value, but the closest is unlocated, probably destroyed (Q106959824), but the entry was given a provisional location, and we don't want to assert it was once there to be destroyed. Is there a good way stating something may not be there at all? Vicarage (talk) 16:48, 23 November 2022 (UTC)

Interesting question. I'm not aware of any prior art. My inclination is there are two reasonable methods:
My preference would be the second of these, so that reports on castles do not have to deal with excluding rows by reference to a qualifier. --Tagishsimon (talk) 17:00, 23 November 2022 (UTC)
I'd prefer a "supposed architectural structure" to avoid creating entries for castles/mead halls/barrows/roundhouses etc to be used as a P31 in combination with them. Then it would be easy to eliminate them. Vicarage (talk) 17:19, 23 November 2022 (UTC)
Sensible to nest a set of "supposed x" beneath "supposed architectural structure", but also beneath the x they are supposed to be. Using only "supposed architectural structure" renders WD unable to say that it was supposed to be a castle / house / whatever. --Tagishsimon (talk) 18:23, 23 November 2022 (UTC)
I'd just add "supposed architectural structure" as well as "castle". Easy SPARQL to pick items without the former and with the latter. I tend to think of P31 as combinatorial properties. And it would be much easier to extend a "forts" and "palaces" and "castles" query to a "really exists" or "just supposed" query. Vicarage (talk) 18:43, 23 November 2022 (UTC)
No. It's not easy to remember or even know that you need to watch out for this combination. It's a trap for the unwary. It is, in short, pointless to coin and use "supposed architectural structure" as well as "castle". "supposed castle" will do what we want: turn up the item in a report on P31/P279* of castle and/or P31/P279* of supposed things, and class trees are the way that WD works best. --Tagishsimon (talk) 19:17, 23 November 2022 (UTC)
Here I agree with Vicarage's solution more. To prevent new classes of supposed objects from being created. From the standpoint of querying this data, it may be sometimes more advantageous than Tagishsimon's solution (eg. when you're trying to query all buildings BUT those disputed). Vojtěch Dostál (talk) 10:56, 24 November 2022 (UTC)

I am conscious of course that most people asking for castles don't want these iffy entries, so it does seem a shame they have to have to add an unobvious caveat to each query. But if we had "supposed castle", that would still be a castle unless we'd forked the whole tree at architectural structure, which makes a lot of duplication for one property change. And of course its not just buildings, there are written works/musical scores or paintings that might or might not have been created. Some we are certain of, so they are "lost" others are just rumours. So perhaps "supposed thing" with a a qualifier "instance of" castle would be a better approach. Then the keen would ask for both P31 castle and P31 supposed P31 castle. Vicarage (talk) 19:23, 23 November 2022 (UTC)

A "supposed castle" is a subclass of a "castle". That really is the way it works. Anything else is manifestly suboptimal. I cannot believe we're even having this conversation. --Tagishsimon (talk) 19:46, 23 November 2022 (UTC)
This may be more a matter of terminology than of modelling, but this sounds back-to-front to me. The defining characteristic of X being a subclass of Y is that every X is also a Y. So to me this approach would be declaring that everything that might or might not actually be a castle, actually is one, which I presume is the opposite of what's intended. --Oravrattas (talk) 20:45, 23 November 2022 (UTC)
I agree. Pragmatically I'm going to create a "notwanted" list in my software to eliminate these iffy entries based on my quality criteria, but it would be good to have a WD-wide solution to allow normal queries to provide good results transparently, with an option for extra uncertain results. Vicarage (talk) 09:26, 24 November 2022 (UTC)
Indeed it’s more like every castle is (also) a supposed castle (as silly as this statement may sound, but it’s just the special case where the supposition has turned out to be correct), which would imply castle instance-of supposed-castle. I can understand that this looks suboptimal, too. A better solution could be to not (ab)use instance of (P31) and subclass of (P279) for this kind of modelling, and instead creating a new property “supposition of” (or something like that) to link supposed-castle to castle (i.e. supposed-castle is the “supposition of” castle). That way regular items would not have to be cluttered with references to their supposed-quality counterparts, without the need for questionable subclassing. --2A02:8108:50BF:C694:A549:3462:AD30:9E21 10:47, 24 November 2022 (UTC)
By the way, this would also affect human whose existence is disputed (Q21070568). --2A02:8108:50BF:C694:A549:3462:AD30:9E21 10:49, 24 November 2022 (UTC)
When a user queries for all castles in the UK, they only usually only want to see those castles that actually exist, so having a supposed castle subclass castle is problematic for that usecase. For human whose existence is disputed (Q21070568) we have the construction that we deprecated the subclass claim but frequently users break it. I proposed https://www.wikidata.org/wiki/Wikidata:Property_proposal/subclass_of_with_uncertain_existance to have a specific property for that relationship. ChristianKl11:42, 27 November 2022 (UTC)
That problem (user queriying for all castles doesn’t want non-existing ones) wouldn’t arise with castle as a subclass of supposed castle or the alternative approach of linking them with an entirely different property (be it supposition of or the newly proposed subclass of with uncertain existance). I maintain that the class of certain somethings which might exist is not a subclass of the class of the same certain somethings that do exist. It’s the other way round. --2A02:8108:50BF:C694:55AC:C874:B5ED:83CD 11:06, 28 November 2022 (UTC)

WikidataCon 2023 will be organized by Wikimedia Taiwan and Wikimedia Germany

Hello all,

Following up on the search for a partner to co-organize the WikidataCon 2023 and after talking to several enthusiastic affiliates and local Wikidata communities, I am happy to announce, Wikimedia Germany will support Wikimedia Taiwan and the local Wikidata community in Taiwan in designing and running the next iteration of the Wikidata conference, which will take place on October 28-29, 2023.

The Wikidata community in Taiwan has been active for years and regularly organises events in Taiwan, collaborating with other open source and open data communities. They are particularly interested in geographical data, human settlements and local languages. Wikimedia Taiwan will provide the administrative coordination and legal framework to support the Wikidata community in Taiwan.

Many people from different fields are working on Wikidata: Researchers, cultural institutions, linked open data enthusiasts and more. WikidataCon 2023 attendees will be able to discover how people from other disciplines interact with Wikidata, learn from them, exchange ideas and find inspiration.

With this partnership, we’ll try out a hybrid event format. We will hold an on-site event that will take place in Taipei, welcoming participants from East Asia. Meanwhile, participants from other regions of the world will be able to join the conference online. Finally, we’re encouraging other local Wikidata groups to prepare satellite events and on-site gatherings.

More information and updates will be provided in the upcoming months on Wikidata:WikidataCon 2023 and its talk page. Meanwhile, if you have any questions, feel free to reach the organizers on the talk page.

We’re extremely happy to be collaborating with the enthusiastic Wikidata community in Taiwan. We look forward to this new iteration of the conference.

Léa Lacroix & Alan Ang for Wikimedia Germany


逐家好!

自從宣佈公開徵求Wikidatacon 2023 的伙伴之後,德國維基媒體協會 (Wikimedia Germany) 與數個興趣小組與在地維基數據社群接洽。我們德國維基媒體協會很高興宣佈我們將與台灣維基媒體協會 (Wikimedia Taiwan) 合作,台灣在地的臺灣維基數據社群(Wikidata Taiwan)將會設計與執行下一輪於 2023 10/28-29 舉行的 Wikidatacon 大會。

近幾年台灣的維基數據社群相當活躍,定期在台灣舉行活動,並且與其他的開源與開放資料社群合作。他們關注行政區劃、在地語言、溪流資料。 在地社群,台灣維基媒體協會,則會提供必要的行政協調與法律架構來支援整個社群。

這次合作舉辦會議,將會彰顯東亞這邊進行的維基數據專案,鼓勵跨領域的合作,例如人們在不同的領域之間,如文化機構、教育單位、研究機構等展開合作。

我們攜手合作之下,我們會嘗試混合型式的活動,一場在台北的實體活動,歡迎東亞區域的客人參與,而其他有興趣的人則是能夠線上參與。我們也會鼓勵其他在地維基數據社群舉辦衛星活動與實體聚會。

Wikidata:WikidataCon 2023 頁面與討論頁面將會在接下來幾個月提供更多資訊與消息;同時間,如果你有任何問題與回饋意見,你可以在討論頁聯繫主辦方。

我們相當高興能與台灣活躍的維基數據社群共同合作,並且期待下一次新一輪的大會。

德國維基媒體協會 Léa Lacroix 與 Alan Ang

Lea Lacroix (WMDE) (talk) 14:14, 28 November 2022 (UTC)

Dear Wikimedia Community members,
(apologies for cross-posting - you can find this announcement in Mandarin below)
Wikimedia Taiwan is excited to collaborate together!
When Wikidata Taiwan became aware that Wikimedia Germany was seeking a partner for a co-organized WikidataCon event in 2023,  there was intense enthusiasm within the community. Followed by several discussions with Wikimedia Germany, the Wikimedia Taiwan chapter and the representatives of the Taiwan Wikidata community confirmed their intent to collaborate. The preparations are all going quite well right now!
Wikidata development in Taiwan only began in 2018 with more systematic goals and initiatives. Since then, OpenStreetMap x Wikidata organize monthly meetup by the Wikidata Taiwan community and the OpenStreetMap Taiwan community, as well as the Yearly Open Data Day, COSCUP community track and other events, both have consistently promoted the growth of related editing and project development.  For instance, importing of data from a ground-level village dataset, various levels of school dataset, art dataset and exhibition information.  As a result, it has collaborated with other open-data communities, primarily GLAM and government agencies.  Other examples include the citizen open source groups' g0v-Hackathon and g0v-Summit events. In addition, collaboration with the Soil and Water Conservation Bureau to promote river dataset data governance issues. The community also holds the Wikidata Cross-Domain Forum at the end of October to celebrate Wikidata's birthday with the global Wikidata community.
Wikimedia Taiwan is delighted to work with Wikimedia Germany to co-organize the 2023 WikidataCon with Asia as the focal point, while also following the footway of Wikimedia Movement user group in Brazil. We will continue the development of Wikidata in a variety of cultural contexts. Particularly in relation to Taiwan's issues of cross-disciplines and dialogue. As a result, we advocate for a more diverse and prosperous global Wikidata community.
As Léa mentioned in her words, WikidataCon will host both online and on-site events the following year, as we want to maximize the capacity for participation. The on-site event will take place in Taipei. When the time comes, all Wikidata contributors and friendly organisations around Taiwan are welcome to join our events!
Stay tuned for upcoming events!
Dennis Chen for Wikimedia Taiwan
Allen Wang for Wikidata Taiwan Community
各位維基社群朋友逐家好:
台灣維基媒體協會對此合作感到非常興奮!
臺灣維基數據社群(Wikidata Taiwan)在首次獲知德國維基媒體協會尋求 2023 年 WikidataCon 的合作夥伴時,社群內部便抱持高度的興趣,後續在台灣維基媒體協會與臺灣維基數據社群代表經過幾次與德國維基媒體協會的意向討論後,確定了雙方合作的意願,目前所有準備工作也都順利的進行當中!
維基數據 (Wikidata) 在台灣的發展,大約始於 2018 年才有比較系統性的目標與措施,自此由 Wikidata Taiwan 社群與 OpenStreetMap Taiwan 社群合辦的 OpenStreetMap x Wikidata 月聚以及每年的 Open Data Day 、COSCUP 社群軌等活動,穩定的推動著相關編輯及專案的發展,像是匯入台灣基礎行政區的村里資料、各級學校資料,藝術資料與展覽資訊等。並逐漸地與台灣其他開放社群、GLAM、政府單位等合作,例如公民開源團體的 g0v 大松或是 g0v Summit 活動,以及與水土保持局合作推動溪流資料的資料治理議題;逢年 10 月底,社群也自主性的舉辦「維基數據跨領域論壇」與全球的維基數據社群一同慶祝 Wikidata 生日。
能承接在巴西維基媒體運動用戶組之後,與德國維基媒體協會共同籌劃 2023 WikidataCon,並以亞洲作為核心,台灣維基媒體協會備感榮幸。我們將延續對於維基數據在不同文化脈絡下的發展,特別是在台灣我們所關心的跨領域交流與對話,倡議一個更加多元、共榮的全球維基數據社群。
誠如 Léa 在信中所言,明年的 WikidataCon 將同時舉辦線上與線下型式活動,我們希望將參與的形式盡可能擴及到最大程度。線下的主場也將在台北舉辦,屆時,如果您就在我們臨近的區域,非常歡迎所有維基數據貢獻者、友好單位的朋友們,一起來台灣現場共襄盛舉!
敬請各位持續關注接下來的活動資訊!
台灣維基媒體協會 Dennis Chen
臺灣維基數據社群 Allen Wang
Supaplex (talk) 14:43, 28 November 2022 (UTC)

nouvelle fiche

Bonjour, Je souhaiterai faire une fiche artiste pour une amie chanteuse connue dans le milieu vintage, elle est passée dans une émission de télévision d'une grande chaine française, comment puis-je être aidée s'il vous plait? merci 92.184.98.10 14:48, 24 November 2022 (UTC)

Bienvenue sur Wikidata ! Vous devez commencer par regarder nos critères de notoriété. Votre ami les satisfait-il ? À titre indicatif, pouvez-vous trouver deux articles de journaux indépendants qui fournissent des éléments substantiels ? Ensuite, cherchez des personnes similaires pour voir comment elles sont représentées. Vous pouvez également obtenir une meilleure aide en français au Bistro. Bovlb (talk) 16:03, 28 November 2022 (UTC)

your text is distorted, meaning too large to be legible

your text is distorted, meaning too large to be legible. what should i do. i am 78 years old and not good at using the computer. thank you. myron martin. Ventura21c (talk) 18:41, 25 November 2022 (UTC)

Pigsonthewing has responded at the user's talk page. Bovlb (talk) 19:41, 28 November 2022 (UTC)

Wikidata weekly summary #548

Doubled item

Could someone delete item Q115486491, it is a duplicate of Q85858207. I don't find an option for a proper deletion request, therefore I write that here. Carl Ha (talk) 21:35, 28 November 2022 (UTC)

@Carl Ha   Merged by M2k~dewiki --Emu (talk) 21:49, 28 November 2022 (UTC)

ISBN-13 and Hardcover vs paperback

Some books have separate ISBN-13 (P212) for the hardcover and for the paperback. The property distribution format (P437) accepts only printed matter (Q1261026). How to differentiate those ISBN numbers? How to specify whether it's a hardcover or a paperback? D6194c-1cc (talk) 21:20, 27 November 2022 (UTC)

I found print book format (Q82046811) item, but it has no associated property. Should I make a property proposal? D6194c-1cc (talk) 09:55, 28 November 2022 (UTC)
I would prefer adding new possible values to distribution format (P437) over introducing a new property. ChristianKl13:35, 28 November 2022 (UTC)
There should be a separate item for each edition which is an instance of version, edition or translation (Q3331189), linked with has edition or translation (P747) and edition or translation of (P629) to the item of the work. Edition-specific properties like ISBN, distribution format etc. belong on edition items. The whole setup is described in more detail at Wikidata:WikiProject Books. Since hardback (Q193955) and softcover (Q990683) are both subclasses of printed book (Q11396303), which is a subclass of printed matter (Q1261026), I don't see a problem with using distribution format (P437). Pfadintegral (talk) 06:33, 29 November 2022 (UTC)

Tabular style data on items?

On the Wikidata 10th birthday event in Utrecht we had a discussion about "tabular style data" on items. That is, for example, time series that can add up to a very large number of statements. I got around today to summarize the discussion. Whether you agree or disagree with what we said, your opinion would be valuable so that we later can forward our conclusions to the developers. Ainali (talk) 21:37, 27 November 2022 (UTC)

I do think tabular data is good when there are more than 10 values. ChristianKl21:39, 27 November 2022 (UTC)
Thanks @ChristianKl! Do you mean, the 11th statement should be moved to Commons? Or that if there are ten or more statements, all but one should move? Ainali (talk) 21:46, 27 November 2022 (UTC)
Tabular data in items under current arrangements is a curse. However a prescriptive 'put it in a CSV on Commons' seems a bit weird - it's not clear to me that it is for users to specify a technical solution to devs. I'd have thought a triplestore solution might be more effective. The major problem, at least from my limited knowledge, is one of getting any interest in the issue from whoever makes decisions on development resource allocation. The problem is not insuperable, just neglected. --Tagishsimon (talk) 21:48, 27 November 2022 (UTC)
Thank you Tagishsimon, I agree. The discussion, and my proposal, was triggered by the inaction. But during the discussion, we all agreed on that solving the actual problem would be better. And if we can get broad agreement on that with an "or else we want this", it might get the attention it deserves. Ainali (talk) 22:03, 27 November 2022 (UTC)
Well on the one-hand, chapeau for taking the initiative. On the other, it's as depressing af that we have to contemplate going through a pantomime like this to get developer focus on an obvious canker of the wikidata environment. --Tagishsimon (talk) 22:24, 27 November 2022 (UTC)
See also Commons:Think big - open letter about Wikimedia Commons. Interesting anthropology. --Tagishsimon (talk) 22:58, 27 November 2022 (UTC)
The current tabular data implementation is English-centric. Lectrician1 (talk) 04:25, 29 November 2022 (UTC)
  • There is for quite some time meanwhile a property data type for links to tabular data at Commons, see Help:Data type#Tabular data. We do even have 22.8k such statements [6] for six properties of this data type.
  • It is just not as convenient to use as "internal" data at Wikidata and does not integrate with the query service; however, using a script for postprocessing, it is not difficult to parse the raw data from Commons that is referenced to in a Wikidata statement and do whatever you want with it.
  • It is unclear to me whether tabular data inside Wikidata would enhance usability of large pages. The bulky data is still going to be there, so it would probably not be much easier for browsers to handle it.
  • One of the difficulties with tabular data outside Wikidata is that some schema information needs to be provided, and that is not always straightforward. A limitation with the situation at Commons is also that data size is limited to 2M (as much as I am aware), so chunking can quickly become an issue that adds further complexity.

MisterSynergy (talk) 23:28, 27 November 2022 (UTC)

  • Wikidata items have a size limit also - about 4.5M, though it's hard to guess exactly what goes into that. Statements with references can be quite large (each reference is stored separately adding to size). If there's some way to increase that size limit (and also adjust how references are handled) then I can see a use for a "tabular data type" within Wikidata that could store significant data as part of an item. In fact I was a little surprised when I first started using Wikidata that there wasn't a "list" or other array-type value allowed. The current Commons-based solution isn't unreasonable - something comparable could be done within Wikidata with a new namespace perhaps? Anyway I do feel this is an area where we need something to change. ArthurPSmith (talk) 16:34, 28 November 2022 (UTC)
    Wikidata items have a size limit and as far as I can remember from the last conversation about performance, adding a new statement to an item means that Wikidata has to recreate the whole item. Adding a statement to a 1MB item is going to be roughly as costly as adding a statement to 100 10KB items.
    Given that creating new statements has often been the performance bottleneck for Wikidata, having items that are multiple MBs big is a bad idea. ChristianKl17:46, 28 November 2022 (UTC)
    @ChristianKl: If you are referring to the way WDQS updating works (regarding performance) that was true in the past, but the switch to the new incremental updater a couple of years back means this is much less of a problem than it once was. There are still areas where the size of the item in itself is an issue though. Particularly UI performance right now. ArthurPSmith (talk) 18:37, 28 November 2022 (UTC)
    Very interesting. These are topics that should really be discussed at a meeting of the community and the WMDE/WMF development team. Only together we can come to a viable solution for this kind of data. Vojtěch Dostál (talk) 19:11, 28 November 2022 (UTC)
    I agree that a discussion needs to be had with the development team. But I think that we, as a community, first should agree what an ideal scenario would be. Then we can discuss the possibilities of that and if it gets a definite no, we'll have to find a workaround (hopefully better informed at what the current bottle necks are). Ainali (talk) 19:41, 28 November 2022 (UTC)
    @ArthurPSmith, there is still revisions ("text") table, which stores the whole revision (not just a diff). Just looks how much data can be added to a single place - https://datacommons.org/place/geoId/4805000 . And here is just a single time series - https://api.datacommons.org/v1/observations/series/wikidataId/Q987/Mean_Rainfall?key=AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI (you can also check tab files, but they won't give enough information). Lockal (talk) 05:38, 2 March 2023 (UTC)
    (ouch, sorry for necroposting, I forgot that I am in archive...) Lockal (talk) 05:40, 3 March 2023 (UTC)

It is a very important article but there are too many missing or unnumbered authors. Combinato (talk) 03:17, 30 November 2022 (UTC)