Wikidata:Project chat/Archive/2021/03

Change on formatter URL

On February 10, Rashinseita changed the statement formatter URL (P1630) for the property BSD Portal athlete ID (P4650). I reversed this change two days ago. Even so, the old formatter is still used. For example, if you follow the preferred link for the data object Susanne Kreher (Q60193698), you come to the page https://web.archive.org/web/*/http://www.bsd-portal.de/sport/skeleton/athleten/%23536 and not the page http://www.bsd-portal.de/sport/skeleton/athleten/%23536. How long will it take to use the new formatter or do I need to change anything? --Gymnicus (talk) 14:19, 4 March 2021 (UTC)

The question is settled. It was due to the cache. After I have cleared the cache, the link is now to the correct page. --Gymnicus (talk) 14:33, 4 March 2021 (UTC)
I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. --Gymnicus (talk) 14:33, 4 March 2021 (UTC)

Error with JSON dump

Hey, I'm working with the latest JSON dump today (https://dumps.wikimedia.org/wikidatawiki/entities/latest-all.json.bz2, downloaded February 25, 2021), and I notice that at least one, and possibly more, of the entries have invalid JSON. The item with ID Q105653764, for example, is missing a lot of close quotes. The first few bytes look like {"type:"item,"id:"Q105653764,"labels:{"en:, when they should look like {"type":"item","id:"Q105653764","labels":{"en":. I'm not sure what happened here, but I figured you might like to know about it. --50.35.70.129 04:11, 26 February 2021 (UTC)

bzgrep Q105653764 latest-all.json.bz2 gives me after 8 hours: {"type":"item","id":"Q105653764","labels":{... so this seems to have changed to the better. --SCIdude (talk) 07:53, 1 March 2021 (UTC)

easy SPARQL wrapper

While current efforts for making access to SPARQL easier are highly appreciated and anticipated, would it not be easy to place an input field on every property page that takes a value and starts a SPARQL query for items with that property and that value? Many of my personal queries are of such type, and the implementation seems a no-brainer. --SCIdude (talk) 16:18, 28 February 2021 (UTC)

That's not a million miles away from what Preferences / Gadgets / EasyQuery gives you. --Tagishsimon (talk) 16:52, 28 February 2021 (UTC)
Maybe it should be activated by default. --- Jura 22:32, 28 February 2021 (UTC)
Probably. It produces execrable SPARQL:
SELECT ?item ?label ?_image WHERE {
  ?item wdt:P31 wd:Q39614.
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" . 
    ?item rdfs:label ?label
  }
  
OPTIONAL { ?item wdt:P18 ?_image. }
}
LIMIT 100
Try it!
--Tagishsimon (talk) 01:13, 1 March 2021 (UTC)
Would prefer (and maybe without the ?image, which serves only to cause items duplicated across rows; add ?itemDescription instead):
SELECT ?item ?itemLabel ?image WHERE {
  ?item wdt:P31 wd:Q39614.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . } 
  OPTIONAL { ?item wdt:P18 ?image. }
} LIMIT 100
Try it!
--Tagishsimon (talk) 01:13, 1 March 2021 (UTC)
and no image; no embedding/autorun and opens in a separate tab. I think it's based on an older version of the script, but I still prefer that.
Anyways, I think either would be a plus for users. --- Jura 13:14, 1 March 2021 (UTC)
I left an edit request at MediaWiki_talk:Gadget-EasyQuery.js. --- Jura 13:27, 1 March 2021 (UTC)

Wikidata weekly summary #457

What does "20. century" mean?

I've been using "20. century" for approximate dates, under the impression that "20. century" means "20th century". This appears not to be the case, as a constraint clash shows it apparently starts after 1977. Can anyone tell me what it actually means? Is it the years 2000-2099? Or 2001-2100? Or something else? Either way, this seems very confusing to me. -- The Anome (talk) 06:49, 26 February 2021 (UTC)

Looking at, for example Andrea Bergamasco (Q61472634) the date of birth "20. century" is stored as 1 January 2000. I viewed the value using a SPARQL DESCRIBE
DESCRIBE wd:Q61472634
Try it!

Piecesofuk (talk) 07:31, 26 February 2021 (UTC)

@The Anome: There is a precise date, to the day, stored with any date value on Wikidata, along with a precision. Something showed as « 20th century » can have any precise date like « 3rd of may 1949 » or « 4th of july 1960 », with a « century » precision. The « problem » is that it’s difficult to take the precision into account in every computation, and in a lot of one only the precise date is used, for sorting.
For example, imagine you want to sort people by age … you have one person with century precision on the date of birth, and one other with the precise date 1 january 1950. Is the first older than the other ? It’s impossible to know …
if you naively sort by value, it’s easier to use only the precise date. One solution could be to take advantage of the fact that the year is not significant to make an educated guess about the precise date of birth of the person : if you don’t know the year of birth, or even the decade, you can set the precision to « century », but setting the year as 1910 if you guess it’s probably around this period and 1960 if you guess the person is probably born in the second half of the century. That way you get a more plausible sorting using the (non significant) year/month/date of the precise value. author  TomT0m / talk page 08:42, 26 February 2021 (UTC)
a constraint clash shows it apparently starts after 1977 - this is a bug in specific implementation of constraint checker. There are multiple implementations - naïve sparql-based, element-based (mw:Extension:Wikibase Quality Extensions), dump-based analysis by KrBot, complex constraints - some of them suffer from precision issues. --Lockal (talk) 08:56, 26 February 2021 (UTC)
Perhaps we should force the default for "20th century" to be 1950, "19th century" to be 1850, etc, and mass-convert all "nth century" dates with sorting-values of eg n00 or (n+1)00 to be n50 ? It could be done as a very big bot job. (Obviously if there were qualifiers like earliest date (P1319) or refine date (P4241) they could be taken into account too)? Jheald (talk) 11:09, 28 February 2021 (UTC)
Quite pointless and misleading, IMO. According to this logic, year precision should store value not on the 1st of January, but on July 2. Just exclude low-precision values when doing comparisons, like here. It is expected from well-written clients to know about precision anyway. Even more, your suggestion may not be possible due to specification with Wikidata:Stable Interface Policy. --Lockal (talk) 19:30, 1 March 2021 (UTC)
Well, as somebody who runs queries that often run close to the 60-second limit, with little time for any extra clauses, I would find it useful.
Also, it would be good if the WDQS output UI automatically indicated precision for dates (it's there in the downloads, but not on the screen) -- it's embarrassing when a query is full of 1 January dates being presented as if they had full position. But I guess, on your terms, WDQS UI is just "not a well written client". Jheald (talk) 19:58, 1 March 2021 (UTC)

Colonial Mayor of New Brunswick or Mayor of Colonial New Brunswick

At Talk:Q6797896 for example, the numbering system for politicians of localities begins with independence. Should we call the earlier politicians "position=Colonial Mayor of New Brunswick" or "position=Mayor of Colonial New Brunswick". I tried to find examples as a guide, any help would be appreciated if you can find some examples, or input which is more widely used. In the end they will be synonyms, but the right name earlier is best. --RAN (talk) 22:42, 28 February 2021 (UTC)

@Richard Arthur Norton (1958- ): Looking at the Wikipedia article, New Brunswick was incorporated, chartered, reincorporated, and rechartered several times. But in my best estimation, New Brunswick was the same municipality, but reauthorized under newer laws. I guess if you really wanted to distinguish those periods it would be "Mayor of the Town of New Brunswick" and "Mayor of the City of New Brunswick". At that point you'd be going down the road to split up the Wikidata and Wikipedia articles into "Town of New Brunswick" and "City of New Brunswick". I don't think that is necessary and would just stick with treating the mayoralty as the same continuity throughout all the incarnations of New Brunswick. William Graham (talk) 00:42, 1 March 2021 (UTC)
The numbering system of the canonical list starts with independence. The list does not restart with each change of government style. --RAN (talk) 01:22, 1 March 2021 (UTC)
Which, or whose, canon are we talking about? --Tagishsimon (talk) 01:30, 1 March 2021 (UTC)
  • You can search the New York Times archive for a phrase like "3rd mayor" or "2nd governor", "1st president", or "100th pope" or visit the website for that entity and recognize that there is a canonical numbering system, no matter who started it, or who decided it is canonical. I don't think we have to worry about integrating antimayors to mirror antipopes, or pretender mayors. See "he was sworn in as the 21st Mayor of Montclair.", all you need is the current filler of office and their number to work out the sequence. --RAN (talk) 13:37, 1 March 2021 (UTC)
Are you actually able to point to a canonical list for the Mayor of New Brunswick, RAN, or are you just blowing smoke? Can you point to a NYT search for, e.g. 3rd Mayor of NB? You say "I don't think we have to worry about integrating antimayors to mirror antipopes, or pretender mayors", but I'm old enough to remember that it was you who raised this whole question of how we cater for colonial-era Mayors of New Brunswick. It seems to me that you have decided such colonial-era Mayors of New Brunswick are holders of an office distinct from post-independence Mayors of NB, and are perhaps unhappy that others think that the Mayor of NB is the mayor of NB, irrespective of the situation with the national government. --Tagishsimon (talk) 01:54, 2 March 2021 (UTC)
@Tagishsimon: I agree there are many different ways of thinking about the history of a named populated place. It feels like a corollary to the "Ship of Theseus" question. I generally prefer to begin by building a model that matches well to existing articles in the language Wikipedia of the subject because I'm not a legal or historical scholar. A model where all officeholders have a replaces (P1365) and replaced by (P1366) property with start and end dates to that the sequence can be walked. William Graham (talk) 02:07, 1 March 2021 (UTC)

How do I connect the video game developer Iron Gate Studio (Q105725001) to the publisher Coffee Stain Studios (Q3178586)?

I know for a fact that the publisher mentioned is connected to the video game developer. There's no mention on the internet about the video game developer being a subsidiary of this publisher. Should I mention the publisher and the developer only for the game Valheim (Q105100327), or should I also mention their partnership in their QIDs? Tetizeraz (talk) 21:15, 1 March 2021 (UTC)

Wikifunctions logo contest

01:45, 2 March 2021 (UTC)

A demoted page

Greetings to all. I wanted to change the status of this item, last.fm in Farsi wikipedia from good article because the page has been demoted and does not have good article status anymore but the page is protectded and I am not able to do so, so to any who is able: could you please be a dear and remove the good article badge? Thanks in advance Vahid (talk) 12:10, 2 March 2021 (UTC)

This item is not protected so as an autoconfirmed user you can change it by yourself at Special:SetSiteLink/Q183718/fawiki.--GZWDer (talk) 20:49, 2 March 2021 (UTC)

Academic publishers preprint policies

The page on ENWP: w:List of academic publishers by preprint policy has a structured table of publisher policies. How would it be best to encode these into wikidata? A possible way of organising data on the item of a publisher (e.g. Wiley (Q1479654)):

  • Statement: permits (P8738) preprint (Q580922) (or possibly create item for "submission from / sharing of preprint")
    • Qualifier: prohibits (P8739) create items for common restrictions e.g. "commercial preprint server" or "version after peer review"
    • Qualifier: has characteristic (P1552) alternatively the inverse of the above e.g. "Non-commercial preprint server only" or "version before peer review only"
    • Qualifier: Not sure how to encode conditions such as "If preprint is CCBY, then must pay APC"
    • Qualifier: start time (P580) if people want to add in when different publishers/journals changed policy
    • Reference: quotation (P1683) if people want to quote the policy txt (along with reference URL (P854), obv.)

Please reply at w:Talk:List_of_academic_publishers_by_preprint_policy#Draft_wikidata_encoding to keep discussion centralised. Thanks in advance!. T.Shafee(evo&evo) (talk) 05:04, 2 March 2021 (UTC)

Sorting by a listeria table property

Question originally asked at Template_talk:Wikidata_list#Sorting_by_a_qualifier_property

Any ideas on how to sort a listeria table by a qualifier property (e.g. |sort=P106/Q3809586/P582)? The example below still seems to just sort by ascending QID (see my sandbox). T.Shafee(evo&evo) (talk) 09:59, 3 March 2021 (UTC)

File:HMS 'Sappho' capturing the Danish brig 'Admiral Jawl', 2 March 1808- ships engaged RMG A4461.tiff

This picture has been assigned a Q number, it is one of two. One q number has been assigned to two different pictures. The artist for both is Francis Sartorius (the younger). Commons has 5 images of two pictures. Broichmore (talk) 11:17, 2 March 2021 (UTC)

Are you sure? HMS Sappho capturing the Danish brig 'Admiral Jawl', 2 March 1808: surrender of the brig (Q50919903) and HMS Sappho capturing the Danish brig 'Admiral Jawl', 2 March 1808: surrender of the brig (Q50919903) each point to a single distinct image on commons. Where exactly is the " One q number has been assigned to two different pictures"? --Tagishsimon (talk) 21:09, 2 March 2021 (UTC)
@Broichmore, Tagishsimon: you're both being very unclear. Broichmore, is talking about multiple pictures and one qid without providing any links. Tagishsimon just linked the same picture twice.
I see that Commons:Category:Admiral_Yawl_(ship,_1808) was created, which has 5 pictures so I guess we're talking about these.
Looks like one of the image was linking to the wrong one. You can just change it to the right one. Multichill (talk) 19:37, 3 March 2021 (UTC)

Archiving options ?

What archiving options are currently supported for talk pages on Wikidata ?

Is there one where a thread will stay on the talk page until someone adds an "I think this thread can be archived now" template, and then it gets automatically archived eg 2 days later ?

I think I've seen this on other WMF wikis. Jheald (talk) 22:59, 3 March 2021 (UTC)

Aha! {{Section resolved}} is what I'm after. And it looks like User:SpBot will take care of things from there (Special:Contributions/SpBot). Jheald (talk) 23:23, 3 March 2021 (UTC)

Focus languages for improvements to the lexicographic extension of Wikidata and Abstract Wikipedia

Hi. We would like to find two or three language communities who would be good matches to help to start and guide some long-term improvements to the lexicographic data part of Wikidata, and the closely related work in the Wikifunctions wiki and the Abstract Wikipedia project, over the next few years. Participating communities will hopefully find that this project will lead to long-term growth in content in Wikipedia and Wiktionary in and about their language. See Wikidata:Lexicographical data/Focus languages for more information. Please help us identify potential good matches. More details are on that page. Thank you! Quiddity (WMF) (talk) 00:09, 4 March 2021 (UTC)

Underscore Smith and unknown Smith (label)

Interesting find: Q100924725 (and a few others) use "_" for the missing first name in the label. A past import went with "unknown" (Q75646725). I also came across others that merely used the family name in these cases. Either has its advantages. "_" probably works better when copied to nl. @IagoQnsi, Edoderoo, GZWDer: --- Jura 09:41, 2 March 2021 (UTC)

Script (style of handwriting)

I'd like to be able to record that a manuscript is written in secretary hand (Q16933853). The current property writing system (P282) is restricted to <instance of> alphabet, writing system, or orthographic transcription and has a short list of one-of constraints.

This seems to have been solved for manuscripts in uncial script (Q784235) by adding <instance of>Latin script (Q8229) to "uncial script". Is that the way to go about this? - PKM (talk) 19:58, 3 March 2021 (UTC)

@PKM: I think a new property "script style", to take values that were instances or subclasses of handwriting style (Q33260112) would be the right way forward. It could be overloaded onto property writing system (P282), but I am not sure it would be right -- item writing system (Q8192) says that it encompasses script (Q63801299): set of symbols of a writing system and orthography (Q43091): set of conventions for writing a language, but not handwriting style (Q33260112): style of handwritten document -- so it looks to me like something to identify the latter would be usefully orthogonal to writing system (P282). Jheald (talk) 23:38, 3 March 2021 (UTC)
@PKM:: Proposed: Wikidata:Property proposal/writing style. Jheald (talk) 09:03, 4 March 2021 (UTC)

Logo Update

Hello, I am new user trying and still learning wikidata, I am trying to create wikidata for company , just for information , i am not associated with company anyway, this is for learning process , i will also willing use some other profile as well.

Help me to upload logo, it is not in Commons.  – The preceding unsigned comment was added by Keith8221 (talk • contribs) at 12:16, 2 March 2021‎ (UTC).

Hi, an Indonesian editor noticed vandalism in a BLP entry. The editor wondered if the following words could be added to the AbuseFilter/11:

Is here the correct place to ask for this kind of addition to abusefilter? Also, please block Ixarising1 (talkcontribslogs) for vandalism. Thanks. Bennylin (talk) 15:12, 4 March 2021 (UTC)

DannyS712 is the last person to edit that filter, so if we ping him he can probably fix that. :-) Jon Harald Søby (talk) 15:33, 4 March 2021 (UTC)
I was just updating deprecated variables - I'm not familiar with the filter, sorry. DannyS712 (talk) 17:39, 4 March 2021 (UTC)

What's the general consensus regarding blocking/filtering profanities and such in Wikidata? I suppose this is also a problem in many languages short description? Is there a general strategy taken by Wikidata to combat such vandalism? Maybe each language can be assigned a specific filter of bad words (after making sure they are really bad and such). I don't want this to be just a stopgap solution.

@DannyS712: I manage such filters in id.wiki, so if you need some general syntax help, I can probably help with that. Bennylin (talk) 18:28, 4 March 2021 (UTC)

Systematic human abuse at Wikidata and Metawiki

How negligent WMF staff can be?

176.15.141.169 19:50, 4 March 2021 (UTC)

Sometimes they don't clean the coffee cups before going home. Did you have a point, or are you just grinding an axe? --Tagishsimon (talk) 20:01, 4 March 2021 (UTC)
If you see the issue happened in one page repeatly, you may request protection at WD:AN.--GZWDer (talk) 20:34, 4 March 2021 (UTC)

Library of Congress identifier

I want to connect https://www.loc.gov/item/02004585/ to Records of Officers and Men in the Civil War, 1861-1865 (Q105755689), but it doesn't seem to fit any of the Library of Congress identifier schemes, am I missing something, or just use the url? --RAN (talk) 15:39, 4 March 2021 (UTC)

The formatter URL in Library of Congress Control Number (LCCN) (bibliographic) (P1144) is https://www.loc.gov/item/$1/ so that property would seem to achieve the link. Heaven knows whether the regex will be happy. --Tagishsimon (talk) 20:40, 4 March 2021 (UTC)
Excellent, thanks. I always wondered what they other LCCN number was for. --RAN (talk) 00:03, 5 March 2021 (UTC)

Q1687151

Two people were improperly merged, I am not sure if I know how to de-merge properly yet. See Jens Dietrich (Q1687151). --RAN (talk) 23:57, 4 March 2021 (UTC)

Hopefully sorted. Now Jens Dietrich Zimmermann (Q3177184) and Jens Dietrich (Q1687151). --Tagishsimon (talk) 00:10, 5 March 2021 (UTC)

Invitation to join new Wikiproject: Early Modern England and Wales

We are building out a new project, Wikidata:WikiProject Early Modern England and Wales. The purpose of this project is to improve Wikidata's coverage of England and Wales in the 1500s and 1600s, with a particular focus on building out items needed to understand developing transport networks in the period before mass industrialisation. We'll be working in partnership with the collaborative Viae Regiae (Q105547906) project which is building a Gazetteer of Early Modern England and Wales (Q105548625). You can learn more about the Viae Regiae project on its website.

We want to enrich Wikidata's coverage of maps, cartography, manuscript itineraries, and historic places and people. We're just getting started - come join us! - PKM (talk) 00:45, 5 March 2021 (UTC)

Heads up: page protections of "highly used items"

A couple of months ago I proposed a bot task to indefinitely semiprotect "highly used items" as per Wikidata:Page protection policy#Highly used items. I have not been able to kick this off due to Covid lockdown constraints that leave little time for Wikidata, unfortunately, but now I am pretty much ready to go and want to let you know that this is going to happen within the next days. There are roughly 25.000 items to receive indefinite semiprotection, and fewer than 100 which do not qualify for protection under this scheme any longer and need to be unprotected. The plan is to ramp this up over a couple of days with a series of manually executed bot runs; once this is done, an unsupervised weekly job will manage the protections (protect items which newly fall into this scheme, and unprotect those which do not any longer).

I do not actively seek for input at this point; previous thorough discussions have already taken place in the bot approval process and the request for admin rights for the bot account User:MsynABot. Nevertheless, you may comment here and I will read all the input :-) —MisterSynergy (talk) 09:29, 5 March 2021 (UTC)

thanks for doing this. BrokenSegue (talk) 17:25, 5 March 2021 (UTC)

Update: the first ~1.5k protections have been added, but I did receive complaints about how the protection log entries render on watchlists on English Wikipedia. There is now a phabricator task at phab:T276613 where I want to wait for some more input by the dev team before I can proceed. —MisterSynergy (talk) 21:18, 5 March 2021 (UTC)

The Wikidata consultation about the Universal Code of Conduct is closed

As you can read from the title, the second phase of the consultation about the Universal Code of Conduct is now closed. This means that the phase in which several communities (Arabian, Bengali, Italian, Korean, Malay, Nepalese, Polish, Yoruba, plus Wikidata and Wikimedia Commons) commented with their concerns, suggestions, ideas and opinion is now over.

Now, we facilitators will work to collect and sum up all of your feedback and answers. The results and the relative data will then be published on Meta (you'll be noticed when this will happen), and will also be available to the Committee that will draft the second part of the UCoC, regarding the implementation of the Code and the reporting system.

This doesn't mean that the discussion is over! There will be new rounds of consultations on Meta and with our numerous Affiliates, and there is space also for a community discussion here on Wikidata about implementing ourselves the UCoC. I'll be glad to give a help as a volunteer, if it will be so.

Meanwhile, I want to thank all the people who took part to the discussions, reached to me in private, answered the survey or were victim of my requests for participating! You've been great, and I truly thank you for your help!

If you want to keep in touch, keep following the page about the UCoC here or on Meta, and please let your voice be heard!

Cheers, --Sannita (WMF) (talk) 16:01, 5 March 2021 (UTC)

Ordering statement values

I am currently developing a page for the fictional solar system depicted in Kerbal Space Program I must have done this link incorrectly, so just search the above item. ERBuermann (talk) 18:11, 5 March 2021 (UTC) . I wanted to list the planetary systems in order, but I forgot to link the page about the central sun. How do I change the order that the statement values appear in? ERBuermann (talk) 18:10, 5 March 2021 (UTC)

https://www.wikidata.org/wiki/User:Tohaomg/rearrange_values.js --Tagishsimon (talk) 18:18, 5 March 2021 (UTC)
order doesn't have meaning on wikidata. you can reorder things but it's not meaningful and you should assume it'll get shuffled. BrokenSegue (talk) 19:57, 5 March 2021 (UTC)
Order does have meaning in the UI. It will not be shuffled unless someone shuffles it. --Tagishsimon (talk) 20:16, 5 March 2021 (UTC)
@Tagishsimon: I mean it has meaning only in that it is preserved. But it has no semantic meaning to wikidata and a bot or another user would be totally free to mess with the order as part of a refactor. Honestly I think the backend API should shuffle the order just to avoid people trying to assign meaning to the statement order. BrokenSegue (talk) 20:47, 5 March 2021 (UTC)
Sure, I know what you mean. But equally when consulting items with, for instance, 10s of P39s, it's more easy when they're in date order. When reading qualifiers, start date before end date is probably more intuitive & arguably requires less brain-power to comprehend, than end before start. All manner of standarised ordering and standardised wording (e.g. for descriptions) helps the user; which is (also) what we should be about. Agreed that more could be done server-side, were we to agree a rule base. --Tagishsimon (talk) 21:06, 5 March 2021 (UTC)
hmmm, maybe we should implement sorting of statements front-end (alphabetica/series ordinal/etc). could be a plugin or something. BrokenSegue (talk) 21:11, 5 March 2021 (UTC)

Q55107400 and Q28772011

Hello, I'm working on Palace in Bologna with QAnswer. I merged Q55107400 with Q28772011 because both are about the same palace. But Q28772011 is also about a museum. Perhups should it be possible to use Q55107400 redirect as an element only for museum ?--Patafisik (talk) 18:21, 5 March 2021 (UTC)

Probably better not to have merged them. The palace is a building. It has within it - at the moment - a museum. These are two distinct things, which can be linked, as separate items, by properties such as 'location'. I see you have made may label changes since the merge, but I'm inclined to wind back the merge. --Tagishsimon (talk) 18:29, 5 March 2021 (UTC)
I agree with you, I will use one element for 'museum' and the other for 'palace'.--Patafisik (talk) 08:03, 6 March 2021 (UTC)

Possibly different things linked to same item

The four Wikipedia articles linked to OpenFL (Q6954720) (OpenFL / NME / F4L) seem to be about two or maybe three different things. I don't know the context so I have no idea if they're actually the same thing. Is the item supposed to be split somehow? Overcast07 (talk) 11:14, 6 March 2021 (UTC)

I undid this merge. Now F4L has its own item. --Shinnin (talk) 11:33, 6 March 2021 (UTC)

Verifiability of data about myself

I've changed my name and gender. How can I provide verifiable citations about these changes to my own data? Marnanel (talk) 21:32, 4 March 2021 (UTC) (aka Marnanel Thurman (Q90844917))

That's a great question. Because Wikidata includes items for people who are not actually notable (e.g. relatives of notable people, candidates for minor offices, people imported from random databases, etc.), how do we expect those items to be maintained, especially given that incorrect data about living people can cause actual harm to those people? Kaldari (talk) 23:09, 4 March 2021 (UTC)
@Marnanel: Assuming there are no public sources such as newspaper articles that you can refer us to, the only suggestion I can think of at the moment would be to email privacy wikimedia.org with evidence of the changes and request that they update the information. If that doesn't work, I would suggest posting a request at the Administrators' noticeboard. I went ahead and removed the gender from Q90844917 (as it was unreferenced anyway). Good luck. Kaldari (talk) 23:23, 4 March 2021 (UTC)
Surely privacy wikidata.org ? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:43, 6 March 2021 (UTC)
  • Help:Sources explains how to reference information on Wikidata, emailing WMF isn't mentioned and I don't quite see how that could be an acceptable way of referencing.. --- Jura 07:06, 5 March 2021 (UTC)
  • It depends on current statements - see Wikidata:Autobiography#On_Wikidata. If statements are unsourced you can just remove them. If statements are sourced, you can contact the source owner and ask them to update info (otherwise even WMF cannot do anything - you would just trigger algorithms which will reimport old data either in the same element or creating a duplicate). --Lockal (talk) 07:31, 5 March 2021 (UTC)
    I doubt that it would be possible to change the name of a candidate in a 2016 election. Ghouston (talk) 10:07, 5 March 2021 (UTC)
    Perhaps if you had something like a twitter account, which could be verified as belonging to the candidate in the 2016 election, then it would be possible to publish whatever biographical information about yourself that you liked, and this would be accepted as a source for Wikidata. Ghouston (talk) 10:26, 5 March 2021 (UTC)
  • One issue is, with the best will in the world, we have no proof that User:Marnanel is the same person as that represented by Q90844917; it would be just as easy for a malicious person to make that claim in order to present false data. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:43, 6 March 2021 (UTC)

generic sugar vs. table sugar (sucrose)

sugar (Q11002) in the English Wikipedia is about generic sugar which can be a monosaccharide or disaccharide. The Wikidata description describes table sugar, which is sucrose (Q4027534), a very specific disaccharide. The Wikipedia links are a mixture of articles in various languages on generic sugar and the exact chemical sucrose (table sugar). Can another chemist confirm, before I tackle the difficult task of teasing apart the links and changing sugar (Q11002) (the generic one) to "a monosaccharide or disaccharide". --RAN (talk) 00:50, 6 March 2021 (UTC)

Wikipedia articles being conflations of several concepts is ubiquitous. I'd suggest to wait until we can link to Wikipedia redirects, or create an item instance of Wikipedia overview article (Q20136634) and list all concepts as main subject (P921). --SCIdude (talk) 15:21, 6 March 2021 (UTC)
  • also I think that the articles are not talking about the chemical compounds but the food ingredient(s), and the item does reflect that. What is missing is that as a plant product it is not a pure chemical compound or a group of such, but a group of mixtures. Even table sugar is probably a mixture. --SCIdude (talk) 16:58, 6 March 2021 (UTC)
Table sugar is a 100% pure compound, that is why it crystallizes. The leftover material is sold separate as molasses and turbinado which contain the other sugars, as well as other chemicals. I changed a few links so that now most of them point to the proper "generic sugar" or the pure chemical "sucralose" (table sugar). A few more may have to be changed. And as User:SCIdude points out, some Wikipedia articles are conflations, that have no exact target. --RAN (talk) 20:49, 6 March 2021 (UTC)
    • I have adapted the description and statements. The items linking to it support my assessment. Fortunately we didn't need an overview item. --SCIdude (talk) 17:17, 6 March 2021 (UTC)

Grant application

i applied for a project to add and improve content on Nigerian female choreographers, dancers, musicians etc on wikidata. will appreciate advice and suggestions. the link is https://meta.wikimedia.org/wiki/Grants:Project/Rapid/Anurikaonu/Nigerian_Female_Choreographers,_Dancers_and_Musicians --Anurikaonu (talk) 18:39, 6 March 2021 (UTC)

How to identify a man who impregnated a woman

I have an 1878 court case and I want to link the QID for the man in the entry for the woman. How would I do that? Normally we would use "Spouse". --RAN (talk) 21:03, 6 March 2021 (UTC)

Via their child? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:45, 6 March 2021 (UTC)
The child was anonymously adopted, I suppose I could make an entry for the child as "unnamed child" like we have at The Peerage. --RAN (talk) 00:25, 7 March 2021 (UTC)

It is now possible to manage interwiki links to/from multilingual Wikisource

Hi everyone,

As you may know, you can include Sitelinks (also known as interwiki links or interlanguage links) that go from individual Items in Wikidata to pages on other Wikimedia sites including Wikisource. Until now, it was not possible to manage the interwiki links to/from multilingual Wikisource on Wikidata. This was due to the unusual technical setup of multilingual Wikisource as well as the fact that we can only link to one page for a topic on any given wiki (multilingual Wikisource can theoretically have several pages covering the same topic in different languages though this is very rare).

We resolved the technical constraint and starting today you can connect pages on multilingual Wikisource with the other language-specific versions of Wikisource so these interwiki links don't have to be maintained in the wikitext anymore. After adding a sitelink, the sitelink appears on the Item page (in the Sitelinks section) with language code mul. The restriction of only being able to link to one page on multilingual Wikisource per Item stays in place and was considered acceptable based on feedback by the Wikisource community.

This feature was requested by users from the Wikisource community. We hope that it can give them more access to Wikidata and the freedom to add multiple links to different Items.

If you encounter any issues or want to provide feedback, feel free to use this Phabricator ticket.

Cheers,

-Mohammed Sadat (WMDE) (talk) 16:26, 23 February 2021 (UTC)

Nine million people

[2]--GZWDer (talk) 00:11, 1 March 2021 (UTC)

Or: Potential merges. --Succu (talk) 21:16, 1 March 2021 (UTC)
Cool. How many have functionally identifiable data like birth or death dates? How many are linked to zero items besides human (Q5)? How many lack external identifiers (not that these are required: not every human who has ever lived has yet been given a code in a database). Increasing the absolute numbers of items should not distract from structuring the data in a meaningful manner. -Animalparty (talk) 22:14, 1 March 2021 (UTC)
200,000 merges?
One third have family name (P734), two thirds given name (P735). It seems it never gets done ;)
99,991,000,000 to go? --- Jura 00:49, 2 March 2021 (UTC)

"Also known as" for historical names?

When an entity (e.g. a newspaper) has a historical name it no longer uses, should that name be included in the "also known as" alias field near the top, or should it only be included as title (P1476) with start time (P580) and end time (P582) qualifiers? {{u|Sdkb}}talk 21:12, 3 March 2021 (UTC)

I usually do both, especially for "official name" of businesses. [edited]] - PKM (talk) 21:21, 3 March 2021 (UTC)
If the alternates are functionally interchangeable besides name, then it's probably a good idea to list them as aliases (but nobody really knows anything here, we're all just blind fools doing whatever we think makes sense: so I could be wrong). If the entities are more substantially different, then it might make more sense to create a separate item for the historic entity. For instance, scholarly journals and newspapers may go through different titles over the years, but keep the same volume and issue numbering. Each title may have its own set of catalogue identifiers, and for ideal bibliographic displaying the accurate title separate items are warranted, connected by replaces (P1365), follows (P155), etc. Businesses and organizations that change names as result of mergers or corporate restructuring probably also warrant distinct item, as the parent organization(s) and notable executives may only be applicable to the one of the items. Simple name changes of schools, companies, and geographic places that don't change materially following name changes may not warrant separate items. -Animalparty (talk) 22:54, 3 March 2021 (UTC)

Baron Frolik is an American film director who has created films such as Attempted Murder[1] and The Microchip.  – The preceding unsigned comment was added by Johnnatas1 (talk • contribs).

Interpreting "described at URL" (P973)

I have some questions about the proper use of described at URL (P973). As I interpret this property, it should be used to point to a web page or document that is about the Q item. On items for subjects, e.g. data-driven modeling (Q102047184) and safari industry (Q105347324), I have added P973 and linked out to some articles about these topics that didn't have Wikidata items themselves. Jura1 has been removing these statements and creating new items for the articles, and in those items he is using P973 to give the URL that provides the actual full text of the article. I wrote to him saying that this didn't make sense to me, because the article isn't describing the article. Instead, I think that full work available at URL (P953) is the correct property to use for a URL that provides the full text of an article. As an example, I changed P973 to P953 on Safari tourism (Q105534089). Jura1 said to me that "it isn't consistent with the purpose of described at URL (P973)." Am I misunderstanding the purpose of described at URL (P973)? UWashPrincipalCataloger (talk) 00:37, 17 February 2021 (UTC)

I use and have used described at URL (P973) the same way you use it. To me it's a top level property that is roughly analogous to a reference URL or further reading URL you'd see on English Wikipedia. If he's going to move them to a new item, I would recommend adding described by source (P1343) with a value of the newly created item. William Graham (talk) 01:13, 17 February 2021 (UTC)
That's a very good suggestion. But on the new item, shouldn't P973 just be P953? UWashPrincipalCataloger (talk) 01:21, 17 February 2021 (UTC)
I forgot to ping @Jura1: as is good etiquette. William Graham (talk) 01:25, 17 February 2021 (UTC)
I use “described at URL” the same way you do. - PKM (talk) 01:30, 17 February 2021 (UTC)

  WikiProject sum of all paintings has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. (given the uses for paintings) @Jarekt: (given the use on Commons). --- Jura 09:56, 19 February 2021 (UTC)

The The Night Watch (Q219831) is described at https://www.rijksmuseum.nl/nl/collectie/SK-C-5 so that's the contents of described at URL (P973). Looks like only Jura thinks it should be done differently? Multichill (talk) 17:22, 19 February 2021 (UTC)
Isn't it? --- Jura 12:20, 23 February 2021 (UTC)
  • @Jura1: I would say in response that some of these URLs I would consider valid for use in "described at URL", but only the ones that really describe the original painting. Links to information about books or TV shows with some relationship to the painting would be a bit too far of a stretch for me to use here. UWashPrincipalCataloger (talk) 19:55, 28 February 2021 (UTC)
  • IMO exact match (P2888) is an indication of a bad smell, and should be avoided wherever possible. Its primary function is to indicate relationships that probably ought to have their own external-id property, but for some reason haven't had one created yet. Wherever possible (IMO) exact match (P2888) should be emptied, and its contents turned into appropriate external-id statements. Where that is not possible (IMO) P2888 should be avoided where possible, so as not to add distraction and noise, that make it harder to see the hard core of P2888 statements that still need external-id properties to be created. mapping relation type (P4390) = exact match (Q39893449) can be used as a qualifier to underline that a relationship really is an exact match.
Pinging @UWashPrincipalCataloger, William Graham, PKM, Ghouston, Jura1, Multichill: -- Jheald (talk) 11:25, 28 February 2021 (UTC)
@Jheald: --- Jura 14:01, 7 March 2021 (UTC)
@Jura1: I see "described by URL" as a parallel to "described by source". One source (eg a book) may describe many things. We compensate to some extent by allowing page number as qualifiers. But I have no expectation that described by source, even with a page-range qualification, will only describe one thing. Jheald (talk) 22:09, 7 March 2021 (UTC)

Warning: the last edit to the page has not yet been patrolled.

Hello, How to patrol the last edit, for example, concerning Q2847390 ? Thanks Cquoi (talk) 15:57, 6 March 2021 (UTC)

Go through the revisions and click on 'Mark as patrolled'.-❙❚❚❙❙ GnOeee ❚❙❚❙❙ 19:08, 7 March 2021 (UTC)

BD-08

Hello! I wanted to create an item for BD-08 rifle, that is redirected to en:Type 81 assault rifle (en:BD-08) in enwiki, but is a separate article on bnwiki (bn:বিডি-০৮). Should I link that redirect page to the item (while creating)? Adibhai • আদিভাই (Talk • আলাপ) 16:40, 7 March 2021 (UTC)

P.S. Same about Mahiravana. (Please ping to Meghmollar2017.) Meghmollar2017Talk 16:56, 7 March 2021 (UTC)

@Meghmollar2017: yeah this is a known issue. Yes you should link to the redirect page but wikidata won't let you do that. Your best bet is to temporarily unredirect the enwiki article, link it and then restore the redirect. BrokenSegue (talk) 17:08, 7 March 2021 (UTC)

Constraint error

Can someone peek at Desert Sentinel (Q100293098) and see why I get the constraint error at "Chronicling America newspaper ID", the links work. I know why I get the expecting single value. --RAN (talk) 00:25, 7 March 2021 (UTC)

have they changed their numbering recently? Looks like the regular expression was recently changed but not in the property constraint field, I've changed it to match but still getting the error?!? Piecesofuk (talk) 18:27, 7 March 2021 (UTC)
Very vaguely recall there's lag after regex is changed. --Tagishsimon (talk) 19:00, 7 March 2021 (UTC)
I updated the regular expression. --Lockal (talk) 11:19, 8 March 2021 (UTC)

Wikidata weekly summary #458

Beginner question on second author

I am trying to enter that (Q104744987), "Practical Astronomy" a 1902 book has two authors. The first is already listed and now I'm trying to add the less famous (Q105826927), Frank Stowell Harlow as second author. How? Jim.henderson (talk) 21:30, 8 March 2021 (UTC)

Add Q105826927 to author (P50), and use series ordinal (P1545) to denote order. See for example A new species of Kali (Salsoloideae, Chenopodiaceae) from Sicily, supported by molecular analysis (Q22077741). If authors are known but do not currently have Wikidata items, author name string (P2093) is preferred to "unknown" value until it can be replaced by an item. -Animalparty (talk) 21:41, 8 March 2021 (UTC)
Ah. Thanks. I thought my cleverness in creating a poor but valid personal Q would take care of all, but clearly there are many tricks that are not clear to the beginner. Gradually I learn. Jim.henderson (talk) 00:37, 9 March 2021 (UTC)

Modeling complex inscriptions

I have posted a number of questions about modeling complex inscriptions on maps or paintings at Property talk:P1684#Help with modeling inscriptions. I am especially interested in how WD statements for these items can be automatically used to populate the Commons {{Inscription}} template in the future. Comments and suggestions welcome on the Property Talk page. - PKM (talk) 23:42, 8 March 2021 (UTC)

Searching for ISBNs

Searching Wikidata for an ISBN requires an exact match, with hyphens (and spaces) being relevant; for example, in Resolver:

resolves correctly, whereas:

does not.

Similarly:

vs:

Is there any tool that can resolve the latter form? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:17, 6 March 2021 (UTC)

This might be overkill and not the right tool for the job, but I found this naive query from WDQS to work - it retrieves in 23 seconds for me
SELECT ?item ?ISBN13
WHERE 
{
  ?item wdt:P212 ?ISBN13.
  BIND(REPLACE(?ISBN13, "-", "") as ?isbnnum)
  FILTER (?isbnnum = "9781849383707" )
}
Try it!
Alternatively, I queried all items with ISBN13s of which there are about 60000 records and downloaded the CSV. I processed this file with regex in Python. Xenmorpha (talk) 04:08, 9 March 2021 (UTC)

Hotel rating

How can be entered hotel ratings (hotel stars) and the rules for these ratings (as a qualifier)? I think these ratings are no awards (award received (P166)) like Michelin star (Q20824563). --RolandUnger (talk) 08:36, 9 March 2021 (UTC)

Scientist

The item scientist (Q901) is used very broadly, with subclasses like brazier, byzantinist, church historian, collector of folk music, ironmaster, law theorist, land surveyor, management engineer, mathemagician, mental calculator, mythologist, sport historian, special educator, script kiddie and utopian (full list).

The demarcation between science and other fields is fuzzy, but it would be helpful to come up with some guidelines about what should be included. The guidelines could consider research vs. application to help determine the applicability of chemist vs. chemical engineer or animal scientist vs. farrier. It may also be helpful to create three subclasses for scientists in the natural, the social and the formal sciences. (But right now, natural scientist (Q55002844) is a fictional operatic character.) --Ariutta (talk) 19:22, 27 February 2021 (UTC)

  • You can easily create several items for "scientist" each with its definition and references. --- Jura 22:34, 27 February 2021 (UTC)
    • Since most entries are using scientist (Q901), I'm hesitant to make a major disruptive change, and I want to get some consensus before making any change, because the boundaries can be fuzzy. The most minimal change would be adding some guidelines for the item, so we don't end up with results like scientist as a superclass of theologian (theologian -> religious studies scholar -> social scientist -> scientist). A more major change would be adding three subclasses for scientists in the natural, the social and the formal sciences. While this would be useful for me, I don't have the bandwidth to create and maintain those items myself right now.--Ariutta (talk) 23:59, 1 March 2021 (UTC)
  • Seems to me that it's mostly a "downstream" thing. At one point a subclass-of link is applied a little abusively (e.g. land surveyor as a subclass of surveyor (Q11699606)). Circeus (talk) 18:10, 3 March 2021 (UTC)
    • Yes, that's true. When following the links from superclass to subclass, each individual link may be (at least somewhat) reasonable, but the final result (top-most superclass and lowest subclass) can be absurd.--Ariutta (talk) 18:44, 9 March 2021 (UTC)

querying and displaying images with author/copyright

On Querying image (P18), the result at query.wikidata.org will be an image thumb with a larger size in a lightbox, which restrict's its usage to very private occasions because I cannot fulfill CC attribution requirements (especially when embedding the result). What would be a feasible approach to solve this? I placed a similar question @Commons talk:Structured data, but it might exist a different solution here, like openening the image in the Commons image viewer (like in the item page). Any ideas are appreciated … thanks! Please ping me for an answer --Elya (talk) 21:22, 3 March 2021 (UTC)

People have said that "click through" attribution is sufficient, and that's what Wikipedia uses. If you click the image you go the page on Commons with the details. Ghouston (talk) 01:50, 4 March 2021 (UTC)
@Elya: The solution would be either or both of representing the copyright status and/or licence for the image in Commons structured data, and as a qualifier of the P18 value (copyright license (P275), for instance) such that reports could access one or other value. --Tagishsimon (talk) 02:06, 4 March 2021 (UTC)
@Tagishsimon, yes, we have this data at structured Commons, but how to retrieve it? I could start a project completely out of the Query Service and combine 2 APIs, however, it would be nice to retrieve them via the Query service. Writing a bot that copies structured data from Commons to Wikidata as qualifiers … ? Rather not, I think.
@Ghouston, thanks, but not exactly. Check out this random map query: Churches in Wittenberg. no chance to get to the image attributions.
Yes, this particular output is giving no way to get to the Commons image page, so it's inadequate. Perhaps the underlying software could be modified. Ghouston (talk) 09:15, 4 March 2021 (UTC)
I might try however to additionally output the Commons file page as hyperlink in the map, which would be a no-api but sufficient solution. Thanks for thinking around with me ;-) If anybody is aware of an API approach, still highly appreciated! --Elya (talk) 06:22, 4 March 2021 (UTC)
@Elya: this technically works, but not as optimal, as multiple requests (first to WDQS, then to commons API, maybe then to WCQS).
SELECT ?item ?itemLabel ?pbLabel ?cat ?coord ?img (IF(BOUND(?img), ?usage, "") AS ?usg)
WITH {
  SELECT ?item (SAMPLE(?cat) AS ?cat) (SAMPLE(?coord) AS ?coord) (SAMPLE(?img) AS ?img)
               WHERE {
                 wd:Q75849591 wdt:P527 [ wdt:P527 ?item; wdt:P361 ?pb ].
                 ?pb wdt:P31 wd:Q76598130.
                 ?item wdt:P625 ?coord.
                 OPTIONAL { ?item wdt:P373 ?cat. }
                 OPTIONAL { ?item wdt:P18 ?img. }
               } GROUP BY ?item
} as %items
WHERE {
  INCLUDE %items .

  BIND(STRAFTER(wikibase:decodeUri(STR(?img)), "http://commons.wikimedia.org/wiki/Special:FilePath/") AS ?fileTitle)

  SERVICE wikibase:mwapi {
    bd:serviceParam wikibase:endpoint "commons.wikimedia.org";
                    wikibase:api "Generator";
                    wikibase:limit "once";
                    mwapi:generator "allpages";
                    mwapi:gapfrom ?fileTitle;
                    mwapi:gapnamespace 6; # NS_FILE
                    mwapi:gaplimit "1";
                    mwapi:prop "imageinfo";
                    mwapi:iiprop "extmetadata" ;
                    mwapi:iiextmetadatafilter "UsageTerms" .

    ?usage wikibase:apiOutput "imageinfo/ii/extmetadata/UsageTerms/@value".
  }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "de". }
}
Try it!
There are few other interesting fields in extmetadata you may want to use. Note, that it does not take multiple licenses into account, for example, for https://commons.wikimedia.org/wiki/File:Wittenberg_Schlosskirche.JPG it shows only "Public Domain". --Lockal (talk) 16:11, 4 March 2021 (UTC)
Lockal, oh, wow … I'll have to dig into that, maybe with the folks at „request a query“ which I bother a lot recently, because my own map query is a bit more complex than this church example. Only one thing: is it possible to get not only the license but also the author? --Elya (talk)
Oops, I forgot to add a link I used for debugging - https://commons.wikimedia.org/w/api.php?action=query&titles=File:Lutherstadt_Wittenberg,Kirchplatz,Stadtpfarrkirche_St._Marien.jpg%7CFile:Wittenberg_Schlosskirche.JPG&prop=imageinfo&iilimit=50&iiprop=extmetadata (api help). There is an "Artist" field, so updated query is here. Also, you may want to start with WCQS + P180 (depicts) instead of WDQS + P18 (image). --Lockal (talk) 18:15, 4 March 2021 (UTC)
@Lockal:, one little thing: my original query had 199 results, after quering the image authors, only 179 … the missing ones must be those with no images. I tried to make everything in the second request optional: here, but no success. Any idea? I pulled everything from the output in a custom leaflet map and it looks very promising :-) --Elya (talk) 18:47, 9 March 2021 (UTC)
@Elya: Found an easier query (in terms of readability), and outer query is always has the same size as inner query - [4]. However, again, now it looks more like a "challenge query", when you are only limited with a single call to SPARQL (e. g. for Listeria). If you are writing a separate application, you can fetch the same info much faster with 2 separate requests: first to SPARQL, second to api.php. --Lockal (talk) 19:23, 9 March 2021 (UTC)
@Lockal:, thanks! Now it's 192 vs. 199 items, I'll have to find out what's going on. I agree to your suggestion with the separate query, for the moment I'll just use the static result as it's a relatively stable data set (every 3-5 years a couple of buildings with this architectural award). At the beginning of this thread I did not intend to make my own "app" but use the iframe from the query service. However, every couple of years I start learning leaflet from scratch and have to build something ;-) Maybe I'll try the double API request … --Elya (talk) 21:22, 9 March 2021 (UTC)
I beg to suggest that SPARQL queries federated across wikidata and commons is not the way to go. If the licence / author / whatever data is in commons structured data, then it can be reported on and added to the wikidata P18 statements as qualifiers, such that your report can be based on wikidata only. Q78227747#P18 --Tagishsimon (talk) 20:13, 4 March 2021 (UTC)
Hmmmm … Tagishsimon, really? I see the performance issue, my data soul shouts „but redundancy“ … however, apart from my feelings, is there a bot that does this already? If this became a standard procedure, it would make things easier, indeed. It would albeit be necessary to have an automated check in place in case someone changes the image but not the respective license/author qualifiers. Still a bit hesitant … --Elya (talk) 17:11, 7 March 2021 (UTC)

Is case sensitivity considered acceptable for multiple statements, or merge ?

Having multiple entries under the taxon common name property (P1843) which simply differ in their case sensitivity is not an uncommon issue, usually driven by imports from different sources which use different standards.

For example, the page for Corvus corax under taxon common name, language English has entries for 'Common Raven' and 'common raven'. Obviously these refer to the same data and the duplication seems to just be taking up space.

Is it acceptable to combine these and their references into a single entry, or is there some rule or guideline that says no, you have to leave them both in place ?

Thanks CanadianCodhead (talk) 19:04, 5 March 2021 (UTC)

if two items represent the same concept they should be merged. even if they have totally different labels (or nearly the same labels). BrokenSegue (talk) 19:56, 5 March 2021 (UTC) oh I was confused ignore that. Oh yeah I think you want to delete one but don't ask me which. BrokenSegue (talk) 20:01, 5 March 2021 (UTC)
  • I would say merge, no need to delete, the external sources we loaded here, may be linking back to us. --RAN (talk) 22:40, 5 March 2021 (UTC)
  • Deleting one common name over another could cause issues. Common names of different organisms in different countries can have different standards. So removing one capitalization in favor of another would be favoring one country's norms over another country's. --EncycloPetey (talk) 22:55, 5 March 2021 (UTC)
  • We don't upload or store dates in different formats to reflect different standards in different places when they clearly represent the same thing, why would we not adopt the same standard for text? CanadianCodhead (talk) 03:25, 6 March 2021 (UTC)
    Because we have the ability to convert dates using standard protocols. There is no uniform nor standard protocol for conversion of common names; you'd have to encode lots of additional information about the grammar of individual parts of the name. For example "Common Raven", "common raven", and "Common raven" would suggest a rule about capitalization, but that rule would not apply to "California condor", where the first word must be capitalized. Words do not have uniform display rules in the way that dates do. Further, any rule that did apply would be specific to names in only one language. Some languages never capitalize common names, but in German all nouns are capitalized. So again, there are no uniform rules that can be universally applied in the way that uniform standards are applied to different dates. --EncycloPetey (talk) 20:25, 6 March 2021 (UTC)
  • I'm not suggesting choosing or forcing a standard display format of capitalization. I'm questioning why duplicated data is allowed and the need for it. Having the same name entered 2, 3, 4 times just varying in capitalization is unnecessary and confusing. If they do need to be there, then they should be disambiguated in some way to reflect where they are used, be that through different language strings, some property such as the Stated As qualifier or other that indicates where this format is used etc.
  • We would not support or allow separate numerical values for a statement to be entered as '35.4' and '35,4' to reflect different uses of a decimal point in display, why support it for text?CanadianCodhead (talk) 14:39, 7 March 2021 (UTC)
    35.4 and 35,4 are the same data with different uniform standards applied that follow regular and predictable rules under defined situations. My point is that there are no such regular or predictable rules that can be applied to words in the way they are applied to dates or measurements. "Common raven", "Common Raven", and "common raven" are not duplicates; they are equally valid values used under different systems. The variation in capitalization is necessary to preserve information that cannot be collapsed without losing signal. Capitalization in English (and many other languages) carries signal: for example China is a country but china is a form of dinnerware. Language does not follow strict or regular rules; and even "rules" such as "i before e except after c" have more exceptions in English than examples that follow the rule. Artificially deciding to collapse two different capitalizations as "duplicates" is applying a "rule", and would be choosing or forcing a standard. --EncycloPetey (talk) 00:39, 9 March 2021 (UTC)
  • China the nation and china the tableware clearly refer to 2 different concepts. The fact that the tableware can be expressed in a sentence as either 'During the earthquake, my china was broken' or 'China fell out of the cabinet during the earthquake' does not mean that 'China' and 'china' in those 2 cases are different things. Nor does it mean that in the second somehow it is the nation being referenced, or that we need to somehow maintain both as separate entities to reflect the possibility of it being used either way. 'Common Raven', 'Common raven', 'common raven', 'COMMON RAVEN' and any other combination thereof do refer to the same thing. Even in places where it is agreed that 'common raven' is the "correct" format, it can be expressed differently. You would still write 'Common ravens are found here.', that does not mean it is a different thing. Rules about how to display items do not mean the items they express become different things or refer to different things.
  • We would not create separate Q items for each of the spellings. Why again is it not right to somehow disambiguate these, with the 'Stated as' qualifier being the most viable ? CanadianCodhead (talk) 17:39, 9 March 2021 (UTC)

30 Lexic-o-days, events and challenges about lexicographical data

Hello all,

I'm glad to announce that 30 Lexic-o-days, a series of events, projects and challenges around lexicographical data, will start on March 15th. There will be discussions, presentations, but also activities like improving the documentation of Lexemes or editing challenges. The goals of this event is to gather people editing Lexemes to have discussions around the content and work together. You can find the schedule and all relevant links on this page.

This format is a first experiment and its content is powered by the community: if you have ideas or wishes for the discussions, you're very welcome to set up an appointment or to create a task on the related Phabricator board! We're also keeping an open list of ideas here. Discussions about Lexemes, or summaries of future discussions that will take place during the event, should be documented on the project page or its talk page.

If you have questions or need help to participate, feel free to contact me. I'm looking forward to your participation! Cheers, Lea Lacroix (WMDE) (talk) 12:32, 9 March 2021 (UTC)

Wikidata items for all Debian Packages? - Notability point 2. is still playing a role for these packages?

From the previous discussion "Wikidata items for all Debian Packages?" about whether all Debian packages should be included or not I'd like to document some of my progress and ask for further participation.

On the Mix'n'Match Debian stable package page, specifically the Preliminarily matched section I started doing some edits. One of the first items I wanted to check there was the package acm but there was a notability concern, point 2:

  • A "It refers to an instance of a clearly identifiable conceptual or material entity." and
  • B "The entity must be notable, in the sense that it can be described using serious and publicly available references."

I'd say acm satisfies A, but B, I'm not sure at all. How do I know if a package satisfies B? I will be pinging users who participated in the other project chat section I created, if you believe this pinging is unnecessary please let me know. @ChristianKl:

@SCIdude: quoting from the previous project chat(now in archives) "I consider distribution packages as normal software releases (specialized to the distribution), so before collecting these it would be more important to have all the general releases (provided we even have the software item)"

If I did not misunderstand you, you mean a general item, like ie. sudo (Q300883) and to that item we can add identifiers for Debian, Fedora etc.?
The software has releases/versions designated by their authors. Distribution packages depend on specific versions. I don't think you should add anything about distribution packages before you have all the normal versions of the software. Distribution packages are in my opinion less notable than proper original software versions. --SCIdude (talk) 08:14, 9 March 2021 (UTC)

@Lockal: quoting from the previous project chat(now in archives) "Definitely not for "transitional dummy packages", or internal dependencies like https://packages.debian.org/buster/wesnoth-1.14-data"

Agree, I find that to be common sense and thanks for mentioning Mix'n'Match, I really enjoy having discovered this tool.

@MisterSynergy: any tips on detecting non-notable packages is appreciated. LotsofTheories (talk) 06:58, 9 March 2021 (UTC)

If this doesn't conclude or I don't get any insight or community participation that I seek I will stop editing Debian packages, related things to that in Wikidata and I will lay this topic to rest. LotsofTheories (talk) 17:40, 9 March 2021 (UTC)

levels of specificity with P131

There is a (probably very large) number of items of geographical places with absurdly/unhelpfully broad values for located in the administrative territorial entity (P131), e.g. a building, park, or cemetery that is stated as being simply located in Mexico (Q96) or New York (Q1384) (many because they were auto-populated by Wikipedia categories of "Xs in Y" at a time when "Y" was only as coarse as Mexico or New York State). Someone more adept with with querying, feel free to add some hard numbers. I would imagine this is a suboptimal data structure: ideally a discrete entity should be located in the narrowest (most exclusive) locality, which in turn is nested in the next most inclusive jurisdiction, and so on, and so on. Someone trying to query all churches in New York City are going to be stymied if half of the churches are only located in "New York State". Right? So I have a handful of questions: 1) How big a problem is this: If reasonably big, would someone be able to generate a table of discreet geographical entities (e.g. buildings, structures, basically any geographical place that cannot be placed in a more exclusive P31) that have P131 values equal to or greater than federated state (Q107390), province (Q34876), U.S. state (Q35657), and/or maybe some sort of system for flagging/detecting such entities when they are created? And 2) if an item has an explicit value for coordinate location (P625) but a vague/broad P31 value, is querying improved at all? Does the mother brain of Wikidata recognize that just because the lat-long of a statue is situated in Central Park (Q160409), that therefore the statue is in Manhattan (Q11299)? -Animalparty (talk) 02:43, 6 March 2021 (UTC)

You make a huge presumption, that the narrowest P131 value should be used, and all higher values found by following the P131* path. Before we analyse the presumed problem, we might want to dig into that somewhat. First, I don't think there is consensus that only the narrowest value should be used. Seocnd, you "imagine this is a suboptimal data structure" but do not state the nature of the suboptimality. The cost and scope for inconsistency in maintaining multiple layers of P131 per item might be considered suboptional. The benefits of maintaining multiple layers of P131 per item when reporting might be considered optimal. Redundancy is often used in data structures to diminish runtimes and make report construction simpler.
In other news, there is no "mother brain of Wikidata". Coordinates are of some help in establishing which P131 values are appropriate for an item, but there is no automated way of establishing P131 values from P625 values.
I point to Scottish items as a case in point: the vast majority of geolocatable items have P131 values for their civil parish and their local authority area. P131s in Scotland tend to use object has role (P3831) such that it is clear what level is being represented. Civil parish is, of course, the lowest level of P131 - the narrowest. But in my experience, the local authority area is the most useful P131 value. Bluntly, few care which parish a thing is in. Many care which local authority area it is in. Your proposal / unilateral declaration of a problem currently seems completely unhinged from any consideration of the way in which data is consumed.
I have to tell you, too, that any idea that there should only be a single P131 value is for the birds. Geolocatable items tend to be within the bounds of multiple orthogonal administrative territorial entities - in the UK, for instance, parishes, local authority areas, health provider areas, flood defence areas &c &c &c - and P131 users will have to become adept at ascertaining which P131 values they consider, based on their type, as wikipedia moves forward with the collection of richer datasets.
This is not to say that there is not massive scope for improving P131 data, which should, ideally, always identify the narrowest entity for the item. --Tagishsimon (talk) 03:09, 6 March 2021 (UTC)
As someone who has been trying to make P131 to narrower items, I do think it is better for narrower items to be used in P131. Note that machines can easily go down the sublevels assuming a hierarchical structure of P131 exists. Even if "people do not care which parish it is in" - it is easy for a machine to get to the local authority area, if this data is modeled, which can be added. Re: "is no automated way of establishing P131 values from P625 values." - if you mean no one is doing it on Wikidata, then this is probably true. However if one has a way of partitioning New York State into subdivisions - by first querying for all items with a P625 value while having a P131 value of New York State, then it should be quite possible to classify all the points based on the boundaries of those partitions. In short, you probably need to write a bot or script to do it, although I'm not sure how one would clear administrative barriers such as Request for Bot Permissions as well Xenmorpha (talk) 03:16, 9 March 2021 (UTC)
@Xenmorpha: Indeed, this should be simple enough for an off-line script. historic county (P7959) was previously populated in just this way, I think. Any assignments made in this way should probably have the reference added based on heuristic (P887) = "inferred from coordinates", so that it is clear where the information came from. (Important, because some wd coordinates may be not be completely accurate, some indeed completely inaccurate). Jheald (talk) 08:33, 10 March 2021 (UTC)
@Tagishsimon, Xenmorpha: I think the Scottish parishes are a slightly special case. They lost their last actual administrative functions in 1930; but were never comprehensively replaced by anything at that level of localness, so they have continued to be used (including officially) to eg organise lists of heritage buildings; or, on wiki, to organise pictures on Commons. And after so many hundred years of existence, they often retain quite an identification and resonance on the ground too, as a local-sized unit with defined boundaries that people do identify with.
But of course it's also important to identify what the local unit that actually currently does administration is. (Currently "council area" in Scotland -- much larger areas, with borders that don't necessarily recognise old parish areas).
We got told off by people looking after the infoboxes on Russian wikipedia for having multiple P131s on items -- and I think rightly so. IMO there should at most be one clear 'preferred' P131 value, for the most local entity in the most significant present-day administrative hierarchy. IMO other P131s may exist, but not with best-value rank (so that infoboxes can have one clear value to present). Some of the hierarchies you talk about -- "health provider areas, flood defence areas &c &c &c" -- I think are alternative hierarchies that, for querying, it would be useful to have a different property for, other than P131. (Because one so often wants to use P131* or P131+ in queries, which don't allow qualifiers to be checked). But the important thing is to make sure that there is only one 'best value' for P131, and that that corresponds to the current principal hierarchy. Then P131* and P131+ can at least be used for that. Jheald (talk) 08:14, 10 March 2021 (UTC)

Invalid claims

Hi, can something please be done about user:Mapenite? Hy uses 5 digits for Property:P1368 in articles instead of 9 digits and causes the following message to appear in red at the bottom of articles: "The LNB id 47485 is not valid." See en:Linguistics. He ignores requests to use 9 digits and puts his invalid claims on hundreds of pages. Thank you. ~ Burgert (kontak) 10:46, 10 March 2021 (UTC)

Numeric IDs

I'm interested in proposing new properties for Bandcamp and SoundCloud stable numeric IDs, but I'm not sure how they would work with existing properties.

Bandcamp has numeric identifiers for artists, albums and tracks; e.g. C418 (Q1847436) is 3385865266, Minecraft – Volume Alpha (Q1523255) is 1349219244, Gospel for a New Century (Q104878749) is 3437627917. There's already a property for Bandcamp artists, but the ID that's used is the subdomain, which is unstable. The URLs that use the numeric IDs don't redirect to the normal pages, and there aren't embeds for artist pages so I've had to use the contact form.

SoundCloud has numeric identifiers for users, playlists (sets/albums) and tracks; e.g. Leona Lewis (Q183519) is 5528551, Day69 (Q49422702) is 449464896, Mouth Moods (Q28532736) is 304209955. There's already a property for SoundCloud pages, but the ID is the entire URL after the domain name, so it'd be used for all three types.

  • Is there a rule for when to have both numeric ID and non-numeric ID properties for something?
  • Should the numeric ID properties be used as statements, as qualifiers, or both?
  • Does a bot usually maintain the uses of these pairs of properties? Would someone be expected to run one for all of the new properties?

My thinking at the moment is that it would make sense to create all six properties and to use the Bandcamp artist, SoundCloud user, SoundCloud playlist and SoundCloud track properties in qualifiers, which would be consistent with how Twitter user, tweet and topic IDs are handled. However, I don't know if it'd be appropriate to use all three new SoundCloud properties as qualifiers of the one existing property, or if it'd make sense to not create extra properties for Bandcamp albums and tracks (which would have to use the full URLs). Overcast07 (talk) 17:21, 10 March 2021 (UTC)

If there are two different identifiers for the same database, with similar coverage, I would tend to make one a qualifier of the other, just to group things together on the item page. Jheald (talk) 18:04, 10 March 2021 (UTC)
we have a similar problem for twitter user ids. We have both X username (P2002) and X user numeric ID (P6552). Where the latter is "stable" and the former is easy for an end user to get. I think we should do as Jheald says and have both and make on a qualifier. In the end we should make sure it's easy to contribute a new identifier and have the rest get populated by a bot if people don't want to look up the internal numeric id (but I don't think a bot is a prerequisite for making the props). BrokenSegue (talk) 19:22, 10 March 2021 (UTC)

White field

On right half of screen on item pages I have big white field. I use FireFox 86.0 Windows 10. FireFox 56.0.2 with same OS hadn’t this problem. 217.117.125.88 17:55, 5 March 2021 (UTC)

Hi! To report technical issues, please use Wikidata:Contact the development team. Also, do you mind explaining more about what you mean -- possibly with a screenshot, so that we can understand better. Thanks. -Mohammed Sadat (WMDE) (talk) 08:32, 11 March 2021 (UTC)

Wikidata-Game?

Hello. Apparently, there is something called #wikidata-game that stores at least wrong gender information, e.g.: [5], [6]. I corrected it for these two people, but I would assume that much more mistakes were done by whatever wikidata-game is. --Christian140 (talk) 09:35, 11 March 2021 (UTC)

This: https://wikidata-game.toolforge.org/distributed/# ... essentially it's a user interface which enables users to make decisions such that data is added to wikidata. It does not store wrong gender information. It allows users to make the wrong choice, much like any user interface. It is arguaby not a neutral thing. It is also arguably not to blame. --Tagishsimon (talk) 09:41, 11 March 2021 (UTC)

Membership in the list

There is a list of some elements (for example, a list of cultural heritage sites — list of cultural heritage sites in Sortavala (Karelia, Russia) (Q27601176)). Is it possible to indicate that a specific entity is included in this list (e.g. Wooden Town Hall of Sortavala (Q105836102))? Obviously, "has list (P2354)" is not suitable for this purpose, is it? --Avsolov (talk) 09:05, 10 March 2021 (UTC)

Just put list of cultural heritage sites in Sortavala (Karelia, Russia) (Q27601176) as object of stated in (P248) in a reference claim of heritage designation (P1435). I have done this for you in https://www.wikidata.org/wiki/Q105836102#P1435. --SCIdude (talk) 08:30, 11 March 2021 (UTC)
@Avsolov: Would be better to source it to the source of the list, but I recall that wasn't easy for Russia.
Now that I found someone who is working on cultural heritage in Russia, I might as well ask some questions:
kulturnoe-nasledie.ru ID (P1483) is the same as the one used in Commons:Template:Cultural Heritage Russia, right? I see that kulturnoe-nasledie.ru ID (P1483) is used only 8800 times, but the monuments database has 185946 entries for Russia. Do you know if anyone is working on importing these missing cultural heritage monuments? Multichill (talk) 16:31, 11 March 2021 (UTC)
@Multichill:, you are right, kulturnoe-nasledie.ru ID (P1483) is exactly that identifier. At the moment the list of Russian cultural heritage monuments is maintained by Russian Wikivoyage team. And our team is now discussing the process of importing cultural heritage monuments into Wikidata. --Avsolov (talk) 17:55, 11 March 2021 (UTC)

Timeout problems

I am currently importing a large set of data using a script, but I frequently encouter timeouts ("MaxlagTimeoutError: Maximum retries attempted due to maxlag without success."). This significantly impairs the import and, honestly, also my motivation of working in wikidata, because it causes manual work to get the script back on track. Why does such problems occur and how can I avoid it? Steak (talk) 12:28, 11 March 2021 (UTC)

What tool are you using?
Maxlag is the API asking for a slower data ontake b/c it's getting behind in serialising wikidata's RDF into triples for the reporting service. chart here. In general tools are written to apprehend maxlag and decrease their input rate when maxlag is >5. So although this can slow things down, it does not cause them to stop. The possibility is that your tool is not apprehending or reacting to maxlag as it might? --Tagishsimon (talk) 12:34, 11 March 2021 (UTC)
I am using Anaconda Spyder with the pywikibot module. I have several try - error loops in my script that should avoid immediate crashes at timeout, e.g.

while True:

       try:
           Qitem.addClaim(newElo)
           break
       except pywikibot.exceptions.MaxlagTimeoutError as e:
           print(e)
           time.sleep(35)
       except pywikibot.exceptions.OtherPageSaveError as e:
           print(e)
           time.sleep(35) 

But this does not really help. Steak (talk) 12:39, 11 March 2021 (UTC)

I would recommend to update pywikibot to the latest version. I used to see this very same behavior on PAWS and Toolforge in the past, but meanwhile it just keeps on trying—even without try/except handling… —MisterSynergy (talk) 12:57, 11 March 2021 (UTC)
@Steak: yes, update to the latest version of Pywikibot and use a bot account. The wait time should increase and if you increase the number of retries (for example max_retries = 150 in your user-config.py), it might take longer, but it will complete. Multichill (talk) 16:20, 11 March 2021 (UTC)
Okay, I have set up Anaconda new from scratch and added the max_retries parameter to my user config. Let's see if it has a positive effect. Steak (talk) 18:32, 11 March 2021 (UTC)
Unfortunately it has no effect. It's still crashing with the same error. Steak (talk) 19:00, 11 March 2021 (UTC)

Conscientious objection to military service (Q5576246) and conscientious objection (Q2930613)

There's a screwed-up mess between Q5576246 (conscientious objection to military service) and Q2930613 (conscientious objection), the latter in principle a broader category not necessary limited to military service objection, but the links show this is not the case. I don't have time to deal with this, maybe someone else would like to jump in an unscramble the situation. Thanks, Mathglot (talk) 01:52, 11 March 2021 (UTC)

Should we take conscientious objector (Q2930613) to be what it's English label and linked ca and es Wikipedia articles say, as conscientious objection to anything, and not what its linked en Wikipedia says, as applying to military service only? Both Wikipedia articles have been linked since the item was created in 2013. Ghouston (talk) 09:09, 11 March 2021 (UTC)
Or should we alternatively say that "conscientious objection" is generally used in English in the context of military service, and the two items should be merged as duplicates, and a new item created for the broader concept (which apparently includes objections to things like performing abortions or same-sex marriage)? Ghouston (talk) 21:46, 11 March 2021 (UTC)
Lurking, and hoping that others more familiar with WD than I am will resolve this. I can jump in later if needed, but from a quick glance at your questions, I think you've pinpointed the problem, which is of course the first step to coming up with a solution. What I remember from looking at it before giving up, is that whatever is decided, a bunch of articles will have to be delinked and relinked. I seem to remember that it's complicated by the fact that some wikipedias have articles on both concepts, some have just one (which might be linked to the more general, or the more specific concept), so my *hunch* is that we should start first with the wikipedias that have both concepts, and interlink all of them properly (delinking 1-concept 'pedias in this step only where necessary to get the 2-concept pedias connected properly), and once that is done, see what has to be done with the 1-concepters. Mathglot (talk) 05:58, 12 March 2021 (UTC)
Yes, and I'd say a new item is needed given that the meaning of conscientious objector (Q2930613) is unclear. conscience clause (Q5162726) is a related item that doesn't seem to be in good shape either. Ghouston (talk) 08:28, 12 March 2021 (UTC)

Items for colours

 
Not all reds are #FF0000

white (Q23444) has the alias and the hex value #FFFFFF, but surely the item is describing a range of colours descried as white, such as #FFFFFE?

Likewise, we state that red (Q3142) has the hex value #FF0000, but the image on that item is the one shown here.

What's the best way to resolve this ambiguity? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:08, 11 March 2021 (UTC)

Perhaps said to be the same as (P460) could help in specifying a range of values; though whether there exist references tying the hex colour to the colour name is another q. Your point is well made: white extends further than #ffffff. --Tagishsimon (talk) 10:14, 11 March 2021 (UTC)

I'm also grateful to User:Jheald, who points out that both magenta (Q3276756) and fuchsia (Q5005364) have the hex value FF00FF.

I have just created #FFFFFE (Q105883127) as an example of how we could model colours (assuming Q23444 is taken to represent a range, and that a separate item representing #FFFFFF would be created).

On a related note, perhaps someone could make a user script or gadget that would display a square of the appropriate colour, next to sRGB color hex triplet (P465) values? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:39, 11 March 2021 (UTC)

P465 should have an "eternal ID" datatype, not "string"; this would allow the use of formatter URLs like https://www.colorhexa.com/$1 (e.g. linking to https://www.colorhexa.com/fffffe ). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:48, 11 March 2021 (UTC)

red (Q3142) and other natural color names should really be a class with individual hex points being instance of them.--GZWDer (talk) 22:58, 11 March 2021 (UTC)

OSM question

Perhaps someone can answer to this question? Lymantria (talk) 21:33, 11 March 2021 (UTC)

Done. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:26, 12 March 2021 (UTC)

Another OSM question

I was trying to add this relation to our element Q4174348. I am using the relation ID provided by Wikidata which is 1433267622. However, when I add this to our element, I (i) get a contraint violation notification saying the ID should not be longer than 8 characters (ii) If I click on the link I get to the correct place on OSM but get an error. What am I doing wrong?--Ymblanter (talk) 16:53, 12 March 2021 (UTC)

@Ymblanter: I'm not an OSM expert but I think the issue is that you are adding a "node" when that property is for "relations". Is there a relation for that item? Or maybe we need to add a wikidata property to capture OSM nodes. BrokenSegue (talk) 16:58, 12 March 2021 (UTC)
Thank you. I think you are right - this is indeed a node and not a relation (and this object probably should not have a relation).--Ymblanter (talk) 17:20, 12 March 2021 (UTC)

The {{Item documentation}} doesn't support display any language other than English

Whatever language I choose, the "parent class" section is in English. --173.68.165.114 20:39, 12 March 2021 (UTC)

Creating interwikis via bots on Wiktionary

I have a list of Kurdish wikt. categories and I want to connect these with en.wikt. Is there any script to connect them automatically using pywikibot or AWB? --Balyozxane (talk) 17:08, 9 March 2021 (UTC)

If the categories contain old-style interwiki, you can use interwikidata.py script from the Pywikibot distribution. If not, you may have to manually map them one-by-one to Wikidata items (which for a list of categories in the other language can be obtained using e.g. PetScan) and then use Help:QuickStatements. --Matěj Suchánek (talk) 15:37, 12 March 2021 (UTC)
Thank you for the reply. Those tools would work but it's just impossible without having an extensive code knowledge.--Balyozxane (talk) 23:41, 12 March 2021 (UTC)
@Matěj Suchánek: Do we need bot flag to connect/create interwikis using pywikibot? I tried some links then realized they needed to be "patrolled" so I filed a bot request instead. --Balyozxane (talk) 06:40, 13 March 2021 (UTC)
I believe for a few hundreds of links you don't really have to. As for code knowledge, these tools don't require you to code anything. I'm currently connecting the categories to Wikidata, but you will have to remove the links from the category pages by a bot approved on your wiki. --Matěj Suchánek (talk) 09:21, 13 March 2021 (UTC)

Merge request

Please can someone merge National Tutoring Programme (Q106047711) with National Tutoring Programme (Q106047780), they're the same thing.

Thanks

John Cummings (talk) 16:55, 19 March 2021 (UTC)

  Done--Ymblanter (talk) 17:08, 19 March 2021 (UTC)

Please see Help:Merge for next time.

This section was archived on a request by: --- Jura 19:26, 19 March 2021 (UTC)

Igreja Paroquial de Ferreira do Alentejo

  Done--Ymblanter (talk) 17:09, 19 March 2021 (UTC)

Please see Help:Merge for next time.

This section was archived on a request by: --- Jura 19:26, 19 March 2021 (UTC)

How to describe Pastel QAnon in statements?

Hi all

I just created an English Wikipedia article for Pastel QAnon Pastel QAnon (Q105958828), but I'm not really sure how describe it on Wikidata. If anyone has any experience in describing conspiracy theories please help. My main issue in describing it on Wikidata is its not conspiracy in itself, its a collection of strategies for indoctrination into QAnon. --John Cummings (talk) 22:16, 14 March 2021 (UTC)

Survey

I would suggest not to participate in a 'Short survey: The Wikidata Community in 2021 : Help Wikimedia Deutschland and the rest of the Wikidata community to find out who edits Wikidata. Fill out a 2-minute, anonymous survey (Available in English, Chinese, Hindi, Spanish, Portuguese, French, Arabic and German)' (with link privacy statement below 'Take survey →' button).

A reason is because the survey is found at https://wikimedia.sslsurvey.de/Wikidata_Community_in_2021/en (sslsurvey.de is domain name with https://sslsurvey.de leading to 'The survey could not be found' error page titled 'LamaPoll' at https://sslsurvey.de/#/).

Maybe it is usable website for surveys but looks kind of left-handed.

--5.43.83.177 02:28, 15 March 2021 (UTC)

How to monitor constraint violations

I have recently discovered many editions producing constraint violations. See Wikidata_talk:WikiProject_France/Communes#Communes_with_multiple_INSEE_codes_with_none_as_preferred_rank

They violate the single value constraint because the current value was not set as preferred rank.

They were added nearly a year ago (2020-04-11) without being noticed/addressed until now.

I am wondering if it would be possible to set some monitoring mechanism to raise an alert as soon as these violations appear, so that they can be fixed immediately.

Thank you. --Albert Villanova del Moral (talk) 08:41, 12 March 2021 (UTC)

Does https://www.wikidata.org/wiki/Wikidata:Database_reports/Constraint_violations/P374#Single_value work? AKAIK all constraints produce a violation report, linked from the talk page - e.g. search on that page for 'single value'. --Tagishsimon (talk) 08:52, 12 March 2021 (UTC)
Thanks a lot for your answer, Tagishsimon. It is very useful to know that list exists. I am wondering if it would be possible to automatize someway the detection of the violations just immediately as they appear. Regards, --Albert Villanova del Moral (talk) 09:20, 12 March 2021 (UTC)
These lists are produced by SPARQL queries. If you want the immediate result run the query yourself. Usually the results are not older than 5 minutes. --SCIdude (talk) 07:46, 13 March 2021 (UTC)

Import from ISNI

Forgive me as I couldn't find the answer. Is there a way to import ISNI data?

http://isni.org/isni/0000000079977033

Or even better (more complete), from here:

https://catalogue.bnf.fr/ark:/12148/cb161929407

Thank you! Meskalam (talk) 13:11, 14 March 2021 (UTC)

@Meskalam: At the moment no, as far as I know. --Epìdosis 07:47, 15 March 2021 (UTC)
  • You can add them manually. As it's unclear how ISNI data is compiled and sourced, it might not be advisable to import without checking. Maybe it's just an impression of mine, but while VIAF cleverly reclusters incorrect matches every month or so, ISNI just collects all past matching names (correct or not). --- Jura 08:02, 15 March 2021 (UTC)
@Epìdosis:, :@Jura1:. Thanks for the quick answer. Done! Q105963069

Bold lawn-scold newbie requests quick edit-history review of my vigorous but wrinkled wet toe

I'm a long-time editor at en.wiki, but new here.

I figure the best way to learn is to dive into the deep end. Within about an hour of sizing the place up, I created the new item:

And then I ran around perhaps overdoing the "facet of" property, snaring in my wide net:

My edit history is so far fairly small.

If an experienced hand wishes to take a quick scan of my edit history to see if I'm "doing it right" any guidance would be appreciated. MaxEnt (talk) 19:28, 11 March 2021 (UTC)

@MaxEnt: generally items should have either a instance of (P31) or a subclass of (P279). Maybe subclass of morality (Q48324). Personally that's now how I would use part of (P361) (more an aspect of kinda thing imo). Welcome! BrokenSegue (talk) 20:26, 11 March 2021 (UTC)
Are we sure it's not a duplicate of sexual ethics (Q967177)? Ghouston (talk) 21:41, 11 March 2021 (UTC)
I am not sure. BrokenSegue (talk) 23:52, 11 March 2021 (UTC)
It looks to me like a duplicate. The distinction the items draw is that sexual ethics is an academic discipline that studies sexual morality, but I don't really buy that. {{u|Sdkb}}talk 03:38, 12 March 2021 (UTC)
To me this looks fine. I don't think that it is a duplicate: The academic discipline studying sexual morality in a descriptive or normative sense is different from the explicit and implicit moral rules of a society or culture at a given point of time. - Valentina.Anitnelav (talk) 12:00, 15 March 2021 (UTC)
I agree with BrokenSeques statement above that it should be a subclass of morality (Q48324) and added this. - Valentina.Anitnelav (talk) 12:27, 15 March 2021 (UTC)

Transliteration properties

The implied schema for transliteration properties seems to be broken. But it is also so confusing and undocumented that I can’t really figure out how it is supposed to work. For example:

transliteration or transcription (P2440) “conversion of text to alternate script (use as a qualifier for monolingual text statements; please use specific property if possible).” This property has examples, but they are very simple. It seems to work in items that are names, where each transliteration is a statement that can have its own (required) qualifier determination method (P459) associated. For example, Natalka (Q81330018) has a single native label, so transliteration statements each with their own determination method apply to it.

And transliteration can’t be a top-level statement when an item has more than one thing to transliterate. It has to be a qualifier on an edition item with title and subtitle, or an geographical-entity item with multiple official names, traditional names, nicknames, demonyms, etc.

When used as a qualifier, transliteration is unable to represent the necessary relationships. One of the property examples is Moscow (Q649)/native label → Москва/transliteration → Moskva, determination method → ALA-LC romanization. This example happens to use a restricted set of Cyrillic letters that are transliterated the same by every system. Fine, because there is only a single transliteration in Latin script entered, and no translations into other scripts have been entered, yet (but in addition to romanization, there can be Sinizations, Hebraizations, Japanifications, etc., to be entered).

But scroll down in the city’s item a bit to the statement Demonym. The Russian word for Muscovite is москвич, romanized moskvich according to to the ALA-LC and BGN/PCGN systems, but moskvič per scientific transliteration. There is no way to represent this with the given framework and properties because all transliterations are grouped together, and all determination methods lumped together separately, each displayed in order of entry. The result fails to associate the different transliterations with their (required) determination methods; the information captured is missing the structure of what went in.

Well, at least I can partially resolve this by “using the specific property if possible,” and noting that transliteration or transcription (P2440) → See also includes the more-specific property ALA-LC romanization (P8991). Tried it, but this violates both allowed entity types and property scope constraints. At this point I can’t really tell what is the right way to fix this.

(I presume phonetic/phonemic transcription may have many of the same issues, but I haven’t even looked at that.)

So, does anyone have some insight in how this should work? Am I missing something important? Or is this a problem to be fixed, or replaced with lexemes, or something else? —Michael Z. 17:09, 12 March 2021 (UTC)

Programmatic transliteration

By the way, many transliteration systems are deterministic, and could be implemented programatically (en.Wiktionary has done this). Is there any way Wikidata can automatically generate transliterations from text? (Some systems are language-specific, and some are multilingual. They range from dead-simple character replacement, to more complex rules based on word-initial, medial, or word-final position, preceding or following letters, preceding or following vowels or consonants, part of speech, and so on.) —Michael Z. 17:09, 12 March 2021 (UTC)

@Amire80:--Ymblanter (talk) 17:19, 12 March 2021 (UTC)
I'm not sure how does programmatic transliteration fit here. Sure, a lot of it can be auto-created, but I'd be unenthusiastic about getting it created by bots, for example. If it can be reliably created by a piece of software, it probably shouldn't be published in Wikidata as a value, but created on the fly in other tools. --Amir E. Aharoni {{🌎🌍🌏}} talk 09:43, 14 March 2021 (UTC)
But it is being entered in Wikidata, unreliably, potentially millions of times (every single personal and place name can be transliterated into dozens of writing systems by multiple transliteration systems). You seem to be suggesting that all of the transliteration values be removed from Wikidata because they belong to the domain of other applications? I doubt that will happen, although it is worth thinking about. And according to the principle of editors will use the framework to do everything it is capable of, many more transliterations will keep appearing here than we are currently imagining.
We can improve the way it is entered. There’s just a massive opportunity to replace millions of person-hours of error-prone work. There are at least some precedents for programatically generated information here, e.g., URL links generated by applying regular-expression substitutions to identifier strings.
So let me suggest another conceptual approach, or at least a good reason to consider automation.
Since,
then it makes sense that something like either:
etcetera. And therefore it follows, and can be programmatically determined that, for any monolingual text, or text in a known writing system:
  1. Л (Q172000) Unicode character (P487): “Л”→ (qualifier ALA-LC romanization (P8991): “L”), and
  2. Larysa (Q86430977) native label (P1705): “uk:Лариса”→ (qualifier ALA-LC romanization (P8991): “Larysa”)
This database is a framework for implementing transliteration tables. If an item can have an identifier ID that presents an automatically derived URL link, why can’t a native name or other monolingual text statement have a transliteration qualifier that displays the automatically derived transliteration? —Michael Z. 16:20, 14 March 2021 (UTC)

Python / PAWS

 

Hi there! I want to update the graphic to the right. For this I installed Pywikibot in PAWS.

To get the number of pages I want to run this script:

import pywikibot
wikidata = pywikibot.Site('wikidata', 'wikidata')
#Page number one
page = pywikibot.Page(wikidata, 'q%i' % 1)
print(str(page.getVersionHistory(reverse=True, total=1)[0][1]) + '\t' + str(1))
#One page after 100000 other pages
for i in range(0, 80000000, 100000):
    page = pywikibot.Page(wikidata, 'q%i' % i)
    if page.exists():
        print(str(page.getVersionHistory(reverse=True, total=1)[0][1]) + '\t' + str(i))

and get the Error: AttributeError: 'Page' object has no attribute 'getVersionHistory'. One year ago it was working well. Has anyone an idea what I can do to run this thing? Bigbossfarin (talk) 22:58, 13 March 2021 (UTC)

@Bigbossfarin: It appears str(list(page.revisions())[-1]['timestamp']) gets what you want; substituting the existing str(page...[0][1]) fragment with what I provided should give you something useful. Mahir256 (talk) 23:27, 13 March 2021 (UTC)
This seems to be an inefficient way to retrieve that information, since a simple database query would also do the job in like two minutes or so. See https://public.paws.wmcloud.org/User:MisterSynergy/tmp/QID%20creation%20timeline.ipynb for example. —MisterSynergy (talk) 23:57, 13 March 2021 (UTC)

30 Lexic-o-days starting today

Hello all,

As a reminder, the month dedicated to Lexicographical Data is starting today! On this page, you can find plenty of sessions and discussions taking place in the next few weeks. There will be for example live editing and querying sessions, a presentation of Lingua Libre, a Q&A about Abstract Wikipedia, and plenty of open discussions. There's also a Phabricator board where we will track tasks, for example on improving the documentation.

The opening session will take place today at 15:00 GMT/UTC on Jitsi (the first part will be recorded).

We're looking forward to your participation! Lea Lacroix (WMDE) (talk) 09:48, 15 March 2021 (UTC)

Merging

Hi can we merge theese three links? they are actualy the same: Category:Ministers of Foreign Affairs of Azerbaijan (Q105962070) Category:Ministers of Foreign Affairs of Azerbaijan (Q105962326) and Category:Ministers of Foreign Affairs of Azerbaijan (Q8634037) And also theese two: Category:Education ministers of Azerbaijan (Q10033115) and Category:Education ministers of Azerbaijan (Q105962051) --Dr.Wiki54 (talk) 11:29, 15 March 2021 (UTC)

Air crashes

What's best between these two properties to tell which aircraft has been involved in an aviation accident (Q744913) : would vessel (P1876) suit better than item operated (P121) ? My feelings is that P1876 was mostly aimed at spacecrafts, not aircrafts... --Bouzinac💬✒️💛 21:11, 10 March 2021 (UTC)

Most uses are item operated (P121). Concur that vessel (P1876) started from the narrow perspective of spacecraft and see that the proposers did not appear to consider the general case; its scope has been widened to the general case per its talkpage, label change and aliases - and rightly so. For me vessel (P1876) is a more apt and obvious property than item operated (P121).
Remarkably (presuming no error in my SPARQL) there are 31 main property pointers from air accident items to aeroplane items. 2,347 air accident items have no main property for the aircraft type involved. --Tagishsimon (talk) 23:57, 10 March 2021 (UTC)
Do you mean vessel (P1876) is OK for the plane model (Lockheed, Airbus A380, Boeing 747, etc) and that the labels of vessel (P1876) are somewhat false (some languages sticking only to spacecrafts, and other languages widening to general crafts) ? Pinging here the creators of that propery @Adert, Joshbaumgartner, Ivan A. Krestinin, Pasleim:.--Bouzinac💬✒️💛 07:39, 11 March 2021 (UTC)
Yes. IMO vessel (P1876) is OK for the plane model. --Tagishsimon (talk) 07:44, 11 March 2021 (UTC)
In my mind "vehicle " is the individual aircraft, "item operated" is the type. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:18, 11 March 2021 (UTC)
That's a reasonable point. I did think through the same instance versus type issue w.r.t. vessel (P1876) on the accident item but concluded that the context makes it clear, not least in the absence of items for instances of most aeroplanes. Appreciate others may disagree with that take. --Tagishsimon (talk) 10:27, 11 March 2021 (UTC)
Couple of more points: 1) not sure why you find 'item operated' more type-y than vehicle, not least since it is not labelled 'type operated'; and 2) if we lack a property which properly distinguishes between item and type, then use of the qualifier object has role (P3831) <aircraft type> might help? --Tagishsimon (talk) 10:36, 11 March 2021 (UTC)
1) Because all of the examples on the property definition page are types, not individual vehicles; and otherwise, why do we need a "vehicle" property? 2) If we have an item for an individual vehicle, the its type should be declared there; no need to repeat it. As to your receding point, we may not have items for instances of most aeroplanes, but should create them as needed for describing accidents and other significant events. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:12, 11 March 2021 (UTC)

In my opinion, we should never use item operated (P121) as such on accident items. It is not the accident that operates the aircraft. Thierry Caro (talk) 10:30, 11 March 2021 (UTC)

Are such items generally about the accident, or about the flight which had the accident? We seem to use labels and P31 ambiguously (based on a small sample). Wikipedia content does likewise ("Pan Am Flight 6 [...] was an around-the-world airline flight that ditched...") Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:12, 11 March 2021 (UTC)
Excellent point. Pan Am flight 6 was a recurring event, of which only the last instance was famous because it ended in a crash. Clearly, the article and the WD item are really about the crash (the article may have some supporting text about the flight in general), and if necessary, supporting items for the recurring flight, or other related events, can be created. I do agree that a crash (or indeed any event) does not 'operate' the equipment and hence why vessel (P1876) is the better option since its inception (far more recent than item operated (P121) which was probably used on several crash items in earlier days. Josh Baumgartner (talk) 01:33, 16 March 2021 (UTC)

Wikidata weekly summary #459

Deleting my userpage

Could someone please delete my userpage so instead my global userpage can show? I prefer it to what I have now. Chicdat (talk) 12:01, 20 March 2021 (UTC)

You can use redirect. Haideronwiki (talk) 17:09, 19 March 2021 (UTC)
@Chicdat: Done. - Nikki (talk) 07:07, 21 March 2021 (UTC)
I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. Matěj Suchánek (talk) 12:47, 21 March 2021 (UTC)

How to indicate a typo in a source

Hello. Typos appear even in reliable sources. Is there a way to use the reference but indicate that a typo has been corrected - that is, that it is intentional that the value does not appear exactly as given in the source? If a source makes a completely wrong claim, then we would simply deprecate the statement. And then we just wait for a source with the correct statement. However, in a few cases of a typo, like a missing minus-sign, I've fixed the value and stated on the item's talk page which correction I did (latest example: the Richardson constant).

We have the qualifier sourcing circumstances (P1480) which goes into the right direction: Could we use that or a similar one on the source to indicate the correction of a typo?

Best wishes. Toni 001 (talk) 07:57, 13 March 2021 (UTC)

@Toni 001:: object named as (P1932) is available, or object stated in reference as (P5997) if you want to be specific that the form appeared in that specfic reference, not necessarily elsewhere. Also subject named as (P1810) if the entity in question is the subject-value of the item, as opposed to the object-value of the statement. Jheald (talk) 17:33, 13 March 2021 (UTC)
Thanks @Jheald: For use with defining formula (P2534), the reference qualifier (or whatever the terminology might be) object stated in reference as (P5997) is interesting: Sometimes different sources give formulas in different but equivalent forms while in Wikidata we might want to list only one. Toni 001 (talk) 04:55, 16 March 2021 (UTC)
  • I put in the incorrect value, add the source, deprecate it, then there is a list of "reasons for deprecation" such as typographical error. I have to do it all the time for birth and death dates, so many people type 1987 instead of 1787, or something similar. Some people switch birth and death dates. --RAN (talk) 02:13, 14 March 2021 (UTC)
Thanks @Richard Arthur Norton (1958- ): Deprecation indeed seems to be the best approach. Toni 001 (talk) 04:55, 16 March 2021 (UTC)

What to do with items that are a mishmash and mirror VIAF's errors?

Thomas Hutchinson (Q63343166) is a problematic item, and the underlying VIAF data is a mishmash of various authors, and not sure where to start unpicking and reverting, or even how to flag the problem to VIAF. The stuff for the literary scholar/critic in the 1910s belong to Thomas Hutchinson (Q28735465), uncertain about 1890s. Anyway, there is a range of works for a vast range of years, and there is at least one date that says author Tom Hutchinson b. 1948 in the VIAF and our records. It is not right.  — billinghurst sDrewth

Make Thomas Hutchinson (Q63343166) a conflation (Q14946528) item, perhaps. Ghouston (talk) 22:08, 14 March 2021 (UTC)
Done (probably inexpertly). I am interested in our combination of a concept and a WM reason into the same item. I would have thought that we would have our deprecation reasons / flags as separate items.  — billinghurst sDrewth 23:41, 14 March 2021 (UTC)
We have Wikidata:VIAF/cluster/conflating entities where these things are tracked. I've no idea what happens to items listed there, but it's a start. --EncycloPetey (talk) 23:45, 14 March 2021 (UTC)
@billinghurst: That’s a tough one. First, we have to sort out what the entries originally meant:
So from my somewhat limited perspective, the two might very well be the same person, now conflated with other persons through VIAF. You might want to ask the university library of Leeds for more information. Once the identity of the persons (and their bibliography) is clear, we can work on VIAF – which is really hard. --Emu (talk) 23:47, 14 March 2021 (UTC)
Thanks emu. I know that it is a tough one, and it is hard to tell whether we conflated the issues first, or VIAF did. I have spent about six hours picking through bits of genealogical data trying to work it out. Made worse as Thomas Hutchinson (Q28735465) the scholar as the expert on Wordsworth, may or may not be related to Wordsworth's nephew (through wife), a Thomas Hutchinson who died 1903.

I believe that the Hutchinson (Q28735465) is the author of the works on Shelley, as it appears that it is the field of work started under Edward Dowden (Q970472) at the University of Dublin. I just cannot find Hutchinson (Q28735465) at UoD or reliable records for him in Ireland. [All done sorting works for enWS, rather than a specific interest in the author themself].  — billinghurst sDrewth 00:12, 15 March 2021 (UTC)

@billinghurst: I see, sounds like a lot of fun :-) Well, VIAF is conflated anyway as (at least) SUDOC/IDref mixes the scholar (or indeed, one or two of them, as the case may be) and Thomas Hutchinson (Q1892842). It is possible to have SUDOC/IDref correct there entries, but there’s probably even more trouble somewhere. --Emu (talk) 00:22, 15 March 2021 (UTC)
Fun = keyboard headbutting. Aaaaaaand what I find really amazing is that an author who wrote and had published two definitive works, one on Wordsworth, and one on Shelley, is basically not known or noted one hundred years later, nor particularly at the time.  — billinghurst sDrewth 00:26, 15 March 2021 (UTC)

Okay, I have sorted out some of these works and done a summation at Wikidata:VIAF/cluster/conflating entities#Thomas Hutchinsons for someone who knows how to unbreak or alert to onward notify.  — billinghurst sDrewth 04:40, 15 March 2021 (UTC)

and sometimes it gets worse we have Thomas Hutchinson (Q52222331) which by the dates is the brother-in-law of William Wordsworth, and he definitely wasn't the author of the 20th century analyses. And I have no idea what to do with that one beyond saying CORRUPTED by bot edits  — billinghurst sDrewth 05:00, 15 March 2021 (UTC)

Q25620949

Hr̥dayacandasiṃha Pradhāna (Q25620949) The Nepali Wikipedia entry uses three different birth and death dates, so is it a fake entry, or just someone that does not know what they are doing? --RAN (talk) 23:14, 14 March 2021 (UTC)

Feel free to go to the article, and then fix the item. Done on this occasion, it was reasonably simple fix.  — billinghurst sDrewth 23:46, 14 March 2021 (UTC)
How are you determining which of the multiple values is correct? We have (1916-1960) at age 44 and (1972-2016) at age 44, using two different dating systems, the Gregorian calendar and the Nepali calendar. I see the problem (BS 1972 - BS 2016) is the Nepali calendar that we do not support, the Nepali article uses a mix of the two dating systems without any consistency, hence the confusion. Wikidata imported both values as Gregorian. --RAN (talk) 23:55, 14 March 2021 (UTC)

There are no properties for movie ratings, why?

We have properties like IMDb ID (P345), Rotten Tomatoes ID (P1258) and Metacritic ID (P1712). I think it would be useful to have properties like "IMDb rating", "Tomatometer" or "Metascore". I understand that those numbers can change over time but it is also true for things like population. Can I open a proposal or it has been discussed already? --ԱշոտՏՆՂ (talk) 22:16, 15 March 2021 (UTC)

I wasn't able to find previous discussions so I have opened a new one. --ԱշոտՏՆՂ (talk) 08:35, 16 March 2021 (UTC)

Influenzavirus A

Q17139796 and Q834390 seem to be te same Influenza A. Can they be joined?Smiley.toerist (talk) 21:09, 14 March 2021 (UTC)

looks the same to me but I'm not a subject matter expert. BrokenSegue (talk) 21:46, 14 March 2021 (UTC)
They have different taxon rank (P105). Ghouston (talk) 22:03, 14 March 2021 (UTC)
I modified the labels to avoid confusion. There is also Q51916564, 'alfa' = 'A'. Smiley.toerist (talk) 11:08, 15 March 2021 (UTC)
I doubt Help:Label/nl supports this. The description should make the difference. Bear in mind that Wikidata item labels are different from Wikipedia article titles. --- Jura 11:13, 15 March 2021 (UTC)

In 2017 the genus Influenzavirus A (Q17139796) was renamed to Alphainfluenzavirus (Q51916564). I moved the sitelinks. --Succu (talk) 19:46, 16 March 2021 (UTC)

Unusual edits

Does anyone know what these edits and these edits are supposed to be for? Some of them have shown up in my watchlist, and a lot of the changes seem to be arbitrary or destructive. Overcast07 (talk) 11:43, 8 March 2021 (UTC)

Mostly destructive. Bug in mobile editing? --SCIdude (talk) 14:58, 8 March 2021 (UTC)
Does anybody know what the difference between the tags Mobile edit and Mobile web edit is? ChristianKl18:28, 8 March 2021 (UTC)
iOS Wikipedia app edits (to mainspace pages) get tagged as “mobile edit, mobile app edit, iOS app edit”. For Wikidata though, I imagine m.wikidata.org is the only mobile client (?) so that the two tags “mobile edit, mobile web edit” would always usually occur together. (I.e., “mobile edit” is a superclass of “mobile web edit” + “mobile app edit”.) Pelagic (talk) 02:28, 9 March 2021 (UTC) Add: app edits to non-English short descriptions do get written back to wikidata; I haven’t tested to confirm how they are tagged. Pelagic (talk) 02:33, 9 March 2021 (UTC)
Mmm, maybe bug. I can’t see a way to edit other-language labels in the mobile UI, so I don’t know how the user(s) could be doing it manually. Unless mobile web edit is mis-tagged. Here’s another one: 125.162.168.168 Pelagic (talk) 03:00, 9 March 2021 (UTC)
So should we add a filter for these? --- Jura 08:17, 9 March 2021 (UTC)
What should trigger the filter? Lymantria (talk) 17:09, 10 March 2021 (UTC)
@Lymantria: If there is no valid way to do mobile edits in multiple languages, trigger would be the tags for mobile edits and an automatic edit summary that indicates changes in several languages. @Pelagic, SCIdude, Overcast07: what do you think? --- Jura 14:44, 16 March 2021 (UTC)
That would not be the easiest way to program, AFAIK edit filter does not see tags. Lymantria (talk) 15:49, 16 March 2021 (UTC)
@Lymantria: Looking at Special:AbuseLog/16624165 it has a few other elements that could be used. --- Jura 16:34, 16 March 2021 (UTC)
I could replicate an edit like the above using the mobile interface Special:Diff/1383404926. What is the filter you suggest supposed to prevent? Lymantria (talk) 18:19, 16 March 2021 (UTC)
So there actually is a valid way to do it. Maybe we should just limit IPs to a few label changes at once (e.g. 3). --- Jura 09:34, 17 March 2021 (UTC)

New WDQS Streaming Updater now available on pre-production test server for feedback

Hello all,

It has been exciting to see the increased adoption and usage of the Wikidata Query Service (WDQS) usage in the past year. To support this growing demand, on 15 March 2021 the Search Platform team released a new Streaming Updater to a test server (https://query-preview.wikidata.org) for feedback before going to production on 15 April 2021 (pending any major blockers discovered during testing). Once in production, WDQS will become less of a bottleneck for Wikidata updates, and we’re looking forward to better facilitating Wikidata’s continued growth as a more complete knowledge graph. Your relevant feedback on the following changes is important to us to ensure we continue to best support your needs while scaling up the service in production:

  1. New Streaming Updater: [1]
    • This improvement to the Updater will allow WDQS to better handle the volume of edits to Wikidata, improving data consistency and decreasing update latency: while the existing Updater fluctuates between 5-15 updates/sec (averaging 10 updates/sec), the new Updater will be able handle a throughput of 40-130 updates/sec (88 updates/sec on average). Without these performance improvements, edits to Wikidata were being throttled, approaching the point where they could become impossible. With the new Updater, edits to Wikidata will be on the whole more consistent and have less lag, reducing the WDQS bottleneck to improving Wikidata content.
    • We don’t anticipate this to adversely affect workflows or usage, but it is a big update, and we would like you to let us know if you find any related bugs or problems so that we can properly address them.
  2. Blank node skolemization: [2]
    • To reliably use the new Streaming Updater to minimize the throttling of edits to Wikidata, skolemization of blank nodes was required, as detailed in the phabricator ticket. For more detail on why this was necessary, you can also refer to another attempt to design a “diff” format for RDF, where the solution suggested to handle blank nodes is also skolemization. We understand that this solution will unfortunately potentially introduce breaking changes to your usage of WDQS, RDF dumps, and Special:EntityData; however, given the severe risk of the edits to Wikidata becoming impossible, we felt this was the best course of action to take in the timeframe we had. We acknowledge that this approach has its shortcomings, however, and invite you to provide us with feedback on how we can improve future usage of Wikidata and WDQS while maintaining their scalability and reliability.
    • From a user perspective of this change, (1) queries using isBlank() will need to be rewritten; (2) queries using isIRI/isURI will need to be verified; (3) WDQS results will no longer include blank nodes. If these changes affect your workflows, and/or you need to know how to modify your workflows to account for the blank node skolemization, please let us know what your specific use case is.
    • For more detail on how to modify your workflows, including examples, please refer to the following page: https://www.mediawiki.org/wiki/Wikidata_Query_Service/Blank_Node_Skolemization
  3. Constraint fetching [3]
    • Constraints are a Wikibase concept that allows entities to be validated based on definable properties: i.e. all astronauts must be human. Ideally, constraint fetching would be used to ensure data quality for Wikidata edits. The reality is that the current implementation of constraints fetching is not meeting our production quality standards and was generating detrimental noise in our logs.
    • As a result of the sub-par implementation, and the fact that the new Flink-based Streaming Updater doesn’t support it, current constraint fetching functionality will be disabled with the new Updater release, until we can expose constraint violations in a more production-ready way [4][5][6]. We recognize that even functionality that doesn’t meet our production quality standards is still potentially useful for some, and we would like to hear your feedback if you are adversely affected by this change.

We’re looking forward to these new changes improving WDQS, and your relevant feedback on these updates will help us make sure we can continue to support your needs. If you have any questions, issues or suggestions, feel free to reach out to us on the WDQS contact page. MPham (WMF) (talk) 13:52, 15 March 2021 (UTC)


To summarize "Constraint fetching [3]": "wikibase:hasViolationForConstraint " triples that never quite worked will be dropped. --- Jura 09:57, 17 March 2021 (UTC)

How fast "ought" it to be possible to edit Wikidata ?

At the moment I'm sitting waiting for a batch of 4500 edits to go through QuickStatements. I don't regard that number of edits as very many -- more like a routine medium-sized batch job.

I'm using QuickStatements in direct (unbatched) mode, the query servers are having a reasonably good day (grafana dashboard / dashboard 2), I'm getting about 15 edits a minute. This is the system when it's running well.

My 'routine medium-sized job' is going to take about 5 hours to complete. If I was adding qualifiers or references it would be double that. It feels glacial.

This is before one even starts to think about jobs involving 100,000 items -- which are not so unusual.

I appreciate all the work done by everybody to keep all the services up and running as well as they do. But question: if this is the current 'normal', it is okay? Or merely 'tolerable'? What kind of throughput level would make for a 'good' contributor experience? And, if we can succeed in making the data here as robust and systematic and reliable as we would like, so wikidata's reputation improves, attracting new external users, what happens then? If the time for medium-sized batches of edits is barely tolerable now, what would happen if we reached a level where suddenly lots more people were more willing to see wikidata as a respectable place to put data associated with them? Jheald (talk) 11:12, 12 March 2021 (UTC)

I think a job of 4500 items is quite small. And I also think that the time that these jobs take is much much much too long. If I want to change let's say 20000 items, this should in my opinion be done in half an hour. Instead, it takes around a day or so. Which is actually the main reason why I most of the time refrain from importing that much data. In the past I ran imports that took one month, although the data was ready, it was just because of the slow and unstable import (MaxLag issues etc.). Steak (talk) 11:53, 12 March 2021 (UTC)
It "ought" to be possible to make 90 edits per minute. This is the hard line for all non-admins and it's fixed at that value for technical reasons. Your problems rather come from usage of QuickStatements which has several shortcomings, for example you cannot add multiple claims on one item using a single edit. QS is no longer maintained, from all what I know so don't expect it to change. Your solution is to write your own bot, which I personally find very easy using wikibase-cli. --SCIdude (talk) 07:39, 13 March 2021 (UTC)
@SCIdude: 90 edits a minute would be a much happier rate. But I am not sure the infrastructure can sustain that rate from a lot of people. My understanding is that QS could edit faster, but has throttled to about 15 edits per user, to make sure the system stays responsive. Are individuals still being allowed to run bots at 90 edits a minute? I'm not sure. Also, does wikibase-cli automatically respect maxlag, or is that something that you have to be aware of and handle manually?
But your point about being able to add multiple claims in a single edit if one doesn't use QS is a fair one. Jheald (talk) 17:28, 13 March 2021 (UTC)
@Jheald: wikibase-cli maxlag default is 5 and configurable. I usually get 30-40/min with multiple claims. --SCIdude (talk) 17:45, 13 March 2021 (UTC)

"Your solution is to write your own bot, which I personally find very easy using wikibase-cli" isn't so much a sure-fire cure as it is a symptom of the paucity of support for editing wikidata. It's nonsense on stilts. --Tagishsimon (talk) 17:51, 13 March 2021 (UTC)

"Personally" is the key in the above sentence, the definition of "bot" may vary to your liking. --SCIdude (talk) 07:24, 14 March 2021 (UTC)

After adding the new property Open Tree of Life ID (P9157) for a month now, I guess a got a ¼ (Q13048260) of them (half a million). Yes it's frustrating to do this in a lame way. --Succu (talk) 20:22, 13 March 2021 (UTC)

Second round of Wikidata updates on the mass data imported a while ago/ Wikidata history query service

Hi, I am looking for the most efficient and reliable way for editing the data that was imported by the organization that I am working for two years ago. I want to reference some of those data but in order to do so, I need to make sure the data was not edited by someone else during the last two days. I first tried to do this using the SPARQL queries but I can only achieve a time precision on the level of the items and not on the level of properties and their statements. I was wondering if anyone has a solution for me and my project. Thank you very much.

Fatemeh 94.142.212.90 12:28, 18 March 2021 (UTC)

I think edit timestamp at the item level is the best & only option through WDQS. Unsure in what way that does not evidence an edit within the last 2 days - if a statement in the item has been edited, then the item has been edited? --Tagishsimon (talk) 12:36, 18 March 2021 (UTC)
See the answer to my question at https://phabricator.wikimedia.org/T270003 --SCIdude (talk) 17:05, 18 March 2021 (UTC)

Signed Statements

Edit: In order to provide more context and consolidate discussion on this topic I put together the following RfC: Wikidata:Requests_for_comment/Signed_Statements_(T138708) ElanHR (talk) 06:50, 16 March 2021 (UTC)


For anyone interested, I put together a brief proposal (https://phabricator.wikimedia.org/T138708#6911561) and some demo schema on wikidata-test to support signed statements and am interested in any thoughts/feedback people have. :)

I'm primarily interested in the use case of a batch data donation by an authoritative source (e.g. GLAM institutions) and enabling them to sign uploaded statements.

Cheers, ElanHR (talk) 02:12, 15 March 2021 (UTC)

  • So to check if one actually has the signed version of the statement one would need to retrieve a page in an item's revision history? --- Jura 11:34, 15 March 2021 (UTC)
    • Thanks for asking, how verification is done was certainly not clear. This would enable two levels of verification to make sure a claim is still supported by a given reference:
  1. Simply checking that the claim matches the provided hash (eg. the object IDs and qualifiers have not been altered). This verification would not require looking at revision history.
  2. Checking that the identity of subject/objects have not been altered since the claim was made. This would involve comparing the current revisions to the marked revision and determining the identity has not changed.
For example, let's say a bad actor wanted to change Wikidata to state "Barack Obama was born in Kenya" instead of Barack Obama (Q76)place of birth (P19)Kapiolani Medical Center for Women and Children (Q6366688) which is currently supported by a reference to a birth certificate. There are a few ways they could do this:
  1. Simply replace the object ID with Kenya (Q114). This would immediately be flagged by f(claim, reference) no longer matching the provided hash.
  2. Change the identity of Kapiolani Medical Center for Women and Children (Q6366688) to match Barack Obama (Q76). In this case the hash would still match but could be detected by checking if the identity of Kapiolani Medical Center for Women and Children (Q6366688) still matches the identity when the reference was made.
  3. Change the identity of Barack Obama (Q76) to something else. Same as case #2
Per @BrokenSegue:'s comment I will put together an RfC on this and make sure to distinguish these cases. ElanHR (talk) 20:46, 15 March 2021 (UTC)
  • Is there an RfC or document outlining the value of such a proposal? I see the top comment for for the ticket but it doesn't explain how they imagine working in practice. Is there demand for this? Do we plan to lock down signed statements? How will we create a chain of trust? Etc. BrokenSegue (talk) 13:36, 15 March 2021 (UTC)
    • Not that I've seen but I agree that would be a better place to hold a discussion on this. I will try to flesh this out some and put together an RfC either today or tomorrow!
To somewhat address the "chain of trust" question, I see a number benefits of this all supporting the larger goal of improving Data Quality and Trust (https://phabricator.wikimedia.org/T76230):
  1. It enables institutions donating data to ensure they will not be misattributed down the road.
  2. It shortens the chain of trust by removing the question of whether a middleman(reference) is trustworthy. Essentially this would move from the current scheme of "someone(via this reference) says authoritative source X claims Y" to "authoritative source X claims Y (and you can immediately verify it)". That said, whether or not a user trusts a particular source or not is still up to them!
  3. It allows for data audits providing an additional tool to counter these forms of vandalism.
These are great questions and I will definitely try to tackle them more in depth in the RfC. ElanHR (talk) 20:46, 15 March 2021 (UTC)
  • @ElanHR: One of the issues with signed statements is that there may be legitimate changes that can and should be made to a statement without invalidating its reference. For example, suppose a qualifier is added -- should that always invalidate the signature? (Perhaps the signature should indicate which qualifiers it does or does not encompass). Or suppose there is a legitimate merge of the value of the item. While that could be an attack, it's also often just housekeeping. Or suppose a statement gets deprecated with "reason for deprecation"; or preferred, with "reason for preferred rank". Would either of those affect the signature? Curious as to your thoughts on this. Jheald (talk) 21:25, 15 March 2021 (UTC)
I think in the case of making a claim more specific (e.g. adding qualifiers, updating a value to be more specific) we should be wary about attributing more to a source that it actually says. For instance if one source says someone "was born in the 1300s" and later information suggests that they "were born in 1317" we should be wary of updating the claim without removing the reference to avoid misattribution. In these cases I feel the proper course of action would be to leave both to accurately represent a source's claim.
In the second case preferred/deprecation reasons are a case that I definitely overlooked and you're right these should not affect the signature. In order to avoid this the serialization function should be defined to ignore these housekeeping properties. Thanks for pointing this out! ElanHR (talk) 06:50, 16 March 2021 (UTC)
@ElanHR: It's not just qualifiers like reason for preferred rank (P7452) or reason for deprecated rank (P2241) though, is it? What about subject named as (P1810), object named as (P1932), applies to part (P518), nature of statement (P5102), subject has role (P2868), object has role (P3831), statement is subject of (P805), statement disputed by (P1310), statement supported by (P3680), follows (P155), followed by (P156), replaces (P1365), replaced by (P1366), relative position within image (P2677) and all sorts of other qualifiers that we may think it is legitimate to add to a statement, although they go beyond just 'housekeeping'. Jheald (talk) 08:01, 16 March 2021 (UTC)
@Jheald:I see your point. I think the easiest way to do this would be to add a separate reference with the additional information. I feel this is a bit clunky as an approach but this provides a way of adding additional context without invalidating an endorsement. In my test item I've added an "official web site" claim to illustrative this approach.
While I don't think the addition of any of the properties you listed are likely to be particularly harmful, I am wary of a final result where someone "endorses" something they did not explicitly say.
Alternatively there is a field for "snaks-order" so it may be possible to only serialize the claim + only reference statements made before the hash/signature property but I think this approach would be brittle and unintuitive. ElanHR (talk) 04:46, 17 March 2021 (UTC)
@ElanHR: Particularly on a property like position held (P39) (which can already have a lot of values) I think it would be unfortunate to say that adding replaces (P1365) or follows (P155) or various other not-very-controversial qualifiers should require a new statement. Pinging @Andrew Gray:, who does a lot of work in this area. Slightly OT, it does seem to be an issue that surfaces from time to time, how to specify that a particular reference is supporting a particular sub-set of qualifiers on the statement -- see eg this in Project Chat last December (and independently somebody had asked me exactly the same thing at just the same time), or this 2019 property proposal, which fell principally because no examples were given. Jheald (talk) 10:46, 17 March 2021 (UTC)
I agree adding a new value isn't a particularly elegant way to show which qualifier are supported by a reference but I'm not sure of a better way of doing it. Additionally I think in the worst case this could contribute to ambiguity.
For example: Barack Obama (Q76)position held (P39)United States senator (Q4416090) occurs twice with different qualifiers. If I have a reference that supports the qualifier electoral district (P768) Illinois Class 3 senate seat (Q101499034) which one should it support? In this case it's valid for both but I don't think that would always be the case.
At a high level I think we might be hitting a wall with what is possible to model with being able to point to individual references. Even for the "Supports qualifier" proposal I could imagine problem cases that are not possible to model because a qualifier property could potentially occur multiple times for a value. E.g. For a cast member (P161) statement, what if the actor had multiple character role (P453)? ElanHR (talk) 05:41, 19 March 2021 (UTC)
  • @ElanHR: I am also suspicious of the whole rationale here, which has always seemed to smack of "Ooooh, crypto. Shiny!". We don't actually need a cryptographic signature to say who added a statement -- we have an edit history for that, and it applies to everybody's edits, not just some hallowed few. What we actually need are better tools to look at the edit history of a single statement, whoever added it. Jheald (talk) 08:01, 16 March 2021 (UTC)
    Well I certainly can't bemoan anyone's skepticism to crypto-hype (that is essentially my life in the tech industry lol). In this case I think it might actually be an elegant solution to trusting facts that anyone can alter.
    As for the "hallowed few", the proposed schema would definitely be available for anyone who is interested in using it. The reason for highlighting data donations from GLAM institutions is this would likely results in a higher claims/endorser ratio and make it easier to showcase the schema's value.
    I totally agree that improved tools for reviewing edit history would be desirable. Unfortunately reviewing edit history to track provenance has some complexities that make signed statements appealing (keeping track of this small amount of extra data to make it computationally more efficient). I go into (perhaps too much) detail on this under [#Discussion on the RfC] ElanHR (talk) 05:06, 17 March 2021 (UTC)
  • This is actually something that comes up fairly frequently, but unfortunately hasn't been tackled by devs yet. Accordingly, I don't quite see why an RFC would be needed. Maybe it's a solution to a problem we have now when a statement that is supported by several references gets changed to something slightly or completely different. --- Jura 09:14, 16 March 2021 (UTC)
@Jura1: One thing that is different this time is User:ElanHR trying to see just how far we can get towards this in user-space, without dev input; and trying to get the community to think about what we would actually want, & what issues come up, rather than devs just just presenting a finished package for the community to take or leave.
I have to say I do find the additional reference claims on https://test.wikidata.org/wiki/Q214700 a bit ugly, and would make the regular reference claims harder to see -- but that's a UI thing that could be easily enough fixed with an appropriate UI patch, either as a gadget or in the main code.
The one thing I think ultimately would need dev intervention is that one would probably want the system to keep track of whether the statements still meet the signed hash (and to subtly modify the statement presentation if they do), rather than that calculation having to be done by every browser for every signed statement every time the page was opened. But the later approach is okay as a userland proof-of-concept.
The real thing the community needs to think about is the social dimension -- is this a thing we think is actually worth spending any time on (& a justifiable complexity increase) ? and secondly, as I have been trying to do above, how well does it play with how we actually use references in reality?
A worked demo should certainly help us focus on those two questions, but should not be given any presumption of inevitability that it would actually go forward. Jheald (talk) 11:29, 16 March 2021 (UTC)
  • I don't really care who ultimately does the development and if WMDE hasn't done in 8 years, they are unlike to do it. Personally, I think the problem I mentioned above needs to be addressed, semi- or fully protecting MediaWiki pages isn't going to solve it. --- Jura 11:35, 16 March 2021 (UTC)
  • @Jheald:/@Jura1:I totally agree that the demo endorsement scheme is uglier than I'd like - this is the minimum information necessary to make it work and all that being exposed in userspace is a bit much. :\
I was considering assigning the phab ticket to myself because I think it'd be useful for the community but wanted to wait for a response to my proposal from the community plus the devs who are already on the ticket (that and my plate is already moderately full). While I can't say I've coded a gadget before I'm fairly confident I could eke one out given some time.
As for keeping track of violations: one of my first goals for this would be to populate a list of violations via a cron job that:
  • runs a SPARQL query for references that use this schema
  • check for hash/signature match
  • verify the identity of Sub/Obj have not been changed (probably at a lower frequency because it's more computationally intense).
Violations could then be reviewed manually. For cases where the change is valid the endorsement could either be moved to a deprecated claim or simply removed.
I also flushed out some of the examples and motivations on the Wikidata:Requests_for_comment/Signed_Statements_(T138708)#Discussion. ElanHR (talk) 05:30, 17 March 2021 (UTC)
  • I'm completely unconvinced by what I've read so far. I don't see how the supposed 'chain of trust' is shortened or strengthened by signed statements, as opposed to well-referenced statements. I have concerns that 'signed statement' = 'owned statement' ... institutions that submit data to Wikidata do not own the item nor the statement, and even a whiff of protecting items or statements is anathema for the "do not own the item nor the statement" reason. I'm not even clear whose interests are being served here. Refs work well enough for the uninvolved user. The institution should be well enough served by periodically checking their database holdings against WD holdings to see what's changed. Uninvolved users must be free to amend statements, not least since data originating from institutions often enough has errors within it. --Tagishsimon (talk) 06:16, 17 March 2021 (UTC)
    @Tagishsimon:I disagree wholeheartedly with the characterization that signed statements are 'owned' statements and would argue a more accurate description is 'endorsed'. This signature in no way implies protection from edits, it is just a quick way of easily recognizing if a statement is what was said by an individual/institution. The data donation process is still the same: I have some data I would like to make freely available, I put it in shared schema, and then put it out for anyone to use/modify/edit/remove/etc.
    I think an appropriate analogy is an open letter. I can craft a letter and put it out to the world for anyone can sign on to however if someone changes what the letter says I want my signature removed until I can read it (people say some crazy stuff). That said no one is stopping you from copying/editing or writing your own conflicting letter.
    Per your comment "The institution should be well enough served by periodically checking their database holdings against WD": This can be an incredibly laborious/technically difficult process depending on what has changed and I think you are overestimating the technical expertise of institutions and willingness to put forward the effort necessary for this ongoing process. My impression from attending Wiki conferences is the most common batch uploaders are small GLAM institutions (think local historical society/libary) that just want to put their data into a CSV, upload it, and be done.
    Re: "whose interests are being served here": People using, producing, and curating data - see my comment in the RfC. ElanHR (talk) 06:59, 17 March 2021 (UTC)
    @Tagishsimon:: Just to clarify I don't think you're incorrect to say that "Refs work well enough for the uninvolved user" (heck a lot of use cases don't even require references), it's just not everyone falls into this bucket and I want to make Wikidata as usable as possible. Vandalism exists and when used this feature could be one step to counter it, help avoid misattribution, and help users who can't blindly trust references without verifying. ElanHR (talk) 07:10, 17 March 2021 (UTC)
    My personal opinion is that this will do little for vandalism detection. I can just add a new statement or change the label or... We would need a very high density of signed statements for someone to reasonably only consider signed statements. The answer to vandalism detection here is better ML. BrokenSegue (talk) 13:25, 17 March 2021 (UTC)
    The usage of "Revision ID of signed subject/object value" properties could help somewhat with detecting this form of vandalism (see #2+3 of #How would Signed Statements be verified?) but your point stands.
    From personal experience a major bottleneck in developing ML solutions is that labeled positive examples of vandalism are exceedingly few compared to good examples and labeling new examples is incredibly laborious/time consuming and often requires domain expertise. I totally agree that this would not solve the problem of vandalism on its own but it could help find new cases and hopefully reduce the workload of items being reviewed. As for the density question fortunately (or unfortunately) bots automatically importing datasets are becoming the majority of edits/contributions (http://datakolektiv.org/app/WD_HumanEdits) so it may take only a handful of sources to adopt this schema to have noticeable impact.
    While I agree ML approaches will be necessary to tackle the more general problem of vandalism I believe both approaches can be used in conjunction resulting in an improvement over our current process.
    PS: The following isn't a criticism specifically against ML-based vandalism detection but just a fun example that recently came up in a talk where we were discussing how confident we should be in current ML solutions: https://www.theverge.com/2021/3/8/22319173/openai-machine-vision-adversarial-typographic-attacka-clip-multimodal-neuron ElanHR (talk) 05:58, 19 March 2021 (UTC)
    @Tagishsimon: ["#Second round of Wikidata updates on the mass data imported a while ago/ Wikidata history query service"] actually provides a pretty great case study of how signing statements would be beneficial by providing checks at the statement level. If the data were originally signed than finding these unedited statements would be a two step process:
    1. Querying for statements signed by this institution.
    2. Checking if the statement hash is still valid.
    ElanHR (talk) 06:38, 19 March 2021 (UTC)

Bot idea - automatically add language of work to reference URLs

I would like to automate tedious tasks as much as possible. Entering "lang of work" for every URL is no fun. Since most pages these days are html and often contain a lang="langcode" attribute I would like to make a bot that follows WD and adds it right after someone adds a URL. WDYT?--So9q (talk) 08:48, 18 March 2021 (UTC)

citoid (Q21679984) may help. Unfortunately, its Wikidata deployment is stalled, see T199197.--Jklamo (talk) 13:29, 18 March 2021 (UTC)
I have considered doing this. But I'm not sure what the most robust way of inferring language is. You can look at the content (and there are models to infer language) or various pieces of metadata etc. But I would be amenable to implementing it. Some experimentation would be needed to see what's the most accurate. It would also take forever to run but we do have tons of time. BrokenSegue (talk) 14:02, 18 March 2021 (UTC)
We're not losing information because of link rot, since Internet Archive archives all pages for all URLs added to wikidata. We may well have a backlog on adding archive URL data, but that's another story. --Tagishsimon (talk) 18:02, 18 March 2021 (UTC)
Internet Archive may archive all URL datatypes statements (and it has yet to be proven) but, for sure, it does not archive external IDs. Ayack (talk) 08:29, 19 March 2021 (UTC)
@Cyberpower678: would have more information on if User:InternetArchiveBot checks external IDs. ElanHR (talk) 08:36, 19 March 2021 (UTC)

Recommendations for mapping the contents of a non-WikiMedia MediaWiki to WikiData?

I was surprised not to have (yet) found recommendations/guidelines/tools for mapping the items in a non-WikiMedia MediaWiki instance to the appropriate Q-names on WikiData. The best workflow I've found so far is to paste the wikitext of the pages into Magnus's LinkedItems tool, then save the resulting list, open it in AutoList (thereby getting the descriptions), manually re-sort them (as AutoList seems to put them in a weird, unsorted order), then manually go thru them removing the ones that don't seem right (like disambiguation pages), then update the saved list with the linking property, and apply the result with QuickStatements. This ... works, kinda, but has a bunch of manual steps and is finicky. I would have expected there was a nice, already existing tool for doing this. I looked into OpenRefine, but wasn't able to see how exactly to use it for this sort of purpose. Any pointers appreciated. JesseW (talk) 14:06, 18 March 2021 (UTC)

It looks like Mix-n-Match is what I'm looking for, but there isn't any documentation on using it for other MediaWikis, and when I try to use the scraper functionality, it seems to be broken. :-/ JesseW (talk) 03:46, 19 March 2021 (UTC)

Two meanings of Volume Property:P478

On a related question to the series issue above: We've been using volume (P478) to mean two different things. For a set of volumes, it is the volume number of the book, but for a journal article, it is the volume of the journal in which the article appears. So a value of "3" for volume (P478) could mean that the data item is itself the 3rd volume of something, or could mean that the data item is something contained within the 3rd volume. --EncycloPetey (talk) 21:05, 18 March 2021 (UTC)

I think this is a non-issue, because the data item itself is established with instance of (P31). An individual scholarly article (Q13442814) has properties of appearing in a certain issue of a certain volume of a publication. An item with instance of (P31) scholarly article (Q13442814) should not be confused with a volume (Q1238720), just as we don't mistake a publication date of 1900 as being the actual year 1900. -Animalparty (talk) 03:01, 19 March 2021 (UTC)
Some of what you say is correct, but booth books and articles can go through editions, in which case they will be an instance of (P31) version, edition or translation (Q3331189) and in that case the meaning is not going to be clear. We can also have multi-volume sets of collected works, where the contents include novels, poems, and other items. In such a multi-volume set, a poem might be in volume 3 or the set, but a novel might be volume 3 of the set, and in some multi-volume sets, a novel can be more than one volume long, so the first volume of the novel might be volume four of the set, in which case the data item's property would be both volume 1 and volume 4. --EncycloPetey (talk) 03:46, 19 March 2021 (UTC)

notice for lost/theft of identity from delhi india

my identity can be misused for things or services i do not avail or subscribe . so kindly take into notice that on the basis of my identity no contracts or agreements should come into effect . i dont work for any organisation and am a resident of india . so any person willingly representing a digital copy of any government approved document shall be cross checked and with virtual and written expression with posessing a signature in presence of the registerar. no voice enabled or text or email enabled transactions made through telecommunication. i hold no share or intrest or salary or benfit in any organisation .



name: akshay father name: virendra kumar mewal resident of : 21 bihari park , devli road , delhi, india 110062


i have not purchased any domain names aswell , so any transactions made under representation of my identity being a driving license , passport or and identity card shall not be entertained . i disclaim any agreement with any person or organisation . Insert non-formatted text here


  • I'm sorry to hear that Akshay but I think you may be in the wrong place. Is it that you have a presence on Wikidata that you would like removed? ElanHR (talk) 06:22, 19 March 2021 (UTC)

Wikipedia category interwiki control

Hi there.

Suggestion: This is a suggestion to increment how items on categories are created. We are missing at this point the relation between categories and subcategories on these items. Example: in this category item (Category:Crimes by century), we should have a way to connect to the subcategory on English Wikipedia subcategory item (Category:21st-century crimes), and reciprocally.

Rationale/Context: I had this idea as I was skimming through Pi bot, per Mike Peel's request, and realized that an interwiki mess-up had happened between the categories I have used as examples in the first paragraph. The ptwp category Categoria:Crimes do século XXI had been mistakenly connected to Category:Crimes by century, which led Pi bot to create a new category for Categoria:Crimes por século for ptwp without any interwiki. I have undone the mess up. I kept wondering how many cases we might have like this and that the Pi bot creation of items for categories could actually lead to a strategy to provide a better interwiki control for different Wikipedias. This could be done as we might be able to envision some sort of mismatch check tool.

Caveat: I am not sure how to model this. I thought about the is part of property relation, and its reciprocal property, with a qualifier to which Wikipedia version this relates to. Not sure it makes sense, though. I have no idea what a mismatch check tool would look like.

I know this is very vague and not necessarily helpful, but anyway I thought it might resonate to some of you.

I hope everyone is safe and well. Cheers. --Joalpe (talk) 16:30, 6 March 2021 (UTC)

If we did that, how would we coordinate all the possible category structures across all Wikimedia projects? Not every project has the same subcategories or supercategories. --EncycloPetey (talk) 20:19, 6 March 2021 (UTC)
@EncycloPetey: I don't think uniformity is the goal here. We can replicate the category structure existing on Wikipedias as they are, not force upon them any uniform structure. The structure of categories and subcategories is knowledge, and at this moment we are not modeling this knowledge as we build items for categories on Wikidata. --Joalpe (talk) 22:57, 6 March 2021 (UTC)
I understand that, but how many category structures do you want involved, because we're not just talking the Wikipedias, but also the Wikisources, Wikiquotes, Commons, and every other project with categories. How will anyone detect useful signal with so many potentially conflicting systems coded? --EncycloPetey (talk) 05:13, 7 March 2021 (UTC)
@EncycloPetey: I don't know for sure. You are one step ahead of me. The first step to find interwiki mistakes would be indeed to identify differences in category structures. The big question, and again I don't know how to answer this, would be how to parse what is just a legitimate different strategy of category structuring and what is a mistake once you have differences identified. Do you have any ideas, or you think this is impossible? Thanks for responding, by the way. --Joalpe (talk) 13:25, 7 March 2021 (UTC)
All that assumes that the goal is always the same, but it isn't. The reasons for certain category structures are fundamentally different between Wikisource (which uses library categorization) and Wikipedia (which uses encyclopedia topical categorization) and Commons (which categorizes images, data files, and audio). What counts as a "mistake" in one might be correct for another, as the projects are using the categories for different purposes and in different ways. Look for example at what is linked on the data item for Oedipus Rex (Q148643): the Wikipedias have articles about the play, but Wikisource has listings for editions of the play, and Wikiquote has quotations from or about the play. The content on the sites differs sufficiently that pages linked together through the same data item will be categorized in different ways on different projects. --EncycloPetey (talk) 21:34, 7 March 2021 (UTC)
A long time ago, I thought it might be useful to have a relation Xsub-category ofY, with qualifier "applies to wiki" (taking a list of wikis it applied to), and a qualifier "nature of sub-categorisation" = one of various different types of pattern, to capture the way in which the sub-category was a narrowing of its parent category. In such a way it seemed to me that on wikidata one could build up a representation of the nature of all of the category structures on all of the wikis, and to try to annotate them. Having identified the sub-categorisations, one could then look to see whether or not there were corresponding statements linking corresponding non-category items X1 to Y1.
I think creating such a representation within Wikidata would have been quite useful -- at the very least it would have helped identify missing statements. It also would have told us more about what types of sub-categorisation humans like to make. And on the wikis' side, it might have helped them get closer to wikidata-assisted subcategorisation.
But my thoughts were dismissed. I was told that what was relevant to wikidata to try to represent was the real world, not different human editorial choices. (And there were questions about maintainability). I still think those objections are wrong, and the idea was a worthwhile one. But I have moved on to other things. Jheald (talk) 21:57, 7 March 2021 (UTC)

Category:21st-century crimes (Q6976196) should absolutely be linked via a statement to Category:Crimes by century (Q30101322). It is inconceivable to me that in any wiki which has both of these categories, the former would not be a subcategory of the latter. MSGJ (talk) 09:41, 11 March 2021 (UTC)

@MSGJ: What is the best way to model this? --Joalpe (talk) 01:15, 13 March 2021 (UTC)
I don't know! A property called "is a Wikimedia subcategory of" might be a start. I think it should be unidirectional. But I share the concerns of maintainability ... I suppose it would only be of use to projects which had both parent and child categories, although you could use chains of these statements, e.g.
Category:21st-century crimes (Q6976196) is a subcat of Category:Crimes by century (Q30101322) is a subcat of Category:Crimes (Q7023064)
So if a project was using Category:21st-century crimes (Q6976196) and Category:Crimes (Q7023064), then the subcat relationship could be inferred by skipping the unused category. MSGJ (talk) 12:01, 19 March 2021 (UTC)

Hello,

I’m not sure this is the best place to post about this issue, if it’s not the best place to ask about it, where should it be posted instead?

I wonder what should we do to fix the mess done by Mirisa56 to both the Qitems I’ve linked in the above title (he probably haven’t noticed or understood the difference between Meteorology & Metrology concepts) ; after his edits, a bot has done a merge, and then someone created a new Qitem, so should we revert all that first (and check if something done in his edits should be done to the other Qitem instead), or leave as is and then also add to the newly created Qitem the data removed by Mirisa56 from the old one, before the merge process (see history for World Meteorological Day (Q1135422)/World Metrology Day (Q575238)/World Metrology Day (Q106006264) for more details).

Thanks for your help,

109.89.95.35 01:28, 19 March 2021 (UTC)

  • I think I sorted it out.
At some point, hiwiki and fawiki sitelinks were on the wrong item. In the meantime, two other items were created for metre day.
Please double check the sitelinks. --- Jura 10:31, 19 March 2021 (UTC)

Ranking of Constraint Violations

Could Constraint Violations get reported in ranked order, going from most to least important? Ideally, one would fix each and every single potential problem in Wikidata; but given limited time, and the size of the generated reports, focusing on fixing the most important issues first might be a good idea. For ranking entities, perhaps Wikidata QRank could be used. It’s not a perfect signal, but I think it would still be better than the current order by entity IDs. (Disclaimer: I wrote QRank, please apologize my shameless plug). Asking this on Project chat because the dev input page is marked as inactive. —Sascha (talk) 10:39, 19 March 2021 (UTC)

  • There are currently four or five approaches to tackle them. The idea is that contributors pick the one they prefer. Above it's announced that one of them will be discontinued. --- Jura 11:10, 19 March 2021 (UTC)

Series ordinal

At 1930 Ford National Reliability Air Tour (Q68255449) I have series_ordinal=6 but it gives a constraint error. Is there a better way to show it? Or should I change the constraint parameters to allow it as a main value? I notice all the Superbowls also need this parameter. --RAN (talk) 13:39, 19 March 2021 (UTC)

Which country (P17) is preferred? (sample: Sofia, capital of Bulgaria, Q472)

Q472#P17 has several values, each qualified with a different start time and all but Bulgaria (Q219) with an end time. [7]

For this fairly clear case, I think the value for the current country, Q219, should have preferred rank.

I fixed this the other day, but noticed that similar non-preferred values are present for places in Bulgaria (and possibly elsewhere). @Спасимир: --- Jura 08:11, 20 March 2021 (UTC)

Hello, Jura! Thanks for the note. The separation of the dates is necessary when using the data in bg wiki, because in certain periods the official name of the country was different. Isn't it usually the first item with a preferred rank? Yes, this is the preferred rank item.--Spasimir (talk) 11:21, 20 March 2021 (UTC)
Dates are fine. Ok, I will fix the ranks where missing. --- Jura 12:42, 20 March 2021 (UTC)

Call for review, comment and discuss my PhD thesis on Wikimedia movement

Hello,

Just a short message to call people interested to review, comment and discuss my PhD thesis on Wikimedia movement. All the best ! Lionel Scheepmans Contact Désolé pour ma dysorthographie, dyslexie et "dys"traction. 12:20, 20 March 2021 (UTC)

@Lionel Scheepmans: did you forget to include a link?--So9q (talk) 12:39, 20 March 2021 (UTC)
It looks like the main content is available on French Wikiversity: fr:v:Recherche:Imagine un monde. (A summary is found on English Wikiversity.) whym (talk) 13:47, 20 March 2021 (UTC)
  • Courageous ! --- Jura 12:47, 20 March 2021 (UTC)
  • Here is @So9q, thanks @So9q, but Wikimedia movement is also a passion for me. It looks like the last and free open space in the global north civilization and I already look certain drift than I want to signal. Wikidata with his license CC.0 is one of them. The CC.0 should be replaced by a CC.SA (which unfortunately disappeared in reason of a lack of use. Otherwise, all the work done on this space is a direct profit for the giant who manage our digital lives. Lionel Scheepmans Contact Désolé pour ma dysorthographie, dyslexie et "dys"traction. 12:58, 20 March 2021 (UTC)

Dubious URL as source

In Q105954358 there is one source 'https://www.vernoeming.nl/voornaam/arrien' wich is flagged by MCfee as untrustworhty as it links trough to 'https://check-your-prizes.life/' (under Firefox). Under Edge I get the notice Deze site is gerapporteerd als onveilig Gehost door pipe.travelfornamewalking.ga. Smiley.toerist (talk) 12:45, 18 March 2021 (UTC)

That kind of links should be removed, indeed! --Dick Bos (talk) 11:06, 19 March 2021 (UTC)
There a NL website to check if a first name (Arriën) is used: Nederlandse Voornamenbank, however is wich other languages this first name is used is uncertain. It is not a frequently used name.Smiley.toerist (talk) 16:26, 20 March 2021 (UTC)

I need your help.

I want to remove link (P1324) between Q1206660 and its English Wikipedia page. I searched alot and read articles but didn't find the way to do.
Reason: Its English Wikipedia page have two infoboxes and this property applies to only one infobox but by default showing on both infoboxes.
Haideronwiki (talk) 17:06, 19 March 2021 (UTC)

  • Most infoboxes don't display the data from Wikidata if local (Wikipedia) data is present. Maybe it's sufficient to add it. Alternatively, you could create an item for the other version. --- Jura 17:48, 20 March 2021 (UTC)
  Done Haideronwiki (talk) 21:40, 19 March 2021 (UTC)

Turkey pulls out of Istanbul convention

 
Presidential decree published in the official gazette to withdraw from the Istanbul Convention. .

How could I specify in Convention on preventing and combating violence against women and domestic violence (Q1697391) that Turkey pulled out?

It would be "great" to have two dates. Decision to pull out was signed on 2021-03-19 and published on 2021-03-20 in the Journal Official?

Or something different?

Meskalam (talk) 17:38, 20 March 2021 (UTC)

@Meskalam: I took a stab at it using repealed by (P2568) and has characteristic (P1552) but honestly I'm not sure this level of detail is needed or helpful. Is the publication date meaningful? BrokenSegue (talk) 18:57, 20 March 2021 (UTC)
@BrokenSegue: Thank you! In this case, it looks a bit "overdone" I agree, but not so long ago, there could be a week or more between the decision and the publication (before the computerized era...). It can even get very complicated (Paris Peace Conference (Q199820)), but really not well documented in wikidata. Probably the level of details of Blender (Q173136) is more helpful...

getLabel on non-existent labels on vepwiki returns results from etwiki and enwiki

 
This image from the Lua manual suggests that this is intentional, but it should be changed to Russian instead

I noticed that the following returns Estonian or English labels when Veps labels are missing.

   {{#invoke:WikidataIB |getLabel |Q429281}}

Where is this vep > et > en fallback hierarchy defined? In the case of Veps, Russian would be a more appropriate fallback than Estonian. How can I temporarily disable fallback labels? —Iketsi (talk) 04:57, 19 March 2021 (UTC)

Where is this hierarchy defined? MessagesVep.php has $fallback = 'et';. MessagesEt.php has nothing like that, therefore 'en' is default.
How can I temporarily disable fallback labels? The module should use mw.wikibase.getLabelByLang. --Matěj Suchánek (talk) 12:53, 21 March 2021 (UTC)

WikiProject for local community

I am interested in creating a WikiProject that looks at organisations and physical resources of particular benefit to local communities, for the purpose of 1. improving the coverage of those entity types for a given community and 2. enabling lists to be extracted from Wikidata to use in other data contexts, e.g. maps that highlight specific features of a area. Would using on focus list of Wikimedia project (P5008) as a way to build a list of relevant items be the right way to proceed? Pauljmackay (talk) 10:43, 21 March 2021 (UTC)

Memorials?

I went to add a memorial in a church to a person's item, but all I was offered was plaque image (P1801). We have Q5003624 - memorial (a general item) but not a property to attach, e.g. The Albert Memorial to Albert, Prince Consort. Obviously there's a risk of overloading an item with these, but that's a matter of control. Or have I missed something? Rodhullandemu (talk) 19:36, 20 March 2021 (UTC)

maybe make an item for the memorial and link it to the person? BrokenSegue (talk) 22:10, 20 March 2021 (UTC)
That would work but it's a specific solution for one person. As I see it, for the memorial to deserve an item it would have to be independently notable, and I don't see that notability here is inherited. Rodhullandemu (talk) 19:14, 21 March 2021 (UTC)
There is also image of grave (P1442) which may be useful in some cases. But in general I agree with BrokenSegue - Wikidata:Notability#3 covers this MSGJ (talk) 21:09, 21 March 2021 (UTC)

Project Grant application for SDC support in OpenRefine: feedback and endorsements welcome

 

Hello everyone! Since 2019, it is possible to add structured data to files on Wikimedia Commons (SDC = Structured Data on Commons). But there are no very advanced and user-friendly tools yet to edit the structured data of very large and very diverse batches of files on Commons. And there is no batch upload tool yet that supports SDC.

The OpenRefine community wants to fill this gap: in the upcoming year, we would like to build brand new features in the open source OpenRefine tool, allowing batch editing and batch uploading SDC :-) As these are major new functionalities in OpenRefine, we have applied for a Project Grant. Your feedback and (if you support this plan) endorsements are very welcome. Thanks in advance, and many greetings – Pintoch (as OpenRefine developer) and SFauconnier (talk) 09:21, 16 March 2021 (UTC) (aka Spinster, as member of the OpenRefine steering committee)

Mess after User:Andreasmperu (false accusation?)

Andreasmperu (talkcontribslogs)

It is still here:

None of admins adressed this. None of stewards adressed this. None of WMF adressed this.

Why anyone would trust anything to Wikidata? 95.29.9.63 00:48, 22 March 2021 (UTC)

Some (all?) of above complaints are wrong. "User:Andreasmperu" seems to be sysop here. I have not found any vandalism by em. Item "quadruple" quadruple (Q1987256) about football is clean now. Taylor 49 (talk) 08:21, 22 March 2021 (UTC)

Merge request

Please merge Bella Devyatkina (Q41933106) and Bella Devyatkina (Q105501463). --2001:B07:6442:8903:D461:A359:70CC:BAE1 11:43, 22 March 2021 (UTC)

Serie of a journal

Hi, how to model the serie of a journal, (with a new item?), as exemple see Deux genres nouveaux d’Échinides du Lias (Q106031902) where I putted it within volume (P478). Shouldn't we have a "serie" (or "journal serie") property? Christian Ferrer (talk) 19:56, 18 March 2021 (UTC)

The Source MetaData WikiProject does not exist. Please correct the name.
The Source MetaData/More WikiProject does not exist. Please correct the name. --Succu (talk) 21:29, 18 March 2021 (UTC)
Thanks for the link,   Notified participants of WikiProject Periodicals. Christian Ferrer (talk) 10:31, 19 March 2021 (UTC)
It would be really great if this could be resolved! There are many journals that have different "series". See the examples named or linked above and American Journal of Science. See also for instance Virginia journal of science and The Edinburgh Journal of Science, that have "new series". --Dick Bos (talk) 11:04, 19 March 2021 (UTC)
It's possible a property for this would be helpful, but I think better is to create a separate item for each series, similar to edition items for books? ArthurPSmith (talk) 13:38, 19 March 2021 (UTC)
@ArthurPSmith: so if we have Bulletin de la Société géologique de France (Q4035500) instance of (P31) scientific journal (Q5633421), what would be the instance of "Bulletin de la Société géologique de France, serie 5" and what would be the relation with the former journal? Christian Ferrer (talk) 21:16, 19 March 2021 (UTC)
part of (P361) has been used on some, probably due to lack of any better idea. See some of these items: [8]. Ghouston (talk) 05:10, 20 March 2021 (UTC)
Maybe a new item for each new serie is a more suitable idea, furthermore series are not all series with numbers, e.g. Memorias del Museo de Ciencias Naturales de Barcelona, Serie Geológica (Q106066630) vs Memorias del Museo de Ciencias Naturales de Barcelona, Serie Botanica (Q16942031). Christian Ferrer (talk) 07:26, 20 March 2021 (UTC)
One question is about formatting citations. If the series is a separate property, then it can be formatted in different ways as required. Are there two types of series, one where each series is generally treated as a separate journal, and another where it's more like a volume identifier for a single journal? In the first case, the series are all published simultaneously, in the 2nd, they are published in sequence and the journal has continuity. Ghouston (talk) 23:42, 20 March 2021 (UTC)
Agreed, there is already a lot of exemples for the first case, e.g. Stuttgarter Beiträge zur Naturkunde (A) (Q21385082) and Stuttgarter Beiträge zur Naturkunde (B) (Q21385087). For the second case, if we take Deux genres nouveaux d’Échinides du Lias (Q106031902) that gives → (5)6(6–8): 419–442 → or → ser.5 vol.6(6–8): 419–442. This journal is a good exemple as it has a lot of series, in BHL they go towards serie 4 but in fact there are 7 series. The following question is legitimate, do we create 7 items? without property, the answer is yes, of course, otherwise we can't differentiate the volumes that have the same numbers....:( The only remaining question is therefore do we create a property for that kind of case? Christian Ferrer (talk) 08:20, 21 March 2021 (UTC)
I proposed this as a property two years ago. Unlike Arthur, I very much oppose the idea of making them separate items. Circeus (talk) 13:52, 21 March 2021 (UTC)
There is talk of series but sometimes it is more usefull to define individual issues such as for Q62015763 wich is part of Q63440244.Smiley.toerist (talk) 16:42, 21 March 2021 (UTC)
So the property was already proposed and rejected, but it doesn't seem it attracted much interest at the time. Without the property, we have to create separate items, which is possibly invalid because they are basically duplicates that represent a single journal. It also limits how citations can be formatted, since the series will be fixed as part of the journal "title". Ghouston (talk) 21:39, 21 March 2021 (UTC)
I just created Heterosalenia alloiteaui sp. nov. du Jurassique moyen du Liban nord et un cas de croissance excessive de plaques arretant le developpement des zones ambulacraires (Q106116443) with SourceMD, the serie have been added with the volume: "S6-II". Christian Ferrer (talk) 21:12, 22 March 2021 (UTC)

Structuring mandatory language qualifiers

There are a bunch of items with a required language qualifier. Some, such as title (P1476) have a box that drops down and requires you to enter the two-digit language code (e.g. "en" for English) before you can save the item. This works nicely. Others, such as official website (P856), seem to have a cruder system that just yells at you with the flag if you don't include language of work or name (P407) as a qualifier. Could we change the properties that use the cruder system to the nicer one? {{u|Sdkb}}talk 20:04, 19 March 2021 (UTC)

@Sdkb: The difference in system is due to the nature of the properties. By definition the "nicer system" cannot appear for other item types unless it is expanded to every property of that type. Doing what you propose for official website (P856) would also require that the language be define for every single use of streaming media URL (P963) or property proposal discussion (P3254), which is a sweeping software change that is, to say the least, unlikely to happen. Circeus (talk) 14:24, 21 March 2021 (UTC)
@Circeus: Thanks for the reply. I don't totally understand what you're saying, but I take it that there's some issue, so I guess this is doomed to remain on the pile of non-ideal aspects of Wikidata's interface indefinitely. Oh well. {{u|Sdkb}}talk 04:57, 22 March 2021 (UTC)
@Sdkb: I'll try to explain in more details. Properties function differently because the type of things (called "data type" in the software) you enter for them varies. Some properties demand dates, some properties demand existing items, some properties demand language-less text, some properties demand URLs... All this variety because what the software does with the data entered for these properties varies.
(And yes some needed property types do not exist One example of a data type that currently does not exist is for dimensionless integer number, so you don't have to specify a unit.)
The "nicer system" you see on properties like title (P1476) is not merely a random switch that can be flipped for any given property, but something attached directly to the data type Monolingual text itself, so all properties with that data type (e.g. trading name (P6427) or unit symbol (P5061)) have it. So aside from this being a significant software issue, you cannot apply this nicer system to official website (P856) without affecting every other property with the URL datatype. Circeus (talk) 12:15, 22 March 2021 (UTC)
Ah, that makes more sense. Thanks for the explanation! {{u|Sdkb}}talk 16:32, 22 March 2021 (UTC)

Wikidata weekly summary #460

Wikidata user diversity survey will be activated on Wednesday 10th March, 2021

Hello,

As you may recall, the Wikidata development team at Wikimedia Germany requested for a CentralNotice banner earlier in February to deploy the Wikidata user diversity survey. The request has been approved and the banner will be activated and displayed for 10 days from Wednesday 10th March, 2021 and closed on Saturday 20th March, 2021.

Your responses to this survey will help us to understand the core demographics that make up the Wikidata community in order to provide us with a baseline for future diversity efforts.

The survey is very brief and will only take 1-2 minutes to complete.

Link to Survey (open until Monday 22nd March, 2021): https://wikimedia.sslsurvey.de/Wikidata_Community_in_2021

More information about the survey, the use of the data, and the questions asked: Wikidata:Usability and usefulness/2021-2-Survey

Many thanks for your participation. If you have any questions please do not hesitate to ask.

Cheers, Cheers,

-Mohammed Sadat (WMDE) (talk) 11:33, 9 March 2021 (UTC)

  • In other contexts, some users were concerned about the provider(s) being used, partially due to policies that are incompatible with those of WMF, precluding some Wikidata contributors from participating. What's the situation here? --- Jura 12:27, 9 March 2021 (UTC)
Hi Jura, Thanks for the feedback. We are using Lampoll which is certified to follow the European data protection GDPR law. All the data will be collected anonymously and all the questions are voluntary for participants. -Mohammed Sadat (WMDE) (talk) 12:20, 10 March 2021 (UTC)
  • For your point of view (or from that of this months WMF contractor doing some project), it may make sense to simply use a provider you consider suitable and accept its policy. From my point of view, it may not matter if I wont participate in one or the offsite activities/sub-projects, but if you look at it on a Wikidata project level, it gets problematic: we can't feature a different policy each month on all Wikidata pages because someone just considers it suitable that month. --- Jura 10:27, 16 March 2021 (UTC)

Can we an item on any topic?

Do we not consider the notability of any subject for creating a new item here on Wikidata? For example, we have an item on TYC 2691-2676 that has no wikipedia article. I've seen a few more such items here that has nothing more than a description, label and a few statements with no reference. Should they not be deleted? Btw, I am new here. Lightbluerain (talk) 14:07, 23 March 2021 (UTC)

See Wikidata:Notability. --Tagishsimon (talk) 14:31, 23 March 2021 (UTC)
The linked item does have references and an external ID from simbad. ChristianKl14:39, 23 March 2021 (UTC)
Help:Sources tries to explain "reference" in Wikidata. --- Jura 14:45, 23 March 2021 (UTC)

Same person under different number

Q104762815 (Amir Hossein Mahmoudi) is the same person as Q105906947 (Amirhossein Mahmoodi). Idk what to do if 1 person has 2 codes, so can someone fix this? Thanks, and have a good day. ShadowBallX (talk) 17:42, 23 March 2021 (UTC)

Goal and done. Taylor 49 (talk) 20:01, 23 March 2021 (UTC)

College admissions statistics

For w:Infobox U.S. college admissions, I'm seeking to eventually automate the data presentation so that it can be imported, saving us from the laborious task of manually updating it every year. For the data to be stored at Wikidata, however, would seemingly require the creation of a lot of properties. Some (such as yield rate) seem a reasonable enough extension of admission rate (P5822), but others, such as the 75% percentile SAT math score among enrolling students, seem really hyperspecific and thus not the best property candidates. Any suggestions about how best to structure this data? The core issue is just that there's a ton of salient data related to college admissions, and I'm not sure how we'd represent it without having a really excessive number of properties. {{u|Sdkb}}talk 22:59, 23 March 2021 (UTC)

@Sdkb: maybe use lots of qualifiers instead? We could use test score (P5022) and then qualify it with applies to part (P518) "50th percentile", assessment (P5021) "SAT", and such. BrokenSegue (talk) 01:08, 24 March 2021 (UTC)
@BrokenSegue: That would help somewhat, but having test score (P5022) is a little weird, since it's not the college that has the test score, but rather the entering class (and just using test score (P5022) on the college, someone might reasonably mistakenly think that it's the average score of the entire student body rather than just the entering class). {{u|Sdkb}}talk 01:20, 24 March 2021 (UTC)
@Sdkb:yeah that occurred to me but I was thinking the qualifier would say that it applies to part (P518) "50th percentile admitted student". The students are a part of the college so the qualifier is accurately describing what that score applies to. In some sense it is the "school's score". Yeah it's not the most clear but the other option (making a ton of properties) isn't appetizing. BrokenSegue (talk) 01:29, 24 March 2021 (UTC)
@BrokenSegue: It would have to be "50th percentile first-year enrolled student"; the statistic for admitted students is different because it includes those who were accepted but did not enroll (colleges often publish it instead of the more meaningful enrolled stat because it's generally higher). I don't know how to express that on Wikidata, though. {{u|Sdkb}}talk 01:49, 24 March 2021 (UTC)
Yeah I'm suggesting making an item called "50th percentile first-year enrolled student" or something and to use it as a qualifier. BrokenSegue (talk) 02:13, 24 March 2021 (UTC)

I've recently seen a large number of items switched from using coordinate location (P625) to using headquarters location (P159). This has broken things at Wikipedia, which is (predictably) causing grumbling and calls to stop integration with Wikidata. When there are big changes here, it may be a good idea to give more of a heads up to Wikipedia (yes, we're a separate project, but for Wikidata to succeed we need integration that allows us to make use of Wikipedia's large editor base) and try to resolve these things in advance as much as possible, and it'd be good to resolve any lingering threads from this particular change. {{u|Sdkb}}talk 01:45, 16 March 2021 (UTC)

I don't think using headquarters location (P159) is anything new. It could probably be fixed with a minor change to the module on the Wikipedia side. Ghouston (talk) 05:28, 16 March 2021 (UTC)
The item in question, COSI Columbus (Q5013508), is claimed in Wikidata to be a business, and businesses don't have locations as such, but they may have an HQ or other spaces that they use, perhaps multiple. I assume that was the reasoning behind the bot change.[10] Ghouston (talk) 05:35, 16 March 2021 (UTC)
Well, companies do have a registered office (Q65616986), in jurisdictions that I'm familiar with, but that's not necessarily a place where they do business or have staff, and Wikidata doesn't seem to have a specific property for it. Ghouston (talk) 05:48, 16 March 2021 (UTC)
P31 on COSI Columbus (Q5013508) is problematic. That probably explains some of the changes and why they are not really understood. --- Jura 09:25, 16 March 2021 (UTC)
Simply fix the templates at Wikipedia. For juridical person (Q155076) coordinate location (P625) as qualifier of headquarters location (P159) is nothing new, it has been repeatedly discussed many times. The second thing is mixing institution/building/exposition in one item.--Jklamo (talk) 11:32, 16 March 2021 (UTC)
The point is that these statements should not be changed without giving some warning/notice to Wikipedias that are using them. How can they get fixed unless they are notified? Changing existing statements (while completely logical and made in good faith) can cause breaking changes to articles on other projects, which Wikidata would be wise to be avoid. A technical solution would be to prevent removal of any statement which is use by any other project (or to require the editor to acknowledge a warning that their actions may cause breakage). This would encourage editors to fix the templates at Wikipedia before making changes to statements. MSGJ (talk) 12:57, 16 March 2021 (UTC)
We lack a mechanism right now afaik, to know where data is being used. What is the editor supposed to do? Cast runes? Wikipedia template writers need to write their templates defensively - Postel's law. --Tagishsimon (talk) 13:21, 16 March 2021 (UTC)
@Tagishsimon: That's an utterly impossible ask. Wikipedians who have written templates that draw coordinates of institutions from Wikidata that have been tested and worked perfectly until now cannot possibly be expected to magically intuit that Wikidata might someday decide that Princeton University (Q21578) doesn't actually have a coordinate location but rather only a headquarters location (no, that's not a hypothetical). {{u|Sdkb}}talk 07:13, 18 March 2021 (UTC)
Ditto. The suggestion that "Wikipedia template writers need to write their templates defensively" is unrealistic and ridiculous! There are any number of ways that data can be modelled and template editors will work with the method that is currently used. They will not be able to anticipate that a Wikidata editor will pull the rug from under them by changing the data model. The onus must be on the editor making the change to fix the templates or at least give fair warning before making breaking changes. MSGJ (talk) 11:35, 19 March 2021 (UTC)
Not as a complete solve but in order to better estimate impact of these sorts of changes it might be helpful to have a property to indicate this on Wikidata (e.g. "Template uses/relies on property"). Since templates may have different implementations it should probably include which site as a qualifier. ElanHR (talk) 19:04, 24 March 2021 (UTC)
@MSGJ: That would definitely be a good start. But it's not just Wikipedia using data. When changes break, say, Google search results, they won't do us the courtesy of complaining, they'll just stop using us. {{u|Sdkb}}talk 07:13, 18 March 2021 (UTC)
Well we can't undertake never to the change the data model MSGJ (talk) 11:52, 19 March 2021 (UTC)
I don't think this should be a main driver of whether to update wikidata's schema/representation. While having a frequently changing/unstable schema wpuld be detrimental to usage, if there is a more accurate way to represent something we should strive to do that and try to solve the deployment question separately. One possible approach would be to give a heads up to Wikipedias + known users, temporarily have both (possibly marking the old one with "soon to be deprecated"), and eventually remove them after a reason time period. ElanHR (talk) 21:37, 24 March 2021 (UTC)
Yes, having it as a qualifier of P159 is really bad (breaks pretty much every tool on wikipedia/commons that uses coordinates!). Much better to keep coordinate location (P625) with applies to part (P518) as a qualifier. Thanks. Mike Peel (talk) 13:44, 16 March 2021 (UTC)
It's bad for Commons because Commons infoboxes don't handle coordinates as qualifiers? Isn't that something that should be fixed there? --- Jura 09:31, 17 March 2021 (UTC)
It's bad in *every* *single* *case* I have seen P625 being used. On Wikipedias, Commons, in tools like Wikishootme, in SPARQL queries, everywhere. Literally no-one uses coordinates that are qualifiers. I don't know how it came about, but there is no way that should be the convention here. Thanks. Mike Peel (talk) 19:00, 17 March 2021 (UTC)
Yeah. About that, Mike. Jheald and I are using coordinates as qualifiers in what seems to be a very useful fashion, for GB1900 data. It allows for very easy crosschecking of GB1900 coord data against the item's coord - example and can indicate either an invalid GB1900 ID attached to an item, or an invalid main P625 in that item. Not very inclined to think we should not be using this b/c templates. --Tagishsimon (talk) 19:30, 17 March 2021 (UTC)
However, in part the point about the GB1900 qualifiers is that in that case we don't want them to show up as regular coordinates, because we know that they will be slightly off (the GB1900 coordinates are for the first letter of the label on the map, not the actual thing). Jheald (talk) 19:45, 17 March 2021 (UTC)
OK, that sounds like a specialised case, I'm not sure it applies to the general situation. Thanks. Mike Peel (talk) 19:48, 17 March 2021 (UTC)
My personal view is that adding statements with the qualifier "applies to part" instead of adding them directly to the part is an unstructured approach (insert whatever adjective you prefer). --- Jura 19:39, 17 March 2021 (UTC)
I have used coordinate location (P625) to qualify origin of the watercourse (P885) and mouth of the watercourse (P403) on rivers. @Mike Peel: is that wrong? MSGJ (talk) 11:43, 19 March 2021 (UTC)
@MSGJ: There should also be a coordinate for the river as a whole as well... Thanks. Mike Peel (talk) 11:48, 19 March 2021 (UTC)
Rivers are long things. It's not intuitive to put a coordinate on such things. (I believe the convention is to use the mouth of the river.) MSGJ (talk) 11:51, 19 March 2021 (UTC)
Rivers are also physical things that have locations, or a range of locations which can even change occasionally. Companies on the other hand are abstract entities that exist in minds and databases of accountants, lawyers and governments. Ghouston (talk) 05:14, 20 March 2021 (UTC)
It's common to use P625 as qualifier on place of burial (P119), see for example on Marilyn Monroe (Q4616). Most user agree that humans should not have a P625 statement but their birth place, place of residence, or place of burial have coordinates. That's why we add the coordinate as qualifier to these properties and not as statement to the item. With the same consideration, we should not add coordinates to business items. A business doesn't have a coordinate but its headquarter, factories or shop buildings have one. I understand that it is more complex to write SPARQL queries and Wikipedia templates to retrieve qualifiers but imho structured data is more important than how easily the data can be retrieved. --Pasleim (talk) 16:59, 23 March 2021 (UTC)
@Pasleim: I'm not sure I agree with humans, particularly place of burial, but anyway. Businesses and headquarters are often synonymous, so a coordinate there makes perfect sense, with a suitable qualifier. It's exactly the same logic that you are saying, just with the structured data the other way around - you have a coordinate, then you qualify what it applies to, rather than trying to qualify something with a coordinate. Why should someone have to look through every property to see if it has a coordinate, rather than just asking for the coordinate(s) directly? Thanks. Mike Peel (talk) 19:11, 24 March 2021 (UTC)
I removed the problematic job. I didn't find any bot approval or community consensus for these moves. I'll ask the operator (User:Pasleim) to undo the recent edits. Multichill (talk) 20:42, 17 March 2021 (UTC)

Item edit request

Please add "Romania" for Wikipedia Tay language on Romania (Q218). Thanks in advance!!! --2001:B07:6442:8903:89B2:A6E0:1C69:BED4 08:45, 23 March 2021 (UTC)

Please add "Tunisia" for Wikipedia Tay language on Tunisia (Q948). Thanks in advance!!! --2001:B07:6442:8903:89B2:A6E0:1C69:BED4 08:47, 23 March 2021 (UTC)

Please add "Cyprus" for Wikipedia Tay language on Cyprus (Q229). Thanks in advance!!! --2001:B07:6442:8903:89B2:A6E0:1C69:BED4 08:49, 23 March 2021 (UTC)

Please add "Cuba" for Wikipedia Tay language on Cuba (Q241). Thanks in advance!!! --2001:B07:6442:8903:89B2:A6E0:1C69:BED4 08:50, 23 March 2021 (UTC)

Please add "Dominica" for Wikipedia Tay language on Dominica (Q784). Thanks in advance!!! --2001:B07:6442:8903:89B2:A6E0:1C69:BED4 08:52, 23 March 2021 (UTC)

The Tay wikipedia is still in the incubator and as such is not a valid target for Wikidata. Circeus (talk) 23:28, 23 March 2021 (UTC)

@Circeus: It is not more in Incubator, the main page is here.--2001:B07:6442:8903:1CDA:B98F:9D5A:D11 08:53, 24 March 2021 (UTC)

If you create an account, you can asked for confirmed status at Wikidata:Requests_for_permissions/Other_rights#Confirmed and then add them directly yourself. --- Jura 10:10, 24 March 2021 (UTC)
+ added it to Wikidata:Status_updates/Next#Other_Noteworthy_Stuff. --- Jura 11:48, 24 March 2021 (UTC)

Change property type

I created Deutsche Bahn station code (P8671) as a string but apparently it should be an external identifier. Is there any method to change the data type or would it be necessary to create a new property of the correct type? MSGJ (talk) 21:12, 15 March 2021 (UTC)

@MSGJ: It is possible to have the datatype amended between string and external ID. I did the opposite for Irish Grid Reference (P4091) in the past (see talk page). Basically you should try to look for some consensus for the change and then raise it with the development team. To get you started, I would support this change --SilentSpike (talk) 11:55, 18 March 2021 (UTC)
@Entbert, Jheald: please can you confirm your support for this change? MSGJ (talk) 11:36, 19 March 2021 (UTC)
  Support --Entbert (talk) 19:35, 19 March 2021 (UTC)
  Support -- Jheald (talk) 20:46, 19 March 2021 (UTC)

Thank you. I have opened a ticket to get this changed. MSGJ (talk) 11:29, 25 March 2021 (UTC)

If an official website property is on the spam blacklist, you can't edit its qualifiers

/pol/ (Q39045417) has constraint error "This official website statement is missing a qualifier language of work or name". I try to fix it by setting "official website" = "English" but I get an error the URL is on the spam blacklist. But I am not adding a URL on the spam blacklist, just trying to modify its qualifiers! I think this is a bug. The spam blacklist should not stop modifying the qualifiers and/or references of existing statements, when the added qualifier/reference doesn't itself reference anything on the spam blacklist. Mr248 (talk) 03:06, 20 March 2021 (UTC)

How to add a property to Template:Anatomy properties

I want to add new property Cephalopod Ontology ID (P9334) to Template:Anatomy properties, but when I try to edit it I get this: "#invoke:Property navigation|navbox|anatomy"

Is there documentation somewhere that I can read to learn what this means and how I can actually add the property to this template? UWashPrincipalCataloger (talk) 08:42, 20 March 2021 (UTC)

  • @GZWDer: I noticed a while ago that your proposed solution makes it hard to discover how to edit the template. Maybe we should revert to the old format till we have a new format that allows users to discover how to add new entries? ChristianKl12:18, 21 March 2021 (UTC)
  • It would have been good to links to the template being discussed. Clicking edit there should have allowed to find the module being called. --- Jura 11:41, 25 March 2021 (UTC)

Automating language of work annotation

In response to Wikidata:Project_chat#Bot_idea_-_automatically_add_language_of_work_to_reference_URLs I'm considering writing a bot to add the language of work or name (P407) qualifier to relevant URL properties by fetching the URL and using an off the shelf language detection model. But I figured such a proposal would be controversial and since so few people look at Wikidata:Requests_for_permissions/Bot I thought I'd bring it up here.

I did a short write up of my proposal at User:BrokenSegue/P407 analysis. The results are a little more muddy than I'd like honestly but doing it more cleanly would require more time that I don't have right this second.

General Questions: Is this a non-starter? Do I need to do more work? Or would people support such a bot as-is? BrokenSegue (talk) 02:30, 24 March 2021 (UTC)

  • Personally, if I look for the official website, I don't really care about the language it is in.
For references, most of the time, there is just one and, if there is a need to check the statement, one has to work with that reference whatever language it is in.
Obviously, some re-users might have different views on this, but might not depend on the language qualifier to determine the website language.
OTH, I don't really see a problem with adding such qualifiers. --- Jura 10:54, 24 March 2021 (UTC)
    • I think the value of having language on official website is so that wikis consuming the information can annotate the link or select the proper link. I agree the value on references is lower. BrokenSegue (talk) 14:32, 24 March 2021 (UTC)
  • I've been wondering how this is supposed to work with sites that seem to do language auto-detection - there are a number of places where if I go to the website I get redirected to a /en/ url (because presumably it thinks I want English) but I suspect other people would go to a different language. How do we handle this? ArthurPSmith (talk) 19:48, 24 March 2021 (UTC)

Merge Q105729329 and Q348579

Please someone check and merge the following Wikidata pages:

Thank you in advance. Naluna (talk) 16:16, 24 March 2021 (UTC)

  • trawling (Q105729329): method of catching fish; principle by which netting bags are being towed in water to catch different species of fishes in their path
  • trawl net (Q348579): type of fishing net; construction of nets dragged behind a ship through a sea allowing large-scale capturing of fish

Above the same with their English descriptions. Might explain the difference between the 2 --- Jura 17:12, 24 March 2021 (UTC)

They should NOT be merged because they are clearly different ... but some wikipedia pages are connected to the wrong one. Taylor 49 (talk) 10:31, 25 March 2021 (UTC)
Improved them. Some more wikipedia links probably have to be moved. Taylor 49 (talk) 10:57, 25 March 2021 (UTC)

suggestions for Q1430613

May I suggest someone updates the "official website" entries for the Dutch Society for Sexual Reform (Q1430613). Personally, I likely lack the Wikidata skills to accurately pull this off, given the qualifiers, references, multiple entries, etc. Currently, Wikidata just says http://www.nvsh.nl/english, but that is a dead link (404 page). First of all, their (native) Dutch website is https://www.nvsh.nl/. But additionally, they have four internationally-oriented websites (domains); something you can verify by visiting their native website and then clicking the related language flags in the top right. The four other languages are French, German, English, and Spanish. Respectively https://www.artdelamour.fr/, https://www.zaertlichsein.de/, https://www.sexualskills.co.uk/, and https://www.sobrehacerelamor.es/. Thank you. --143.176.30.65 10:58, 25 March 2021 (UTC)

I updated the info. Data are only update-to-date thank to the community. I encourage you to register and have a look at Project:Community_portal. The interface is relatively intuitive, for the next time --Meskalam (talk) 12:11, 25 March 2021 (UTC)

Disambiguate wikidata and wikipedia

I would like to document blocking (Q2630964) (currently 0 statements). When looking at the label translation I get into trouble. The wikipedia linked pages (for those language I can read) are "misleading". I am not the first to notice (it was already mentioned in 2019) and we are thinking to split it. Discussion is here https://en.wikipedia.org/wiki/User_talk:Tigraan#Didascalies_%E2%80%B3Blocking_(Theater)%E2%80%B3_(&_caption_') Is it recommendable to create a new item in this case? Should I look for further discussion somewhere first? Meskalam (talk) 10:58, 25 March 2021 (UTC)

Backlog of property creations?

There are over 40 property proposals that haven't been modified for more than 100 days -- does anyone know more about what is causing the backlog? Are there insufficient property creators, or some technical issue, or what? JesseW (talk) 04:31, 6 March 2021 (UTC)

We're now down to 232 open proposals, and only 23 that haven't been edited in more than 100 days! (95 of them have been edited in the last 2 weeks.) Progress. JesseW (talk) 19:29, 15 March 2021 (UTC)

  • What's the progress if you merely close them as not done? --- Jura 14:37, 16 March 2021 (UTC)
    • @Jura1: See @BrokenSegue:'s comment above. As a non-property-creator, all I can do is judge whether or not the given discussion suggests consensus or not. In many cases, it doesn't. Marking them as such moves them off the list, making it clearer which ones still need to be considered, and (as it has in some cases) may prompt interested parties to further discussion or clarifying their support. That's the progress. I'd love your help with it. JesseW (talk) 03:44, 19 March 2021 (UTC)
    • @JesseW: this is a list of modeling questions to solve, this with or without suggestions how to solve it instead. The actual number or age of the proposals doesn't really matter. From property creators it's expected that they provide some level of support to this instead of merely closing them. e.g. Wikidata:Property proposal/population connectée you could have asked the proposer if they want to revise the proposal. Noting the absence of consensus without pinging them isn't really getting things done. --- Jura 09:19, 19 March 2021 (UTC)
      • You are welcome to ping them; if they have the proposal on their watchlist, the mere act of editing does so already. They (and you) had four months to respond to the opposition expressed, and didn't. And marking a proposal as not done doesn't prevent it from being re-opened, or a new proposal created. It just allows people looking for proposals to understand what is and isn't actively being considered. You may consider the list of property proposals as not actually proposals, but "modeling questions to solve", but that's not what the guidelines for them say, or what is asked of people creating them. What is asked there is to provide a very specific proposal to solve a specific lack in Wikidata, and to do so in a manner that generates minimal to no objections and at least some active support. Lots of the ones there don't (yet) do that, and distinguishing the ones that do from the ones that don't (yet) I still think is helpful. JesseW (talk) 13:26, 19 March 2021 (UTC)
        • Proposals are generally not re-opened, but new proposals written or the current one re-formulated. Property proposal discussions are also there to help formulate a suitable proposal. I think it's better that you stop closing them, especially as you think it's for others to ping people in your place. --- Jura 13:36, 19 March 2021 (UTC)
          • @Jura1: I urge you to ask @BrokenSegue: who suggested I do that closing duties. If you have an objection, bring it up with them, not me. I am glad to be more aggressive with using {{Ping}}. Also, please work to clarify the instructions on when to mark a proposal as "not done" if you consider the current instructions to be unacceptable. JesseW (talk) 13:50, 19 March 2021 (UTC)
            • BrokenSegue didn't ask you to close any of them. Maybe you can add a diff if I missed something. --- Jura 13:54, 19 March 2021 (UTC)
            • I don't see you mentioned in that sentence nor you being ask to do anything or being yelled at. They are just expressing their view on some of the proposals. --- Jura 14:16, 19 March 2021 (UTC)
              • It was in direct response to my asking what the cause of there being so many proposals open, some for multiple years. I don't see how that can be interpreted as anything other than "You can and should close the ones lacking consensus.". But you have successfully pressured/intimidated me into, at least, staying far away from any proposal you have commented on, so congratulations on that, at least. JesseW (talk) 14:30, 19 March 2021 (UTC)
  • @Jura1: @JesseW: To clarify, I wasn't specifically suggesting that JesseW close them but I am perfectly fine with non-admins/property creators closing proposals given that the admins are so incredibly backlogged. Such a backlog is unhealthy in itself especially if conversations have stalled out for months. I haven't read all of the particular closed proposals but the linked ones seem fine to me. Note that this was also brought up at Wikidata:Administrators'_noticeboard#Special:Contributions/JesseW where zero admins felt the need to chime in. If admins cannot even be bothered to comment on this then I don't think it's especially problematic. BrokenSegue (talk) 15:23, 19 March 2021 (UTC)
  • It's not particularly an admin task. Given the responses to feedback by JesseW (or the absence thereof), I still feel it's preferable to that they don't close these. If a user feels they are being "yelled at" or "intimidated" when being their doing is being is discussed, they may not suitable for this. --- Jura 12:46, 20 March 2021 (UTC)
    • "Discussed" is a interesting term for one single person (you) trying (repeatedly) to prevent this work (closing proposals that have been open for far longer than the minimum, but have very clearly lacked consensus or support) from being done. I didn't, and don't, object any of the other people who have brought up concerns (and I don't even object to you arguing for your (idiosyncratic) view of the Property Proposal process as a set of open-ended modeling questions). I objected to you telling me, repeatedly, to just not do the work. But as I said, you already won -- all you have to is comment on every single proposal (as you nearly do already), and you can safely avoid having me, at least, ever touch them. JesseW (talk) 14:26, 20 March 2021 (UTC)
      • I don't find your tone and approach particularly constructive and concerns about you closing property proposals were raised on the admin noticeboard (which is why I commented here again). --- Jura 15:06, 20 March 2021 (UTC)
  • At least one (Wikidata:Property proposal/ESPN men's college basketball team ID) was unopposed; there were no comments at all so should it be reopened? Peter James (talk) 13:21, 24 March 2021 (UTC)

This process is opaque. I proposed two properties at Wikidata:Property proposal/Ukrainian romanization that have been sitting idle with no opposition for 135 days. Am I supposed to read somebody’s body language to understand that it “should be closed as "no consensus" or rejected”? —Michael Z. 14:01, 24 March 2021 (UTC)

Property for the refuse-type scope of a recycling centre

Is there a suitable property to specify the set of types of material handled by a recycling centre? We have analogous properties - product or material produced or service provided (P1056), typically sells (P7163) - but I've not spotted a suitable property for this occasion. --Tagishsimon (talk) 23:57, 22 March 2021 (UTC)

Maybe uses (P2283)? But it sounds like it may be a property we are missing. ArthurPSmith (talk) 17:55, 23 March 2021 (UTC)
Wouldn't that be the input of a process? Input and output of processes are properties that are certainly needed. --SCIdude (talk) 15:18, 24 March 2021 (UTC)
@Tagishsimon: what about the following: some process --> has part(s) (P527) --> material, with qualifier: object has role (P3831) --> reactant (Q45342565) / artificial object (Q15401930)? This could be stated either on the plant item or on a process item which is part of or used by the plant. --SCIdude (talk) 16:43, 25 March 2021 (UTC)

Entries for 'state-owned enterprise'

I'm not sure what the difference is between state-owned enterprise (Q270791) and public enterprise (Q17990971) - the descriptions seem to refer to the same thing, worded in different ways. I notice with the interwiki links there are almost no language overlaps - suggesting that they are the same concept - but there are 3 or 4 languages where each one points to a different article. I also notice that each has the property partially coincident with (P1382) with the other - which suggests someone has looked at both and deemed that they do refer to different concepts. Any thoughts? At the bare minimum the English descriptive text could provide a bit more clarity. -- Chuq (talk) 01:16, 23 March 2021 (UTC)

The items look like duplicates, but additional items are probably needed for Wikipedia articles which are about slightly different concepts: some language Wikipedias have articles sitelinked to both. Ghouston (talk) 04:54, 23 March 2021 (UTC)
languages with two articles:
Having had a glance at the Estonian, Italian, Portuguese and French ones, in the first three cases, one is the general term as understood in english (a technically commercial enterprise that happens to be owned by a state), the other is a specific organisation of state-owned enterprise. In Italian it may be that "Azienda pubblica" is more general than a commercial state-owned company, as it seems (?) to encompass public administrations like administrative regions... In French, one of the articles is on the mirror aspect, Literally it's "shareholder state". I cannot say about the Japanese pair, google being too unhelpful. It kinda look like one refers to state-owned services (e.g. state-owned television, postal services, public transportations...) and the other to general companies?
Either way, none pof the extraneous ones in these pairs look to be synonymous tpo each others. Circeus (talk) 17:32, 24 March 2021 (UTC)
Did you notice that the English label of Q17990971 was "public enterprise" until two months ago when someone changed it into the current one? [11] To my eyes, the difference is not negligible. For example, it seems to make more sense to call Canadian Broadcasting Corporation a public enterprise (as fr:Entreprise_publique says), than a state-owned enterprise. I think the change only made things confusing and I'd revert it. whym (talk) 12:44, 24 March 2021 (UTC)
Can you establish exactly what the difference is? en:Public enterprise is just a redirect. In any case, it doesn't seem that all of the Wikipedias with duplicate articles are referring to the same two concepts, so it's likely that more than 2 items are needed (although it's possible suitable items already exist somewhere.) Ghouston (talk) 21:24, 24 March 2021 (UTC)
One alternative model is that an organization may not be directly owned by a government, e.g., it's registered as some kind of non-profit organization, but which obtains a lot of its funding from the government. I assume we'd just declare these to be whatever kind of non-profit organization they are. University of Tasmania (Q962011) is an example I know of, nobody "owns" it, and it's not a branch of government, it's incorporated within the state of Tasmania. Ghouston (talk) 21:45, 24 March 2021 (UTC)
Yeah, but universities are rarely, if ever considered as "enterprises". That term is, in this case, mostly a fancy word for "company" given the treatment in most languages I've glanced at. Circeus (talk) 01:49, 25 March 2021 (UTC)
In the case of the Canadian Broadcasting Corporation, the opnion at en:Crown corporations of Canada is that they are a "specific form of state-owned enterprise", i.e., there's no obvious distinction of "public enterprise". The BBC (Q9531) is a British version. Ghouston (talk) 21:47, 25 March 2021 (UTC)
The BBC is a "corporation incorporated under Royal Charter" [12] ... my observation of UK non-governmental organisation is that most of our P31s are wild-assed guesses & usually wrong, and our ontology in this area greatly lacking. I've come across one or two reliable sources detailing the specific nature of 600 or so UK non-government orgs, but haven't had the time to do much with them yet. --Tagishsimon (talk) 22:47, 25 March 2021 (UTC)
It seems that the Japanese Wikipedia article ja:公企業 (public enterprise) makes a specific distinction between "公企業" (public enterprise) and "国有企業" (state-owned enterprise). According to ja:公企業#所有と経営の主体, "state-owned enterprise" refers to enterprises owned by the national government, while "public enterprises" include enterprises owned by both national and local governments (the latter is called "地方公有企業" or "local public enterprise"). --Stevenliuyi (talk) 02:58, 25 March 2021 (UTC)

Volleyball league points system

Hello. I need help for adding the league points system (league points system (P3195)) for volleyball when the system is:

a) when a team win with 3-0 or 3-1 set takes 3 points and the other team 0 point b) when a team win with 3-2 set takes 2 points and the other team 1 point

I create volleyball point system according to sets (Q106069676) but I am not sure if it is correct (using 3-0 sets (Q27210420), 3-1 sets (Q27210425) and 3-2 sets (Q27210429)). (Please also read Property talk:P3195#Difficult point system).

Thanks. Data Gamer play 22:59, 19 March 2021 (UTC)

@Data Gamer: Yes, the volleyball point system according to sets (Q106069676) is a little difficult to implement here in Wikidata, you're right. The current proposal is fine from my point of view, but I would also go with you and say that it is not yet perfect. I find it a little inconvenient that, for example, there is a 3-2 sets (Q27210429) for two points as well as one point. It is actually logical, which points would be awarded, but still it does not actually come out of the statement.
A first suggestion from me would be that perhaps the data objects 2-3 sets, 1-3 sets and 0-3 sets are also created in addition to the previous data objects. This made it clear to me that the "first named" team is always the team that gets the points.
A second suggestion would be to take the statements about the property points awarded (P3260) out of the data object volleyball point system according to sets (Q106069676) and say that the volleyball point system according to sets (Q106069676) consits of the possible results 3-0 sets (Q27210420), 3-1 sets (Q27210425) and 3-2 sets (Q27210429). Only in these data objects do you then go into the points gained. In addition, the property won sets could also be suggested in the context. --Gymnicus (talk) 10:36, 26 March 2021 (UTC)
@Gymnicus: thanks for your suggestions. I think I have added all. Please check volleyball point system according to sets (Q106069676). And please also check the possible results objects like 2-3 sets (Q106211942). I have added applies to part (P518) as a qualifier with home team (Q24633211) and away team (Q24633216). Data Gamer play 20:26, 26 March 2021 (UTC)
@Data Gamer: This looks very good. Then today or tomorrow - let's see when I get to it - I will propose the property won sets. Because these can then be built into the possible results. --Gymnicus (talk) 20:36, 26 March 2021 (UTC)

Which property/value to set to indicate a statue of a fictional character without their own item?

Hello, I was wondering what the best practice to add a statement to a fictional statue to indicate that it is of a fictional character. For example the following statue Indian Hunter (Q17194242) is of a fictional hunter. I have previously used depicts (P180) with the tag of another page for the character (ex The Tempest (Q27940684)), but this character isn't named. Assigning depicts (P180) to fictional character (Q95074) doesn't seem right but that may be it. Akadouri

why not make an item for them? i don't think there's a solution without that. BrokenSegue (talk) 02:36, 25 March 2021 (UTC)
There are statues that depict vague figures like an anonymous man/women/solider/etc. It seems like the entry for the statue and the depiction would be duplicating a lot of information, the item wouldn't exist outside the context of the statue either. I am not too familiar with WikiData, so if that is the way I can do that. Akadouri 3:14, 25 March 2021 (UTC)
If the subject is truly anonymous then I would suggest making a "statue of soldier" item (a subclass of statue like statue of Jesus (Q29168168)) and marking it as an instance of that. If the subject is uniquely identifiable but unnamed then you could consider making an item for the subject depicted. BrokenSegue (talk) 03:29, 25 March 2021 (UTC)
Why not directly "depicts=hunter"? --- Jura 12:17, 25 March 2021 (UTC)
That's the question I had, is that ok to do since depicts (P180) usually refers to an instance of a character or person? There is also additional metadata like sex or gender (P21) which would end up on the statue item itself if that were the case. I don't know if that's really an issue either, or if those properties are meant to only be on people/characters. Akadouri 15:52, 25 March 2021 (UTC)
I don't think there's any such restriction, and the value can be a class, e.g., as on Cat Statuette intended to contain a mummified cat (Q29385765). Ghouston (talk) 21:38, 25 March 2021 (UTC)
I think depicting a non-specific thing is exactly depicting a member of a class of things. Depicts human (Q5) (qualifiers man (Q8441) Indigenous peoples of the Americas (Q36747) hunter (Q1714828)) + dog (Q144). Oddly, I can’t find an item for American Indians (the above is a broader category that also includes Inuit and Yupik peoples). —Michael Z. 03:00, 26 March 2021 (UTC)
It's an alias on Indigenous peoples of the Americas (Q36747). Ghouston (talk) 09:36, 26 March 2021 (UTC)

Notability misunderstanding to be cleared up

I want to lay a subject to rest(get something off my chest)

My point regarding Wikidata:Notability and regarding also the edits I made to Temple Grandin (Q232810) attempting to add her parents I'd like to say this.

I interpret the Wikidata notability guidelines such as this, and now with this real example of Temple Grandin:

This is an explanation what I originally intended so with this I also say I don't get the "Adam and Eve" argument since with this we won't add "Adam and Eve". So if you got no further feedback I hereby lay this subject to rest. LotsofTheories (talk) 05:51, 25 March 2021 (UTC) + minor edit LotsofTheories (talk) 05:52, 25 March 2021 (UTC) + 1 more minor edit LotsofTheories (talk) 05:54, 25 March 2021 (UTC)

Wikidata notability in regard to families, means you have to prove they really exist with a reference, or an external identifier like Familysearch or one of the other genealogical sites. Wikidata notability has nothing to do with being famous, just prove that the entry is not a hoax. In theory everyone can be added that is dead, but there is no value to adding everyone, since it will dilute finding the person you are actually searching for. Perhaps one day we will be able to properly add and disambiguate all dead people and somehow flag the famous people that most researchers are looking for. Findagrave, for instance, flags famous people. --RAN (talk) 20:06, 26 March 2021 (UTC)
It is already discussed at Wikidata:Project_chat/Archive/2020/04#Help,_should_I_add_parents_of_all_notable_persons?. Wikidata uses a very different standard to determine notability than Wikipedia. Please however note there are no contemporary people with known generation-by-generation descent from Adam and Eve.--GZWDer (talk) 16:34, 25 March 2021 (UTC)
  • Don't forget to add in the family tree template at Commons. See Commons:Category:Temple Grandin. It provides an easy to use navigation device between generations, and lets you know if you have an error or omission in concatenating generations at Wikidata. Recognizing an error using raw Wikidata entries can be maddening, but once you see the family displayed graphically, errors and omissions are easy to spot. --RAN (talk) 19:58, 26 March 2021 (UTC)
  • "Wikidata:Notability point 3. does not apply to Temple Grandin's parents unless they are notable" for Wikidata the notability policy defines what's notable. Adding information about parents of a person like Temple Grandin is a task that our notability policy is designed to allow. ChristianKl13:20, 27 March 2021 (UTC)

ART+sport+FEMINISM, in the year with (out) OLYMPICS

Are you interested in Sport? Olympics? Its failures and success? Consider to support and join creative and critical work on ART+sport+FEMINISM https://w.wiki/38f6 (also get in touch if you have interesting content donations/refs.) -- Zblace (talk) 09:38, 27 March 2021 (UTC)

Public domain

I am looking for something similar for copyright status to be "determination_method=released into public domain" for a document that is voluntarily released into the public domain. Does anyone know if we have a QID for something already? --RAN (talk) 20:10, 26 March 2021 (UTC)

Use copyrighted, dedicated to the public domain by copyright holder (Q88088423). Multichill (talk) 16:42, 27 March 2021 (UTC)

How to represent measurement units in Wikidata?

For example gigatonnes of carbon (GtC) or GtC/yr should they all have a diffent object, and what kind of relationship should they have? FischX (talk) 15:44, 27 March 2021 (UTC)

See Help:Data_type#quantity if you haven't already. JesseW (talk) 16:30, 27 March 2021 (UTC)
I'd like to query properties and name of units not use them on objects. FischX (talk) 20:29, 27 March 2021 (UTC)
Hello @FischX: Each unit receives a Wikidata ID. For instance gigatonne per year (Q106232369). You can query properties using the query service:
select * where {
  wd:Q106232369
    rdfs:label ?label ;
    wdt:P5061 ?symbol ;
    p:P2370 [
      ps:P2370 ?conversionFactor ;
      psv:P2370 / wikibase:quantityUnit / rdfs:label ?coherentSIUnit ;
    ]
  .
  filter (lang(?label) = "en" && lang(?coherentSIUnit) = "en")
}
Try it!
If a given unit does not exist feel free to create it or ask for help. Eventually we'd like to have an item for each unit which has ever been used, but we are not close to that, yet.
About "gigatonnes of carbon": As a general recommendation from ISO 80000, the qualification "carbon" would be part of the quantity, not the unit. For example:  . Practically this means that a Wikidata property would indicate "carbon" either in its name or using a qualifier and then simply use the item for gigatonne. Toni 001 (talk) 21:33, 27 March 2021 (UTC)
About the question how units are related: They are indirectly related via the quantities they measure. For instance, mass change rate (Q92020547) has a defining formula (P2534) which indicates how this quantity relates to mass and time. Then, say, gigatonne (Q106232521) is related to mass via measured physical quantity (P111). Toni 001 (talk) 21:45, 27 March 2021 (UTC)
Thank you! That helps me a lot! FischX (talk) 22:44, 27 March 2021 (UTC)

Undo merger

Can someone undo the merger at Vincenzo Trani (Q4013558) it merged a contemporary business man and a police officer from the 1800s. --RAN (talk) 01:46, 30 March 2021 (UTC)

Done. --Tagishsimon (talk) 01:57, 30 March 2021 (UTC)
I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. Matěj Suchánek (talk) 10:26, 2 April 2021 (UTC)

Undo merger of Q16938506 and Q21892966

Can someone undo the merger at Dog Creek (Q16938506)? I merged a river in Australia with a tributary of the Fraser River in BC, Canada. Iketsi (talk) 15:23, 31 March 2021 (UTC)

I think I have done this, pls check--Ymblanter (talk) 19:04, 31 March 2021 (UTC)
Fully fixed now ... correct instances and different descriptions for different rivers. Taylor 49 (talk) 20:16, 31 March 2021 (UTC)
I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. Matěj Suchánek (talk) 10:26, 2 April 2021 (UTC)

Hi,

An anonymous editor removed a lot of data items from this item. Lisa Ann ended her career last year, but I'm not sure how to act on the removal of all this data... Can some one help? Ciell (talk) 20:56, 31 March 2021 (UTC)

This has been reverted by Succu. MSGJ (talk) 21:15, 31 March 2021 (UTC)
I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. Matěj Suchánek (talk) 10:26, 2 April 2021 (UTC)

Wikimedia disambiguation page

When we have something like Richard Freudenberg (Q65018447) should we be somehow listing all the entities that are being disambiguated? --RAN (talk) 18:39, 28 March 2021 (UTC)

Based on my experience, I would say you can, but you don't have to. That is certainly always a matter of taste. --Gymnicus (talk) 20:54, 28 March 2021 (UTC)
There is no good reason to, IMHO. The item represents a disambiguation page on 1 or more Wikimedia pages. It is not in itself a disambiguation page (nor should it function as one, in my opinion). Different Wikis will have different numbers and combinations of entries on their respective disambiguation pages. We needn't and shouldn't try to model every permutation on Wikidata, just as we shouldnt try to reproduce the various levels of detail found on idiosyncratic Wikipedia articles: a stub in English Wiki may be a 100 kb Featured Article in French Wiki, both entirely independent of the number of properties on the corresponding Wikidata item. -Animalparty (talk) 04:38, 29 March 2021 (UTC)
different from (P1889) might be a simple solution. Should help prevent "false" merges --Bouzinac💬✒️💛 07:32, 29 March 2021 (UTC)
different from (P1889) should be used for item pairs that are likely to be confused, and/or that are commonly confused in other databases and literature, not merely any number of arbitrary items with a similar name. John Smith (Q228024) and John Smith (Q376158) are unlikely to be confused as they share almost no characteristics beyond name. But as humans, neither should be confused with a non-human Wikimedia object like John Smith (Q245903). Similarly, we need not and should not list every item in a given category or list like Category:People of World War I (Q8756687) or list of mountains in Albania (Q1304622). -Animalparty (talk) 17:17, 29 March 2021 (UTC)

Properties for registration conditions and view conditions

I want to know if the site has the properties and/or constraint to explain the registration restrictions (e.g. region, mobile phone number / email) and access restrictions (public visible, ordinary users visible, payment viewing and trial, etc.) on services (website, software, or wider). For example, I may want to know if a service needs to be registered with a mobile phone number or credit card, and whether a website or area is generally only limited to certain users.--YFdyh000 (talk) 20:59, 28 March 2021 (UTC)

Partly relevant are use restriction status (P7261) and access restriction status (P7228). Circeus (talk) 18:10, 29 March 2021 (UTC)

Wikidata weekly summary #461

Suggestion for human (Q5) items

I would suggest for a bot for more faster name in native language (P1559) import: for each item who have human (Q5) statement, the bot read what native language (P103) have the item, and read the name on the wikipedia item and add the name adding name in native language (P1559). --2001:B07:6442:8903:D853:F41A:2B53:217F 08:56, 26 March 2021 (UTC)

native language (P103) is often missing, thought. I sort of don't like setting it because it seems like basically a guess in most cases. If I guess, I use inferred from association (Q91611664) as the heuristic, as on Feodosy Krasovsky (Q358004). Ghouston (talk) 09:31, 26 March 2021 (UTC)
I think this definitely could be a helpful way to improve coverage but I agree with @Ghouston: that any heuristic used to increase coverage should be marked as such (personally I wish this were mandatory) in order to allow people to ignore these inferred statements if they're looking for a high precision signal. ElanHR (talk) 03:15, 27 March 2021 (UTC)
@ElanHR: any samples of references you consider "high precision signal" for this statement? --- Jura 13:18, 27 March 2021 (UTC)
You may get lucky and find an interview where the subject explains how he was actually brought up with Chinese as his first language despite being born in Germany and having a German name. If somebody is brought up in the language you'd expect, I doubt it would ever be mentioned. Some people may put it on their CVs. Ghouston (talk) 23:25, 27 March 2021 (UTC)
@Ghouston: If native language (P103) is often missing, bot could use country of citizenship (P27) or place of birth (P19) to obtain native language (P103). It can work? --2001:B07:6442:8903:A838:1DF0:E90D:9189 08:49, 29 March 2021 (UTC)
It would just be guessing in bulk. A lot of correct statements would be added, and also quite a lot of errors. Not every country has a single language that everybody speaks. Some people move from one country to another. Some people have immigrants as parents and maybe learn an official language of the country as a 2nd language. Ghouston (talk) 10:33, 29 March 2021 (UTC)

Unparished areas

I wonder of I could ask for a sanity check. I've been recently adding a lot of statements like:

Teversal Manor (Q15979549)located in the administrative territorial entity (P131)Ashfield (Q600996)

with qualifier

statement is subject of (P805)unparished part of Ashfield (Q105909200)

(eg: Q15979549#P131).

Question: is this an appropriate use of statement is subject of (P805) ? Or should P805 be kept for when statement and item have a 1:1 relationship (as for eg statement <--> encyclopedia article item links) ? Is it appropriate to use if many statements will link to eg unparished part of Ashfield (Q105909200), so the item is not distinctly about any single one of the statements?

Background
see also: Wikidata:WP EMEW/Subdivisions/Unparished

For most of England the lowest level in the "located in the administrative territorial entity (P131)" chain is a civil parish (Q1115575). But for historical reasons, not all of the country has civil parishes. (eg in some towns there were municipal authorities instead, which then got abolished, and not replaced).

So in some parts of the country, the lowest level in the P131 chain is a local authority. In some of those cases not all of the local authority is unparished, so there is a distinction between eg Ashfield (Q600996) and unparished part of Ashfield (Q105909200).

The question is how to represent this in wikidata. In particular how to indicate that when P131 = <local authority>, that is because there is no local parish. It's also useful to be able to query just which places are in eg the unparished part of Ashfield (Q105909200), eg to produce a map like https://w.wiki/36zB (note also that the 'unparished part' of an LA is not necessarily even a single contiguous unit -- see eg the area around Canterbury).

One way to model this would be to treat the 'unparished part' of an LA like a 'quasi-parish' and include it as a level in its own right in the P131 chain, eg:

Teversal Manor (Q15979549) --> (P131) : unparished part of Ashfield (Q105909200) --> (P131) : Ashfield (Q600996) --> (P131) : Nottinghamshire (Q21272736)

However, this approach has not been taken, for at least a couple of reasons:

  • Firstly because some (including for example OSM) take a strong view that an unparished area (Q7897276) represents the absence of administrative entity: it is not an administrative entity in its own right, so should not be presented as one.
  • Secondly, because keeping unparished part of Ashfield (Q105909200) out of the P131 chain may be friendlier to reusers. With the qualifier structure the infobox c:Category:Teversal_Manor accurately presents the location as "Teversal, Ashfield, Nottinghamshire, East Midlands, England". In contrast the repetition in a form like "Teversal, unparished part of Ashfield, Ashfield, Nottinghamshire" may seem clumsy and mechanistic, and tends to cause complaints, unless special code is written to avoid it.

So if instead we have just

Teversal Manor (Q15979549) --> (P131) : Ashfield (Q600996)

with some appropriate qualifier(s) to indicate that there is no civil parish, and to link to unparished part of Ashfield (Q105909200), are there any views as to what that qualifier structure should most appropriately be; and, in particular, which is/are the qualifier or qualifiers most appropriate to use ?

Thanks, Jheald (talk) 21:31, 18 March 2021 (UTC)

I'm confused... why is :Teversal Manor (Q15979549)located in the administrative territorial entity (P131)unparished part of Ashfield (Q105909200) not the correct statement here? unparished area (Q7897276) is still ultimately a descendant of administrative territorial entity (Q56061) in wikidata, even if it's not got an administration of its own. Circeus (talk) 14:03, 21 March 2021 (UTC)

It's in the label of the qualifier, at least in English: "located in the administrative territorial entity", so non-administrative entities are invalid. I'm not sure a parish really qualifies either. Ghouston (talk) 21:26, 21 March 2021 (UTC)
Then your problem is with the ontological hierarchy of wikidata, not anything about the property or the way it should be formatted when the target is these items. Compare various items like unincorporated territory of the United States (Q783733), unorganized territory (Q1550680) or Unorganized North Sudbury District (Q3518813), all of which are proper descendant of administrative territorial entity (Q56061) and defined in much the same way as unparished part of Ashfield (Q105909200). Circeus (talk) 01:08, 22 March 2021 (UTC)
  • @Jura1, Sic19, Ghouston, Circeus: Thanks for all taking the time to look and think about this. I'm afraid I haven't rushed to change anything, given that there are now 41,000 items using the form above, unless we can all agree it would be an improvement.
@Circeus: I do agree that the norms are pretty loose about what sorts of items can be values for P131, so long as they fit into a good hierarchical structure. Thus in the UK neither region of England (Q48091) nor ceremonial county of England (Q180673) have much administrative role, but it can be very useful to locate them in the hierarchy from districts up to the country as a whole, especially to allow querying of places within them, and easy ordering and grouping of counties and districts etc. So IMO views like the stance of OSM that an unparished area was the absence of an administrative area I think are interesting, and worth noting, but not decisive.
But what did weigh more heavily with me was what would be more useful for reusers. Per a discussion I had with User:Mike Peel ([13]), people do "complain about redundant information in the category tree (e.g., Bob, City of Bob, Municipality of Bob, Greater Bob, etc.)" That did weigh me towards trying to avoid "Teversal Manor, unparished part of Ashfield, Ashfield, Nottinghamshire". Even though Commons infoboxes could probably be hand-tweaked to avoid showing "unparished part of ...", in general most reusers would probably not be so aware. Do we really want "unparished part of Ashfield" showing up in their output, when this is such an esoteric non-everyday fact to note about the place? I didn't think we did; but I am happy to be over-ruled if people think this would be the best solution here.
As for "part of" (@Jura1:) -- to me "part of" as a qualifier tells me about the object of the statement, that it (all of it) is part of some wider entity. I do think there can be a bit of a perennial issue here: is a qualifier informing about the subject or the object of the statement? For some predicates, it can be far from consistent or clear. But seeing "part of" here, I would expect it to be telling me, as an information note, that "Ashfield" was part of "unparished part of Ashfield", not something about "Teversal manor".
@Sic19: You suggested valid in place (P3005), which again is an interesting thought. But again, it doesn't seem to me particularly comfortable. That "Teversal Manor is Ashfield" would appear to be true regardless of where one is; its validity does not seem particularly conditional on being in the unparished part of Ashfield.
One other qualifier I have used recently was location (P276) on historic county (P7959), eg where <place> historic county (P7959) Yorkshire, to specify which riding it was in; or where <place> historic county (P7959) Hampshire, to specify if it was on the Isle of Wight; since in both cases these have been historic counties, but are not the historic counties intended for P7959. But I am not sure P276 would work particularly well here either.
What I suppose we're really looking for is "applies to part of object". But given that that doesn't exist, is the use of statement is subject of (P805) acceptable here, or is it something where we really do need to find a better semantic? Jheald (talk) 19:37, 22 March 2021 (UTC)
statement is subject of (P805) is, at least in my opinion, definitely not appropriate. In the way I read that property, here it means that an item exists for "location of Teversal manor", or at least about a relevant border conflicts that impact how the location is define (e.g. Newfoundland and Labrador–Quebec border (Q16025037) with regards to some areas of Quebec and Labrador, or territorial disputes in the South China Sea (Q2405450)). An entirely distinct territorial entity to the one in the base property certainly isn't a valid target. What you seem to want is a weird equivalent to directions (P2795)... Circeus (talk) 21:15, 22 March 2021 (UTC)
I find myself leaning towards Teversal Manor (Q15979549)located in the administrative territorial entity (P131)Ashfield (Q600996)applies to part (P518)unparished part of Ashfield (Q105909200) as indicating that the immediate administrative unit is Ashfield because there is no appropriate parish-level unit. But I may be reading that into the statement knowing what I am looking for. - PKM (talk) 23:58, 22 March 2021 (UTC)
Thanks @PKM: But I really think applies to part (P518) should consistently be kept to indicating a part of the subject, not part of the object of the statement. Jheald (talk) 08:39, 23 March 2021 (UTC)
@Jura1: I think P361 makes sense for comparing an area with an area. But it seems a bit odd for a specific monument to be "part of" an unparished area. Particularly as we might be wanting to use "part of" to identify it being part of a particular small group of items -- eg an assemblage of heritage items all part of a particular estate (very common). Using P361 as a P131 substitute would interfere with that. Jheald (talk) 09:33, 23 March 2021 (UTC)
Good point. Maybe the qualifier idea should be dropped entirely and just "location" being used as main value. --- Jura 12:08, 25 March 2021 (UTC)

Islamic time

I guess that this question was already discussed but I couldn0t find any help. Thank you for any pointer

How can I enter Islamic / Judaic / Rumi calendar date? I would like to use it for publication date. Of course there are tools (link below), but this is not optimal (and not very inclusive)

https://www.aoi.uzh.ch/de/islamwissenschaft/hilfsmittel/tools/kalenderumrechnung.html Meskalam (talk) 09:06, 30 March 2021 (UTC)

@Meskalam: I’m afraid currently only the Julian and (proleptic) Gregorian calendars are supported. T252627 is the general tracking task for additional calendar models, and T131593 is the specific task for the Hebrew calendar. (As far as I can tell, we have no dedicated tasks for supporting the Islamic or Rumi calendar yet.) --Lucas Werkmeister (WMDE) (talk) 09:17, 30 March 2021 (UTC)
@Lucas Werkmeister (WMDE):Thanks for quick reply. For what is worth, this (diff) is my tentative to record it before having new calendar possibilities. İstanbul'dan Asyâ-yi Vustâya seyâhat (1st book edition) (Q106282126), but I am happy to modify if there are other ideas.

Outreachy

Hello Everyone, My name is Jannath , I am so excited to be at this level of application looking forward to contribute and learn.

hi. BrokenSegue (talk) 16:22, 30 March 2021 (UTC)

Suggested addition to our bot policy concerning descriptions

Hi. I would like all our bots to add descriptions in addition to labels.

I'm unsure how to proceed forward. I can write a policy suggestion in a userpage. If I go ahead and do that, where is it discussed, voted on? Here?

In Wiktionary they have a special place for votes, but I cannot find anything similar here. --So9q (talk) 07:50, 29 March 2021 (UTC)

We are more taciturn. You can put together an RfC, but probably few will turn up. You can discuss at Wikidata talk:Bots, but ditto. Here is good, though. One of the issues with 'all our bots to add descriptions in addition to labels' is that it may well drive the addition of useless descriptions simply to surmount the new hurdle, rather than make an actual improvement. Besides which, descriptions in which languages? And if not all, why are you selectively promoting one or a few languages? Why are you not trusting the process? One person does part of the job. Another person does the next part of the job. That's how things tend to work around here. Why are you wanting to require person one to do more than they otherwise would, simply because you happen to think it is a good idea? I predict this suggestion will not fly. --Tagishsimon (talk) 07:58, 29 March 2021 (UTC)
@Tagishsimon:Thanks for your comment. I see that Wikidata.org is terrible for the user to search on at the moment. How hard can it be to fix a description? @belteshassar: recently imported a ton of stuff from Riksdagen and the Swedish Courts and he added descriptions to every single item he created if I'm not mistaken. If he can then I guess everyone willing to improve the project also can. How hard can it be? Do we really want millions of indistinguishable items that turn up in CirrusSearch and result in a bad UX? I say, if the bot authors cannot fill this requirement they are better of not adding data and let someone else do it properly. Let's discuss further below.--So9q (talk) 08:18, 29 March 2021 (UTC)
'How hard can it be' does not answer any of the questions I posed. The prism through which you are looking at wikidata is not shared by all: CirrusSearch is a very small part of the way in which wikidata is used. --Tagishsimon (talk) 08:33, 29 March 2021 (UTC)
@Tagishsimon:Thanks for your reply. I agree that CirrusSearch is a small part and perhaps the UX that it results in should perhaps not guide/influence the policy decisions as I suggested below. See my comment to Belteshassar below that would result in sidestepping the issue (by fixing the CirrusSearch-drop-down) and not make the suggested change to the policy necessary.--So9q (talk) 08:43, 30 March 2021 (UTC)
I can only take credit for the Swedish Supreme Court decisions. I think descriptions should be added in at least one language when creating a new item, whether the user is silicon or carbon based, but I think your suggested requirement is unnecessarily strict. A label in an additional language is useful, even without a description. Belteshassar (talk) 08:44, 29 March 2021 (UTC)
@Belteshassar:I see, the problem is that searching in that language where only a label exist will also result in a bad UX, if I'm not mistaken. It might be a better solution to change CirrusSearch to completely ignore labels that do not have descriptions in the same language effectively sidestepping the whole problem. The labels could then still be found using the "normal" search (that is if the user press enter in the CirrusSearch, the normal search is enacted and there everything show up, but the labels without descriptions are excluded from the drop-down suggestions). Better yet, see the suggestion of auto-generating outlined by BrokenSegue below.--So9q (talk) 08:43, 30 March 2021 (UTC)

According to the policy-template on Wikidata:Bots: All changes made to it (except for minor edits such as fixing typos) should reflect consensus. When in doubt, discuss your idea on the project chat.

I therefore ask the community if there is consensus to add the following to Wikidata:Bots under the section Bot requirements under a new subsection Labels and descriptions:

  • All bots adding or changing labels for a language must make sure that there is a relevant description also in the same language. (This is to help avoid a lot of entries in CirrusSearch where items with the same label become indistinguishable and thus an annoyance to the user browsing on wikidata.org and searching for something specific.)

Discussion

Asserting that a user has added 267,137 new items mostly (or entirely?) without adding any descriptions, without providing any evidence whatsoever that the user has added 267,137 new items mostly (or entirely?) without adding any descriptions is suboptimal. Singling out a single user, without evidence, for a general issue, is sketchy at best. --Tagishsimon (talk) 09:29, 29 March 2021 (UTC)
@Tagishsimon:Thanks for reminding me to provide evidence for my statements. I have removed the sentence because as you say this is a general issue not related to one or a few contributors.--So9q (talk) 08:43, 30 March 2021 (UTC)
  CommentI have striked my support because I no longer believe the proposal to be necessary. We should fix CirrusSearch like suggested by @brokenSegue: below (auto-generating descriptions (on the fly) from P31/279 where none exist yet) and/or have good bots that fill in meaningful descriptions instead. If anyone have an idea how to write such a good bot, I would be happy to contribute in writing it (In Python using WikibaseIntegrator). I'm aware only of @edoderoo:s description bot working on nl-descriptions, would that be a good start?--So9q (talk) 08:43, 30 March 2021 (UTC)
There used to be a descriptioner tool, but it seems to be gone recently :-( I have my own scripts for this indeed, and if a certain sparql-query (or any other generator) can auto-generate a description in any language, where the descriptions are currently blank, I can be of help in either providing the python script and/or running it. Feel free to contact me on my talk page/Telegram/etc. Edoderoo (talk) 08:52, 30 March 2021 (UTC)
  •   Support While I support the motion note that CirrusSearch could be fixed to show automated descriptions derived from P31/P279 statements, like Mix'n'Match shows for items potentially matching an unmatched catalog entry. If there are no P31/P279 statements in the item then it's pathologic, anyway. --SCIdude (talk) 16:07, 29 March 2021 (UTC)
  •   Comment why not, as has been discussed before, have bots go around populating descriptions? A bot that fills in a label might not know how to best fill in a description properly. we might end up with bots populating very unhelpful descriptions. BrokenSegue (talk) 16:13, 29 March 2021 (UTC)
  •   Oppose Having descriptions is good, of course, but writing good automatic descriptions is difficult, which is why we don't have more bots doing this already; saying "it must be done" doesn't make it possible to do it well. This will just encourage automatic addition of badly-generated descriptions, or discourage people from doing the label work in the first place. Andrew Gray (talk) 17:08, 29 March 2021 (UTC)
    • yeah this is basically my thoughts. is there a list of bots trying to do automatic description completion and the logic behind them? maybe we can unify the logic and make contributing to the logic easier by non-programmers. BrokenSegue (talk) 17:11, 29 March 2021 (UTC)
Interesting idea about unification. I'm not sure it's the best way forward though. See Edo De Roos nl-description botcode.--So9q (talk) 08:59, 30 March 2021 (UTC)
@So9q: man that code is complex. I wonder if we couldn't simplify it though. A lot of that information should be included in wikidata already (e.g. the mapping from "Japan" to "Japans"). There's also a lot of boiler plate code there. I'm imagining a basic domain-specific language (Q691358) that non-coders could edit to manage these. So a user would write configuration that says if is(Q1983062) and uses(P179) then { english: 'an episode of ${labelOfValue(P179)}' } . Maybe it's not possible or worth it. You clearly have more expertise here. BrokenSegue (talk) 13:42, 30 March 2021 (UTC)
  Oppose I very much want to have descriptions for all items. But creating a description might be tangential to creating the item and requiring it might cause friction on the item creation. Anectodatlly, I just saw on my watchlist that a bot had added simple descriptions in over 50 languages to an item for an article that I had created - somehow I was waiting for that to happen because it's a natural thing to be done by bots. The bot did it probably because some other data was there, say, "instance of" or an external identifier. I'd keep the bar for bot edits as low as: The item needs to be identifiable, be it by a good descrption or by (a combination of) properties. Toni 001 (talk) 21:37, 29 March 2021 (UTC)
  •   Oppose This is an ill- or barely-thought-out proposal, the proposer for which doesn't seem willing to engage with questions already raised about description language, the perverse incentive to create poor-quality desciptions, and the opportunity cost of promoting descriptions to must-have-at-creation status. Neither has any evidence been adduced that there is actually a problem of any real significance to be solved by the proposed rule. How is my search improved by a rule which requires a Swedish contributor to add a Swedish description? I don't speak Swedish. I don't search in Swedish. Why is a description more important than the coining of a descriptionless item with a sitelink in response to a language wikipedia article? Why should we lose the ability to search for the text of the sitelink in the situation that the cost for the bot operator of calculating a worthwhile description is so great as to render the addition of the item unfeasible? Why, if per Help:Description, "the description on a Wikidata entry is a short phrase designed to disambiguate items with the same or similar labels" is it proposed that a description be mandatory for items with labels completelty distinct from all other items? Why is the proposer's interest in descriptions of greater moment than my interest in getting done whatever the hell I'm doing, for my own legitimate reasons? Why do I suddenly need to satisfy the whim of somone who, based on the proposal, thinks that descriptions are something (strings to aid CirrusSearch) that, again, per Help:Description, they are not? Why is a description more important than, say, a mandatory P31 or P279? -- Tagishsimon (talk) 23:46, 29 March 2021 (UTC)
  Oppose - it looks to me that we move the issue that CirrusTool has, to other tools. It would be good to have decent descriptions, but adding them manually does not work (maybe it will when we have 100 times more active users in a language, and maybe it works for Engish), but making it mandatory just to make a tool (I never heard of until today) work better, is not a good policy. Edoderoo (talk) 10:41, 30 March 2021 (UTC)
  Comment Some bots unfortunatelty frequently change their mind about what the optimal description is, and mass replace a useless description added by the same or another bot into another similarly useless description. When 2 items have to be merged, there are conflicting descriptions in an unknown number of languages (since they are reported one-by-one), both based on same "instance of" and added by bots having no understanding of the item at all. Taylor 49 (talk) 16:42, 30 March 2021 (UTC)
I never had this issue with merges, but I'm sure that manual edits will give you the same behaviour, and probably even worse? Edoderoo (talk) 17:20, 30 March 2021 (UTC)
Sure they will, but they should add relevant useful descriptions at least (as opposed to bot-guessed ones), and the human editors should improve the quality of the item, and maybe even discover the need of a merge and perform it. Taylor 49 (talk) 17:27, 30 March 2021 (UTC)
  Oppose The way the text is written now it would prevent many improvements to Wikidata. If an item has label and description in one language, then just adding labels in any additional language is adding value to the item and should be encouraged rather than discouraged. Ainali (talk) 14:14, 31 March 2021 (UTC)

Outreachy project

Hi all. I'm mentoring an Outreachy student project that is focused on improving the synchronization of content between Wikipedias and Wikidata, partially based on my work by Pi bot. If you're interested, you can follow the project at phab:T276329 - at the moment the project has been advertised and interested applicants are getting in touch and trying the introduction tasks. Feel free to comment on the phabricator tasks, improve the intern task instructions, and/or watch my user talk page - but remember that this is an educational project so please don't give answers too easily. ;-) (Also, I'm still looking for a co-mentor if anyone is very interested in this!) Thanks. Mike Peel (talk) 20:08, 30 March 2021 (UTC)

are they going to have to apply for bot access? or is this going to happen under one of your accounts? Mainly I'm just worried about data quality. Otherwise seems like a great project. BrokenSegue (talk) 20:18, 30 March 2021 (UTC)
@BrokenSegue: To start with they are trying different starting tasks, which will involve a small number of edits under their own usernames. Then there will be a selection based on how well the tasks have gone, and one will work on the main task. That will need bot approval, either separately or as a new pi bot task. Thanks. Mike Peel (talk) 10:04, 31 March 2021 (UTC)

BUG: "intentional sitelink to redirect" does not work

Special:SetSiteLink I check " intentional sitelink to redirect " but the result is still " Site link XXX:YYY is already used by item ZZZ (Q000). Perhaps the items should be merged. Ask at d:Wikidata:Interwiki conflicts if you believe that they should not be merged. " I do not believe that they should be merged. They are different on some wikies. I believe that this is a bug. Otherwise the "intentional sitelink to redirect" is useless. Taylor 49 (talk) 17:01, 30 March 2021 (UTC)

That is correct. I understand that this feature is not yet working. MSGJ (talk) 17:12, 30 March 2021 (UTC)
Thanks. The status quo with having to vandalize the target wiki, add the link, and revert one's own vandalism then, and hint arrow intentional sitelink to redirect not appearing is suboptimal. This really should be fixed. And other-than-intentional links to redirects should be disallowed, unless I am missing something crucial. Taylor 49 (talk) 17:23, 30 March 2021 (UTC)
@Taylor 49: The latter badge was created to allow it to be added, at scale, to existing sitelinks-to-redirects that need evaluating. Existing sitelinks to redirects may have been created intentionally; or they may have been created by merges on the target wiki. In some cases an automated process might be able to make a strong guess that the "intentional sitelink to redirect" is appropriate (eg, especially if the redirecting wiki page contains the template "redirect from wikidata"). But in other cases a bot would not be able to make such a guess. It is useful in the latter case to be able to flag the sitelinks as linking to redirects, even if such an automated process cannot determine any more about them, just flag them as a caution for users, and an indication for further investigation.
Unfortunately, as I understand it, the system currently refuses to allow any badges to be easily added for existing sitelinks to redirects, which is why (as of the present) the badges have not yet been widely deployed by bots. Jheald (talk) 21:44, 30 March 2021 (UTC)

Redirects

@Lydia Pintscher (WMDE): Any update on Wikidata:Project chat/Archive/2021/02#Redirects 2? Eurohunter (talk) 23:06, 30 March 2021 (UTC)

Sorry I got swamped with some other things and didn't manage to adjust the task to reflect the new consensus to get it ready for the developers :( I'll do this today. --Lydia Pintscher (WMDE) (talk) 13:05, 31 March 2021 (UTC)
I see. This had been discussed a few times before. The status quo (tested yesterday) is that it is not possible to add links to redirects, even to section, without performing junk edits (AKA vandalism) at the target wiki [14]. Ideally, changing
  • an ordinary page to a redirect (I consider ca 99% of such edits as vandalism ...)
  • a redirect to an ordinary page
  • page redirect to section redirect
  • section redirect to page redirect
should adjust the badges at wikidata (in the same way as moves and deletes are handled), with possible values:
  • redirect level (2 mutually exclusive options):
    • page
    • section
  • reason for redirect (2 mutually exclusive options):
    • deliberate
    • dubious (is this needed at all? should be permanently prominently deprecated at least?)
I still see no point with 2 independent badges "redirect" and "deliberate redirect" allowing for 4 distinct states. Furthermore, there are too many redirects on some wikies, particularly -en- wikipedia, making it difficult to find relevant information, and to link via wikidata. Taylor 49 (talk) 20:19, 31 March 2021 (UTC)
Deliberate links from wikidata to redirects should probably be allowed ONLY if the redirect CANNOT BE FOLLOWED, ie an attempt to follow it gives the infamous "already taken" error. Taylor 49 (talk) 20:36, 31 March 2021 (UTC)

Kamikaze pilots

Hello, I don't know if there is a data problem but it's odd seeing only a minority of deaths (12 WWII deaths for 62 people) among that query...--Bouzinac💬✒️💛 14:43, 31 March 2021 (UTC)

I imagine that those that lived, matured into careers that made them more notable. --RAN (talk) 01:04, 4 April 2021 (UTC)
Some of the entries are a little questionable. According to the enwiki article on Noboru Ando (Q7046038), he trained as a pilot and then transferred to a suicide-frogman unit, but was not deployed before the war ended. Nothing in the article suggests he was a kamikaze pilot. However, I am not familiar with the source material, so it is possible that he was briefly assigned to a kamikaze unit. Also, as an aside, are we clear on the terminology here? "Kamikaze" is normally applied to Japan's suicidal air units but I have noticed some amateur historians refer to Japan's suicidal land or sea based units as "kamikaze" as well. Are we applying the broader definition for him being in a frogman unit? From Hill To Shore (talk) 01:33, 4 April 2021 (UTC)

How to add an identifier to a wikidata page

Hello everyone. I'm a relatively new user and am asking for help because I can't figure out how to add identifiers to data pages. Any help would be greatly appreciated.  – The preceding unsigned comment was added by PunkWillNeverDie44 (talk • contribs).