Wikidata:Project chat/Archive/2020/09

This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

Hi. Someone has differentiated these items into the literary character and the work. Whilst I understand that this can be done in the item sense, I don't believe that it reflects the WPs and sister interwiki links, and here I am looking at Commons, English Wikipedia and English Wikiquote. Can someone who deals with these blurred/ambiguous categorisations please have a look.

The English Wikisource edition needs to have its parent edition, and let me say that linking to an unlinked supposed representative item of the work is not appropriate when comparing with the content at other sites. Thanks.  — billinghurst sDrewth 22:36, 27 August 2020 (UTC)

  • I believe there is a need for two different items here. The character has certainly had a life of its own outside of the poem. - Jmabel (talk) 00:07, 28 August 2020 (UTC)
    @Jmabel: Apologies for not being clear, my issue isn't the separation, it is the allocation. The articles at enWP and enWQ are more of an amorphous mass, talking about the rhyme, then may link into some aspects of the character, but it isn't the focus. Similarly, the images at Commons have a stronger alignment with the poem, or its representations than the character. At the moment I would have said that the articles don't clearly distinguish between the character and the rhyme, and it is only recently that the differentiation occurred, and I am not sure that the way that the differentiation has occurred aligns with the links.  — billinghurst sDrewth 21:38, 1 September 2020 (UTC)
    So we probably need a "Bonnie and Clyde" item for the conflation. - Jmabel (talk) 00:17, 2 September 2020 (UTC)

Almost a year after I asked about a change in this gadget without an answer, so I'm asking here now, as there is another property that needs to be addressed:

  1. canonical SMILES (P233) and isomeric SMILES (P2017) should be UrlEncoded, because there are symbols in SMILES that can break the URL (an example here: Q32089#P2017). Right now many URLs for isomeric SMILES (P2017) are broken.
  2. new property SMARTS notation (P8533) should be added to this gadget and also UrlEncoded (at least 5 symbols have to be encoded, more info here).

Could someone help with this? Wostr (talk) 11:43, 31 August 2020 (UTC)

Here's a query for properties where an example value includes a /: https://w.wiki/b2f .
For most of the properties here the / character should not be URL-encoded, eg: Handle ID (P1184), Rotten Tomatoes ID (P1258), Swedish Open Cultural Heritage URI (P1260), History of Parliament ID (P1614). (It is an ongoing annoyance that Reasonator fails to get this right, always encoding such links, which breaks them).
Perhaps we need a qualifier field on the property items to indicate when such characters are expected to be encoded? Jheald (talk) 18:08, 1 September 2020 (UTC)
      • @Jheald: Thanks for the list of properties with slashes. In this case we need to indicate whether the value should be urlencoded or not. A qualifier would is one way to achieve this. Alternatively we could continue using the $1 escape for properties that must not be escaped and introduce $2 for properties that must be escaped. This has the advantage of being easier to implement as you don't have to consider qualifiers but just the link template itself. The downside is that the meaning of $2 needs to be documented. Tools that work right now for some instances will stop working at all for urlencoded values unless they are updated. 18:52, 1 September 2020 (UTC)
  • @Wostr: Regarding SMARTS notation (P8533), Uni Hamburg only offers a REST-API for downloading images. I'd find it surprising if clicking on a statement downloads a file. It may be worthwhile to ask them to offer an option to view these items with a link. Alternatively we could operate a viewer on Toolforge that relays requests to the viewer (and caches results). Pyfisch (talk) 09:35, 1 September 2020 (UTC)

updating entry for the Royal Society of Sculptors formerly Royal British Society of Sculptors

The name has changed dropping the 'British' also the name for the Vice President needs to be changed to Almuth Tebbenhoff. How can I do that?  – The preceding unsigned comment was added by Almuthtebbenhoff (talk • contribs) at 10:42, 1 September 2020‎ (UTC).

The VP is at w:Royal_Society_of_Sculptors, not Wikidata.
To add the (old and new names) to Wikidata, you could add two statements with official name (P1448) to Royal Society of Sculptors (Q7373851). --- Jura 17:51, 1 September 2020 (UTC)
I added the name changes to Royal Society of Sculptors (Q7373851) and added positions on Almuth Tebbenhoff (Q98761401). Ghouston (talk) 03:57, 2 September 2020 (UTC)

How does WikiData know which four languages to show initially?

When browsing WikiData items, the first four languages shown in the language/label/description/alias table for me are English, German, Persian and French.

How does WikiData determine the four languages to show? --79.249.159.29 11:38, 1 September 2020 (UTC)

If you are logged in, it uses the #babel settings on your user page, if set. Otherwise, it seems to display any preferred languages from the browser settings, but I don't know where they come from otherwise. If I log out, I'm offered Traditional Chinese and Chinese as the 3rd and 4th choices, for some reason. Ghouston (talk) 03:02, 2 September 2020 (UTC)
Perhaps based on geolocation of the IP address and most-spoken languages in particular areas. Ghouston (talk) 04:14, 2 September 2020 (UTC)

Item that need total rewriting

I noticed that Rufus (Q25209781) have random consequential versions value. How it is possible to reorder them starting from earlier to latest version? Is there need to make manually? I wish to have trust from Wikidata community and I can work hardly to finish! How can I put alert that nobody revert my edits until I finish? --151.49.44.242 19:05, 1 September 2020 (UTC)

@151.49.44.242: order does not matter on wikidata. please do not try to re-order them. this data is maintained by a bot. BrokenSegue (talk) 19:36, 1 September 2020 (UTC)
@BrokenSegue: remember you can't ping IPs. --Prahlad (tell me all about it / private venue) (Please {{ping}} me) 00:42, 2 September 2020 (UTC)
oh. BrokenSegue (talk) 01:22, 2 September 2020 (UTC)

See Wikidata:Contact_the_development_team/Archive/2020/06#Sort_statements_of_same_property_when_saving_(June_7). --- Jura 11:22, 2 September 2020 (UTC)

  • If you want trust from the Wikidata community in general it makes sense to register an account. I trust people when I know that they did decent work in the past and the only way I can know that is if the person has registered an account. ChristianKl18:34, 2 September 2020 (UTC)

Rollback

On enwiki, w:en:RB states that Rollback could be used to mass-undo a misguided editor's edits. I recently found an editor who keeps capitalizing descriptions against the conventions at Help:Description#Capitalization. Can I use rollback to mass-revert them? Let me know. Thanks, --Prahlad (tell me all about it / private venue) (Please {{ping}} me) 01:00, 2 September 2020 (UTC)

Given that WD:D ist more or less ignored by a lot of contributors, it might be wise to discuss it first with said person. --Emu (talk) 09:44, 2 September 2020 (UTC)
@Emu: that does not make it right. --Prahlad (tell me all about it / private venue) (Please {{ping}} me) 14:29, 2 September 2020 (UTC)
It's not right but the way errors from well-meaning users get fixed is through discussing the errors with the person (and undoing) and not through rollback. ChristianKl18:38, 2 September 2020 (UTC)
@ChristianKl: they have already been warned by Hasley. --Prahlad (tell me all about it / private venue) (Please {{ping}} me) 19:59, 2 September 2020 (UTC)
If a person continues editing in a way that violates our policy after that's pointed out to them, it makes sense to bring up the case on WD:AN. ChristianKl20:01, 2 September 2020 (UTC)
@ChristianKl:   Brought up at AN. --Prahlad (tell me all about it / private venue) (Please {{ping}} me) 20:15, 2 September 2020 (UTC)

Ahmed Ghanem

An editor on enwiki pointed out that the article en:Ahmed Ghanem, about an Olympic hurdler, was linked to articles in other languages about a lawyer and writer. I investigated and found that Q2827576 indeed seemed to blend two different people. Only the enwiki link was about the hurdler, so I removed that link - I also edited the description in English and French, and removed the properties that relate to the hurdler. All the other articles except arz-wiki are about the lawyer (and look to be mechanical translations of each other).

I am pretty sure I was right to remove the properties relevant to the other Ghanem, but should I have created a new entry for the hurdler? (I suspect that he does not meet en-wiki's notability, and am probably going to nominate the article for deletion, but I don't know if that is important to Wikidata.)

A further point: the arz-wiki article linked does not seem to be about him. I think (from Google Translate) that it's about his foundation; but I think that it therefore shouldn't be linked. I have left it, though, as I am not sure. --ColinFine (talk) 15:45, 2 September 2020 (UTC)

Update: I've checked en:WP:NSPORT, and as an Olympian, he is notable. --ColinFine (talk) 15:52, 2 September 2020 (UTC)
This is a case of conflation. We'll need to create new items for all the people involved. Whether a subject is notable in their own right (according to WD:N), if they have been involved in a conflation it is probably best to retain their item, if only to ensure the details aren't conflated again later. From Hill To Shore (talk) 16:25, 2 September 2020 (UTC)
Thanks for replying, From Hill To Shore. So what happens now? Does it just stay as a conflation until somebody feels like fixing it? Are there editors who watch for conflations and sort them out? I haven't been active on Wikidata for a couple of years, so I don't know how we do these things. --ColinFine (talk) 17:13, 2 September 2020 (UTC)

We now have Ahmed Ghanem (Q98834041), Ahmed Ghanem (Q98834310), Ahmed Ghanem and cultural program Egypt in the eyes of the World (Q98834514) and cultural program Egypt in the eyes of the World (Q98834482). Ahmed Ghanem (Q2827576) remains with residual statements; it is unclear which subject they relate to. From Hill To Shore (talk) 19:00, 2 September 2020 (UTC)

one Wikipedia entry for two separate things

What do we do when a single Wikipedia entry relates to entities sufficiently different that two different Wikidata entries would be sensible. I'm looking at: https://es.wikipedia.org/wiki/Real_F%C3%A1brica_de_Relojes Which covers both a manufacturer and a school. Given the relationship between the two, it makes sense for there to be a single entry, but I'd rather have two Wikidata entries with separate names for linked-data purposes. Since there's a WP forward from Real Escuela de Relojería to Real Fábrica de Relojes, I thought I could create a Wikidata page for the Escuela (Real Escuela de Relojería (Q98834518)) separate from the Fábrica (Template:Q6101931) - but the system didn't like that.

This must be a common problem, but I don't know the common solution. Is there guidance on this someplace? Thanks. -Kenirwin (talk) 18:09, 2 September 2020 (UTC)

Similar to the topic a few sections above, you'll need to set up an item like Ahmed Ghanem and cultural program Egypt in the eyes of the World (Q98834514). From Hill To Shore (talk) 18:33, 2 September 2020 (UTC)
On Wikidata this is known as the "Bonnie and Clyde" (Q219937) problem. Generic guidance is at Help:Modelling/Wikipedia and Wikimedia concepts#Compound Wikipedia articles. (There's also more that could be said, I think.) —Scs (talk) 19:10, 2 September 2020 (UTC)
Thanks @Scs:, @From Hill To Shore: - Kenirwin (talk) 20:24, 2 September 2020 (UTC)
Re linking to redirect pages: it is possible to get around the technical problem by temporarily removing the redirect template from the Wikipedia page, creating the link here, and then restoring the page. There are quite a few such examples here -- Greenwich Hospital (Q96311011) is one that springs to mind. I suppose it's not a bad idea, especially if the redirect page ever gets turned into a full article. — Levana Taylor (talk) 00:00, 3 September 2020 (UTC)

sr-el

sr-el doesn't works in babel? Eurohunter (talk) 16:21, 2 September 2020 (UTC)

If there is something that makes you think so and you believe it should work, report it to WD:DEV or Phabricator. --Matěj Suchánek (talk) 13:07, 3 September 2020 (UTC)
@Matěj Suchánek: Done, thanks. Eurohunter (talk) 14:51, 3 September 2020 (UTC)

suggestededit-add 1.0

What is suggestededit-add 1.0? Eurohunter (talk) 16:08, 3 September 2020 (UTC)

Wikidata:Suggestededit-add 1.0. --Matěj Suchánek (talk) 16:56, 3 September 2020 (UTC)

Aliases

Is it ok to add "George Hamlin (1869-1923)" as an alias for the entry George John Hamlin (Q41804451) and other humans? I add them so I can cut and paste the string and use it on photographs and obituaries I load to Commons such as "George Hamlin (1869-1923) in 1920" and "George Hamlin (1869-1923) obituary in the New York Times". I noticed another editor removes them in other entries. --RAN (talk) 15:42, 29 August 2020 (UTC)

Please don't. Aliases should be actual alternative names that are used to refer to the person or subject. It may be worthwhile to add structured data like I did to the files. Pyfisch (talk) 16:18, 29 August 2020 (UTC)
You can simply use the QID on Structured Data on Commons, e.g. depicts Q41804451, and George John Hamlin will appear. -Animalparty (talk) 19:23, 29 August 2020 (UTC)
How is any different from "Hamlin, George" which I also see in other entries? "George Hamlin (1869-1923)" would be how he would appear in an encyclopedia. We also sometimes use the description "(1869-1923)" or "American musician (1869-1923)" and also "(1869-1923) American musician". There doesn't appear to be a standard to harmonize on. --RAN (talk) 19:59, 29 August 2020 (UTC)
Labels and aliases aren't invisible aids that serve only to assist in identifying an item. They also appear in some infoboxes (e.g. Commons:Creator templates), and look at the atrocity that is Rembrandt (Q5598) (probably dutifully compiled by robots). What good would be accomplished by adding a parenthetical disambiguator or inverted name order that can't be accomplished by simply using QID? While Wikidata is still largely the Wild West (and the sheriff just got shot, and the mayor's a horse...), some proposed guidelines are at Help:Aliases and Help:Label. -Animalparty (talk) 20:29, 29 August 2020 (UTC)
By the way a lot of items have aliases like ########### IPI.--GZWDer (talk) 13:06, 30 August 2020 (UTC)
Yes, we have a lot of robots. Humankind is certainly inefficient. Maybe one day AI will take over and free us from the burden of thought. The IPI values are better off transferred to IPI name number (P1828) -Animalparty (talk) 18:29, 30 August 2020 (UTC)
@GZWDer: The IPI aliases are probably mostly added by me and a few other music editors. They are aliases on songwriters as a timesaver and quick lookup-method when adding composer / lyrics credits. It's simply a helpful tool to have the composer show up in the dropdown when pasting the IPI, especially for people with multiple pseudonyms or very common names. I'm aware that it is a source of irritation to some, but I don't really see the harm ;) Moebeus (talk) 13:23, 4 September 2020 (UTC)
I disagree. Aliases may have (maybe already have) the function to identify the item in texts. And many scientific texts refer to concepts by some database ID instead of its name (label). This may not happen with texts about humans, though. --SCIdude (talk) 06:37, 31 August 2020 (UTC)
Also disagree with Animalparty. The primary function of aliases is for discoverability, i.e. for items here to show up as search hits, and/or by found by item-matching algorithms. In particular: aliases can and should include common mis-spellings and incorrect versions of the name, so that our item appears for them in searches as we would want it to. Actually used names, nicknames, pseudonyms etc should appear as referenced statements, not just aliases. Finally, on the question of Rembrandt, despite all the aliases here, the Commons infobox at c:Category:Rembrandt still looks good, because it knows not to include them. Jheald (talk) 08:27, 31 August 2020 (UTC)

How to rank WPs referencing?

When checking WPs for recently deceased persons, I find that some WPs more consistently reference deaths than others.

jawiki seems by far the most consistent in its referencing. plwiki and enwiki generally work quite well too. enwiki is sometimes a bit inconsistent. dewiki and ruwiki tends to trail, but I'm not really good with Cyrillics. The problem with dewiki is that they tend to not add references to articles, but mention it somewhere else. --- Jura 07:00, 31 August 2020 (UTC)

You mean on which Wikipedia to rely on if there are several given dates? I don’t think a general rule would be appropiate, Wikidata should present the best obtainable version of the truth without relying on any Wikipedia language version (but ideally on third-party sources). --Emu (talk) 09:48, 2 September 2020 (UTC)
No, no. The question is how rigourous WPs are for referencing the dates, e.g. jawiki has (almost) always a reference. --- Jura 10:46, 2 September 2020 (UTC)
Ah. Well, that’s a very good question. I have slightly different anecdotal evidence than you (enwiki is more willing to accept dubious references, dewiki is trying to decrease Newstickeritis, cswiki is often very thoroughly referenced) – but it’s anecdotal. But a clear answer to the differences would require some sort of systematic research. --Emu (talk) 15:47, 3 September 2020 (UTC)
You could try this too. I don't recall cswiki appearing there frequently - possibly because its contributors update Wikidata directly. Some wikis only appear rarely, so it's hard to say something more general about them.
If avoiding Newstickeritis in dewiki means that they add the information, but not the reference they had been using, we probably don't have the same understanding of the purpose of referencing. Sometimes, we see newbie contributors lecturing us with detailed views on referencing and then you see them adding mainly the "official website" and Britannica. ;) --- Jura 08:25, 4 September 2020 (UTC)

Shopping pages as reference

I noticed that in Xu Yuhua (Q238736), the statement sex or gender (P21) is referenced with 3 weblinks to alibaba.com (which of course does not reference the sex or anything else for this person). The user that added this is not active anymore. But anyways, besides this case, how can we ensure that no useless or even harmfull SPAM links are added as references? Steak (talk) 19:05, 2 September 2020 (UTC)

@Steak: There is Special:AbuseFilter which prevents links to certain websites. I recall a user asking about changing a link to 4chan/8chan in the project chat a few days ago, which was blocked by the filter. Additionally I tried to find all links to Alibaba in Wikidata references using the query service but only got a timeout. --Pyfisch (talk) 09:09, 3 September 2020 (UTC)
There are 12 links to Alibaba.com in Wikidata. I've also updated Wikidata:SPARQL query service/query optimization#A query that has difficulties section about such queries. --Lockal (talk) 13:05, 4 September 2020 (UTC)
Other references were added by trusted users, [1][2][3][4], @Crazy1880:, @Florentyna:, could you comment about your sources? --Lockal (talk) 13:37, 4 September 2020 (UTC)
From my side it looks like it was added via the Primary sources tool. This time I guess this tool linked to crazy websites. And this is also the reason why I did not use it anymore since then I guess. Alibaba.com in general looks not very reasonable as source. I would swear that I never used Alibaba.com - but reality shows I used it. Florentyna (talk) 13:52, 4 September 2020 (UTC)

Property like “listed in [database]”

A number of protein superfamilies, such as Metal-dependent hydrolase (Q24768309), have a description of “InterPro Domain”, which is not really a description, but only seems to indicate that they are listed in InterPro (Q3047275) – rather like describing Aldebaran (Q12170) as “GSC star”. The fact that they are in InterPro is already provided by InterPro ID (P2926), so wouldn't it be better to remove those bulk descriptions? Or would it make sense to have a property like “listed in [database]” for them? ◅ SebastianHelm (talk) 03:43, 4 September 2020 (UTC)

  • What is a minimal description, a basic description and good description for this item is a different question from what statements/properties it has. Obviously information from statements can be used for descriptions, but the purpose of them is entirely different. Descriptions should mainly help understand the concept together with the label. There is or was some debate if no description is better than a minimal description.
Maybe @Andrawaag: can explain why that particular description was chosen. --- Jura 04:56, 4 September 2020 (UTC)
  • Ideally all information that's in the description is already provided by the statements in an item. The description field doesn't exist to add information that's not expressed in the statement even through that sometimes happens for valid reasons.
On Wikidata descriptions help for disambiguation. The information that an item is a “InterPro Domain” helps users to immediately see that the item is not about a particular protein which is helpful information. It would be possible to have a better description but this description is better then nothing. Hopefully in 2-3 years we have WikiLambda descriptions that automatically provide better information, so I don't think it's worth investing much effort into improving descriptions like this. ChristianKl11:53, 4 September 2020 (UTC)

Why does Wikidata differentiate fictional beings from fictional beings in fiction?

For every type of fictional or mythological being on Wikidata, there is also a corresponding fictional version of that fictional being:

There is no clear distinction between the scope of these items as all folklore and mythology is also technically fiction, and even modern depictions of these creatures are still based on the mythological/folkloric versions. Having them awkwardly split in two like this doesn't make sense to me and I think we should merge them. Otherwise, someone wanting to query Wikidata for all ghosts or all orcs is only going to get a subset of relevant results. Kaldari (talk) 17:59, 3 September 2020 (UTC)

Some of these aren't purely fictional as they can form part of the beliefs of a religion (Demon being the most obvious example of the ones you identified). An atheist may describe God (Q190) as a fictional being but it wouldn't be wise to merge the item with God (Q4088979). Even if the religion is extinct (for example, Ancient Greek worship of the Olympian gods) we still need the ability to represent their beliefs in concepts in order to describe history. A purely fictional character based on a mythological figure is a different concept to the mythological figure itself and must be kept separate. Otherwise we could end up describing the legend of Perseus (Q130832) and his flying horse, Rainbow Dash (Q12739035) (a slightly exaggerated analogy there to illustrate the point). From Hill To Shore (talk) 19:29, 3 September 2020 (UTC)
To clarify my point a little, we need to represent two concepts. One, a mythical entity that either our ancestors or our modern day counterparts believe is or was real. Two, a purely fictional representation of the mythical entity that the creator intended for people to think wasn't real. From Hill To Shore (talk) 19:38, 3 September 2020 (UTC)
From Hill To Shore, you seem to feel quite certain about this. Would you feel comfortable adding a criterion to the different from (P1889) statements for each of those? ◅ SebastianHelm (talk) 01:25, 4 September 2020 (UTC)
Project Chat is an area for users to discuss issues and form a consensus. I have given my interpretation of the situation and it is for others to either agree or put forth an alternative interpretation. I'm always happy to abide by consensus. In terms of editing the items, I'm not going to have access to a personal computer for the next 16 hours, so I'll see how the discussion has developed at that point. From Hill To Shore (talk) 01:39, 4 September 2020 (UTC)
This could raise some tricky epistemological questions. Presently we do classify religious texts as works of fiction. We have religions, mythologies, conspiracy theories, and generally rejected scientific theories, all of which people currently believe are true or have believed in the past that they were true. We could have a base class that would be basically "something thought true by a significant number of people but we know it's false", but how do "we" justify knowing them to be false? Because they aren't confirmed by modern science to be true? There exist modern scientists who practice religion regardless. But something like Big Ears (Q4905608) seems to be in a different class, in that even people who believe in sprite (Q20828805) may see him as fictional. What do religious people think about the religious texts of religions they don't follow? Do they consider them works of fiction? We also have things like global warming (Q7942) which maybe we don't want to treat the same way, as something that a lot of people apparently "know" is false, because it's a currently widely-accepted theory among climatologists. Ghouston (talk) 03:31, 4 September 2020 (UTC)
(edit conflict) Big Ears should be easy to classify as fictional, since he has a human (Q5) creator (P170). But doesn't such a criterion, and maybe a few similar ones, already suffice – do we in addition really need to duplicate every type of fictional or mythological being? ◅ SebastianHelm (talk) 04:02, 4 September 2020 (UTC)
I suppose it's just following the current practice in Wikidata: fictional character (Q95074) has 241 direct subclasses and 2214 in total. fictional human (Q15632617) alone is quite popular. Without that, you'd be declaring fictional characters as instance of human (Q5), which seems dubious. Ghouston (talk) 04:59, 4 September 2020 (UTC)
The answer is probably that we can't be in the business of deciding what's true of false, or real or non-existent, but only in grouping them with other types of similar entities. Scientific theories can have varying degrees of support, which can change greatly over time, but they are always scientific theories. Likewise religions and conspiracy theory (Q159535). Conspiracy theories often seem crazy, but occasionally one may turn out to be true, since conspiracies do exist in the real world. Outright fictional works are also in a class of their own, even if a few people may end up believing in some fictional concepts. Ghouston (talk) 03:51, 4 September 2020 (UTC)
So as an aside, I'd make religious literature (Q12617225) a subclass of literary work (Q7725634), instead of fiction literature (Q38072107), since it's really in a class of its own. Ghouston (talk) 05:19, 4 September 2020 (UTC)

  Notified participants of WikiProject Narration --- Jura 05:39, 4 September 2020 (UTC)

Hello. I would like to add to the conversation that things are not necessarily true (be it a religious or scientific truth) or false (fiction or scientific speculation). They can be also in a state in uncertainty and collapse into a certain definition depending on the belief system you use. For example, you could say:

⟨ Perseus (Q130832)      ⟩ instance of (P31)   ⟨ demigod of Greek mythology (Q23015925)      ⟩
according to belief system Search ⟨ Greek mythology (Q34726) ⟩


⟨ Perseus (Q130832)      ⟩ instance of (P31)   ⟨ human whose existence is disputed (Q21070568)      ⟩
according to belief system Search ⟨ scientism (Q193626) ⟩

Or even:

⟨ Perseus (Q130832)      ⟩ instance of (P31)   ⟨ literary character (Q3658341)      ⟩
according to belief system Search ⟨ archetypal literary criticism (Q776439) ⟩

Hopefully my contribution helps to move the conversation forwards. Kind regards. --MathTexLearner (talk) 09:58, 4 September 2020 (UTC)

Depending on the definition of "fiction" (broader or narrower) it is not true that all folklore and mythology is technically also fiction. Encyclopedia Britannica defines fiction as "created from the imagination, not presented as fact, though it may be based on a true story or situation" [5]. (emphasis by me). Casper the Friendly Ghost (Q1442531) does fall into this definition, but not Brown Lady of Raynham Hall (Q991048).
Moreover, even if you think of fiction in its broadest sense, there would be still the need to distinguish between mythological beings and fictional beings in a narrower sense, just to be able to query for all mythical/folkloristic beings without ending up with Count von Count (Q12345), Rainbow Dash (Q12739035), etc.
If you ask for a criterion to the different from (P1889) statement I would give you doxastic attitude (Q5303685). Elements from myth, folk belief (or religion) are/were part of a world view treating them as somehow real. They may be object of sincere investigation. The same is not true of characters from fiction (probably nobody thinks or thought that Casper the Friendly Ghost (Q1442531) is real). There may be individual cases that a difficult to classify (e.g. Rudolph Fentz (Q673181), a character from a short story that became an urban legend afterwards), but this should and can be dealt with on a case by case basis.

I thought about ways to avoid creating too many classes, but they have still their shortcomings. E.g. to use ghost (Q45529) for both Casper the Friendly Ghost (Q1442531) and Brown Lady of Raynham Hall (Q991048) you would need to delete all subclass statements linking ghost (Q45529) to the sphere of mythology and folk belief and we would need a new way to express that these concepts have their origin in these spheres. One could probably create a new property for that. One could also abandon the current way of expressing that a certain character is represented as human/orc/dog or whatever using P31 and instead use a new property (e.g. "represented as"/"imagined as"). This way one could also stop creating fictional analogues of classes for real entities. but then there would be still the question how to express that a certain fictional ape species is imagined as a subclass of ape. One could probably create a new property also for that (e.g. "imagined as subclass of ..."). In the end one would also need to deal with the circumstance that there are Wikipedias having distinct articles for a certain being from folklore and from fiction (e.g. elf in a work of fiction (Q3050815) vs. elf (Q174396)). Should we ignore these items when classifying imaginary entities? Other proposals are welcome. - Valentina.Anitnelav (talk) 14:00, 4 September 2020 (UTC)

Hello Valentina.Anitnelav, thank you for your contribution. It depends. I would not ignore these items when classifying imaginary entities, but I would not make it too complicated either. There are infinite possibilities regarding belief systems, however that is too much for us to handle. We could restrict ourselves to three main areas: science, spirituality, and art. All three are well established and have clear boundaries and ways to generate knowledge and data. Science is verifiable through experiments conducted in the outer world, spirituality has also some degree of verifiability through the direct experience of a person (self-development) or group (religion), and art normally bridges the gap between those two.

From the scientific point of view a "ghost" might be an "instance of -> hypnagogic hallucination", and from the spiritual point of view a ghost can be seen as a "supernatural being". However, from the artistic point of view, a ghost can be seen as an "element of mystery", as it doesn't matter what it is on a rational level, as long as it captures the emotional interest of the reader in the context of a story.

For a scientist studying the case of Brown Lady, it would be interesting to know how autosuggestion works in his own mind. Could he or she find a way to enter the shared hallucination of those who reported seeing Brown Lady? For the amateur spiritist, it could be more interesting to study the ecology of ghosts. Why do ghosts are more prevalent in places of power, and how to reproduce that in other places? And for the storyteller, it might be more useful to know which kind of atmosphere they can evoke into their readership by inviting this or that ghost into their story.

When does something stop being art and start being spirituality? When enough people want it that way. Lovecraft started the Cthulhu Mythos as a shared fictional universe to make sense of his personal experiences that neither science nor religion could account for, and now there is a group of people bringing it to the next level by developing a set of spiritual practices centered around his literary canon. Rainbow Dash might be seen as a "fictional pony" by rational-minded people, yet if enough people would start considering it as a deity, that is what it would become too.--MathTexLearner (talk) 14:55, 4 September 2020 (UTC)

MathTexLearner, I'm not against classifying phenomena from different perspectives, but there still remains the need to express that some entities are from mythology and folklore as opposed to fiction. There are several branches in the humanities that investigate them as cultural phenomena, e.g. anthropology (esp. folkloristics), religious studies and cultural studies. Wikidata should be a database for all domains, also the humanities and not just the sciences. Mythological, religious and folkloristical entities are described in scholarly literature, they have distinct attributes from fictional entities (in the narrow sense) and should be treated like that. If somebody would classify an entity from fantasy literature as a mythological being this would be simply wrong. - Valentina.Anitnelav (talk) 17:30, 4 September 2020 (UTC)
Valentina.Anitnelav, thank you for sharing your impressions. I wish you good luck with your quest, and with the implementation of your ideas in a practical way for Wikidata. I trust that you will be able to come up with a proposal that satisfies your personal wishes, and those of the community at large. Could you please notify me once your approach is ready to use? I do not have time to continue this discussion any longer, but I will be very willing to use your approach once is ready.--MathTexLearner (talk) 22:54, 4 September 2020 (UTC)
@From Hill To Shore: So how are we supposed to resolve the huge overlapping grey area between fiction and folklore? Is Cinderella fiction or folklore? what about the Headless Horseman? what about Puss in Boots? what about Mother Goose? Kaldari (talk) 01:59, 5 September 2020 (UTC)
Can you provide works, references and statements you want to model? --- Jura 02:52, 5 September 2020 (UTC)
@Kaldari: From reading the comments above, it appears that we are treating the works of any religion as fiction (or in other words, an untrue invention of humans). That looks like a disaster waiting to happen once people outside of Wikidata pick up on it. Headlines of "Wikipedia's sister project calls religion false" or "Wikidata is run by atheists" could end up triggering a prolonged vandal campaign. While dumping all entities into a big bucket of "fictional" is an easy approach in the short term it is storing up trouble for later. By splitting up the entities to cover concepts of belief then that will create grey areas and they can be tackled as they arise. I'm not pretending to have the answers here; I am just reporting that our current approach is flawed. Your original question was "Why does Wikidata differentiate..." and I have given my view. This view may differ from others and a consensus that disagrees with me may exist. That is fine; I never edit against consensus except by mistake, and I then go back and fix any mistakes I have made. From Hill To Shore (talk) 09:11, 5 September 2020 (UTC)
I made the change I suggested above at religious literature (Q12617225), to separate religious texts from general fiction. I think even atheists will generally see a difference between not believing Qur’an (Q428) and not believing Harry Potter (Q8337). Ghouston (talk) 11:37, 5 September 2020 (UTC)

Property proposal doesn't show on the page

Hello. I have just made a property proposal (rendered TeX string), however it doesn't show in the generic page Wikidata:Property proposal/Generic. Did I do something wrong?--MathTexLearner (talk) 09:22, 4 September 2020 (UTC)

No, it seems there are too many proposals. We have to get some of them closed or moved to another group. --Matěj Suchánek (talk) 10:52, 4 September 2020 (UTC)
Ok, I hope it can be done.--MathTexLearner (talk) 14:58, 4 September 2020 (UTC)

ChristianKl❫ 11:46, 4 September 2020 (UTC): General seems to be a bad place for "rendered TeX string", https://www.wikidata.org/wiki/Wikidata:Property_proposal/Natural_science#Mathematics would be a better place. ChristianKl11:46, 4 September 2020 (UTC)

I do not agree with your judgment. Tex is not used only by mathematics. For instance, the symbol of copyright is not used in mathematical concepts, and still is a "rendered TeX string".--MathTexLearner (talk) 14:58, 4 September 2020 (UTC)
Even if the proposed property is more broadly applicable, people in mathematics and natural sciences will have the most to contribute the discussion, so it is a good idea to put it in Wikidata:Property_proposal/Natural_science#Mathematics. The-erinaceous-one (talk) 07:34, 5 September 2020 (UTC)
As The-erinaceous-one said. The point of listing a proposal is to get feedback from people who understand the modeled domain and see possible issues with it. People in math are more likely to have a good idea. ChristianKl11:28, 5 September 2020 (UTC)

RfD page is huge

Request for deletion page is almost 300kb now, takes considerable amount of time to render, diff, update. Probaby it would be a good idea to use separate pages for discussions. --grin 10:36, 2 September 2020 (UTC)

Why is WD:RFD so huge in the first place? Seemingly there are many non-notable items and/or spam items in the list that should be deleted. I therefore assume that there are not enough administrators who regularly work-off the list and delete this items. I considered to apply for admin privileges to process the deletion requests, but I don't think I have enough edits in recent months and experience fighting vandalism. --Pyfisch (talk) 10:59, 2 September 2020 (UTC)
Wikidata entities are expected to be stable and therefore not to be deleted except on rare occasions. I don't think the problem is that there is a lack of people to delete, but that either too much low-quality content is created (something that could be solved by requiring, for example, at least one "instance of" or "subclass of" statement, or at least one reference), or the requirements for an Item to remain are too demanding. On top of this, admins cannot easily know whether entities are linked from external services or are being used in some way in other projects, so entities that should have been preserved end up being deleted. As a general rule, I never delete entities except in cases of vandalism, test edits or author requests. --abián 12:15, 6 September 2020 (UTC)
Well, what about when there is crosswiki spam with multiple articles created about a non-notable subject that have been linked (with all the links ultimately being speedy deleted). Surely these should be deleted, along with any other cases of blatant non-notability. Also, aren't items automatically created without "instance of" when linking two projects using "add links"? How could that be solved? --IWI (talk) 16:48, 6 September 2020 (UTC)

Representing indigenous Australian languages in title (P1476)

Hi. I just created Coolangatta (Q98925803) and was trying to set title (P1476), which requires a language code. According to w:en:Coolangatta, New South Wales, the name comes from a local aboriginal word. How is it best to represent this in title (P1476)? I can't find any ISO codes for indigenous Australian languages and even if there were, I wouldn't know which one to choose. In this situation would a choice of "mul" be best or is there another alternative? From Hill To Shore (talk) 22:32, 5 September 2020 (UTC)

@From Hill To Shore: I use "und", another one of those magic lang codes. It renders as "Unknown language" in English, but I believe it's supposed to say "Undetermined". Moebeus (talk) 23:07, 5 September 2020 (UTC)
@From Hill To Shore, Moebeus: Use "mis" as the language code and it'll show up in English as "Unsupported language". Then use language of work or name (P407) as a qualifier to specify the language. An example is here: Lingnoonganee Island (Q21961048). --Dhx1 (talk) 11:10, 6 September 2020 (UTC)
Thanks, I've added that as an example on title (P1476). From Hill To Shore (talk) 14:25, 6 September 2020 (UTC)

prices in £sd

I can't find this in Help anywhere (is there a currency help section? Neither the help portal nor Google turned it up). How do I enter the price (P2284) of a book whose list price (as printed on the cover) is in pounds, shillings, and pence? The data entry field wants a decimal number, so it's not possible to enter the price as written. Convert it to a decimal fraction of a pound (10s 6d being 0.525)? — Levana Taylor (talk) 17:14, 6 September 2020 (UTC)

If you want to make calculation with a normalized price (P2284) data (ie all in decimal), you might have better write it into decimal, with a qualifyer stating the original price in pounds/shillings/pences. Besides, might be notherworthy that there are few non-decimal currencies currently  : Maltese scudo (Q967990), Mauritanian ouguiya (Q207024) and ariary (Q4584). Bouzinac💬✒️💛 19:49, 6 September 2020 (UTC)

What is the most efficient way to update "Specialty" across several mental disorders?

For most mental disorders, under "specialty", Wikidata lists psychiatry only. This is not a "bad" thing—it's good that at least one relevant specialty is currently listed. At the same time, clinical psychology is almost always a primary specialty for mental disorders. (My psychiatry colleagues agree, i.e., this is not a controversial point.) There are exceptions, e.g., for specific treatments such as electroconvulsive therapy (ECT) or psychopharmacological treatment (medication), clinical psychology is not a relevant specialty.

Is there an efficent method to add clinical psychology to several mental disorders at once? If it's a one-disorder-at-a-time process, that's fine. I just wanted to make sure I don't miss a faster method if one exists. Many thanks. - Markworthen (talk) 20:28, 6 September 2020 (UTC)

Are you familiar with QuickStatements? It's a way to add a batch of statements to any number of items at one go, using a list of text commands -- instructions at Help:QuickStatements. Perfect for repetitive commands. — Levana Taylor (talk) 20:54, 6 September 2020 (UTC)

Wikidata item usage on other wikis?

How can I find uses of wikidata items on other Wikis?

For example, I'd like to find all the articles in Wikimedia Foundation projects that cite Siva Vaidhyanathan (12 June 2018). Antisocial Media: How Facebook Disconnects Us and Undermines Democracy. Oxford University Press. ISBN 978-0-19-084118-8. OL 29796727M. Wikidata Q56027099. . I cited it in v:Confirmation bias and conflict, and I'm pretty sure I cited it in other articles, but I can't remember where.

Wikimedia Commons includes "File usage on other wikis" with each item. For example, File:Election integrity, the Missouri Voter Protection Coalition, and the Kansas ACLU.webm#File usage on other wikis says:

The following other wikis use this file:

  • Usage on en.wikiversity.org
    • Election integrity, the Missouri Voter Protection Coalition, and the Kansas ACLU

Is there anything similar for Wikidata items?

Thanks, DavidMCEddy (talk) 21:06, 6 September 2020 (UTC)

Campsite items

I've noticed some new entries (June and July) by So9q about campsites and shelters in Sweden. I'm not sure if these items are within the scope of WD – any opinions? — Draceane talkcontrib. 12:03, 7 September 2020 (UTC)

I think this data should be included in OpenStreetMap rather than Wikidata. Additionally I am a bit uneasy about including it in Wikidata with only Facebook as a source (Wikidata:Verifiability). --Pyfisch (talk) 12:33, 7 September 2020 (UTC)
Guys. I've been spending the last two years building a comprehensive maps of American campgrounds, starting with national parks. It is way too late to ponder whether or not they are within the reach of this project. De facto, they are. We even have properties for them, such as Recreation.gov campground ID (P6747) and Campendium ID (P6842). Thierry Caro (talk) 13:30, 7 September 2020 (UTC)

Wikidata weekly summary #432

Countries vs. sovereign states

We currently have a number of items like Spain (Q29), which has instance of (P31)=sovereign state (Q3624078) but not instance of (P31)=country (Q6256). I noticed this as commons:Module:WikidataIB's 'location' code looks for instance of (P31)=country (Q6256) to know where to stop following location properties; without this you end up with cases like "Adeje, Santa Cruz de Tenerife Province, Canary Islands, Spain, Iberian Peninsula, Europe, Northern Hemisphere" in the infobox at commons:Category:Casa Fuerte de Adeje - it should stop after 'Spain'.

It seems that country (Q6256) has been removed from these items initially randomly and then systematically (see [6]), with the rationale that sovereign state (Q3624078) has subclass of (P279)=country (Q6256). One solution to the infobox problem would be to add a check for sovereign state (Q3624078) as well. However, I don't think that the modelling here is correct, as it seems to confuse a government system with a territory. So I'd like to remove subclass of (P279)=country (Q6256), and add back instance of (P31)=country (Q6256) to the affected items (and removing any 'preferred' ranks to avoid shadowing of the country (Q6256) value). Does that make sense?

Pinging @RexxS, Oravrattas: as this follows up on previous discussions with them. Thanks. Mike Peel (talk) 10:45, 15 August 2020 (UTC)

  • Depending on the POV, some can be described as one, others as both. Accordingly, I'd add both values in P31. --- Jura 10:57, 15 August 2020 (UTC)
  • When you come to actually use the data that is in Wikidata, rather than playing around trying to organise it into whatever the latest ontological fad is, you end up with algorithms that have tests like "is the entity a country?". I want to be able to recognise when the entity is a country - a straightforward property - with a simple test like "does (instance of) contain a value equal to (country)?". I don't want to have to test if "(instance of) contains a value equal to (country) or (Mediterranean country) or (Baltic country) or (sovereign state) or (unitary state) or (social state), etc." and be forced to supply an open-ended list of all the possible synonyms for a country, just to try to make sure that I catch a country that has had the value of "country" removed from P31. Isn't it about time that some thought was put into how we retrieve information, and stopped making life impossible for anybody trying to write code to deal with this mess? Can we please rollback the removal of P31=country from countries? =--RexxS (talk) 12:40, 15 August 2020 (UTC)
    I don't necessarily disagree with your conclusion, but clearly the query could just use wdt:P31/wdt:P279* rather than a long list of ORs. --99of9 (talk) 12:47, 15 August 2020 (UTC)
    A key part of the issue is that "is the entity a country?" is actually pretty far from a straightforward question. Lists of countries vary dramatically depending on who's producing them, and why. In most cases, it's necessary to ask a more precise question depending on what entities you actually expect to get back. The list returned from the various forms of wdt:P31, wdt:P31/wdt:P279*, and p:31/ps:P31/wdt:P279* with a target of Q6256 have always been problematic, and no matter what way someone tries to 'fix' them, someone else sees something there they're surprised by, or thinks that somewhere else is obviously missing, and the cycle continues again. I suspect it also doesn't help that the descriptions on that item are quite wildly different across different languages. I have no attachment to any particular solution, and I doubt it's even something that we can fix quickly or simply anyway, but it would be good to actually have an agreed direction of travel. For other similarly problematic fields we've come to accept that items should ideally only represent a single concept, and even where Wikipedia articles tend to conflate multiple concepts within a single page, Wikidata's preference is to split these up. There seems to be some resistance to doing likewise with countries, but I think a lot of the recurring problems in this area stem from this approach of trying to shoehorn too many separate concepts into a single item. --Oravrattas (talk) 13:33, 15 August 2020 (UTC)
    @99of9: Clearly it couldn't, because it's not a SPARQL query: it's a Lua module. Why should I have to test against all possibilities like this:
    • if instid == "Q6256" or instid == "Q3624078" or instid == "Q619610" or instid == "Q179164" or instid == "Q51576574" or (skip and instid == "Q3336843") then
    instead of:
    • if instid == "Q6256" then
    Have we reached the point where SPARQL queries are assumed to be the only means of retrieval of the information in Wikidata? This is what I mean by putting some thought into how users might want to retrieve the information. I ought to be able to decide on what programming language I want to use to work with Wikidata, not be straight-jacketed into a system where the data storage restricts the tools I can use on it. --RexxS (talk) 13:47, 15 August 2020 (UTC)
    • As definitions of "country" indeed vary, maybe we could try to come up with different items labelled "country" with clear definitions. Each applicable one could then go into P31. --- Jura 13:54, 15 August 2020 (UTC)
    @Oravrattas: you claim: is the entity a country?" is actually pretty far from a straightforward question - actually it is. There's no doubt that Italy, Zimbabwe, Zambia, Vietnam and the hundreds of other entities that you removed "country" from, are countries. No sane reader is going to think "Oh, Italy isn't a country". Again, you're working on the presumption that the only way that users make use of Wikidata is by SPARQL queries. Well, that's not the case. I write an algorithm that follows a location chain upwards from any entity that has a location of some sort as one of its properties to the next higher location, then the next, and so on until I reach an entity that is a country, then it stops. Are you maintaining that there are readers who think ""Adeje, Santa Cruz de Tenerife Province, Canary Islands, Spain, Iberian Peninsula, Europe, Northern Hemisphere" is a sensible address because they don't think that Spain is a country? Seriously? Nobody thinks Spain isn't a country, and no matter what your ontological objections are, it is important to be able to recognise that in a simple manner within the structured data. Recognising that Spain belongs to the set of countries costs virtually nothing and makes life much simpler for programmers who have to deal with retrieving Wikidata information for real world applications. --RexxS (talk) 14:07, 15 August 2020 (UTC)
    @RexxS: You can certainly cherry pick those examples if you want to sidestep lots of cases where it's a matter of much more debate (Kosovo, Scotland, Taiwan, Transnistria, Palestine, SADR/Western Sahara, etc), but Wikidata also isn't only about the present: when did Spain and Italy, to continue your examples, become countries? If it's so obvious what a country it, then it should be easy to answer that sort of question too, right? So we can easily produce a list of what countries existed in 1984? 1784? 1584? And what about Denmark? It seems pretty obvious to most people that it's a country, right? But which Wikidata item are we talking about? Q35 or Q756617? What about France (Q142)? French Fourth Republic (Q69829)? French Fifth Republic (Q200686)? You might have your own understanding of a country that makes all these questions have simple, obvious answers. But I doubt they're shared by everyone else here. Look in the archives even just of Project Chat, never mind Request a Query and lots of other places where these questions come up again and again and again. Every approach that has been taken so far has had problems. Pretending this is all simple doesn't help us. Nor does solving only a single use case. --Oravrattas (talk) 14:56, 15 August 2020 (UTC)
    The easy way around this is to pick a country code that matches one's preferred definition. The disadvantage of that is that it wont help one with P31/P279 queries. --- Jura 15:36, 15 August 2020 (UTC)
    @Oravrattas: Cherry-pick??? You removed "instance of country" from hundreds of countries, and have the nerve to accuse me of cherry-picking just four of them. Did you really expect me to list all of them from Zimbabwe to Afghanistan? Every single one of those is a country and your removal of country from them is plain vandalism. Do you want to pick any of those that you are going to say isn't a country? You'll be laughed at. It doesn't matter when Spain and Italy became countries, because they are countries now. If we want to record the history of it, there are qualifiers like start date that are used to do just that job for thousands of other properties.
    If I want to produce location chains for Adeje in the year 1492, I'll code it to use those entities that existed at the time (as soon as Wikidata stores the necessary date information). Just because the data hasn't been added yet, it doesn't mean that you get the right to wreck functioning code for an algorithm that works with current data.
    I don't care what delusions folks here may have about countries. I care about readers, end-users, of whom at least 99.99999999% agree with me that Italy is a country. This isn't difficult. You should never have removed en masse hundreds of perfectly accurate property values, and it's a testament to how dysfunctional this site is that haven't been banned for such conduct. --RexxS (talk) 18:03, 15 August 2020 (UTC)
    @RexxS: you are massively misrepresenting what I have said, what happened, and what the current situation is. No-one is saying that Italy is not a country. Wikidata still believes Italy is a country. Some tools may not understand how Wikidata:Item classification works, but that doesn't make any of your claims correct. Italy has instance of (P31):sovereign state (Q3624078) (at Preferred Rank). sovereign state (Q3624078) has subclass of (P279): country (Q6256). --Oravrattas (talk) 18:36, 15 August 2020 (UTC)
    @Oravrattas: and you haven't even bothered to listen to what I'm saying. As a result of your meddling, the entry for Italy does not contain the statement that it is an instance of a country, even though it most certainly is. I'm not advocating removing information like Italy (Q38):instance of (P31):sovereign state (Q3624078) or sovereign state (Q3624078):subclass of (P279):country (Q6256) because it's useful for some purposes. I'm complaining about you vandalising entries by removing information that is useful for other purposes. You're asserting the value of the procedures you're interested in and denying any value to the procedures I want to use. There is absolutely no reason why Italy should not have a statement that it is an instance of a country, and any sane person would think it crazy to claim otherwise. --RexxS (talk) 19:42, 15 August 2020 (UTC)
    @RexxS, Oravrattas: Please take a step back and listen to what each each other is saying. I think you both have valid points, but accusations aren't going to get us anywhere. Thanks. Mike Peel (talk) 19:48, 15 August 2020 (UTC)
    @Mike Peel: no, Mike, the only step I'll be taking is out of here. I have no interest in debating with someone who makes a decision that their way of working is the only correct one, and that no other ways are valid, then imposes their decision by automated edits. Now we've reached the stage where it's deemed reasonable to deny that Italy is an instance of a country. The lunatics have really taken over the asylum, and there's no place for me here any more. --RexxS (talk) 20:06, 15 August 2020 (UTC)
  • @Mike Peel, RexxS: As for the specific issue of the infobox, might it be enough to check for member of (P463):United Nations (Q1065) instead? That is much more clearly defined and more stable than any P31-based solution is likely to be any time soon, and if you want a few additional entities beyond that, they could presumably be added by hand (and those ones are likely to contentious in a P31-based scenario too) --Oravrattas (talk) 16:22, 15 August 2020 (UTC)
  • On the specific point of checking within templates etc., I don't think Wikidata, per se, would be taking any position about recognition. This would be a question for each tool that needs to generate a list of countries, or check somewhere for country-ness. If a tool wants to restrict to UN members, that's their choice. If they want to include Taiwan or Somaliland or any other combination, that is also their choice. Wikidata's role should be providing them sufficient information to make that choice. (A tool used primarily in Russia, for example, will likely want to work with a different list of countries to one used primarily in the US.)
  • On the more general points: few people are likely to have a problem with the idea that Australia (Q408) is a country, but I'm not sure there would be quite so widespread support for the idea that that is purely about the area (as I think you're saying by "is defined as the area controlled by Australia the state"). Note that we do also have separate geographical items as well (e.g. mainland Australia (Q2872203), Australian continent (Q3960), Oceania (Q55643)). In Australia's case we do not specifically have an item for the combination of places that currently make up the country's territory, but in other cases we do: see for example Nauru (Q697) (the country/state) and Nauru (Q30151675) the island that is its entire territory; or Barbados (Q244) vs Barbados (Q30151210). This is also not restricted solely to places where the territory is a single island: we also have, for example, Marshall Islands (Q709) for the country, and Marshall Islands (Q27577733) for the group of islands.
  • To confuse matters still more, I also don't believe that "we have items for governments" is as true as you might think or hope. Again we are hampered somewhat by language here, as in English "government" is typically used in at least two distinct ways. In most countries there are three branches of government (legislative, executive, and judicial), but "government" can be used to mean (a) the combination of all three (this is the sense when saying that a state is required to have a government); but also, and more commonly, (b) the executive branch specifically, with a head of government, a cabinet of ministers, the ministries they run, etc. Wikidata almost always has items for the "government" of a country in sense (b), but it is much rarer to have an item for sense (a).
  • Again I will note that I am not advocating for any particular solution to all this. I don't really know what that should look like. But we already have "a lot of confusion" due to the current model, even for current countries/nations/states/territories/governments, and significantly more so for historic ones (conflating multiple distinct concepts into one item starts to break down quickly once one of those things changes significantly). And, as I suspect has been overlooked in some of the recent conversations here, the issue is not only about whether any model copes adequately with the "simple" cases (many of which cease to be as simple once we need to answer historic questions, and just current ones): it's also about how well it works for the not-so-simple cases. This is always going to be a balancing act. The world is an incredibly messy place to model even today, and we don't just need to do that: we need to model the entire history of it. And we need do so in a way that doesn't require postdoctoral levels of knowledge of history, geography, and politics to be able to formulate useful queries against the data. That's not going to be simple. But it's a challenge we need to actually confront and take on, not just gloss over. --Oravrattas (talk) 08:26, 16 August 2020 (UTC)
  • As someone who is relatively new to Wikidata and recently tried to use it to catalog sovereign states and their dependent territories ( by some definition ) using this data, I can offer a bit of perspective on how this appears to such a user. It at first seemed strange that Wikipedia had made choices about which are entities are sovereign states and which are dependent states which weren't reflected in Wikidata. If one of the goals of Wikidata is to systematize "raw information" to ensure that Wikipedia is consistent I would have expected more alignment there. That said, it wasn't too disorienting to take Wikidata on its own terms. It took some playing around to understand how to chain properties in SPARQL queries such as wdt:P31/wdt:P279* but that was governed more by my rush to see some results and avoid doing a full study of SPARQL. Understanding linked data as I do now, I don't think it's much to expect people to use such chaining to aggregate information which disparately categorized. In fact, I understand this is precisely the promise of using linked data. Still I sympathize with people wanting simple queries. Perhaps it's enough not to worry about vague terms because they will never capture information with precision. Sure by some measures they produce "inaccurate" results, but perhaps it's enough to note as much in the description. As a long time programmer, I'm less sympathetic with advanced programmers insisting on avoiding the complexity. In the end, Wikipedia in its current state is full of inconsistencies and I'm okay to deal with that for the time being. I do think it important to leave prior work in tact until a consensus is reached though because it's precisely in the work itself that we're likely to find the answers for what works best. I personally don't hold a lot of faith that people (myself included) are very good at divining such complex classifications out of whole cloth. I don't have any skin in this game, but hopefully my contribution is useful nonetheless. -- Gettinwikiwidit (talk) 10:15, 16 August 2020 (UTC)
On the one hand, it's true, we are always at risk of getting so wrapped up in stratospheric ontological purity that our data is unnecessarily hard for mere users to use -- but I don't think that's the issue here. If Italy (Q38) is a sovereign state (Q3624078) and a sovereign state is a country (Q6256), then Italy is a country, and that's all there is to it. If you want to know whether A is-a B, in something like 95% of all cases you are going to have to use wdt:P31/wdt:P279* or the moral equivalent. If Infoboxes can't easily make queries like that (if Lua doesn't have that capability), then we've got to get that capability created, pronto, because, as I said, most of the "obvious" questions you might want to ask about entities absolutely require that capability.
@RexxS: I'm sorry you feel that things are insane, but really, the situation you're complaining about is absolutely normal. Here are some other examples:
Again, the problem goes away if you use a query that takes subclass of (P279) into account, and imho that's what everyone wants to do by default.
Ontology is hard, so the "type categories" are constantly getting rearranged. Even if a simple instance of (P31) query works today, it's not guaranteed to work tomorrow -- and I gather that's what's happened with country (Q6256) recently. But when these rearrangements happen, I think it's safe to say that proper functioning of wdt:P31/wdt:P279* queries is always supposed to be preserved -- which is one more reason to use them. —Scs (talk) 12:00, 16 August 2020 (UTC)

The very notions of sovereign state (Q3624078) and country (Q6256) make sense only at particular points in history. And I would point out that there are also nation (Q6266) and nation state (Q179671). It's easy to say when Jamaica became sovereign because it did so all at once in an era where national sovereignty had a well-defined meaning, but much harder to say when it became a country and whether it is a nation state; it's a lot harder to say when France became any of these things (I'd say "gradually over time," and that in doing so it helped to form these concepts), and whether the French are a "nation" and if so who that nation includes (Bretons? French Basques? Algerian immigrants? Haitian immigrants? French Canadians? Cajuns?). - Jmabel (talk) 15:31, 16 August 2020 (UTC)

Might it make sense to make a new property which is simply an alias for wdt:P31/wdt:P279*? I don't know, does this add a lot of overhead or could it be handled with a pre-processor? Then people could use wdt:P31 when they want precision and the new property when they want the fuzzier relationship. -- Gettinwikiwidit (talk) 22:04, 16 August 2020 (UTC)
@Gettinwikiwidit: As Jura pointed out above, in the case specifically of 'country', there actually is a separate property for this: country (P17) (though I assume you really mean something slightly different to a property in that sense here?), and depending on the requirements that may be good enough in lots of cases. The main problem with these sorts of approaches though is that often that still means essentially asking a fuzzy question, rather than a well defined one. Let's take an example of this photo on Commons. This is a photograph of Václav Havel Airport Prague (Q99172), which has country (P17):Czech Republic (Q213). And in some contexts that might well be the 'country' someone would expect or want to display. But the photo is from 1985, so there are lots of other contexts when you would actually want Czechoslovakia (Q33946). Or take this Commons photo. That's William Wallace Statue, Aberdeen (Q8019940), which by lots of routes will get you to a country of United Kingdom (Q145). But there are many scenarios in which Scotland (Q22) would be the preferred answer to what country that's in. This isn't really a technical problem, or certainly not entirely a technical one. It's very tempting to want to be able to ask a question like "what country is X in?" and be able to get the right answer. But the problem isn't with how to generate an answer to that question. It's that the question itself needs to be better formulated in the first place. Do you want to know what country it's in now, or at some other point in time? What do you want to do about constituent countries, or places where there is, or was, a dispute over sovereignty generally, or around the particular territory in question? Different consumers of the data will want different answers based on these, and many other questions. We can certainly try to make that process as smooth as possible for certain use cases, but the primary goal should be to make sure that we have sufficient data that all the permutations of the question can get the appropriate answers, and ideally also then provide great documentation on the best approaches people should take depending on their needs. --Oravrattas (talk) 12:23, 17 August 2020 (UTC)
@Ovrarattas: I'm not sure where we crossed wires here, but the suggestion was to have a "property" to be used instead of instance of (P31) when you explicitly don't want to bother specifying wdt:P31/wdt:P279* for precision. ( I don't see where 'country' factors into this specific prooposal. ) The idea being that it's more concise to use a single property and hence a little more user friendly for the uninitiated and/or typing adverse user. Since as was suggested above you almost always want wdt:P31/wdt:P279* for finely defined ontologies, having some syntactic sugar might be useful. You describe asking a fuzzy question as a problem, but for many it's a benefit. It allows them to not fuss over the details. I agree that if you are only allowed to asks fuzzy questions then the problem is that Wikidata is reduced to a novelty without much real usefulness. -- Gettinwikiwidit (talk) 20:39, 17 August 2020 (UTC)
If there's confusion, it may be due to the fact that wdt:P31/wdt:P279* is not a property, so there's no way to define a new property to "replace" it. wdt:P31/wdt:P279* is actually a certain way of writing a query on properties. (I believe the technical term is that it is a "path expression".) —Scs (talk) 20:54, 17 August 2020 (UTC)
@Scs: To be clear, I never said that this path expression was a property. I was suggesting that a psuedo-property could be made either with pre-processing or auto-injection (I'm not sure what the options are), while acknowledging that the overhead might not be worth it. It's my understanding that the wdt: properties somehow or another incorporate preferred rank which could otherwise be expressed in more complicated expressions. SilentSpike seems to be making a similar suggestion below. -- Gettinwikiwidit (talk) 22:57, 18 August 2020 (UTC)
I certainly agree that things get decidedly muddier as we head further back in time, but even if we were to take Westphalia as a cut-off point, that's still almost 400 years now of the modern system of statehood. This is also an area where various Wikipedias already have a lot of good information we can draw on: list of former sovereign states (Q62630) varies in depth and in quality across languages, as does list of sovereign states by date of formation (Q668266), but there's no real reason why we shouldn't be able to generate those sorts of lists from Wikidata. My own interest would be to be able to get as far as being able to replicate all the lists from lists of state leaders by year (Q879370), but currently we can't even accurately generate the lists of countries/states for a given year, never mind who their heads of government were. If we need to more explicitly separate out "countryhood" from "nationhood" or make some other splits to be able to do this, then let's do so. But let's decide something, so we can get on with the task of actually building up the data and seeing if we can generate these sorts of lists, and answer useful and interesting queries about them. --Oravrattas (talk) 20:53, 16 August 2020 (UTC)
  • This (the original problem at least) all seems to boil down to, there needs to be a convenient (and quick) generalised way in lua code to check if some QID is a subclass (and not only a direct subclass) of some QID. Rather than checking if the value of P31 is in a fixed lua table containing all the subclasses or by using chained comparisons to do the same. Either that, or you need another property that can be checked to determine if the current entity is a country level item (something like if the value of P17 is the same entity as the subject). --SilentSpike (talk) 21:20, 17 August 2020 (UTC)
Absolutely. Who here knows, or knows someone who knows, enough about Lua to know whether this feature exists, or if not, what it might take to add it? —Scs (talk) 00:24, 18 August 2020 (UTC)
The catalogue of Lua functions offers mw.wikibase.getReferencedEntityId and it seems close to what you want but I never figured it out. --Matěj Suchánek (talk) 07:57, 18 August 2020 (UTC)
It looks like
if instid == "Q6256" or mw.wikibase.getReferencedEntityId( instid, 'P279', { 'Q6256' } ) != nil then
should test if instid is 'country' or a (possibly indirect) subclass of 'country'. @RexxS:, is this workable? You only have to know country (Q6256) to test if instid is or is a subclass of it. -- Gettinwikiwidit (talk) 08:31, 18 August 2020 (UTC)
Unfortunately, no. Lua error: Too many "mw.wikibase.getReferencedEntityId" calls, only up to 3 allowed.. --RexxS (talk) 16:04, 19 August 2020 (UTC)
The only mention of this error I can find is phab:T226925, more information is needed there about when this happens. Peter James (talk) 13:33, 27 August 2020 (UTC)
  • OK, so there doesn't seem to be a working solution for Lua access to this data. It may not be a problem for queries, but Lua access is really important. So I plan to revert the removal of country (Q6256) values soon. Any last comments? Thanks. Mike Peel (talk) 16:14, 26 August 2020 (UTC)
    • @Mike Peel: What items will you be adding instance of (P31):country (Q6256) to, and does it matter to Lua what Ranking these (or other P31 values) are set to? --Oravrattas (talk) 16:44, 26 August 2020 (UTC)
      • Reverting the edit group linked to at the start of all this, and then checking other country items by hand I guess (or maybe a sparql query of sovereign states that aren't countries to start with). Thanks. Mike Peel (talk) 17:02, 26 August 2020 (UTC)
        • @Mike Peel: If your goal is simply for the template to stop at sovereign states, then is there a reason it can't be changed to check for instance of (P31):sovereign state (Q3624078), or member of (P463):United Nations (Q1065) instead? Both of those approaches were suggested earlier in the thread, and I'm not sure whether they've just been overlooked, or can't be done in Lua, or won't return the correct countries, or won't work for some other reason. Without knowing much more specifically what your goal is, and why certain approaches aren't suitable, it's hard to suggest other possible solutions. --Oravrattas (talk) 17:21, 26 August 2020 (UTC)
          • @Oravrattas: Those approaches were replied to in @RexxS's first comment on this thread. Thanks. Mike Peel (talk) 18:09, 26 August 2020 (UTC)
            • @Mike Peel: Do you mean the comment at 12:40, 15 August 2020? If so, then I don't see how these were replied to then, other than in a completely circular manner. If the only acceptable outcome is that every country has to have instance of (P31): country (Q6256) then there doesn't seem to be much point in seeking alternative solutions. Presumably we should also update Wikidata:Item classification to note that items should duplicate the P31 statements from their parent classes when enwiki requires this? --Oravrattas (talk) 18:23, 26 August 2020 (UTC)
              • @Oravrattas: Yes. The only acceptable outcome is a *working* (and stable) solution, none others apart from adding back p31=country seem to have been suggested here. It's not enwp-specific, I'm mostly worried about it because locations are used a *lot* in the infoboxes on Commons. Thanks. Mike Peel (talk) 19:03, 26 August 2020 (UTC)
                • @Mike Peel, Oravrattas: I can create a guaranteed working solution by making further database calls to fetch the values of the "subclass of" property for each value of "instance of" and check whether that matches "country". It makes the code far less efficient of course, because it has to make extra database calls for each value of "instance of" at every step in the location chain. That is far from ideal, but it will work. The real problem here is that I had a piece of code that had been working since I added it in June 2018 based on the reasonable assumption that countries were an instance of "country". Suddenly, one editor decides to void that assumption without notification for no discernible benefit (at least to me), and my code now fails. I should not be required to anticipate every possible schema change implemented without my knowledge just have working code. The code is in use on millions of pages on 100+ projects and breaking changes to the database it uses should not take place on a whim. This isn't the Wild West and editors need to be responsible when making significant changes. How you sort out this particular screw-up is up to you; but I won't be worrying about it further while it's perfectly possible that another similar idea can be implemented without affording me any opportunity to comment. It's pointless wasting my time trying to second-guess possible changes on an unstable dataset. --RexxS (talk) 19:51, 26 August 2020 (UTC)
                • @Mike Peel: As has been pointed out several times by several different people, instance of (P31):country (Q6256) is probably the least stable approach, as what a country is is so poorly defined, and means very different things to different people. I have asked repeatedly throughout this what the requirements for "countryhood" in this context are, but no-one seems willing to answer that, instead effectively asserting that it's not even a valid question. By far the most stable would be to check member of (P463):United Nations (Q1065) — that one will not vary based on different editors' interpretation of what places are countries or states. If that is not sufficient, and you need it to also include some other places, then please engage with the question regarding which other places those would be. Simply reinstating instance of (P31):country (Q6256) to everything that already has instance of (P31):sovereign state (Q3624078) will still leave a lot of inconsistencies around other places: e.g. England (Q21), Scotland (Q22) and Wales (Q25) are all countries by this model, but not Northern Ireland (Q26) (nor other parallel constituent countries such as Greenland (Q223) or Curaçao (Q25279) which go through intervening autonomous country within the Kingdom of Denmark (Q66724388) / country of the Kingdom of the Netherlands (Q15304003) items. Is your plan to also add instance of (P31):country (Q6256) to each of these? If that is not needed here, but someone else wants this for some other purpose, will it be acceptable for them to add it?). American Samoa (Q16641) is a country (Q6256), but Guam (Q16635) isn't (nor are most of the other US territories). Kosovo (Q1246) is, but Taiwan (Q865) isn't. Catalonia (Q5705) is, but with qualifiers to say that that's not accepted by Spain, the EU, or the UN.
                These go to both strands of your stated requirements. Without more information on what working means in relation to any of these (or over a hundred other places that have some level of dispute or ambiguity as to their country status), it's not possible to know what a solution might actually look like. And even if someone carefully goes through and adds instance of (P31):country (Q6256) statements to the ones they think are deserving of it in this context, that on its own seems very unlikely to be a stable solution. Other editors will simply add the same statement to other places, regardless of whether that is desirable or not in these infoboxes. --Oravrattas (talk) 19:50, 26 August 2020 (UTC)
                • One of the main points of Wikidata's data model is that you can handle things that mean "very different things to different people" - this is why you can enter multiple values for the same property, and why ranks etc. exist. I think p31=country should be added where it would naturally be expected to be set (i.e., those cases that are countries by most people's definition) - which may well be the same as you're finding through subclasses, and then you can go argue about the edge cases as you want. Thanks. Mike Peel (talk) 20:01, 26 August 2020 (UTC)
                  @Oravrattas: Hmm, per Wikidata:Edit_groups#Can_I_undo_edit_groups_made_by_others? you'll get a lot of revert notices if I undo [7], but if you do it then you won't. To save you the notices, can you click the undo button please? Thanks. Mike Peel (talk) 20:04, 26 August 2020 (UTC)
                  • Having read through this discussion a couple of times, it's still not clear to me why checking for member of (P463):United Nations (Q1065) as suggested above is not sufficient. I think Oravrattas has made some convincing arguments as for why relying on the country class is inherently unstable. Popperipopp (talk) 20:25, 26 August 2020 (UTC)
                    • If the primary concern is that this template does the expected thing for the places that most people would expect, then switching to checking member of (P463):United Nations (Q1065) seemly by far the simplest and most stable solution. Scs has eloquently pointed out above why relying on a plain P31 value is always going to be problematic, so this seems like it would be significantly superior anyway, even if the instance of (P31):country (Q6256) values are reintroduced. (I'm happy to reintroduce those if there's consensus for that, but I think we need a few more people to weigh in first on that. I think it does raise a few significant wider questions about the general P31/P279* data model in lots of other places too, as per the other examples given above, so it would be good if we could also find a way to separate it slightly from this specific use-case.) --Oravrattas (talk) 20:39, 26 August 2020 (UTC)
                • (edit conflict) You keep claiming that instance of (P31):country (Q6256) is unstable, but it only became unstable when you started deleting it. How many times have Italy, Spain, India or any other commonly recognised country had P31 removed from them? None, until you unilaterally decided to create the instability. Nobody except you think that the hundreds of countries you vandalised are not countries. It is untrue that "Italy is a country" is an ill-defined statement, but you still removed it. As far as edge cases go, they simply don't matter for the purposes of determining a location. If I preview {{#invoke:WikidataIB |location |first=yes}} in w:Academy of Our Lady of Guam it returns Academy of Our Lady of Guam, Hagåtña, GU which most people would consider a sensible location. We only need to stop the chain at a country if the entity isn't already at the top of the chain: Guam doesn't link upwards to a higher level. If I preview the code in w:Sermitsiaq Island, I get Sermitsiaq Island, Sermersooq, Greenland, Kingdom of Denmark. It doesn't matter that Greenland isn't a country; the chain will stop at Denmark. But I do need countries to be somebody's idea of a country. It's not acceptable to decide that Italy isn't a country because Greenland isn't. There's no point in quibbling about whether England is a country or not. If it's a country the chain will stop at England; if it's not then the location chain will terminate at UK. No big deal for a reader who just wants to know where something is, and I don't care about your imagined problems with edge cases. You can call Northern Ireland a country or not as you choose; I can deal with it. But I'm not prepared to work with a system that denies that Italy is a country, because any sensible reader knows damn well that it is. Pasting {{#invoke:WikidataIB |location |first=yes}} into w:Leaning Tower of Pisa readers expect it to return Leaning Tower of Pisa, Pisa, Province of Pisa, Tuscany, Italy, not Leaning Tower of Pisa, Pisa, Province of Pisa, Tuscany, Italy, Southern Europe, Europe, Northern Hemisphere. How many sovereign states are not countries? So why shouldn't every entity that is a sovereign state also be a country? You have made no case whatsoever for what advantage results from removing that fact from the database. A stable solution existed for two years until you broke it and you still haven't enunciated what benefit resulted. --RexxS (talk) 20:40, 26 August 2020 (UTC)
                  Excuse-me, what's the rationale behind removing country (Q6256) from Italy (Q38) ? Plus, cannot your tamplate stop max chain at sovereign state (Q3624078) OR country (Q6256) ? Bouzinac (talk) 20:56, 26 August 2020 (UTC)
                • The reason why a different solution isn't sufficient is that it would require checking for so many edge cases for little reason. If we were to switch to member of (P463):United Nations (Q1065) to use as our stopping place, then every location in the Cook Islands (also Kosovo, Taiwan, Antarctica, and so on) would look something like Avarua, Cook Islands, Polynesia, Pacific Ocean, Earth, inner Solar System. I don't find that "significantly superior", do you?
                  What arrogance to remove hundreds of statements without consensus, then demand we get consensus to put them back! Seriously, is that how this place works? You should be ashamed of yourselves. This "specific use-case" is so specific that it affects 2,000,000 pages of content on 100 projects. You've ducked this question every time, so far, so please tell me what projects/users/others gained benefit from your interference with a stable structure on this database? --RexxS (talk) 21:10, 26 August 2020 (UTC)
                • @Bouzinac: I'd like to know the rationale for removing country (Q6256) from Italy (Q38) as well. If I patch the code to test for sovereign state (Q3624078) or country (Q6256), what happens when Oravrattas decides to remove "country" from all of the the "Mediterranean countries" next? I have to test for Mediterranean country (Q51576574) OR sovereign state (Q3624078) OR country (Q6256). Then Baltic countries, unitary states, social states, republics ... where does it stop? --RexxS (talk) 21:17, 26 August 2020 (UTC)
  • @RexxS: You are clearly upset about this, which is understandable, but your repeated attacks and over the top remarks are also upsetting, and not conducive to finding a productive resolution. Yes, if any code wishes to only look at direct instance of (P31) claims, then it will also need to explicitly enumerate a potentially unending number of superclasses. As per the example given below by Jean-Fred, to check if someone is/was a scientist like this would currently require checking hundreds of possible P31 values, and also regular maintenance of that list as it will doubtless expand further over time. That is simply the nature of the Wikidata data model. In the general case the solution is to also pay attention to superclasses. With countries there are potentially other properties or relationships that might provide an alternative solution (or part of one), but if you are unwilling to engage in a reasonable manner with any suggestions, then it seems pointless to continue to try to help on that front. --Oravrattas (talk) 06:09, 27 August 2020 (UTC)
I sympathise with RexxS’ issue in that he built something on top of Wikidata, which worked for years, and then suddenly stopped − of course, this worked on the basis on some assumptions (a reasonable one indeed). Whether these assumptions might be wrong to begin with is, to a degree, irrelevant: it should be harder to make changes that significantly break something significantly used.
At the same time, “Italy is clearly a country, hence we must have Italy (Q38)instance of (P31)country (Q6256)” − expecting straight P31s is not the reality in most cases: everyone knows that Marie Curie was a scientist, yet we do not have Marie Curie (Q7186)occupation (P106)scientist (Q901) − because we have Marie Curie (Q7186)occupation (P106)chemist (Q593644), and chemist (Q593644)subclass of (P279)scientist (Q901). The Golden Gate Bridge (Q44440) is obviously a bridge, yet it’s 1 subclass away from bridge (Q12280) ; Westminster Abbey (Q5933) is obviously a place of worship, yet it’s 3 subclasses away from structure of worship (Q1370598). Stuffing the P31 with every possible value is not how we do things in a graph database.
Now, of course, we can agree that country is a special case where the redundancy is useful/necessary/convenient/whatever (personally I don’t really mind) ; but I can also imagine an editor wanting an easy/quick/convenient way to check that a building is a place of worship, or is scientist, or is a bridge without having to climb up the subclass tree − should we then add scientist (Q901) and bridge (Q12280) and structure of worship (Q1370598) everywhere for that sake?
Jean-Fred (talk) 21:26, 26 August 2020 (UTC)
I think it's pretty clear that everyone in this debate is at least a little bit right, and that's what makes the debate difficult, because some people are taking an absolute, my-way-or-the-highway approach, while what we pretty clearly need here is some degree of compromise.
I think we can reasonably conclude that all three of the following are true:
  1. In Wikidata, if you want to ask whether A is a B, for all intents and purposes you must use a query that uses a wdt:P31/wdt:P279* search path. You must not use a simpler and more obvious query involving P31 alone.
  2. Due to #1, there absolutely needs to be an easy way (perhaps multiple easy ways) for Lua modules, Wikipedia infobox templates, and other such machinery to perform wdt:P31/wdt:P279* queries. Until there is, those tools are not going to be able to make effective use of Wikidata data.
  3. Until #2 can be realized, the "superfluous" P31 country assignments should be temporarily restored. They should be restored even through they're redundant. They should be restored even through they're inconsistent. But that's what a compromise is: it's an acceptance (in this case, I hope, a temporary one) of something imperfect, in order that the pursuit of absolute perfection doesn't stand in the way of actually getting something done. —Scs (talk) 13:58, 27 August 2020 (UTC)
    +1 - Jmabel (talk) 15:12, 27 August 2020 (UTC)
    As I said earlier, I'm happy to revert once there was more consensus to do so, and this is enough of it for me, so those are rolling back now. I would note, though, that there are still going to be edge cases this template doesn't handle, for the reasons previously noted. Somewhere like Taiwan (Q22502), for example, is still not going to have a direct instance of (P31):country (Q6256) claim, and nor, as far as I can see, is it in a location (P276), located in the administrative territorial entity (P131), or located in/on physical feature (P706)) chain of anywhere else that is. We may also wish to look into adding such claims, at least temporarily, but I think it's worth emphasising that no matter what approach is taken in the Lua, there will likely always be edge cases due to inconsistent modelling. The example given earlier of Cook Islands (Q26988) is a good example. Its relationship to New Zealand is currently fairly well hidden, behind a of (P642) qualifier to instance of (P31): associated state (Q1138279); and a country (P17) at normal rank, behind a Preferred country (P17) to itself (despite not being a sovereign state)). This stuff is messy, in part because reality is messy, but also because lots of the awkward cases have been modelled organically, likely arising out of some disputes, and without much overarching attention paid to the larger issues and sorts of queries that people will be writing etc. It's been mooted previously that Wikidata:WikiProject Countries be rebooted, or a new version started, and perhaps this is a good prompt to revisit that. --Oravrattas (talk) 15:44, 27 August 2020 (UTC)
Thank you, Jean-Fred for your insight, and Jmabel for your analysis. My hands are tied by the extent to which the Wikidata database is exposed to Scribunto. The relatively expensive part of any code is always fetching something from a database. It is quite "cheap" to read the entire set of values of a particular property of a particular entity into a table. Scanning through that table using Lua is very fast, even when there are hundreds of entries in the table. However, if I need to read a value of P31 (for example) which is another entity, and then fetch a property from that entity, check it and repeat, I have to make another database fetch for each value of P31 in the original entity. So the indirect method of determining that Marie Curie was not an astronomer (for example) would require 5 database fetches (her occupation; is physicist a subclass of astrononer? is chemist? is teacher? is nuclear physicist?). Of course, we can determine that she was a scientist by 3 database fetches at most, but it's easier to find positive matches than to eliminate everything to get a negative match. If we could expect Mairie Curie to have scientist (or astronomer if she were one) as one value of occupation, we could naturally make a determination with one database fetch.
A structured query language is designed to efficiently return a set of results across multiple entities in a single query, and so if all you are using is SPARQL, it's easy to fall into the mindset that wdt:P31/wdt:P279* queries are feasible database fetches; but that's only true for a sql query. From the point-of-view of Lua, it is massively more efficient to have redundant values for a particular property, than to have to do indirect lookups on each value of a property to get the same result. Nevertheless, I'm not asking for redundancy to be introduced into property values just to make code more efficient; I am asking that where redundancy already exists, serious consideration is given to the likely consequences for code that has been optimised to take advantage of that redundancy. --RexxS (talk) 17:58, 27 August 2020 (UTC)
@RexxS: If you are writing code to chase the P31/P279 chain yourself, I'm sorry, but you are doing it wrong. You have my sympathy, because I have written that code myself, and I know, sometimes it seems like you have no other choice, but it's still wrong. It's wrong because it's inefficient, it's wrong because it's hard to get right, and it's wrong because you simply shouldn't have to write it -- you should have an easy way of asking if A is-a B.
Given the state of Wikidata today, with its heavy reliance on subclass relationships, anybody who's using the data, and no matter what programming language or framework they're using, absolutely needs a good way of doing the moral equivalent of wdt:P31/wdt:P279*. That "good way" (whatever it is) must be easy, concise, and efficient. It must be easy to use, it must look (syntactically) like you're just asking whether entity A has type B, and it must be implemented efficiently, using the same sorts of techniques SPARQL uses to implement wdt:P31/wdt:P279*. If you don't have that mechanism (or if it's not "good" in all three respects), then over the long term, Wikidata is simply not going to be useful to you.
The short-term stopgap we're talking about today, of ensuring that there are direct, one-step statements of Q6256ship for most countries, is just that: a short-term stopgap. It doesn't lessen the rather immediate need for everyone's query mechanism to have or acquire a properly-implemented moral equivalent of wdt:P31/wdt:P279*. And any such stopgap won't work over the longer term, because too many Wikidata data maintainers won't know about it, and will do various well-intentioned things that end up breaking it, because they'll keep assuming that everyone else is doing queries that will honor P279 subclass relationships. —Scs (talk) 18:08, 27 August 2020 (UTC), updated 18:54, 27 August 2020 (UTC)
@Scs: I'm trying really hard not to write code to chase subclass relationships, which is the entire point of this thread. I've also done it enough times to know it's an inefficient implementation. I usually end up creating a list of entities that have that subclass relationship and testing each one against each P31 value, returning true when I get a match (or returning false if it loops all the way through without a match). The list is equivalent to making a local cache of all the database calls to each entity that has the required subclass, and assumes that set is relatively stable. Lua handles those sort of tables very efficiently, but it may need an update whenever some bright spark decides to change the data relationships or adds a new subclass value. Unfortunately as I'm maintaining the principal module used to get Wikidata information into infoboxes on 100+ Wikipedias, I can't afford to take the view that Wikidata is not going to be useful to me.   --RexxS (talk) 19:34, 27 August 2020 (UTC)

I think that it wouldn't be a bad idea to have a single class "state", in the same way that there's a single class human (Q5) for humans. A state would be any entity with a territory, government, and population, which is not under the administrative control of another state. States can be considered sovereign, vassal and de facto, but a state that changes status from one to another doesn't really change its nature sufficiently that you'd create a new item for it. E.g., Taiwan was a member of the UN until 1971, but is now generally considered part of PRC, but still functions as a state. If its international status changed back to "generally considered sovereign", we probably wouldn't want create a new item for the "new" Taiwan. On the other hand, the likes of Aruba and England are countries but not states, and if they became independent, they'd probably be represented with a new item. One could still argue about the point at which a rebel movement which has taken over some territory should be considered a state. What is lacking, however, is a way to describe non-administrative relationships between states, such as Taiwan considered part of the PRC. Maybe located in the administrative territorial entity (P131) with a qualifier. Ghouston (talk) 23:35, 27 August 2020 (UTC)

@RexxS: I think User:Scs probably deserves more credit for analysis here than I do: on the most important point here, I just seconded him.
On the larger issue: my own "deep" database work is entirely in relational DBs, so YMMV, but (pace User:99of9) this whole thing about wdt:P31/wdt:P279* : we could cache the result of relatively expensive queries like that, and maybe extend the query language to say how old a cached value is acceptable (with a maximum of maybe 1 hour, so we have a time when we know it is OK to clear the cache; anything that only comes by once an hour, we aren't saving much by caching it). For something like this that should seldom change, caching for an hour or so should be fine. Since any cache size would be finite, we might want to apply an LRU approach so that anything that is getting hit often doesn't get rotated out, and then if anything asks for something that is in the cache but the value is "too old" then that is necessarily a time when we do a full query and update the cache entry. Whether we do best to do this on partial queries or full queries would require analysis I'm in no position to make, in terms of how often the same full query is repeated; my intuition is that some queries are made very often even as the "full" query (e.g. the ones that determine that any given major city is in a particular country) but that if (for example) we can use this so that once we chain up to a U.S. state or French department we can go the next step by a cheap cache hit, there might be an even bigger payoff. With a gigabyte or so of memory someplace, we should be able to wildly outperform having to go to the DB for expensive but often-repeated queries. - 00:01, 28 August 2020 (UTC)
@Jmabel: I'm no database expert, but I assume SPARQL has an efficient way of implementing search-path-based queries, meaning that end users like us shouldn't have to do any of our own caching. I assume either that SPARQL can do its own caching, or that the database administrator has set up appropriate indices, or that exotic but indispensable techniques are in use that I've never even heard of. So I furthermore assume that tools like Lua could, if their implementors so desired, chain through to SPARQL to effectively perform search-path-based queries, leveraging SPARQL's power without anyone else having to do much extra work. (But that's a lot of assumptions, so I don't know.) —Scs (talk) 19:39, 2 September 2020 (UTC)

What are the next steps here? We have a temporary work-around in place for now, but what needs to happen next? --Oravrattas (talk) 04:45, 2 September 2020 (UTC)

@Oravrattas: I believe we should make a formal request of the Lua maintainers, to add something new to Lua's Wikibase client to make it easy (and efficient!) to perform search-path-based operations such as "A is-directly-or-indirectly-a B". But I'm not sure this makes sense, and I also don't know how/where to make such a request.
I don't know anything about the infobox machinery @RexxS: is trying to develop, but I can also imagine it might be useful to have a way (again, an efficient one) to simply enumerate all of an entity's types, that is, generate a list of all other entities reachable by a wdt:P31/wdt:P279* path. —Scs (talk) 19:39, 2 September 2020 (UTC) updated 19:52, 2 September 2020 (UTC)
@Oravrattas: Thanks for rolling the changes back. It does sound like some sort of request to the devs for better Lua functions is needed, I'd suggest either filing a ticket on phabricator or posting at Wikidata:Contact the development team. Probably best done by someone that can explain the equivalent in SPARQL (i.e., not me). Thanks. Mike Peel (talk) 15:53, 7 September 2020 (UTC)

"permission denied"

Hey, when I try to edit an item using Wikibase-cli (for example: wb edit-entity ./Q48080592) I'm told, I do not have permission ("permissiondenied: You do not have the permissions needed to carry out this action."). I am logged in with my User@bot-username and my bot-password.

Do I need to add any other specification in the credentials? Thank you for help! --Eva (talk) 16:59, 7 September 2020 (CET)

  • Your're right. Sorry --Eva (talk) 17:52, 7 September 2020 (CET)

Tom Scott best thing video talks about Wikidata

Tom Scott (Q7817504) uploaded a video to youtube today that leverages Wikidata to figure out "what is the best thing". It's an amusing watch and I think it has some useful feedback for wikidata hidden in it (he found blatant vandalism on important items, found our categorization system of mythical beings confusing in a way that is currently being discussed on this very page). Also, we're likely to see some vandalism because of the video. BrokenSegue (talk) 17:04, 7 September 2020 (UTC)

The instances of vandalism he mentioned are confusing me—it looks like the poll was run August 21, but the vandalism appears to be from today (see graphics (Q1027879), pipe organ (Q281460)). Am I missing something? Anyways, good video! {{u|Sdkb}}talk 21:38, 7 September 2020 (UTC)
I think the vandalism today is copycat vandalism due to the video itself. BrokenSegue (talk) 21:56, 7 September 2020 (UTC)
For pipe organ (Q281460) the vandal description was in place from 3 June 2020 to 17 June 2020. The other vandal edits mentioned in the video were in place for a similar time period. From Hill To Shore (talk) 00:06, 8 September 2020 (UTC)

JSTOR topic ID (P3827) creates broken links

JSTOR topic ID (P3827) creates invalid links to JSTOR. Look at the Wikidata property example (P1855) for JSTOR topic ID and follow each of the 3 links. You will see that JSTOR gives an error

We’re Sorry—We Couldn’t Find the Page You’re Looking For

Datariumrex (talk) 08:49, 2 September 2020 (UTC)

If the formatter URL no longer works, please set it to deprecated rank. --- Jura 10:48, 2 September 2020 (UTC)
I did and I added the reason why. Thanks for helping me Datariumrex (talk) 09:53, 8 September 2020 (UTC)

Merge two entities on WikiData

Could someone help me here? I'm trying to merge two entities on WikiData but don't have the faintest idea on how to do it.

The WikiData entities https://www.wikidata.org/wiki/Q2971063 and https://www.wikidata.org/wiki/Q50356342 are about the exact same building in Taagepera, southern Estonia. Yet WikiData fails to understand this and suggests the entities be merged. I just don't know how to. WikiData claims there's a conflict on the English descriptions and a conflict on WikiMedia Commons.

Could someone either do this for me or explain to me step by step how to merge the entities and what's wrong with the conflicts? JIP (talk) 11:22, 4 September 2020 (UTC)

They are not the same and are not supposed to be merged. According to the statements on Wikidata is that Taagepera Castle (Q2971063) is a part of Taagepera Manor (Q50356342). Taagepera Manor (Q50356342) also includes Taagepera manor cemetery (Q18623460). ChristianKl11:45, 4 September 2020 (UTC)
Maybe the label of the "Manor"-item isn't ideal. --- Jura 11:47, 4 September 2020 (UTC)
Would it be possible to move the Estonian and Finnish articles to Taagepera Castle (Q2971063)? I found a similar example, Alatskivi Manor (Q12358592), which includes Alatskivi Castle (Q723577). The main Wikipedia articles are all linked to Alatskivi Castle (Q723577) although Estonian wiki also has a stub linked to Alatskivi Manor (Q12358592). For Taagepera Taagepera Castle (Q2971063) was created in 2013 and linked to 12 Wikipedia articles. In 2018 Taagepera Manor (Q50356342) was created and linked to the Finnish and Estonian articles, leaving 10 articles attached to Taagepera Castle (Q2971063). TSventon (talk) 12:21, 7 September 2020 (UTC)
Pinging @JIP:, @ChristianKl: and @Jura1:. TSventon (talk) 11:14, 8 September 2020 (UTC)
I think Wikipedia pages should be linked with the items that the relevant Wikipedia considers desireable. I haven't looked into the particular items but if you have looked and believe that moving the sitelinks would improve things, feel free to do so.

Is this the appropriate place to ask for my userpage to be deleted?

I have moved most of the contents of my userpage to meta wikimedia org and would appreciate if my userpage could be deleted so it uses my global user page. Thanks in advance and I hope this was the appropriate place to ask for help Datariumrex (talk) 09:56, 8 September 2020 (UTC)

Now   Done. Usually you just need to place the {{Delete}} template on the page to be deleted. The page will then appear in Category:Wikidata:Deletion and at Wikidata:Requests for deletions in the first section. ---MisterSynergy (talk) 11:05, 8 September 2020 (UTC)

Page explaining suggestions, potential issues, and issues

Does anyone know any Wikidata page explaining the differences of suggestions, potential issues, and issues on values and also their icons? I have been searching for it but only find some implication on Help:Property constraints portal. Pressing ctrl + f and typing the word "potential" yield nothing on the page. I've tried searching on Help pages and on Google resulting in only small mentions on past discussion. I'm making a training material for editing in Wikidata and only need some pages for reference. I've been familiar with suggestions and potential issues but having a dedicated help page will be easier for beginners. RXerself (talk) 12:35, 4 September 2020 (UTC)

There are:
  • constraint status (P2316): qualifier to define a property constraint in combination with P2302. Use values "mandatory constraint" or "suggestion constraint"
  • suggestion constraint (Q62026391): status of a Wikidata property constraint: indicates that the specified constraint merely suggests additional improvements, and violations are not as severe as for regular or mandatory constraints
  • mandatory constraint (Q21502408): status of a Wikidata property constraint: indicates that the specified constraint applies to the subject property without exception and must not be violated
--- Jura 15:42, 4 September 2020 (UTC)
Not exactly what I'm looking for but still better than nothing. I hope there'll be a Help page for it to also explain the icons. RXerself (talk) 19:53, 4 September 2020 (UTC)
Whenever there's an icon for a constraint the icon should provide a way to click on it to get more information. When a user sees the icon they can learn what the constraint is about. When it comes to training people I don't see a good reason to tell them about all the kind of constraint they might encounter beforehand. ChristianKl21:14, 4 September 2020 (UTC)
I want to give a quick introduction to the different icons they might encounter and the differences between them. Right now, it is still hasn't been translated to Indonesian and I tried to find it in translatewiki but couldn't. I have difficulties in understanding them myself when first finding them. RXerself (talk) 03:44, 5 September 2020 (UTC)
If you think it's useful, feel free to write one. Just bear in mind that the label and description of a property might define its scope without that being reflected entirely with constraints. Sometimes, it can be more educational to try to enter data and then see how it goes. --- Jura 04:01, 5 September 2020 (UTC)
Good idea. I think I can make one in the future. Maybe I'll try to draft one for the next training. RXerself (talk) 03:22, 9 September 2020 (UTC)
Can you point to something specific that isn't translated into Indonesian? If so we might find a way to get it translated. ChristianKl21:26, 5 September 2020 (UTC)
The thing that pops up after you save a value that is constrained, some of the text like the term "Potential issues" and the text after it haven't been translated. So it is still, "Entities using the [property name] property should be instances of [item], but [current item] currently isn't." Property and item names are in Indonesian as they are queried from their respective label while the rest of the text is still in English. RXerself (talk) 03:22, 9 September 2020 (UTC)

referent vs sign

when you go to the page Q3248845 and scroll down to -part of- it says -everything- which suggests we are talking about the referent 'Omniverse' not the word "Omniverse". however when you click on -everything- and scroll down to -part of- it should be empty as 'everything' is not a proper part of anything but instead it says -psychology terminology- which suggests we are talking about "everything".

the most straightforward way to deal with this is deleting the -part of- statement box and not allowing one to be created for that page but there may be a better way to do this or this may be a systemic issue. --Logicdoglogic (talk) 08:33, 9 September 2020 (UTC)

In my opinion, all articles should refer to the concepts rather than the terms themselves, so omniverse should be an instance of hypothetical entity (Q18706315). In terms of part-of, I anything could be listed as part-of everything so my feeling is that the statement is redundant and all x part-of everything (Q2165236). Everything I think should again be the concept rather than the term so part-of psychology terminology (Q77468620) should also be removed in my opinion, but instance of philoshical concept should remain. --Cdo256 (talk) 05:49, 8 September 2020 (UTC)
Re: "all articles should refer to the concepts rather than the terms themselves," absolutely not. Articles about individual ethnic slurs as obvious examples are a perfect example of why this is not true of "all articles." - Jmabel (talk) 16:01, 8 September 2020 (UTC)
The more I think about it the more I feel that slurs and words with strong connotations should really be senses and that if there's a Q item that links to Wikipedia articles in different languages then they're referring to different slurs. I guess I mean when it's more important to have the concept than the word as a Q node. If the corresponding Wikipedia Articles refer all to the same word then it seems reasonable to make the word a Q node, otherwise the Q nodes should be separated by language (probably by creating lexemes). My feeling is Q nodes of racial slurs for example should, for example, be renamed from '<ethnicity-slur>' to 'racial slur for <ethnicity>' and be changed to be subclasses rather than instances of pejorative (Q545779). --Cdo256 (talk) 04:00, 9 September 2020 (UTC)
@Cdo256: I wish I saw a way to make my point here without getting into using terms here that are likely to offend someone, but certainly en:Nigger, es:Nigger, and de:Nigger refer to the same English-language pejorative. - Jmabel (talk) 15:35, 9 September 2020 (UTC)
@Jmabel: I agree. --Cdo256 (talk) 15:54, 9 September 2020 (UTC)

Notability standards for inclusion of books

I am trying to understand the standards of notability for inclusion of books.

Can any book published for which the existence can be independently verified (e.g. national libraries entries exist) be included? Bogdan (talk) 18:49, 2 September 2020 (UTC)

this is a great question. and points to the weaknesses in our inclusion/notability policies. for example if we decide not all books are ok for inclusion are all works by an already included author eligible under the "structural need"? But it's hard to see how all these scholarly papers can be included but not all books. BrokenSegue (talk) 18:57, 2 September 2020 (UTC)
Probably the number of books in existence is over 100 million; maybe even towards 200 million if we include Chinese books. The Library of Congress alone has 39 million cataloged books and, by our current standards, all of them are verifiable (having LoC catalog entries). Are we ready for such an influx? Bogdan (talk) 19:04, 2 September 2020 (UTC)
I mean I think we should be asking "would it be useful to have them" (in addition to can we technically handle the volume). What would we add above what the LoC etc already are doing. E.g. would we link the books to subjects or authors or other things we already have items for. That kind of thinking though isn't encoded anywhere in our policies right now... My view is that we probably should import books for authors we have items for but not just all books. BrokenSegue (talk) 19:20, 2 September 2020 (UTC)
I'd expand that scope to include books that we are using as references for claims, whether we have the author or not. Once the book is here we can use described by source (P1343) and stated in (P248). From Hill To Shore (talk) 19:33, 2 September 2020 (UTC)
While I agree on this criterion it makes bulk import a no-no, and I would support such a decision. --SCIdude (talk) 07:40, 3 September 2020 (UTC)
See also Wikidata:Project_chat/Archive/2020/03#Notability_of_news_articles.--GZWDer (talk) 07:56, 3 September 2020 (UTC)

When talking about books, don't forget that we make a difference between work and edition (see Wikidata:WikiProject Books). So any number must be multiplied by two at least. Ayack (talk) 07:58, 3 September 2020 (UTC)

  • IMO it may be quite useful to make records for all books which are available online. Often this fact might not particularly accessible, or might be available only in a silo created by the digitising project; but might not be queryable with a SPARQL-style query, nor a query that can systematically return and combine the information from multiple projects and multiple sources. So I think these are books that it would be useful to have consistent systematic items for.
As an example of a group of titles that we currently have systematic items for, we currently have items for about 60,000 non-periodical titles from the Biodiversity History Library (stats). IMO this is entirely appropriate.
In practical terms, it's also useful to widen the criterion from "books that we are using as references for claims", to allow items to be systematically created for all books from groups that appear to be collecting a dense number of references, or might reasonably be expected to. So for example, if we find that particular works in the Victoria County History (Q7926668) series are being significantly referenced, it makes sense to do a systematic extraction, and make sure that we systematically have items for every work in the whole group. I would argue it makes sense to in fact go further than that, and have a record for every work available in digital form at British History Online (Q2925724), and for all their authors and their publishers. Jheald (talk) 15:37, 3 September 2020 (UTC)

I do think that the ultimate goal should be to cover every book ever published in an in-depth way. The catalogue of a national library is a serious and publicly available references per WD:N # 2. However, at the moment we do lack the manpower (and technical gadgets, entering books is really hard in Wikipedia, partially because of the reason Ayack mentioned) to do so. I see three main use cases:

  • If needed for references (per above)
  • If needed to disambiguate persons or to help automatic matching (e.g. VIAF). On of the more extreme examples (in most cases just one or two books are required): Harro Müller-Michaels (Q41142793) and Harro Müller (Q83428104) work almost in the exact same field and have a similar name. Even some libraries have a very hard time to differentiate between the two. Now the two VIAF entries are still conflated but at least Wikidata got it right.
  • At least for well-known authors, all of their main works should be covered

I agree that mass-imports are generally not a good idea.--Emu (talk) 15:40, 3 September 2020 (UTC)

In practical terms it's much easier and less work overall to identify a group of titles that all ought to have items, and to extract the metadata and create all of the items all at once in one go, than to haphazardly create items in ones and twos, creating from scratch each time, without systematic method or organisation or direction. Plus, as well as the labour-saving advantages of mass production, being systematic and directed about the creation of groups of items also lends itself to the possibility of systematic quality control and project-driven quality improvement. Jheald (talk) 15:51, 3 September 2020 (UTC)
In principle yes. That is why I created several batches with hundreds of persons (and in one case, organizations) myself. However:
  1. Such mass creations work best if they are part of specific collection and are carefully curated. The mass imports of GND ID (P227) were a more than mixed blessing, the mass imports for NL CR AUT ID (P691) are far better on average because they are on average better curated.
  2. We have a working general concept for instances of human (Q5). Books are different, Wikidata’s model is still in its early stages, I have yet to see any real best practice example for more complex books. (I think User:Kolja21 has written on this, but I can’t find it at the moment).
Again, I agree with you in principle, but the moment isn’t ripe for mass creation of books. --Emu (talk) 16:32, 3 September 2020 (UTC)
Imho every book is notable. (Millions of scientific articles have been imported, articles that never have been cited in Wikipedia.) There are standards for description (like Ayack noted, see Wikidata:WikiProject Books) but we are far from applying them consistently. LibraryThing has a tool that helps to import data from library catalogs. As far as I know there is nothing comparable in Wikidata. --Kolja21 (talk) 18:33, 3 September 2020 (UTC)
We already struggle to curate the items for scientific articles (issues) and I doubt Wikidata is a suitable place for books in general. WikiProject Books seems to struggle for years now to get things in place. That doesn't mean Wikibase itself is unsuitable for this, it might just work better in a dedicated Wikibase installation. --- Jura 05:07, 4 September 2020 (UTC)
I do consider every published book notable according to our standards. At the same time bulk adding hundreds of thousands of books should go through the bot approval process where we decide about whether we have the capacity for the corresponding edits given that we have technical limitations for the amount of edits. ChristianKl11:57, 4 September 2020 (UTC)
I agree, eventually all books should be in Wikidata and each individual book is noteable. The reason is also pretty clear: current well curated collections are focused on a single country and have a different objective than Wikidata (eg LoC or GND). Given the total number of books in the order of 100 million it would "only" double the number of Wikidata items but that is probably not a huge technical issue, the issue is rather how to do the import consistently since an import on that scale needs to be done once and done properly. Fixing it after the fact is a much larger undertaking than an initial import. Therefore: adding single books is fine, adding a few thousand books using imports here and there is not fine since it makes a true large-scale import much harder. --Hannes Röst (talk) 14:30, 9 September 2020 (UTC)
PS: what is the technical limit for the number of edits? --Hannes Röst (talk) 14:30, 9 September 2020 (UTC)
@Hannes Röst: "all books should be in Wikidata"? All self-published romance novels? All single-copy family memoirs? Unpublished fan fiction sitting in someone's drawer? It seems to me that there has to be a line somewhere, and the question is where. - Jmabel (talk) 15:22, 9 September 2020 (UTC)
@Jmabel: How do you feel about fan fiction with their own identifier values? Should they be allowed?--Trade (talk) 19:25, 9 September 2020 (UTC)
Yes there is a line somewhere, if I make a note on a post-it sheet I cannot simply call it a "book". Maybe I should have qualified my statement in agreement with ChristianKl, every published book should be in Wikidata (as well as some unpublished ones). Wikidata:Notability clearly states " It refers to an instance of a clearly identifiable conceptual or material entity. The entity must be notable, in the sense that it can be described using serious and publicly available references." -- so no in most cases, unpublished fan fiction sitting in a drawer does not have serious and publicly available references. There may be some exceptions where a serious and public reference exist for "stuff in a drawer", eg. Nachlass (Q3827332) of noteable people and other archival documents / letters / diaries / fragments which were never officially published. Self-published books have to be evaluated by the same Wikidata:Notability criterion: if they have an ISBN, then yes, in other cases it will depend on whether they are listed in a "serious and publicly available" list of such books then it should be included. --Hannes Röst (talk) 15:38, 9 September 2020 (UTC)
I do not regard ISBN (a rather recent concept) as a useful selection criterion. In the near term we should IMO prioritize non-fiction works that could be used as references here on Wikidata, focusing more on scholarly bibliographies and less on (general) library catalogues. --HHill (talk) 15:51, 9 September 2020 (UTC)
I totally agree with the short-term focus but I think its important to have a long-term vision as well. I agree ISBN is recent, my argument was not to exclude everything without an ISBN but rather use it as one inclusion criterion. Probably every book published before some date (eg 1900 to 1970) should be notable by default simply because it was comparatively expensive and difficult to publish that only important works were published in book form. Maybe 1972 would be a good cutoff date since this was when ISBN were introduced? I agree that some books that have an ISBN are probably not of the highest literary standard, but so what? Wikidata:Notability doesnt say anything about the literary standard to achieve to be included. --Hannes Röst (talk) 17:45, 9 September 2020 (UTC)
I don't think ISBN is a sufficient criterion to weed out the ephemera I mentioned in my comment below, as well as spam and pirated works and so on. Self-promoters are often willing to pay for an ISBN. And there are many, many ISBNs that have no actual existing book corresponding to them. — Levana Taylor (talk) 21:33, 9 September 2020 (UTC)

The rise of easy self-publishing has led to an enormous amount of ephemera -- works that appear on Amazon or Smashwords for a few days before being removed; works that are revised after being offered for sale without any indication on the title page or in the description; works that are announced but aren't actually available; authors banding together to create promotional "anthologies" of the first chapters of their books, often limited-time-only; promotional book excerpts being listed on Amazon or Goodreads as independent works ...

I recall that this is something that ISFDB has wrestled with a lot, since it is their mission to catalog every published edition of speculative fiction. They originally limited the scope of the problem by not cataloging works that appeared in self-published electronic, rather than paper, form, but that's obviously become untenable in the last few years, with considerable significant writing appearing in that form along with the ephemera. (The distinction between electronic and print is even increasingly arbitrary, since you can order P.O.D. copies of many publications that were primarily conceived as e-books -- and that raises the issue that you don't even know if any physical copies of a P.O.D. book exist, quite possibly no-one's ordered one.) So they've had to give a lot of thought to how to develop a method for determining whether an edition exists in stable form, and also use the method in a systematic manner to triage the flood of new Amazon releases each month to find ones needing to be catalogued in ISFDB. It might be worth trying to find their forum discussions and their latest policy for helpful ideas.

One possible way to cut down on the flood in Wikidata would be to require libraries as references, especially since we're more interested in works than editions: a work must be stable in order to be included in a library's collections. This would define a class for bulk import; exceptions for things like works that are references or have sitelinks could be managed by hand. — Levana Taylor (talk) 17:34, 9 September 2020 (UTC)

I agree to some part and the idea to use national libraries to deal with the problem seems like a good idea at first. But from my understanding libraries are dealing with the same issues and becoming overwhelmed to sift through all the publications and may not necessarily do a much better job at this than a crowdsourced place like Wikidata. However, to me it is clear that POD is not sufficient in case no book was every printed -- I would say that there needs to be a physical object printed somewhere for the concept of a "book" to make sense. --Hannes Röst (talk) 17:45, 9 September 2020 (UTC)
"I would say that there needs to be a physical object printed somewhere for the concept of a "book" to make sense." Hmm. Just as there are some trivial ephemera that exist in multiple copies, there are some substantive works whose only existence is online. Nile Thompson and Carolyn J. Marr wrote pretty much the definitive work on the history of Seattle Public Schools' buildings, Building for Learning: Seattle Public School Histories, 1862-2000. I've seen it described as "published in 2002 by Seattle Public Schools", and there may even be a physical copy someplace, but I do a lot of work on Seattle history and I've never seen one. Instead, its been hosted in a variety of places online, and seems to have found a permanent home serialized as a set of web pages at historylink.org. - Jmabel (talk) 23:01, 9 September 2020 (UTC)
What I'm suggesting is that the set of all books that have been added to library collections would make an adequate starting data set -- it wouldn't contain everything of potential interest, far from it, but it would contain most scholarly writings, and the fact that it would be heavier on those than on fiction isn't a problem if one of the main purposes of a mass import is to have reference-citations available. The main reason I'm proposing libraries is just that if something's been added to their collections it almost guarantees that the book exists and isn't spam or an excerpt.
I think fiction and popular nonfiction would be better found via some curated set of author bibliographies. Libraries purchase only very few of the legitimate e-books (or, in older days, mass market paperbacks) that exist, so another source for non-scholarly works would be needed if we wanted them, and author bibliographies seem like the best chance for being comprehensively systematic. — Levana Taylor (talk) 08:17, 10 September 2020 (UTC)
Ideally there should always be an easy way to get from a Wikidata item to library shelfmarks or a digital object on a stable platform. Items on books without such links should probably be suspected of being non-notable unless they are treated like Deperdita (Q26877876) in relevant bibliographies and catalogues such as Gesamtkatalog der Wiegendrucke (Q1421256). --HHill (talk) 09:55, 10 September 2020 (UTC)

Terms and Conditions May Apply (Q15048686) has cast member (P161) set to Edward Snowden (Q13424289) It's got a little flag so I'm trying to figure out how to improve it and would appreciate guidance. He appeared as an interview subject in a documentary, yet I can't find the Q for interviewee in the dropdown. What do? Psiĥedelisto (talk) 01:36, 10 September 2020 (UTC)

I've added interviewee (Q55534929) and interviewer (Q46034607) to Property:P161#P161$af8ac6f9-43e1-d25c-f558-847edae71c95 so it should now be in the drop down. --Cdo256 (talk) 06:25, 10 September 2020 (UTC)
Okay I've reverted my above change as this is a little more complicated than would have hoped. The constraint requires occupation (P106) qualifier but interviewee is a role rather than an occupation (Q12737077), so adding it to the constraint breaks the constraint that the occupation (P106) qualifier has to be an instance of occupation (Q12737077). I think the proper solution is changing the constraint to require object has role (P3831) instead of occupation (P106), changing all instaces to match. I'm not confident doing this without discussion first though. --Cdo256 (talk) 06:43, 10 September 2020 (UTC)

Railway electrification systems

There is a general data-item 'railway electrification system' Q388201. But they are also country articles on the subject with historic overview, such as Q46858768 (Elektrificatie van spoorlijnen in België) and Q2496090 (Elektrificatie van spoorlijnen in Nederland). The Danish 'Elektrificering af danske jernbaners togdrift' is wrongly coupled to Q388201.

I try to use the property 'subclass of' (P279) to link to the general article 'railway electrification system'. But i have my doubts. Is there a property for a regional/local specific article of a main subject?Smiley.toerist (talk) 17:39, 9 September 2020 (UTC)

There are of course country railway article wich have sections on elektrification systems and history. Unfortutely linking to these sections is not permitted. A major limitation of Wikidata.Smiley.toerist (talk) 17:39, 9 September 2020 (UTC)

In this case instance of (P31) would be more appropriate since Q46858768 and Q2496090 (I assume) are fully specified classification systems, whereas the general railway electrification system (Q388201) isn't fully specified so should use subclass. I'm not sure what you mean by the third paragraph "There are of course country railway article wich have sections on elektrification systems and history. Unfortutely linking to these sections is not permitted." do you mean linking to sections of the wikipedia article? --Cdo256 (talk) 06:09, 10 September 2020 (UTC)

Yes as most Wikipedia's have some information about the electrification systems of the railways, but not necessaraly in one specific article.Smiley.toerist (talk) 06:57, 10 September 2020 (UTC)
I've added electrification of railway network (Q99196116) as I think it's more appropriate. --Cdo256 (talk) 02:56, 11 September 2020 (UTC)

Invitation to participate in the conversation

Adding oneself to WikiData

Hello all,

a bot recently added structured data to one of my photos over on Wikimedia Commons including me as the author. This metadata field does not link anywhere as I don't exist as a WikiData Object. This is why I started thinking about just adding myself to WikiData and started to do some reading on the rules on relevance. According to rule 3 at Wikidata:Notability me adding myself would be fine. I do however not feel comfortable just adding myself without asking anywhere first, as in other many projects changing data about oneself is not allowed.

Based on that I wanted to start this discussion on if adding yourself to WikiData is an acceptable thing to do.

Kindest regards, Unkn0wnCat (talk) 11:44, 4 September 2020 (UTC)

Jmabel, re: " I don't see why posting one's own pictures on Commons should make someone notable": I believe some people feel 'structural want' = 'structural need'. And almost any entity with an internet presence or directory listing can arguably satisfy Wikidata:Notability #2. The concept of a notability threshold is becoming more illusory every day. -Animalparty (talk) 17:22, 4 September 2020 (UTC)
  • There's apparently a bit more notability required for living people, especially if they are creating their own record. If you don't succeed now, I'd suggest trying again posthumously. There are items for historical people about who very little is known, or ever likely to be known, such as imports from the British peerage database or Nathaniel Oldham (Q55445723): which is literally a record from a single census, and which survived a deletion request. Ghouston (talk) 12:09, 5 September 2020 (UTC)
  • @Unkn0wnCat: We have no prohibition on self-authored items, and our - admittedly vague - notability criteria are at WD:N. Please do not be put off by inapplicable pejorative terms like "vanity items". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:55, 6 September 2020 (UTC)
  • Unlike free-form writing (such as a Wikipedia article), Wikidata isn't going to become a press article. What's your name - enter it. What's you DOB - enter it. Do you have a website - enter it. It is all just facts and figures, no fluff. Which is why WD is easier on "CoI" rules than other projects. You still need to pass notability, but as of today (because changes have been discussed in the past) there is no rule against adding/editing yourself. Quakewoody (talk) 12:15, 6 September 2020 (UTC)
  • Thanks for all of your responses! From the responses to my post it is clear there is no consensus on what the notability rules deem qualified or unqualified, so for now I will just stick to editing stuff about others and hold off on editing stuff about myself. Maybe one day someone will create a object about me or the rules get redefined to take a clear stance on something like this and I can fulfill my inherent need for everything to be linked. :P Hope everyone is having a nice day and again thanks for all of your responses, Unkn0wnCat (talk) 12:43, 6 September 2020 (UTC)
  • Also there is no stopping someone from deleting your entry who has deletion rights. I write as someone who recently had my entry deleted without debate. Try having your own entry restored post mortem and it will be more difficult, being dead and all. I think 50 and 100 years from now historians will want to know more about the various photographers for Commons. The LCCN database from the Library of Congress is full of people with no known birth and death dates, that wrote a single book or are mentioned in other books. The Library of Congress has a project at Flickr Commons to identify people in historical photos and provide a date and context. Historians in the future will want to know more about the people of today, let's make it easier for them. --RAN (talk) 03:24, 11 September 2020 (UTC)

What’s your experience with reporting bugs or feature requests to the Wikidata development team?

Hello all,

Currently, various channels exist to report bugs or make feature requests related to the Wikidata software. You can for example leave a message on Wikidata:Contact the development team, or create a ticket on our task tracking system Phabricator with the Wikidata tag. You can also ask questions on the various social channels run by the Wikidata community, like the Facebook group, Twitter, or the Telegram group.

We identified some issues with the current process from our perspective, and we would like to hear about your own experience, whether or not you already submitted a bug report or a feature request. After collecting all of this feedback, we will propose a reviewed and improved process for you to interact with the Wikidata development team.

You can find more information about the project here (current status, problems we identified, timeline). You are very welcome to give us feedback, either using this anonymous form, or answering the questions publicly on this talk page. This feedback loop will run until September 30th.

If you prefer giving feedback in person, we can also offer you a live call to talk about your experience! This call will take place on September 15th at 18:00 UTC on Jitsi.

Thanks for your attention, Lea Lacroix (WMDE) & -Mohammed Sadat (WMDE) (talk) 13:22, 10 September 2020 (UTC)

One data point: there was a long-running thread on this page (now archived) that has pretty clearly identified a need for a new feature, but I don't think anything has been formally requested yet, in part because none of the people involved (myself included) are quite sure how to do so. (This particular issue probably needs some brainstorming between developers and users, since the users aren't 100% sure what they need, or what is feasible.) —Scs (talk) 14:56, 10 September 2020 (UTC)
Thanks for mentioning this thread. We had a brief look at it, but as you mention, it's quite long, and I couldn't identify a clear call to action for the development team here. In these kinds of cases, we don't intervene on discussions about the content itself, and we let the community find a consensus, then bring a clear request. I also understand that on some topics, more discussions with the development team is needed. Unfortunately, talk pages onwiki are not the best channel to have a brainstorming, this is the kind of thing where an IRL discussion is much more efficient. Typically, we have these kind of discussions during hackathons and conferences. Maybe you could also bring it during the next office hour or even Wikidata's birthday online meetup. In any case, a summary of the issue will definitely help us understanding the needs. Lea Lacroix (WMDE) (talk) 07:09, 11 September 2020 (UTC)

Is it okay to use takes place in fictional universe (P1434) or from narrative universe (P1080) on instances of Wikimedia list of fictional characters (Q63032896)?--Trade (talk) 09:24, 11 September 2020 (UTC)

@Trade: I don't think so. Use is a list of (P360) like here. --Haansn08 (talk) 12:57, 11 September 2020 (UTC)

Help requested for several close disambiguation pages

Hi. I currently don't really know what to do with Craterus (Q352303), Krater (Q537876), Cratère (Q16265833) and Krateros (Q18590183), some having different pages in one project. I feel like a bunch should be merged or at least sidelinks should be rearranged, but I'm unsure. Could someone look at it? Thanks! --Jahl de Vautban (talk) 19:50, 11 September 2020 (UTC)

Those who do not login cannot see the data and interface in their own language.

As it says in the topic title, if the visitor is not a member, he or she sees the interface and all content in English. A warning called dialect appears on tes.wikidata.org and the visitor can click here to view the data in his own language (interface again in English). Will a solution be found for this? Thank you. --Sezgin İbiş (talk) 00:13, 12 September 2020 (UTC)

This seems to have been already requested (a couple of years ago) at phab:T196536. Ghouston (talk) 00:47, 12 September 2020 (UTC)

Hello all,

Per request of editors and because we often get the question "how can we get language X to be added in part Y of the Wikidata interface?", I drafted a table where my colleagues and I tried to list all the possible lists of languages that are used here and there in the Wikidata interface and data model. This document is still work in progress, and as you will see, we are still collecting information, before we can eventually turn it into an official documentation page. Feel free to help if you have knowledge in this area:

  • add more content or links for the existing usecases
  • add extra lines if you noticed that a list of languages is missing
  • ping me if you have any questions, or something is unclear

Cheers, Lea Lacroix (WMDE) (talk) 15:07, 10 September 2020 (UTC)

English labels

Due to a conversion above I checked some of the English labels of government departments in English and found them to be quite inconsistent. For example for the justice ministry:

SELECT ?item ?itemLabel 
WHERE 
{
  ?item wdt:P31 wd:Q1413677.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Try it!

we see about half of them are labelled "Ministry of Justice" while the other half have qualifiers ("Ministry of Justice (France)" or "Ministry of Justice of Madagascar") which seems quite inconsistent. What is the preferred solution here, looking at WD:L it seems to me that "Ministry of Justice" is correct since in most statements it would be clear by context which country's ministry is referred to. Any thoughts? --Hannes Röst (talk) 15:16, 11 September 2020 (UTC)

How about checking the website of the ministry and using whatever is mentioned there? --- Jura 15:19, 11 September 2020 (UTC)
That is easy for the native language, but often the website is not in English: see for example Ministry of Justice of France (Q509816) http://www.justice.gouv.fr/ which lists correctly "ministère de la Justice" in French but there is no official english title that I could find. Furthermore, they may use multiple names (eg https://www.hamburg.de/bjv/ lists both "Behörde für Justiz und Verbraucherschutz Hamburg" and "Behörde für Justiz und Verbraucherschutz"). Should we just translate that and try from that to figure out whether they use a qualifier for their country/jurisdiction or not? In general, it is very uncommon for a highest level (country level) ministry to use a qualifier in their name. --Hannes Röst (talk) 15:59, 11 September 2020 (UTC)
The French government has an English language website that does translate it as "Ministry of Justice". --- Jura 07:38, 12 September 2020 (UTC)
In my opinion parenthetical qualifiers like (France) should be removed from the labels - make sure the descriptions include the country name though! If the official name in the native language includes the country name then there's no reason it shouldn't be retained in English also. Names of government departments or ministries are unlikely to be completely consistent from country to country even within a single language, so I don't think we should expect them all to look exactly alike. ArthurPSmith (talk) 17:30, 11 September 2020 (UTC)
For positions, the country/territory is generally added, see Special:Search/Prime Minister haswbstatement:P279=Q14212. Maybe the same should be done for offices. --- Jura 07:38, 12 September 2020 (UTC)

Is there a way to see items or properties by index?

I've just discovered Wikidata, and I think it's great. I'm now curious to know whether it's possible to query Wikidata to view a selection of items or properties by index. For example, say I wanted to list items Q1 ... Q100. Is there a way of doing that through SPARQL, or otherwise? Mynotoar (talk)

You can use Special:AllPages to view the index. --Pyfisch (talk) 09:46, 12 September 2020 (UTC)

Quick question: Which property should be used for a TV channel shutting down?

I was just editing The Comedy Channel (Q7726918) to include the date that the channel shut down, but noticed there's at least a few different properties that could represent an "end date" (such as either end time (P582) or dissolved, abolished or demolished date (P576), but there might be others). Which one would fit best in this situation? -- numbermaniac (talk) 06:12, 11 September 2020 (UTC)

As there is inception (P571), I'd use dissolved, abolished or demolished date (P576) --- Jura 06:32, 11 September 2020 (UTC)
How about a game being removed from a digital distribution platform? --Trade (talk) 09:20, 11 September 2020 (UTC)
an end time (P582) on the distributed by (P750) property? BrokenSegue (talk) 14:22, 11 September 2020 (UTC)

Missing increments for new creates Items

I was watching the recent changes and limited the changes to only show new items. I noticed that some where missing. Noticing that they were bot edit, I turned that on.
However, there are still missing some Items:

According to the log there never has been the item Q99229481 or Q99229482. Does anybody have any idea what happend with these numbers? If it existed at one point in time it should look like Q97229431 (log)
BTW: This does not happen once, but all the time. (Q99229521, Q99229518, Q99229515, Q99229511, all from Q99229489 to Q99229485 Q99229481, from Q99229477 to Q99229479, Q99229468 and Q99229469, Q99229463 and Q99229462) And this short ist is only out of the the last 50 edits.

Also: Is there any way to add content to these (as it looks right now) unused item numbers - or maybe create an item with the number Q99229481? --D-Kuru (talk) 10:33, 12 September 2020 (UTC)

This happens when someone sends some input to the servers in order to create an item, but the software finds that this cannot be done. Reasons could be: user has exceeded the rate limit, user is blocked, input is invalid in some form (e.g. label/description collision with another item, etc), and potentially others. The Q-ID counter has already been incremented before the edit failed.
There is no way to add content to these Q-IDs anymore. This event has meanwhile happened roughly 6 million times, i.e. on average 1 out of roughly 17 items fails to be created. There are times when the fail rate is significantly higher, and others when it is lower. —MisterSynergy (talk) 11:24, 12 September 2020 (UTC)
Already tracked on Phabricator. --Epìdosis 11:28, 12 September 2020 (UTC)
Seems to be an old, stale ticket not really related to more severe problem these days. --- Jura 11:35, 12 September 2020 (UTC)
I don't blame the staff for not being too worried about this: It's an aberration, a minor annoyance. I don't believe it's causing any problems. A fix would likely be complicated and would carry the significant risk of additional, more serious bugs. If resources are limited (and, face it, resources are almost always limited) there are bound to be more important things to work on.
Paying some attention to the bots that are prone to making the errors that cause the "lost" IDs, yes, that's something we could certainly work on. —Scs (talk) 13:08, 12 September 2020 (UTC)
A cause for this problem might be, that in the last few days some indexes are updated not immediately, but with a delay of some days. For example this Petscan result should show only articles without wikidata objects. In the last days, also entries are shown, where the articles already have been connected to a (new or existing) wikidata object. --M2k~dewiki (talk) 13:33, 12 September 2020 (UTC)
This is a long-term issue. The largest problem is it is impossible to find the users doing them and even it is found, blocking the user does not solve it.--GZWDer (talk) 13:55, 12 September 2020 (UTC)
Hammering the server by bot in a severe situation isn't going to help solve it either. --- Jura 14:02, 12 September 2020 (UTC)

Deletion of entries from databases we agree to upload

Recently there has been debate at Wikidata:Requests_for_deletions concerning entries from a database we agree to upload: The Peerage (Q21401824). Once we agree to upload a dataset should we be deleting entries on an ad hoc basis, or are they sacrosanct once we have reached consensus to upload them? This will come up in the future as we upload more large datasets of people, so best to discuss it now. We already correct errors and merge duplicates within the dataset. --RAN (talk) 02:35, 28 August 2020 (UTC)

In my opinion discussion should be centralized at Wikidata:Administrators'_noticeboard#Please_restore_the_red_link/links_in_this_family_tree.--GZWDer (talk) 03:58, 28 August 2020 (UTC)
This is a general question open to everyone, and it affects future entries. Administrators noticeboard is for deletion/restoration arguments among administrators for specific cases. --RAN (talk) 04:17, 28 August 2020 (UTC)
I don't see why we should be bound by past decisions forever. BrokenSegue (talk) 04:20, 28 August 2020 (UTC)
  • Nothing is "sacrosanct" just because there has been a discussion earlier that led to an "agreement" of whatever form. If issues arise from a certain import, we need to address them and that can also result in deletions of parts or even a complete import. It is pretty difficult to make an educated decision for or against an import, as many users do not really have an overview about the dataset to be imported.
    For The Peerage (Q21401824) in particular, I have serious concerns about the plenty of items about minors that really should not have been imported here at all. It is not that much a matter of notability; I am concerned about BLP. Not sure whether it has been discussed earlier or before the import, but I really think that all items about minors imported from ThePeerage should be deleted. —MisterSynergy (talk) 08:22, 28 August 2020 (UTC)
  • English Wikipedia or ThePeerage should not publish such information either, but it is their problem if they decide to do so. In general, I do not think that another project should force us to host data that we find problematic; if they want to publish such data, they need to host it locally.
    Publishing information about living people is always an ethically problematic act and we are already extremely liberal in this regards. The general notion here is that being described by a Wikidata item is a desirable situation and we only assess whether an item is admissible based on the notability policy. This is, however, not generally valid and we regularly see persons who ask to have their item or specific information from it deleted from Wikidata—decisions are made on a case-by-case basis. For minors in particular, we cannot expect them to make an educated decision about this question, thus we should rather err on the side of caution and not host any (personal) information about them here. You can still assign number of children (P1971) claims to an item with a sitelink. —MisterSynergy (talk) 18:13, 28 August 2020 (UTC)
  • The names of Cumberbatch's two children have been widely reported in the press, as the citations on the en-wiki article show [8]. So why should en-wiki (or any other wiki, deriving from us) not report that already widely-reported information? In fact, this typically will be the case in general: that the information has either been in the press, or publications like Who's Who, Burke's Peerage, or Debrett's Distinguished People, otherwise neither en-wiki nor ThePeerage would know those names. So why should we suppress it? Jheald (talk) 21:19, 28 August 2020 (UTC)
  • Gossip magazines (and occasionally also the more "serious" press) also infringe on the privacy of Cumberbatch's children as they earn quite some money that way. This is generally so much the case that it is questionable whether being born as a descendant of a prominent person is a gift—or a liability. Here at Wikimedia we should not be part of this problem and be extremely careful with such information, even if it is well-known on the Internet anyways. —MisterSynergy (talk) 21:33, 28 August 2020 (UTC)
    • I'm fairly certain that this wasn't discussed before upload (except maybe between RAN and GZWDer privately).
      At some point, all items were created and we had to start repairing and completing them. Some of this still needs to be done.
      The database is fairly important for some aspects of the UK before Tony Blair and can be useful for that. Still it includes a large number of items after Tony Blair or unrelated to the UK. Also, apparently TP includes third party database imports that couldn't be referenced otherwise. --- Jura 09:05, 28 August 2020 (UTC)
Please stop the libelous speculation by writing "wasn't discussed before upload ... except maybe between RAN and GZWDer privately". I had nothing to do with the upload and had no communication privately or publicly with anyone on the subject. --RAN (talk) 15:54, 28 August 2020 (UTC)
Can you provide links to what you meant with "entries from databases we agree to upload"? You have left unanswered the question about your claim before. I'm really curious about who agreed with GZWDer about "databases we agree to upload". You did write "we". --- Jura 16:17, 28 August 2020 (UTC)
Ah, I see. I apologize. I meant we, as in Wikidata. Sorry for the confusion. --RAN (talk) 18:33, 28 August 2020 (UTC)
    • BLP does not preclude information covered by reliable source, such as Burke's.--GZWDer (talk) 11:03, 28 August 2020 (UTC)
      • True, but a lot of TP's cited sources are private e-mails. I do wonder about the ethics of turning word-of-mouth gossip into something apparently authoritative by citing TP as the source here, from where it will again be picked up with Wikidata cited as the source. — Levana Taylor (talk) 01:44, 29 August 2020 (UTC)
  • @Richard Arthur Norton (1958- ): We didn't agree to upload The Peerage (Q21401824). The decision to upload was made without seeking an agreement via the bot approval process or an agreement on the project chat. Given the amount of imported data I think an agreement should have be sought via the bot approval process and if that would be the case I think the agreement of that process should be taken into account when discussing whether to delete items but even then we can change our minds. ChristianKl11:16, 28 August 2020 (UTC)
If for some reason, we agree to delete entries, the criteria should be objective and done by a bot. When an individual chooses what to keep, and what to delete, from a curated collection like TP, we introduce subjective bias. For instance, someone may think wives are not notable enough to be included, which removes them from history. These are in addition to whatever biases may already be in the TP database. --RAN (talk) 18:40, 28 August 2020 (UTC)
If Wikidata imports someone else's database, I don't think there should be deletions unless there are substantive legal issues involved in retaining such.
If someone thinks that certain QIDs should not be in the database, there should be procedures for defining a separate property (or properties) to indicate the source and utility for different purposes.
EXAMPLE: I once quoted Jones and Libicki (2008)[1] to someone who had spent years in the US military including in senior positions in the US Department of Defense. My respondent complained that the DoD did not think highly of the work of the Rand Corporation on something like this. In my judgment the best response to that kind of complaint would be to get the data used by Jones and Libicki (2008) into Wikidata and add a property or properties to allow others to flag individual cases and redo the analysis using different definitions of how individual cases should be coded, which cases should be included and which not. That should help elevate the debate from "We don't trust" a particular to source to focus on the sources of distrust.
The "Coalition of the Willing" has killed hundred of thousands and maybe millions of lives and spent over three trillion dollars, if we believe the estimate by Stiglitz and Bilmes (2008).[2] If Jones and Libicki are correct, this entire exercise has made the world poorer and less safe.
I believe this kind of research could be crowdsourced on Wikidata. I have so far not been able to initiate such a project, but I hope to in the future if someone else doesn't to it without me.
Secondarily, what are the notability requirements for Wikidata?
I had understood that there weren't any. I've been routinely creating Wikidata items for authors of publications I cite when they do not already have a Wikidata item. Many if not all of the people for whom I create Wikidata entries are not (yet) the subject of a Wikipedia article. (Of course, having them in Wikidata should make it easier for someone in the future to decide if a given author was sufficiently notable to deserve a new Wikipedia article based on the number of publications they've authored that are in Wikidata. However, I don't see that as super relevant to any notability assessment.)
Similarly, a photographer at Wikimania Montreal uploaded File:Spencer Graves-2.jpg to Wikimedia Commons on 2017-08-29. I created a companion Wikidata entry, Spencer Graves, Wikidata Q56452480 , on 2018-09-03. This person was the author of material that I cited in Wikiversity articles, so I created a Wikidata entry on him. [By the way, "him" is "me", in this case.]
Should I not be using Wikidata in these ways?

References

  1. Seth Jones; Martin C. Libicki (2008). How Terrorist Groups End: Lessons for Countering al Qa'ida. RAND Corporation. ISBN 978-0-8330-4465-5. JSTOR 10.7249/mg741rc. OL 16910145M. Wikidata Q57515305. 
  2. Joseph E. Stiglitz (2008). The Three Trillion Dollar War. W. W. Norton & Company. ISBN 978-0-393-06701-9. OCLC 181139407. OL 624824W. Wikidata Q7769107. 
DavidMCEddy (talk) 20:20, 28 August 2020 (UTC)
  • All large databases, from The Peerage, Geni.com, WikiTree, and even the Encyclopedia Britannica have errors, and nothing is sacrosanct. All entries imported from crowd-sourced databases like Find a Grave and Geni.com should be viewed with a degree of caution and skepticism: duplicate persons are common, and hoaxes (entirely fictional people) are certainly possible (and as I've said before, The Peerage is literally the work of just one guy!). For duplicates we can merely merge items, and note that they may have 2 or more external identifiers. Hoaxes and unverifiable items should probably be deleted, regardless of what disreputable websites claim. External databases should not dictate policies or data curation on Wikidata. It is appropriate to delete items on a case-by-case basis, even if it means mildly inconveniencing a data query. We have to have standards and be willing to say no, otherwise it is inevitable that one day, a devoted bot handler will scrape every person ever named in print or online, from birth records to yearbooks to Facebook profiles (all of you will be items, hooray!), to feed the ever hungry beast that is Wikidata. -Animalparty (talk) 21:20, 28 August 2020 (UTC)
    • Even Kindred Britain, a more formal genealogy project hosted at Stamford University, contains errors. In particular, Kindred Britain has a tendency to record affairs as marriages, which resulted in Oscar Wilde (Q30875) having simultaneously a wife and a husband. I deprecated the "husband" and inserted the name as unmarried partner (P451). Hoax entries should be retained but deprecated in some way. If you identify it as a hoax, the data has been cleaned and can be ignored; if you delete the hoax, the data will be restored by another user when you aren't paying attention and will again show up in reports. From Hill To Shore (talk) 23:41, 28 August 2020 (UTC)
So you're advocating that if I were go online and create a profile for the love child of Richard Nixon (Q9588) with Margaret Thatcher (Q7416) (let's call him Baby Adolf Hussein Thatcher-Nixon), and myself (or some dumb robot) dutifully created a corresponding Wikidata item, that that should forever be on Wikidata? That Nixon and Thatcher should have Baby Adolf as a (deprecated) child (P40) because someone on some website said so? Why would we put rubbish on the same level as research? This speaks to deeper question about the guiding, foundational philosophy of Wikidata (if there is any): should Wikidata have all information ever, or good information? -Animalparty (talk) 02:28, 29 August 2020 (UTC)
That's a great example: If a QID is created for such with a reference to a weblink that actually makes such a claim, I think Wikidata should have a procedure for marking it as "Misattributed", as is done in Wikiquote, e.g., Wikiquote:Abraham Lincoln#Misattributed.
If a QID contains no claimed source, then it might be sensible to delete it, provided the most recent change was at least, say, 48 hours old, so we don't delete a QID that a volunteer is in the process of creating.
Such a "Misattributed" property may need to distinguish between an unreliable source and a source that seems not to provide the claimed information. DavidMCEddy (talk) 05:34, 29 August 2020 (UTC)
@DavidMCEddy: Common values for the reason for deprecated rank (P2241) qualifier: https://w.wiki/afu One of these should allow you to make the distinction you wish. Jheald (talk) 08:49, 29 August 2020 (UTC)
As to Animalparty's question, if a widely-used website (like ThePeerage) or source (like old editions of Burke's Peerage) make a claim that we can establish to be false, it is good and useful for us to record that here, precisely so that downstream readers know that this claim does exist, and may be widely repeated, but we have examined it, and established it to be false. Also, long experience tells us that, at least as far as "high-visibility" sites and sources go, if we don't include the claim and note why it is false, then sooner or later somebody will add it assuming it to be true. That doesn't mean that every nonsense on every no-mark website should be included. But any claim from a website or source that people may be likely to find and take seriously probably should be. Jheald (talk) 08:56, 29 August 2020 (UTC)
@Animalparty: For an example of a hoax entry that we should retain in deprecated form, see Sigrid of Halland (Q75437282). The entry looks plausible but Scandinavian editors advise that it is probably fictitious and there is no evidence to support it beyond sites that refer to the original fictitious account. If more reliable sources appear later, we can restore the details to normal rank. From Hill To Shore (talk) 11:42, 29 August 2020 (UTC)
User:From Hill To Shore: Thanks very much. This makes a valuable and perhaps definitive contribution to this discussion (with both the property used for such purposes and an example). DavidMCEddy (talk) 11:57, 29 August 2020 (UTC)
Sometimes I think we need a reason for deprecation that goes beyond hoax (Q190084): an item for "complete and utter bullshit". - Jmabel (talk) 16:58, 29 August 2020 (UTC)
From Hill To Shore: So you would have no problem with me adding ? -Animalparty (talk) 23:49, 4 September 2020 (UTC)
@Animalparty: So long as you make sure to properly deprecate it, as with any other statement that is poorly sourced, then go ahead. By deprecating it, you will stop a bot from adding it later as a valid statement. I am curious why you are trying to prove a point with this though. I gave a valid example earlier in the conversation so your attempt to provoke a reaction with an intentionally incorrect statement is confusing me a little. From Hill To Shore (talk) 00:05, 5 September 2020 (UTC)
It is not appropriate for Wikidata to decide what is or is not correct, only what has been said. Wikidata is a place for machines to faithfully regurgitate the output of other machines. -Animalparty (talk) 04:24, 5 September 2020 (UTC)
@Animalparty: If that is your belief then start a separate discussion to remove the functionality that allows deprecation. I'm getting the impression that you are trying to provoke me into an argument with these statements but I am not going to indulge you. If you genuinely believe what you are writing, go get a consensus to support you. From Hill To Shore (talk) 08:46, 5 September 2020 (UTC)

TP cleanup

As RAN made us realize, there was actually no consensus to upload this into Wikidata.

The question is now how to fix it. It seems that there is no agreement to include items about minors from that database and possibly anyone born since Blair. The cutoff could also be calculated as people that are children of the generation born in the 1950s. Further, people born in the 20th century that are not British should probably not have been included either. Are there other groups of people we should identify? --- Jura 04:30, 5 September 2020 (UTC)

A key issue if we are considering the deletion of entries, is that criteria have to be flexible enough to consider different scenarios. For example, if we have an item for a 10 year old child where the Peerage is the only source, then I would feel uncomfortable about retaining it. If that 10 year old child is independently notable (perhaps a movie star) then the Peerage entry could appear on that item. I am not sure why nationality is an issue here; I have come across many notable entries of non-British people that have a Peerage ID. Surely, if an entry shouldn't be here for a valid reason, the same reason would apply if the person was from any country? From Hill To Shore (talk) 08:56, 5 September 2020 (UTC)
Another issue is such data may be included in numerous other genealogical databases, plus Burke's Peerage, plus other books.--GZWDer (talk) 09:35, 5 September 2020 (UTC)
So then, criteria for identifying a set of items to further examine: birthdate 2000-present, birthdate has no references except TP, there are no external identifiers except TP. How hard is it to write a query to find those? — Levana Taylor (talk) 13:19, 5 September 2020 (UTC)
  • I think the main point problem of TP is that it's also a genealogical database. If we include them, there is no notability left. Obviously, if items are otherwise notable, they wont be deleted, but I don't think they have been created in the TP batch anyways. --- Jura 09:07, 6 September 2020 (UTC)
Why is Blair a relevant factor here? BrokenSegue (talk) 14:26, 5 September 2020 (UTC)
Blair or rather House of Lords Act 1999 (Q120826) changed the relevancy of TP. This also explains why non-British TP entries aren't notable per se. P27 is fairly easy to query. --- Jura 09:07, 6 September 2020 (UTC)
There are still no consensus for special treatment of living minors. If this is a concern, deletion is also probably not the only solution - anonymization is another.--GZWDer (talk) 08:48, 6 September 2020 (UTC)
Anonymisation is not an option. The objective of Wikidata is to link. Thanks, GerardM (talk) 09:24, 6 September 2020 (UTC)
So do you agree we can not have items with label such as "Eldest daughter of Kobe Bryant" with link refers to (external) webpages mentioning them?--GZWDer (talk) 13:10, 6 September 2020 (UTC)
When a name is not mentioned and it is this easy to find one, it is not anonymisation. I do not agree to anything, what I did / do is point out that it is not an option. Thanks, GerardM (talk) 16:49, 6 September 2020 (UTC)
So this may means we will store every names widely published in reliable sources, without any needed of agreement. In Wikidata the subject of article does not need to agree when an article is created; But as how Wikidata works, we do not require significant coverage, and having such items for children of notables will be convenient to describe (e.g. newspaper) articles where the children are mentioned.--GZWDer (talk) 20:26, 6 September 2020 (UTC)
It does not mean it. For me it is not a given that every database that is free is to be included in tutu in Wikidata. With biased information like "peerage" and information that hardly serves a purpose like "German companies", we really need to be more exclusive (for the import of entire datasets). Thanks, GerardM (talk) 08:39, 13 September 2020 (UTC)

Which property to use in order to link submerged city (Q22674939) and the associated waterbody?

I'm not sure about it. located in or next to body of water (P206) or significant place (P7153) with a qualifier? Which qualifier to use? Here are a few examples:

--Stevenliuyi (talk) 03:26, 12 September 2020 (UTC)

located in or next to body of water (P206). It is a catch-all property for stuff under water, surrounded by water at the surface or close to water but actually not even in contact with it. And by 'water' I mean everything from lakes to rivers to seas. Thierry Caro (talk) 13:11, 12 September 2020 (UTC)
What about cause of destruction (P770)? --Jklamo (talk) 11:50, 13 September 2020 (UTC)

Family name and disambiguation merges

User:Materialscientist (talkcontribslogs) seems to be systematically merging family names with disambiguation pages. Is this the way Wikidata is meant to be used, I thought there was a purpose to having separate items? Just see their user contributions. One example I stumbled upon is Kapanen (Q21491226)/Kapanen (Q1728350). --Kissa21782 (talk) 09:12, 13 September 2020 (UTC)

No, they should definitely be kept separate. All those edits need to be reverted. The user was already informed about this by Charles Matthews (talkcontribslogs) last year but didn't reply. Pyfisch (talk) 09:33, 13 September 2020 (UTC)
That's right. Some items that are said to be for disambiguation pages are actually for family names, and the instance of (P31) statement can be changed in those cases: but merges are a big negative and cannot necessarily be undone quickly (because of the items linking to the family name). Charles Matthews (talk) 09:51, 13 September 2020 (UTC)
I've left a message at Materialscientist's enwiki page to inform them of this discussion and a related discussion at Wikidata:Administrators' noticeboard#Vandalism bei User:Materialscientist. A note on their enwiki page says they have turned off ping notifications on their account, so they may not have seen the message in 2019; hopefully the user will now engage in discussion. From Hill To Shore (talk) 10:41, 13 September 2020 (UTC)
I've started to revert some of their merges but we now have a secondary problem. Because of the redirect, bots have gone through and switched the link to what is now the disambiguation item instead of the family name item. See https://www.wikidata.org/wiki/Special:WhatLinksHere/Q1233159 as an example. From Hill To Shore (talk) 11:04, 13 September 2020 (UTC)

Please discuss first and stop reverting. Some wikis, e.g. de.wiki, mark pages as disambiguation even if the pages clearly say they are lists of family names (and manual checking confirms that). As a result, wikidata editors automatically separate those pages, without checking. Materialscientist (talk) 11:11, 13 September 2020 (UTC)

Well, your merge of Dobbs (Q1233159) and Dobbs (Q56245396) was definitely wrong on two counts. 1. You merged the family name item into the disambiguation item when you wanted to clear the disambiguation; doing it the other way around would have stopped all the links from having to be redirected by bot (and now reversed). 2. The italian wiki page w:it:Dobbs is a disambiguation page that mentions a ferry and some otehr sort of item; this one should not have merged. From Hill To Shore (talk) 11:22, 13 September 2020 (UTC)
First, I am not perfect. Second, the ferry was listed as "see also", which does not make it a disambig (the ferry should be moved to the article body) - it is still a family name page. Third, I was advised to use a script for merging, which I do, so either fix or disable the script. Fourth, if you manually check Q1233159, you will see that the interwikis there are not disambigs. Materialscientist (talk) 11:29, 13 September 2020 (UTC)
I have checked. The Ukranian wiki is also not a family name page as it includes a link to Richard Dobbs Spaight (Q878563) with Dobbs as a given name and not a surname. The standard script for merging allows you to reverse the merge target with a click of the button. However, rather than using the merge script, I would advise you to use the "Move" gadget (in your preferences) to move individual links to the correct page. If you are able to read the language and confirm that it is linked to the wrong page then move that link. If you are unable to read the language, leave it for another editor to make the move. If all the links have gone, we can then decide how to handle the empty disambiguation page. From Hill To Shore (talk) 11:36, 13 September 2020 (UTC)

Just to add one information: Yes, de.wiki has a rather strict distinction between family name and disambig pages. That’s not some sort of faulty behavior as suggested here but rather a choice that makes a lot of sense. --Emu (talk) 11:43, 13 September 2020 (UTC)

I guess Dobbs in Richard Dobbs Spaight is a middle or surname name, not a given name (I would prefer professional opinion though). Materialscientist (talk) 11:48, 13 September 2020 (UTC)
  • Even if one or the other sitelinks are on the wrong item, this doesn't mean that items should be merged. Sitelinks can be moved between items.
I don't think it's a reason to move sitelinks if one has a different view of what a Wikipedia language version ought to do with their articles or pages.
Merging items for different types breaks internal and external links and generally leads to wrong descriptions in countless languages. This is clearly destructive.
For wikis that want sitelinks to any type of page, Wikidata offers a LUA module. --- Jura 11:56, 13 September 2020 (UTC)

Category:Ministers of a country

I have added information on what is in specific categories of Ministers of a country .. eg Health Ministers of Cameroon. This category is included in a category Government ministers of Cameroon. Does it make sense to define it like I did? Is it possible to query it? It could find people who are a minister but not in a sub category, it could find specific types of Ministers (a subcategory that is not defined as that specific ministers for that country).. Any suggestions, comments. Thanks, GerardM (talk) 11:40, 13 September 2020 (UTC)

COVID cases

I've noticed that ranking (P1352) is used in pandemic in France (Q83873593) (I don't know if it is used somewhere else) to rank daily updated count of cases, but IMO series ordinal (P1545) would be more suitable there. — Draceane talkcontrib. 12:55, 10 September 2020 (UTC)

Also COVID-19 pandemic in Norway (Q86886544) is/was using ranking (P1352). Pmt (talk) 05:29, 11 September 2020 (UTC)
@Pmt: Do you have any idea how to fix it? — Draceane talkcontrib. 16:24, 13 September 2020 (UTC)

Items for specific YouTube video

See history of Q96362432. Should we instead create items for each YouTube videos describing specific people?--GZWDer (talk) 20:25, 11 September 2020 (UTC)

That said, YouTube video ID (P1651) seems close to video (P10), so I don't see why one couldn't use it in this case.
Neither property should be used to add dozens of videos to a single item,
nor add videos of type "item [] explained by John Doe" to dozens of items. --- Jura 20:34, 11 September 2020 (UTC)
Corrected my comment above. --- Jura 07:35, 12 September 2020 (UTC)
  • Why is that clear? It's not at all clear to many editors who use this properly in this way. Gamaliel (talk)
That doesn't address the issue of YouTube video ID (P1651) being used incorrectly. Gamaliel (talk) 22:40, 13 September 2020 (UTC)
If items for specific YouTube videos are allowed, you should create items for them and link the person item via described by source (P1343).--GZWDer (talk) 23:03, 13 September 2020 (UTC)
Again, that has nothing to do with category YouTube video ID (P1651) being used incorrectly. Gamaliel (talk) 23:24, 13 September 2020 (UTC)

duplicate any Japanese elections

I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. Eien20 (talk) 15:49, 19 September 2020 (UTC)

I want to merge those items. --Eien20 (talk) 00:29, 19 September 2020 (UTC)

There is information about how to merge items and where get more help, if needed, at Help:Merge. ~ The Erinaceous One 🦔 05:45, 19 September 2020 (UTC)
@Eien20: These are not the same things, and should not be merged. In each case only one is the election; the other is the legislative term of the House of Representatives that then follows that election. --Oravrattas (talk) 09:22, 19 September 2020 (UTC)
Thank you for telling me. I almost used it by mistake. --Eien20 (talk) 15:49, 19 September 2020 (UTC)

Citing from Ancestry.com

Now that wiki-projects have been allowed access to Ancestry, we ought to agree on a standard way of citing information found in their scanned historical documents. I don't have an answer yet, just throwing it out there. Also I'm not clear on whether the terms of use for the wiki users allow us to download and share pages (like for example this one which was stored in order to be cited by Wikitree). If so, then there ought to be a specialized property for the link. — Levana Taylor (talk) 15:32, 5 September 2020 (UTC)

We might want to choose a path that’s also suitable for other like Matricula (for Austria). It won’t be easy and I don’t have any good solutions yet. --Emu (talk) 21:13, 5 September 2020 (UTC)
Is there a tutorial on how make the documents shareable, or does the standard url to the document allow you access without an account? --RAN (talk) 21:59, 11 September 2020 (UTC)
I've checked a bit more about document sharing on Ancestry, and here's what I see. When you're logged in, you open the "toolbox" on any document image and go to Share > Email, where you can create a public link and e-mail it to yourself or anyone. Anyone can use that link without an account. I think that it will even remain available if your account is closed, but I haven't found anywhere in their help pages that says so explicitly. The url format is like this: https://www.ancestry.com/sharing/21488143?h=85516f (you get sent a link url with a whole lot of other stuff on it but this part is all that's necessary). You can also download the page image as a jpg. — Levana Taylor (talk) 09:49, 14 September 2020 (UTC)
For what it's worth, I've been using stated in (P248):Ancestry (Q26878196) and section, verse, paragraph, or clause (P958) for the specific collection in Ancestry (e.g. "Oregon, Death Index, 1898-2008"). Ideally a document level instead of a collection level citation should be used, but I couldn't figure out how to do that. Gamaliel (talk) 12:19, 14 September 2020 (UTC)
We do have items for all of the specific collections on FamilySearch; so then the document citation is the collection and the ARK location of the record. In Ancestry, there are 32,825 collections (list). Potentially same citation format, you just have to create the document link instead of finding it. — Levana Taylor (talk) 14:51, 14 September 2020 (UTC)

Queries with Cirrus Search

Hello,

I thought about queries in the last weeks. [9]. From my point of view this is the most efficient way for that. I have found a query for the selecting the number of people in Wikidata and this query [10] times out. Is there a possibilty to query Wikidata with Cirrus Search and what happens to results I dont see at the first page. Are they also saved when I get the result of the first page or are they queried after I look for the next results. In the cases I tried things with Cirrus search I got the number of results soon and now I want to understand the reasons why it is faster as the query service and if it were also faster if there were a direct possibility to export the results.--Hogü-456 (talk) 21:49, 11 September 2020 (UTC)

Here are some answers:
  1. SELECT (COUNT(?item) AS ?count){?item wdt:P31 wd:Q13442814} times out, but SELECT (COUNT(*) AS ?count){?item wdt:P31 wd:Q13442814} returns the result immediately. The first query materializes all the values of ?item, and if it is bound (not NULL in terms of SQL), it counts it. The second query just returns the precalculated value of triples. Ideal SPARQL engine could deduce that in the first query all selected variables are bound, but there are no ideal engines in general.
  2. CirrusSearch will be progressively slower for each next page and will eventually time out after 20 seconds. The rate of timeout depends on query complexity: regex queries might timeout on the first page, some of simple queries may not time out.
  3. CirrusSearch coexist with WDQS: mw:Help:CirrusSearch#Deepcategory uses SPARQL service, mw:Wikidata Query Service/User Manual/MWAPI allows using CirrusSearch in WDQS. And in cases, where you actually need materialization and query times out, you can use MWAPI, like here: Wikidata:SPARQL query service/query optimization#A query that has difficulties. Note that MWAPI is limited by 10000 rows. --Lockal (talk) 08:09, 14 September 2020 (UTC)

Wikidata weekly summary #433

Temporary and acting

I have come across a problem with modelling the military ranks of Bertram Francis Eardley Keeling (Q96084085). He was a British engineer who joined the army at the outbreak of the first world war and was given a temporary rank. During the war he was given promotions to his temporary ranks. Overlapping his temporary ranks, he was also given acting ranks. For example, in April 1917 he was a temporary Captain but was given an acting rank of Major.[11] In February 1918 he was promoted to temporary Major, with the implication that his acting rank ceased.[12] I am not sure how to model this in the system; so far I have used acting (Q4676846) for these transitory ranks on other human items but here we have a case of someone holding two different transitory ranks at the same time. Any suggestions on how to model this? I am guessing that I will need to have separate items for "acting" and "temporary" but I'm not sure how to express them both as distinct concepts that won't cause confusion elsewhere. Perhaps I should start with a focused concept and create a new item with label "Temporary rank" and description "status of a rank in the British armed forces" with a said to be the same as (P460) to acting (Q4676846). Future editors can then expand the concept of that item, if they feel it is appropriate. From Hill To Shore (talk) 21:39, 14 September 2020 (UTC)

Possibly rewriting Wikidata:Notability?

Hi all. It might be worth thinking about rewriting Wikidata:Notability - from my experience here, our treatment of Commons is a bit odd, as is how Wikidata:Requests for deletions works in practice. In general it's weird that the guideline focuses on sitelinks rather than concepts/items. I've started a sandbox at Wikidata:Notability/sandbox - input/edits would be welcome. Thanks. Mike Peel (talk) 22:58, 5 September 2020 (UTC)

I think we should discuss Wikidata:Verifiability first as notability relies on that.--GZWDer (talk) 01:27, 6 September 2020 (UTC)
You are mixing several issues here which should not be covered in a single policy:
A policy that tries to conflate all these issues in one place is crap and very difficult to work with. Do not make the mistake and take WD:N as the only important policy to govern admissibility here. There are several policies to consider, in fact.
The problem with Commons is that their content which we treat here (mainly Category pages) is auxiliary content at Commons and usually not subject to any identifiability or verifiability requirements, unlike practically all other Wikimedia projects. There is plenty of dubious content at Commons, and we are consequently a little more cautious with their categories than we are with content from other projects.
MisterSynergy (talk) 08:03, 6 September 2020 (UTC)
  • Also (probably BTW), even if sensitive data about minors is a concern deletion is not the only solution. Anonymize them is another (which means removing sensitive information and if name should not be included, replace them with a general one like "Son of John Smith and Jane Doe".--GZWDer (talk) 08:41, 6 September 2020 (UTC)
The biggest issue that I have with the notability is that the order in which the three requirements are is wrong. First should be does it fulfill a structural need. After that the other two arguments, the most hairy ones, are moot.
In Commons we can search in any language using: Special:MediaSearch it relies on Wikidata and as a consequence, every use of a Wikidata in Commons fulfills a structural need. When Commons for its reasons decides to remove media files it may result in lack of notability. That is however not for us to consider. Thanks, GerardM (talk) 08:28, 6 September 2020 (UTC)
Also be careful not to freeze too much in stone, I'm afraid that this will attract people who could be described as temple guardians and who would spend too much time on the community pages and do not contribute anything in term of contents in the elements. I also think that the idea is not excellent to create rules whereas it would be for example a question of discussing before importing a database. Jérémy-Günther-Heinz Jähnick (talk) 08:32, 6 September 2020 (UTC)
Hello Jérémy-Günther-Heinz Jähnick. I am curious about your concept of "temple guardians", do you feel they are necessary? Should they be recognised as such and hold accountable for? What is the right balance between community interactions and content contributions? So far I have only made "community interactions" because I do not have the "tools" that I need to do my work (properties, structure, language code...). Does that mean that I am a candidate to temple guardian?--MathTexLearner (talk) 13:24, 6 September 2020 (UTC)
At this date, you are just a new user, and we can't juge on few contributions. Jérémy-Günther-Heinz Jähnick (talk) 16:22, 6 September 2020 (UTC)
Who will judge me when I have more contributions? Are there temple guardians over there that oversee membership to the Wikidata Order?--MathTexLearner (talk) 17:34, 6 September 2020 (UTC)
@Jérémy-Günther-Heinz Jähnick: There's always a balance between letting people do whatever they want and coming to consensus. We are at a point where the amount of edits that we have holds some people back from contributing because the Wikidata can't handle as much edits as people want to make. Given that the amount of edits that Wikidata can handle is a scarce resources enforcing rules for bigger uploads does become more important. The bot-approval process was designed in the beginning to handle large database uploads and currently often gets circumvent by people using QuickStatements.
A situation where we don't have an import of German companies because the person who wanted to upload actually sought consensus while we do have big uploads from people who don't seek consensus is unfair and not desireable. In we revise policy then it makes sense to revise it in a way where people are not punished for seeking consensus and rewarded for circumventing it.
Over the last year, Wikidata did well at growing overall editorship but we didn't do well in community consensus building and the quality and consistency that comes from good consensus finding. ChristianKl18:16, 6 September 2020 (UTC)
These arguments are not about notability, they are about fear. Yes, we do not want nor need all German companies or companies of any other country but we do want a mechanism whereby the data can be linked to Wikidata. I am sure that German companies like Dutch companies are known by a number issued by a registry (in the NL the KVK number), for those companies that are notable at a Wikipedia level, we want those numbers. We have had big uploads without consensus? Export them to an instance of Wikibase and have links to Wikidata where on the Wikipedia level we want this link.
We have a really large amount of data about science, scientists and scientific publications. This data is becoming increasingly irrelevant because tools to maintain them became unavailable. The argument was that because of too many duplicate ORCID id we had an unmaintainable situation. Except, the absence of the tool did not diminish the problem and with a tool, these duplicates were merged. There was silence when it was requested to reinstate the tool. So we increasingly have a situation where Wikidata is increasingly problematic in its data because of a lack of consensus. Yes, we will get more data but it will also help establish relevance in notability issues at Wikipedia.
As to this genealogy thing (Burkes), at the time it was pointed out that it is an extremely biased publication. It is lily white and irrelevant outside of England. The use of the data makes only sense when you intend to link notable people that are inside Wikipedia. A link to Burkes establishes such credentials and that is all the use I can think of.
So yes, when a large collection is imported that is not notable bin it. That is not "temple gardianship" it is weeding out the bad stuff. What would help is when we are more explicit in what data we do want. Mind you everything that is equivalent to what we have for the first world, is notable when it is from the second, third world. Thanks, GerardM (talk) 06:03, 7 September 2020 (UTC)
According to Mike Peel's draft all German companies (which do have official IDs) would be notable. The government register for companies is a reliable source. ChristianKl13:54, 7 September 2020 (UTC)
This is the same for many other countries; and also the chairman of those countries (strictly said, many items deleted as "spam" or "promotional" will be notable by this criterion). I once proposed two properties - this is a fairly reliable database describing more than 300 million companies and more than 200 million people, but it is under a paywall.--GZWDer (talk) 16:57, 7 September 2020 (UTC)
The question should not be is it notable and can we duplicate it, the question should be do want that and does it scale.. Germany is only one country and do you want all businesses of all countries all the time. What is the point, what do we achieve, what does it cost and is it worth it. Who is going to maintain the companies for countries that currently have no interest like Senegal, Angola et al?? !! Thanks, GerardM (talk) 06:29, 8 September 2020 (UTC)
@GerardM: Our notability policy is the policy for what items we keep and not delete. If you want to delete certain items you need to word the policy in a way to allow deleting those items. ChristianKl12:13, 9 September 2020 (UTC)
My point is that something can be notable in its own right but not as part of an import duplicating what we can can find elsewhere. What possible added value is given to German companies when we can link to the authoritative source for companies we include for other reasons? When German companies are notable, so are Chinese companies. Ask yourself, what is the point. Thanks, GerardM (talk) 12:45, 9 September 2020 (UTC)
@GerardM: if we want to have a policy according to which "something can be notable in its own right but not as part of an import duplicating what we can can find elsewhere" we need to decide how such a policy looks like. One way of doing that is to require consensus decisions for larger data-uploads. How do you think a policy should look like that enforces such a criteria? ChristianKl10:04, 10 September 2020 (UTC)
  • We are talking the import of databases. The first requirement is, for the topic of the data there is already a subset in Wikidata. There is a property that links to the database whose import is requested. The data serves a purpose; there must be an application for the data. The notion that it is free, we can have it does not cut it. When the data is biased to one country or area it needs to be considered what happens when it is scaled up to a global level. How will the database be maintained, at first and at scale. Will we have active cooperation with the organisation that provides us with the data?
  • A positive example
We link to a subset of the content of OpenLibrary. It wants people to read books, we want to share in the sum of all knowledge. Our aim is achieved when we enable more people to read OL books. Thanks to links to VIAF and LoC, we have the means to easily link much more than that we do. We have cooperated with OL and it reciprokes our identifier in its database. Our and their data has been synchronised in the past.
  • A positive example
We link to a lot of scholarly publications. ORCID is a database where we can discover many publications and their associated authors. ORCID is interested in working with us. What they can offer, what we can offer is for a scientist to start a process and update/import his or her publications and associated co-authors. This continuous process will allow us to discover modern sources to the subjects written in Wikipedia. Thanks, GerardM (talk) 18:42, 10 September 2020 (UTC)
  • Yes, a list of all German companies could possibly fit in Wikidata under the current rules, but it it is not a priority. I call this the telephone book conundrum. We are looking for information dense data sets. A list of companies and an tax ID number is information sparse and not very useful, just as a telephone directory is sparse, with just two data points. It also takes us a year of work to merge entries from large databases like The Peerage, so always best to be slow and get them done properly before we move to he next large project. --RAN (talk) 02:51, 11 September 2020 (UTC)
  • I think a full list of companies, with any significant information like directors and ownership, would be very hard to keep up-to-date. A single large company group can have dozens or hundreds of subsidiary companies, constantly changing. Ghouston (talk) 22:19, 11 September 2020 (UTC)
  • We are not looking for data dense sets of data. There is no point to be all inclusive. What we should be looking for is data that serves a purpose, data that scales the whole world. When an external database (data rich or not) has identifiers, we should always link. Maybe adopt when the data is no longer maintained but only adopt when it serves a purpose, our purpose. Thanks, GerardM (talk) 05:27, 14 September 2020 (UTC)
  • Is ORCID really a positive example? Granted, the author items work within the confinement of the “papers portion” of Wikidata – but reusing them in other contexts or simply avoiding duplicates is a nightmare not unlike The Peerage. --Emu (talk) 08:15, 14 September 2020 (UTC)
I think you do not get that the data from ORCID is and has been the basis of what and how we know about scientific developments in Wikidata. Tooling like Scholia expose not only the individuals position (papers and co-authors) but does so for awards, organisations, topics.. Your problem is not with the data from ORCID, it is with the substandard way we import data into Wikidata. We do not maintain integrity, technical we have a big issue there. This is about the value of the data; check. This is about cooperation with other orgs; check. Name me one other example that has this kind of impact on our mission, sharing the sum of all knowledge. Thanks, GerardM (talk) 05:39, 15 September 2020 (UTC)

category description merge conflicts

What's the best way to deal with the subj?

I tried to use Special:MergeItems to merge Category:Gases (Q9713877) into Category:Gases (Q7215014) and was not able to do so, because both contain some generic descriptions ("Wikimedia category" etc) that for some reason differ for a couple of languages (e.g. zh: "维基媒体项目分类" in Category:Gases (Q9713877) vs "维基媒体分类" in Category:Gases (Q7215014)). I tried to clear conflicting descriptions in Category:Gases (Q9713877) first, but quickly abandoned that idea for a number of reasons:

  • There are multiple collisions that need to be resolved, and Special:MergeItems is only showing one. Sure, I can write some clever UNIX shell one-liner to find them all in an automated fashion¹, but not every future Wikidata user would be comfortable with solving this problem by copy-pasting some strange text from the Internet into their terminal emulators.
  • I simply have no idea what description should be preferred for each language. What if I should clear some descriptions in Category:Gases (Q7215014) instead? Or discard both and put in something else?
    • And why would I even want to spend my time on cherry-picking automatically inserted descriptions?
  • And even if it's OK to just clear out all descriptions from a single page, it's still a lot of (keyboard, mouse) button presses to do.
    • Also, rate limits. After clearing just all zh-* descriptions using regular interface² I got, after some delay, a message about abuse and waiting and whatnot. Amused, I went to the history and found a lot of individual change entries, one for each language. (Sure, not everyone does these kinds of changes without bots; still surprising.)

In this particular case I'm tempted to follow manual merge instructions and discard descriptions for Category:Gases (Q9713877) altogether. This, however, is a pretty simple case (just 1 label and 1 WP link to move), so I think it's still worth documenting these obstacles here.

¹ in fact, here is one:

$ q() { curl -s "https://www.wikidata.org/wiki/Special:EntityData/$1.json" | jq -S '.entities[].descriptions'; };  diff -u <(q Q7215014) <(q Q9713877)
--- /dev/fd/63	2020-09-15 04:17:40.312486792 +0300
+++ /dev/fd/62	2020-09-15 04:17:40.312486792 +0300
@@ -37,7 +37,7 @@
   },
   "be-tarask": {
     "language": "be-tarask",
-    "value": "катэгорыя ў праекце Вікімэдыя"
+    "value": "катэгорыя Вікімэдыя"
   },
   "bg": {
     "language": "bg",
@@ -61,7 +61,7 @@
   },
   "bs": {
     "language": "bs",
-    "value": "Kategorija Wikipedije"
+    "value": "kategorija na Wikimediji"
   },
   "bug": {
     "language": "bug",
@@ -161,7 +161,7 @@
   },
   "gl": {
     "language": "gl",
-    "value": "categoría de Wikipedia"
+    "value": "categoría de Wikimedia"
   },
   "gn": {
     "language": "gn",
(…etc etc)

² That change has since been reverted

--46.151.157.21 02:06, 15 September 2020 (UTC)

  • Merging should be done with the merge gadget (see Help:Merge). --- Jura 08:49, 15 September 2020 (UTC)
    • Well, it did the job, although I'm not quite happy about it. There's no (obvious) way to activate it without logging in (I don't visit Wikidata frequently enough to not be bothered about doing so), and its interface is confusing (what way would it merge if "always merge into the older entity" checkbox is deselected? Unlike widget, Special:MergeItems is clear about the direction). Also, did it just disregard all description conflicts? If so, is it some special case for instance of (P31) Wikimedia category (Q4167836) or something like that? If not, that's quite dangerous instrument, and I'd rather avoid it in the future. --MetaWat (talk) 16:35, 15 September 2020 (UTC)
      • There is some discussion with developers about what should be done automatically and what shouldn't. The result is that the gadget now does what I personally would have expected from the special pages.
In any case, I think at some point a group of users want to block people who aren't logged in and/or autoconfirmed from merging items. --- Jura 16:41, 15 September 2020 (UTC)
It seems to me that logging in is less bother than performing a merge manually, writing shell scripts, etc. Ghouston (talk) 02:08, 16 September 2020 (UTC)

Tool to identify Wikipedia articles without interlanguage links

Hello, I would be interested to see a list, eg, of all the articles in English Wikipedia Category:Japan (or its eg Portuguese Wikipedia equivalent) and its subcategories (following the English Wikipedia Category:Contents>Articles>Main topic classifications>World>Countries>Countries by continent>Countries in Asia>Japan etc structure), all such articles, that lack an interlanguage link, supported via Wikidata, to an article on eg Japanese Wikipedia. This would make the task of identifying (and addressing) missing interlanguage links much easier. Many thanks, Maculosae tegmine lyncis (talk) 08:54, 13 September 2020 (UTC)

You can query for items with country (P17)Japan (Q17), an article on English Wikipedia but none on Japanese Wikipedia. Or you can use petscan to find Japanese people from English Wikipedia without an article on Japanese Wikipedia. The category w:Category:Japan contains too many pages, so it is likely easier to only query subcategories and find interlanguage links for these articles. --Pyfisch (talk) 09:54, 13 September 2020 (UTC)
I believe this type of query is possible with PetScan. Charles Matthews (talk) 09:53, 13 September 2020 (UTC)
For smaller Wikipedia categories you can make a SPARQL query to find all articles in the category and subcategories using the MWAPI service and deepcategory search, and then remove articles with language links to a certain language from the results. However w:Category:Japan have too many subcatagories (more than 256) to make a deep category search. --Dipsacus fullonum (talk) 11:00, 16 September 2020 (UTC)

One abuse filter

Please, move Abuse filter №123 to special user right for group Users which is necessary to edit User name space (see also: Special:ListGroupRights#Namespace restrictions). 217.117.125.72 18:08, 13 September 2020 (UTC)

Merge request

Marge doesn't work for me, so could somebody merge Allium siculum (Q2291021) and Allium siculum (Q12292911)? Thanks, Abductive (talk) 03:05, 15 September 2020 (UTC)

  Done @Abductive: merging done without problems--Estopedist1 (talk) 05:16, 15 September 2020 (UTC)
Thanks. Did I type "Marge"? Yeah, she doesn't work for me... Abductive (talk) 05:19, 15 September 2020 (UTC)
@Estopedist1, Abductive: This merge was apparently in error. Allium siculum (Q12292911) was originally Nectaroscordum siculum bulgaricum before an IP changed the English label to Allium siculum. On Wikidata, we generally have separate items for every synonym of a taxon, as those multiple names generally have unique external identifiers, as well as distinct histories, and may be considered valid by different authorities. See GBIF values for Nectaroscordum siculum subsp. bulgaricum vs. Allium siculum subsp. dioscoridis vs. Allium dioscoridis. Properties like taxon synonym (P1420) and basionym (P566) can connect various names. In truth it's quite a mess, with some thinking the items for names should only refer to the name alone, while biological traits like distribution, mass, litter size, etc. should be split into new "organismal" items, regardless of how many names it has, for ontological purity (but utter chaos for mere mortals). See a previous discussion, What heart rate does your name have?. -Animalparty (talk) 18:31, 15 September 2020 (UTC)
@Abductive, Animalparty: strange that your pinging doesn't work. To the topic: yes, maybe should be reverted. Principle that "every synonym means distinct Wikidata entry" is clear, but unfortunately I guess the chaos should be the result. I even don't imagine how many synonyms + combinations for taxons could be. Maybe over 1,000,000,000. Some taxons have over 30 synonyms. I guess rational solution is that one taxon with its numberous synonyms+combinations together (like Wikispecies already does). But I am not sure ... --Estopedist1 (talk) 05:33, 16 September 2020 (UTC)
What matters is that all the wikis are on one or the other. Abductive (talk) 05:35, 16 September 2020 (UTC)

Applicance of television show media content ratings on WD

Let's say that a television show (whether it being a single season or the entire series) have been released on DVD and said DVD release have received a 'Mature' rating from a governmental media content rating organization.

Should the rating of said DVD release apply to:

  • The items of the individual episodes included on the DVD
  • The items of the season(s) included on the DVD release
  • The item of the television show
  • All of the above

When a episode/season have received a rating it's usually understood as also applying to the television show as well. If the rating only applies to specific seasons or episodes then this can be specified with applies to part (P518) and excluding (P1011). Hence why i prefer the fourth option. --Trade (talk) 08:23, 7 September 2020 (UTC)

I have several DVD sets in my collection where each disk has its own age rating; the collection then takes its age rating from the strictest rating of those disks.To qualify for a particular age rating, producers may edit the original show to remove certain scenes or they may include bonus scenes not shown in the original broadcast. Because of this, we probably need to apply age ratings to particular version, edition or translation (Q3331189) that represent the DVD set rather than to an item about the show itself. It is also worth noting that several different editions of a DVD will be released at the same time for different countries, each with their own edits and rating (the DVD set released in Australia may have slightly different content to that released in the UK and USA). I think it will be a pain to map this in Wikidata though. From Hill To Shore (talk) 09:43, 7 September 2020 (UTC)
What are the chances of two episodes on the same disc having different ratings? @From Hill To Shore: --Trade (talk) 08:38, 14 September 2020 (UTC)
  • Interesting point. My first pick would probably have been season, but maybe show is sufficient (both with appropriate qualifiers). @Máté: has probably a more qualified view on this. --- Jura 08:44, 14 September 2020 (UTC)

It really depends on the system. Most countries rate either by the episodes or by the relase which usually contains one season. Now, these seem quite straightforward, if they rate by the episode, let's add the rating on the item of the episode but if they rate by the season, let's add it to the item of the season (or to the series if it only consists of one season). In the former case, usually the srictest rating will apear on the DVD, while in the latter it is uniform, so no trouble. The issue arises when they release DVDs that only contain like four episodes that do not make up an entire season (i.e. the season is released in several parts). The systems that rate by releases will have a rating for, say, season 1 episodes 8-12. We don't have an item for that. I would discourage adding the rating to each episodes (conflicting certificate IDs and statements that are not strictly true). I personally don't think those ratings should be included in Wikidata since the subject of the statement is just not notable. ALternatively, we could add these to the smallest unit that includes all concerned episodes, and use P518. – Máté (talk) 12:18, 14 September 2020 (UTC)

Q61642210#P8326 lists the name of every episode in the first (and only) season as an 'alternate title'. Would it be reasonable to assume that the rating applies to all episodes on the DVD release? @Máté: --Trade (talk) 07:44, 15 September 2020 (UTC)
In this case the rating applies to the series. – Máté (talk) 18:13, 15 September 2020 (UTC)
And the first (and only) season i presume?--Trade (talk) 12:56, 16 September 2020 (UTC)
As there is only one season, they are the same entity, and we won't have separate items for them. Máté (talk) 16:48, 16 September 2020 (UTC)
I have yet to see any database that rates per episode. Which ones are you referring to?--Trade (talk) 16:54, 16 September 2020 (UTC)
E.g. NMHH in my native Hungary does that but I've seen it in a few other markets as well. Máté (talk) 17:11, 16 September 2020 (UTC)

YouTube Category ID

I noticed that we do not currently have a way to map Wikidata Entities to a YouTube Category ID. YouTube has expanded its Data API and now provides query by content type such as a Video Category[13].

It would be nice to hold mappings of the YouTube categories to Wikidata entities so that applications can be built more easily without having to reconcile (I.E. we apply human logic and reconcile the categories to Wikidata entities and store the mapping inside Wikidata). This is similar to some of the Schema.org work and other relational mapping projects we have done in the past. Mappings can be N:1 (many to one) from the titles that I see from the API.

Some example mappings that could be done with this new proposed property:

Wikidata entities: music (Q638) --> YouTube Category ID: 10 (title: music)

Wikidata entities: sport (Q349) --> YouTube Category ID: 17 (title: sports)

Wikidata entities: film (Q11424) animation (Q11425) --> YouTube Category ID: 1 (title: Film & Animation)

Wikidata entities: motor car (Q1420) vehicle (Q42889) --> YouTube Category ID: 2 (title: Autos & Vehicles)

Wikidata entities science and technology (Q34104) --> YouTube Category ID: 28 (title: Science & Technology)

For example, pet (Q39201) and animal (Q729) would both have this proposed new property YouTube Category ID: 15

   {
     "kind": "youtube#videoCategory",
     "etag": "ra8H7xyAfmE2FewsDabE3TUSq10",
     "id": "15",
     "snippet": {
       "title": "Pets & Animals",
       "assignable": true,
       "channelId": "UCBR8-60-B28hp2BmDPdntcQ"
     }
   },


There is support for regional language as well. This is the same ID 15 but returned in ES - Spanish

   {
     "kind": "youtube#videoCategory",
     "etag": "c2Mmk_FJb3mloyX5XIxQpJ4QFT0",
     "id": "15",
     "snippet": {
       "title": "Mascotas y animales",
       "assignable": true,
       "channelId": "UCBR8-60-B28hp2BmDPdntcQ"
     }
   },

And in DE - German

   {
     "kind": "youtube#videoCategory",
     "etag": "CMisNBqXGAfiHedBBZqtmUssOjc",
     "id": "15",
     "snippet": {
       "title": "Tiere",
       "assignable": true,
       "channelId": "UCBR8-60-B28hp2BmDPdntcQ"
     }
   },

Thoughts? --Thadguidry (talk) 16:16, 13 September 2020 (UTC)

not entirely sure what you're proposing. but a bot fetching all of this and storing it (as a string or mapped to a property) seems useful to me. are you suggesting we make an item for each of their codes? BrokenSegue (talk) 19:03, 13 September 2020 (UTC)
No, we have the items already. We need a single new property to map with the identifier code (like the 3 other YouTube properties we already have), I'd propose "YouTube Category ID". --Thadguidry (talk) 22:39, 13 September 2020 (UTC)
I'm not sure I like that implementation because (if I'm understanding you right) it won't make a link between the youtube channel and the concepts. Instead both will have a property mapping themselves to an integer/string. If instead we made properties for the various youtube categories we could link the channel to the category items and the category to the underlying concepts. We could maybe re-use has characteristic (P1552) for this but optimally we'd make a new property also. Thoughts? Do we do things like this elsewhere? I'm considering importing "music genre" information from a source and there might need to be a similar thing. BrokenSegue (talk) 23:11, 13 September 2020 (UTC)
You are misunderstanding the need here. It's not to make formatter links, but simply hold identifiers for programmatic purposes by others in the community. On YouTube's side, the Data API already has those links of channel ids to their category id. That is part of YouTube's existing internal process and channel content creators can create new categories. You might want to look at the links I provided above for more details. On Wikidata's side, our need is simply to map the various YouTube Category IDs to their semantic Wikidata Entity equivalents, so that application developers both in Wikidata community and elsewhere will not have to hold their own reconciled mapping of QIDs <--> YouTube Category IDs. Instead, the Wikidata community maintains the mapping inside Wikidata statements, for the whole community to enjoy and use, through a single new proposed property. Regarding Genres: that is entirely different and we already have entities for the various instances of music genre (Q188451) like jazz (Q8341) and a property to use genre (P136), ready for your needs. --Thadguidry (talk) 01:08, 14 September 2020 (UTC)
Hmm, perhaps I just disagree on the scope of the need. Yes, we should map their IDs to QIDs but the mapping should be done in an item. So we will have an item representing, say, id 15 with all the names of the id in different languages and with linked concepts. Then we can scrape the API and tag wikidata yt video items with those entities. This seems like a better way of representing the data more generally. I bring up the genre issue as how do we map spotify's "genre id" cleanly to one of our genre QIDs. It's a similar problem solved in a similar way. Imagine solving that the same way you are proposing solving the yt category id problem. BrokenSegue (talk) 01:56, 14 September 2020 (UTC)
We don't need or want to replicate Other services entities as Wikidata entities. (your suggestion of making a Wikidata item "15" that is constrained to API endpoint URL (P6269) "YouTube"). Instead, we link to them from existing Wikidata items that have the same semantic meaning (this helps in other areas such as Latent semantic analysis (Q1806883) and pragmatics (Q181839)). That's how Wikidata and Linked Data best practices work[14] . --Thadguidry (talk) 21:06, 16 September 2020 (UTC)

Proposed change of procedures for CheckUser requests

Hello all. Coming to think of it, I think we need a better way of doing followup requests for users. Right now, reporting further socks after the initial report takes the form of additional comments on the original request. This looks unwieldy, especially for long-term sockpuppeteers. Thus, I propose that all new requests be filed using the (n) nomenclature, e.g. the second request for John Doe would be Wikidata:Requests for checkuser/Case/John Doe (2). This will provide better organization and will also provide a measure of a user's sockpuppetry (one could then say "this user has had seven CheckUser cases opened" or something like that). I'm open to alternatives too.

Separately, we've gotten a lot more requests than I expected. I think we need to make the archiving more aggressive; having new pages for new cases would make it easier for us to tell what cases need to be fulfilled, and fulfilled requests can be archived immediately without fear of having to re-transclude for a new case.--Jasper Deng (talk) 19:22, 14 September 2020 (UTC)

How do you feel about 'threads gets archived after 7 days without activity'? @Jasper Deng: --Trade (talk) 07:29, 15 September 2020 (UTC)
So, why use subpages at all?--GZWDer (talk) 10:30, 15 September 2020 (UTC)
@Trade: That seems like a sensible archiving scheme. @GZWDer: Not using subpages would be even less organized, and importantly, all the edit histories for different cases would be on the same page, making it hard to keep track of any individual case.--Jasper Deng (talk) 16:16, 15 September 2020 (UTC)

Significant Reasonator bug

Where is the best place to report problems with Reasonator? I ask because a number of other tools, e.g. Mix'n'match, use the Reasonator-generated text summaries, so problems with the summaries are significant. I just noticed that Reasonator does not disregard deprecated statements, and if for example you have several birth date statements for a person, it will put the first one into the summary even if it is deprecated (example). — Levana Taylor (talk) 14:40, 16 September 2020 (UTC)

Constraint violation

I have just added Gasser (Q99371476)family name (P734)Gasser (Q21506814) but there is a constraint violation since the Q-item Gasser (Q99371476) doesnt have a first name. From family name (P734) this seems to be the right property to link lineage / noble families to the "common name" (see here_ but obviously Gasser (Q99371476) doesnt have a first name. Can we somehow change the constraint so that it does not apply to instances of family (Q8436) ? Best --Hannes Röst (talk) 15:50, 16 September 2020 (UTC)

Country property for boxers

Hi. Is there any property to specify that a boxer is fighting under the flag of X country? --ԱշոտՏՆՂ (talk) 16:06, 16 September 2020 (UTC)

I'm very tired of this situation now.

More than 250 characters can be written in the English description section. But other languages have limits. I have asked this before again here. Also, Q9175810 has the same label on Q8478926. But when I do this in Turkish, he doesn't accept it. Why does it accept it in English? Why is there discrimination in language?

In addition, all the institutions of the United States are mentioned as if they control the whole world. So generalizations are always made. He does this a lot in Britain and Britain and Australia. You have to see this kind of injustice. USA, UK and Australia are not unique in the world. Each country has its own institution and starts with the country name.

So I'm really tired of this situation. When I see these, I don't want to contribute. --Sezgin İbiş (talk) 21:39, 10 September 2020 (UTC)

For the second point, do you mean labels like on Department of Statistics (Q17010279), where it's just "Department of Statistics" and doesn't include the country name? I assume that it's done that way because the organization is known as simply "Department of Statistics" within Bermuda. Ghouston (talk) 23:29, 10 September 2020 (UTC)
For the first point, you may want to read the section further up the page where Wikidata developers are asking for feedback on how easy/difficult you have found it to request improvements. If you have made a formal request and it has been ignored, this is a good opportunity to raise it and find out what happened. If you have just complained on project chat and expected something to happen, this is a good opportunity to make it a formal request.
Another example of your second point is Ministry of Justice of Spain (Q3181035), which in its native Spanish label is listed solely as "Ministerio de Justicia." Many items in their native language do not indicate their jurisdiction in the label. A generalisation that it is only applicable to English language labels suggests that you are not considering other languages to the same degree. From Hill To Shore (talk) 00:00, 11 September 2020 (UTC)
For your point regarding Template:Wikidata (Q8478926) and Template:Wikidata (Q9175810) the duplication error considers the Label and Description together. If both the Label and Description are the same on both items, an error is generated. I had assumed that the same behaviour worked for all languages. Have you tried setting the same label but different descriptions in Turkish? From Hill To Shore (talk) 00:07, 11 September 2020 (UTC)
Thanks again for your work labelling properties in Turkish. Every property you label will appear on thousands or millions of items, so it is a great service to Turkish language speakers. From a personal point of view, it also means that Wikidata:Entity Explosion will also work really well in Turkish now. I wonder if the different character limit is related to different character encoding. Because some langauges have more characters, they require more bytes to specify each character. In any case, usually 250 characters would be too long for a good description, so it's better if we all keep them short. --99of9 (talk) 01:54, 11 September 2020 (UTC)
OK. Q9175810 and Q8478926 issue was caused by the label and description being the same. I tested 250 characters on test.wikidata.org and here. There is a 250-character limit for new entries for English. I've overcome these problems. OKAY. Selfishness, which does not have a language problem and is a problem for the USA, England and Australia, is not the subject here. It's the problem of these countries. (If you look carefully, you can see that other countries do not have such a problem anyway.) Understood. Thank you. --Sezgin İbiş (talk) 07:23, 11 September 2020 (UTC)

Please assume good faith (not just because it's policy). Of course all ministries of finance should just have the label "Ministry of Finance", because that's their (translated) name. But everyone is probably apt to think of other country's ministries as ".. of <country>", more so than for their own. I doubt that US/UK editors are any more likely to do so. It happening more for English labels than others can be explained by there just being more editors from those countries.

In any case, I just checked a search for Ministry of Finance, and the vast majority of English labels are correct. As a contrast, searching for Finanzministerium (the German equivalent) returns country-qualified labels almost exclusively. --Matthias Winkelmann (talk) 21:52, 16 September 2020 (UTC)

Wikipedia in described at URL (P973) ?

Somehow I thought statements like [16] should be avoided. There are now 2000+ pointing to dewiki. @Anvilaquarius:. I think @GZWDer: once made a property proposal for this type of section link. --- Jura 11:14, 15 September 2020 (UTC)

This is useful because it opens up Interwiki links in cases where one Wiki has a page for a topic and another Wiki handles the topic as a subpage. Policy-wise, I think we should remove the existing links to dewiki and add a note on described at URL (P973) that it's not to be used for linking to Wikimedia projects. ChristianKl12:06, 15 September 2020 (UTC)
Yes, but only if there's some sort of sitelink to replace the P973 link. Otherwise, removing these links would be mere vandalism. --Anvilaquarius (talk) 15:26, 15 September 2020 (UTC)
Cleaning up after incorrect uses is never vandalism. Please refrain from such qualifications. How did you get to use this property? --- Jura 15:34, 15 September 2020 (UTC)
Doing it automatically would likely need a dewiki bot approval. Those seem to be generally hard to get, but there are no principled reason why it can't be done. Given that it's not straightforward I do favor removing the incorrect uses. ChristianKl10:20, 16 September 2020 (UTC)
Even if one could get bot approval, it might be tricky to get the redirects properly set up (depending on how rigorous the wiki is the title to use).
Personally, i'd favor an approach like Wikidata:Property proposal/described in Wikimedia article. We could easily migrate all there. --- Jura 10:57, 16 September 2020 (UTC)
Many users on the German Wikipedia are unfortunately heavily opposed to redirects with disambiguation parentheses, arguing that they clutter up the search results and cause other problems. So they invariably will get listed on maintenance lists and then deleted sooner or later. --Kam Solusar (talk) 00:50, 17 September 2020 (UTC)
« Somehow I thought statements like should be avoided. » →‎ I cleary think that statements like this are usefull. Visite fortuitement prolongée (talk) 18:48, 15 September 2020 (UTC)
If statement are noted that way instead of being noted with sitelinks then Interwiki links don't work. If OpenStreetMap for example would want to display an link to the German Wikipedia for Galeriegebäude Herrenhausen (Q98809154) they can when the proper links are used but they can't when described at URL (P973) is used. ChristianKl10:00, 16 September 2020 (UTC)

Qualifier for type of postal address

Hello all. I am wondering if there is currently a way to qualify a street address to indicate the type of address it is? For entities at the University of Washington that we are creating items for, there are frequently multiple addresses, one for the physical address and one for the mailing address. Today I had a department that gives a mailing address and an address for deliveries on its website. I would like to indicate that one address is for mail and the other is for deliveries: see https://familymedicine.uw.edu/about/contact/ Here's another example where there is a physical address and a mailing address: https://globalhealth.washington.edu/contact

Thanks for any advice you could give. --Adam Schiff (talk) 19:09, 16 September 2020 (UTC)

I can't see any sign that anyone has tried to do that before. I think the qualifier object has role (P3831) would be suitable, if you created new Q items for the address types. address (Q319608), street address (Q24574749), and post office box (Q1162282) do exist. Ghouston (talk) 23:51, 16 September 2020 (UTC)

Map publisher Wagner & Debes

Is there no data-item for the map publisher? see Commons:Category:Wagner & Debes.Smiley.toerist (talk) 14:56, 16 September 2020 (UTC)

There is one now. :-) Wagner & Debes (Q99398846). - PKM (talk) 22:00, 16 September 2020 (UTC)
Thanks I now use the item in structured data.Smiley.toerist (talk) 09:29, 17 September 2020 (UTC)

Modeling place of first and last match for a sportsperson

Hello! I'm trying to model the date and place for the professional debut of a sportsperson and also the date and place of the last match. I find it specially difficult: Alexis Apraiz (Q12253165) is a way to do it, but I'm not sure if it is the best one. Another option is at Aimar Olaizola (Q3752396), but I can't find a way to show the last location. What do you think? -Theklan (talk) 16:52, 17 September 2020 (UTC)

Tools for faster wikidata entry?

I'm fairly new to Wikidata and SPARQL. Just wondering if there are any tools that speed up the process of adding statements to wikidata. Ideally, I don't want to be doing data entry one by one.

For example: I want to add a statement of 'part of' with value of 'Sengoku period' to a list of 1000 items. Is this possible?

I've heard of Quick Statements but wondering if there are other tools.

Cheers  – The preceding unsigned comment was added by 2600:1700:3520:1550:481a:6319:9fda:aa25 (talk • contribs) at 08:32, 17 September 2020‎ (UTC).

The way I do things like your example is to use a SPARQL query to create the list of 1000 QIDs, use a text-editor to mass-transform them into the desired Quickstatements command, and run that through Quickstatements. — Levana Taylor (talk) 17:21, 17 September 2020 (UTC)
Yes. Thank You. Thats what I ended up doing. Not ideal but it sort of works. The trick is trying to get lucky in finding a common statement for all the ID's I want to add new statements to. I guess this can be done with a some general query and then filter the query down.  – The preceding unsigned comment was added by Nonoumasy (talk • contribs).
Check out Help:Navigating Wikidata/en#Searching with statements. The "haswbstatement" search in some cases is more appropriate than SPARQL, and certainly easier to use! — Levana Taylor (talk) 23:23, 17 September 2020 (UTC)

Hello @Nonoumasy, Levana Taylor: another tool besides QuickStatemens to add statements is PetScan. (PetScan#Add/remove_statements_for_Wikidata_items) This tool can filter for example based on categories of the article, age of the article, a manual list of articles, a sparql query, and so on. In the small box in the right down corner you can a list of statements (format Pxxx:Qyyy) to execute on the selected entries. --M2k~dewiki (talk) 23:32, 17 September 2020 (UTC)

Thanks @Levana Taylor, Nonoumasy:

Odd infobox result

Any idea why the Wikidata infobox at C:Category:Fires in the 1900s gets the particular image it does (an early airplane hanging from the ceiling of a museum?) - Jmabel (talk) 15:59, 17 September 2020 (UTC)

It's because that was the image for 1900s (Q36574). Anyone know why the category infobox uses the decade image? — Levana Taylor (talk) 17:40, 17 September 2020 (UTC)
It's because the WIkidata Infobox tries to shove as much info as possible in front of peoples' faces with litter regards to relevance. Any intersecting categories have especially redundant or irrelevant data barf. -Animalparty (talk) 17:50, 17 September 2020 (UTC)
…well, that sounds like a brilliant plan… - Jmabel (talk) 23:21, 17 September 2020 (UTC)

Merge two objects

Q2386476 and Q86966830 are the same object. I have copied the apporpriate information so that Q2386476 now is the correct one. I would like that someone merged these so that Q2386476 is the only one remaining. Thanks in advance! Pjotr'k (talk) 18:07, 17 September 2020 (UTC)

One item is for a mountain and the other is for a protected area. Do the boundaries of the mountain perfectly align with the boundaries of the protected area? If not, they should be kept as separate items. There may also be other reasons that I haven't thought of for why they should be separate. From Hill To Shore (talk) 18:19, 17 September 2020 (UTC)
The protected area is for the mountain. Pjotr'k (talk) 19:35, 17 September 2020 (UTC)
Usually a protected area has specific borders, and a mountain does not. - Jmabel (talk) 23:23, 17 September 2020 (UTC)
A inception (P571) statement would make it obvious whether an item is for the mountain or the protected area. Protected areas and mountains are quite different things. Ghouston (talk) 06:22, 18 September 2020 (UTC)
Thank you all for the input. I withdraw my request and have also withdrawn my edits in the objects in question. Pjotr'k (talk) 13:20, 18 September 2020 (UTC)

Notification of global ban proposal who were active on this wiki

This is a notification of global ban discussion per the global ban policy.

これは、グローバル禁止ポリシーに基づくグローバル禁止議論に関する通知です。

Regards, smb99thx email 09:27, 18 September 2020 (UTC)

This is copied from WD:AN. smb99thx email 09:27, 18 September 2020 (UTC)

Ranks and the UI

Currently, ranks are only displayed through a small icon that's easily ignored by newcomers. What do you think about bolding the item name of every statement with preferred rank and using strikethrough for deprecated statements? ChristianKl21:03, 4 September 2020 (UTC)

This problem is being discussed at phab:T206392. ---MisterSynergy (talk) 21:16, 4 September 2020 (UTC)
  • Maybe using a gray background for deprecated statements could do. Somehow I'd avoid making preferred statements too prominent as it could users to conclude that there should always be a preferred statement. --- Jura 08:51, 12 September 2020 (UTC)
    • I think making the text lighter, rather than the background darker would be clearer. If we could make the whole statement 70% transparent, that would be very clear, but I don't know if that is possible. Another approach would be to add a bold red "⊗" next the statement to make it clear that it's not valid. The Erinaceous One 🦔 21:33, 18 September 2020 (UTC)

Section for lexicographical data in the sidebar

It's very common for new editors to mistakenly create Lexemes instead of Items. We need an introductory explanation on Special:NewLexeme (MediaWiki:Wikibase-newlexeme-summary) similar to MediaWiki:Wikibase-newitem-summary but, apart from this, right now both options ("Create a new Item" and "Create a new Lexeme") appear one below the other in the same section of the left navigation bar. Would you like to have a specific section "Lexicographical data" with the option "Create a new Lexeme" and possibly a link to recent changes focused on the Lexeme namespace? --abián 10:10, 17 September 2020 (UTC)

  Done. --abián 17:45, 18 September 2020 (UTC)

"Meaning overlaps" relation

Question for the ontologists: Getty AAT has a relation property "meaning/usage overlaps with". (See example at liqueur glasses.) Is there a mapping relation equivalent we can use in Wikidata? I have been tempted to use both "said to be the same as" and "different from" but I am not sure that really conveys the proper message. - PKM (talk) 22:17, 15 September 2020 (UTC)

Yes, it's partially coincident with (P1382). Ghouston (talk) 01:48, 16 September 2020 (UTC)
partially coincident with (P1382) equivalent property (P1628) http://purl.obolibrary.org/obo/RO_0002008 appears to be about spatial extent, not conceptual. Pelagic (talk) 00:43, 19 September 2020 (UTC)
For that we have territory overlaps (P3179). If identical, there is coextensive with (P3403). --- Jura 05:15, 19 September 2020 (UTC)

Sidebar

Does anyone know where I can edit the translation of the sidebar? I have been looking in Special:Translate and Translatewiki but couldn't find it. Right now the translation for Indonesian is inconsistent with the word "item" being translated both into butir and item at the same time. RXerself (talk) 00:04, 17 September 2020 (UTC)

@RXerself: Add ?uselang=qqx will show where the messages are.--GZWDer (talk) 00:14, 17 September 2020 (UTC)
Yes, I have been looking in "Translate to Bahasa Indonesia" but couldn't find the terms from the sidebar. RXerself (talk) 00:43, 17 September 2020 (UTC)
@RXerself: Maybe https://www.wikidata.org/wiki/Special:AllMessages?prefix=wikibase&filter=all&lang=id&limit=50 is a good start? It has links to translatewiki where this type of update should be done. --- Jura 10:52, 17 September 2020 (UTC)
I searched there for the term "sembarang" and there was no match. :( I'm gonna also ask in the Indonesian Wikipedia community whether anyone there know the place since I'm sure on of the users there translated it first years ago. RXerself (talk) 11:46, 18 September 2020 (UTC)
Hi, RXerself! Are you talking about “Item sembarang” from d:mediawiki:Randompage/id versus “Buat butir baru” from d:mediawiki:Special-newitem/id? (I’m not an interface guru, I just remember the information about where these strings are stored did arise during a discussion on w:en.) Pelagic (talk) 02:54, 19 September 2020 (UTC)
Yes! There are also some other links that still uses "item" instead of "butir". RXerself (talk) 08:07, 19 September 2020 (UTC)

Discography of American Historical Recordings database at the Library of Congress

The Discography of American Historical Recordings (Q42800691) is a database curated by University of California, Santa Barbara Library. It contains historical recording artists. We have Property:DAHR artist ID as an identifier for humans already in Wikibase. I have been discussing loading the entire DAHR artist database into Wikibase with the person that is currently maintaining it. How do we get started, where do I propose it, how do we go about it? I am assuming like SCOPUS and The Peerage, the description will just be "DAHR artist ID=XXXX". You can go to Property:DAHR artist ID and click on an entry to see the contents. Mix-and-Match will be easy since the DAHR database contains LCCN numbers. --RAN (talk) 00:14, 18 September 2020 (UTC)

Great project! I'm curious about how you plan to deal with the peculiarity that DAHR assigns multiple artist IDs based on roles? ("as composer", "as arranger", "as conductor", etc.) Moebeus (talk) 00:54, 19 September 2020 (UTC)
They realized that was not the way to go, and assigned new IDs. The new IDs have combined the disparate roles. I was curious why the old IDs were being deleted here at Wikidata, and am in contact with the point person for the new project. They like our linkage because they can import the images we house at Commons into their database. We imported the entire Library of Congress image database and I have been working on linking the images to Wikidata. The project to identify the people named in the photos is happening at Flickr Commons with 50 new images each week. The last batch added is here. The Bain collection is heavy with opera singers. --RAN (talk) 21:19, 19 September 2020 (UTC)

Information retrieval (SPARQL query) on multiple datasets?

When you want to do a SPARQL query on Wikidata, you go to https://query.wikidata.org/ . When you want to do a SPARQL query on OSM, you go to https://query.wikidata.org/ . When you want to do a SPARQL query on DBpedia you go to https://dbpedia.org/sparql .

Is there a way to do a SPARQL query on multiple datasets or can you do it from any one of these locations?

What I'm hoping to do is to be able to query any dataset from one place to leverage the benefits of this LOD using RDF.

--Nonoumasy (talk) 02:54, 19 September 2020 (UTC)

The concept you are looking for is federated queries. You can query one endpoint and, within the query, reach out to other endpoints. There is a description with examples for WDQS, too. Jneubert (talk) 14:48, 19 September 2020 (UTC)

Thanks.--Nonoumasy (talk) 20:59, 19 September 2020 (UTC)

Wikipedia Infoboxes to Wikidata?

Is it possible to extract Wikipedia Infoboxes to Wikidata?

I'm interested in extracting the property & value of -Belligerents -Commanders -Strength for eg. https://en.wikipedia.org/wiki/Battle_of_Un_no_Kuchi

I'm hoping to add this data to the respective Wikidata item: https://www.wikidata.org/wiki/Q2890433

Or would this be best done using the Wikipedia REST API?

--Nonoumasy (talk) 10:11, 19 September 2020 (UTC)

Is there any links on how to use Harvest Templates? The help button is disabled.  – The preceding unsigned comment was added by Nonoumasy (talk • contribs).

Main points are: login, click load, wait, once it's loaded, start, .. eventually try "stop", "publicly save".
Give it a try with some edits. If something goes terribly wrong, one can easily revert them. BTW please sign your posts with ~~~~ --- Jura 15:52, 19 September 2020 (UTC)

Learning SPARQL

Hi,

I'm just wondering if there are some resources to learn SPARQL. I'm happy at my progress but would like to accelerate the process. I'm sure just 'doing it' has its virtue but if there are video tutorials on intermediate/advance queries, pls let me know. Also, would learning more SQL help? --Nonoumasy (talk) 20:55, 19 September 2020 (UTC)

Maybe Wikidata:SPARQL_query_service/Query_Helper works for you. --- Jura 15:55, 19 September 2020 (UTC)
I have found this wikibook answering all my questions: https://en.wikibooks.org/wiki/SPARQL --SCIdude (talk) 16:25, 19 September 2020 (UTC)

Images from the State Library and Archives of Florida

I am making entries for images from the State Library and Archives of Florida that correspond to an identifiable person. I make an entry in Wikidata and import the image. They may be sparse in information like Elizabeth Harris (Q99481187) when I can't find more information, or more information dense like Oscar Fitzalan Peek (Q99479376) where I can create an entry in Familysearch and link them to primary documents. Does anyone have any objections? --RAN (talk) 21:31, 19 September 2020 (UTC)

how many? it would be good if you could at least put a century on the person to avoid future conflations. i think we have a property that just indicates they existed at a certain point in time (if birth/date are totally unknown). ah it's floruit (P1317) BrokenSegue (talk) 01:29, 20 September 2020 (UTC)
Sure! I am doing them one at a time, I don't see a way to automate them, so no count yet. I think I will be able to get a birth year for everyone from the census, Florida was sparsely populated in the early 1900s. --RAN (talk) 02:26, 20 September 2020 (UTC)
Thank you for doing this!!! I hope you find some existing duplicates to merge along the way. — Levana Taylor (talk) 03:07, 20 September 2020 (UTC)
I am sure the State Library and Archives of Florida will appreciate the additional information on the subjects.

I think they are the duplicates of each other and should be combined.Smiley.toerist (talk) 08:37, 16 September 2020 (UTC)

  • Given that EnWiki has separate pages for each of them, they aren't doublicates for Wikidata. It might be necessary to clarify either item to make it more clear how they differ but they aren't duplicates. ChristianKl09:25, 16 September 2020 (UTC)
The main difference seem to be that a motorcoach can pull other trailers and can function as a locomotive. Railcars are lighter. The definitions are not air-tight, as some railcars can pull exceptionaly pul a trailer. The distinction between a articulated vehicle and a multiple unit is also not clear.Smiley.toerist (talk) 11:55, 16 September 2020 (UTC)
I think a lot of 'railcars' are wrongly classified as 'motor coaches'. Motor coaches are used in combination of several other rail vehicles, not singly. Railcars are easily identifiable by having a driving post at both ends of the vehicle, as they have to be driven in both directions as a single car. They can often be coupled to other 'railcars' to from a train or pull a trailer. Mostly one, as railcars are motorised for only a single vehicle. Only vehicles powered by internal combustion engins can be considered as a railcar. A counter example is File:BT BDe 3-4 43.jpg where motor coach is used as a locomotive. Diesel powered multiple units quite often have a 'motor coach' for the engine power. (File:01.08.92 Liefkenshoek 4006 (5804300936).jpg). These motor coaches never have a separate article, but are part of an multiple unit train type.Smiley.toerist (talk) 09:49, 17 September 2020 (UTC)
PS:The example used in eo:Relaŭto is articulated and should be considered as multiple unit, not as a railcar. As I dont know the language I cannot check if the text is correct.Smiley.toerist (talk) 09:58, 17 September 2020 (UTC)
I have now transferred from 'motor coaches' to 'railcars' for the the following languages: cs, da and pt. De definitions vary widely. 'motor coaches' ave mostly electric examples. in many languages there is no distinction between electric or combustion engine energie source.Smiley.toerist (talk) 11:32, 17 September 2020 (UTC)
  • It would be helpful if non-specialists could understand from both statements on the items and (English) descriptions the difference between the two items. --- Jura 09:34, 20 September 2020 (UTC)

Importing a template

I'd like to import the w:Template:Please see template to Wikidata so that we can use it here. I haven't done this before, but I can't find any help pages at all about templates on Wikidata. Is there anything I should know before I go ahead and do it? (And I assume there's no alternative to just copying and pasting, as much as forking pains me?) {{u|Sdkb}}talk 03:36, 20 September 2020 (UTC)

Missouri "The Ozarks" current role in Luminate

I have lived in Missouri and the Ozarks all my life. There is alot of unspokens going on. I've done my one research from development of my state, counties to the current events the fake news won't report. There are so many things that connect from top CIA, LSD experiments, to banking in such rural areas, to the known cave systems, and dums , to local access options really only access from family property. To the main Luamite head quarters down the road owned by one the main army, CIA intelligence agency worked in White House still owns security system company in Texas. I'm the only one seeing this , it's nerve racking to have been awoken to everything but still in a dark hole hole what's happening in my back yard  – The preceding unsigned comment was added by Dkp29 (talk • contribs).

I think we already know about most of these and have items for them such as Central Intelligence Agency (Q37230). But what is Luminate/Luamite? Is it Luminate (Q6703268), a "Contemporary Christian rock band"? Ghouston (talk) 03:31, 21 September 2020 (UTC)
Illuminati (Q133957)? 193.126.51.63 11:09, 21 September 2020 (UTC)

Resources on Schema.org

Just found out about schema.org. Some people have helped me on some SPARQL query using schema as part of the query. I've also heard that it's more standardized and easier to use than the semantic web RDF standards.

  • How do I learn about how schema is used with respect to SPARQL queries or linked data in general?

There are articles on using it for SEO but I'm more interested in how its used in linked data or information retrieval. Thanks --Nonoumasy (talk) 11:57, 21 September 2020 (UTC)

Wikidata weekly summary #434

There is a bit of a mess here. Most of wikis currently located at the child (anecdote joke (Q2374151)) seem to belong to the parent (anecdote (Q193206)), property topic's main category (P910) clashes, etc. Please help to assign inter-wikis correctly. Cheers, Henry Merrivale (talk) 10:46, 17 September 2020 (UTC).

  • It seems these wiki articles actually form quite a spectrum of descriptions: short accounts that are not at all necessarily humorous (en:Anecdote, simple:Anecdote, it:Aneddoto); short, usually humorous story (fi:Anekdootti, de:Anekdote); short, (necessarily) humorous story, but without any national specifics (ru:Анекдот, be_x_old:Анэкдот, be:Анекдот, sk:Anekdota_(zábavný_príbeh), bg:Анекдот, mk:Анегдота); short, humorous story, just like the previous one, but in the context of Russian and Soviet culture (en:Russian_jokes, tl:Mga_birong_Ruso). This probably needs more wikidata items, but I'm not sure how much. --MetaWat (talk) 19:31, 17 September 2020 (UTC)
  • Thanks, MetaWat for the analysis. I participated in these items but now I see that the situation is even more complex. I have no will here to separate into more items but it would probably solve the issue. --Infovarius (talk) 15:39, 21 September 2020 (UTC)

Q47012826 and Q5471247 (Fort Green)

Are Fort Green (Q47012826) and Fort Green (Q5471247) the same? --RAN (talk) 13:04, 18 September 2020 (UTC)

They have separate enwiki pages, so a merge of those articles would need to be considered before any merge here. From Hill To Shore (talk) 14:34, 18 September 2020 (UTC)
It's possible that one can refer to the historic fort itself, while the other the modern Census designated place. Although these may not need to be treated separately, as in the case of Fort Liberty (Q991369) (edit: oh, and see Fort Bragg (Q8962062). -Animalparty (talk) 16:44, 18 September 2020 (UTC)
I will make one the fort and the other the modern populated area, currently the Wikipedia links are a mix of the two and the descriptions are the same. --RAN (talk) 20:56, 19 September 2020 (UTC)
You might need to sort out the identifiers as well. --- Jura 15:44, 21 September 2020 (UTC)

Creating a database for scientific purpose.

How to create an accurate database for biological research purposes? and What are the main steps to follow up? I basically a Biologist (PharmD Student).  – The preceding unsigned comment was added by Kone Boï (talk • contribs) at 1:54, September 19, 2020 (UTC).

If you are trying to create a database outside of Wikidata, this isn't the place to get help. Maybe try stackoverflow.com? If you want to use Wikidata to acheive your goals, then you'll need to provide more details about what you are trying to do, what you have tried, and where you got stuck. Nobody will do your work for you, but there are many people who help you once they see you've put in the necessary effort.
P.S. Please sign your comments by typing "~~~~" at the end. The Erinaceous One 🦔 05:52, 19 September 2020 (UTC)
If you want to use any of the data on Wikidata for research, you might as well just quit grad school. Massively incomplete, haphazard, and next to zero quality control here. Wikidata is simply a way to link Wikipedia articles, and a platform for nerds to amuse themselves with trivia like how many roads in Sweden start with the letter T and are less than 4 km long. It serves no greater function. -Animalparty (talk) 07:10, 19 September 2020 (UTC)
Animalparty, I wouldn't be too pessamistic about the utility of Wikidata. I have a friend who develops natural language processing AIs and uses Wikidata as a source. But if you are trying to use it for a case where you need 99.99% reliable data, then yeah, this isn't the place! — The Erinaceous One 🦔 10:15, 19 September 2020 (UTC)
Animalparty Wikidata is already in use by industry and researchers. Turns out you don't need a perfect or complete dataset to be useful for various things. BrokenSegue (talk) 14:57, 19 September 2020 (UTC)
I would like to see a reference for this statement. --SCIdude (talk) 16:28, 19 September 2020 (UTC)
@SCIdude: Which? The use in industry or research? Research has evidence everywhere. For example: Facebook research published work on wikidata. For industrial use it's harder to demonstrate because I have insider knowledge I can't share. But you probably interact with services using wikidata more often than you think. BrokenSegue (talk) 17:01, 19 September 2020 (UTC)

Newcomer and edition request Q1737946 [FR]

Hey,

I'm resquesting some insight and discussion on the sadly unlabelled Q1737946 : https://www.wikidata.org/wiki/Q1737946

This term seems to be a french exception for more generic stone-masonry terms

Best, Antoine Gros

Is claveau (Q1737946) different from keystone (Q220919)? - PKM (talk) 19:27, 21 September 2020 (UTC)

Calendar for Events

Hello,

I want to get an Overview about the next events related to Wikidata. The Wikidata event site shows as far as I think not all events. In German Wikipedia there are for special events or for regular events pages with information about how it is possible to attend and what the event is about. That other people can find the event there is a calendar and for the calendar there is a template. Is there a template like this in Wikidata or do you think that such a calendar is helpful to find the events easily. --Hogü-456 (talk) 18:35, 21 September 2020 (UTC)

Making IIIF manifests (P6108) link to something viewable

There was chat today in the zoom call about IIIF and Commons (cf some background and notes at Commons VP), that it would be nice if IIIF manifest URL (P6108) actually linked to an app displaying the image (with its metadata), rather than to the raw JSON of the manifest.

i.e. it would be nice if going to Q21627180#P6108, clicking on the blue link took you to eg

displaying the image, with metadata in the manifest available under the (i) button at the top right, rather than the raw JSON of

even though the latter would still be the value of the property, returned by SPARQL etc.

I was about to simply add a formatter URL (P1630) statement for the IIIF manifest URL (P6108) property. But I wanted to check: will this break anything, given that the property is already URL-valued?

For instance, should I also set say formatter URI for RDF resource (P1921) = $1 to make sure that RDF applications get the underlying URL, not the viewer URL? Would this sort it out? Would it be enough? Or would some applications and/or queries still get the wrong URL? Jheald (talk) 19:48, 21 September 2020 (UTC)

Hmm. Experimenting with the sandbox & sandbox url property, User:Jon Harald Soby has shown that that (probably) doesn't work. As a fallback attempt: can anyone put together a gadget to give a drop-down choice of link destinations, when a property has choices of third-party formatter URL (P3303) defined? Jheald (talk) 21:58, 21 September 2020 (UTC)

Evolution of the community communications roles for Wikidata

Hello all,

For the past four years, I’ve been working in the software department at Wikimedia Germany, taking care of the communication between the Wikidata development team and the community, announcing new features, collecting bug reports and feature requests from you. On top of that, I’ve been coordinating various projects, bringing the WikidataCon to life, coordinating the Wikidata decentralized birthday, creating a prototype for Wikidata Train the Trainers, and taking care of various onsite or online workshops, meetups and other events.

Over the past years, with the Wikidata community growing, the development team growing as well, more and more events happening, and the ecosystem of Wikibase users forming a distinct group with different needs, it became pretty clear that one person was not enough to keep track of everything and provide the best support for the Wikidata editors. That’s the reason why, earlier this year, we had the pleasure to announce the arrival of a new colleague who you already know from being an active Wikidata editor, Mohammed Sadat (WMDE).

We already started a smooth transition of our roles: while Mohammed will become the main person in charge of community communications for the Wikidata and Wikibase communities, I will focus more on organizing Wikidata-related events and supporting community members with their own events and projects. As you may have noticed, Mohammed already took over editing the weekly newsletter, monitoring the social media, and various announcements for Wikidata and Wikibase. As for myself, I will not disappear completely from the Wikidata channels: I will keep supporting Mohammed on community communication, for example with projects like the Wikidata Bridge, in which I’ve been involved since the start.

During this transition phase, we will review and improve our existing communication processes, and you can for example give feedback on the experience you had while reporting bugs or feature requests. Feel free to reach out to Mohammed if you have any questions regarding Wikidata’s development roadmap.

I’m looking forward to continuing working with you on various projects: feel free to contact me if you want to discuss Wikidata-related events, training, online events, or any other ideas you have in mind to gather the Wikidata community and onboard new editors.

Cheers, Lea Lacroix (WMDE) (talk) 08:08, 11 September 2020 (UTC)

@Lea Lacroix (WMDE): thanks for the update and great to see you are still onboard! − Pintoch (talk) 08:30, 11 September 2020 (UTC)
@Lea Lacroix (WMDE): Thanks for all your hard work, I was wondering what you'd be doing now that Mohammed is on board, glad to see what your plans are, keep up the great work! ArthurPSmith (talk) 17:26, 11 September 2020 (UTC)
@Lea Lacroix (WMDE): (and @Mohammed Sadat (WMDE):) I have a question regarding Wikidata Bridge, are you going to adapt it to be used also in Wikisource? I have seen that users are trying to connect Wikidata with book templates, however it seems that the Bridge could work better in that context. What do you think?--MathTexLearner (talk) 20:43, 11 September 2020 (UTC)
  • A suggestion for this years birthparty (as it can't really happen in person): how about handing out 1000 EUR to each of the 100 most active Wikidata contributors and/or ask them to vote which development to assign it to? --- Jura 11:38, 12 September 2020 (UTC)
@MathTexLearner: Thanks for your question. Theoretically, the Bridge is a feature that can be used on any wiki that is connected to Wikidata. So in the future, it could very well be used on Wikisource. However, I don't see that happening any time soon: the Bridge is at its very first steps (first version is currently deployed on one Wikipedia and supports one data type), so there will be quite a lot of development, time and community feedback before it reaches a point where it can be deployed on other wikis.
@Jura1: The idea of asking the Wikidata community about what their priorities would be for the roadmap is definitely interesting. However, I don't think that it should go with money attribution. Also, in the development world, 1000€ is basically nothing. As one can see for example in the project grants related to software, the costs range is much bigger. And why should it be the top 100 editors, aren't the newcomers and casual editors interested in selecting projects that could make their life easier as well? Finally, I think throwing money to people to develop features is not the right way to go. Features should be developed in a consistent way, attached to the existing codebase and making sure that the existing development team has the resources to maintain them in the future. It is less a matter of money than priorities and sustainability. And BTW, the birthday celebrations are definitely happening - plenty of cool events taking place online and offline. Lea Lacroix (WMDE) (talk) 07:01, 14 September 2020 (UTC)
Supposedly the 100 people don't all have different view ..
It seems to me the last birthday celebrations were mainly attended by some of the 100 most active contributors and did seem to cost 1000 per person. Essentially all money that wasn't spent on development, so we still haven't gotten to the bottom of the Query Server problem .. (after 2 or 3 years?) --- Jura 17:26, 15 September 2020 (UTC)
  • It's quite easy to understand being a part of the community what the developers do. We can look at Phabricator and see what tickets they are working on and when needed give feedback. On the other hand the amount of community tasks that are visible to the community don't seem to be big enough to occupy two people in addition to a project manager. Northcote Parkinson studied how the British Civil Service manages to increases it's headcount while the amount of thing it actually got done shrank. The WMF has a history of engaging in make-work for an increasing number of employees while editorship numbers fell in the early part of the last decade that to me seems like it mirrors what Parkison studied.
It seems like Jura1 is suggesting that a lot of money was spent for the birthday party that could have been spent better on development.
When it comes to "community communication" it would be good to have a clearer idea about what tasks are done. I good portion is likely Parkinsonian make-work that wouldn't need to be done. Another portion could be tasks that the community could do and where facilitating the community to do the tasks would be better then doing them with paid labor.
When it comes to missing community communication, we have performance issues with the Query Server. It would be good to have a general guide about how data modeling decisions made on Wikidata affect the performance as that allows us to get most out of what the query server can deliever.
Earlier this year Mohammed Sadat (WMDE) posted two weeks queries that suggested that certain people who aren't slave holders are slave holders. He promized to do everything he could to not do the error again. The suggestion to do through a standard quality assurance process like the 5-Why's to fix whatever went wrong was ignored. That leaves me with the question of whether I should believe that future promises to do things will also be insincere.
I also do believe that it's worthwhile to be transparent about the community communication processes to allow for improvement. ChristianKl11:14, 16 September 2020 (UTC)
@Jura, ChristianKl: I don't want to discuss about WMDE's (or WMF) budget and strategy here. It's much more complex than "throwing more money at deployment" (just like the issues with the Query Service are much more complex than "let's buy more servers") and it's a topic where it is necessary to have the full picture in mind to have constructive discussions. It's not my role to discuss these choices with the community, and I'm sure that people who want to discuss about how money is spent have been following the various discussions and projects around the movement strategy and the Wikidata strategy.
About the Query Service, our colleagues from WMF are doing a lot of efforts to communicate about the status of the Wikidata and Commons query service (see emails from Guillaume Lederey on the Wikidata mailing-list). They also stay very accessible per email or during office hours, and they are not hiding anything about the issues and doubts they encounter: so people who are willing to get more technical details about the Query Service can definitely find information.
We saw your suggestion about the 5 whys and we decided to not answer that. The apologies have been done, the issue has been discussed internally with our manager, and we will not expose the details of these discussions out here.
Providing transparency regarding Wikidata's development does not mean being accountable to the editors for our daily work, neither disclosing precise information about the tasks we do. I don't think knowing exactly what we do of our day is going to support you directly in having a good experience while editing Wikidata. If you disagree with that, and since you seem to think that our work is useless, "Parkinsonian" whatever that means, and easily replaceable, feel free to contact our manager who will be happy to continue this discussion. As for me, you can consider this message as being my last interaction with you two on this topic. Lea Lacroix (WMDE) (talk) 13:57, 16 September 2020 (UTC)

Sitelink to multi-language Wikisource

Hi; how would I go about making a sitelink to the multi-language Wikisource, which has the interwiki shortcut "s:mul"? I'm trying to link Q99526042, the Wikidata item for a multi-language book, to the Swahili-language portions of the text there: s:mul:Swahili Tales. Typing "mul" at the prompt when editing Wikisource sitelinks doesn't work. I'm able to link to English Wikisource but that only contains the English-language portions of the text. Thanks, Struthious Bandersnatch (talk) 08:50, 22 September 2020 (UTC)

Unfortunately it’s not yet possible as far as I’m aware, see T138332. --Lucas Werkmeister (WMDE) (talk) 09:43, 22 September 2020 (UTC)

Make Copy Qid a gadget or core function

Abbe98 has created a nifty user script that allows you to copy the Qid with one click. It saves the time of selecting the Qid and pressing Ctrl+C. Adding up only the time I do this every day would be considerable and on a sitewide scale this must be a lot of clicks and time saved. You can see it in use here and explained here. I suggest that we make it a gadget, or perhaps even make a feature request to add it natively to the Wikibase software. Ainali (talk) 17:50, 20 September 2020 (UTC)

I double left-click to copy a qid, and middle-click to paste, but this may be a Linux or X-Windows thing. Ghouston (talk) 23:16, 20 September 2020 (UTC)
Prior to writing this user script I double left-clicked to select it then pressed Ctrl+c to actually copy it (on Ubuntu). Abbe98 (talk) 08:53, 21 September 2020 (UTC)
I just right-click - Copy link address. When I paste a url to QID instead of QID, API finds what I want. --Lockal (talk) 11:18, 21 September 2020 (UTC)
Sure, that works in one very specific use case. But it won't help somebody who wants to put the Qid in a template, a query, in QuickStatement just to name a few very common tasks. Ainali (talk) 18:34, 21 September 2020 (UTC)
@Abbe98: I love this gadget! Now I just need to train my fingers to go there reflexively. - PKM (talk) 01:18, 22 September 2020 (UTC)
Thanks @PMK:! I'm thinking of adding a keyboard shortcut as well, but will need to do some research to make sure it won't conflict with other popular scripts. Abbe98 (talk) 09:27, 22 September 2020 (UTC)
A keyboard shortcut would definitely be preferable, because you have to scroll up to the top of the page to use the button. — Levana Taylor (talk) 01:35, 23 September 2020 (UTC)

Aknowledgig the amount of proposals and the need to qualify the propsals, I am still asking if there are any thing slowing down proposed propertie for Natural science? Pmt (talk) 18:01, 21 September 2020 (UTC)

  • It seems to me that we had/have plenty that were/are incomplete .. If there are some that you think are ready for creation, please ping a property creator. Avoid pinging me for external-id ones. --- Jura 18:45, 21 September 2020 (UTC)
  • You can see all the property proposals that are marked as ready here [17]. I only see two that are in the natural sciences, but we have been waiting for over two weeks for Wikidata:Property proposal/is solution to to be created. — The Erinaceous One 🦔 09:05, 22 September 2020 (UTC)
  • first of all onour to all property-creators and the work they are doing. My observations is as far as I can see, that proposals posted under Property proposal/Authority control goes easy throu, and understandable, it is just an ID. But still, ID's posted under other other "categories" is not so easily handled. And for the sake of good order, I do not want to discuss specific proposals here, but more to learn what the creators are struggeling with and their work-load. And for other properties than ID things should take some time to ensure quality. Pmt (talk) 21:08, 22 September 2020 (UTC)

Can't edit alias

Why can't I edit the second alias of Jarrow Hall (Q6161118)? MSGJ (talk) 09:01, 23 September 2020 (UTC)

worked for me. --- Jura 09:03, 23 September 2020 (UTC)
@MSGJ: It is a known error if the text overflows, the glitch appears sometimes in some browsers, the secret workaround is to right click on the edit button and it will take you to the "|" separated version of the names, where you can edit freely. --RAN (talk) 18:04, 23 September 2020 (UTC)
Thank you! Will try that next time MSGJ (talk) 19:46, 23 September 2020 (UTC)

Importing review scores

So we support review score (P444) but it's not widely populated (a few thousand uses). Are there legal/copyright issues that prevent us from mass importing these scores from places like Metacritic (Q150248) and Rotten Tomatoes (Q105584)? What about aggregations of user reviews available on Goodreads (Q2359213)? There might be ToS issues with some sources but do we care? Also, how do we deal with the fact that these scores change over time? I don't think we can keep multiple timestamped scores from the same source sanely. BrokenSegue (talk) 19:45, 23 September 2020 (UTC)

Model item for criminals

What do you others think about this modeling? Too many qualifiers in 'significant events'? Paul Moore (Q99343501)--Trade (talk) 07:37, 17 September 2020 (UTC)

  • I see no reason to have a significant event for the conviction as that's already covered by convicted of (P1399).
place of detention (P2632) is currently not used and if it would be used with start time (P580) you wouldn't need imprisonment as significant event.
charge (P1595) is currently not used.
Using occupation (P106) for something that a person did once in their lives feels wrong. The person likely has an occupation in which he spend much more time then in either of the three currently listed. ChristianKl08:26, 17 September 2020 (UTC)
So, when am i allowed to use murderer as an occupation?--Trade (talk) 14:02, 17 September 2020 (UTC)
Thats not an occupation, is it? I mean an occupation is something that is giving an income. If you are a hired killer, thats an occupation.--So9q (talk) 10:26, 24 September 2020 (UTC)
We don't know what counts as an occupation, see Wikidata:Project_chat/Archive/2020/01#Occupation=photographer. I don't think being paid is a necessary requirement, not everybody is part of a market economy. Ghouston (talk) 10:53, 24 September 2020 (UTC)
Not to mention Bill Gates (Q5284) with occupation philanthropist (Q12362622) ... not a big source of income, I'd say. Ghouston (talk) 10:59, 24 September 2020 (UTC)

I'm trying to understand the difference between the above three way of classifying collections of things. My current understanding is as follows:

However sets are class (Q5127848), so should part of (P361) be a subproperty of instance of (P31)? Or should we say elements are instance of (P31) sets? Should groups be instance of (P31) class (Q5127848)? At the moment metaclass (Q19478619) subclass of (P279) class (Q5127848) of class (Q16889133). Should it instead be metaclass (Q19478619) subclass of (P279) class (Q16889133) of class (Q5127848) so set (Q36161) can be instance of (P31) metaclass (Q19478619)? (edited to include class (Q16889133)) --Cdo256 (talk) 06:26, 18 September 2020 (UTC)


Maybe looking at the (English language) descriptions (in addition to the statements on the these items) can help:
  • set (Q36161): well-defined mathematical collection of distinct objects
  • group (Q16887380): well-defined, enumerable collection of discrete entities that form a collective whole
  • class (Q5127848): group of things derived from extensional or intensional definition (philosophy)
--- Jura 06:33, 18 September 2020 (UTC)
I got confused between class (Q16889133), and class (Q5127848), but I meant class (Q16889133).
The en description of class (Q16889133): collection of items defined by common characteristics is very terse and doesn't distinguish itself well from group (Q16887380): well-defined, enumerable collection of discrete entities that form a collective whole.
I guess what I really want is examples of:
--Cdo256 (talk) 06:57, 18 September 2020 (UTC)


Cdo256, these are really good questions; I've been considering some of the same items and how to clarify the distinctions between them. So, what are the differences between set (Q36161), group (Q16887380), class (Q16889133), and class (Q5127848) and when should or shouldn't each be used? (I will also include class (Q217594) in my reponse).
  • Based on the linked Wikipedia page, class (Q16889133) is specifically a class in a "knowledge represenation" (i.e. an ontology). Thus Q16889133 is a good item to use when talking about Wikidata classes. You can be more specific, however, by using metaclass (Q19361238) for classes of classes and first-order class (Q21522908) for classes of instances.
  • The item class (Q5127848) refers to the philosophical concept of a "class," and is broader than class (Q16889133). A class has to actually be included in an ontology to be an instance of class (Q16889133) (which presumably means its insances have some shared characteristic) but every collection of things is a class (Q5127848), so the set "{1, apple, every left shoe}" is a class (Q5127848) but not a class (Q5127848).
  • The Wikidata items class (Q217594) and set (Q36161) are types of mathematical objects that should be used only within the scope of mathematics. There are several formal (not-quite-equivalent) definitions of "class" and "set," but for the sake of understanding the difference between a set and a class, a "set" is defined such that it generally matches your intuition for a colleciton of items except that that certain collections are prohibited. You cannot, for example, have a set that contains all sets that don't contain themselves (otherwise you break mathematics!). A "class" is then either a set or a collection of sets that is not itself a set. (An example of class that is not a set is the class that contains all sets.) See Wikipedia:Class (set theory) for details.
  • To reverse engineer a definition for group (Q16887380), I translated all the descriptions to English. A summary of common descriptions are "entities with similar characteristics," "set of things or people," "group of living things," "two or more objects," or "entities with similar characteristics and coexistence." (My favorite Google translation, however, is: "what is and what is what is" 😂) My assesment, then, is that a "group" is an exaustive collection of two or more concurrent physical things (real or fictional) that have an mutual association or defining charactersitic. So examples of groups are: The Beatles (Q1299), the stars in our galaxy, and Bonnie and Clyde (Q219937). Examples that are not groups are John Lennon (only one item), the set {John Lennon and Paul McCartney} (not exhaustive), the set of real numbers (not physical), presidents of the United States (not concurrent), and the set { New York City (Q60) and Julius Caesar (Q1048) } (no association). So every group is a class (Q16889133), but not every class (Q16889133) is a group.
I'll leave the question of the difference between part of (P361) and instance of (P31) for another time and/or person :)
The-erinaceous-one (talk) 10:53, 18 September 2020 (UTC) (please ping me in responses)
The-erinaceous-one, thank you for the very detailed response. That's cleared up my main confusion. I'll have to have a think about this for a few days for it to sink in properly. --Cdo256 (talk) 02:52, 19 September 2020 (UTC)
Cdo256, no problem! I appreciated the nudge to start sorting out our various types of collections on Wikidata.
My assesment of group (Q16887380) was a bit off, however: it matched the descriptions, but looking at all the direct subgroups [18], we find that groups also can contain events and abstract objects, and there are subclasses set of 0 (Q39604693) and monad (Q39604065) which have fewer than 2 items by definition. This seem to be the result of inconsisent modeling, however, and I am trying to fix it.
In addition to the items mention above, I've discovered that there's also class (Q28813620): collection of items defined by common characteristics, which is not clearly defined at all. So this all goes to say that there's a lot of muddled modeling when it comes to collections on Wikidata. In order to organize all the information about these various types of collections, I've made a page where I'll be trying to sort this all out: User:The-erinaceous-one/types of collections. I would welcome any contributions!
The Erinaceous One 🦔 09:50, 19 September 2020 (UTC)
  • More generally, Wikidata doesn't work that well for general abstract concepts with several definitions for the same/similar names. This can be because the Wikipedia articles tend to combine them, because various Wikipedia versions combine them differently or because different Wikidata contributors try to insert these elements in various ways into the P279-tree. It can be solved, but generally requires creating new well-defined and described items and having some sitelinks on items for Wikimedia page relating two or more distinct concepts (Q37152856). Individual instances are generally better maintained (and easier to maintain). --- Jura 08:45, 19 September 2020 (UTC)
--- Jura 08:45, 19 September 2020 (UTC)
  • Yes, it's difficult to create really clear modeling in Wikidata (and probably anywhere), especially when we're dealing with foundational abstract concepts in the ontology. I think we'll be able to sort out the various types of collections, though, in such a way that they are defined clearly and editors understand how to use them. For example, the description for group (Q16887380), "summarizes entities with similar characteristics together," has never been revised and clearly has room for improvement. — The Erinaceous One 🦔 07:35, 21 September 2020 (UTC)
There's the nice Q+-template that gives class (Q16889133): collection of items defined by common characteristics / class (Q5127848): group of things derived from extensional or intensional definition (philosophy) / class (Q16889133): collection of items defined by common characteristics. This discussion would likely be easier to read if it the template would be used more when otherwise it's necessary to click on the item to know which is meant. ChristianKl18:07, 24 September 2020 (UTC)

Sci-hub.st is in the spamfilter blocklist

Why? Is it an error? I was trying to use it as a reference URL on Sci-Hub (Q21980377)--So9q (talk) 11:52, 23 September 2020 (UTC)

Could you add more details? I have no issues with this domain[19]. --Lockal (talk) 12:40, 23 September 2020 (UTC)
I still get this "Ett fel uppstod under sparandet. Dina ändringar kunde inte genomföras. The save has failed. Sidan du ville spara blockerades av spamfiltret. Detta orsakades troligen av en länk till en svartlistad webbplats. Följande text aktiverade vårt spamfilter: sci-hub.st/". I use the palemoon browser and it usually works fine.--So9q (talk) 14:00, 23 September 2020 (UTC)
I do not know who put this site on the spamlist, but as far as I am concerned it must renmain there since its material massively violates all kinds op intellectual property rights.--Ymblanter (talk) 16:02, 23 September 2020 (UTC)
why would that matter to whether we can link there? BrokenSegue (talk) 16:16, 23 September 2020 (UTC)
Hosting direct links on our items about academic papers point via full work available at URL (P953) towards a paper on Sci-Hub might open us up for being charged with violating copyright. It's worth noting that the spam-filter doesn't seem to block "Sci-hub.st" but "Sci-hub.st/". ChristianKl17:55, 24 September 2020 (UTC)

"Last, First" for references rather than "First Last"

I'm trying to build out a complete reference so that I'll be able to use Wikidata for a Wikipedia application. However, for the author (P50) field, it displays on Wikipedia as "First Last", rather than "Last, First" as Wikipedia style normally dictates. I've filled out the first and last name fields on the author's item, so Wikidata should be able to handle this, but it doesn't seem able to yet. Could this be addressed? {{u|Sdkb}}talk 08:01, 19 September 2020 (UTC)

Using en:Template:Cite Q? It's using the author (P50) field on the publication (and you'd have to check the template source code to find what it takes from the author item), but it's an issue for that template specifically. There must be lots of such examples throughout Wikipedia. It says on the documentation:

Order of precedence for rendering author names:

   stated as (P1932) qualifier on author (P50)
   author name string (P2093)
   author (P50) label in English
   author (P50) label in any other language

Ghouston (talk) 08:35, 19 September 2020 (UTC)

@Ghouston: Thanks for the link. I'm not sure how I'd use that template within w:Template:Wikidata, and in any case, it doesn't appear to handle the author name properly itself.
If you all want some incentive to fix this, the page where I'm trying to do this is on the path to soon becoming a featured list. The Wikidata links are much less likely to survive the upcoming FLC review if en-WP editors can point to a way in which the Wikidata-derived citations are inferior than if they display identically. So this issue will potentially determine whether or not we're able to get some featured-level Wikidata-integrated content on en-WP. {{u|Sdkb}}talk 03:24, 20 September 2020 (UTC)
I don't think that using en:Template:Wikidata to generate text, as on that page, is permitted in the English Wikipedia in any case (en:Wikipedia:Wikidata#Appropriate_usage_in_articles). At best, Wikidata can supply data for infoboxes, and references e.g., via en:Template:Cite Q. I guess the names in a citation should be taken as written in the work itself, instead of generated from everything that we may know about the author. But trying to convert a value from object named as (P1932) automatically may be complex and error-prone. The text I quoted from Cite Q above is actually from the "issues" section on the template, so may not be the way it's currently done. Another listed issue is "Author name should display as "Last, First Middle" to match Wikipedia house style". The template authors have probably realised that it's difficult. Ghouston (talk) 04:17, 20 September 2020 (UTC)
@Ghouston: I just added that last listed issue earlier today haha; there wasn't anything previously. I'm not sure who to ping who might be interested in working on this, but there are probably quite a few instances of w:Template:Wikidata that include a reference with an author name (and if there aren't currently, we hope there ultimately will be), so fixing this will have a widespread impact. {{u|Sdkb}}talk 06:53, 20 September 2020 (UTC)
Well, I think the only way it could be done reliably is by specifying the names manually in the right order, either in a new qualifier for the publication items in Wikidata, or in Wikipedia as a parameter to the template. Ghouston (talk) 08:03, 20 September 2020 (UTC)
en:Template:Wikidata uses en:Module:Cite Q to display some references (those with the right properties). It may also compile reference statements in a best-effort way (those which have divergent properties). In that case, it simply takes the label from the author's Wikidata item. To convert that into "Last, First" is tricky indeed, as there's not one rule which can be applied to all names (think of first names that consist of two detached parts and last names consisting of multiple parts). It would help if all author items have a given name (P735) and a family name (P734) statement, but that's not the case (also because not all given and family names have a Wikidata item). And if you would want to use those statements and retrieve them, then that means that you would first need to load the author's Wikidata item, which is an expensive operation. Maybe an option to give a Wikidata item a person's name (with one field for the given name and one for the family name) instead of a single label would be a nice feature. But for now, some best-effort approach may be thought of perhaps. Also pinging @Mike Peel as he's planning to improve en:Module:Cite Q soon. Thayts (talk) 17:56, 20 September 2020 (UTC)
Using given name (P735) and family name (P734) wouldn't be a reliable way to reproduce an author's name, as written on a particular text. It may not even be in the same script, or if you use the English label instead of the native version, it may not be the same transliteration. If Jane Brown writes an article, and puts her name on it like that, would you really want to reference it as "Brown, Jane Elizabeth de Pfeffel", just because the middle names are available in Wikidata? What if she changed her surname in the meantime? Ghouston (talk) 23:03, 20 September 2020 (UTC)
Loading the author's Wikidata item would be a cost that would probably be paid anyway, to allow linking to the author's Wikipedia page if they have one. Ghouston (talk) 23:09, 20 September 2020 (UTC)
That does not require the whole item to be loaded, so that's less expensive. Regarding author name changes etc., maybe we would need to have separate string-based properties for author names, but that would be a maintenance hell. Thayts (talk) 09:00, 21 September 2020 (UTC)
They would only have to be set on items that are used as references in Wikipedia, and only when somebody cares enough to want perfect formatting. Once set they won't need to change, so no maintenance is needed. Ghouston (talk) 09:46, 21 September 2020 (UTC)
It'd still be a violation of w:Don't repeat yourself, create busywork for perfectionists, etc. So generally a bad idea. {{u|Sdkb}}talk 23:55, 21 September 2020 (UTC)
Although, I suppose it's not just Wikipedia (in its various languages with different requirements), but anybody who wants to generate reference lists from Wikidata would have the same problem. Ghouston (talk) 09:52, 21 September 2020 (UTC)
Maybe there's a way that name formats could be put on a person item, to avoid repetition. We could have a new property "name style" which would have a multilingual string similar to object named as (P1932), and matching the way that the name is written in works (books, films, whatever). E.g., for Iain Banks (Q312579) the first such statement could be "Iain Banks" and second "Iain M. Banks". This property would have qualifiers like "name format last first" which would take a monolingual string value, in this case "Banks, IanIain" (English). The algorithm for formatting an author name on a given publication would be a) for each name associated with a work, get a name style, from a "name style" qualifier if set, otherwise from object named as (P1932) (if neither is set, use the label of the author item following a language fallback list.) b) find a matching "name style" statement on the author item, pick the preferred format qualifier, e.g., "name format last first" or "name format last initials", and select the preferred language by following a language fallback list. So then you can get a formatted name like "Banks, Iain M." from Consider Phlebas (Q261728) or "Liu, Cixin" from The Three-Body Problem (Q607112), or something different if you were prioritizing Chinese. Ghouston (talk) 00:50, 22 September 2020 (UTC)
@Ghouston: May I hope that you did not mean to suggest that we would translate the Scottish Iain to Ian for an English-language label, and that was just a typo? - Jmabel (talk) 01:51, 22 September 2020 (UTC)
Oh no, I started out with "Ian" throughout and then noticed the mistake and fixed most of them. Putting a language label on a name, especially when it's "en" used as a default for all kinds of names, will always look wrong though. But I wouldn't like it to be like labels, where there are dozens of identical labels for the various languages using Latin-based scripts. Ghouston (talk) 04:03, 22 September 2020 (UTC)
But still, how would you handle changed names and maybe pseudonyms for different works by the same person? Thayts (talk) 07:51, 22 September 2020 (UTC)
On the item for the person, there'd be a separate "name style" statement for each name variant. Perhaps the details could be improved somehow, but basically there's a collection of formatted names on the person's item, and a method of selecting the one required. Ghouston (talk) 08:39, 22 September 2020 (UTC)
Looking at Douglas Adams (Q42), we have "Douglas" and "Noel" as given names, with series ordinal (P1545) as qualifiers and "Douglas" given preferred rank, and of course "Adams" as family name. Should we set a standard that the preferred rank would mean translate that to "Adams, Douglas" rather than "Adams, Douglas Noel"? With that, I think we'd have all the needed information already in the item.
Does anyone know how the "Last, First" format typically works for countries in which the last name is normally listed first? I.e. is it "Mao, Zedong" or just "Mao Zedong"? If it's the latter, we might have an issue. But that is I think a level of detail that even Wikipedia citations aren't yet built to handle; if you do {{cite book|last1=Mao|first1=Zedong}}, it'll spit out "Mao, Zedong". That's the main edge case I can think of. I also recently encountered an instance at Frank Parkhurst Brackett where I got a value type constraint error for listing "Parkhurst" as the second given name (what I think we're supposed to do for middle names) because it's classified as a family name, but that's tangential. {{u|Sdkb}}talk 17:05, 22 September 2020 (UTC)
I don't think this is a problem. But for non-Latin names, one common issue is that there could be multiple Latin transliterations for a name, and the person may use a specific transliteration for their name. For instance, the family name of Steven Chu (Q172466) is Zhu (Q13391907), which is most commonly transliterated as "Zhu" nowadays, but he uses "Chu" instead. --Stevenliuyi (talk) 20:40, 22 September 2020 (UTC)
The preferred rank shouldn't be used like that, because "Noel" is just as valid as "Douglas". Besides that, is also offers no help if you want to match the specific form of the name on a particular work, e.g., "Iain M. Banks", when M. is a made-up initial used on only certain works, or works by "Richard Bachmann" (Q39829, so do you cite them with the pseudonym or the real name, I have no idea), or for people who change their names and if you want to cite the earlier works with the original name. I'm assuming you want to cite works using the name as given on that work, although I don't know for sure that that's the preferred practice. Ghouston (talk) 22:33, 22 September 2020 (UTC)
George Eliot (Q131333) is another that comes to mind, should her works be cited as "Evans, Mary"? Ghouston (talk) 22:38, 22 September 2020 (UTC)
"Douglas", "Noel" ought to be handled as a sequence, not as one being more preferred. - Jmabel (talk) 22:40, 22 September 2020 (UTC)
Yes, they already have series ordinal (P1545) qualifiers, and that's enough. Ghouston (talk) 22:57, 22 September 2020 (UTC)
Steven Chu was born in the United States, and maybe his birth certificate reads "Chu". Perhaps family name (P734) should be set to a surname item for Chu in the Latin alphabet, which doesn't seem to exist at present. Ghouston (talk) 23:13, 22 September 2020 (UTC)
It would be harder to decide which family name to use for those who have more mixed background, such as Charles K. Kao (Q16389) (whose family name is Gao (Q713706)). I guess maybe we could set multiple family name (P734) and use qualifiers to differentiate? --Stevenliuyi (talk) 00:48, 23 September 2020 (UTC)
Yes, multiple versions of the surname are probably needed in some cases, such as when an author has had works published in multiple scripts. Ghouston (talk) 01:17, 23 September 2020 (UTC)
  • I'm surprised nobody here has suggested that the enwiki "Last, First" standard should be abandoned. Names are far harder than any normal person assumes - see this page of falsehoods. The only reliable programmatic approach to handling a name is to spit it out exactly as it was originally provided. On works that means use author name string (P2093) or the object named as (P1932) qualifier on author (P50) and hope that whatever process got those values into Wikidata didn't already munge the names (which sadly often happens). ArthurPSmith (talk) 13:27, 24 September 2020 (UTC)
    @ArthurPSmith: Wow, that essay is a humbling read for anyone who, like me (and I assume most of us here), likes to fit things neatly into boxes. False assumption 18 is speaking directly to us: People’s names have an order to them. Picking any ordering scheme will automatically result in consistent ordering among all systems, as long as both use the same ordering scheme for the same name. Number 30 is also relevant: There exists an algorithm which transforms names and can be reversed losslessly. (Yes, yes, you can do it if your algorithm returns the input. You get a gold star.) It's a reason to be very glad we use Q identifiers, where the only assumption is one person per person.
    Regarding the circumstances here, w:Template:Wikidata is still obviously in beta, so our goal shouldn't be to get something that'll work perfectly in every possible situation, but rather something good enough that it's capable of spitting out "Seery, John Evan" when presented with John Evan Seery (Q97940622) (possibly with a parameter flipping the switch), and future-proof enough that it can be improved over time. I could always replace author (P50) with author name string (P2093) at the reference being drawn from at Pomona College (Q7227384), but that's not future-proof, since it'd disconnect the reference from the author's item. {{u|Sdkb}}talk 22:04, 24 September 2020 (UTC)
    Thanks for posting the "Falsehoods Programmers Believe About Names" link, it's a classic. It seems that this discussion is still at the stage that some people still think that given name (P735) and family name (P734) are sufficient. The suggested alternative is to declare that producing references, or reference lists, with names formatted to a particular Wikipedia or academic standard is too hard, and we don't even try. The alternative that I proposed above was to allow names, formatted manually into different styles, to be stored in the author's item and selected as required. But that requires a whole new mechanism. Ghouston (talk) 22:37, 24 September 2020 (UTC)

Initials for a person

I have a question: how do I model middle name initials for a person? I have David Tabb (Q42887966) which is stated as "David L. Tabb" in ORCID but we dont know what the middle name actually is. Is there a way to model this? Should I use David Tabb (Q42887966)given name (P735)L. (Q19803509) for this? Why wasnt this properly imported from ORCID? --Hannes Röst (talk) 16:05, 23 September 2020 (UTC)

Not sure on how to model it; however many of our researcher items with ORCID links had the names actually imported from elsewhere (PubMed records for example), so the names often differ at least slightly from the name as recorded in ORCID. This isn't always a bad thing - sometimes the name publicly shown in ORCID is just a first name for example. However, I have run into some cases where the link was clearly wrong for some reason, i.e. a completely different name (a coauthor on some article, and the wrong name linked for some reason). ArthurPSmith (talk) 17:19, 23 September 2020 (UTC)

Label/description conflict with a deleted item?

I wanted to fix some labels/descriptions on Nicki Nicole (Q75211884), but when I try to change the Spanish label and description to "Nicki Nicole" and "cantante argentina", I get the error message "Could not save due to an error. Item Q67179790 already has label "Nicki Nicole" associated with language code es, using the same description text." The problem here is that Q67179790 was deleted last year. And using the preview of this post, I can now see that the link to the deleted item somehow still shows the label. Tried it with some other recently deleted items, but can't reproduce the problem with those. Is this a known bug? --Kam Solusar (talk) 06:58, 24 September 2020 (UTC)

Seems weird. WD:DEV is the venue for this. --Matěj Suchánek (talk) 08:34, 24 September 2020 (UTC)
I happened to see it here anyways :) curious indeed, I filed T263730 for it. (I think I know what’s going on, and if I’m right, it should only affect old items. I’ll post more details on Phabricator.) --Lucas Werkmeister (WMDE) (talk) 10:08, 24 September 2020 (UTC)
As a work-around, you can change the description first, then change the label. Maybe include the date of birth, or just reword it. Ghouston (talk) 22:50, 24 September 2020 (UTC)
Yes, of course. That's what I'd normally do. But since there's obviously a bug involved in this case, I decided not to change it for now to make it easier to reproduce the bug. --Kam Solusar (talk)

Items concerning autism

Hi,

I noticed last night that things are misconnected with our autism-related articles: the interwiki's are mixed up.

We've got classic autism (Q38404) and the autism spectrum (Q1436063) mixed up with each other, at least the English, German and Dutch articles don't match. Also Q1104126 (Kanners syndrome) seems to be in the mix. The diagnostic manual DSM has been changed in 2013, and translations of this change came out only in recent years, that may be why. Can some one untangle? Ciell (talk) 07:30, 20 September 2020 (UTC)

Sooo... some one maybe? I'm available through telegram and can get in IRC to do this together, Ciell (talk) 20:36, 23 September 2020 (UTC)
Found the problem: there appears to be no German article on the Autism Spectrum yet (Q1436063), which got me all confused. This question can go into the archive now! Ciell (talk) 19:07, 25 September 2020 (UTC)

Quickstatements does not work in palemoon

Hi, have anyone else had problems with getting it to start? I found this error in the console after clicking "run":

TypeError: d.command is undefined[Learn More]  
quickstatements.toolforge.org:218:10
	runSingleCommand/< https://quickstatements.toolforge.org/:218:10
	u https://tools-static.wmflabs.org/cdnjs/ajax/libs/jquery/3.3.1/jquery.min.js:2:27452
	fireWith https://tools-static.wmflabs.org/cdnjs/ajax/libs/jquery/3.3.1/jquery.min.js:2:28202
	k https://tools-static.wmflabs.org/cdnjs/ajax/libs/jquery/3.3.1/jquery.min.js:2:77649
	n/< https://tools-static.wmflabs.org/cdnjs/ajax/libs/jquery/3.3.1/jquery.min.js:2:79907

--So9q (talk) 07:47, 25 September 2020 (UTC)

I tried doing exactly the same steps from PetScan->QS in Chrome and it worked without errors in the console.--So9q (talk) 07:51, 25 September 2020 (UTC)

Invitation to endorse a WikiCite grant application to mass upload DBLP and OpenCitations to Wikidata

Hello, I am applying for a WikiCite grant to mass upload DBLP and OpenCitations scholarly databases to Wikidata. I invite you to endorse the proposal at https://meta.wikimedia.org/wiki/Wikicite/grant/Adding_support_of_DBLP_and_OpenCitations_to_Wikidata. WikiCite require several endorsements from the Wikimedia Community to fund several grant proposals. This is mainly to ensure that the mass upload of data will not cause controversions among the Wikimedia Community. --Csisc (talk) 19:08, 25 September 2020 (UTC)

Asking for some help in my import project

Hello - This is my first import project, and I am not clear on what the spreadsheet should look like to make the data import as easy as possible. My intention is to populate the hardiness zone (P8194) property for US cities. I have a spreadsheet of US cities with their QIDs and labels, and I used the postal code data to create a mapping to USDA Plant Hardiness Zones. I have both the IDs and labels of the Zones in the spreadsheet here https://lite.framacalc.org/9j5d-usda-hardiness-zones-jdc. Here is a sample of the data in the spreadsheet

Unique ID Name zipcode zone zone_qid
http://www.wikidata.org/entity/Q60 New York City 11004 7b Q96279219
http://www.wikidata.org/entity/Q60 New York City 11005 7a Q96578142


I am looking at this step in the import process: https://www.wikidata.org/wiki/Wikidata:Data_Import_Guide/Step_5:_Format_the_data_to_be_imported. Ideally, the 'Unique ID' will be the subject in a statement with the P8194 property and the zone_qid column would be the object. Should I remove the 'zipcode' and 'zone' columns and just rename the 'zone_qid' to 'hardiness zone' and consider it good? Looking at the example spreadsheet in Format the data to be imported, I don't see any property IDs, so should I just use the labels?

In addition, what about adding a postal code qualifier to the statement, since there might be multiple zones associated with a city. But this usage doesn't look in-line with the property constraints on postal code (P281). Is there something I need to do to make that happen? And how to indicate the qualifier piece in the spreadsheet?

Thanks for any help! (Still trying to figure out how to sign this properly, will fix as soon as I know  – The preceding unsigned comment was added by Jdeecooper (talk • contribs) at 19:07, 25 September 2020‎ (UTC).

You can sign by adding ~~~~ at the end of a post, it will be converted automatically. - Jmabel (talk) 23:13, 25 September 2020 (UTC)
@John Cummings: - Jmabel (talk) 23:14, 25 September 2020 (UTC)

thanks much! Jdeecooper (talk) 23:22, 25 September 2020 (UTC)

Handling obsolete official website (P856)

Hi. I am trying to tidy up Life&Style (Q1824415), which an IP user tried to blank in error and merge with an English language edition of the same magazine. As part of the clean up I noticed that the official website (P856) is no longer valid. According to dewiki, the publication ceased in 2012 and the webarchive copy shows the link has been redirecting visitors to an alternate site since at least 2013. The official website (P856) is still trying to redirect visitors but my browser throws up an error and stops me visiting the site (and I am not going to override my browser to see if the code on the current target is safe). However, there is an archive copy of the original site at https://web.archive.org/web/20090305130649/http://www.lifeandstyle-online.de/
What is the best way to handle this? Should I deprecate the old website (which is now redirecting visitors) or perhaps set the end time (P582) as 2012 to match dewiki's claim that it ceased publication? Where should I insert the archive copy? From Hill To Shore (talk) 20:06, 25 September 2020 (UTC)

Deprecation isn't recommended for items which were valid at some time but no longer valid, but end date should be set, even if it's just an unknown value. The archive link can be added as a qualifier. If there was a new site, it could have preferred rank. Especially for entities like former publications, it's not unexpected that much of the data will be out-of-date (purely historical). Ghouston (talk) 22:07, 25 September 2020 (UTC)

Possible duplicated information

See Q3132861#P1343 and Q3132861#P4823. Now we have three ways to link an item to an ANB article (using described by source (P1343)=American National Biography (Q465854); using American National Biography ID (P4823); creating an item for the specific ANB article). In my opinion the second is useful (convienent for query); the third can provide some meta-informtion about ANB articles, but what is the proper way to link such items from the subject item. @Gamaliel:.--GZWDer (talk) 04:04, 18 September 2020 (UTC)

Yes, I think when we have an identifier statement like American National Biography ID (P4823), then described by source (P1343) is redundant and should be omitted -- keep described by source (P1343) for sources without an identifier property. The statement is subject of (P805) qualifier on the American National Biography ID (P4823) statement for the item specifically representing this article is a nice touch, and IMO a good way to record this linkage. Jheald (talk) 20:28, 18 September 2020 (UTC)
I agree, generally when we add a new identifier we delete the now redundant way of linking. We remove "described at url" and "described by source". --RAN (talk) 21:35, 19 September 2020 (UTC)
I perfectly agree, in such cases described by source (P1343) is redundant and should be removed; some time ago I saw a different view here, but I think P1343 is as superfluous with American National Biography ID (P4823) as with Dictionary of Swedish National Biography ID (P3217). --Epìdosis 10:24, 26 September 2020 (UTC)

Data quality qualifiers

Hello, please feel free to review and complet that item https://www.wikidata.org/wiki/Property:P1480#P1855 Bouzinac💬✒️💛 11:07, 23 September 2020 (UTC)

This looks good. I've made one change to pi (Q167)numeric value (P1181)3.14sourcing circumstances (P1480)average (Q54835811) to use approximately (Q60070514) instead and added a description average (Q54835811) to mean what I'd expect average to mean as it something in the category containing mean, median, mode, and geometric mean. - cdo256 11:31, 26 September 2020 (UTC)
I don't think pi (Q167)numeric value (P1181)3.14sourcing circumstances (P1480)approximately (Q60070514) is appropriate as approximately (Q60070514) is defined as "the source specified value and explicitly stated that value is a rough approximation or estimate". Appropriate values would be "rounded to 2 decimal places" or "rounded", but they should not be used with sourcing circumstances (P1480) right now. --Pyfisch (talk) 13:32, 26 September 2020 (UTC)

Quellenangabe/ Fundstelle: Schilder und Informationstafeln am Objekt

Wie gebe ich eine Quelle richtig an, wenn ich die Daten (z.B. Name, Künstler und Jahr einer Skulptur) von einem Schild vor Ort habe, wie z.B. hier unten im links im Bild zu sehen: https://commons.wikimedia.org/wiki/File:Evolver_Hans_Germer_Dorfmark.jpg Was gebe ich an, wenn die Quelle eine Informationstafel ist? Beispiel: https://commons.wikimedia.org/wiki/File:Tumulus_of_Vierde_(1),_Heidekreis,_2020.jpg RogerWiki (talk) 13:02, 25 September 2020 (UTC)

Edit: Sorry, I have forgotten to select the correct language. Here is the translation:

How do I specify a source correctly when I have the data (e.g. name, creator and year of a sculpture) from a plaque/label on site, as shown in the picture in the lower left corner: https://commons.wikimedia.org/wiki/File:Evolver_Hans_Germer_Dorfmark.jpg How do I specify a source if the source is an information board? Example: https://commons.wikimedia.org/wiki/File:Tumulus_of_Vierde_(1),_Heidekreis,_2020.jpg RogerWiki (talk) 13:15, 25 September 2020 (UTC)

hmm, well you could make a wikidata item for the sign, attach the image and then cite that item. or you could just use the URL of the image/commons page? not optimal. BrokenSegue (talk) 14:08, 25 September 2020 (UTC)
An approach as described in Help:Sources#Headstones at Commons would probably be the easiest. There are not many items in use with "type of reference" in reference qualifiers, so here is an overview of potential candidates. —MisterSynergy (talk) 15:17, 25 September 2020 (UTC)
Since the information board is the responsibility of the publisher of the board, it could be the organization placing the board, or the actual local organization behind the location of the board (not the same thing for art objects in museums: museums generally take responsibility for their own items in exhibitions and credit "home institutions" for objects on loan to them). For special exhibitions, then the source is the curator. The url to the info board can be used as a reference using reference URL (P854) as a qualifier. Jane023 (talk) 07:56, 26 September 2020 (UTC)
Thank you for your answers. Unfortunately, my english is not the best. So, there is no simple way to say: "I've got this Information (e.g. creator) from a plaque locatet next to the sclupture" or "I've got the Information from a information board on site" without having a photo/url? RogerWiki (talk) 11:08, 26 September 2020 (UTC)

Can P31 properties have a preffered rank?

I noticed several cities, notably Cairo (Q85), Montevideo (Q1335), Copenhagen (Q1748) and Lima (Q2868) don't show up as an instance of (P31) of city (Q515). Turns out, are listed as cities, but the capital city (Q5119) value has the preffered rank set, so it supersedes all other values. Is this correct? I changed it for other cities, but these are semi-protected.

Svízel přítula (talk) 21:01, 15 September 2020 (UTC)

I suppose preferred ranks would be needed if there were instance of (P31) statements that were outdated and no longer valid, but then all the currently valid statements should be preferred. I don't see any reason to give preferred rank in a case like capital vs city. Ghouston (talk) 01:52, 16 September 2020 (UTC)
An example of outdated-vs.-permanent P31 values is Dadra and Nagar Haveli district (Q46107). But the historical value is not marked as deprecated (nor the current value preferred). —Scs (talk) 11:53, 16 September 2020 (UTC)
When we use rank, we don't deprecate historical values but generally qualify them with end time (P582) and then set the current value as preferred. There are plenty of cases like Dadra and Nagar Haveli district (Q46107) where it would make sense to use ranks but we currently don't as nobody did the task to set the rank. ChristianKl12:01, 16 September 2020 (UTC)
With Dadra and Nagar Haveli district (Q46107), it may be better split into separate items, given significant changes in status. It also has a dissolved, abolished or demolished date (P576) which means that "current value" has no meaning. Ghouston (talk) 12:29, 16 September 2020 (UTC)
I'd also prefer using the dedicated property capital of (P1376) to indicate that relationship instead of instance of (P31). Ghouston (talk) 01:54, 16 September 2020 (UTC)
I agree. The property is better to communicate the information. ChristianKl12:01, 16 September 2020 (UTC)
As an aside, remember that straight P31 relationships are not generally a reliable way to query for is-a relationships, anyway. For example, Boston (Q100) isn't (directly) a city, either, and the reason has nothing to do with ranks. wdt:P31/wdt:P279* is your friend! —Scs (talk) 12:09, 16 September 2020 (UTC)
True, but that also won't work here, since capital city (Q5119) is not a sublass of city (Q515). I also query for "is a capital" using capital of (P1376) and "is a location" using coordinate location (P625), as that's faster. Svízel přítula (talk) 15:39, 16 September 2020 (UTC)
A big "meh" to P31=city in the United States (Q1093829) and anything like it. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:42, 20 September 2020 (UTC)
Unfortunately there are many like this, including the beginnings of a hierarchy of them via city of Oregon (Q63440326), city in the state of New York (Q15063611) etc. --Oravrattas (talk) 05:49, 21 September 2020 (UTC)
Lovely. Let's recreate the entire set of Wikimedia project category trees, why not? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:21, 26 September 2020 (UTC)

Gitter room

Would anyone be interested in a gitter room for Wikidata (e.g. https://gitter.im/wikimedia/wikidata - currently there are some rooms under https://gitter.im/wikimedia).

IRC is a bit of a schlep in comparison and there is already a substantial linked data community on gitter with json-ld.org, linkeddata/chat shex and rdf4j to name a few. Iwan.Aucamp (talk) 16:34, 26 September 2020 (UTC)

Sitelinks for properties (not just items)

I'm new to Wikidata; please advise me where I can find an existing discussion on this topic if appropriate:

For context, I was looking over the item page for Visual Studio Code (Q19841877), then followed the subreddit (P3984) identifier property to see what it could do.

I noticed that while there is a Sitelink section for the item, there are no Sitelinks in the properties that I examined.

I understand that the semantics of a property don't, themselves, denote a connection to the reference (Wikipedia), but for many cases, a connection to a reference topic seems immensely helpful.

For this particular property, I happen to know what a subreddit is, and where it comes from. And yes, I can find links to the website at reddit.com. But there are properties, especially identifier properties, that are scoped to a domain (like this one is), and aren't general ontological properties (like "instance of", "inception", "authority" and so on). Some of these domains are unknown to me and, I'm sure, many others.

I thought that for some properties, a Sitelink entry would be appropriate. As an example, for this property, I'd expect to see a Sitelink to en.wikipedia.org/wiki/Reddit, or, at the very least, to the item Reddit (Q1136). RedCrystal (talk) 18:25, 26 September 2020 (UTC)

subreddit (P3984) has both a link to Reddit (Q1136) and subreddit (Q28464970). The relevant site links are placed on the items. From Hill To Shore (talk) 18:32, 26 September 2020 (UTC)
In the general case it's a mistake to assume that site-links fully tell you what an item is about. There are plenty of cases where different Wikis have different scopes for articles that are linked to the same Wikidata item. If you want to understand the scope of a property reading the property proposal discussion and the property talk page is where you find the information. ChristianKl19:34, 26 September 2020 (UTC)

correct way to indicate "mayor of [some place]"

working on: Imre Gálffy (Q1031047)

the subject was the mayor of a town; I tried using: position held (P39) mayor (Q30185) of (P642) Miskolc (Q102397)

It throws a "none of constraint" saying that position held cannot be "mayor". How does Wikidata like to represent this kind of information?

Does it want a separate Q entry for every mayorship in the world? (e.g. Mayor of New York City (Q785304), Mayor of London, Mayor of Podunk, etc.)? If so, is it ok to blithely create that item on the basis of just knowing one of the previous mayors, and not knowing much more about the government of the town?

Thanks -Kenirwin (talk) 00:31, 26 September 2020 (UTC)

It is expecting something like mayor of a place in Canada (Q98880424) or mayor of a place in India (Q98848902). I can't spot one for Hungary though, so that one may need a new item. From Hill To Shore (talk) 01:27, 26 September 2020 (UTC)
@Kenirwin: Yes, it's perfectly fine (and indeed recommended) to create a new item for "Mayor of (wherever)" even if you only currently know one officeholder for that, and nothing much else about it other than applies to jurisdiction (P1001). There are also tools, such as Petscan, that will allow us to, for example, set everyone in Kategória:Miskolc_polgármesterei to have a position held (P39) to such an item once it exists. --Oravrattas (talk) 07:50, 26 September 2020 (UTC)
@Oravrattas: -- thanks for this; I'd not heard of Petscan before, but it seems pretty great. Do you know of any good English-language instructions for doing the kind of thing you're talking about? It's pretty easy to search PetScan, but I'm finding it harder to understand how to use it to add properties/values in WikiData. I read the manual, but I'm finding it tough to understand. I've created mayor of Miskolc (Q99653360) and added it to the one entry, but PetScan finds about a dozen more where it would be relevant. Thanks! - Kenirwin (talk) 18:12, 26 September 2020 (UTC)
@Kenirwin: I'm afraid I don't really know if there are any good instructions anywhere: I've mostly figured it out from trial and error, and watching other people use it. The most important (but most easily overlooked) toggle when using it to create claims in Wikidata, is to make sure that "Use wiki: Wikidata" is selected in the "Other sources" tab, before running the "Do it" (yes, it's rather well hidden!) That will then give you a "Command list" box at the top of the result sets, where you can then enter something like position held (P39):mayor of Miskolc (Q99653360), to apply that to all (or a selected subset — e.g. removing the "list of mayors of Miskolc" article first) of the results. (You don't need to worry about filtering out ones that might already have it: QuickStatements will skip anything that would create a duplicate statement.) --Oravrattas (talk) 20:33, 26 September 2020 (UTC)
@Oravrattas: -- that's fantastic -- thanks. Yeah, I was never gonna find that... Thanks so much for your help! - Kenirwin (talk) 00:29, 27 September 2020 (UTC)

String to external-id

Not sure about where I should ask about this. We have Open Food Facts food additive ID (P1820), Open Food Facts food category ID (P1821) and Open Food Facts ingredient ID (P5930). Two of them are external-id, one is string datatype. Open Food Facts food additive ID (P1820) should be changed to exernal-id too. Wostr (talk) 16:28, 26 September 2020 (UTC)

iNaturalist-Taxon-ID duplicate Q15447825#P3151

Can someone please check that these two iNaturalist records indeed refer to the same plant? --Pyfisch (talk) 17:02, 27 September 2020 (UTC)

Help in mapping relationships between Pokémon, variants and items

Hi, I wrote this help request two years ago and nobody answered me. I try to explain everything:

  1. In generation 6, devs introduced a new game mechanics named "Mega evolution (Q16577590)": a Pokémon holding a Mega Stone (Q56676211) can change its form and also its types. It can be done just during a battle and when the battle finishes or the Pokémon is KO, it returns to its normal form. I need to link items between original forms, their megaevolutions and their megastone (e.g. Altaria (Q2345405), Mega Altaria (Q56676709) and Altarianite (Q56676294))
  2. In gen 7 and 8, devs introduced new Pokémon forms, regional variant (Q56707607), which change its form and its types from the original one. I need to link items between original forms and regional forms (e.g. Meowth (Q877650) and Alolan Meowth (Q79234337), and viceversa)
  3. Some moves or some class of moves, when used with a Z-Crystal (Q56676155), change to Z-Move (Q26206570). How to link them (e.g. Aloraichium Z (Q56706849), Thunderbolt (Q26156769) and Stoked Sparksurfer (Q26945322))?

Could someone help me? --★ → Airon 90 17:50, 27 September 2020 (UTC)

1) Looks similar to water vapor(190120). "Mega-Evolved Pokémon" is a subclass of(P279) Pokémon species(3966183). Mega Altaria(56676709) is an instance of(P31) "Mega-Evolved Pokémon", of(P642) Altaria(2345405).
Altarianite(56676294) here is similar to vaporization(6452502).
2) Looks like you are saying that Alolan Meowth(79234337) is a "regional form of Pokémon" of(P642) Meowth(Q877650).
3) Z-Move(26206570) uses(P2283) Z-Crystal(56676155), Stoked Sparksurfer(26945322) uses(P2283) Aloraichium Z(56706849). Stoked Sparksurfer(26945322) instance of(P31) Z-Move (Q26206570) of(P642) Thunderbolt(26156769)
--Lockal (talk) 09:52, 28 September 2020 (UTC)

Property proposal Birth rate

Can a property creator have a lokk at this proposal [20] birth rate and eventually make it ready? Pmt (talk) 08:52, 28 September 2020 (UTC)

Wikidata weekly summary #435

It seems to me that most identifiers on Kingdom of Denmark (Q756617) don't apply to this item, but to Greenland (Q223) or Faroe Islands (Q4628), possibly Denmark (Q35).

This applies to (notably) ISO 3166-1 alpha-2 code (P297) "GL" and "FO". For that property, it also violates a distinct value constraints as it's already on Greenland (Q223)/Faroe Islands (Q4628).

Accordingly they should be delete from Q756617.

I will add a note at Wikidata:WikiProject Danmark, Talk:Q756617 and Property talk:P297. @BALMAINM, Fnielsen: --- Jura 09:55, 28 September 2020 (UTC)

Editing items with a form

Is it possible to edit a item with a Form. I know that it is in Wikdata Cradle possible to create new items with forms and the definitions of such form is from my point of view in a easy way possible with that tool. I think that it were also helpful for using it for editing items. A form is something what is known by edtors and so I think they know how to edit the content. Is there a tool for that or do you think that this is helpful. I was at the Weekend at a meeting with editors of the FürthWiki a lokal wiki, and they use SemanticMediaWiki and I think that forms are helpful for editing the data. At the moment editing Wikidata is not so easy as they told me, and they said it is mainly because of the missing structure. To get a structure a form can help. --Hogü-456 (talk) 19:33, 28 September 2020 (UTC)

Military specialty=sapper, rifleman, sniper, medic, cook, radioman

Should we create a new property called Military specialty, or some similar name to hold: chaplain, sapper, rifleman, sniper, medic, cook, radioman and other military occupations? Or should we model it as: conflict=World_War_II with subject_has_role=chaplain ? Let me know which one you think would work best. --RAN (talk) 18:13, 23 September 2020 (UTC)

@Richard Arthur Norton (1958- ), ChristianKl, Andrew Gray: I guess the problem here is: how would you query for the occupations a particular set of people had while they were serving in the military? Can this query generally be achieved, given a particular pattern of qualifiers? (And if so, how robust or brittle would the query be, if it were relying on such a pattern?) Or is another approach needed?
The key thing may be to make sure that all military occupations are identifiable as military occupations. So "military chaplain" rather than generic chaplain, so that the occupations are all in the subclass tree of an instance of military profession (Q6857706). (Also making sure that no non-military or ambiguous occupations -- eg hospital chaplains / generic chaplains -- can be found in that subtree). Not sure if that is achievable; but if it were, it would achieve what was needed for the query. Jheald (talk) 21:53, 28 September 2020 (UTC)
  • Currently we list occupation=soldier, so we can query for military people as a profession. I think occupation=cook or [occupation=medic start_time=1941 end_time=1945] may possibly give people the answer they are looking for, but there must be a better way. --RAN (talk) 01:10, 29 September 2020 (UTC)

description of "described by source"

Currently the English description of described by source (P1343) is "reference work where this item is described." Do I understand correctly that values for this property don't have to literally fall into the class of reference work (i.e. a dictionary, encyclopedia, or other such summum of knowledge) but can be any work which is significantly useful as a reference on the topic? If that is in fact so, then the description is misleading. Most of the other-language descriptions that I'm able to read more strongly lean towards specifying a literal reference work; they're something like "dictionary, encyclopedia, or other reference work the item is described in," or even "dictionary or encyclopedia the item is described in." — Levana Taylor (talk) 11:20, 28 September 2020 (UTC)

There was a discussion a few months ago at Wikidata:Project_chat/Archive/2020/02#Described_by_source. Ghouston (talk) 22:42, 28 September 2020 (UTC)
Also at Property_talk:P1343. Neither seems conclusive. Ghouston (talk) 22:45, 28 September 2020 (UTC)
OK thanks, I hope this gets sorted soon. If a topic has an entire scholarly paper or well-researched book about it, why should we not be able to mention that on the item page, while we can mention a brief encyclopedia article? — Levana Taylor (talk) 23:16, 28 September 2020 (UTC)
Yes, it doesn't make much sense to me that we have to use an external link to refer to a particular resource when it's already described by a Wikidata item, as I said in the discussion. I'm not really convinced an additional property is needed, but it's about time it was proposed if so. Ghouston (talk) 23:39, 28 September 2020 (UTC)

How to add support for Local language to Wikidata

I have been assisting at Wikidata editathons just to discover that, most local languages we were hoping to edit were not supported. This brought the quest for this topic. Since we have Wikidata as our initial point of involving the community, we wish to find out the necessary steps to get these languages on Wikidata.  – The preceding unsigned comment was added by Eugene233 (talk • contribs).

You may find User:Lea Lacroix (WMDE)/List of lists of languages interesting. --Matěj Suchánek (talk) 08:33, 29 September 2020 (UTC)

submission via URL using LAST

Copied from Help_talk:QuickStatements - please reply there if possible.

The following URL submission isn't working: https://quickstatements.toolforge.org/#/v1=CREATE%7C%7CLAST%7CLen%7C%22Digital%20media%20use%20and%20mental%20health%22%7C%7CLAST%7CDen%7C%22Scholarly%20article%22%7C%7CLAST%7CP31%7CQ580922%7C%7CLAST%7CP7347%7C%22https:%2F%2Fen.wikiversity.org%2Fwiki%2FTalk:WikiJournal_Preprints%2FDigital_media_use_and_mental_health%22%7C%7CLAST%7CP793%7CQ76903164%7CP585%7C2019-06-08T00:00:00Z%2F11

However, each aspect of it works alone if I just create a new item and add a single statement (=CREATE%7C%7CLAST%7C[statement]). Is there something obvious I'm doing wrong? I've merged the items into Q99557809, but I'm making a template that auto-formats up these submission URLs so I really want to get it working (see further example here). T.Shafee(evo&evo) (talk) 03:43, 28 September 2020 (UTC)

"Is there something obvious I'm doing wrong?"
I think you didn't read the help page your are commenting on. --- Jura 06:27, 28 September 2020 (UTC)
@Jura1: hah, sadly the solution wasn't in the #Running_QuickStatements_through_URL on that help page. In this case it was that Senwikiv wasn't an accepted sitelink abbreviation for en.wv (I'd extrapolated from the given example of Senwiki for en.wp). And that the url submission method doesn't infer a "+" in front of dates (I'd extrapolated from the CSV import method which does). Luckily some troubleshooting at the talkpage managed to get it working (the now working example after troubleshooting in case it's useful for reference some time in the future). T.Shafee(evo&evo) (talk) 05:38, 29 September 2020 (UTC)

Duplicate items created by Quickstatements

In the last weeks, I've run some QS batches to create items. These batches (100 items at a time) created a variable number of duplicates, sometimes even triplicates. An example is batch #43384, which created 7 duplicates. The duplicates do not stem from the input file. Has somebody made similar experiences? Are there perhaps known workarounds? Can anybody spot weaknesses in the input file? I'd very much appreciate help here, I want to avoid creating the other 3500 items in the queue and having to delete about probably more than 200 items to clean up afterwards. Cheers, Jneubert (talk) 15:51, 28 September 2020 (UTC)

See Topic:Vkhd578n4cv9ndew.--GZWDer (talk) 16:21, 28 September 2020 (UTC)
User:Tagishsimon noticed this too, on the 26th: twitter.
Does QS do this all the time, or is it having bad days? Jheald (talk) 16:36, 28 September 2020 (UTC)
I've experienced that at least at four different days in the last two weeks, so apparently it isn't just a hickup. @GZWDer: Thanks a lot for the link - I've searched that queue, but not long enough back. Will try without batch mode, perhaps that is the workaround I'm looking for. Jneubert (talk) 16:46, 28 September 2020 (UTC)
@GZWDer, Tagishsimon, Jheald: The use "Run", not "Run in background" workaround worked without creating duplicates for batches of 100 and 1000 items. Thanks again, Jneubert (talk) 07:06, 29 September 2020 (UTC)

Time composite maps

 

On this map is drawn a railway system Q2149865 (From 1913 to 1938, on 1938 the first line is dismantled) on a 2012 OSM background map. Is there a property 'background map' to place the OSM date? How can we deal with this? Quite a lot of maps are drawn on a more recent background map.Smiley.toerist (talk) 20:11, 28 September 2020 (UTC)

Merging items

Q97446514 and Q47742130 refer to the same person. How to merge them into one item? Regards, TryKid (talk) 07:09, 29 September 2020 (UTC)

Proper way to contribute surnames with a bot

We faced a problem with a nobiliary particles in the surnames of artists when started to import data with a bot. Sometimes it is hard to separate them from the surname because we don't really know each possible surname prefixes in a different languages. For example Van is a very common prefix in Dutch language, and Di in Italian. My questions are:

  1. Can last names be compound at all(consist of several words separated by spaces)?
  2. Any suggestions how to extract only surname in general?

 – The preceding unsigned comment was added by [[User:|?]] ([[User talk:|talk]] • contribs).

If you aren't confident in your approach, it's better not to use a bot for this.
What can be easier is to limit oneself to people of a given nationality and personal names that consists only of two words in Latin script. Also, you could test for specific family names. --- Jura 15:03, 29 September 2020 (UTC)
Jura Thanks for the update. We already limited our functionality as you described but for us it is still a question how to improve an automation of this process.Wdrupal (talk) 16:05, 29 September 2020 (UTC)
@Wdrupal: It really depends on your dataset. The way personal names work evolves over time and is different from one country to another.
If your items already have name in native language (P1559) and country of citizenship (P27), it might be possible to determine further steps.
If you have primarily Cyrillic script names, you might need to create quite a few new items even for given names. --- Jura 18:15, 29 September 2020 (UTC)

Movement Strategy - What Are Your Choices For Implementation

Hi Wikidata friends,

Apologies to make this page busier, but the time has come to put Strategy into work and we would love to hear from you.

The Movement Strategy Design Group and Support Team are inviting you to organize virtual meetings with your community and colleagues before the end of October. The aim is for you to decide what ideas from the Movement Strategy recommendations respond to your needs and will have an impact in the movement. The recommendations are available in different formats and in many languages. There are 10 recommendations and close to 50 recommended changes and actions or initiatives. Not everything will be implemented. The aim of prioritization is to create an 18-month implementation plan to take some of the initiatives forward starting in 2021.

Prioritization is at the level of your group, affiliate, and community. Afterwards, we will come together in November to co-create the implementation plan. More information about November’s global events will be shared soon. For now and until the end of October, organize locally and share your priorities with us.

You can find guidance for the events, the simple reporting template, and other supporting materials here on Meta. You can share your results directly on Meta, by email, or by filling out this survey. Please don’t hesitate to get in touch with us if you have any questions or comments, strategy2030 wikimedia.org

We will be hosting office hours to answer any questions you might have, Thursday October 1 at 14.00 UTC (Google Meet).

Thank you.

MPourzaki (WMF) (talk) 16:49, 29 September 2020 (UTC)

Icons for properties

Hi y'all,

I've added icon (P2910) on Palissy ID (P481) and Mérimée ID (P380), as it was already done on ORCID iD (P496) and format as a regular expression (P1793). It could be very useful for re-users, starting with template like the Wikidata Infobox on Commons or authority box on French Wikipedia and Wikisource.

What do you think? It seems pretty but trivial but is there any comment or objection?

Cheers, VIGNERON (talk) 19:32, 29 September 2020 (UTC)

Sounds good to me! - PKM (talk) 19:57, 29 September 2020 (UTC)
Good idea. Might even be a good idea to find some way to display icons when editing Wikidata (at least opt-in) --Emu (talk) 20:09, 29 September 2020 (UTC)

Wiki of functions naming contest

21:25, 29 September 2020 (UTC)

Characters in movies/series

Hi, I'm trying to understand how to properly construct movie items. Movies/series have cast member (P161) with a list of actors and every actor can be assigned to a character with character role (P453). However, in 99% the character is not really that important to create separate Q item for it. What if I want to assign an actor to "man from the crowd" character role - how to do that properly, to have ability to localize phrase "man from the crowd" into other languages later? Please, ping on reply. --Kanzat (talk) 13:09, 28 September 2020 (UTC)

Duplicate entries

Hello. I found a couple of what appear to be duplicate entries:

Perhaps it is easiest to delete (Q67601241), which is the newest entry created last year.

Also:

Perhaps a merge would be appropriate here?

Regards, --Malcolmxl5 (talk) 16:33, 28 September 2020 (UTC)

Thank you. I merged Goddards, as it looked unambigous duplication. Concerning the first pair, we have a perennial issue of a museum vs building which hosts the museum. We recently had a long discussion of building vs function, does anybody remember what the outcome was (if any)?--Ymblanter (talk) 19:17, 28 September 2020 (UTC)
Keep both items, a building can change function and a museum can change its location.--Jklamo (talk) 20:10, 28 September 2020 (UTC)
That is logical. I have removed the property 'architect' from (Q2086562), the function, and added it to (Q67601241), the building. Similarly, I have removed 'heritage designation' and 'National Heritage List for England number' from (Q2086562), the function, as they relate to the building (they are already included in Q67601241). Thanks, --Malcolmxl5 (talk) 17:57, 30 September 2020 (UTC)
From Special:WhatLinksHere/Q24699794 I found Wikidata:WikiProject Heritage institutions/Data structure/Data modelling issues, which is linked to from Wikidata:Project chat/Archive/2019/09#Cleaning up Wikidata Entries for Heritage Institutions - Your Help is Needed! but that isn't particularly long or recent so there may have been another discussion. I think it's just a coincidence that Yorkshire Museum building (Q67601241) was created around the same time - it was so I could add a statement to St Mary's Lodge and attached railings, gates and gate piers (Q67600995), a separate building which according to the heritage list description is part of the museum; there are others I have split more recently because they weren't always museums. With historic house museums, there is a stronger association with the building, and I agree with merging the house and garden item; exceptions would be if the building and garden were under separate ownership, or if any of the sitelinks were specifically for the garden. Peter James (talk) 22:23, 30 September 2020 (UTC)

Can't get a Wikipedia page to include a language link

It all began when I wanted to know the French word for watermelon (which I eventually found through Wiktionary, but that's long since become irrelevant). I had begun by heading to the article on English-language Wikipedia and scanned the "Languages" list in the sidebar, but though it included entries for gajillions of other languages, it bore nothing for français. In trying to figure out how to remedy that, I learned (where had I been?) that all of the links to corresponding pages on other languages's Wikipedia sites are managed via this site, Wikidata.

OK, so I ended up at https://en.wikipedia.org/wiki/Help:Interlanguage_links#Adding_a_new_link, dutifully followed those directions, but got nowhere. The Wikidata page to which the sidebar at https://en.wikipedia.org/wiki/watermelon sent me is watermelon (Q38645), but when I tried to add an entry for fr to its Wikipedia table, I got groused at because it seems that French Pastèque is already claimed by Citrullus lanatus (Q17507129). Well, those bad tidings were at least accompanied by the advice that, "Hey, maybe the two should be merged." And indeed, when I glanced at their discussion pages I learned that User:Brinerustle had suggested the merger in July 2019. But then I discovered that User:Cloht had advised against it, arguing that one page is for the plant itself and the other is for its fruit.

Well I get that User:Cloht's distinction is one worth maintaining. Indeed, the French language makes such distinctions right and left (e.g., an apple is une pomme, but an apple tree is un pommier). But that doesn't do me any good in resolving the original problem: In the status quo, the myriad innocent users of English-language Wikipedia who visit the article about watermelon have no convenient way to navigate to its cousin in French Wikipedia. It makes no sense for French to be singled out for this bogosity. But being utterly brand new to Wikidata, I haven't the slightest idea how to fix it. Help??—PaulTanenbaum (talk) 18:47, 29 September 2020 (UTC)

See en:w:Help:Interlanguage links. This sounds like a situation where you will want to add a local link. From Hill To Shore (talk) 19:16, 29 September 2020 (UTC)
Well, that was easy enough. A bit kludgy, but hey, it worked. :-) Thanks, From Hill to Shore! —PaulTanenbaum (talk) 20:31, 29 September 2020 (UTC)
en:Watermelon should probably be linked to Citrullus lanatus (Q17507129), since it seems to be about the plant and not just the fruit. Ghouston (talk) 23:10, 29 September 2020 (UTC)
No, because there are probably much more taxa than this making a watermelon. --SCIdude (talk) 05:52, 30 September 2020 (UTC)
@Ghouston: is correct; there is only the one species of Watermelon; the English link at Citrullus lanatus (Q17507129) is currently just a redirect to en:Watermelon, which means a lot of the interwiki connection doesn't work well - MPF (talk) 10:11, 30 September 2020 (UTC)

VIAF and KrBot

This is potentially a serious problem. You might recall a discussion (here for one) about how best to to indicate that VIAF clusters contain authority records for multiple different people, and how best to stop bots from taking those VIAF clusters as reason to add the authority IDs to the wrong Wikidata items. We settled on putting the erroneous IDs in the item with a deprecated rank in hope this would block the bots from adding them again. But it doesn't seem to work, at least in the case of KrBot. Have a look at the history of James B. Baker (Q96925509) (silviculturist) and James B. Baker (Q60438918) (architect). Not only did KrBot re-add the deprecated VIAF ID, but worse, the confusion seems to have travelled back to VIAF again. Earlier this summer, VIAF separated out the architect's Getty ID from the silviculturist's LoC ID, but now they've combined them again. I am having a hard time figuring out what exactly the bots did to VIAF, but it anyhow appears that our attempts to be a source of correct and clear data for other authorities may not be working. There can't be unrestricted bidirectional bot-editing, because then bots "correct" Record B according to the erroneous Record A and take the second version of Record B as confirmation that Record A is correct and if someone fixes Record A, they "correct" Record A according to Record B, and and and — Levana Taylor (talk) 15:37, 29 September 2020 (UTC)

Can you check with Ivan? I think it should have ignored the deprecated statement. --- Jura 15:49, 29 September 2020 (UTC)
The problem's not just KrBot really -- that isn't what's editing VIAF, is it? Even if we figure out what happened in this case, what's to stop it happening in some other form? We need to block bidirectionality and I'm not sure that deprecated rank does that effectively. — Levana Taylor (talk) 15:55, 29 September 2020 (UTC)
If I understand you the issue is that KrBot doesn't check for deprecated entries before writing back. BrokenSegue (talk) 15:57, 29 September 2020 (UTC)
maybe we need a software change to prevent bots from adding entries that already exist and are deprecated? Optimally the authors of these bots would be "doing the right thing" and we would have common software libraries that make doing the right thing the default. Looking at KrBot it seems they haven't actually been approved to do what they are doing? I thought bots needed approval per-task? Why did @Ivan A. Krestinin: not need such re-approval. Also, the lack of open source code for this and similar bots makes checking they are doing the right thing hard. BrokenSegue (talk) 15:57, 29 September 2020 (UTC)
If a change is needed at VIAF, it's something that needs to be discussed with them. Still, I don't quite get why you add conflated identifiers to a clean item. This specifically not what Help:Conflation_of_two_people#Keep suggests to do. --- Jura 16:00, 29 September 2020 (UTC)
Adding the conflated identifiers, deprecated, was an attempt at a kluge solution to stop bots adding them -- suggested in the previous discussion but apparently not the right answer. — Levana Taylor (talk) 16:04, 29 September 2020 (UTC)
If you think Help:Conflation_of_two_people should be revised, please suggest updates. If you follow some other approach, you can just hope VIAF and bot operators guess it, but that is somewhat unlikely. --- Jura 16:08, 29 September 2020 (UTC)
Oh, I agree. Luckily the proposal from the previous discussion hasn't been used much yet. The question is what to do instead. — Levana Taylor (talk) 16:12, 29 September 2020 (UTC)
VIAF bots are unlikely to work with your kludges nor should they rely on emails from Wikidata contributors .. Strange that people suggested that approach to you while I think it's known that it isn't working. --- Jura 16:27, 29 September 2020 (UTC)
Help:Conflation of two people doesn’t necessarily apply here as it handles cases of Wikidata entries that are conflated and need to be untangled. Generally, it is useful information to know that an identifier appears to be about a given person but really is not, so the deprecation serves a purpose beyond locking out bots. VIAF is a special case, though, I stopped to try to make VIAF happy as it is too erratic. The time is better spent trying to correct other identifiers (that mostly aren’t that much conflated which makes flagging the odd one out even more useful). --Emu (talk) 20:05, 29 September 2020 (UTC)


Bots and circularity

Could there be a hand-setting that means "I notice there are errors in external authorities' values for this statement, I have made Wikidata correct to the best of my ability, and I am now blocking bots from changing it on the basis of those external authorities" -- i.e. a "human editing only" setting? — Levana Taylor (talk) 16:21, 29 September 2020 (UTC)

Although that still isn't a general solution to the problem of bidirectional bot editing. Let me put this another way: If Wikidata bots took data from elsewhere but others didn't take data from us, we'd constantly be fighting errors but at least we wouldn't be the means of propagating those errors back to external sources. Conversely, if we didn't take data from outside but others took from us, we'd have a smaller dataset but the others could rely on us to try to make it correct. Whereas if the dataflow is both ways, not only do errors go round and round forever, but no one knows who to rely on. Being able to manually block the back-and-forth in specific cases would be better than nothing. — Levana Taylor (talk) 17:26, 29 September 2020 (UTC)

Yet further thought: We don't want to have a blanket ban on Wikidata taking any data from external sources that in turn take data from Wikidata, because those external sources in many cases have multiple ways of getting information and contain a lot that is useful for Wikidata. But would it be a good idea to find out at least some of the major ones that use Wikidata as a source, fr'ex VIAF, and ban bot imports from them altogrther, rather than just case by case like I suggested in the paragraphs above? — Levana Taylor (talk)

@Levana Taylor: That's throwing out the baby with the bathwater. There is a huge amount of valuable information on VIAF etc that we don't have, that we should import. The problem is making sure that information doesn't get re-imported that has already been identified as incorrect. The way to do that is, first, to make sure that information that is incorrect but widespread should be visible here and marked as deprecated (ideally with a reason spelling out why); and, second, to require the major bot platforms not to add a statement to an item, if the statement is already here with deprecated rank. Jheald (talk) 19:38, 29 September 2020 (UTC)
But above, Jura was arguing that adding incorrect external identifiers deprecated was not the way to deal with the VIAF-conflation issue. — Levana Taylor (talk) 19:51, 29 September 2020 (UTC)
We should be aware of Citogenesis. Multichill (talk) 19:43, 29 September 2020 (UTC)
Yup, I should've known that XKCD would already have found a name for the process. And yes, humans can do it just as well as bots can, merely not on quite such a massive scale. — Levana Taylor (talk) 19:48, 29 September 2020 (UTC)
Property_talk:P214/Archive_1#synchronize_with_VIAF has some discussion about it from 2016. --- Jura 09:08, 30 September 2020 (UTC)
That's informative, but besides that discussion being more concerned with initial imports than maintenance, it was assumed back then that VIAF would be working fairly closely with Wikidata. Is there currently any human being at VIAF who is actively responding to correspondence? If not, then that makes it very hard to presently set up WD in any way that's useful to VIAF, since we wouldn't know much about how VIAF is reading and using WD data and how it is responding to changes here. It's more important, I think, to just protect WD statements that have been hand-corrected from being re-altered by a bot, which would pretty much solve our end of the VIAF problem. As to whether that's best done by adding deprecated statements and making sure bots don't duplicate them ... aren't there quite a few external processes and even Wikimedia gadgets that have trouble interpreting deprecation? Maybe it's best, then, to avoid as much as possible proliferating deprecated statements, using them only in cases where they'd be informative to humans, and elaborately tracking VIAF cluster errors probably isn't such a case. — Levana Taylor (talk) 18:43, 30 September 2020 (UTC)

Deprecated rank mechanics

Maybe I haven't read thoroughly above. What are good reasons not to declare it a bug that deprecated values don't prevent the same value being added with a different rank, or added at all? It would be the first real database feature preventing bot stupidity, for which there is no good other remedy. --SCIdude (talk) 15:18, 1 October 2020 (UTC)

a bug in what? i agree it is a bug in the bots doing the mirroring. BrokenSegue (talk) 15:27, 1 October 2020 (UTC)
A bug in that the attempt to add the same value should produce an error. --SCIdude (talk) 15:39, 1 October 2020 (UTC)
Ah, so a bug in wikibase/wikidata. I'm not sure it should always be prohibited. Could produce a bad user experience if someone was about to add a new qualifier or change the rank. I would suggest only enforcing it for bots. BrokenSegue (talk) 15:48, 1 October 2020 (UTC)

I have closed the above RfC as consensus to semi-protect all properties. Currently, all properties older than a year can (and should) be semi-protected. However, I feel that a proposal to have a grace period between creation and protection of a property (which I now arbitrarily set to a year) has not been sufficiently explored. I opened a new section on this RfC in which I request all of you to come up with the ideas on the duration of this grace period. Thank you in advance.--Ymblanter (talk) 20:02, 25 September 2020 (UTC)

Now closed. All property pages must be semi-protected. I assume this will be realized in an automatic regime via the Phabricator ticket which has already been open.--Ymblanter (talk) 20:55, 2 October 2020 (UTC)

Heads of state/government up to date

Is there a system for keeping heads of state and heads of government up to date, after new elections or deaths? Systems depending on wikidata tend to get a lot of queries regarding new leaders immediately after they're designated.

Currently Kuwait (Q817), Togo (Q945), and Somalia (Q1045) are out of date.  – The preceding unsigned comment was added by High surv (talk • contribs).

where would we source that data from? unless volunteers do it manually which I believe is what we rely on BrokenSegue (talk) 20:02, 30 September 2020 (UTC)
Apparently volunteers like the person who commented above can't edit country items. --- Jura 07:36, 1 October 2020 (UTC)
To be fair, asking for a "system" to do it is a lot easier than monitoring the political changes in 200 or so states and updating it yourself. The data also tends to be stored redundantly on multiple items, so you'd likely have to make several edits in each case, as well as creating items for any missing politicians. Ghouston (talk) 22:36, 1 October 2020 (UTC)
Here's the new prime minister of Somalia, we don't know much about him at the moment: Mohamed Hussein Roble (Q99605101). He replaced somebody else who was acting for a few months. Ghouston (talk) 22:45, 1 October 2020 (UTC)
I have completed the information for "prime minister of Somalia", also added a picture for the new PM.. On the talk page for many office holders like these, you find what Wikidata knows for that office. This is the one for the Somali prime minister.
For a long time I manually added information about people who died in Wikidata. I stopped doing this when adequate software was added that makes it easy to maintain date of death information. It is linked to many Wikipedias. A similar job can be done for office holders. It will make the information that we hold that much more useful. It will also make it easier to include the information in info boxes like they do on the French Wikipedia.. Thanks, GerardM (talk) 11:32, 6 October 2020 (UTC)
  • @High surv: if someone builds a system that depends on Wikidata and finds that Wikidata's data doesn't provide satisfactory answer to certain questions, there's an easy fix: Edit the data, so that the answers are higher quality. Either do it yourself or encourage your users to go back to Wikidata to update the data so that it has higher quality. ChristianKl23:51, 12 October 2020 (UTC)
@ChristianKl: country pages are protected. --High surv (talk) 00:27, 13 October 2020 (UTC)