Open main menu

Wikidata:Project chat/Archive/2017/06

< Wikidata:Project chat‎ | Archive‎ | 2017

This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

Contents

Edit request: Property proposal preload

It's not clear where to place requests for edits to Wikidata:Property proposal/Proposal preload/en, so please can someone tell me; or better still, make this edit to the page:

After the |formatter URL row, add a line:

|external links = <!-- search string to pass to sister projects' Special:LinkSearch pages, e.g. example.com -->

-- Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:35, 3 June 2017 (UTC)

  Done in Wikidata:Property proposal/Proposal preload. (I'm sorry that I didn't add it there as soon as I made that possible.) Matěj Suchánek (talk) 15:47, 3 June 2017 (UTC)
This section was archived on a request by: Matěj Suchánek (talk) 06:32, 6 June 2017 (UTC)

Cities by population

I'm trying to get largest cities by population (filtering Ukrainian ones, so I could check that everything is correct) like this:

# Cities by population
SELECT ?item ?cityLabel ?countryLabel ?population
WHERE
{
	?city wdt:P31 wd:Q515 . # is a city
    ?city wdt:P17 ?country . # show me country
    ?city wdt:P17 wd:Q212 . # let is be Ukraine
    ?city wdt:P1082 ?population . # and get me population
    FILTER (?population >= 1000000) # remove cities with population less than one million
	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
ORDER BY DESC(?population)

Try it!

And I don't see Kiev (Q1899) in results. Is it not instance of (P31) city (Q515)? Looks like it is only instance of (P31) city with special status (Q5124045), which subclass of (P279) cities of Ukraine (Q200209), which finally subclass of (P279) city (Q515). How to update query so it selects not only items that are instance of (P31) city (Q515), but also instance of (P31) of any subclass of it? Or should I add instance of (P31) city (Q515) for Kiev (Q1899) and other cities of Ukraine? --Bunyk (talk) 08:05, 6 June 2017 (UTC)

?city wdt:P31/wdt:P279* wd:Q515 and SELECT DISTINCT. d1g (talk) 08:36, 6 June 2017 (UTC)
SELECT DISTINCT ?item ?cityLabel ?countryLabel ?population
WHERE
{
	?city wdt:P31/wdt:P279* wd:Q515 . # changed
    ?city wdt:P17 wd:Q212 .
    ?city wdt:P17 ?country .
    ?city wdt:P1082 ?population .
    FILTER (?population >= 1000000)
	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
ORDER BY DESC(?population)

Try it!

Thanks a lot! --Bunyk (talk) 08:47, 6 June 2017 (UTC)

This section was archived on a request by: Matěj Suchánek (talk) 09:33, 6 June 2017 (UTC)

Plotting property usage over time

Constraint reports like Wikidata:Database reports/Constraint violations/P496 include an item count parameter, rendered as, for example, Items processed: 2125. For example in this edit it was incremented from 2122 to 2125.

Is there a tool that can extract this data over time, so that it can be plotted? Or perhaps someone has a bot that could so that? perhaps the figures could be added to each property page on, say, a monthly or quarterly basis? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:13, 20 May 2017 (UTC)

The number used on the constraint pages isn't accurate. It doesn't get updated if they are no changes to the contents of the page, for example. Every property talk page shows {{Property uses}} though, you can use the revisions to get a overview. Sjoerd de Bruin (talk) 21:48, 21 May 2017 (UTC)
It's accurate enough; and although its not updated daily, plotting the number against time is good enough for my purposes. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 06:31, 23 May 2017 (UTC)
A selection of properties (external identifiers with a BARTOC ID (P2689)) is since recently plotted on http://coli-conc.gbv.de/concordances/wikidata/, created by User:JakobVoss. -- Jneubert (talk) 16:05, 24 May 2017 (UTC)
@JakobVoss, Jneubert: Thank you. Funnily enough, I learned of that tool only last week, in another context. I see thumbnail plots on that page, but no links to larger versions, and not the raw data as plotted in your image here. What am i missing? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:51, 31 May 2017 (UTC)
@Pigsonthewing: Raw data is available at http://coli-conc.gbv.de/concordances/wikidata/statlog.csv The source code to update raw data is available. I recorded your feature request. -- JakobVoss (talk) 08:36, 1 June 2017 (UTC)

A secular tree has been cut down, how to reflect this?

This secular tree Q28919865 has been recently cut down as it collapsed during the heavy snowfall of April 2017 in Moldova (snowfall in Moldova, April 2017 (Q29516088)). How do I reflect this on the item's page? Preferably also specifying the cause. --Gikü (talk) 15:16, 29 May 2017 (UTC)

I had not found anything helpful on The Senator (Q2267723), another tree taken down, this one by a fire. --Gikü (talk) 15:18, 29 May 2017 (UTC)
I assume the property to use would be significant event (P793) maybe with target logging (Q845249) and appropriate qualifiers? ArthurPSmith (talk) 14:09, 30 May 2017 (UTC)
end time (P582) is the generic property for things that have ceased to exist. It is generic enough that most SPARQL writers know to use if when they want only items that still exist. Syced (talk) 07:37, 1 June 2017 (UTC)

Enable Wiktionary sitelinks in Wikidata

After enabling the extension Cognate that provides the interlanguage links for the main pages of all the Wiktionaries, we will now move forward to the next step of supporting Wiktionary with Wikidata. From June 20th, we are going to store the Wiktionary interwiki links (all the namespaces but main, user and talk) in Wikidata.

Just like Wikipedia a few years ago, a “Wiktionary” links section will be created for the items, the links will be migrated to Wikidata, new items will be created on this purpose, and Wiktionary editors will be able to add and modify Wiktionary links.

How can you help?

  • First of all, you can help us translating this documentation page in the languages you know.
  • If you know tools, scripts, bots, that could be useful for the migration process and removing the manual sitelinks, please share your informations on the page and offer help to people who would need to use them.
  • From June 20th, you may want to pay special attention to the new created items and all recent changes that will result from this new feature available for Wiktionaries.
  • Be friendly and welcoming with the Wiktionary editors :) Help them if necessary, make them feel part of the great Wikidata community.

Thank you very much! Lea Lacroix (WMDE) (talk) 08:13, 1 June 2017 (UTC)

Property 4000

Another milestone: has fruit type (P4000) was recently created. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:14, 1 June 2017 (UTC)

Multiple items refer to Kingdom of Israel

There are multiple QIDs 'Kingdom of Israel (Q230407)', 'Kingdom of Israel (Q3185305)' which have Kingdom of Israel as the label.  – The preceding unsigned comment was added by Mbkv (talk • contribs) at 12:14, 1 June 2017‎ (UTC).

Labels need not be unique -s see Help:Label. The description should disambiguate them. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:48, 1 June 2017 (UTC)

Video game release regions

Also, in the WikiProject Video Games, we use regions to separate game releases, such as Japan, North America, Australia, and Europe. What item corresponds to these regions? SharkD (talk) 12:23, 1 June 2017 (UTC)

applies to jurisdiction (P1001). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:47, 1 June 2017 (UTC)
@Pigsonthewing: So place of publication (P291) (alias 'release region') would not be acceptable for this purpose? Mahir256 (talk) 19:41, 1 June 2017 (UTC)
That could be used, but what if it's published in London and made available to all of Europe? Or globally? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:50, 1 June 2017 (UTC)
Maybe you can use P291 together with NTSC-J (Q6955505), NTSC (Q185796), PAL region (Q2729044) and NTSC-C (Q6955502)? Q.Zanden questions? 20:23, 1 June 2017 (UTC)
Also, I was assuming that the OP would use the requested property as a qualifier on the release date(s). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:01, 2 June 2017 (UTC)
What I meant to say was, what should I use for country of origin (P495)? But it is a stupid question because it's self-explanatory. Sorry. SharkD (talk) 20:32, 1 June 2017 (UTC)

Join my Reddit AMA about Wikipedia and ethical, transparent AI

Hey folks, I'm doing an experimental Reddit AMA ("ask me anything") in r/IAmA on June 1st at 21:00 UTC. For those who don't know, I create artificial intelligences that support the volunteers who edit Wikidata. I've been studying the ways that crowds of volunteers build massive, high quality information resources like Wikipedia and Wikidata for over ten years.

This AMA will allow me to channel that for new audiences in a different (for us) way. I'll be talking about the work I'm doing with the ethics and transparency of the design of AI, how we think about artificial intelligence on Wikimedia projects, and ways we’re working to counteract vandalism. I'd love to have your feedback, comments, and questions—preferably when the AMA begins, but also through the ORES talkpage on MediaWiki.

If you'd like to know more about what I do, see my WMF staff user page, this Wired piece about my work or my one of my more impactful research papers, "The Rise and Decline of an Open Collaboration System: How Wikipedia’s reaction to popularity is causing its decline". --EpochFail (talk) 07:38, 25 May 2017 (UTC)

The AMA is live! I'll be around for a couple of hours answering questions. See https://www.reddit.com/r/IAmA/comments/6epiid/im_the_principal_research_scientist_at_the/ --EpochFail (talk) 20:29, 1 June 2017 (UTC)

Treaties - differentiating original signatories from later signatories

Regarding the signatory (P1891) property on the item treaty (Q131569), how do I indicate later signatories of a treaty from its original participants. Should I add a "point in time" reference for each country I list as a signatory? Also, if I am aware that a group of countries joined a treaty after its creation, but don't know the dates, how do I indicate that? EU explained (talk) 22:20, 1 June 2017 (UTC)

@EU explaind: I would use point in time (P585) for all of them. —Justin (koavf)TCM 23:18, 1 June 2017 (UTC)

Regex question

Hello everybody. Can anybody please check if I've gotten this regex right?

For the new of BBF ID (P1650), which looks like e.g. bcaec648-5c7d-46d8-8a80-3d4b38f7f1b1, i've identified the string pattern as (8 hexadecimal digits)-(4 hexadecimal digits)-(4 hexadecimal digits)-(4 hexadecimal digits)-(12 hexadecimal digits). Hence, I think as regex it should look like this:

\d{8}[0-9a-f]-\d{4}[0-9a-f]-\d{4}[0-9a-f]-\d{4}[0-9a-f]-\d{12}[0-9a-f] Right? Jonathan Groß (talk) 06:43, 2 June 2017 (UTC)

Nope, it should be [0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}, which you can simplify to [0-9a-f]{8}(-[0-9a-f]{4}){3}-[0-9a-f]{12} (and even further but that wouldn't be readable). Matěj Suchánek (talk) 06:46, 2 June 2017 (UTC)
@Matěj Suchánek: Thanks! It's just like learning a new language: I have to think differently. Jonathan Groß (talk) 09:25, 2 June 2017 (UTC)

Google Doodles

Today it is the 66th birthday of Gilbert Baker (Q4081194) and therefore he has his own Doodle on Google. I could remember there was a property for that, but I can't remember what it was. Now I fixed it like this. Is there a better way to describe a GoogleDoodle event? Q.Zanden questions? 12:09, 2 June 2017 (UTC)

The urban population (city population).

Hello! There is a website https://www.citypopulation.de/ which displays data on the population of all countries, their regions and large settlements (data from several recent censuses or estimates). The website information is updated. It would be nice if all these data would be moved to Wikidata. (Original text discussion - https://www.wikidata.org/wiki/Wikidata:%D0%A4%D0%BE%D1%80%D1%83%D0%BC - Население городов) 14:11, 2 June 2017 (UTC) And S Yu (talk)

I think that there is problem with license: CC BY 3.0 and some data --ValterVB (talk) 16:54, 2 June 2017 (UTC)

A QUERY

After how many edits will we get paid by wikipedia and how much.  – The preceding unsigned comment was added by Sri sudershan (talk • contribs) at 16:28, 2 June 2017‎ (UTC).

Not, Wikipedia is a volunteer based project. The same applies for wikidata. Q.Zanden questions? 16:29, 2 June 2017 (UTC)

Does petscan allow you to add items to watchlist?

Just wondering. MechQuester (talk) 18:37, 2 June 2017 (UTC)

I don’t think so. However, the tool output is typically convertible for raw import within a minute using some regex voodoo in a reasonable text editor. —MisterSynergy (talk) 18:51, 2 June 2017 (UTC)

Translatable help pages

I need template and translation experts to organize the Translatable help pages in Category:Property Translatable Help:

Since I am not that much experienced with translations and template programming, I need someone to help me with {{Property documentation/help page template}} (the help page template) in a way that it fulfills the categorization scheme shown above automatically. It should be almost fine right now, but I’d appreciate if someone with more experience could verify (or optimize) this.

There are also a couple of pages to re-categorize in Category:Property Translatable Help; I suspect that translation administrators need to mark those for translation again. Thanks! —MisterSynergy (talk) 14:01, 6 June 2017 (UTC)

I was thinking about this. I really like and endorse the idea that the headers and footers should be always and only used, so that we can let users focus on the help itself. Although you introduce syntax with <noinclude>, I would consider wrapping the help text to <onlyinclude>. Then we will be 100% sure the text is the only thing to get transcluded, the rest wouldn't have to make sure it's visible. Matěj Suchánek (talk) 12:14, 7 June 2017 (UTC)
This section was archived on a request by: Matěj Suchánek (talk) 16:16, 8 June 2017 (UTC)

Our policy of items about living people

I'm planning to write a RFC for the adoption of Wikidata:Living_people. Does anybody here think that the policy needs changes before it goes into the RFC? ChristianKl (talk) 21:58, 29 May 2017 (UTC)

Please refer to the comments on the previous RfC; and the question I put to you on the above page's talk page on 14 April 2017. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:27, 29 May 2017 (UTC)
Sorry, for not replying sooner, I answered you on the talk page. ChristianKl (talk) 08:31, 30 May 2017 (UTC)
The previous policy proposal is not even two months ago rejected. This new proposal has none of the provisions in it that will make a dent in the scale of the things as we have it and consequently it is as problematic. It considers several aspects in isolation but it forgets how controversial nationality is while ethnicity is included.. No thank you, not at this time. Thanks GerardM (talk) 04:59, 30 May 2017 (UTC)
The previous RFC was rejected in August 2016. That's 8 months ago and not two. ChristianKl (talk) 08:31, 30 May 2017 (UTC)
That is only one of two arguments and imho the least relevant.. I oppose this proposal because it does not include a way whereby we concentrate on the issues that are likely problematic. This is too heavy handed. Thanks, GerardM (talk) 08:35, 30 May 2017 (UTC)
i see the same adversive negative feedback. would you care to rework as a wikiproject and quality circle approach? do you have any history justifying "repeated or egregrious incidents by a user may lead to blocks"? if not, them you are bringing a solution where the facts are not indicated. Slowking4 (talk) 16:35, 30 May 2017 (UTC)
Can you be more specific about what you mean with "rework as a wikiproject and quality circle approach?" ChristianKl (talk) 17:53, 30 May 2017 (UTC)
w:Quality circle i.e. w:Wikipedia:WikiProject Women in Red and w:Wikipedia:Teahouse -- Slowking4 (talk) 18:05, 1 June 2017 (UTC)
@Slowking4: The WMF resolution is quite clear that every project in the Wikimedia universe should have its own explicit policy and having an explicit policy is good for the relationship to the individual Wikipedias. If you feel like there are changes to the draft that would improve the policy and reduce the chances for harm I invite you to make your changes, but I think sooner or later we will have an adopted policy. ChristianKl (talk) 11:36, 3 June 2017 (UTC)
@ChristianKl: When we are to do better re living people, fine. The big issue is that the way this proposal is put forward is like an edict with no indication on how this is to be implemented. When we are to improve our data, it is best done by concentrating on the known issues. These are not a subset of what is considered "problematic" but they are based on indicating the errors and having a path to improving the data.
I feel extremely uncomfortable by imposed policies that are only words and have no plan behind them. When the plan is nuke everything that does not comply, I am dead against it because our data set is limited as it is. So by comparing what exists in Wikipedias and other sources, finding the differences and NOT accepting them but curating them you have a way out of this mess. Thanks, GerardM (talk) 12:25, 3 June 2017 (UTC)

Protection against vandalism

Has Wikidata any protection against vandalism? I just found this obvious case: [1]. If Wikidata wants to be a useful reliable resource then it must have a kind of protection. --178.9.86.238 08:37, 1 June 2017 (UTC)

Protection exists. What we need is more patrolling. Sjoerd de Bruin (talk) 09:57, 1 June 2017 (UTC)
And more editors with rollback... Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:14, 1 June 2017 (UTC)
Wikidata has 30 times more "articles" than English Wikipedia. Latter struggles to watch just every page created.
  1. If we issue semi-protected flags manually before "vandalism", we will spend a lot of time.
  2. Who is able to speak that many languages in order to "lock" them all? If we lock languages independently it will result up to (30*10^6)*(languages) records.
Simplest technical solution with reasonable involvement of human resources is to correct mistakes when necessary. d1g (talk) 08:59, 3 June 2017 (UTC)

How to send requests to https://tools.wmflabs.org/fatameh/add/*NUMBER* through account "PokestarFan" through a command interface/bot?

I know nothing about coding, or oauth, but I have figured out how to send requests to urls. However, fatameh needs oauth, so I have no idea as to how to even append an oauth token to an url. Could someone help me? PokestarFan • Drink some tea and talk with me • Stalk my edits • I'm not shouting, I just like this font! 20:07, 2 June 2017 (UTC)

This sounds like you wanting help about how to do implement bot functionality without actually using a bot. This is not how it's supposed to be done. ChristianKl (talk) 16:57, 3 June 2017 (UTC)
I have a bot, but so far I have to create the script manually, then run it through the bot, and I have to use chrome's "open" paramter. It takes too much cpu. PokestarFan • Drink some tea and talk with me • Stalk my edits • I'm not shouting, I just like this font! 17:56, 3 June 2017 (UTC)

Modelling historical place names

We are starting a project to save historical Finnish place names in a Wikibase. In the project, we seek to come up with modelling solutions that can inform the same activities in Wikidata or make importing items from the repository straightforward.

The key design issue is whether or not to model the place names as items, whether or not an individual place name will have it's own URI. Here I present thoughts that are applicable to both Wikidata and the project, but will write them from the perspective of the project.

Yes, the names should have their own items/URIs in the project:

  • Related national resources have unique identifiers for names, it would be natural to follow that in the historical place name repository project.
  • Modelling with individual items is more flexible than with properties only. There are not enough levels for complex definitions: Hiidenniemi (Q28031790) > name (P2561) > "Hiisi" (fi) >IPA transcription (P898) > "hi:si" > language of work or name (P407) > Q28031759. Most of the data can be expressed by qualifiers to the name property, but there is never room for a definition that requires a qualifier.
  • The solution in this external project can differ from Wikidata, there can be URIs in the historical names project and not in Wikidata. When importing to Wikidata only most relevant information will be imported.

No, the names should only be expressed as properties in the project:

  • If Wikidata would follow the same principle, there would be too many meaningless name items. Similar names should be kept separate, not merged. That could confuse the users.
  • In Wikidata most of the existing names are defined as textual properties already.

Both:

  • There should be ways to declare names both as textual properties and as individual items. Currently there is a guideline to use the property P794 (P794) to link to a possible place name item. Perhaps a new property?

Using lexemes:

  • Lexemes are not in use yet
  • It is too early and possibly technically complex to start using them in an external project
  • If used in Wikidata, the data from the external project can be mapped to lexemes?

Thanks for any assistance in thinking about this!

Cheers, Susanna Ånäs (Susannaanas) (talk) 07:53, 30 May 2017 (UTC)

I don't see the problem with qualifiers in your example. Anytime you have a definition, it's the definition of a concept that has a name. That concept warrants it's own item.
What's your core motivation to have this project in your own Wikibase installation instead of having it in Wikidata? ChristianKl (talk) 08:27, 30 May 2017 (UTC
Thanks for the answers!
  1. I have found native label (P1705) so ambiguous that I have reverted to name (P2561). And these items are not official name (P1448). Vernacular might be the best adjective. I will study the name properties more carefully, and the practices around them. Where would the latest discussion live?
  2. I will study your reply about the qualifier problem before I can reply!
  3. This project will act as the reference database. There are 3 million items, and the site will be set up to invite users to enrich and enhance the data. There will be duplicates etc. that are better cleaned before introducing to Wikidata. Also, the amount of items is such that it's better to proceed step by step. / Susanna Ånäs (Susannaanas) (talk) 09:20, 30 May 2017 (UTC)
I think a name with external identifiers would be ok as its own item rather than just embedded in a statement with qualifiers. We do have items for family names, etc. I don't think this is the same as a lexeme where there's language-specificity. ArthurPSmith (talk) 14:17, 30 May 2017 (UTC)
Thanks for your thoughts! I think one thing to avoid with unique identifiers would be that Pyhäjärvi (name) for "place A" would be the same item as Pyhäjärvi (name) for "place B". The reason to have the unique identifier would be to allow rich descriptions of the this specific name for this specific place. The practice with last names would encourage to think these as items to be treated as one.
For the lexeme issue: I have a hunch it would be useful to model place names as lexemes, but it's definitely too early to apply the idea. The place names are language-specific par excellence, for example their conjugation is an artform of it's own! – Susanna Ånäs (Susannaanas) (talk) 06:39, 4 June 2017 (UTC)

Hepburn romanization

For Japanese labels, which writing system should we use? There are several to choose from, including hiragana, katakana and Hepburn romanization. Can I use more than one? SharkD (talk) 10:26, 1 June 2017 (UTC)

@SharkD: there are properties Revised Hepburn romanization (P2125) and name in kana (P1814) that are used for romaji and kana. The Japanese label should include whatever kanji is in its official name (P1448). Mahir256 (talk) 17:34, 1 June 2017 (UTC)
@Mahir256: Thanks! Also, I am trying to add the Japanese name for Cry On (Q3005700), but I can't find the button to add a new label. Am I missing something? SharkD (talk) 18:20, 1 June 2017 (UTC)
@SharkD: Go to the top the page and next to 'read' you can find 'Labels list'. There you can add 'ja' for Japanese and then the name. Good luck! Q.Zanden questions? 19:06, 1 June 2017 (UTC)
@QZanden: I don't have a 'labels list'. It just says 'Read', 'View history' and a star. SharkD (talk) 19:28, 1 June 2017 (UTC)
@SharkD: Go to your preferences and in the gadgets list enable 'labelLister'. Mahir256 (talk) 19:39, 1 June 2017 (UTC)
That works. But what language code do I use for Japan? JA, JPN or JP? Here is a list. (I think it's the correct list.) The documentation for the labelLister doesn't say. SharkD (talk) 20:22, 1 June 2017 (UTC)
You need ja for Japan. (If you would have read my previous comment, you would have known it.) Q.Zanden questions? 20:52, 1 June 2017 (UTC)
In addition to using the labelLister the other way is to add Japanese to your babelbox. ChristianKl (talk) 09:14, 4 June 2017 (UTC)

Statistics don't update anymore

https://stats.wikimedia.org/wikispecial/EN/TablesWikipediaWIKIDATA.htm#editor_activity_levels had it's last update in March. It's linked form https://www.wikidata.org/wiki/Wikidata:Statistics . Is there something to be done to fix the statistics? ChristianKl (talk) 12:22, 3 June 2017 (UTC)

The end of stats.wikimedia.org was anncounced on meta:Tech/News/2017/16 --Pasleim (talk) 14:12, 4 June 2017 (UTC)
@Pasleim:It seems like the new version isn't yet operational and we are at the moment without statistics. Is that assessment correct? ChristianKl (talk) 19:00, 4 June 2017 (UTC)
@ChristianKl: I don't know if it covers your needs, but you can use Wikiscan which has a lot of statistics about Wikidata editors. — Envlh (talk) 18:37, 5 June 2017 (UTC)

Tool to import citations?

Is there a tool that can convert Wikipedia references to Wikidata references? It is the single most time consuming activity I perform on Wikidata, and if the references use citation templates they are already structured. Thanks. SharkD (talk) 05:09, 4 June 2017 (UTC)

I think it would be difficult to create a tool like this, because there are many items that must be searched for, such as the book, the edition of the book, and the authors. For each item, it's hard to search for the item because of minor variations in spelling, abbreviation, or completeness. For example, is the title The Oxford Companion to the Year: An exploration of calendar customs and time-reckoning or just Oxford Companion to the Year? Is the first author "Blackburn, B." or "Bonnie Blackburn"? These searches are best performed by humans, not tools. Jc3s5h (talk) 12:13, 4 June 2017 (UTC)
Yeah, I didn't consider that. In that case, something to at least make copying/pasting easier would be great. I can resolve discrepancies manually. SharkD (talk) 13:01, 4 June 2017 (UTC)
You might be interested in the Drag'n'drop gadget at Special:Preferences#mw-prefsection-gadgets; it adds links next to the sitelinks section in the web interface which make Wikipedia article references accessible. This is likely the best you could have right now. In future (time frame ~2–5 years) there will be WikiCite, I have some hope that this will drastically improve the reference management. —MisterSynergy (talk) 18:47, 4 June 2017 (UTC)
How does this work? Do I need two browser windows? Is there documentation? Thanks. SharkD (talk) 15:33, 5 June 2017 (UTC)
Activate the gadget. You’ll then see extra links called “[ref]” after each connected sitelink, which opens an overlay in the same browser window containing the connected articles. Some elements of them can indeed be dragged and dropped into the item, such as (online) references and wikilinks. I don’t know of any documentation. —MisterSynergy (talk) 16:37, 5 June 2017 (UTC)

Merge tool doesn't merge dates as expected

When merging two items with the same publication date (P577) what I expect is that if both entities have the same date the resulting entity will have the date only once with all the references combined, but it keeps both dates separately. I would expect this happening when the precision of the date is different, but I don't get it why they are not merged together in this example. -- Agabi10 (talk) 12:53, 4 June 2017 (UTC)

yes, tool could be error prone without any action.
such conflicts are resolved manually (remove duplicate/wrong claim at one item, then performing a merge). d1g (talk) 13:13, 4 June 2017 (UTC)
Because of their internal representation: one is 1975-00-00 and one is 1975-01-01. Matěj Suchánek (talk) 13:27, 4 June 2017 (UTC)
@Matěj Suchánek: Is there any feasible way of fixing the internal representation of the year and month precision dates? That would fix the problem and it would decrease the number of duplicate dates that are added... -- Agabi10 (talk) 13:36, 4 June 2017 (UTC)
I believe KrBot was doing this somewhere in the past. It's unclear to me, however, which one is correct, maybe that's why it's no longer done by the bot. Note that the first step should be to prevent one of the representations to be added. Matěj Suchánek (talk) 14:13, 4 June 2017 (UTC)
@Matěj Suchánek: Based on this diff I would say that the correct one is the one with this format 1975-00-00 instead of the other. I created that one using the GUI. If anything is creating entities in the other format it probably is a bot. -- Agabi10 (talk) 14:47, 4 June 2017 (UTC)
The format depends on how type it in. This diff was also created using the GUI by typing "1.1.2017" and the changing the precision to year. --Pasleim (talk) 14:53, 4 June 2017 (UTC)
Yes, but the one I did is using the automatic detection for the precision, the one of your diff is probably a bug that should be fixed... maybe... -- Agabi10 (talk) 15:30, 4 June 2017 (UTC)
The data model description mediawikiwiki:Wikibase/DataModel/JSON states 'That is, 1988-07-13T00:00:00 with precision 8 (decade) will be interpreted as 198?-??-?? and rendered as "1980s". 1981-01-21T00:00:00 with precision 8 would have the exact same interpretation. Thus the two dates are equivalent, since year, month, and days are treated as insignificant.'
Thus, a tool that treats characters that are insignificant as being significant is faulty, and the merge software should be fixed. Jc3s5h (talk) 11:59, 5 June 2017 (UTC)

Request for bureaucrat

Dear all,

there is an ongoing request for bureaucrat. Please express your opinion about the candidate at the request page. The most recent request for bureaucrat in January failed due to a lack of quorum. Although there was 80% support the total number of support votes was too low to make it pass. The quorum for a bureaucrat request is nearly double that of an admin request. The current candidate also seems to have a decent percentage of support, but there has been no new input for over a day. As such I would like to ask everyone to express their opinion about the candidate at the request page.

Sincerely, Taketa (talk) 13:18, 4 June 2017 (UTC)

Non English terms/words used as the English label

If an item has no English name, should a name in another language be used as the English label? For instance Kimi ga Yobu, Megido no Oka de (Q847161). Thanks. SharkD (talk) 18:55, 4 June 2017 (UTC)

in the case of a creative work, if there is no standard English translation available, I would use the non-English title for the label, as you have done here. - - PKM (talk) 22:16, 4 June 2017 (UTC)
Thank you. SharkD (talk) 14:10, 5 June 2017 (UTC)

Twin buildings

Hi. Looking at an item like Petronas Towers (Q83063) (a twin tower), could someone help guide me with regards to:

  • How to add individual information for each tower (like height, date if official opening, floors, etc)
  • If it is possible to add more than just two towers, such as 3 or more identical towers

I'm editing articles relating to buildings, and I often come across such multi-tower complexes which has slightly varying information for each tower. Thanks in advance! Rehman 23:11, 4 June 2017 (UTC)

@Rehman: See World Trade Center (Q11235). It uses has part (P527) -- LaddΩ chat ;) 01:07, 5 June 2017 (UTC)
I see. Thanks! So that means, a separate page should be created for each tower, which has the needed specifics? As opposed to things like height/floors/etc to be all displayed on one page for all towers... Rehman 04:09, 5 June 2017 (UTC)
@Rehman: That is correct. You may want to check here first to see if CTBUH has an identifier for a tower you want to add. Mahir256 (talk) 04:28, 5 June 2017 (UTC)
Many thanks! I'll look at that link. Have a good day. Rehman 04:30, 5 June 2017 (UTC)

Why doesn't we get any info on USA or America when using api

going to this link gives out missing="" https://www.wikidata.org/w/api.php?action=wbgetentities&format=json&sites=enwiki&titles=USA&props=descriptions%7Cclaims&languages=en  – The preceding unsigned comment was added by 103.205.152.154 (talk • contribs) at 10:55, 5 June 2017‎ (UTC).

"USA" is not a valid title on Wikidata. You have to use the Q numbers as titles, e.g. https://www.wikidata.org/w/api.php?action=wbgetentities&format=json&sites=enwiki&titles=Q30&props=descriptions%7Cclaims&languages=en --YMS (talk) 09:12, 5 June 2017 (UTC)
But what if I don't know the Q numbers of query and only know the alias?  – The preceding unsigned comment was added by 103.205.152.154 (talk • contribs) at 11:34, 5 June 2017 (UTC).
You have to use the correct enwiki page title, not redirects; https://www.wikidata.org/w/api.php?action=wbgetentities&format=json&sites=enwiki&titles=United_States&props=descriptions%7Cclaims&languages=en works for instance. —MisterSynergy (talk) 09:47, 5 June 2017 (UTC)

P2593 talk page error

Can anyone see what's up at Property talk:P2593? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:16, 10 June 2017 (UTC)

  Done Special:Diff/499344303. Matěj Suchánek (talk) 11:17, 11 June 2017 (UTC)
Still showing "Lua error in Module:Wikidata at line 648: bad argument #1 to 'pairs' (table expected, got string).". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:33, 11 June 2017 (UTC)
Not for me. I think you need to do the purge trick. Stryn (talk) 14:37, 11 June 2017 (UTC)
Yes. Thank you. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:31, 11 June 2017 (UTC)
This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:31, 11 June 2017 (UTC)

Breaking change: "wb_entity_per_page" table will not be updated and replicated on ToolLabs anymore

Hello all,

This is an important message to all the people running external tools.

On July 12th, we are going to stop updating the wb_entity_per_page table from the Wikibase database and stop its replication on ToolLabs. At a later point we will remove it completely.

wb_entity_per_page was a secondary database table, mapping Wikibase entity IDs (e.g. "Q42") to MediaWiki page IDs (e.g. 138, which can be seen at https://www.wikidata.org/wiki/Q42?action=info). wb_entity_per_page stored entity IDs as numbers, while page titles are always full entity IDs.

This mapping existed because Wikibase was designed with the possibility to have entity pages where the ID does not match the title. This idea was never used, and finally removed in 2015 (documented here). We decided to get rid of the table because it contains outdated information that could mislead users, it costs resources and could conflict in the future with our new entity types for lexicographical data.

Please check if you are maintaining any code that accesses the wb_entity_per_page table, and replace it with lookups to MediaWiki's page and redirect tables.

We will drop the replica of the table on ToolLabs for test.wikidata.org on June 28th. We will do the same for wikidata.org on July 12th.

If you have any question or issue, feel free to add a comment on the ticket or to ping me. Thanks for your understanding Lea Lacroix (WMDE) (talk) 12:51, 1 June 2017 (UTC)

@Lea Lacroix (WMDE): This seems to break 89 queries in my query directory on toollabs. Not very thrilled about that. So how am I supposed to join the page table with wb_items_per_site and wb_terms? Multichill (talk) 19:49, 1 June 2017 (UTC)
wb_terms has new term_full_entity_id. Matěj Suchánek (talk) 06:21, 2 June 2017 (UTC)
@Matěj Suchánek, Lea Lacroix (WMDE): I can't yet see term_full_entity_id. Is it already deployed? Will there also be ips_full_entity_id on wb_items_per_site? --Pasleim (talk) 10:07, 6 June 2017 (UTC)
A few points to clarify:
  • term_full_entity_id is not yet deployed. We have decided to push back the removal of wb_entity_per_page until after term_full_entity_id has become available, so people only have to change their code once.
  • We currently have no plan to change wb_property_info or wb_items_per_site to use full entity IDs. They will keep using numeric IDs for now. If you need to JOIN against them, you will have to use CONCAT or SUBSTR (which will not be good for performance, depending on the query). Please let us know your concrete use cases, so we can try to find a solution. Please also consider using the MediaWiki API or the Query Service as an alternative.
  • Any code that currently uses wb_entity_per_page to find the page title for a given entity can simply skip this step now. Wikibase now guarantees that the entity's page title can always be computed from the entity ID; For now, the title will always be the ID itself. You will have to know which entity type corresponds to which namespace, though.
  • In general, using the database directly is always a tradeoff: you gain querying power, but you lose stability. The internal DB schema is not designed or intended to be as stable as an external API. We are aware of the cost of such changes to to tool authors (and DBAs), so we are trying to keep them to a minimum. But in the end, the DB schema is an internal structure, and not designed to provide backwards compatibility.
  • The move away from numeric IDs is driven by the need for supporting entities that do not have simple numeric IDs: The new Lexeme entity type will contain sub-entities, Forms and Senses, which are stored on the same page as the Lexeme, and use structured IDs (e.g. L762343-F6) for addressing. We could not fit these into the existing database tables.
In general: please tell us in detail what you are using wb_entity_per_page for, and why you decided to use it over some other method of getting the information you need. This will help us to know how to best support the migration, and minimize breakage. -- Daniel Kinzler (WMDE) (talk) 15:04, 6 June 2017 (UTC)

list based on Q21073501?

Q21073501

I don't understand reference to list here.

97 tools are here. d1g (talk) 01:59, 3 June 2017 (UTC)

This shows it much better. I fail to appreciate why this is a subclass. Please explain. Thanks, GerardM (talk) 17:46, 3 June 2017 (UTC)
  • There is probably a better way to do these items, but I wonder if there isn't a confusion (or at least an odd translation) involved with use (P366) ("use" in English) and uses (P2283) ("uses" in English). Tools for Wikidata generally uses (P2283)="Wikidata" as do other applications, but their use (P366) is more less limited to maintaining Wikidata or, at least, it includes it. One can't just replace "tool for Wikidata" with "uses=Wikidata".
    --- Jura 14:31, 5 June 2017 (UTC)
P31 isn't the best place to store information about use (P366).
We need to create items for every aspect of data model and use them in P366.
Actions for Wikidata editor (Q28859215)    . d1g (talk) 18:24, 6 June 2017 (UTC)

Vandalism

Could a few more editors add pages to their watchlists from topviews? It's very surprising to me how blatantly obvious vandalism (e.g. changing labels to "caca" or other nonsense) manages to stay on important/high-profile items for days, and there should ideally be more editors dealing with it. (It's also useful to get lists of pages from the large Wikipedias and convert them to Wikidata item IDs with QuickStatements v1.) Thanks, Jc86035 (talk) 09:24, 4 June 2017 (UTC)

We should allow to semi-protect top10 items by views monthly. Perhaps even by bots. d1g (talk) 11:32, 4 June 2017 (UTC)
@d1gggg: I think top 10,000 might be better, considering the amount of vandalism that goes unnoticed… Jc86035 (talk) 12:51, 4 June 2017 (UTC)
I actually confused columns "views" and editors.
Yes, first 1000 items has 100k views which is huge.
I'm not sure there where to stop exactly after first 1000. d1g (talk) 13:02, 4 June 2017 (UTC)
@d1gggg: To be honest I think it's probably a higher priority to improve recent changes tools, since so many edits even on the more important items manage to go unnoticed, but protecting pages with somewhere around 10 views or more per day should work. (Topviews wouldn't work for this since it only goes to 696 for Wikidata, for some reason.) Jc86035 (talk) 13:37, 4 June 2017 (UTC)
I don't think it's a good idea to semi-protect all those items. It raises the barriers to entry for new users. ChristianKl (talk) 13:08, 5 June 2017 (UTC)
Registration is something possible in every modern site. Current CAPTCHA is readable.
Biggest question is if we want to allow edits semi-anonymously (using IP). Other projects e.g. Wikipedia might want this, but why Wikidata? d1g (talk) 13:27, 5 June 2017 (UTC)
You need more than just registration to be able to post on a semi-protected page. You also need to be autoconfirmed. ChristianKl (talk) 14:32, 5 June 2017 (UTC)
@ChristianKl: The main problem is that vandalism stays unnoticed for ages. I have no idea how many people are patrolling recent changes, but I've had to revert really obvious four-month-old vandalism (on labels etc.) which just doesn't get noticed by anyone and stays there. Granted, it only happens on a small minority of items, and many logged-out editors contribute constructively, but it's not very good and the flood of blatant vandalism has to be dealt with at some point. Semiprotection is just one option. Jc86035 (talk) 16:28, 5 June 2017 (UTC)
I don't think you understand the point. Semi-protecting not only prevents logged-out users from editing. It also prevents new users with registered accounts from editing. Preventing new users from contributing isn't a valid solution to the problem of vandalism. ChristianKl (talk) 16:58, 5 June 2017 (UTC)
@ChristianKl: To be clear, I think mass semiprotection isn't the best way of dealing with vandalism, and from a more Wikipedia-centric perspective it would go against allowing anyone to edit and would prevent page infoboxes from being modified by logged-out users and non-autoconfirmed users for no reason. Jc86035 (talk) 04:22, 6 June 2017 (UTC)
@ChristianKl: oh, then it should be something less restrictive than semi-protection. d1g (talk) 17:34, 6 June 2017 (UTC)

Genres

Would it be okay to create a separate type of genre for video games? You have the literary genres such as "mystery" or "tragedy" already. But games are usually categorized based on their mechanics. For instance, the difference between a "strategy" or a "shooter" game. (A game of course could be both a "strategy" and a "shooter" game at the same time.) We could call it a "gameplay genre". Thanks. SharkD (talk) 18:43, 4 June 2017 (UTC)

If a game is_a shooter game then instance of (P31) might be fitting. If you think we need a new property, you can create a property proposal and make your case why we need a new property: https://www.wikidata.org/wiki/Wikidata:Property_proposal ChristianKl (talk) 18:47, 4 June 2017 (UTC)
Thanks for the link. SharkD (talk) 18:56, 4 June 2017 (UTC)
It seems there is already a video game genre (Q659563). However, I am not able to add it as a statement to video games. For instance Ghostbusters: The Video Game (Q1514853). Can this be changed? SharkD (talk) 14:08, 5 June 2017 (UTC)
video game genre (Q659563) is an item and not a property. The ID's of properties start with P and the ID's of items with Q. Properties get only created after a successful property proposal. ChristianKl (talk) 19:53, 5 June 2017 (UTC)
Thanks for the correction. I guess I'll need to find a different way to filter the different types of genre. SharkD (talk) 20:49, 5 June 2017 (UTC)

Property for league of player

Is there a property which designates the sport league a player played in? I am aware of league (P118), but this is for clubs and teams. It would be nice to have a property like this for generating lists of, e.g., players who played in the Premier League etc. Steak (talk) 10:56, 5 June 2017 (UTC)

Please correct me if I'm wrong but if a team plays in a league and a player plays for that team at that time, they play in that league. So queries could use member of sports team (P54)/league (P118). Matěj Suchánek (talk) 06:15, 6 June 2017 (UTC)
You are wrong, this conclusion is not necessarily true. Supposse a club plays in the domestic league and a (national or international cup) in one season. If a player is member of that club in that season, there is no information if this player played in the league, in the cup or in both. Steak (talk) 13:06, 6 June 2017 (UTC)

Contemporary fiction

There are tags for science fiction, historical fiction, fantasy, etc. Is there one for fiction set in the modern day as well? Thanks. SharkD (talk) 15:39, 5 June 2017 (UTC)

To indicate the genre of a work of contemporary literature you can use this item: Q5165031. Valentina.Anitnelav (talk) 07:34, 6 June 2017 (UTC)

How to prevent users from adding values that should not be there

 
The coat of arms of the City of Malmö, but NOT the urban area of Malmö.

Yesterday, I had a discussion on Talk:Q54339 about the removal of such claims as coat of arms and official webbsite to items like Trollhättan (Q54339). This could also be said about such things as sister city and head of government. If I simply remove such claims, automatic tools tends to add them again. If I add "novalue" here, that happens less often. But it still happens sometimes. How do I in the best way put a source or qualifier that this "novalue" should not be removed? -- Innocent bystander (talk) 07:09, 6 June 2017 (UTC)

We should be able chat with bot writers and to get change, though I have found that they can be indignant, indifferent, or silent to pings on items. It certainly is an issue where bad information is propagated by bot (re)addition. I would agree on your approach of "novalue" where it should be left empty.  — billinghurst sDrewth 08:19, 6 June 2017 (UTC)
Keep "no value" claims.
Wikipedias could be slow to update, wrong values would be added again until every wiki is fixed.
Not a problem of bot owners (e.g. infobox importing tools could copy wrong values again) d1g (talk) 08:29, 6 June 2017 (UTC)
Yes, infobox importing tools are a larger problem than bot owners. The use of "novalue" often prevents adding of data that way.
One problem here is: How do I source or note that "novalue" is the only correct statement? It would be helpful to tell that such information is more valid in items like "Municipality of Gothenburg" and/or "City of Gothenburg" than in the urban area of Gothenburg. -- Innocent bystander (talk) 08:45, 6 June 2017 (UTC)
Use rank and add source. Matěj Suchánek (talk) 09:34, 6 June 2017 (UTC)
@Matěj Suchánek: What rank and what source? If Italian Wikipedia in some page version says that Tokyo was ruined in 1954 by Godzilla, that is not of value even for a claim with deprecated rank. Why should I even look for sources that says otherwise? Some Wikipedia-versions and Wikidata is the only place where you can read that Malmö urban area has a coat of arms. Statistics Sweden is the only authority about Swedish urban areas. They says nothing about coat of arms and webbpages. How do I proof that? -- Innocent bystander (talk) 10:11, 6 June 2017 (UTC)
(edit conflict) My thoughts:
  • We are bound to build our data from multiple sources, not only the highest authorities. But we do decide if we can trust their data.
  • If they do say nothing, they cannot even say that it doesn't exist.
  • Marking values imported from Wikipedia as deprecated, not deleting them, unless they are obviously wrong, is possible.
  • Statement that "novalue" is the only correct statement needs to be supported.
Maybe others have different opinions, which I will glad to hear. Matěj Suchánek (talk) 10:57, 6 June 2017 (UTC)
Well, if some source tells you "Malmö" has File:Malmö fulla vapen.svg as COA, that is a correct claim as such, but which item at Wikidata is (s)he then talking about? That claim is valid in Q10576166 and Q503361, but not in Q2211. Parts of the urban area of Malmö is located in Burlöv Municipality which have File:Burlöv vapen.svg as COA. In the same way, if somebody tells you there has been a terror-attack in Paris, you have to check which item you should add that to. One option is Paris (Q13107162), but that would most likely not be correct. -- Innocent bystander (talk) 13:22, 6 June 2017 (UTC)
Maybe we can set constraints to signal that an urban area has no coat of arms? Afterwards the tools shouldn't add constraint violating claims. ChristianKl (talk) 10:42, 6 June 2017 (UTC)
Property talk:P775 already have "conflicts with sister city" as constraint! P6, COA and "official webbpage" can be added to. That I have only added sister city here, was because I intended to move such claims to the proper item. It was a very hard job, since I found it very hard to find good sources for such claims. -- Innocent bystander (talk) 13:22, 6 June 2017 (UTC)

Adding new interwikis way too difficult to figure out

I created an article "Arnel Pineda" in the Finnish Wikipedia. Then I wanted to add interwiki links to the sidebar (knew there was a corresponding article in the English Wikipedia). Adding the interwikis failed through clicking the link in the sidebar in the Finnish article. That was a way to create a new Wikidata item. Who would figure that out? Please, put some red box there that says "you are creating a new item. are you sure? please, try to find if the item already exists before you proceed" Then I went to the English Wikipedia and clicked the link there in the corresponding article's sidebar and that was the way to go, but the mechanism to add an entry was way hidden. It should be way more visible and not just a line at the end of the list. You could add a colored box around the last line and the text "ADD NEW ENTRY". The box could be green. Just common sense IMO, these are too user-unfriendly because everything is white and small icons etc. Please, add some red and green boxes to help people figure out. --Hartz (talk) 07:38, 6 June 2017 (UTC)

Wikidata weekly summary #263

Exclude redirects and deleted items from query service results

It happens occasionally that deleted items or merged redirects appear in query service results weeks and months after deletion/merging action was performed. What can I do to remove those items from results sets? Would purging potentially help? —MisterSynergy (talk) 15:03, 6 June 2017 (UTC)

Report them to Smalyshev (WMF). Full database reloads happen a few times per year. Sjoerd de Bruin (talk) 17:35, 6 June 2017 (UTC)
any reason for MINUS { ?item owl:sameAs [] } not to work? 1 d1g (talk) 17:42, 6 June 2017 (UTC)

I now left a message on Stas’ talk page. The MINUS hack might help for redirects, but theoretically they should not appear at all in the results. It would not help for deleted items at all. —MisterSynergy (talk) 17:46, 6 June 2017 (UTC)

"point in time" vs. last update (P585)

Hey folks, currently the property point in time (P585) is used for two purposes at the same time: On the one hand, it defines a point of time when an event took place (see the examples on the talk page, e.g. 2012 United States presidential election (Q4226) → November 6, 2012), and on the other hand it is used as a qualifier to determine when a statement was true or last updated (for example the population of a city or the number of goals a soccer player has scored – these data are subject to frequent changes, so it is important to state when the given information was true or last updated respectively). The property is used more and more for the first purpose, while its original intention was the latter (see property proposal: here). Especially for finding an appropriate label for the property in the various languages, the specific use of the property is important. In English, for example, the original label was as of to determine when a statement was true or last updated, but with the usage to determine when an event took place, it was changed to point in time. The same issue we're facing for the German label and probably for all other languages also. Therefore, I think it might be reasonable to create another property to determine when a statement was true (e.g. as of, last update or something like this) and so use two properties for those two purposes. As this would be quite a big change (the property is used extremely often), I thought a prior discussion here would be reasonable before requesting the creation of the new property. What do you think? Yellowcard (talk) 10:08, 25 May 2017 (UTC)

There are really three cases: "as of" implies that the truth of a statement was checked at a particular time. "Date of an arbitrary event" could be applied to most anything that doesn't have a devoted property for that kind of event, such as date of birth (P569). And the range of time when a status is known to be true can (in the sense of it became true on a known date, and became false on a later known date) could be indicated with start time (P580) and end time (P582).
If we're going to fix some of this stuff, it might be worth pointing out that the example cited by Yellowcard, "2012 United States presidential election (Q4226) → November 6, 2012)" is false because Wikidata dates are interpreted as w:Universal Time (UT) and, in UT, the polls were open in the USA on November 5, 6, and 7. This problem could be solved by implementing my proposal on Mediawiki, that is, interpret Wikidata dates as local time rather than UT. Jc3s5h (talk) 11:42, 25 May 2017 (UTC)
Much that would require "as of" is covered by retrieved (P813). - Brya (talk) 16:36, 25 May 2017 (UTC)
I wonder if "as of" is at all translatable into Swedish. -- Innocent bystander (talk) 19:10, 25 May 2017 (UTC)
@Innocent bystander: I would propose "per" as a Swedish translation of "as of". Example: "The Sweden population as of December 31, 2016 was..." -> "Sveriges befolkning per 31 december 2016 var..." --Larske (talk) 16:14, 26 May 2017 (UTC)
@Larske: That would work, but 'per' is not much of a property-label. It look as generic as 'som' and 'av'. -- Innocent bystander (talk) 06:43, 27 May 2017 (UTC)
When I see a case like this, I think it might be worth to have a more formal process for changing the label and description of a property. ChristianKl (talk) 16:59, 25 May 2017 (UTC)

Is there a concensus that we change the as of purposes of P585 to retrieved (P813)? Does it have to be discussed on a broader level somewhere as it might cause some applications / modules to break? Yellowcard (talk) 10:57, 28 May 2017 (UTC)

No. "As of" means a certain status was true on the date stated. "Retrieved" means the date information was looked up in a source. One could write that as of May 20, Donald Trump was in Saudia Arabia, and that the information was retrieved from https://www.whitehouse.gov/potus-abroad on May 30. Putting it another way, the "as of" date is based on what the source says, and the "retrieved date" is based on the calendar of the editor who obtained the information from the source. Typically, but not always, the editor would add it to Wikidata right away, but if there is a delay in adding the information, the retrieved date would be earlier than the date of the edit that adds the information. Jc3s5h (talk) 11:57, 30 May 2017 (UTC)
Good point, Jc3s5h. Would you prefer creating a new property, then? Yellowcard (talk) 17:26, 6 June 2017 (UTC)
After looking at Help:Sources#Web page I believe the necessary properties already exist. point in time (P585) is satisfactory both for the date an event took place, and for the date a certain status (such as population) was true. The best way to show when a source made a statement is to make sure the item for the source contains the publication date (P577) property, but if that cannot be determined, retrieved (P813) can be added as a qualifier to the reference. Also, if one is using reference URL (P854) rather than stated in (P248), one can add publication date (P577), retrieved (P813), or both as qualifiers in the reference. Jc3s5h (talk) 19:59, 6 June 2017 (UTC)

Soccer data for wikidata

Hi, we are new start up, our core product is soccer stats center: scorum.co.uk

We want to make bot for wikidata which will update soccer stats. We can start from Japanise J1 League: 2017 J1 League, update League table and Top scores. You can check if the stats are correct on our website at https://scorum.co.uk/football/tourneys/1103-2/j-league. It would be much more easier for you as you won’t need to update them manually and all the data will be up to date. MaybeVlad (talk) 08:03, 27 May 2017 (UTC)

Delusion23
WFC
happy5214
Fawkesfr
Xaris333
A.Bernhard
Cekli829
Japan Football
HakanIST
Jmmuguerza
H4stings
Unnited meta
محمد آدم
Wolverène
Grottem
Petro
Сидик из ПТУ
Sakhalinio
Gonta-Kun
CanadianCodhead
  Notified participants of WikiProject Association footballJohn Vandenberg (talk) 04:17, 28 November 2013 (UTC)
Bill william compton (talkcontribslogs)
--►Cekli829 23:32, 31 January 2014 (UTC)
VicVal (talk) 17:14, 24 October 2015 (UTC)
AmaryllisGardener talk 19:03, 24 April 2016 (UTC)
Tubezlob (🙋) 16:06, 6 October 2016 (UTC)
Sannita - not just another it.wiki sysop 17:24, 14 July 2017 (UTC)
Jmmuguerza (talk) 03:34, 24 August 2017 (UTC)
MisterSynergy
Xaris333
Migrant
Mad_melone
Сидик из ПТУ

  Notified participants of WikiProject Sport results

I pinged linked projects. Tubezlob (🙋) 08:56, 27 May 2017 (UTC)
@MaybeVlad: Hi, first I think that we should create Wikidata properties to link your website to Wikidata (matches, teams, persons, tourneys) in order that you can use our data.
Then you have to create items for each match (with the property Scorum match ID) and import your data. For example, you can see this item: France v Romania (Q24201656) (Group A match of UEFA Euro 2016).
You can have informations about bots here and how to create one here.
Sincerely, Tubezlob (🙋) 12:40, 27 May 2017 (UTC)
Thanks for reply) Can we start without final community decision about the topic?) MaybeVlad (talk) 13:27, 27 May 2017 (UTC)
@MaybeVlad: No, you have to request a permission here and explain what you want exactely to do (which properties you want to use, etc.). For the properties, the rule is to wait one week after the request before create them. It's here to request the four properties.
Tubezlob (🙋) 13:55, 27 May 2017 (UTC)
This discussion tends to be too passive. Because in case this task will be approved, it will affect a lot of items, with multiple statements changed, and a lot of edits per each item, more input is needed. Maybe is the case to ping the en.wiki Footy project? XXN, 15:27, 28 May 2017 (UTC)
"lot of edits per each item" not always, depend by the bot, it's possible add a lot of statements, label, reference atc. with one edit. --ValterVB (talk) 18:10, 28 May 2017 (UTC)
Yes, but items will be edited frequently, generally on a weekly basis (a common span between league rounds in almost all countries) and everything should work fine and to be agreed. --XXN, 20:38, 28 May 2017 (UTC)

I'd like to see the list of all entities on an item page which are planned to be updated. Maybe some property is missing and there should be created a new one. XXN, 20:38, 28 May 2017 (UTC)

@MaybeVlad: Do you think about updating the data of soccer players, add items for matches or what exact purpose do you have in your mind? Potentially, this could be a great help. Yellowcard (talk) 17:29, 6 June 2017 (UTC)

@Yellowcard: yes, we want to update all data for soccer. Add items not only for matches but for players, teams etc. Now we are looking for a developer, to do all of this. I think we will start in one week. MaybeVlad (talk) 06:48, 7 June 2017 (UTC)

Not sure every match from every league is notable enough to create items for them, but at least already existent items certainly could be updated. Even 'popular' items like this could be greatly improved (number of matches played, goals scored, participating teams, etc). XXN, 13:36, 7 June 2017 (UTC)

@XXN: I think that all matches of notable leagues are notable. "It refers to an instance of a clearly identifiable conceptual or material entity" and " it can be described using serious and publicly available references" (Wikidata:Notability). It is sure that it could be very interesting if it's done correctly. Tubezlob (🙋) 13:57, 7 June 2017 (UTC)

Reporting errors in external databases

T.seppelt (talk) 21:00, 18 February 2016 (UTC) Vladimir Alexiev (talk) 11:59, 13 March 2017 (UTC) GerardM (talk) 15:58, 26 March 2017 (UTC) Jonathan Groß (talk) 17:52, 26 March 2017 (UTC) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits Jneubert (talk) 13:47, 29 April 2017 (UTC) Framawiki (please notify !) (talk) BrillLyle (talk) 10:09, 10 July 2017 (UTC) Sic19 (talk) 20:42, 12 July 2017 (UTC) Wikidelo (talk) 21:15, 8 May 2018 (UTC) salgo60 Salgo60 (talk) 07:09, 10 June 2018 (UTC) ArthurPSmith (talk) 19:52, 22 August 2018 (UTC) PKM (talk) 19:40, 23 August 2018 (UTC) Ettorerizza (talk) 06:44, 8 October 2018 (UTC) Fuzheado (talk) 03:47, 19 December 2018 (UTC) Daniel Mietchen (talk) 16:30, 7 April 2019 (UTC) Eihel (talk) 15:13, 19 June 2019 (UTC)

  Notified participants of WikiProject Authority control

Do you know any places that helps in reporting errors in external databases to their maintainers? For a long time yet I have reported errors in Union List of Artist Names (Q2494649), Art & Architecture Thesaurus (Q611299) and recently Sandrart.net (Q17298559) and RKDartists (Q17299517)[1].

A place would be great that has these explanations for each source:

On English and German Wikipedia there are en:Wikipedia:VIAF/errors (does not look to be active) and de:Wikipedia:GND/Fehlermeldung (active). They seem to have this purpose for VIAF and GND respectively only.

Shall there be some page in the Wikidata namespace, a WikiProject, a subpage of WD:WikiProject Authority control for those errors in its domain? --Marsupium (talk) 13:13, 30 May 2017 (UTC)

  • Database properties and their corresponding items should contain some information about the maintainers. You might find information on their websites, but this has to be looked up case by case.
  • You can then try to establish a relationship to them (via email), as you apparently already did in some cases. Some will respond, while others won’t.
  • I’d suggest to leave a note on the talk page of the property, containing the following details: who wrote to them, when did that happen, what was the outcome, …. We should try to bundle our conversation with external database maintainers as much as possible, and the individual property talk pages seem to be much more suitable for that than a separate WikiProject.
  • How to deal with multiple IDs: all current IDs should not have deprecated rank; if useless (valid) IDs are among useful ones, prefer the latter ones with preferred rank; if an ID has been fixed in the external database, we might want to remove it from Wikidata or apply deprecated rank, maybe with a qualifier that indicates the reason.
MisterSynergy (talk) 13:45, 30 May 2017 (UTC)
OCLC are aware of en:Wikipedia:VIAF/errors; I was discussing this with them only last week. For the equivalent on Wikidata, we have constraint reports like Wikidata:Database reports/Constraint violations/P214. I always draw the attention of ID-issuers to such pages, where I can. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:20, 30 May 2017 (UTC)
I have created User:Marsupium/External references error reporting with a subpage per maintainer as a draft that can easily be moved elsewhere, I will answer more in detail.
@Pigsonthewing: Fine! Then en:Wikipedia:VIAF/errors should be linked from the top section of Property talk:P214 probably.
I've also sent links to the constraint violation reports to external maintainers, especially to the single value constraint section where duplicates are collected. Two problems: 1) External errors should be added to the exceptions. But ok, those could be checked by the maintainers then. 2) The ratio of actual duplicates varies and the maintainers might not have enough resources or interest to sort them out. Thanks for the reply, --Marsupium (talk) 15:10, 30 May 2017 (UTC)
I like the metadata collected on your new page, but I think it might be better in a template, on property talk pages. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:54, 30 May 2017 (UTC)
Fine! And yes, templates on property talk pages are definitely a good idea! --Marsupium (talk) 16:02, 30 May 2017 (UTC)
@MisterSynergy:
  • Should some property maintained by (P126) some maintainer / e-mail address (P968) errorreport@maintainer.org and bug tracking system (P1401) if available be used on the property pages to add contact information? In fact the website look-up is not the best way and sometimes a conversation might indicate a better way than that given on a website. If a way given on a website is the best the website can be referenced on the property page. Probably this should show up somewhere in the Template:Property documentation?
  • I’d suggest to leave a note on the talk page of the property, containing the following details: who wrote to them, when did that happen, what was the outcome, …. We should try to bundle our conversation with external database maintainers as much as possible, and the individual property talk pages seem to be much more suitable for that than a separate WikiProject.   Support completely. I've alredy represented that information in this draft: Tables aren't the fastest way to do that though. Probably a template could help.
  • all current IDs should not have deprecated rank; if useless (valid) IDs are among useful ones, prefer the latter ones with preferred rank OK. The question is about cases like Q26822078#P836: The formerly obviously valid ID E05005938 now throws a 404. For ULAN ID (P245) the content of records silently disappears when the Getty Vocabulary Program merges records. The use in keeping invalid IDs is obviously not any further information, for IDs whose issuers don't use redirects or otherwise indicate the former use of an ID like GSS code (2011) (P836) and ULAN ID (P245) that might be interesting to third parties who can find that information here then. Thus we could deprecate the statement and set the qualifier X Y / reason for deprecation (P2241) withdrawn identifier value (Q21441764). (Interestingly Q26822078#P836 uses the qualifier but hasn't deprecated the statement.)
Thanks for your replies, --Marsupium (talk) 16:02, 30 May 2017 (UTC)
Maintainer information best fits into the item about the database, which is linked from the property with subject item of this property (P1629). This is, however, neither very obvious nor easy to find. Example (which I keep a close eye on): property FISA rower ID (P2091) which lists World Rowing database (Q21008628) as the corresponding item to hold this information.
Although I am not very experienced with template and module coding, I guess it would be easily possible to make a template that pulls maintainer information from the database item and displays it on the property talk page. This would make maintainer information available where you would typically expect it to show up. —MisterSynergy (talk) 16:27, 30 May 2017 (UTC)
  Comment I will say that numbers of issues in the VIAF database are caused by Wikimedians incorrect additions. @Pigsonthewing: do you know the turnaround time (or time range) for us deleting our incorrect assignations to be reflected in the VIAF database?  — billinghurst sDrewth 23:25, 30 May 2017 (UTC)
I have created Template:External reference error reports and Template:Error report row. Their quality and functionality is very basic and should definitely be improved if it turns out they are useful in principle. Unfortunately I didn't manage to hide the reports table in the case it is empty.
The templates are at work now on Property talk:P214, Property talk:P227, Property talk:P245, Property talk:P350, Property talk:P650, Property talk:P1014 and Property talk:P1422 (live source) and everything from User:Marsupium/External references error reporting is now on these property talk pages. --Marsupium (talk) 23:47, 30 May 2017 (UTC), 01:40, 31 May 2017 (UTC)
@Marsupium: I like it and left a couple suggestions for improvement.
I think having a table row for each occurrence will not scale; for some properties, especially before post-import clean-up, we have hundreds, if not thousands. And the manual work is too much, when constraint reports automate much of it. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:51, 31 May 2017 (UTC)
@Pigsonthewing: I think this template is for deeper discussions on difficult cases. It's not necessary to fill it for all cases. Also, it's the first time we're tracking reactions from source database maintainers, so let's see how it goes. (It's not a substitute for a proper issue tracking system, so if such collaborations really take off, we'll need a better system) --Vladimir Alexiev (talk) 11:58, 7 June 2017 (UTC)
Do you have some examples of errors in VIAF data being introduced by Wikipedians? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:51, 31 May 2017 (UTC)
VIAF:111588127 here is a beauty, it is a munge of Yoshio Nishi (Q15440100) and Yoshio Nishi (Q15440100), the dates are one person, and the works are a mix of both, and if it was like that prior to our adding data one can see who the error was made. In this sort of example, I have removed the VIAF data from both, and I feel that we just have to let things sort themselves out prior to rebinding. Well, that and sticking different from (P1889) on both.  — billinghurst sDrewth 00:08, 31 May 2017 (UTC)
Such values should not be removed; they should be marked as deprecated. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:51, 31 May 2017 (UTC)
Deprecated? I must have a different understanding of the word. It is wrong to have our item assigned to an amalgam of people at VIAF. If it continues to exist I see little hope to resolution. I would like to see a stronger argument put forward to why 'deprecation'/retention is a path to resolution.  — billinghurst sDrewth 11:09, 31 May 2017 (UTC)
Help:Ranking#Deprecated rank. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:17, 31 May 2017 (UTC)
I hardly think that applies for this situation where two items point to the same incorrect VIAF identifier. 1) it sits outside our violation process, 2) we regularly remove removed VIAF data; 3) the rankling examples relate to factual components, eg. dates of birth; theories; etc. not some dynamic authority control series.  — billinghurst sDrewth 13:01, 31 May 2017 (UTC)
The examples are just that; they are neither definitive nor restrictive. The scenario at hand fits precisely into the definition at the linked page. As for "we regularly remove removed VIAF data" this is wrong and should stop. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:53, 31 May 2017 (UTC)

For duplicate IDs what do you think about tagging them like this:

Is there a better way to get them with SPARQL? --Marsupium (talk) 19:13, 1 June 2017 (UTC)

Getty does track merged items (AAT, TGN, ULAN), see documentation and ULAN query, counting query (total 8.5k). Does anyone want to do something with them? --Vladimir Alexiev (talk) 11:51, 7 June 2017 (UTC)

Any volunteers to move this to project Authority Control and organize it a bit (split up to topics)? Thanks! --Vladimir Alexiev (talk) 11:51, 7 June 2017 (UTC)

Unpublished works and release date

I have been using publication date (P577) for video games that have been published. Can anyone recommend how to indicate in this field that a project was never published because, 1) it is a future product that will be published in the future, or 2) the development of the product was cancelled before it could be published? Or an alternate solution? Thanks. SharkD (talk) 23:09, 1 June 2017 (UTC)

 
Here you can set your value
You can set the value for P577 to novalue for unpublished items by clicking on the small blue blocks and to unknown value for items that will be released in the future. Q.Zanden questions? 00:52, 2 June 2017 (UTC)
Awesome, thanks! SharkD (talk) 00:58, 2 June 2017 (UTC)
We discovered here that there is a third possibility where an item's publication date simply has not been defined yet, and that catching all three cases using a query is very very slow. The query that we defined timed-out when trying to do this. Does anyone think a new "publication status" property would be a good idea? Values could include "unreleased", "released" and "cancelled". SharkD  Talk  03:09, 7 June 2017 (UTC)
Or how about a "development status" property, with values of "in progress", "released" and "cancelled"? SharkD  Talk  03:14, 7 June 2017 (UTC)
I started a proposal here. SharkD  Talk  03:30, 7 June 2017 (UTC)

Ministers of EU countries

I want to post here first, but in the process of trying to make on article on an institution that gathers the ministers of education (or equivalent) from all 28 eu countries, I discovered that many of these posts don't have a wikidata item, or if they do, their wikidata item corresponds to a list from wikipedia of people who've held that position rather than an instance of that position itself. I also have discovered that many pages don't differentiate between the government (the executive and administrative authority of a country responsible for day to day governance) and the cabinet of a country (the collective decision making body of senior ministers), having an page combining them both. Essentially, I don't think many of the ground rules have been set or clearly defined in this area and a lot of the data being imported from wikipedia is messed up.

It would be great to have pages on all ministerial positions within EU countries at the sovereign state level, with a record of their officeholders and the times they were in office (present and past), as well as what positions they replaced and the time they came into being (eg. if a minister of education, becomes minister of human capacities - yes, Hungary really does have a position by that name),. This would be great as you could for example, work out who was at a meeting of EU ministers and their official title, based simply on the date.

Let me know if anyone is doing something similar. I am biased towards starting with EU countries first - partly because I'm working on that and also for the good reason that ministers from the government of the EU's 28 countries form it's main legislature and so having a record of who was in office at any time is quite useful for writing articles on it. EU explained (talk) 10:35, 4 June 2017 (UTC)

It would definitely be good to clean those up - I've seen some of those "list" items but I haven't done anything about them myself at this point. ArthurPSmith (talk) 15:14, 5 June 2017 (UTC)
The query by @Oravrattas: at Wikidata:Request_a_query#Ministers_of_Education_in_the_European_Union looks like a good start.
--- Jura 15:31, 5 June 2017 (UTC)
I've been doing some work in this area over the last year, and it's actually something that I'd been hoping to get around to in earnest within the next couple of weeks (currently I'm trying to sort out lots of issues with Heads of Government, which have a lot of the same problems, particularly around the "Prime Minister of X" vs "List of Prime Ministers of X" conflation, which is particularly time-consuming to untangle as often the "equivalent" page in different Wikipedias are actually about one or the other and all need split up. And of course that one has a whole bunch of other issues due to there being at least three different ways to say who the head of government of a country actual is… — come help out over at Wikidata:WikiProject Heads of state and government if you're interested)
The situation with Ministerial positions is actually even worse than many people realise, in that the position held (P39) information in lots of cases isn't even just the "simple" wrong version of pointing at a "List Of" page, but often says that the person was the "Ministry of Education' rather than the Minister (as lots of these were imported automatically from Wikipedia infoboxes without checking that the link was to the post rather than the department). I've been tidying a lot of those up as I come across them, and linking them together with organization directed from the office or person (P2389) and office held by head of the organization (P2388), etc, but this has been very ad-hoc so far. The worst is when the item page is a mishmash of information on the minister, the ministry, and the list of people who've been the ministry.
I also raised the issue about Cabinet vs Government a while back, though the response was a little inconclusive. I think it's mainly just a matter of going through and making sure that entries are created and linked correctly for each. I'd be very happy to be involved in working out the best way to do all that, putting together queries to track which ones need tidied up, with daily reports to spot any future problems etc, if there's general interest in turning it into a WikiProject.
In terms of tracking historic names, it's never been quite clear to me when we want a new "Minister of Education and Science of Placeistan" post as a replacement for the previous "Minister of Education of Placeistan", or when we want to just make that a change of name on the existing item. Either approach brings problems both for data maintenance, and for being able to write queries against it, but it's something that happens frequently enough that it would be good to resolve. Another thing to discuss on a WikiProject page, perhaps…
--Oravrattas (talk) 13:20, 6 June 2017 (UTC)
Yeah, that name change issue is definitely another one I've run into a lot. I think in general if it is simply a name change, and not a reorganization of responsibilities (for example pulling some pieces from other departments or merging several departments) then it should probably be a single item with the historical name identified with start and end dates. But if the name change is a consequence of an actual structural change of some sort, then it probably should be a new item. ArthurPSmith (talk) 13:39, 6 June 2017 (UTC)
It could be very complicated sometimes. Here in Sweden (Q34) we often have two ministers of education, one for education of children and one for higher education and research. The latter normally having a higher rank than the first. As far as I know, we have never had any known as "minister of space". When the EU ministers of space meet, it is either the "minister of research" or the "minister of industry" who participate depending on how the current government has selected to organise their work. -- Innocent bystander (talk) 15:31, 6 June 2017 (UTC)
@ArthurPSmith: — that sounds like a sensible approach, though in practice I'm not sure how easy it is to know that, especially for historic renamings. To pick a fairly random example, The Australian health minister has been renamed 12 times in the last 30 years: https://en.wikipedia.org/wiki/Minister_for_Health_and_Aged_Care — adding all those names with dates to a single article is relatively easy (though tedious, unless we script it), and can be done by anyone vaguely interested, but working out which ones were actually structural changes that deserve a separate article is a lot more work, and requires a much deeper understanding of Australian political affairs. --Oravrattas (talk) 07:42, 7 June 2017 (UTC)

Fractaler's new findings

How do you like Q30126951 and Q30127019? Should we now classify some (all?) items into one of these 2 groups? :) --Infovarius (talk) 16:16, 5 June 2017 (UTC)

Other than the constituent elements which make up "bad thing" and "good thing" what could be a practical application of these? —Justin (koavf)TCM 16:21, 5 June 2017 (UTC)
@Infovarius: Do you propose that we classify items this way? Otherwise, what's the motivation in starting this thread? ChristianKl (talk) 19:57, 5 June 2017 (UTC)
It's a Q30126951 to judge?! --Succu (talk) 20:53, 5 June 2017 (UTC)
Ask a judge.. It is imho silly and full of point of view. Thanks, GerardM (talk) 05:28, 6 June 2017 (UTC)
WTH? Subjective claptrap. How does it pass Wikidata:Notability? Both items are bad things and should be nominated for deletion.  — billinghurst sDrewth 08:14, 6 June 2017 (UTC)

I guess the best to solve this would be to nominate for following items for deletion: Q30126982 Q30126973 Q30127019, and Q30126951. The latter two are already present at WD:RfD#Q30126951 and WD:RfD#Q30127019, so just add the others over there (preferably in a combined section). We will then find out whether those items are desired or not. —MisterSynergy (talk) 08:34, 6 June 2017 (UTC)

Yes, RfD is a proper thing for these items, thanks. Infovarius (talk) 11:17, 7 June 2017 (UTC)

We can remove duplicates, but such items could be used to store opinions: what is a bad thing or a good thing. We cannot have any definition for these items, only to cite opinions IMO. d1g (talk) 08:41, 6 June 2017 (UTC)

Then for neutrality we should use both in each item?? --Infovarius (talk) 11:17, 7 June 2017 (UTC)

RfD nominations are complete now. Feel free to discuss at Wikidata:Requests for deletions#good or bad?. —MisterSynergy (talk) 08:46, 6 June 2017 (UTC)

  • @Infovarius: Did you know about evil (Q15292)/good (Q15290) (2012 year of birth)? How about "Should we now classify some (all?) items into one of these 2 groups? :)"? No preliminary discussion with the author of the item - is bad or good thing? For whom? Discussion behind the author's back, without pinging - is bad or good thing? For whom? Items Q30126951/Q30127019 had ([2]/[3]) pointer for whom (subject). Without such items, the user will not be able to ask such a question to the wikidata's model of the world and get an answer - Q30126951/Q30127019 for whom? --Fractaler (talk) 13:07, 6 June 2017 (UTC)
  • Wikidata is not making statements about goodness of something, so these items are meaningless - they cannot be used in Wikidata. About pinging: I am sorry, but you are known for controversial things (notions, disambiguations, categories and others), so your participation in the discussion is often meaningless. Sorry, but you are walking on the edge. --Infovarius (talk) 11:17, 7 June 2017 (UTC)

I kindly ask all of you not to continue this discussion here, since it is happening too much on a personal level. The issue itself is discussed at WD:RfD (already linked and backlinked). Thanks, MisterSynergy (talk) 11:24, 7 June 2017 (UTC)

Cannot merge Q6849468 into Q1458946

Please try to resolve this. There is no need to separate East Asian and Western languages in this topic.--Jusjih (talk) 00:12, 7 June 2017 (UTC)

I merged both. --Liuxinyu970226 (talk) 05:09, 7 June 2017 (UTC)
Thank. I just wonder how it has been done.--Jusjih (talk) 15:42, 7 June 2017 (UTC)

Resuscitating WikiProject:Heads of state and government

Wikidata:WikiProject Heads of state and government was an early project to make sure that all national heads of government and heads of state were added to Wikidata. At this it was largely successful (though there are still lots of historic gaps). Unfortunately, however, although there are items for all current Heads, the actual position held (P39) information about their role, and which ones are set via head of government (P6)/head of state (P35) on each country, or via the combination of officeholder (P1308) with office held by head of government (P1313)/office held by head of state (P1906), is often missing or inconsistent. (Having three ways to express who is the Head of Government of a country is a little redundant, but it certainly offers useful ways to check for data that's almost certainly out-of-date.)

To help with this I've tried rebooting the Project, and created a series of tasks to help track down the errors and omissions. If anyone is interested in helping out, please come join in! For the current missions, there are still almost 40 countries with no head of government (P6) entries, and about 30 where the P6 entry doesn't match the officeholder (P1308) of the office held by head of government (P1313), and/or the position held (P39) entries. All assistance very gratefully received… --Oravrattas (talk) 10:08, 7 June 2017 (UTC)

Quite happy to collaborate particularly when it is about former countries. I only use "position held" so far. Thanks, GerardM (talk) 11:48, 7 June 2017 (UTC)
Oooh, getting more information on former countries would be excellent. Only a tiny fraction of them have a office held by head of government (P1313), for example:
SELECT ?item ?itemLabel ?office ?officeLabel
WHERE { 
  ?item wdt:P31 wd:Q3024240 .
  OPTIONAL { ?item wdt:P1313 ?office . }
  OPTIONAL { ?item wdt:P576 ?dissolved . }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
ORDER BY DESC(?dissolved)
LIMIT 500
Try it!
--Oravrattas (talk) 16:12, 7 June 2017 (UTC)

An attempt at structing kanji/hanzi

I've been toying with (Q3595028). I don't know of other efforts to work on kanji/hanzi on Wikidata, so I added some relevant statements to this one. Certainly we will need more properties in the future, especially with the incoming Wiktionary integration. In any case, commentary, criticism and help are appreciated. ~nmaia d 16:45, 7 June 2017 (UTC)

Space missing

In Module:Wikidata is shown a table of examples. In line 12 appears a mistake in German interface:

Code Render
{{#invoke:Wikidata|formatStatementsE|item=Q12418|property=p186|lang=de}} Ölfarbe und Pappelholz

before "und" a space is missing ("Ölfarbe und Pappelholz" would be correct). That mistake appears several times.--Bigbossfarin (talk) 18:22, 14 June 2017 (UTC)

  Done Thanks for your report, there was a wrong localization in Module:I18n/linguistic which I have just fixed. Matěj Suchánek (talk) 18:36, 14 June 2017 (UTC)
@Matěj Suchánek: That module has also some copies: Module:I18n/linguistic (Q26156411). Should these also be replaced? --Bigbossfarin (talk) 18:44, 14 June 2017 (UTC)
I'll check all of them but this error was introduced at the beginning of May, so it's not likely. Matěj Suchánek (talk) 18:47, 14 June 2017 (UTC)
This section was archived on a request by: Matěj Suchánek (talk) 18:47, 14 June 2017 (UTC)

Indexing languages on Wikidata

Hi all

Currently there are two databases of languages being imported into Wikdiata, both at the stage of Mix n' Match.

  1. Identifiers for human languages listed in Glottolog
  2. UNESCO Atlas of World Languages in Danger

Two things:

  1. I think it is likely that there will be many items that will be in both catalogues, is this going to cause an issue?
  2. I think it would be really nice to have these databases imported in time for the Celtic Knot languages conference on the 6th of July. Does anyone have any suggestions of how to encourage people to do some mixing and matching? There's about 1,500 items needing matching on the UNESCO dataset and 15,000 on the Glottolog dataset, so its quite a lot of matching needs doing, I will keep chipping away at them but I feel a bit like I'm digging a hole with a teaspoon.

Thanks

--John Cummings (talk) 14:12, 1 June 2017 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────

  1. No, that's the point; Wikidata acts as the "hub" for the entries in each of the two (or more) external systems. Obviously, there should be one Wikidata item, not two!
  2. Use the UK mailing list, and ask WMUK to promote this. Get them to mail the booked attendees.

Good luck! Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:33, 1 June 2017 (UTC)

Thanks very much Pigsonthewing (talkcontribslogs), I'll put something together this weekend. --John Cummings (talk) 15:26, 2 June 2017 (UTC)
If I am understanding the UNESCO data correctly, when a language population extends into two adjacent countries, UNESCO assigns an AWLD ID for the language in each country. Wikidata will generally have a single item for the language (correctly in my view), and I believe we should put both AWLD IDs on that item with qualifiers to indicate which country each ID is associated with. My question is, what location property should be used in this qualifer? In Cofán (Q2669254) I used located in the administrative territorial entity (P131) but I know that can't be right. @John Cummings:, FYI.
Ooh, I just found applies to jurisdiction (P1001) and valid in place (P3005). I like "valid in place". Would that be it? - PKM (talk) 20:14, 8 June 2017 (UTC)
Also, adding two values causes a constraint violation. Is there any way to write the constraint such that multiple values are allowed if they have a specific qualifier? - PKM (talk) 19:39, 8 June 2017 (UTC)
Hi all, this has got a bit technical for me, you may find some relevant information on the UNESCO website here. Thanks--John Cummings (talk) 23:07, 8 June 2017 (UTC)
Thanks for that link. UNESCO page says "In the case of outlying communities, the editors had the possibility to create separate entries, indicating respective levels of endangerment and numbers of speakers." In some cases, WP editors have created two articles and thus two wikidata items. Where they have not, I think both UNESCO numbers should be attached to the single wikidata item with location qualifiers. And since the alternate label for valid in place (P3005) is "applies to location", that is the qualifier I'll use. Mix'n'Match makes it really easy to find these, so I'll try to keep them cleaned up as more are matched. - PKM (talk) 00:40, 9 June 2017 (UTC)

Date qualifiers

I have a question around which combinations of date qualifiers to use which I'm hoping someone here might help out with.

I want to describe the water quality of a lake WFD Ecological status (P4002) where the given value is the decided one for "2016" but relies on measurements and assessments made between 2008 and 2013. I know I could fall back on using only point in time (P585) "2016". But I would ideally want to include the data collection period as well. Any suggestions are welcome. /André Costa (WMSE) (talk) 14:34, 2 June 2017 (UTC)

@André Costa (WMSE): Maybe somthing like this? can help you out? But with a different order: 1. point in time (P585), 2. determination method (P459), 3. start time (P580) 4. end time (P582) maybe better? Q.Zanden questions? 14:51, 2 June 2017 (UTC)
@QZanden: Does the order of qualifiers matter (i.e. do we tell consumers to take the order into account). If not then it risks getting confusing =) /André Costa (WMSE) (talk) 15:03, 2 June 2017 (UTC)
@André Costa (WMSE): No, we don't tell consumers if the order matters. But in this case my example of the order would get confusing because there are many interpretations possible. The order I gave later is to make clear that the date that this situation is for 2016, but was measured between 2008 and 2013. I hope this explains a bit? Q.Zanden questions? 16:15, 2 June 2017 (UTC)
I don't think that it makes sense to depend on the order in any way. There are no guarantees anywhere that the order is stable. SPARQL queries don't know the order. I think it's very easy for the proposed solution to lead to problems.
I would prefer some solution with significant event (P793) qualified with start time (P580) 4. end time (P582) and use point in time (P585) "2016" on the actual measurement. ChristianKl (talk) 09:31, 3 June 2017 (UTC)
@QZanden, ChristianKl:. I agree with ChristianKl that relying on the order is risky. Using significant event (P793) is problematic however since we are only talking about measurement data used in the determination of WFD Ecological status (P4002). Other properties might rely on on other dates. Or were you thinking something like with further valid in period (P1264) "2016", start time (P580) "2008", end time (P582) "2013" qualifiers? (possibly with WFD Ecological status category (Q30092063) replaced by something even closer to WFD Ecological status (P4002)) /André Costa (WMSE) (talk) 12:52, 5 June 2017 (UTC)
Yes, I think such a solution would work. ChristianKl (talk) 13:09, 5 June 2017 (UTC)
@ChristianKl: My only worry with this is that it becomes complicated if other statements have data collection ranges. There is also the question whether data collection for a particular statement really is a significant event for the whole item. I have no better solution merely wondering if this one is preferred over excluding the data collection dates completely./André Costa (WMSE) (talk) 00:01, 8 June 2017 (UTC)
@André Costa (WMSE): this was fine except point in time (P585) (which is point, similar to P580/P582), publication date (P577) would be better.
alternately we can use references and say "stated in" <item> where item describes dataset. d1g (talk) 13:13, 5 June 2017 (UTC)
@D1gggg: The problem with using publication date (P577) is that it misrepresents what the 2016 stands for. It is not the publishing date but the date/year for which the label (e.g. WFD Ecological status: Good status (Q30092128)) was decided to apply (it was actually published in 2017, which I'll add to the reference). /André Costa (WMSE) (talk) 00:01, 8 June 2017 (UTC)
@André Costa (WMSE): Then I wouldn't relate 2016 with data collection (Q4929239), 2-3 qualifiers are enough IMO. d1g (talk) 07:47, 8 June 2017 (UTC)

UI discussion?

Where is the place to discuss the main GUI for Wikidata, where people typically enter in new data? Thanks. SharkD  Talk  02:53, 7 June 2017 (UTC)

You can have any discussion about the GUI at this place. If you have concrete suggestion for changes, you can also fill a phabricator ticket. ChristianKl (talk) 08:45, 7 June 2017 (UTC)
My main issue is that the dropdown lists don't disappear when you tab between form elements using the keyboard. I prefer using the keyboard more than the mouse, and with all the lists visible, I can't see what I'm typing. SharkD  Talk  09:00, 7 June 2017 (UTC)
@Lea_Lacroix_(WMDE):ChristianKl (talk) 09:28, 7 June 2017 (UTC)
I second this complaint. It can be very annoying. Jon Harald Søby (talk) 10:03, 7 June 2017 (UTC)
Is this phab:T149798? Sjoerd de Bruin (talk) 12:11, 7 June 2017 (UTC)
Hard to say. I think the dropdown list stays open whenever you don't actually select something from it. You have to select something from the list to make it go away. But the form changes focus sometimes without your intervention. SharkD  Talk  06:29, 8 June 2017 (UTC)

Check out the history of Dispepsi (Q5282564)

All bots, expect for one human edit. So funny. PokestarFan • Drink some tea and talk with me • Stalk my edits • I'm not shouting, I just like this font! 01:12, 8 June 2017 (UTC)

Now to find all other items with that quality. Mahir256 (talk) 01:47, 8 June 2017 (UTC)
Which one is the human edit? Matěj Suchánek (talk) 06:39, 8 June 2017 (UTC)
It's mine! I win something? :) --ValterVB (talk) 06:59, 8 June 2017 (UTC)

Difference of date of official opening (P1619) and opening (Q15051339)

Hi. Could someone explain the above difference please? I want to add a date to World Trade Center (Q1542258), while comparing World Trade Center (Q11235). Thanks in advance, Rehman 08:11, 8 June 2017 (UTC)

I also wounder if opening (Q15051339) and opening ceremony (Q3010369) are different. Do have number of viewers/observers for latter? d1g (talk) 11:02, 8 June 2017 (UTC)
Unfortunately I don't :( Rehman 11:10, 8 June 2017 (UTC)

Constraint report lag

The constraint report for ORCID iD (P496) (for example) was updated at 8:09 UTC this morning and still includes an earlier format violation for Subhash Khot (Q7631228), which I fixed at 12:13 yesterday. What causes this lag? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:19, 8 June 2017 (UTC)

  • This is pretty normal to my experience. The covi page diff you’ve provided indicates that the evaluation of the covi report was performed at 2017-06-07T11:58:59Z, which was before you applied the fix in Subhash Khot (Q7631228).
  • AFAIR @Ivan A. Krestinin uses an offline copy of the database for evaluation. This might cause further delays, although I am not sure how much this could be.
MisterSynergy (talk) 08:29, 8 June 2017 (UTC)
There is a 12 hour delay in constraint reports due to a predefined lag, see here. Sjoerd de Bruin (talk) 11:33, 8 June 2017 (UTC)
There's nothing about lag in that section. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:20, 8 June 2017 (UTC)
Sorry, this section. The page height was just at that level at my side... Sjoerd de Bruin (talk) 15:23, 8 June 2017 (UTC)

Breaking change: improving the schema of wb_terms table

Hello,

This is an important information regarding our database, for people running tools and scripts.


The change

In the wb_terms table, a new column term_full_entity_id containing strings with the full ID including the letter (ie. “Q42”) will be created, and will be used instead of the current column term_entity_id that only stores the numeric part of the ID (ie. “42”).

This change is made, among other things, to support the new entity types to come, and then to store terms of entities that have non-numeric IDs, for example Forms (“L42-F1”).


Implications
  • This change only affects tools that directly access the database. Other tools, for example those getting data through the API, pywikibot, gadgets, etc. will not be affected and will work as before with no action required from the maintainers.
  • in order to adapt tools to the new database structure, database queries using wb_terms should be changed so that they use term_full_entity_id column instead of term_entity_id. This might potentially simplify queries. For example, instead of having a condition WHERE term_entity_type='item' AND term_entity_id=42, one could now do: WHERE term_full_entity_id='Q42'.
  • term_entity_type is not affected by this change, and will still be available as before.


The process

Starting now, we will work through several steps to achieve this change. Note that the dates indicated below may not be exact. We will announce each of the steps separately. Here is a rough timeline:

  • June 22th: the new column, term_full_entity_id, becomes visible on Labs. It will be fully populated in the testwikidatawiki database replica on Labs. Note that this column will remain incomplete in the wikidatawiki database until later! Tools that use the wb_terms table can test new code that uses the new term_full_entity_id column with the testwikidatawiki database, but must keep using the old code with the wikidatawiki database.
  • Some time later: testwikidatawiki should be fully populated, and become usable on the wikidatawiki database. Tools that use the wb_terms table can now switch to using new code that no longer uses the old term_entity_id column on all databases.
  • Eventually (not before July 6th): the old term_entity_id column is removed from the testwikidatawiki and wikidatawiki database replicas on labs. Any code that still uses the old term_entity_id will break.


If you have any question or problem regarding this change, feel free to ping me. Lea Lacroix (WMDE) (talk) 10:54, 8 June 2017 (UTC)

problem with human-readable identifiers is that they could/will collide eventually. This is unavoidable for short strings in Latin.
"L42-F1" could be one item and another from other country with Latin script.
it is nice to have redirects from "L42-F1" to real page like http://www.wikidata.org/entity/Q42
we can use data from current labels/aliases (and then data from external ids) to make unambiguous "names" d1g (talk) 11:19, 8 June 2017 (UTC)
@D1gggg: I think you are misunderstanding something here. "L42-F1" would be a "real" page - or at least part of the page for the lexeme L42. See for example the wiktionary development proposal though I'm not sure that's completely up to date. ArthurPSmith (talk) 13:41, 8 June 2017 (UTC)
I'm not sure if new concepts (L42) will be accessible along with Q42 (old).
Everything is fine as long we have at least one way to access core concepts language-independently (currently with Q/P and numbers). d1g (talk) 14:06, 8 June 2017 (UTC)
About the Lexemes, Forms and Senses, you can have a look at the data model to understand how the lexicographical data will be structured :) Lea Lacroix (WMDE) (talk) 07:32, 9 June 2017 (UTC)

Unknown value / No value

Does it make sense the unknown value or no value in properties as image (P18) or place of birth (P19) ?. See exemple at Bivin of Gorze (Q605836). I understand a no value for an unvariable situation, like a spouse (P26) in a single dead person, cause it can't change. However, the image could not be known now, but someone can find it in the future. Should we fill with unknown/no value all empty properties ?. Obviously not. So, what's the meaning of do it in some items as the exemple?. Thanks,--Amadalvarez (talk) 11:39, 8 June 2017 (UTC)

If no known depiction of someone exist or if place of death is not known than I can see using unknown value or no value to indicate that hopefully with references. Maybe we need more specific item like "no known depiction exist" to make it more clear. --Jarekt (talk) 12:37, 8 June 2017 (UTC)
All people were born somewhere, so all of them should have place of birth (P19), even somevalue (novalue would always be wrong). However, since we are still building our database (we will never reach the finish, I am concerned), adding place of birth (P19): somevalue to all items about people would make this process harder. Thus, special values usually require sources. Matěj Suchánek (talk) 16:14, 8 June 2017 (UTC)
I'd use no value in cases we specifically want to express absence of value - e.g. see George Washington (Q23) property position held (P39), the value for President of the United States (Q11696) has "predecessor" as "no value", because George Washington (Q23) was the first President. In the same manner, someone who never married can have "no value" as spouse, but I would add it only if this fact is specifically notable, since a lot of people are unmarried and by itself it's not that important. However, for Isaac Newton (Q935) it might be notable to mention he was a lifelong bachelor and as such spouse (P26) has "no value". For image (P18) I can't imagine many situations where "no value" would be appropriate - maybe for description of something that can not have any image for some notable reason? "Unknown/somevalue" is used when we know the value is there, but we don't know what it is - e.g. if we know somebody was married, but the identity of the spouse is unknown. Mass-adding "somevalue" for place of birth (P19) though wouldn't make sense IMO because it's not an interesting data by itself. However, when we have somebody who we know died, but don't know when exactly, "unknown" may be appropriate, since knowing whether somebody is alive is usually important. Laboramus (talk) 20:28, 8 June 2017 (UTC)
@Jarekt, Matěj Suchánek, Laboramus: If I understood correctly, we all have a coincident opinion about use this values when the value is significative and unchangeable. We can't write "all the things we don't know", because the list is infinite. We can talk about we have or we known, for instance, we known that George Washington (Q23) has "predecessor" = "no value" and it will not change because it happens in the past. In the case of image (P18), nobody knows when we'll find a image (or draw) of the person or place; so, "no value" or "unknown" are circumstancial value. Even, we agree that fill of "no info by now" all the theoretical properties is not a good idea. Then, if you don't mind, I'll delete this value when it doesn't make sense. Thanks for your cooperation. --Amadalvarez (talk) 21:49, 8 June 2017 (UTC)
  Comment I do get concerned with things like date of death where "unknown value" is used, or a really rough approximation is used, and it is because the person adding the data themselves does not know. Our guidance on the use of these criteria of "no value", "unknown value" and approximations needs to be stronger. I would also think that there should be guidance given on the property pages to demonstrate 1) whether it is acceptable, and 2) the cases where it would be used. Even better would be constraint monitoring. I know the VIAF property has a use example.  — billinghurst sDrewth 03:56, 9 June 2017 (UTC)
@billinghurst: Totally agree. When you talk about "Our guidance on the use...", are you meaning "Help:Statements#Unknown_or_no_values" ?, Or another more specific content ?. Thanks,--Amadalvarez (talk) 04:31, 9 June 2017 (UTC)
I do think that we can do more, as we still have poor use. A couple of redirects to there are probably good, and I have just done a couple. I think that we need some level of over-arching statements that for data fields that the use of the tags is based on the sum of human knowledge, not an individual's knowledge base of the subject. For look-up fields, eg. authorities identifiers, absence of data allows or disallows the use.

For specific, I would like to see that guidance provided, so if no date of death, we say that it is not okay to use "no value", we test for it; there is limited scope for adding "unknown value", that is supported by examples of appropriate use, and state that it needs to be referenced to be a valid use. Similarly that guidance should cover the application of vague date ranges, so when is it appropriate to give a date of death of "20th century"? Or maybe it is never appropriate as vague dates are problematic for data recall and for testing whether the field is being used or not (and that is a problem that we see at the Wikisources).  — billinghurst sDrewth 12:50, 9 June 2017 (UTC)

How do I look up what city a person was born or died in?

I am writing c:Module:Creator that creates infoboxes on Commons about basic biographical info about people including place of birth and place of death. With places of birth/death the smallest unit is traditionally a city and if unknown than country. Wikidata place of birth (P19) and place of death (P20) properties are similar but significantly different. The description ask that the location should be "the most specific known (e.g. city instead of country, or hospital instead of city)", so now infoboxes show names of specific hospitals, neighborhoods or even houses as places of birth/death. For example for Pyotr Ilyich Tchaikovsky (Q7315) place of death (P20) is Malaya Morskaya Street (Q3449016) but traditionally his place of death is listed as Saint Petersburg (Q656) (see for example here). Infoboxes that rely on Wikidata incorrectly show Malaya Morskaya Street (Q3449016) (see for example here). I am trying to figure out some logic to get Lua code to consistently round up all the places of birth/death to show the city not specific building or street. Any idea how? I guess I can follow properties located in the administrative territorial entity (P131) until I get to an item whose instance of (P31) is subclass of (P279) a city (Q515). But that seems awfully complicated to just look up place of birth or death, not to mention that I might have to load bunch of large items to look it up. Any solutions I am missing? Any Lua codes out there to do that? --Jarekt (talk) 12:33, 8 June 2017 (UTC)

We don't need to show unspecific place of death, where exact coordinate or area is known.
@Jarekt: exact fragment from Russian page: Болезнь протекала тяжело, и Чайковский скончался в 3 часа пополуночи 25 октября (6 ноября) от холеры «неожиданно и безвременно» в квартире своего брата Модеста, в доме № 13 на Малой Морской улице. Claimed source is Правительственный Вѣстникъ 1893, № 235, p 2. 26 октября (7 ноября) d1g (talk) 15:19, 8 June 2017 (UTC)
Infobox in w:ru:Чайковский, Пётр Ильич does not use wikidata and lists place of death as Saint Petersburg (Q656), as it should. w:es:Piotr Ilich Chaikovski relies on Wikidata and list place of death as "Malaya Morskaya Street, Rusia" (no links), which makes little sense to me. I support policy of storing the most detailed location possible, but how do we round it us to the city level? --Jarekt (talk) 15:40, 8 June 2017 (UTC)
it should be located in the administrative territorial entity (P131) (or if absent country (P17)).
It seems like a bug that Malaya Morskaya Street (Q3449016) (now Malaya Morskaya Street, 13 (Q30159190)) wasn't linked. Try to add Spanish names? d1g (talk) 16:15, 8 June 2017 (UTC)
The problem with located in the administrative territorial entity (P131) is that it is a chain and as you move up the chain, you need to recognize when you arrive to an item that is at the "city" level, or has instance of (P31) that is a subclass of (P279) a city (Q515). I think I can create query to find city level place of death of Pyotr Ilyich Tchaikovsky (Q7315), but it is harder to do in Lua. --Jarekt (talk) 16:25, 8 June 2017 (UTC)
Based on your requests to get city of place of death; your logic is based roughly on assumption that humans only die inside Cities, which is incorrect :-)
Value of P131 should be specific and abstract for any purpose. d1g (talk) 16:38, 8 June 2017 (UTC)
In majority of cases people die in some human settlement line city, town or village. If not than people usually list a nearby place (and stares that it is "near") or lists region, country, ocean, etc. My concern is with places described with precision finer than city. By the way current place of death on w:es:Piotr Ilich Chaikovski is listed as "sin etiquetar, Unión Soviética", which seems even worse. It seems like that page should overwrite Wikidata location to get expected location. --Jarekt (talk) 17:14, 8 June 2017 (UTC)
Description of Wikidata item can be used to disambiguate individual cases.
that creates infoboxes on Commons about basic biographical info
I think that such information is less expected at commons (not expected at all)
I created several test templates (e.g. c:Template:PersonWD at c:File:Albrecht Dürer - Portrait of Maximilian I - Google Art Project.jpg
information about Maximilian I (Q150726)    
  • fetched from Wikidata
  • comes with interface-specific translations. d1g (talk) 15:09, 8 June 2017 (UTC)
Sometimes both values (the most correct, sourced and the "client-friendly") are listed (even with different ranks), sometimes qualifiers are used but either is actually wrong. The tree approach should definitely be used. I understand coding this up is going to be very hard. If it shows up to be impossible or unsustainable, though, we can sit down with the developers. Matěj Suchánek (talk) 16:37, 8 June 2017 (UTC)
Matěj Suchánek this almost calls for a property derived from place of birth (P19) or place of death (P20) which is precalculated by a query and stored and which is not manually edited. "Country of death" is even more problematic since it could be either Soviet Union (Q15180) or Russia (Q159). By the way a query to find "place of death" (rounded to a city level) would be:
SELECT ?item ?itemLabel
{
    wd:Q7315 wdt:P20 ?podItem .     # "place of death item"
    ?podItem wdt:P131+ ?item .      # item we ase looking for is a parent of "place of death item"
    ?item wdt:P31 ?cItem .          # item is and instance of "city item" 
    ?cItem wdt:P279+ wd:Q515        # "city item" is defined as subclass of city
    SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
}
Try it!
--Jarekt (talk) 17:14, 8 June 2017 (UTC)
Very good question, I would be interested in an answer as well. As long as it is difficult to retrieve information from a different item than the one which the sitelink is connected to (particularly if this includes extensive searches in the knowledge tree), users tend to model data in a way that somehow fits their needs, but does not make use of Wikidata’s full potential. A “flat Wikidata” would be the outcome, in which all relevant information is collected within the item by itself, even it is utterly redundant or misplaced there. Example: place of birth (P19) with qualifiers located in the administrative territorial entity (P131) and country (P17) to indicate which district and country the place of birth was located in when the person was born (plenty of examples); there are a couple of similar other situations. It would be useful if SPARQL queries were available in Lua modules. Is this actually the case (I am totally inexperienced with modules), and if so how would it work? A fairly simple query identifies the city in which Pyotr Ilyich Tchaikovsky (Q7315) died, and I guess using the query service would be the most effective way to browse the knowledge tree. —MisterSynergy (talk) 16:53, 8 June 2017 (UTC)
MisterSynergy, Wow, your query is much better than mine. --Jarekt (talk) 17:17, 8 June 2017 (UTC)
As far as I know modules using Lua can not run Database queries and there are no plans to allow that. However modules can write code to pick an item (expensive operation), look up some properties and based on findings repeat the process until proper item is found. But that would be an ugly code that might have to traverse through a lot of items. That is very expensive resource wise. Also Commons Creator templates that would be using it are "templates" which are transcluded in millions of pages and all those pages would have to go through the same process to look up place of birth/death. --Jarekt (talk) 17:29, 8 June 2017 (UTC)
Do you know of any discussions regarding SPARQL queries within Lua modules? I am interested to read more about it… Thanks, MisterSynergy (talk) 18:47, 8 June 2017 (UTC)

SPARQL in LUA @MisterSynergy: based on possibly outdated doc, answer is no SPARQL in Lua at Wikimedia projects as for now. d1g (talk) 19:55, 8 June 2017 (UTC)

Thanks. I can’t find an explicit request to integrate SPARQL into Lua in phabricator. Can anyone else? We should consider to open such a request otherwise. —MisterSynergy (talk) 20:20, 8 June 2017 (UTC)
@Jarekt: I apologize if I don't response exactly what you ask for. In CAtalan WP we are changing all infoboxes to get from WD as much properties as possible. Recently we have build ca:template:infotaula geografia política that manage all kind of administrative divisions of territory. To solve the upper administrative division where a specified city or regions belong to, we implement in our ca:module:wikidata a function called getParentValues that recover -in recursive way- all the located in the administrative territorial entity (P131) and instance of (P31) from the item you are until "n" levels or until find the country (P17) one. Even includes an optional list of instance of (P31) to exclude from recovery if you consider that it's not significant. If you are used with LUA (not my case, sorry) you can get some idea of solution we applied. To see the results of this feature in the mentioned infobox, you can see ca:Spiere (without paràmeters) or ca:Londres (with just 2 parametres of additional info). I hope it could be usefull to you. --Amadalvarez (talk) 21:33, 8 June 2017 (UTC)

I created phabricator:T167521 related to this issue. Please comment with ideas. --Jarekt (talk) 16:40, 9 June 2017 (UTC)

'What links here' with properties

Are there possibilities, and if so which, to get 'What links here' like information for an item that not only shows the other items linking to it but also for which property they are the value? And/or an overview ONLY of the properties? - Andre Engels (talk) 06:35, 9 June 2017 (UTC)

+1 I would include respective properties and sort/group "relevant" items by properties.
SPARQL can do this.
"from related entities" SQID (Q24298088) d1g (talk) 12:39, 9 June 2017 (UTC)

Using Wikipedia as a source

When using Wikipedia as a source, do I tag each claim using "imported from" and "English Wikipedia"? Or is there a preferred method? SharkD (talk) 22:18, 2 June 2017 (UTC)

In general it's better to use sources that are external to Wikipedia but if you take information directly from Wikipedia "imported from" is the way to go. ChristianKl (talk) 22:21, 2 June 2017 (UTC)
I just saw the previous topic on this page. Should I use "retrieved" in conjunction with "imported from"? I don't fully understand the other discussion. But it seems people want to also distinguish between bot versus user edits. SharkD (talk) 22:46, 2 June 2017 (UTC)
When it comes to bot edits it's relatively little effort to tell a bot to use "retrieved" every time it makes an effort. On the other hand with human editing it's unfortunatley takes addition time to use more. I think the quality is a bit better if you do use retrieved but it might not be worth the effort for manual editing. ChristianKl (talk) 09:25, 3 June 2017 (UTC)
Using User:TMg/currentDate.js makes manual application of retrieved (P813) less burdensome Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:18, 3 June 2017 (UTC)
And this script can easily be activated in Special:Preferences#mw-prefsection-gadgets, option currentDate. Very useful indeed. —MisterSynergy (talk) 18:20, 3 June 2017 (UTC)
MisterSynergy (talkcontribslogs) Is there a script to automatically input "English Wikipedia" when using "imported from" property? Thanks! SharkD  Talk  00:21, 10 June 2017 (UTC)

Mac OS

I think Mac OS (Q43627) and macOS (Q14116) should be merged, but am unsure which should be merged into which, and whether it will create complications. Your thoughts? SharkD  Talk  18:40, 8 June 2017 (UTC)

  Not done. They refer to distinct but separate concepts. Mac OS (Q43627) refers to an umbrella term for all Mac OS, starting in 1984. macOS (Q14116) refers to the current generation of Mac OS that started in 2001. MechQuester (talk) 18:44, 8 June 2017 (UTC)

Maybe we can update the descriptions to make it more clear which happens to be which? ChristianKl (talk) 06:18, 9 June 2017 (UTC)
Mac OS (Q43627) links to different concepts in various languages: "Macintosh operating systems" (classic+macOS+etc) in English, "Classic Mac OS" in Italian, and even "MacOS x" in Ukrainian. Anyone willing to clean this mess? :-) Syced (talk) 06:49, 9 June 2017 (UTC)
The ukranian one looks correct. The italian for Q43627 needs to be corrected. I will remove it. MechQuester (talk) 15:26, 9 June 2017 (UTC)
And you made a mistake: you have removed sitelink, but, if there is an error, you must move valid sitelink vs others item or vs new item with relevant property. If the situation is messy you can use this page --ValterVB (talk) 07:15, 10 June 2017 (UTC)
In my opinion, the English label for macOS (Q14116) should be "OS X". The term "macOS" (as written) is almost never used in English discussion about computers and "OS X" is Apple's official brand name for it.--Jasper Deng (talk) 16:53, 9 June 2017 (UTC)
It was rebranded to macOS last year. Sjoerd de Bruin (talk) 16:56, 9 June 2017 (UTC)
So be it then. That's news to me. --Jasper Deng (talk) 18:38, 10 June 2017 (UTC)
Thanks. I think the new brand name is strange however, especially since it affects computers going back to 2001. SharkD  Talk  22:26, 9 June 2017 (UTC)

No property for party leader?

I see we have chairperson (P488), but I can't find a property for political party leader. Evidently, we need one: See here. Danrok (talk) 12:43, 9 June 2017 (UTC)

aliases at chairperson (P488) say "chairman chairwoman president leader".
Also, examples state Christian Democratic Union (Q49762)     with Angela Merkel (Q567)     as leader. d1g (talk) 12:47, 9 June 2017 (UTC)
You can make a position item to use with position held (P39) though. Sjoerd de Bruin (talk) 12:48, 9 June 2017 (UTC)
@D1gggg:. chairperson (P488) is no good for party leader, because some parties have both a chairperson and a leader. They're not the same thing. As shown in the example I gave. Danrok (talk) 13:16, 9 June 2017 (UTC)
@Sjoerddebruin:. position held (P39) is of no use for making claims within the political party's item. Danrok (talk) 13:18, 9 June 2017 (UTC)
Oh right, forgot to say to use it on the item of the person. Sjoerd de Bruin (talk) 13:18, 9 June 2017 (UTC)
One may also consider the new secretary general (P3975). Thierry Caro (talk) 14:08, 9 June 2017 (UTC)
  • When it comes to the example of the German CDU I think it makes sense to model Angela Merkel as chairperson (P488) and Peter Tauber secretary general (P3975). The CDU doesn't have a title of party leader Angela Merkel's title is "VORSITZENDE DER CDU DEUTSCHLANDS" and "Vorsitzender" is the German word for chairperson. ChristianKl (talk) 22:05, 9 June 2017 (UTC)
The issue is that some parties have both a chairperson and a party leader, some have one or the other. So, it seems to me that we need a new property for party leader. Danrok (talk) 12:51, 10 June 2017 (UTC)

Worldwide place of publication?

What should I enter for place of publication (P291) when a creative work has been released internationally? Is there a "world-wide" or some other parameter for these cases? Thanks. SharkD  Talk  00:30, 10 June 2017 (UTC)

Are you really sure you need a place of publication (P291) for that kind of item ? The more correct value should be Earth (Q2) but as we don't have any example of creative work published or created outside of Earth I think this is not a critical question. Snipre (talk) 01:03, 10 June 2017 (UTC)
World-wide or international release is not a concept I just invented. SharkD  Talk  01:14, 10 June 2017 (UTC)
I think you mix release and publication. A book can have an international release but it will have always one place of publication. A place of publication for a video game should be more related to the place where the final version was edited. I thing we reach a problem of property definition here. Snipre (talk) 03:39, 10 June 2017 (UTC)
@SharkD: worldwide (Q13780930) is intended for exactly this purpose. Mahir256 (talk) 05:00, 10 June 2017 (UTC)
Wow! I looked for this but missed it! Thanks. Unfortunately, "place of publication" is not showing up in my Contributions list, so I can't go back and fix them. SharkD  Talk  10:48, 10 June 2017 (UTC)

Request constraint check gadget

I would like to request turning the user script for constraint checks into a gadget. --Jonas Kress (WMDE) (talk) 08:34, 12 June 2017 (UTC)

Maybe add it to Wikidata:Tools/User_scripts first, Jonas Kress (WMDE)? --Atlasowa (talk) 07:25, 14 June 2017 (UTC)
  Done Matěj Suchánek (talk) 14:31, 16 June 2017 (UTC)
This section was archived on a request by: Matěj Suchánek (talk) 14:31, 16 June 2017 (UTC)

Merging help

Please merge Q15625195 to Q8203760.--Jusjih (talk) 17:32, 7 June 2017 (UTC)

They don't seem to be about the same subject. Sjoerd de Bruin (talk) 17:34, 7 June 2017 (UTC)
There are different phrases when categorizing court case laws by year on English Wikipedia and English Wikisource. Renaming the categories on either wiki requires major works, so we have not tried this. Please reconsider it and also from Q15625194 to Q8202761? Thanks.--Jusjih (talk) 17:38, 7 June 2017 (UTC)
Hmm, in this case, let's do it. --Liuxinyu970226 (talk) 04:48, 8 June 2017 (UTC)
Thanks. Also Q15625192 to Q8200959, Q15625190 to Q8198649, please. What does "language mk" mean?--Jusjih (talk) 02:55, 11 June 2017 (UTC)
Macedonian (Q9296)? --Liuxinyu970226 (talk) 12:36, 11 June 2017 (UTC)

Page protection

Is there a gadget or user script, like Wikipedia's Twinkle, which can aid in requesting the protection of a page? Jc86035 (talk) 17:13, 10 June 2017 (UTC)

Requesting administrator rights is probably much easier. Sjoerd de Bruin (talk) 17:15, 10 June 2017 (UTC)
There are plenty of people not eligible for admin rights - not to mention those who should have them but would not be able to due to the petty enmities of others in this community - who would still like to request page protection from time to time. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:27, 11 June 2017 (UTC)

Death categories

A series of en.wiki categories "Category:Death in xxx" are incorrectly interconnected in Wikidata with categories from other languages which are equivalent to "Category:Deaths in xxx", meanwhile the corresponding enwiki "Category:Deaths in xxx" are in other items connected /or not/ with sitelinks from other wikis.

For clarity, categories "Category:Death in X" are for ~anything about death in the place X (e.g. including sub-categories for cemeteries, burials, funerals, mausoleums, etc; example en:Category:Death in the United States), while "Category:Deaths in X" are for people deceased in the place X. Many wikis currently have ~only categories for "Category:Deaths in ...", so, from these two types of items this one should be the "most popular" one, with the most sitelinks, but the current situation shows that we have mostly mixed concepts in items related to this subject.

So, what we have; examples:

A similar situation in:

Other items have no enwiki sitelink, only wrong English label "Death" instead of "Deaths", having correct sitelinks and labels in other languages, for example Q6240650, Q17388715, Q5760689, Q6433900. Seems that in these cases the bad label was added by bot(s), following the most widespread pattern (which is wrong).

Some observations:

  • I am not sure, but seems that besides en.wiki - Turkish, Farsi, and Arabic Wikipedias also have both types of such categories, judging by interwikis from Category:Deaths in the United States (Q6334989) + Category:Death in the United States (Q8366740).
  • Most frequently (almost always or even always), sitelinks for "Category:Deaths in xxx" in Russian (Категория:Умершие в xxx), Ukrainian (Категорія:Померли в xxx), Belarusian (Катэгорыя:Памерлі ў xxx) are correctly interconnected between them, but they may be in the same item with a wrong enwiki sitelink. XXN, 15:43, 11 June 2017 (UTC)
Please don't fix yet these examples; let other people understand the problem.
This could be probably corrected in mass using a bot, if the correct patterns for interconnecting pages from wikis will be known with certitude. XXN, 15:54, 11 June 2017 (UTC)

Power stations

Hi. I've been working with power station articles for some time, and though of helping move data of individual power stations to WikiData, but I'm facing a problem... Each type of power station (i.e. Geothermal, Nuclear, Pumped-storage, Solar, Thermal, Tidal, Wave, Wind) all have different specifics. For example a wind farm will have number of wind turbines/turbine manufacturer/turbine type/etc, or a nuclear plant will have number of nuclear reactors/reactor type/etc (see here for full list of such specifics).

My problem is, I am not able to figure out a way of adding those entries to WikiData. Are they purposely not included, or not included yet? I'd love to add myself, but I don't know how. Can someone guide me on how to go about with this please? Thanks! Rehman 10:59, 8 June 2017 (UTC)

each station should probably have a instance of (P31) value that is the appropriate type (Geothermal, Nuclear, etc). If there are specific properties needed to support aspects of a given plant you could suggest them as new property proposals - see the Wikidata:Property proposal process. But most likely you can use existing properties - for example has part (P527) with qualifier quantity (P1114) - see day (Q573) for an example where a "day" is indicated as having 24 hours. ArthurPSmith (talk) 13:48, 8 June 2017 (UTC)
I personally think new properties would help in this case, as there are literally hundreds of thousands of power station articles that needs proper syncing with WikiData, in order to easily integrate with infoboxes, etc. I'll look at the property proposals page. If anyone got time, please do feel free to lend a hand :) Rehman 23:36, 8 June 2017 (UTC)
@Rehman: the
< the station > has parts of the class (P2670)   < type of turbine >
quantity (P1114)   < xx >
is meant to solve this kind of cases. That allows you to enter all the informations you want, I think. author  TomT0m / talk page 06:15, 9 June 2017 (UTC)
Hi TomT0m. That only partially works, unfortunately. For example, Lakvijaya Power Station (Q6479997):
I'm totally stuck on how to proceed from there, as this is blocking the possibility of using Wikidata to auto-fill the power station infoboxes at the Wikipedias. Rehman 13:42, 12 June 2017 (UTC)
Start by modeling not the specifics of each turbine but the specifics of each power plant. --Izno (talk) 14:15, 12 June 2017 (UTC)
Izno, I've added some details to the above power station and the generation unit. Could you assist to check if that's the correct way to do it? I'll then see if I can figure out how to extract those details on infoboxes at Wikipedia... Rehman 14:23, 12 June 2017 (UTC)
That actually looks pretty reasonable. The one issue I see is powerplant: You've added a powerplant which is the company, not a system designed/built by that company. Perhaps qualify it with "designer/constructor/whatever we have for such a property" property rather than powerplant, since I would guess the specific models of turbines are basically one-off designs. If they're not one-offs, then it might be worthwhile to create items for those and then the use of the powerplant property becomes fine. --Izno (talk) 14:29, 12 June 2017 (UTC)
@Rehman: Create an item for each turbine. Each will then have their own claims. Link them to the power plant by « has part ». author  TomT0m / talk page 17:58, 12 June 2017 (UTC)

Modeling partners at law firms

Hi all, how are people best modeling partners at a law firm? Right now, position held (P39) -> partner (Q7140693) yields only two people - Tracy Atkinson and Hillary Clinton. I just added Christopher A. Wray (Q30148517) but there should be many more. Thanks. -- Fuzheado (talk) 13:17, 8 June 2017 (UTC)

Looks like the correct way to go. Feel free to add more :-) I guess famous lawyers have their "occupation" set to "lawyer" but few editors have cared about their exact position, so far. Syced (talk) 06:56, 9 June 2017 (UTC)
Thanks much @Syced: -- Fuzheado (talk) 14:04, 9 June 2017 (UTC)
@Syced: @Fuzheado: There are more modelled with this as a qualifier on employer (P108), e.g. Ric Clark (Q20962810) or Jonathan Blattmachr (Q24014264), which seems like a slightly better version to me. However, I don't think position held (P39) is the correct property in either version, though, as it's meant to be for public office — i.e. it's defined as representing a political mandate — rather than a normal employment position. I'm not sure if there's a better qualifying property here than P794 (P794), though. --Oravrattas (talk) 16:13, 10 June 2017 (UTC)
@Oravrattas: - Thanks for that. You're right that position held (P39) is not the right property, though my gut tells me that "position held" is being misused all over the place in Wikidata given the ambiguity of the property label. The employer (P108) with position held (P39) qualifier is better, but could also seen as non-ideal since a partner is not just an "employer/employee" status. We also don't have a clean breakdown of legal partner vs financial/business partner. Therefore, a query to find all people who are partners in a legal firm is somewhat complex, as it would have to drill down into the subclass property of the firms in which someone is a partner. Unfortunately, a quick spot check of several law firms show that they are quite bare, and don't even have any statements that indicate they are law firms (Milbank, Tweed, Hadley & McCloy (Q6850819), Rose Law Firm (Q7367828), Debevoise & Plimpton (Q5248071)). We have a long way to go to properly model this, it seems. -- Fuzheado (talk) 12:07, 11 June 2017 (UTC)
A related issue is directors of firms - these are often not "employees" as such & an interesting relationship to model. Don't have a great solution at the moment. Andrew Gray (talk) 12:28, 11 June 2017 (UTC)
@Andrew Gray: - Good point. This is another example of our general systemic bias – lack of interest in making "corporation" entries detailed and complete. -- Fuzheado (talk) 15:32, 11 June 2017 (UTC)
@Fuzheado: Hoi, it is more a sign of immaturity then of systemic bias. Systemic bias is in the lack of data to do with the global south for instance or any other diversity issue. Thanks, GerardM (talk) 15:48, 11 June 2017 (UTC)
@GerardM: - True, but systemic bias is a much larger issue than just global south vs Western European notions of knowledge. At least on English Wikipedia, WikiProject:Corporations is pretty much a dormant project, with few people finding the motivation, or the sympathy, to improve articles about Fortune 500 companies in substance or quality. I suppose Wikidata has the chance to balance this out somewhat. It will be interesting to see. -- Fuzheado (talk) 17:54, 11 June 2017 (UTC)
@Fuzheado: There is already bias in concentrating on the Fortune 500 companies. They are American in scope.. There is a world out there. Thanks, GerardM (talk) 18:02, 11 June 2017 (UTC)
@GerardM: I think you're missing the point. The quality of those Fortune 500 articles is generally spotty and poor given their prominence on the world stage. They are not just "American" in scope. See en:Fortune_Global_500#Fortune_Global_500_list_of_year_2016. -- Fuzheado (talk) 07:04, 12 June 2017 (UTC)
Spotty data is imho immaturity, the lack of data for countries like Brazil, Angola or Mozambique is bias. This absolute lack of data is really problematic because none of our tool grasp how much is lacking. With the Fortune 500 we know about them and consequently tools can indicate what is missing. Quite a difference. Thanks, GerardM (talk) 12:29, 12 June 2017 (UTC)

Changing the type of “relation”

As we are preparing for migrating constraints to statements on properties, Jonas noticed that the data type of relation (P2309) is not ideal: it would make more sense for it to have the data type “property”, and link directly to instance of (P31) and subclass of (P279), instead of having the type “item” and linking to instance of (Q21503252) and subclass of (Q21514624). (It would even be possible to link to other “relation” properties, though I can’t think of a good example for that right now.)

The ideal time to make that change would be pretty much right now – once we start using constraint statements, such a change would have to be synchronized between the change on the Wikidata entities and the change to the code using those entities (and the time when that code is deployed).

Do you think this change makes sense? Should it be done? Ping Ivan A. Krestinin. --Lucas Werkmeister (WMDE) (talk) 13:43, 8 June 2017 (UTC)

This makes sense to me. The property has only been used 31 times (and some of those uses look incorrect) so it should be easy to change when we can. However, changing datatype is I think not a simple thing in itself - it may have to be deleted and a new property created. ArthurPSmith (talk) 13:54, 8 June 2017 (UTC)
Sure – if it’s easier to create a new property, that’s fine by me. --Lucas Werkmeister (WMDE) (talk) 15:58, 8 June 2017 (UTC)
  Support If the backend implementation should be as simple as possible, then I definitely support this. Matěj Suchánek (talk) 16:08, 8 June 2017 (UTC)
Could you give examples of how you see that this property will be used if we change the datatype? ChristianKl (talk) 17:05, 8 June 2017 (UTC)
  • You'd need two or more properties to do the same. Please see Property:P247#P2302 how it's meant be used. A problem we have currently with type constraints, is that it's either "P31" or "P279", but not any of the two. Still, some are clearly for instances (P31) only, others not.
    --- Jura 17:52, 8 June 2017 (UTC)
@Jura1, Ivan A. Krestinin: we already use instance of (P31) and subclass of (P279) to check the type – the question is just, why should we have those two items that are really just item versions of the properties? (Practically speaking, it means two extra configuration variables of the Wikibase Quality Constraints extension – two for P31 and P279, and then two more for Q21503252 and Q21514624: the items are redundant, in my opinion.) --Lucas Werkmeister (WMDE) (talk) 10:24, 9 June 2017 (UTC)
instance or subclass of (Q30208840) is an option current templates lack.
--- Jura 09:48, 10 June 2017 (UTC)
@Jura1: The QualityConstraints extension doesn’t support it either – you can’t just create the item and have it work magically, we’ll have to implement support for it if intended :)
And I’d say that two “relation” parameters, one with “instance” and one with “subclass”, would actually be the better way to model this. That’s still possible with a property datatype. --Lucas Werkmeister (WMDE) (talk) 08:43, 12 June 2017 (UTC)

gene (Q7187) uses instance of (P31) differently from rest of Wikidata

SELECT ?parent2 (count(?item) as ?number)
WHERE
{
	?item    wdt:P279 ?parent1 .
    ?parent1 wdt:P279 ?parent2 .

    ?item    wdt:P31  ?parent2 . # strange relation, avoids ?parent1
}
GROUP BY ?parent2
ORDER BY DESC (?number)

Try it!

What should we do about it? 507166 links as for now d1g (talk) 17:32, 9 June 2017 (UTC)

this is an aspect of a very general problem with chemical compounds and related items - it's not clear whether P31 or P279 is the correct relationship, so people use both (in the case of genes, apparently quite methodically). Obviously when one talks about for example BRCA1 (Q227339) the subject is not a specific collection of atoms at one physical location, it is a general arrangement of atoms into a DNA sequence, or perhaps even more abstractly as an information object. So it represents a subset (subclass) of all the things that are "genes" in the sense of specific arrangements of atoms into parts of DNA molecules. Yet it seems obvious BRCA1 "is a" gene as we would understand it - an instance of the concept "gene". So what should the relationship be? Similar discussions have run for a while quite inconclusively under the chemistry wikiproject for example here. ArthurPSmith (talk) 18:08, 9 June 2017 (UTC)
My world view is very primitive: I use P31 only to most isolated physical entity (or abstract if physical manifestations aren't possible), constrained or limited in space and time.
One reason for this is to get information back from ... P31 ... statements easily.
If we use P31 without prioritization, then we have hard time to say if instance of (P31) BRCA1 (Q227339) means physical object or still abstract.
With my view (... P31 ...) is always physical without need to walk full P279 tree.
Ideally anything physical should use P279 and physical object (Q223557)     but there are dark corners on Wikidata to make it work.
Abstract concepts (gene) should be ever higher than physics abstract object (Q7184903)    , because they are abstract.
With genes we have everything linked to one item gene (Q7187)     which is ambiguous: abstract gene can be different from subclasses of physical genes.
Does it make sense? d1g (talk) 19:37, 9 June 2017 (UTC)
This is not entirely correct. Metaclasses exist as real concepts in human discussion; there are some examples on Help:Basic Membership Properties. See also Wikidata:WikiProject Ontology/Modelling and references there, particularly on the experience with Cyc's higher-level classes. All that said, in general you are right that instance of (P31) should be used sparingly for abstract things - metaclasses and higher-order or ambiguous classes (as gene perhaps is) should be rare. ArthurPSmith (talk) 20:15, 9 June 2017 (UTC)
That Help:Basic Membership Properties page is rather nice, and somehow I hadn't managed to see it before. Do you know if there's another similar page anywhere that is even more example driven — ideally with lots and lots of examples, and from many different fields, rather than just one or two for each concept? Mostly I've been muddling through pages like Wikidata:Item classification and User_talk:TomT0m/Classification, but they tend to get very technical and very abstract very quickly, so in practice I've largely just been trying to follow the examples already in place, and have sort-of built up a mental model over time of how to use them. But occasionally someone will revert a change I make, and it can sometimes quite difficult to know where (or whether) my understanding is actually wrong, or when the examples I'm copying might themselves be faulty. (It was a major revelation to me, for example, when Infovarius (talkcontribslogs) pointed out that position held (P39) being a subproperty of instance of (P31) essentially means that anyone holding a position that way means they're also an instance of it.) Or is there a different part of Wikidata where questions about these sorts of things tend to be raised, other than here? --Oravrattas (talk) 20:30, 9 June 2017 (UTC)
@Oravrattas: WD is a wiki meaning this is a collaborative work: discuss with other contributors about your topic using a dedicated wikiproject where a common view can be defined. I think one of big problems we have in WD is a very small trend to work with other contributors, so it is very difficult to create a coherent modelling. The worst are person with the possibility to work with a bot: they are able to apply a specific structure to a large number of items without any preliminary discussion about how items should be defined, which properties should be used with which values. Snipre (talk) 00:53, 10 June 2017 (UTC)
By the way, people with more expertise in medical areas than I have should review recent edits by Special:Contributions/EricSadou - this user has changed a lot of "subclass of" statements to "instance of" in a way that I think is not correct. I've fixed one or two of them but it probably should be more systematically reviewed. ArthurPSmith (talk) 20:19, 9 June 2017 (UTC)
@Daniel Mietchen, Andrawaag: Mahir256 (talk) 23:36, 9 June 2017 (UTC)
@D1gggg, ArthurPSmith: Please don't forget the granularity characteristic of an ontology: this means we are free to decide until which level of details we want to use to describe a concept.
Something like "use P31 only to most isolated physical entity" can be true but is not always the truth. If I take the example of a molecule of ethanol which was in my glass of wine yesterday evening, I can in theory create an item for that molecule because I can describe its position at a certain time, so I can isolate it but does it have an interest ? Do we want and are we able to identify each molecule in the universe or at least some of them ? The response is no, there is no famous or historical molecules like we have for humans or animals. So even if we are able to distinguish one molecule among billions of similar molecules, there is no interest to create item for each unique molecule. So defining ethanol (Q153) instance of chemical compound (Q11173) is not wrong if we decide to not go deeper in the level of details, if we choose to define the lowest granularity to the group of same molecules and not to the isolated molecules.
I just want to continue the explanation with another example. Currently we accept to create one item for each person (respecting some criteria, see the notability rule) but in the classification of humans the lowest entity is the individual. But this is a choice and not a rule: why can we create different items for the different aspects of one unique person ? One item for "Albert Enstein as scientist", "Albert Einstein as student": in that organization, the existing item Albert Einstein (Q937) won't be an instance of human (Q5) but a subclass of human (Q5), as one person can be considered through different aspects (profession, life steps,...)
So for me Help:Basic Membership Properties page is not a good classification rule for WD because it tries to create one unique way to define what is an instance and what is a subclass without taking care of how we really model the concepts in WD. It doesn't respect the first question we have to answer when speaking about ontology: what do we want to model, to classify ? First we have to answer that question in each field of WD and then we can create the classification rules to distinguish between instance and subclass.
My proposition is to use a more pragmatic rule to identify an instance from a subclass: the instance is the more detailed concept in a classification. I can identify the class of methanol molecules from the class of ethanol molecule using different values for a set of common properties in WD but currently I can't describe two different molecules of ethanol using the existing properties like the precise location of each of them yesterday in my glass of wine.
So the question of instance/subclass is not an universal definition but more a dynamic question which is different depending on the concepts we are modelling. Snipre (talk) 00:20, 10 June 2017 (UTC)
@Snipre: definitions like "more detailed concept" aren't self-explanatory.
< ethanol (Q153)     >  Wikidata property  < chemical compound (Q11173) >
But why not to use P279 between abstract classes?
My approach is to follow what was said at Help_talk:Basic_membership_properties#Proposition_of_definition
"The simple difference between an instance and a class is that an instance has a unique location in space and time, while a class does not."
Because abstract Ethanol doesn't have place and time, then it is a class.
Because it is a class, it should be ethanol (Q153) P279 chemical compound (Q11173)
Wast majority of scientific things in Wikidata are abstract, so relations between them should be P279.
storm (Q81054)     is a abstract object without place in time, so it should be linked to phenomenon (Q483247)     using subclass of (P279) directly or indirectly.
Exact event of storm (Q81054)     with place and time would be instance of (P31) storm (Q81054)    
Direct P279 claims are efficiently replace what is sometimes wanted as "abstract thing P31 of abstract class"
@Snipre: one property P279 for abstract concepts is much simpler than P279/P31, isn't? d1g (talk) 12:47, 10 June 2017 (UTC)
@Snipre, D1gggg: naturally I disagree somewhat with both of you here (yes there is a third intermediate option :) - Snipre, "ethanol" is not actually the "lowest granularity" in several respects even without talking about individual molecules isolated to a particular physical location etc. Does "ethanol" refer to the abstract arrangement of atoms in an individual molecule, or to the general concept of the liquid substance (or possibly solid or gas state) composed purely of those molecules? Is "ethanol" in different contexts (mixed with water for example, or with various other chemical compounds) the same concept? If some of the hydrogen atoms are actually deuterium, if a C is C-13 or the O O-17 or O-18, is that a different concept? For scientific purposes those different substance states, contexts, or isotopic arrangements imply differences in properties, so that "ethanol" is not really a uniform consistent thing under all conditions. We might (at least hypothetically) want different wikidata items for each of those conditions, in order to specify precisely relevant properties. I think those different items, were they to be created, would all be "subclasses" of "ethanol", because they are describing a subset of the conditions covered by the general concept "ethanol", so their specific instances would also be instances of the general concept. But is a specific instance of a given type of molecule actually an instance of the general concept "chemical compound"? I'm not so sure, and our definitions aren't so clear. There are similar issues in the area of "products" - for example blue cheese (Q746471) is an "instance of" type of cheese (Q3546121) while a "subclass of" cheese (Q10943) which seems right to me - so is "chemical compound" a "type" in that sense? ArthurPSmith (talk) 13:43, 12 June 2017 (UTC)
@ArthurPSmith: yes, it seems the best to use a separate item to make P31 claims over "types" of cheese or chemical compounds.
I also think that ethanol can be "physical" i.e. we can have at least one claim
< ethanol (Q153)     > subclass of (P279)   < eventually physical class >
So that claims P31 Q153 would mean physical arrangement of ethanol molecules. d1g (talk) 14:41, 12 June 2017 (UTC)

I wrote an entire help page on the matter, help:classification. author  TomT0m / talk page 14:15, 10 June 2017 (UTC)

My understanding after User:TomT0m/Classification#Classifying_classes was that we need "type of gene" item, not - is it wrong? d1g (talk) 14:39, 10 June 2017 (UTC)
Apparently the «gene» term is polysemous. The cleanest way I can see to handle this I can imagine is to create an item for each sense, and to make the polysemous item a superclass of all of them. And to look on existing ontologies and definitions to check if all of this matches.
My guess would be : at the concrete level, a gene instance is a (part of) molecule. Then a gene like «LacY» is the class of all these molecules, usually defined by their common sequence. If «LacY» has alleles, they are all subclasses of «LacY» (say «LacY1» «LacY2»). As a consequence, «LacY1» and «LacY2» are both classes of molecules, just as «LacY». They are defined by a more specific sequence.
I think there is two ways to see the « gene » concept : either as a class of molecules, a superclass of «LacY», «LacY1», «LacY2» and all the similar classes, like « lacZ ». All the instances of « qene » would share stuffs like their opening and ending sequence, so it’s no problem to clearly define the class. Now there may be other subclasses of it that are neither gene classes nor alleles, so it’s probably useful to have a distinct item « gene » that is the metaclass of all «LacY» « lacZ » … classes. And an « allele » item, metaclass for all «LacY1» «LacY2» … classes that would not make sense as a class of molecules. author  TomT0m / talk page

Inconsistent parsing of dates

I've just noticed this when working with start time (P580): entering a date with different punctuation can change the way it's interpreted.

  • 9 7 2017 - 9 July 2017
  • 9/7/2017 - 7 September 2017
  • 9.7.2017 - 9 July 2017
  • 9-7-2017 - 9 July 2017
  • 7 9 2017 - 7 September 2017
  • 7/9/2017 - 9 July 2017
  • 7.9.2017 - 7 September 2017
  • 7-9-2017 - 7 September 2017

I understand that this sort of date is ambiguous and we shouldn't rely on it being interpreted consistently, but why does one set of punctuation give a different result to another? It can't be that using slashes is unique to MDY notation - I've been writing DMY dates this way all my life. Andrew Gray (talk) 21:37, 9 June 2017 (UTC)

I think Wikidata has to go by common international standards, not necessarily what you've been doing all your life. SharkD  Talk  22:25, 9 June 2017 (UTC)
Well, yes, I agree entirely :-). The international standard is YYYY-MM-DD; we support the "traditional" xx-xx-YYYY for convenience, but there's no standard saying "dashes mean D-M-Y but slashes mean M/D/Y" - my comment was meant as a demonstration that punctuation between numbers isn't at all linked to a local style of date notation. Apologies if it was unclear... Andrew Gray (talk) 22:33, 9 June 2017 (UTC)
It's not consistent for any given punctuation either, e.g. "9/7/2017" gives "7 September 2017" but "19/7/2017" gives "19 July 2017" while "7.9.2017" gives "7 September 2017" but "7.19.2017" gives "19 July 2017". I would like to it see it give the user a list of plausible interpretations when a date format is ambiguous and ask them to pick which one is right, instead of acting like there is only one way to interpret it. - Nikki (talk) 00:00, 10 June 2017 (UTC)
Remind me - how many countries in the world use MMDDYYYYY format? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:10, 10 June 2017 (UTC)
Unclear, but on behalf of us Americans who are perhaps the most prominent users of MM/DD/YYYY, I apologize. :) -- Fuzheado (talk) 10:26, 12 June 2017 (UTC)

Video game remakes

Sometimes a video game is remade a few years after release, with improved graphics, additional dialogue between different characters, etc. What property should I use to indicate when this has been done? Is based on (P144) the property I should use? Thanks. SharkD  Talk  22:23, 9 June 2017 (UTC)

You can use significant event (P793) with an appropriate value describing this kind of reedition or you can create a new item like we do for books when a new edition is released. This should be discussed with the video game wikiproject Wikidata: WikiProject Video games in order to define a common rule. Snipre (talk) 01:09, 10 June 2017 (UTC)
That project is dead as far as I can tell. SharkD  Talk  01:16, 10 June 2017 (UTC)
There are no dead projects. There are only projects waiting for your participation :-) Syced (talk) 01:50, 13 June 2017 (UTC)

exact match (P2888) and external-ids

Do we need use exact match (P2888) if corresponding external-id property exists? For example see Entrez Gene ID (P351) and exact match (P2888) properties on Ngp (Q18252644). Ping @Andrawaag, Sebotic:, --Micru (talk) 21:46, 24 August 2014 (UTC) Tobias1984 (talk) TomT0m (talk) Genewiki123 (talk) Emw (talk) 03:09, 9 September 2014 (UTC) —Ruud 16:15, 9 December 2014 (UTC) Emitraka (talk) 14:32, 14 October 2015 (UTC) Bovlb (talk) 19:10, 21 October 2015 (UTC) Peter F. Patel-Schneider (talk) 22:21, 23 October 2015 (UTC) ArthurPSmith (talk) 15:51, 5 November 2015 (UTC) --Daniel Mietchen (talk) 20:53, 3 January 2016 (UTC) --Harmonia Amanda (talk) 22:00, 27 February 2016 (UTC) --Lechatpito (talk) --Andrawaag (talk) 14:42, 13 April 2016 (UTC) --ChristianKl (talk) 16:22, 6 July 2016 (UTC) --Cmungall Cmungall (talk) 13:49, 8 July 2016 (UTC) Cord Wiljes (talk) 16:53, 28 September 2016 (UTC) DavRosen (talk) 23:07, 15 February 2017 (UTC) Vladimir Alexiev (talk) 07:01, 24 February 2017 (UTC) Pintoch (talk) 22:42, 5 March 2017 (UTC) Fuzheado (talk) 14:43, 15 May 2017 (UTC) YULdigitalpreservation (talk) 14:37, 14 June 2017 (UTC) PKM (talk) 00:24, 17 June 2017 (UTC) Fractaler (talk) 14:42, 17 June 2017 (UTC) Andreasmperu Diana de la Iglesia Jsamwrites (talk) Finn Årup Nielsen (fnielsen) (talk) 12:39, 24 August 2017 (UTC) Alessandro Piscopo (talk) 17:02, 4 September 2017 (UTC) Ptolusque (.-- .. -.- ..) 01:47, 14 September 2017 (UTC) Gamaliel (talk) --Horcrux92 (talk) 11:19, 12 November 2017 (UTC) MartinPoulter (talk) Bamyers99 (talk) 16:47, 18 March 2018 (UTC) Malore (talk) Wurstbruch (talk) 22:59, 4 April 2018 (UTC) Dcflyer (talk) 07:50, 9 September 2018 (UTC) Ettorerizza (talk) 11:00, 26 September 2018 (UTC) Ninokeys (talk) 00:05, 5 October 2018 (UTC) Buccalon (talk) 14:08, 10 October 2018 (UTC) Jneubert (talk) 06:02, 21 October 2018 (UTC) Yair rand (talk) 00:16, 24 October 2018 (UTC) Tris T7 (talk) ElanHR (talk) 22:05, 26 December 2018 (UTC) linuxo Gq86 Gabrielaltay Liamjamesperritt (talk) 08:44, 21 June 2019 (UTC) ZI Jony Ivanhercaz (Talk) 11:07, 15 July 2019 (UTC)

  Notified participants of WikiProject Ontology. — Ivan A. Krestinin (talk) 08:47, 10 June 2017 (UTC)

  • In this case exact-match is redundant with Entrez Gene ID. However, the exact-match URL (http://identifiers.org/ncbigene/) looks more canonic than the Entrez Gene URL. I would recommend changing FormatterURL of Entrez Gene to use this more canonic URL, if the scope of Entrez Gene is subsumed by the scope of identifiers.org/ncbigene: I know nothing of life sciences). In fact it may be useful to create an external-id for identifiers.org and track that too: this ID would be "ncbigene/18054". The value of such redundant ID would be that it gives access to all kinds of life science entities --Vladimir Alexiev (talk) 07:31, 12 June 2017 (UTC)

Idea for new editing tool

Here's a query for people newly elected as United Kingdom MPs, rendered as table with - at the time of writing - gaps where data is missing for thing like "website" "Twitter name", "Facebook profile"

We could turn that into an editing interface, where people could enter text where gaps exist, and either have that saved directly into Wikidata, if the users are signed in, or into a Wikidata-game like interface for checking by a signed-in editor.

The columns requiring Q values would not be editable (or we could have an autosuggest-style method of adding them), neither would the fields already holding data.

This would enable us to crowd-source the collection of discrete sets of data, attracting people not currently contributing to Wikidata. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:55, 10 June 2017 (UTC)

We have Tabernacle. --Edgars2007 (talk) 16:37, 10 June 2017 (UTC)
That is not something that could be presented to novice editors to start their engagement with Wikidata. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:17, 10 June 2017 (UTC)
I have always though that this would be a wonderful idea, indeed. Many people would take on the challenge to find a picture for each type of cat, an address for every museum, a batting average for every baseball player. Right now, people entering data are Wikidata experts. With such an interface, people entering the data would be cat/museum/baseball experts, not Wikidata experts. I really hope someone makes it a reality. I often write such queries, but for now I only post them to the relevant WikiProject, where they aren't exposed enough to domain experts. Syced (talk) 03:38, 11 June 2017 (UTC)
I think it would be good if the SPARQL tool would have the option to provide data entry. Otherwise there's work on automatically generated lists for Wikipedia. A good UI for lists might also solve this use-case. ChristianKl (talk) 12:11, 13 June 2017 (UTC)

User script help

Could somebody please help me with this user script I am working on? It is based on the currentDate script. It is supposed to automatically insert "English Wikipedia" when I add the "imported from" property to a reference. It is adding the text correctly I think, but the "save" link stays gray instead of turning blue. Thank you! SharkD  Talk  18:56, 11 June 2017 (UTC)

Try inserting "Q328", instead. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:30, 11 June 2017 (UTC)
Thanks! Which method do I use to insert it? Also, is there a manual for Wikidata-specific jquery API? SharkD  Talk  20:00, 11 June 2017 (UTC)
You are aware that "imported from" should only be used for bot additions? Sjoerd de Bruin (talk) 20:28, 11 June 2017 (UTC)
I was told this was not the case. Link. SharkD  Talk  13:36, 12 June 2017 (UTC)
@Sjoerddebruin: Can you tell me where such a decision was made? I agree that when it comes to manually adding facts it's a lot better to add them from a source outside of Wikipedia but if someone adds a fact that he knows from a Wikipedia source, "imported from" is the best way to provide provenance. ChristianKl (talk) 12:31, 13 June 2017 (UTC)
Inserting the string "Q328" does not help any. The "save" link is still grayed out. SharkD  Talk  14:34, 12 June 2017 (UTC)

Correcting external databases

Are there any plans to notify external database owners in some kind of "official" way about errors in their databases? Here in wikidata there are many constraint violations reports with sometimes dozens of entries stemming from double entries in other databases. I know that there are ongoing efforts with VIAF, GND and partially also ImdB to fix their databases by removing double entries, but what about other databases (e.g. those of big sport organizations (FIFA etc.), federal databases or databases of science organizations)? Of course users can do this, but at least I got no reaction to such efforts (which is not really surprising I think, such organizations probably get tons of emails every day). I think this should be made more coordinated and in a more "offficial" way by Wikimedia or at least some representative wikidata people to have an impact. Steak (talk) 20:08, 11 June 2017 (UTC)

Interesting idea. I imagine a "report card" could be generated that could publicize to database owners what inconsistencies we have found with their data. I discussed something similar at the recent Wikicite conference about how we might report back to web sites how good/bad their metadata is for Citoid/Zotero. It's hard to find a direct channel back to an organization's "responsible person," but a periodic public report might do the trick. -- Fuzheado (talk) 07:10, 12 June 2017 (UTC)
@Fuzheado: The 'report card' is our constraint report. For example, the DBLP folk you and I met at WikiCite are using Wikidata:Database reports/Constraint violations/P2456 to clean up their data (remove duplicates, etc). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:01, 12 June 2017 (UTC)
The problem is not the collection of corrupted data, but the transfer to the organizations. Steak (talk) 19:58, 12 June 2017 (UTC)
Not all data that triggers a constraint report is "corrupt"; indeed, most is not. Several of us are working with external data managers to inform them of relevant constraint reports and other methods of cooperation; case studies are also being collected; see past issues of the 'Wikidata weekly summary'. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:21, 12 June 2017 (UTC)
#Reporting errors in external databases Matěj Suchánek (talk) 07:28, 12 June 2017 (UTC)
i have emailed some photo libraries for metadata correction on an ad hoc basis. but we should have a contact process for error correction as a part of the upload process. they should have an email as a fall back. Slowking4 (talk) 21:25, 12 June 2017 (UTC)

Call for submissions for the WikidataCon program

Hello there,

From now and until July 31st, you can submit projects for the WikidataCon program!

Our focus is sharing your knowledge and experience within the community. We suggest a lot of different formats, topics, and we also have room for unusual formats and new ideas.

Feel free to submit your projects, support other submissions, ask the program committee for any question.

We hope that many of you will contribute to this program to make the WikidataCon amazing, community-driven and diverse :)

Thanks, Lea Lacroix (WMDE) (talk) 10:22, 12 June 2017 (UTC)

Administrative territorial entities: How to add links to geoshapes (that are not located on Commons)?

Since February 2017, Swisstopo is publishing the geoshapes of administrative territorial entities in Switzerland as linked data: [4]. They have approached me to find out if and how links to these geoshapes could be added to the corresponding items on Wikidata (i.e. cantons, districts, and municipalities in Switzerland).

With a quick search I was able to dig up: geoshape (P3896). But despite its more general label, this property seems to be reserved for geoshapes that are hosted on Wikimedia Commons. – What is the general idea behind having such a property, specifically targeted at Wikimedia Commons? Don't we want to link to geoshapes (available as open data) elsewhere?

How should we proceed in the Swisstopo case? --Beat Estermann (talk) 11:10, 12 June 2017 (UTC)

Same reason as why you can only use Commons files and not images somewhere else on the internet. Multichill (talk) 12:38, 12 June 2017 (UTC)
So, would you advise in favor of ingesting them all into Wikimedia Commons? - If yes, is there a way to automatize the process? Given that the data is published as linked open data and that it will be updated in the future when changes to the municipality structure are made, this would be helpful.
Wouldn't it also be helpful in this case to have a statement for each Wikidata item that points to the official shapefile - if only to facilitate the automated ingest of the shapefiles into Commons? – For experimental purposes I have for now added a second formatter URI for RDF resource (P1921) statement to Swiss municipality code (P771), which actually provides the link to the item on the Swisstopo site. It's rather well hidden from the user; so I'm not entirely sure whether that's the way to go.
And there is yet another issue that would need to be resolved on the Wikidata side: on Wikipedia and Wikidata, Swiss municipality codes are "normalized" to four digits (thus, "351" becomes "0351"), while this is not the case in the official reference database. Of course, you can deal with this specifically when writing SPARQL queries pulling data from both services (see https://tinyurl.com/jqkkwrv for an example), but this requires some extra code to be written by people who are aware of the issue to begin with. – I guess it would make sense to correct the Swiss municipality codes on Wikidata.
Any advice on how to further proceed with this case is appreciated. --Beat Estermann (talk) 14:02, 12 June 2017 (UTC)
Try to make test edits with something from Commons:Commons:Upload_tools or Commons:Commons:Upload_Wizard?
Let us know what works the best now. d1g (talk) 15:44, 12 June 2017 (UTC)

Wikidata weekly summary #264

Building under construction

I am not sure, how to indicate, that building is under construction. I think we need such an information, for example for queries like tallest skyscrapers.

I have these ideas:

Third solution seems the best to me.--Jklamo (talk) 10:42, 11 June 2017 (UTC)

World Trade Center (Q11235)
significant event (P793) and construction (Q385378)
missing end date can mean still in progress d1g (talk) 10:49, 11 June 2017 (UTC)
How can I query this (filter out incomplete buildings, note that majority of completed buildings does not have significant event (P793) and construction (Q385378) filled)? How can I record other states - proposed/approved/under construction/on-hold/architecturally topped out/structurally topped out/demolished etc. ?--Jklamo (talk) 15:06, 12 June 2017 (UTC)
Exclude with event Q331483 regardless of qualifiers, then exclude Q385378 with P580 but not with P582
They should be subclasses of occurrence (Q1190554)     direct or not. d1g (talk) 02:21, 14 June 2017 (UTC)

500,000,000 edits

We have made it to number 500,000,000. Matěj Suchánek (talk) 14:53, 12 June 2017 (UTC)

No wonder my wrists ache! Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:43, 12 June 2017 (UTC)
Sweet! In comparison, the English Wikipedia is currently at around 780 million edits, so I don't think it'll take too long for us to pass them. Someone may even be able to chart the number of edits over time and make a graph (I'd do it myself, but I'd spend way too much time researching how to properly do it to bother). Jon Harald Søby (talk) 21:39, 12 June 2017 (UTC)

Timeline of Wikidata by number of edits.
Edits (hundreds of millions) / point in time

 

Milestones (edits). +100M / date

 

XXN, 22:36, 12 June 2017 (UTC)

However, we should recognize the fact that this number is heavily inflated, and we could acquire all the current content of WD with less than 100 millions of edits. optimization (Q24476018) is a foreign word for Wikidata. XXN, 14:23, 13 June 2017 (UTC)
Not for everybody ;). Emijrp (talk) 15:22, 13 June 2017 (UTC)
Emijrp  . Wikidata missed you in previous years! XXN, 21:20, 13 June 2017 (UTC)
Hey XXN: So what is the difference between optimization (Q24476018) and mathematical optimization (Q141495)? -or optimum (Q1186200) --Succu (talk) 21:39, 13 June 2017 (UTC)
@Succu: not sure about optimum (Q1186200) (I don't speak any of the languages in the linked articles), but mathematical optimization (Q141495) is considered a subclass of optimization (Q24476018) (this should be a more generic term applicable in life). I should have use a normal word instead of some poor item to not confuse people, apologize. XXN, 22:10, 13 June 2017 (UTC)

Physical dimensions of prints

A two-dimensional print (Q11060274) (or in general instances of artwork superclasses) has physical dimensions height (P2048) and width (P2049). There are cases where different physical dimensions can be found in sources, e.g. for image size and paper size.

  • Which ones are typically used for physical dimension properties?
  • One could add multiple values and qualify them with qualifiers. Good idea? Which qualifiers fit best (property/value pairs)? (barely done for instances of File:Deshpran Fishing Harbour, Petuaghat-2, IMG 20190723 182757.jpg (Q478798) and subclasses thereof until now)
  • Should ranks be used to prefer one type of dimensions over another?
  • Which WikiProject or users have experience in this field?

Thanks, MisterSynergy (talk) 16:51, 12 June 2017 (UTC)

@Jane023: ! Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:20, 12 June 2017 (UTC)
Well in the paintings project we avoid these issues by ignoring all framing sizes altogether (and most images of paintings have their frames cut off as "sculpture" on Commons anyway), so I would go with image size for consistency. If you have both measurements per print and aren't afraid of the trouble to model them in such a way that others can follow what you are doing, go ahead. Modelling prints on Wikidata is still a bit up in the air, since no one wants to tackle them. Potentially it could be a nightmare if every library in the world wants to upload their print-of-a-print, but then some prints of old originals might be notable if the original prints are lost or quite rare, etc. Someone I spoke with recently felt that we should upload the entire Hollstein catalog (Gah...) and then each library can link their version to the iconic item for that print. I can't decide, but good luck to you! Jane023 (talk) 17:37, 12 June 2017 (UTC)
Thanks for your answer.
The task I work on is in fact a repair job which came to me via #Unitless claims (section above). There was a large set (~4000) of print items identified which are part of (P361) Welsh Landscape Collection (Q21542493), but they have a problem of missing units for their physical dimensions (P2048 and P2049). Since all data is equipped with an external identifier and it looked consistent, I decided to invest some time and “fix it” (the missing unit problem). However, I was made aware of the fact that all (?) of them have bad data for width and height, mixing up image height (assigned as width) and paper width (assigned as height). This was imported by another user from Commons, where it is still wrong as well.
Fortunately, uniform sources are provided and they are even accessible in JSON format, so I now have information of both image size and paper size for all ~4000 items in a local database (I crawled all of them). They “just” need to be prepared for the correction job, which should include the addition of a source as well.
One more question: the physical dimensions of a 2D artwork is typically given as “width × height”, isn’t it? I just want to make sure that it isn’t the other way round ;-) —MisterSynergy (talk) 18:30, 12 June 2017 (UTC)
Sorry, no idea about prints, but with paintings, it is generally H x W and not the other way around (see e.g. here). Having said that, just do a spot check on the data looking at the prints to find out which way your dataset does it - most prints are pretty obvious, since they tend towards the page size and rectangular shape. 4000 is a lot - thanks for your work on that! Jane023 (talk) 21:53, 13 June 2017 (UTC)

Mediawiki API Service for WDQS

We have deployed the first iteration of Mediawiki API support for Wikidata Query Service. Please see the manual for the full documentation, below outlined are the main highlights.

The service allows to call out to some Mediawiki APIs from SPARQL in order to obtain information not contained in RDF data and WDQS database. See the list of the supported APIs in the manual.

Currently a small subset of existing APIs is supported, and we expect the community to nominate more services and contribute service templates to extend the API. Please see the manual for description of service templates. Note that we do not plan to support any APIs that modify data, edit wikis, etc. - only read-only querying APIs and only APIs that do not require any authorization can be supported.

Currently supported hosts are: *.wikipedia.org, commons.wikimedia.org, www.mediawiki.org, www.wikidata.org, test.wikidata.org. If any other wikis need to be supported, please leave a comment to the developers and we will enable them.

Example service query (more in the docs):

The following query uses these:

  • Properties: subclass of (P279)    , instance of (P31)    
     1 SELECT * WHERE {
     2   SERVICE wikibase:mwapi {
     3       bd:serviceParam wikibase:api "EntitySearch" .
     4       bd:serviceParam wikibase:endpoint "www.wikidata.org" .
     5       bd:serviceParam mwapi:search "cheese" .
     6       bd:serviceParam mwapi:language "en" .
     7       ?item wikibase:apiOutputItem mwapi:item .
     8   }
     9   ?item (wdt:P279|wdt:P31) ?type
    10  }
    

If there are any problems or questions, please contact the developers on the wikidata list, #wikidata on IRC, or on wiki, or submit a Phabricator issue.

TODOs:

  • Add more services (nominations welcome)
  • Support services that accept multiple titles as input in one query
  • Implement parameter types

--Smalyshev (WMF) (talk) 18:55, 12 June 2017 (UTC)

@Smalyshev (WMF): Very nice! This opens up all sorts of new possibilities. Not sure yet about use cases, but I'm sure we'll find some good ones! Multichill (talk) 19:22, 12 June 2017 (UTC)
One use-case (probably already possible via generator) - confirm, that all transclusions of "Template:Infobox park" are marked as parks at WD. --Edgars2007 (talk) 11:20, 13 June 2017 (UTC)
Fantastic! This should yield a nice speedup for the OpenRefine reconciliation interface once I find the time to migrate to that. − Pintoch (talk) 19:48, 13 June 2017 (UTC)

Userscript for marking duplicates

I have created a userscript which simplifies the process of marking two pages as duplicates. This is necessary when two pages about the same topic exist in the same wiki and are connected to two duplicate items (example).

You can enable it by adding this to your scripts:

mw.loader.load( '//www.wikidata.org/w/index.php?title=User:Matěj_Suchánek/markasduplicate.js&action=raw&ctype=text/javascript' );

You will see "Mark as duplicate" link in the sidebar. When you click on it and type the id of the second item, the script will mark the current item as duplicate (using Wikimedia duplicated page (Q17362920)), identify which wikis have duplicate articles and try to request merging them (using templates from Template:Merge (Q6919004)).

Please be careful when using it, you may do a mess in many wikis during a short while. Feel free to report any problems you find. Good hunting! Matěj Suchánek (talk) 15:17, 9 June 2017 (UTC)

  • Excellent! Thanks for writing this. With all the new upcoming redirects, we might need it even more.
    --- Jura 15:43, 12 June 2017 (UTC)
Thank you, Matěj Suchánek! You could add it to Wikidata:Tools/User scripts (with a screenshot?) --Atlasowa (talk) 07:15, 14 June 2017 (UTC)

Described by source

described by source (P1343) has a constraint that the value should be an item that is an <instance of> work (Q386724). Since the contents of dictionaries and encyclopedias vary by edition, shouldn't we reference a specific edition rather than the work? (With a page number and specific entry to boot, to my way of thinking). - PKM (talk) 21:32, 13 June 2017 (UTC)

version, edition, or translation (Q3331189) is in the subclass hierarchy of work (Q386724) which is all that the property constraint requires. So yes, it probably should be as specific as possible in practice, but I don't think that needs to be reflected in the constraint. ArthurPSmith (talk) 13:30, 14 June 2017 (UTC)

Contempt for our upper classes

Have a look at for instance "Sultan of the Ottoman Empire" and ask yourself if this is for real? This structure is so baroque I feel it has nothing to do with the things I am working on. When you compare it with other sultans / monarchs these have different classes. So the application of classes is inconsistent as well. When you look at the upper classes, it cannot be explained with a straight face without a straight jacket.. How did we get here? Thanks, GerardM (talk) 07:35, 14 June 2017 (UTC)

Reasonator seems to be picking out a particularly long thread of class relationships there - if you follow the subclass hierarchy through "activity" for example you get a much shorter list. In any case, yes, wikidata class relationships are rather messy, though I think they are slowly improving as people see problems and address them. Please participate in Wikidata:WikiProject Ontology (see in particular the Problems page) to help. ArthurPSmith (talk) 13:40, 14 June 2017 (UTC)

Next IRC office hour on June 28th

Hello,

Our next Wikidata IRC office hour will take place on June 28th, 18:00 UTC (20:00 in Berlin), on the channel #wikimedia-officeconnect.

During one hour, you'll be able to chat with the development team about the past, current and future projects, and ask any question you want.

See you there, Lea Lacroix (WMDE) (talk) 13:17, 15 June 2017 (UTC)

Guide for QuickStatements 2?

Does anyone know if there is a step-by-step guide for making QuickStatements 2 batch edits? I've been writing commands up using the version 1 format, then importing them into the new version, but I can't seem to figure out how these people are making batches that show up in the batch history. – Pizza1016 (talk | contribs) 03:18, 14 June 2017 (UTC)

I'm pretty sure the option 'Run in background' gives you the option to describe the set of commands you wish to run, after which the set is queued as a single batch. Mahir256 (talk) 04:47, 14 June 2017 (UTC)
I also would love to see a Guide for QuickStatements 2. I am on of "these people" that make batch runs, but I am still clueless about what that means. I use version 1 syntax and on the button you have 2 buttons: "Run" and "Run in the background". I usually try to click run, but once or twice clicked the other button instead, the result was the same but it took a bit longer. --Jarekt (talk) 02:11, 17 June 2017 (UTC)

Facto Post – Issue 1 – 14 June 2017

Facto Post – Issue 1 – 14 June 2017
 

Editorial

This newsletter starts with the motto "common endeavour for 21st century content". To unpack that slogan somewhat, we are particularly interested in the new, post-Wikidata collection of techniques that are flourishing under the Wikimedia collaborative umbrella. To linked data, SPARQL queries and WikiCite, add gamified participation, text mining and new holding areas, with bots, tech and humans working harmoniously.

Scientists, librarians and Wikimedians are coming together and providing a more unified view of an emerging area. Further integration of both its community and its technical aspects can be anticipated.

While Wikipedia will remain the discursive heart of Wikimedia, data-rich and semantic content will support it. We'll aim to be both broad and selective in our coverage. This publication Facto Post (the very opposite of retroactive) and call to action are brought to you monthly by ContentMine.

Links
Editor User:Charles Matthews. Please leave feedback for him.

If you wish to receive issues of Facto Post on English Wikipedia, please add your name to our mailing list. You can always remove it.
Newsletter delivered there by MediaWiki message delivery

Charles Matthews (talk) 14:19, 14 June 2017 (UTC)

What do you mean exactly by "post-Wikidata"? Good luck with the Facto Post, looking forward to the next issues! Syced (talk) 05:42, 15 June 2017 (UTC)
Ah. I have tried thinking about Wikimedia integration around Wikidata. I think this is happening, but it is hard to explain anyone not already a Wikimedian working on several of the sister projects. The WMF is doing a review of prospects for 2030. I have contributed to both stages so far; in the first stage I said Wikidata should by then have one billion statements. Why not? But this also doesn't say enough.
So here I'm also talking about integration, but in a more encyclopedic way. In its 15th edition, the Encyclopædia Britannica innovated (Micropædia, Macropædia, and Propædia); but this was not really a success. Wikipedia has innovated also, but we need to look at both technologies and groups of people, to understand the potential for successfully building a new kind of reference work.
The input-output issues around Wikidata seem like a good way to understand things. Wikidata inputs (automated, semi-automated, and via the fact mining which I'm working on at WikiFactMine). Holding areas such as mix'n'match, potentially LibraryBase. Wikidata outputs, not just to infoboxes but via SPARQL, and some form of WikiCite export.
So what I meant about "post-Wikidata" is that, after about five years, there is a new perspective available. Charles Matthews (talk) 07:12, 15 June 2017 (UTC)
@Charles Matthews: Thanks a lot for starting this initiative. I think the movement needs people as courageous as you for advancing in its mission. Are you coming to the Wikidata conference? Did you apply for a grant already? If not, please do so, as this is the kind of submissions that really need the attention of the community.--Micru (talk) 04:48, 16 June 2017 (UTC)
I was hoping to – Lydia invited me when she was in Cambridge. Charles Matthews (talk) 04:59, 16 June 2017 (UTC)

Secure IRC URLs

For IRC channel (P1613) on Pepper&Carrot (Q24289654) I wanted to enter ircs://chat.freenode.net:6697/peppercarrot, but saving failed with the message “An URL scheme "ircs" is not supported.” In the meantime I added the non-TLS URL instead. What can be done about this? FranklyMyDear... (talk) 11:48, 15 June 2017 (UTC)

it is meant to be format as a regular expression ... ircs?://[^ \t\r\n\v\f/]+/\S+, will take a developer to find the issue.  — billinghurst sDrewth 08:30, 16 June 2017 (UTC)
There is a list of allowed protocols for properties of the URL data type. Right now irc is allowed but ircs is not. Can you file a ticket on phabricator.wikimedia.org? Then we can add it. --Lydia Pintscher (WMDE) (talk) 11:49, 16 June 2017 (UTC)

Surfacing more of our data in Wikipedia templates

For values such as population, Wikipedia templates often only list the most recent value. We could display an icon like 📊 after the number. If the user clicks on the icon, we could provide him with a chart in a popup that shows the historical data for the population. What do you think about such a feature? Would it be well-received by the Wikipedians? ChristianKl (talk) 15:29, 16 June 2017 (UTC)

It could be an opt-in feature or to turn off by one self. MechQuester (talk) 15:59, 16 June 2017 (UTC)

Question about electoral districts

What property would you use to indicate in which electoral district a town is located? Thanks, Amqui (talk) 16:43, 16 June 2017 (UTC)

I would suggest d:Property:P131, like in Selkirk, Varsity View, and Inuvik. --Larske (talk) 17:13, 16 June 2017 (UTC)
I've been wondering about this for UK constituencies - I haven't got a good answer yet, but I'm a little cautious about located in the administrative territorial entity (P131) because it might confuse the main hierachy of administrative areas. In these Canadian cases, are the electoral divisions the same as the next level of administrative area, or is there otherwise nothing between the city and the state? I think it might be a bit messier in other countries and lead to items having multiple chains of located in the administrative territorial entity (P131) as a result. eg/ Oxford (Q34217) (city) is located in Oxford (Q20986484) (district, functionally the same) located in Oxfordshire (Q23169). But electorally it is located both within Oxford West and Abingdon (Q1050814) and Oxford East (Q1050889) (both of which also contain areas outside the city, so you can't use contains administrative territorial entity (P150)). So that's three. I can see this getting very complicated...
@Oravrattas:, any thoughts? @Jheald:, I know you've been looking at UK areas - am I missing a trick here? Andrew Gray (talk) 21:34, 16 June 2017 (UTC)
I'm not entirely sure that this is something that makes sense to model directly like this, any more than it would make sense to model what electoral district a lake is in. Any given geographic point will usually be within multiple electoral districts (i.e. at various levels of government), which will often have geographies that intersect in wildly different ways. Depending on the level in question, a town will often either contain many electoral districts, or overlap with multiple others. As there isn't usually a direct mapping like this, this seems like something that should really be queried a different way — e.g. by having shapefiles for the various geographic concepts you want to compare. --Oravrattas (talk) 22:47, 16 June 2017 (UTC)
There is a proposal for a new property at Wikidata:Property proposal/District.
--- Jura 05:21, 17 June 2017 (UTC)

Districts are not fixed. It is what gerrymandering is about. So the best thing is to have maps associated with electoral districts. Also, a town does not need to be fully in one district.. Just consider the electoral district of Kensington as an example. Thanks, GerardM (talk) 07:18, 17 June 2017 (UTC)

NB An example of a map with electoral districts of Kerala..

How to add value for language not in MediaWiki?

How would I go about doing that? I want to add pī Lín pī Kǒng yùndòng in Pinyin (Q42222) with title (P1476), but it is not in the list of languages. PokestarFan • Drink some tea and talk with me • Stalk my edits • I'm not shouting, I just like this font! 02:03, 17 June 2017 (UTC)

This is same on the talk page. Dave Bansg 02:54, 17 June 2017 (UTC)

Huh, isn't pinyin transliteration (P1721) enough? --Liuxinyu970226 (talk) 05:48, 17 June 2017 (UTC)
Pinyin (Q42222) isn't a work and thus not in the domain of title (P1476) and thus there's no reason to use it in this context. In general I would recommend you (PokestarFan) to stick to cases where the right edit is more obvious. ChristianKl (talk) 17:22, 17 June 2017 (UTC)

How to use Tabular data

The "Tabular data" datatype has been enabled in wikidata, allowing to link to tabular data files stored on Wikimedia Commons (currently as raw JSON files, but probably a spreadsheet-like editor to simplify data editing will be made available at some point). Besides being a more convenient way of storing raw data tables than Wikidata, it also allows some nice features like interactive graphs or filtered lists. Currently, just a sandbox property exists (Sandbox-Tabular data (P4045)), and AFAIK there are no conventions yet about how to use this datatype in actual properties. Properties that are prone to be replaced with this datatype are those representing the evolution over time of a quantity (e.g. population (P1082), visitors per year (P1174), employees (P1128), total revenue (P2139), net profit (P2295)...), but for sure others are also possible (e.g. a series of strings in software version identifier (P348)).

The question is how to indicate the contents of each column in the property / item, including:

  1. See for example c:Data:Wikidata/St.Petersburg.tab, if a new "population table" property is created for linking to these type of tables, I think new properties (similar to Wikidata property (P1687)) are needed to link as statements (in the "population table" property) to the point in time (P585), and population (P1082) properties, with a qualifier to indicate the name id of the column (i.e. "year" and "population") and the unit used ("human").
  2. The JSON table in commons allows to add an optional "sources" file for referencing the source of the material, but in many cases each row of the table will have a different source (e.g. census of that year), so probably by convention in those cases a specific column (or columns) will be needed to store the specific source (and a specific property to indicate the format, e.g. reference URL (P854) or/and determination method (P459)).
  3. The same table can contain different data values (e.g. temperature and precipitation in c:Data:Ncei.noaa.gov/weather/New York City.tab), but maybe each property should only be allowed to describe a single one (e.g. separate "temperature data" and "precipitation data" properties, instead of a single "weather data" property) ignoring the other columns, even if useful for data visualization purposes. But I think this should be discussed also from a technical point of view.
  4. This description properties for these simple tables are meant to be specified in each property definition, as each table file will refer just to a single Wikidata item, but for complex tables containing data for multiple items (e.g. c:Data:Bea.gov/GDP_by_state.tab or c:Data:Dolmens_of_the_Preseli_Hills.tab) another qualifier will be needed in the item to indicate which column contains the actual data for this item (e.g. in the Alabama (Q173) item, a qualifier will be used for "GDP table" property to indicate that the "AL" column contains the applicable data). It is also possible to forbid the use of this complex tables, forcing by convention that each table only contains data for a single item, but if imported tables from recognized sources usually contains data for multiple items this may be impractical, requiring conversions.

I just want to discuss first some alternatives, with more experiences users, before creating a new RfC with a more mature approach. Thanks for your ideas!! —surueña 07:33, 12 June 2017 (UTC)

@Suruena: great questions and ideas there! One additional thing that I think would be really useful would be some way to tie the data tables into SPARQL queries, so that the data from the table could actually be queryable and/or returnable. Maybe we can get the SPARQL endpoint folks to weigh in? In response to your first 3 questions/statements it seems perhaps we ought to encourage the use of the wikidata property id's as column headings in the data tables or otherwise have some way to tie them directly to the files, rather than using qualifiers within the wikidata item (I'm not sure how you'd even do that, as a qualifier can only have one value, but you're trying to link two values together??) Another option is to use file format (P2701) as qualifier to point to very specific file formats that list the individual properties used in those files. Your 4th issue is also something that could be important, I think there is a good case for one or two new properties there - 'key field' and 'key value' perhaps, to be qualifiers on the table statement? ArthurPSmith (talk) 14:16, 12 June 2017 (UTC)
Tabular data is data that can't be easily queried with SPARQL. As a result, I don't support moving properties like population (P1082) to the new data type. The motivation for using the new datatype is that we think keeping the relevant data inside of Wikidata takes up too much space and the ability to query the data with SPARQL isn't useful enough to pay that price. ChristianKl (talk) 15:21, 12 June 2017 (UTC)
That's a very good point, I agree that those properties meant to replace current facts like population shouldn't be created unless table files can be queried with SPARQL. RDF support for tabular data is currently being implemented (T163921), is this enough to query this data from SPARQL? In any case, my proposal is to replicate the data model, so table files are just a more convenient way to put a long statement group, i.e. one column is the value of the property (e.g. population count), and any additional column is a qualifier (e.g. point in time) or reference (e.g. reference URL) for that value. Therefore, to make as simple as possible to support it in queries (but in a more convenient way, and allowing interactive graphs and filtered lists without the need to duplicate information). @ChristianKl: It's also true that we can start with tabular properties of data taking a lot of space but not very relevant from queries, do you have examples please?
@ArthurPSmith: I've elaborating in my mind the proposal, I was thinking in something like this: Suppose a new property "population evolution" that links to .tab files like c:Data:Wikidata/St.Petersburg.tab. We need to describe in the statements of any property with Tabular data type the format of the files that can be linked with, i.e. we need to indicate that one column will have the population value and another will be the qualifier with the point in time. Then, something like the following statements and qualifiers would be in the "population evolution" property:
Then, in the Saint Petersburg (Q656) item, the statement "population evolution" would be just a link to c:Data:Wikidata/St.Petersburg.tab. It's important to describe the allowed format, so constraint violations can be detected (e.g. mandatory column not present, or the values of the column do not met the constraints required by the property population (P1082), for example). Other qualifiers will be probably needed, e.g. whether the .tab column must have string/number/boolean/etc format or the allowed units, and probably also a convention how to write in the .tab files a "no value" / "unknown value" and ranks. A more complex approach will be needed to support complex tables, so probably is better to just allow simple tables (all columns just referring to a single item) and have scripts to take complex tables and split in simple tables.
I think that the use of wikidata property id's as column headings is a possibility, but this would be more difficult to modify by casual editors. Can you put an example of file format (P2701) usage? Would be something like the description of each column in statements, but as a separate item? Thank you very much to both of you for your great feedback! —surueña 06:57, 13 June 2017 (UTC)
Yes, each column in the file described by statements in a separate item for the file format is what I was thinking. I think that might be a better option that creating a distinct property for each data format; on the other hand doing it via the property as you suggest would mean the relationships were very solidly defined from the start, constraints could be easily checked, etc. So either way could work. As to ChristianKl's specific domain comment - rather than population we have an old property proposal for temperature data here which I think would be ideal to try to get something like this working. On the other hand I'm not sure what existing commons tabular files for temperature exist that would be a good starting point there.... ArthurPSmith (talk) 15:34, 13 June 2017 (UTC)
  • I don't think that it's important to seek for use-cases of the new feature. It's much better to focus on how to model specific domains well. But if you want a use-case, if we had the goal of integrating most of the historic weather data that's in the public domain at the highest resolution that's available, that's likely too much data to store it directly in Wikidata. ChristianKl (talk) 12:01, 13 June 2017 (UTC)
@ChristianKl: I agree that storing historic weather data is a good candidate for tabular data, even in case it cannot be queried from SPARQL. In any case, I've asked the developers to see which is the current plan about this (see Wikidata:Contact the development team#Tabular data in SPARQL queries?). @ArthurPSmith: Thanks for pointing out that property discussion, the data file proposed seems representable by the approach I proposed (c:Data:Ncei.noaa.gov/weather/New York City.tab, to generate the Wikipedia table en:User:Yurik/WeatherDemo through Lua scripts), either by defining a different wikidata property per column ("highest temperature evolution", "average high temperature evolution", "Lowest temperature evolution", "precipitation evolution"...), or a single property "weather table" (with probably optional columns when data is not available for some locations). I prefer one property per column, and probably the scripts for printing the wikipedia tables can work fine with multiple properties, but is another point to be clarified. —surueña 06:07, 19 June 2017 (UTC)

How to harvest instance of (P31) from an infobox?

All articles using this infobox should be made instances of park (Q22698). How to do that?

  • I can't use Harvest Templates as it wants me to specify a parameter.
  • By specifying a widely-used parameter such as "name" I can trick Harvest Templates into giving me a rather complete CSV list of pages, but then I can't feed that list to QuickStatements, because QuickStatements does not have the capability to detect whether each item has a P31 or not already (so I would end up adding park (Q22698) also to items that already have a more specific P31 such as national park (Q46169), which would be bad).

How to do? I encounter this situation very often. Thanks in advance! Syced (talk) 09:44, 12 June 2017 (UTC)

Use petscan (already pre-filled). You can use this tool even to add claims as desired. —MisterSynergy (talk) 09:49, 12 June 2017 (UTC)
Wikidata:Database reports/items without claims categories/jawiki lists a few categories that might help you find other candidates.
We could also set up a template-based report for jawiki (compare with Wikidata:Database reports/templates and items with 0 claims/nlwiki for nlwiki).
User:NoclaimsBot adds regularly P31 for a few templates from specific wikis. Maybe it could be configured for jawiki as well.
--- Jura 11:00, 12 June 2017 (UTC)
@Syced, Jura1: yes, it is quite easy to set up the bot for another language. I do need someone who understands the language to make a page like en:User:NoclaimsBot/Template claim and keep an eye on it. Multichill (talk) 11:09, 12 June 2017 (UTC)
@Multichill: Like this? I speak Japanese and can maintain this page, so please add it to the input list of your bot, thanks a lot! :-) Syced (talk) 14:58, 12 June 2017 (UTC)
@Syced: yes, like that. Please update ja:利用者:NoclaimsBot/Template claim, that's the page the bot looks at.
The bot does need Wikidata:Database reports/without claims by site/jawiki (and Wikidata:Database reports/templates and items with 0 claims/jawiki) to function. @Pasleim: can you generate these reports? Multichill (talk) 15:20, 12 June 2017 (UTC)
@Multichill: When editing that page I am told "This filter was triggered because you tried editing other user's userpage. Make sure you're now editing your userpage with your account. If you surely believe you should edit the page, you can request sysop to edit the page (Add new section). Add {{AllowEdit}} to your userpage if you want others to edit yours. Could you please add that or point to my page? Thanks a lot! Where is the source code of your bot? In particular, I would like to know how it reacts when the item already has a P31 (or the given property). If an article is about both a radio broadcast and a TV broadcast, thus having both infoboxes, will the tool add both? Will the bot add "TV broadcast" if the item already is an instance of "NHK TV broadcast"? Thanks! Syced (talk) 02:20, 13 June 2017 (UTC)
Right, the Jawp went completely overboard with abusefilter. You should be able to edit the page now. As for source and other documentation: All on User:NoclaimsBot.
The bot loops over all pages and tries all templates so will probably add both? I could probably just return after a successful claim to be a bit more efficient. Multichill (talk) 11:44, 18 June 2017 (UTC)
I'd be very wary of that. The en.Wikipedia version of that template, for instance, is also used on things that are not parks, including nature reserves, botanical gardens, arboretums, and even a biography in which a park is discussed. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:52, 12 June 2017 (UTC)

Germans (Q42884), only for the group or also individuals?

Would this item only refer to Germans as a collective group, or does it also refer to individual Germans of that group? More specifically, would the singular noun "German" have the same referent as this item? CodeCat (talk) 20:22, 14 June 2017 (UTC)

No. This item is for the ethnic group, not for individuals. You can use it with ethnic group (P172), but not with instance of (P31). Steak (talk) 20:33, 14 June 2017 (UTC)
Is there an item for an individual German person, then? CodeCat (talk) 20:55, 14 June 2017 (UTC)
What kind of knowledge do you want to express? ChristianKl (talk) 21:33, 14 June 2017 (UTC)
I want to link a sense of the word "German" (on Wiktionary) to a Wikidata item: "A member of the Germanic ethnic group which is the most populous ethnic group in Germany; a person of German descent." If the item is only for the group itself (for which Wiktionary has no entry since it's just the plural), then I don't know what to do. CodeCat (talk) 21:41, 14 June 2017 (UTC)
I would propose to create a new item titled "citizen of Germany" that subclasses citizen (Q1020994) and that has country (P17) Germany. ChristianKl (talk) 08:52, 15 June 2017 (UTC)
What is the difference with "country of citizenship" "Germany" ? It is imho redundant. Thanks, GerardM (talk) 11:09, 15 June 2017 (UTC)
Firstly, being German by citizenship and German by ethnicity are separate things, even if they aren't usually distinguished. Secondly, the point is that there is a noun in English which refers to: 1. a person of the Germans (Q42884) ethnic group and 2. a person who is a citizen of Germany (Q183). Currently, Wikidata does not seem to have an item for either of these referents, i.e. there is no "German citizen" item nor is there a "person of the German ethnic group" item. This works fine when you want to express these things with a property, but I'm looking to connect the referents of these two senses of the noun "German" directly to Wikidata items. I have already done so with other words, such as wikt:Paris where in the English section the sense "The capital city of France." is tagged with {{senseid|en|Q90}} to indicate that it refers to the same thing as Paris (Q90). CodeCat (talk) 13:00, 15 June 2017 (UTC)
The core difference is that "country of citizenship" "Germany" is no item but a statement. The project of interlinking with Wikidictionary means that we need a variety of new items for concepts for which we currently don't have items but for which Wikidictionary has words. @CodeCat: If you want an item for the concept of an individual that's of German ethnicity, the principle of creating a new item is the same. ChristianKl (talk) 13:58, 15 June 2017 (UTC)
Why include Wiktionary terms? It is probably more relevant doing it the other way around. All Wikidata items have labels that can be conjugated. This is seeking a problem for a solution. Thanks, GerardM (talk) 15:05, 15 June 2017 (UTC)
Because it's useful to link meanings of words to the corresponding Wikidata item. CodeCat (talk) 15:13, 15 June 2017 (UTC)
For applications of text-mining it's very useful to be able to go from Wikidictionary terms to the concept behind those terms. The concept of a German citizen is also a "clearly identifiable conceptual entity" and thus notable. ChristianKl (talk) 11:30, 16 June 2017 (UTC)
@GerardM: It's interesting how fast you shift from wanting to include all data that has possible usage, to exclude data in this case. ChristianKl (talk) 13:35, 16 June 2017 (UTC)
I don't think this is the right thing to do. Wikidata does not model things by creating new subclasses for every possible intersection of things, it primarily describes things using properties. If the way Wiktionary links to Wikidata can't handle that, it sounds like something which needs fixing in Wiktionary. If we want to have an item "citizen of Germany", the argument for it should be more than just "Wiktionary wants to say that "German" means "citizen of Germany" instead of linking "citizen" and "Germany" separately" because that reasoning can be applied to lots of words (e.g. "female French citizen" could be added for wikt:en:Französin, "multiple female French citizens" for wikt:en:Französinnen...) and Wikidata is not a dictionary. Work is being done to add support for lexical information, but that will use new prefixes, not the existing Q and P ones. - Nikki (talk) 09:08, 16 June 2017 (UTC)
Adjectives seem to lend themselves well to being mapped to properties. The adjective "German" in one of its senses, would be ethnic group (P172) = Germans (Q42884), i.e. the adjective sense applies when an entity has that property with that value. In a roundabout way, the noun "German" can then be defined as any entity for which that property holds, i.e. a German person. I have no idea how that would be expressed on Wiktionary's side, but it's an idea. It could also help to distinguish the noun "green" green (Q3133) from the adjective "green" color (P462) = green (Q3133). CodeCat (talk) 21:02, 16 June 2017 (UTC)
Wikidata holds information about concepts in the Q-namespace. The noun and adjective green both refer to the same concept. Wikidata will soon have additional datatypes like the lexeme datatype and then we have separate items for noun and adjective. ChristianKl (talk) 18:33, 18 June 2017 (UTC)

@CodeCat: plural forms are not so useful to link when singular form is linked to Wikidata item. At least Russian "немец" "немка" (colloquial "германец") are aliases of Germans (Q42884); "немцы" is used here. Yes, many items are missing separate item and subclasses of Q17519152 because one/many distinction is implemented using objective properties (e.g. collection or exhibition size (P1436)) most of the time. d1g (talk) 18:59, 15 June 2017 (UTC)

In Wiktionary, both the singular "German" and the plural "Germans" are combined under one lemma, so these are a single unit from Wiktionary's point of view. Thus, the distinction one German versus many Germans is not made, they are the same sense. The Wikidata item, however, seems to refer specifically to all the Germans, the entire set of people whose ethnicity is German. This is a nuance that Wiktionary doesn't have, although perhaps it should. CodeCat (talk) 19:24, 15 June 2017 (UTC)
This lemma corresponds to Germans (Q42884) very well
Consider adding major languages at your user page ({{#babel:en-0|fr-0}}) in order to view all languages with their Help:Labels and Help:Aliases. d1g (talk) 19:52, 15 June 2017 (UTC)
@CodeCat: to answer your two missing items, we are using properties for this matter:
1. a person of the ... ethnic group: human (Q5) item with ethnic group (P172)
2. resident of ... country: human (Q5) item with country of citizenship (P27) or rarely country for sport (P1532) d1g (talk) 20:02, 15 June 2017 (UTC)
Yes, I'm aware of that. It doesn't work for Wiktionary though, since we're dealing with a specific concept, which requires its own item. I see three possibilities: 1. we link the Wiktionary sense straight to Germans (Q42884), understanding them to be the same, 2. we create a new item for a person of German ethnicity and link the Wiktionary sense to that, and also do this for every other ethnicity, or 3. we don't link the Wiktionary sense at all and just leave it as it is. CodeCat (talk) 20:22, 15 June 2017 (UTC)
1 or 3 now; maybe 2 if we really need these items for other reasons (as values in claims)
1 because a. ethnic group (Q41710) can consist of last survivor; b. ethnic group (Q41710) is studied by (P2579) ... even after each member dies. d1g (talk) 20:41, 15 June 2017 (UTC)
  • I have been thinking.. Suppose someone is of Turkish ancestry and a third generation German.. Would you call him a German or a Turk. If you make a choice is that not racist? Thanks, GerardM (talk) 11:08, 16 June 2017 (UTC)
    • We should user whichever one (or both!) we have a reliable source for, and reference the source(s). - PKM (talk) 20:15, 16 June 2017 (UTC)

Query to keep Commons Creator page (P1472) property and Commons creator templates in synch

Wikidata Commons Creator page (P1472) property and Commons creator templates have kind of reciprocal relationship: Commons Creator page (P1472) point from an item to one of Commons Creator templates and each creator template has "Wikidata" field with a q-code of the item. If creator template has "Wikidata" field with a q-code of some item but that item does not have matching Commons Creator page (P1472) property than the creator template is placed in c:Category:Creator templates with Wikidata link: item missing linkback. However I can not figure out how to write a query or generate a list of q-codes for items with Commons Creator page (P1472) property that point to a templates that do not have "wikidata" field pointing back. Any idea if such a query can be written or if some tool like petscan, etc. could help me? --Jarekt (talk) 02:39, 16 June 2017 (UTC)

If I understand correctly what you're looking for, the new MWAPI for WDQS can be of some help.
The following query uses these:
  • Properties: Commons Creator page (P1472)    
     1 SELECT (IRI(concat("https://commons.wikimedia.org/wiki/", ?template)) as ?templateLink) ?templateName ?creatorItem ?creatorItemLabel {
     2   SERVICE wikibase:mwapi {
     3      bd:serviceParam wikibase:api "Generator" .
     4      bd:serviceParam wikibase:endpoint "commons.wikimedia.org" .
     5      bd:serviceParam mwapi:gcmtitle "Category:Creator templates without Wikidata link" .
     6      bd:serviceParam mwapi:generator "categorymembers" .
     7      bd:serviceParam mwapi:gcmlimit "max" .
     8      ?template wikibase:apiOutput mwapi:title  .
     9   }
    10   BIND(substr(?template,9) as ?templateName) .
    11   OPTIONAL { 
    12     ?creatorItem wdt:P1472 ?templateName .
    13     SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
    14   }
    15   FILTER ( BOUND(?creatorItem) ) .
    16 }
    
However it may not return all wanted entries, due to limitations with MWAPI call results limit. This limit really doesn't make sense in the SPARQL context in my opinion, especially for generators. @Smalyshev (WMF): any chance the engine can iterate by using continue parameter? -- Nono314 (talk) 19:39, 16 June 2017 (UTC)
Theoretically, it is possible, practically, different APIs seem to do continuations differently, so it may be hard to implement it in generic way. E.g. for this API, continue is in gcmcontinue, but for search it's sroffset, and for querypage it's qpoffset. If I figure out a way how to generalize it, I can implement it. Though one needs to be careful as result may be too big and lead to timeouts. --Smalyshev (WMF) (talk) 22:54, 16 June 2017 (UTC)
Sure, as for any SPARQL query! @Smalyshev (WMF): Thanks for having a look at it. I think it really makes sense for generators. So we could pass the right parameter in the query, and you would just iterate until you get batchcomplete in result? And maybe a way to specify a maximum number of iterations? -- Nono314 (talk) 23:38, 16 June 2017 (UTC)
Nono314 this is great, that is exactly what I needed. Thanks a lot! I think I fixed all that were showing up, so the query does not return anything now. --Jarekt (talk) 03:13, 17 June 2017 (UTC)
Hopefully in the near future this will deliver complete result sets. In the meantime, Jarekt you can fiddle with gcmsort/gcmdir/gcmstart/gcmend parameters to get some additional results, like this. By the way, it seems like a number of the missing wikidata ids have actually been recently removed by your "reset linkback" bot run (eg [5]), so you may want to check that you are not undoing with one hand what you do with the other. -- Nono314 (talk) 10:46, 17 June 2017 (UTC)
I know that that bot job removed few Wikidata links ( it effected only pages where Wikidata field was in the same line as "Linkback" field in a template where each field is supposed to be in a separate line), so there is a new urgency to the query to fix things I have broken. --Jarekt (talk) 14:46, 17 June 2017 (UTC)
I think the second query I offered should have fixed them, as it was listing recently modified templates. Additionally, you may be interested by the following query listing creator templates that do not have wikidata id, but for which there is an item with Commons category (P373) pointing to their home category.
The following query uses these:
  • Properties: Commons category (P373)    
     1 SELECT (IRI(concat("https://commons.wikimedia.org/wiki/", ?template)) as ?templateLink) ?templateName ?categoryName ?commonsCatItem ?commonsCatItemLabel {
     2   SERVICE wikibase:mwapi {
     3      bd:serviceParam wikibase:api "Generator" .
     4      bd:serviceParam wikibase:endpoint "commons.wikimedia.org" .
     5      bd:serviceParam mwapi:gcmtitle "Category:Creator templates without Wikidata link" .
     6      bd:serviceParam mwapi:generator "categorymembers" .
     7      bd:serviceParam mwapi:gcmtype "page" .
     8      bd:serviceParam mwapi:gcmlimit "max" .
     9      bd:serviceParam mwapi:gcmsort "timestamp" .
    10      bd:serviceParam mwapi:gcmdir "descending" .
    11      ?template wikibase:apiOutput mwapi:title  .
    12   }
    13   hint:Prior hint:runFirst 1 . 
    14   SERVICE wikibase:mwapi {
    15      bd:serviceParam wikibase:api "Categories" .
    16      bd:serviceParam wikibase:endpoint "commons.wikimedia.org" .
    17      bd:serviceParam mwapi:titles ?template .
    18      bd:serviceParam mwapi:clshow "!hidden" .
    19      ?category wikibase:apiOutput mwapi:category  .
    20   }
    21   BIND(substr(?template,9) as ?templateName) .
    22   BIND(substr(?category,10) as ?categoryName) .
    23   OPTIONAL { 
    24     ?commonsCatItem wdt:P373 ?categoryName .
    25     SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
    26   }
    27   FILTER ( BOUND(?commonsCatItem) ) .
    28 }
    

-- Nono314 (talk) 15:41, 17 June 2017 (UTC)

Nono314 that is a good one too and I already added the connections. I have a way of finding creator templates that do not have wikidata id, but for which there is an item with sitelink pointing to their home category, but this one is better and can catch more connections. Another good query might be using c:Category:Creator template home categories without Wikidata link, as the creator template names often do not match category names. Thanks again. --Jarekt (talk) 00:32, 19 June 2017 (UTC)

3 year old bad bot edits

I was just looking at strange case of Q4747853 and w:en:Amos_Doolittle. Three years ago blocked bot User:GPUBot run by User:GautierPoupeau "imported" dates from English Wikipedia. However the dates were corrupted in transfer, somehow. Than French article was written citing wrong dates. I corrected the dates on Wikidata, but I do not speak French to correct them there. Shall we run some bot jobs searching for cases where source wiki dates no longer match wikidata? I wonder how many more cases like that were created by that bot. --Jarekt (talk) 04:45, 18 June 2017 (UTC)

Looks like enwiki was wrong when the bot ran - the import was in February 2014, and the enwiki article was corrected in April 2014. So it's not a corruption issue, thankfully :-)
I think a script like this would be a good idea - does anything for it exist at the moment? Andrew Gray (talk) 08:21, 18 June 2017 (UTC)
Andrew Gray thank you for correcting me. I could not figured out where the issue come from, but now it makes sense, and I am less concerned about other edits by the bot.
This is the only way where we curate both Wikipedias and Wikidata. It is not only for one it is needed for any project. Thanks, GerardM (talk) 10:38, 18 June 2017 (UTC)

Missing years

The future years are a bit of a mess. Every once in a while a tiny Wikipedia creates a bunch of them. These get imported as items, duplicates occur, stuff gets deleted, etc etc. We're wasting time here. I propose to properly create and link all years from now to 3017. Why 3017? If you look at my notepad (pl) you'll notice the highest year is 3001 (Q29976877) so we might as well do now + 1000 to not run into this for a very very long time. Taking 2017 (Q25290) as an example. Each item would have:

Opinions please and please check if 2017 (Q25290) is correct in your language. Multichill (talk) 11:08, 18 June 2017 (UTC)

I looked at some of the years. 2017 and several are correct (for Chinese). MechQuester (talk) 11:37, 18 June 2017 (UTC)
Sounds good - I like the idea of creating these now so that we have a solid framework. Will it also be worth going backwards as well to cover historic years and make sure they all have the basic metadata? Andrew Gray (talk) 11:51, 18 June 2017 (UTC)
It looks like early years are quite a mess. A few problems I noticed with a cursory look:
  1. There are separate items for 0 (Q23104) and 1 BC (Q25299) even though they are two names for the same year.
  2. 0 (Q23104) contains a claim that it follows 1 BC, even though they are the same year.
  3. 1 BC (Q25299) contains claim that is an instance of year BC (Q29964144). "Pre-Julian Roman calendar year" conflates several ideas: the Julian calendar (Q11184), the Roman calendar (Q200966), and the notation used to name years.
Jc3s5h (talk) 12:43, 18 June 2017 (UTC)
@Jc3s5h: The Swedish article starts by telling that the year never existed. That is not a big deal. This is an item about a Swedish municipality that never has existed. This is one of at least two Moons of the planet Saturn that never has existed. -- Innocent bystander (talk) 18:34, 18 June 2017 (UTC)
Why year (Q577) instead of calendar year (Q3186692)? year (Q577) seems to be defined with a fixed length while different lengths years in our calendar don't have the same length. It might be also valuable to fill duration (P2047). ChristianKl (talk) 13:20, 18 June 2017 (UTC)

I would create even more, until year 10,000 at least. There are plenty of astronomical events and predictions that could benefit from this. We have also other calendar years, like Hebrew 4372 AM (Q12405196) and Islamic 175 AH (Q6005370). Emijrp (talk) 13:39, 18 June 2017 (UTC)

We rarely use these items ourselves because our date/time properties don't use items. The main reason these items exist is because of the sitelinks, so I don't think there's much point going far beyond the range of years which have sitelinks. Filling in the gaps makes sense to me, but going all the way to 10,000 (and beyond) seems like overkill. - Nikki (talk) 16:37, 18 June 2017 (UTC)
The Wikipedia's who do have a lot of articles for individual years frequently store information about how a year is called in different calendars in the respective articles. Maybe we can store the same information to make them more useful? ChristianKl (talk) 18:02, 18 June 2017 (UTC)

Can Quickstatements remove items?

is it possible? MechQuester (talk) 16:24, 18 June 2017 (UTC)

Yes - prefix the line you want to remove with "-" like so.
 Q4115189	P39	Q16707842  [6]
 - Q4115189	P39	Q16707842  [7]
Andrew Gray (talk) 16:39, 18 June 2017 (UTC)
Thanks @Andrew Gray:/ MechQuester (talk) 17:51, 18 June 2017 (UTC)
ERROR: - is not a Wikidata item (Qxxx). Did you forget to set a wiki to convert from articles to items?

This is what I got. @Andrew Gray:. MechQuester (talk) 20:22, 18 June 2017 (UTC)

@MechQuester: Interesting! Looks like it isn't supported in the original version of QuickStatements but works fine in the new one. Andrew Gray (talk) 20:41, 18 June 2017 (UTC)
Figured it out. Thanks. MechQuester (talk) 21:09, 18 June 2017 (UTC)
On the page https://tools.wmflabs.org/quickstatements/# Import commands / Version 1 format there are following instructions now:
  • Paste a tab/newline-delimited command sequence from the original QuickStatements here.
  • You can pass such commands as URL parameters by appending "#v1=COMMANDS" to the URL. For convenience, you can replace tabs by "|" and newlines by "||".
  • Note: MERGE command does NOT work yet.
  • You can remove specific statements by prefixing a line with "-"
  • Quantities with error can be entered as 1.2~0.3 (for 1.2±0.3)
--Jarekt (talk) 03:13, 19 June 2017 (UTC)

Indigenous to

The type constraint for indigenous to (P2341) specifies that it should be applied to items that are in subclasses of folk culture (Q4384751), languoid (Q17376908), or dish (Q746549). I'd like to add clothing (Q11460) to this list for the many clothing items that are (or originated as) traditional dress of various places. Any objections? - PKM (talk) 18:14, 18 June 2017 (UTC)

mass (P2067) and molar mass

I see that mass (P2067) has alias "molar mass", but in fact molar mass (Q145623)     is not the same as mass, it even has different units (g/mol or kg/mol). Should there be a separate property for it? Laboramus (talk) 20:55, 18 June 2017 (UTC)

Wikidata:Property proposal/Archive/36#atomic mass Matěj Suchánek (talk) 21:04, 18 June 2017 (UTC)
AFAIR you just need to replace unit molar mass (Q145623) by atomic mass unit (Q483261) and these claims should be fine again. —MisterSynergy (talk) 21:57, 18 June 2017 (UTC)
So, just to be sure, we use value in daltons as "molar mass" and use it in mass (P2067)? I guess since they are defined as numerically equal, I can live with that, though it feels a bit iffy. But ok, I'll assign dalton-based masses to the appropriate items then, thanks. Laboramus (talk) 01:06, 19 June 2017 (UTC)
It is no longer the molar mass, it is an atomic or molecular mass (depends on subject item). As you said, they are numerically equal due to the definitions of a mole and the atomic mass unit. Maybe we should discuss the alias molar mass for mass (P2067), but I don’t see that much of a problem with it. —MisterSynergy (talk) 04:54, 19 June 2017 (UTC)

Constraint checks updated, please test!

Hi everyone! We recently made a lot of updates to the constraints extension, to hopefully make the messages much easier to understand (“This property must only be used on items that are in the relation to the item (or a subclass of the item) defined in the parameters” – what? :D ), and also to improve the look of the report that this user script displays on item pages.

Since we’d like to turn that user script into a gadget soon, we’d appreciate it a lot if some more people tested the current state and gave us feedback on it :)

To use the script, add the following line to your user/common.js:

mw.loader.load( '//www.wikidata.org/w/index.php?title=User:Jonas_Kress_(WMDE)/check_constraints.js&action=raw&ctype=text/javascript' );

There are a few known issues that also came up when the user script was first announced:

  • We currently only check the main statement, not qualifiers or references. One consequence of this is that the “value only” constraint is not implemented, since in the cases we check it would always be satisfied.
  • We currently don’t distinguish between different constraint states (e. g. “regular” and “mandatory”). This is tracked in T164254.
  • Constraints are only checked when the page is loaded. Edited or new statements aren’t immediately checked, and editing a statement without changing the main value will probably still make the constraint report icon disappear.

If you notice any other issues, or think some of the messages could be improved further, then please let us know (e. g. by replying here)!

Thanks for your help :) --Lucas Werkmeister (WMDE) (talk) 09:52, 12 June 2017 (UTC)

@Lucas Werkmeister (WMDE): please stop asking people to edit their common.js, no seriously, you shouldn't do that. Make a gadget. Me or someone else can help you with that. Multichill (talk) 11:14, 12 June 2017 (UTC)
Why shouldn't he do so? Did I violate something when I introduced a new one three days ago? Matěj Suchánek (talk) 11:20, 12 June 2017 (UTC)
We chose to provide this as a user script first, because we wanted to have a collaborative process, and we wanted the community to try it and provide feedback before enabling it as a gadget. That's what happened (thanks again to people who provided ideas and helped us fixing bugs). After this round of tests, we will indeed suggest to create a gadget, so more people can use it :) Lea Lacroix (WMDE) (talk) 13:30, 12 June 2017 (UTC)
We already requested the gadget, too (here and here) – we just wanted to get another round of feedback on the script before it’s exposed to a lot more users as a gadget. --Lucas Werkmeister (WMDE) (talk) 14:53, 12 June 2017 (UTC)
Much better than daily updated reports!
Also, this saves time walking from item page to report page.
The only thing I can miss is ability to support complex constraints and custom help page for more complex things from Wikidata model. d1g (talk) 16:11, 12 June 2017 (UTC)
Hm, complex constraints are a bit different from other constraints because it’s a single SPARQL query that returns all violating items at once, not something that you check on a single statement of a single item… I’m not sure if there’s a good way to apply the same idea to per-statement constraints. How much do people use complex constraints?
Also, once we move to constraint statements, the length limit on the string datatype (500, I believe) will probably limit the usefulness of this constraint type – how long are typical queries for that constraint?
Do you think a custom help page would also be useful for other constraint types, or is that just for complex constraints? --Lucas Werkmeister (WMDE) (talk) 17:17, 12 June 2017 (UTC)
We can write checks so that ?item can be replaced with particular instance with few possible issues.
Biggest use of it is to warn about sub optimal structure of item with respect to related items.
At least P31 check Constraint:Target required claim|property=P279 needs a separate help page.
about 6-10 graph patterns could be included in single 500 char sequence, I think this is enough for average check.
We can fit primitive checks in multiple 500 claims d1g (talk) 19:09, 12 June 2017 (UTC)
@Lucas Werkmeister (WMDE):
  • incorrect suggestion at Q223557#P366
  • I would prefer to have "autofix" feature for "inverse of is missing" suggestions
d1g (talk) 16:53, 12 June 2017 (UTC)
Thanks for your feedback! --Lucas Werkmeister (WMDE) (talk) 17:11, 12 June 2017 (UTC)
@Lucas Werkmeister (WMDE): Hi! This is a very nice tool. One small bug: the constraint type is not translated. For example, with the user interface in French, in Rennes Métropole (Q12208) for the property headquarters location (P159), it displays "Conflicts with" in English in place of a French translation. I didn't find it on Translatewiki. — Envlh (talk) 09:39, 13 June 2017 (UTC)
@Envlh: yeah, the constraint type is just a plain string at the moment (directly from the title of e. g. {{Constraint:Conflicts with}}) – but soon we’ll import constraint statements, and then we can use the localized label of the constraint item, e. g. conflicts-with constraint (Q21502838). --Lucas Werkmeister (WMDE) (talk) 10:21, 13 June 2017 (UTC)
I get the impression that an edit to contraints that I made on May 27 wasn't yet integrated into the contraint table. Is this expected behavior? If so, I would advocate to run the script to update the contraint table on a weekly basis. ChristianKl (talk) 17:38, 17 June 2017 (UTC)
Yes, it’s been a few weeks since we last updated the constraints table. We plan to migrate to constraint statements soon (see this phabricator comment for implementation progress), which will mean instantaneous constraint updates. --Lucas Werkmeister (WMDE) (talk) 13:26, 19 June 2017 (UTC)

Notability of template subpages

There are quite a lot of template subpages which cannot clearly be classified as "/doc" or something else which is mentioned in WD:Notability, and which have two sitelinks, so they are notable according to the criteria. However, I don't think that pages like Q21845026 or Q23758062 should have an item. Furthermore, /XML, /Meta and /preload subpages are also not covered by the current criteria. I would suggest to change the criteria so that subpages of templates are not notable in general. With subpage I mean pages which are not a template used in article main space for themselves. This does of course not include pages like Q13413893, where in german the template is formally a subpage ("Vorlage:Navigationsleiste Flughäfen nach Staat/Europa"), but actually it's a normal template. Steak (talk) 07:15, 14 June 2017 (UTC)

So you're asking that subpages of /doc subpages, like /doc themselves, are not notable? --Liuxinyu970226 (talk) 08:19, 14 June 2017 (UTC)
In this case the item provides interwiki links. The policy seems to be written with the assumption that the use-case of interwiki links warrants the item. Having the item has little cost. Can you argue why you think the interwiki links are useless? ChristianKl (talk) 10:10, 14 June 2017 (UTC)
Fully agree with Steak on this matter. 1) They are not templates; 2) They not individually notable, see Wikidata:Notability; 3) They are components of their parents, and cannot sensibly exist in isolation. The addition of subpages should not be automatic, it should depend on the wikis thoughts, and the usage of the page.  — billinghurst sDrewth 10:36, 14 June 2017 (UTC)
I don't like template subpages on Wikidata but that's my personal bias. There was a RfC a long time ago. Matěj Suchánek (talk) 15:33, 18 June 2017 (UTC)
I agree with Steak and billinghurst, if wikipedia wants them, they can add them in wikitext... Q.Zanden questions? 20:45, 18 June 2017 (UTC)
I am going to sligthly change the notability wording to make clear that XML and other types of subpages are also not notable. Steak (talk) 06:56, 19 June 2017 (UTC)

WikidataCon: registration open and new deadline for scholarships

Hello all,

I'm glad to announce that the registrations for the WikidataCon are now open! You can access to all informations and fill the form.

Important: due the necessary time for people to get visas (about 3 months), we changed the deadline for the scholarship applications. You can apply for a scholarship before July 16th. We will then make sure that the applicants receive a response on July 25th.

Reminder: you can still propose a project for the program until July 31st!

If you have any question, problem, or see an information missing, feel free to reach me.

Cheers, Lea Lacroix (WMDE) (talk) 11:53, 19 June 2017 (UTC)

It seems to me like forcing people to add data about emergency contacts isn't in the spirit of the WMF position around privacy. The same goes for the birth date. I can understand why you might need to know the year of birth but I don't see how the exact birth date is within the need to know. ChristianKl (talk) 12:58, 19 June 2017 (UTC)
Thanks for noticing. These fields are now optional. Lea Lacroix (WMDE) (talk) 13:09, 19 June 2017 (UTC)

Wikidata weekly summary #265

Mass adding a property to items

(comes from Wikidata:Café#Herramienta_para_a.C3.B1adir_a_varios_elementos_una_misma_propiedad where I couldn't get a possitive reply). I've been reviewing some bios from eswiki where those persons where awarded Great Cross of the Royal Order of Sports Merit (Q30278709), but such data is not yet here in Wikidata (see es:Categoría:Grandes cruces de la Real Orden del Mérito Deportivo for the current list). Is there any tool to mass add this property to the bios of that category? and if yes, can someone teach me how to use it? Best regards, —MarcoAurelio (talk) 11:17, 24 June 2017 (UTC)

Yes there is! There are two approaches.
  • If you can generate the list of items from a category (eg "Every entry in Categoría:Grandes cruces de la Real Orden del Mérito Deportivo should have P166:Q30278709") then you can use PetScan. After logging in through WIDAR (link at the top of the results), enter the WP language and category ([8]), then on "Other sources" select "Use wiki > Wikidata", and you will get a list of all the wikidata items representing pages in the category ([9]). Be careful with category depth - if you set this too high it can get very unexpected results. It's a good idea to quickly skim down the list and check that the items all look right and they're not, eg, lists of winners, or articles on the award. If there are any, uncheck them.
Then you can fill in the "process commands" box, which is a very simple syntax - P166:Q30278709 means "add this property:item pair to every item". Then hit "Process commands", and it will do them all for you. Any which already have this property:value pair will be skipped.
  • If you have a manually developed list of items that doesn't match a category, you can use "Other sources: Manual list" in PetScan, or else you can use QuickStatements (v1, v2. This takes a tab-seperated list of item-property-value entries which you can generate using a spreadsheet. QS is a bit more complicated to use but a) it's harder to accidentally do items you didn't intend, and b) you can add qualifiers and sources, which you can't do with PetScan. Andrew Gray (talk) 11:28, 24 June 2017 (UTC)
Dear @Andrew Gray, thanks for your detailed reply. I've built http://petscan.wmflabs.org/?psid=1131753 but after generating that list (I did the query myself to learn and checked with the links you already provided to me to see if I did it okay) I cannot find the "process commands" box in PetScan. Is it from a different tool? Regards, —MarcoAurelio (talk) 14:21, 24 June 2017 (UTC)
Nevermind, it was a problem with cookies. I now see the box indeed. Thanks. —MarcoAurelio (talk) 15:18, 24 June 2017 (UTC)
This section was archived on a request by: Matěj Suchánek (talk) 07:58, 25 June 2017 (UTC)

Wikidata Zurich

HackZurich is one of the largest hackathons in Europe and it will take place in Zurich, during the Digital Festival 2017 in September 2017 (15.09 - 17.09).

HackZurich is a great opportunity to make open data fans and hackers in general aware of Wikidata, and also to hack for Wikidata. We invite Wikidata volunteers to apply for HackZurich and code for the city.

The application process of HackZurich will close soon! See: https://twitter.com/hackzurich/status/874493802710528000 So, be quick and send your application as soon as possible.

One day before HackZurich, we will run a Wikidata workshop organised by and at the University of Zurich. You can participate in our workshop independently of whether you join HackZurich or not. We will have talks, small hands-on sessions to learn how to code with Wikidata and a brainstorming session to discuss what we could hack during HackZurich. You can register for our workshop here: Wikidata Zurich Workshop

If you are a Wikidata volunteer and you would be willing to help running a mini hands-on session during the workshop (14.09.2017), please contact us, we would be happy if you join us!

All the information about the event can be found here at the Wikidata Zurich Wiki Page.

Cristina Sarasua (talk) and Leon Kastler (talk) 13:47, 5 June 2017 (UTC)

Unitless claims

I would like to start a discussion - or rather, renew, as I think I talked about it here before - about claims having no units for properties that require units. We have quite a number of such claims, and something like "height" or "area" without a unit is mostly useless - granted, in some cases it can be derived from context, but that requires human intervention, and the whole point of Wikidata is so that it won't be the case.

I am mainly talking about such properties as (the number is how many unitless claims there are):

I think such claims should be either mass-deleted or assigned some kind of default unit (e.g. kilogram (Q11570) for mass (P2067)), but I would like to hear opinions on this. Maybe there's a better way of handling it. It's not hard to change them (I actually have the code that can find them, and deleting or changing them would be very easy) but I'd like to see what the consensus is. I think we should do something, because having useless data in the database is bad, and having something recoded as "mass = 42" is useless, it can not be used for any work where it is important to know mass.

I am not sure how the process of consensus gathering should work in this case - should I make an RFC, or discussion here is enough? Laboramus (talk) 05:10, 11 June 2017 (UTC)

It is better to have data than not having data. Just deleting for an opinion is imho the last thing we should do. With data we can indicate the ones as problematic and seek help. With no data there is not even a chance of that. Thanks, GerardM (talk) 07:54, 11 June 2017 (UTC)
To be more frank than I usually are "It is better to have data than not having data." is bullshit! We could add "42" as a value to every question here, but it does not makes any sense at all. It would be a mockery for Wikidata. I just removed this before I saw this new thread. The claim had no source, so I could not fix it. I sometimes remove such fully useless claims when they have no source. -- Innocent bystander (talk) 08:56, 11 June 2017 (UTC)
I think the « assume good faith » approach implies that if it’s not obvious vandalism there is at least something true in it. Find the pattern then mass correction is a possible approach. You should not care too much about the laughters. We all know why we are here aren’t we ? Better laught with them, a little bit of self-mockery is often a good thing.
First they ignore you, then they laugh at you, then they fight you, then you win.
Mahatma Gandhi (Q1001)     author  TomT0m / talk page 09:26, 11 June 2017 (UTC)
@TomT0m: When I remove a claim like this, that makes it easier to find this item as lacking this property. "good faith" works in both directions, you know! -- Innocent bystander (talk) 10:37, 11 June 2017 (UTC)
Well, I guess you should avoid to call ohter people opinion "bullshit", then :) Unitless claims appears in constrain report, so they are also very visible. author  TomT0m / talk page 14:04, 11 June 2017 (UTC)
@Innocent bystander: Such strong opinions need the backing of strong arguments. You fail to provide them. Please explain why you think you understand the issue, what the ramifications are of both our opinions. So far you insult but fail to convince. Thanks, GerardM (talk) 15:45, 11 June 2017 (UTC)
Constraints reports gives us limited support in many cases. The report for "area" is in my watchlist, and I have seen very little progress there. And the attitude "the more data the merrier" has not made Wikidata more useful, it has only improved the numbers in the statistics-reports. That attitude has also given us Wikipedia as the main source for our data. If Wikidata should be useful at all, for anybody except for ourselves, we have to do much better than that. The class I am working with at the moment is filled with bad coordinates imported from nlwp. I have asked bot-owners to replace those claims with better sourced data, but it is not until I start to remove such bad claims that they are replaced with anything at all. -- Innocent bystander (talk) 19:00, 11 June 2017 (UTC)
I disagree that it is better to have bad data than none. Bad data is not useful in any application, but it makes it look as there is data, instead of showing the real situation - that we have no data. If for specific cases we can add units, I'm all for it, but if it's not possible, I'd rather have it explicit that we don't have good data than having meaningless numbers which can not be used for anything. -- Laboramus (talk) 20:44, 11 June 2017 (UTC)

@Innocent bystander: OK, but in your example, Seav just forgot the unit and the source. However, it's obvious that is km². Instead of delete it, you can add the unit. There is no source but it's not difficult to go on Wikipedia. And yes, "It is better to have data than not having data.". Tubezlob (🙋) 09:40, 11 June 2017 (UTC)

@Tubezlob: What so obvious about it? Statistics Sweden often uses hectare (Q35852) which therefor is the area-unit I use the most here. -- Innocent bystander (talk) 10:37, 11 June 2017 (UTC)
@Innocent bystander: OK maybe it's not so obvious, but I think that it's a bad idea to delete statements like this. It might be better to fix instead of delete. Tubezlob (🙋) 06:28, 12 June 2017 (UTC)
@Tubezlob: I definitely agree that a fix is better than to delete. But how should we fix when there is no source provided? And as you see below, looking up who the user was who added the data is not always simple. Do you yourself know what you used as source in a QuickStatement-run two years ago? I can tell that I don't! -- Innocent bystander (talk) 07:58, 12 June 2017 (UTC)
@Tubezlob:, I was not the one who added the area. It was Exec8 who added the data. —seav (talk) 19:21, 11 June 2017 (UTC)
@Seav: OK sorry. Tubezlob (🙋) 06:28, 12 June 2017 (UTC)
In one specific case, it may be easy (sometimes it's not). In 4000 cases, it's not possible to go through all of them and manually reconsider each one, and nobody is going to do it (unless somebody volunteers to do it? please feel free to :). So I am looking for better solution. -- Laboramus (talk) 20:44, 11 June 2017 (UTC)
In case of height (P2048) and width (P2049) more than 3500 of each are repairable. However, data is in fact partially wrong, so besides an addition of the unit, the values need to be fixed as well. Takes some time, but still: doable. I am working to fix it in the next days. —MisterSynergy (talk) 20:50, 11 June 2017 (UTC)

Here you have another very poor example. "area=1,74±0,01". I changed it now to 331 hectares as of 2015-12-31. This example had a poor source (enwp) but was still not correct. -- Innocent bystander (talk) 10:51, 11 June 2017 (UTC)

Unitless data should simply be removed. If Height = 120, this could be meter, feet, cm or whatever obscure unit. Having such statements is even dangerous, because some user may change (in good faith) the unit to meter, when it's actually feet. Then there is a real problem, because the data looks then good, but is in fact bullshit. Steak (talk) 07:21, 20 June 2017 (UTC)

Some structured data is better than none

First a few assertions:

  1. There is no data store without problems. This includes Wikipedia and Wikidata.
  2. The data we hold is best understood by applying set theory. The data in Wikidata consists of many subsets; probably the most valuable subset for the WMF are the interwiki links.
  3. The error rate in each subset can be assessed and is by definition different from the overall Wikidata error rate
  4. The absence of data often indicates a bias in what Wikidata holds. A good example is the lack of data relevant to the global south.
  5. Given the huge influx of data from Wikipedia, the biggest imports have been from English Wikipedia and it is one reason for the existing biases in Wikidata.
  6. An absence of data prevents the application of tools. Tools may suggest writing a Wikipedia article, tools may compare data with other sources.
  7. Concentrating on the differences between Wikidata and any other data source is the most optimal way of improving the quality of existing data.
  8. Having an application for the data in Wikidata is the best way for improving the usefulness for a subset of data.
  9. Each contributor to Wikidata works on the data set(s) of his/her own choice, these data sets interact in the whole of Wikidata. This may raise issues and this can not always be avoided.
  10. Examples of problematic data must be seen in the light of the total of the dataset they are part of. Statistically they may be irrelevant.
  11. Never mind how "bad" an external datasource is, when they are willing to cooperate on the identification and curation of mutual differences, they are worthy of collaboration
  12. Wikidata improves continually and as such it is purrfect but it will never be perfect.

consequences

When people assert that all data in Wikidata should have a source or must be complete, they have a point. Once this point is made, it is then followed up with a proposed action.

  • This typical approach is often seen as problematic by others because it violates the assertions above. When problematic data is inside or outside of Wikidata and we have a healthy collaboration going with an external party it follows that after curation our data is improved or their data is improved.
  • When sources are lacking both inside and outside Wikidata, it is not really problematic when the data is the same. Sources are however of particular importance when there is a need for reconciliation and curation

One of the Wikimedia values is "be bold".

  • Much of our data is incomplete. When a constraint insists on a combination, the constraint is a warning and not really a constraint. A genuine constraint is "must have" but it can be taken in two ways. For instance: an award must be given to a person or an organisation. When this instance is lacking, it must not be a fictional person or fictional organisation. The second approach removes many constraints but is imho more relevant
  • There are several conventions where a bot could help out. When multiple initials exist in a label for a person, is there a space between them or not.  – The preceding unsigned comment was added by GerardM (talk • contribs).

discussion

If there's a healthy collaboration with the third party than data gets improved. On the other hand, simply important all geonames data doesn't provide for such a healthy collaboration and the same goes for other ways to import a lot of low quality data. ChristianKl (talk) 21:28, 13 June 2017 (UTC)
So you insist on maintaining the current bias. You insist on preventing tools build to support our movement from working because of a lack of data. Why do you think we can not collaborate? How do you define "low quality data" when our own record in this sub set is abysmal? Thanks, GerardM (talk) 00:24, 14 June 2017 (UTC)
Basically GeoNames is a dataset that combines everything on which they could get their hands. It's optimized for quantity not quality. It also doesn't have the manpower to collaborate to remove errors. If we add to our cache of geo data a lot of low quality data it becomes less useful for actual tools.
The way to solve our problem with having no good data on Ethiopia isn't to seek to import a lot of low quality data about the country but by nice to Amheric and give them the freedom to contribute data in the way they want. If they have some use-case for an existing data set it can be useful to help them to import that data set but I don't think the way to get rid of Western bias is for Western people to create a lot of items about non-Western countries. ChristianKl (talk) 11:37, 14 June 2017 (UTC)
Much of the GeoNames data should be in Wikidata already, warts, and all because of its inclusion in Wikipedias. As we link to GeoNames already, we do include its source ID, we can compare the data that we share, collaborate on the curation of differences. When you insist on not having data about countries like Somalia, Ethiopia, Eritrea et al, you insist on maintaining our existing bias. No data is 100% incorrect, we can not do worse. Our tools cannot operate on no data, we cannot suggest to people to write an article. No data is the epitome of failure in what we aim to achieve; share in the sum of all knowledge. Thanks, GerardM (talk) 12:30, 14 June 2017 (UTC)
Last thing I want is deal with awfully fake Geonames data; especially if it is freemium licensed part of GeoNames.
Yes, it is the best example how "no data" is the only viable option. d1g (talk) 14:52, 14 June 2017 (UTC)
You provide the exact arguments that make your opinion irrelevant. First you only want to work only on sourced information and second you make an absolute statement based on emotion not on facts. I have asserted that when cooperating with external Sources, sources are not crucial but differences are. I have also asserted that everywhere you will find vandalism/errors (choose your label), it is however the percentage that counts. I have also asserted that when we collaborate with another community, even this is largely irrelevant. So you do not want to work on this? ... Fine. Thanks, GerardM (talk) 07:59, 15 June 2017 (UTC)
Fake objects and non-existing objects in real life from GeoNames is a known fact. d1g (talk) 15:55, 15 June 2017 (UTC)
I honestly don't think that it is possible to evaluate every dataset using same set of rules; especially geodata.
Best part of the Wikidata is that I can get slice of dataset where all data has great or acceptable sources; then I don't care how bad rest of Wikidata is. d1g (talk) 02:16, 14 June 2017 (UTC)
It is positive that you think but it is sad that you do not share your arguments. Wikidata includes an enormous amount of geodata that is of the highest quality that you cannot use. Most of its labels are in Chinese. Is that what you mean? Does it help us when we share data, compare, evaluate, curate and improve at both ends by concentrating on differences? ABSOLUTELY. Is there any proof that throws my assertion in GEODATA.. Really and how?
When you only care about the data with "great and acceptable sources" fine, go and do your thing. Especially for you I have added an additional assertion. Thanks, GerardM (talk) 07:47, 14 June 2017 (UTC)
Well, this postulates that the data we have is well structured. But in many cases it isn't. Thousands of articles are now deleted on svwiki since the quality of Geonames proved to be to unreliable. No data there proved better than the unstructured mess of Geonames. The whole basis for some of our structures is shipwrecked in some cases. I on daily basis have to deal with the postulate that administrative divisions follows a hierarchy from the top to the bottom. In a subset of the present world, that is maybe true. But we cannot model reality based on a small simple subset of reality. -- Innocent bystander (talk) 08:07, 14 June 2017 (UTC)
The experiences with importing data directly to Wikipedia is not part of this discussion. We would be talking cooperation between Wikidata and Geodata.
I am on record that data to be used to import data to a Wikipedia is best imported in Wikidata first and the texts for articles are then generated and cached. This has as a benefit that all the curation will affect the data presented to readers. Only when an editor decides to change the cached text, it becomes an article (and it is out of our hands).
Administrative divisions do follow a hierarchy from the top down. I have spend some time on such hierarchies in the Ottoman empire. They change. So far we have incomplete datasets in Wikidata because of incorrect assumptions inherited from one or more Wikipedias. All the more reasons to consider these and do better as a result. Thanks, GerardM (talk) 08:40, 14 June 2017 (UTC)
@GerardM: Do you tell me that "Administrative divisions do follow a hierarchy" because it did so in Ottoman Empire? Maybe so, but the system should work also outside of Ottoman Empire, and also in all times. Some areas close to the Swedish/Norwegian border was co-administrated between the two nations for some decades. The hierarchy is here therefor not even simple on nation-level. And if we go even further back in history, we find even more complex relations between Sweden, Norway and Novgorod. -- Innocent bystander (talk) 10:23, 20 June 2017 (UTC)

Linna Muuseum URLs

We have a large number of items with described at URL (P973) having values starting <http://www.linnamuuseum.ee/eestipiltnik/?action=view&id=, for example Eduard Ilves (Q27864972) has the value http://www.linnamuuseum.ee/eestipiltnik/?action=view&id=635.

These links are all returning a 404 error.

Do the IDs persist in new URLs, and if so, how are they structured?

If not, The Wayback Machine seems to have archived copies.

@WikedKentaur: who added at least some of them. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:24, 18 June 2017 (UTC)

Mass renaming is needed from http://www.linnamuuseum.ee/eestipiltnik/?action=view&id=635 to http://vanaveeb.linnamuuseum.ee/eestipiltnik/?action=view&id=635 etc. --WikedKentaur (talk) 14:04, 19 June 2017 (UTC)
welcome to the maintenance of rotting urls. how many folks calling for references will pitch in? do we need an automated internet archive bot as a stop gap? Slowking4 (talk) 15:23, 19 June 2017 (UTC)
Thank you. The question now is: do we have a bot update the URLs, or create a property and migrate all the ID values? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:26, 19 June 2017 (UTC)
I vote for a new property - we should be able to map the property values from the "bad" url and then delete the URLs. But in general - an automatic archive bot isn't a bad idea either. - PKM (talk) 22:17, 19 June 2017 (UTC)
building on the work done on english https://blog.archive.org/2016/10/26/more-than-1-million-formerly-broken-links-in-english-wikipedia-updated-to-archived-versions-from-the-wayback-machine/ maybe we should have a word. -- Slowking4 (talk) 17:25, 20 June 2017 (UTC)

Property proposal - template transclusion failure?

I created my first property proposal here and attempted to transclude it into Wikidata:Property_proposal/Creative_work with this edit. However, the property proposal page is still showing the following message near the top of the page:

You have not transcluded your proposal on Wikidata:Property proposal/Creative work yet. Please do it.

How can I fix this? Zazpot (talk) 10:17, 24 June 2017 (UTC)

I such cases, it is always worth a try to make a null-edit to the page. (Press the edit button, and save again without changing anything.) Whatever caused the problem, looks like it is solved now. -- Innocent bystander (talk) 10:21, 24 June 2017 (UTC)
This section was archived on a request by: Matěj Suchánek (talk) 19:09, 26 June 2017 (UTC)

Documenting old names of TV stations

The name of Las Estrellas (Q80478) was Canal de las Estrellas until the 22 august 2017 when they changed their name to Las Estrellas. In the infobox of TV shows that started before that date the name of the station should be displayed as Canal de las Estrellas and for the ones after that date it should be displayed as Las Estrellas. When loading the data from Wikidata we use the label, but that way it's impossible knowing when each of the names should be displayed. How should the name change be added as Wikidata properties to allow us automatically taking the correct one? I've been thinking on using official name (P1448), but I don't know if there is any property that would be more appropriate for the current use case. Thanks. -- Agabi10 (talk) 11:39, 27 June 2017 (UTC)

You have the correct property, be sure to use date qualifiers and you are all set! :D Sjoerd de Bruin (talk) 12:19, 27 June 2017 (UTC)
This section was archived on a request by: Agabi10 (talk) 14:42, 27 June 2017 (UTC)

GeoNames and, the first to throw a stone...

I exchanged e-mails with Marc Wick, the founder of GeoNames. It turns out that he is looking into connecting GeoNames to Wikidata. He finds problems that need to be resolved because of errors on our end. Errors are to be expected.

I asked permission and this is where you can find the e-mails so far. As I have asserted before, when there is cooperation any complementary data is welcome because it improves an 100% error rate to something that is more manageable. The whole history with Wikipedias importing data from GeoNames is imho our own problem. When we welcome the data they want to import and provide the platform to generate texts, texts that are cached and not saved, our efforts to curate data will have the biggest effect.

What is quite clear to me is that only the first to throw a stone at GeoNames should be the one that makes no mistakes. As we all make mistakes collaborating is the only sane way forward. Thanks, GerardM (talk) 08:31, 17 June 2017 (UTC)

Since I already have tried to work with this dataset, I would like to start to pinpoint some weak points. The links to Wikipedia in GeoNames can a little to often not be trusted. Unfortunately we some years ago started to add GeoNames ID (P1566) here based on those WP-links. A little to often, the only thing the WP-articles and the GeoNames-items have in common is the label. A second problem is that the coordinates are rounded to minutes. That is not very useful when it is used to pinpoint a specific summit (Q207326) among others on a mountain. It also makes it difficult to find where a specific reef is IRL. A large problem is how the dataset in GeoNames has been collected. It looks like any name on a map have been harvested. But what those names describes has not always been identified in a good way. Beyne-Heusay (Q681039) is a commune in Belgium. There is no populated area with this name. I live fairly close to this point. I can tell you that there is no populated place with the name "Ytterlännäs". Ytterlännäs is the name of several existing and former administrative units, but there has never existed a populated place with this name. The namesake of these administrative units is Näs, located next to the old church of Ytterlännäs.
The problems on svwiki has not only been the quality of this source. How the project was implemented and what complementary sources was added together with the algoritms used by the bot also have affected the project. I do not in any way oppose any cooperation between GeoNames and Wikidata. In fact, I would have preferred if Lsjbot would have started here instead of on Wikipedia. That would have included more experienced users in the discussions about how the project should have been implemented. -- Innocent bystander (talk) 10:25, 17 June 2017 (UTC)
Thank you for your thoughtful reply. Yes, there are issues and these issues are found at both Wikidata and GeoNames (see blogpost). What we really need is a way to communicate issues in both directions. We are good at the things we do but we communicate badly. I am not a Wikipedian and I find many issues with awards that are fine in Wikidata but still problematic at the Wikipedias I frequent.
If there is one thing I wish for the strategic process is that communication of issues gets a better facilitation. Thanks, GerardM (talk) 11:15, 17 June 2017 (UTC)
Yes, the definition of "city" is one of the largest problems we have here. The Swedish article Stad mentions several different meanings of the word in Swedish alone. It could for example mean both an administrative entity or a larger settlement who is identified with one single name. The original meaning is "place".
Another problem worth talking about is the difficulties to identify what a "University" or "School" is. Both Wikidata and GeoNames have this problem. Universitys and Schools are something else than a building. The Lsjbot-articles about Universitys became something to laugh about. A University is located in B, it has climate C and is surrounded by D. I worked some time in Linköping University (Q782600). Telling that it is located at a specific place and describing the nature around it does not make sense. It has activity in several places in Sweden. The same thing could also be said about many things. Is a Municipality an organisation or a geographic feature? I work with a lot of people in the local municipality, and we mainly discuss such things as social care, economy and health care. We never discuss geography. -- Innocent bystander (talk) 15:04, 17 June 2017 (UTC)
In practice I think we should subclass the concept of city more frequently. When in German a city is a human settlement with at least X people and the French concept is one with at least Y people it makes sense to have a concept of "German city" and one of "French city". ChristianKl (talk) 17:31, 17 June 2017 (UTC)
Subclassing based on a local understanding? It should suffice to have the number of inhabitants for places. It is the user who understands the concept of city in a particular way. Having a subclass of French city only means city in France to me. Thanks, GerardM (talk) 18:40, 17 June 2017 (UTC)
Well, it is not as simple as body counting. If a place has a population of 200,000 but still has the infrastructure of a village, it is still today called a village ("by") in Swedish. A place here who do not have a liquor store and a pharmacy can here hardly be called a city. -- Innocent bystander (talk) 10:05, 18 June 2017 (UTC)
We do normally do this. I'm not sure why we don't yet for Germany. - Nikki (talk) 09:55, 18 June 2017 (UTC)

While working on the Thai localities, I also noticed several wrong entries in geonames when it comes to the populated place category - luckily I could stop the ceb-wikipedia bot from importing the whole of geonames in Thailand before it was too late. However, it seems most of these were imported from yet another database, which we have as GNS Unique Feature ID (P2326), so the blame for wrong entries goes one step further. Also, the above mentioned "Ytterlännäs" originates from GNS-UFI -2536857. Sadly, it seems the original ID from GNS isn't found in the geonames data anymore. That said, it is still a good idea to closer link geonames with wikidata, but don't blindly import from geonames. Ahoerstemeier (talk) 21:03, 17 June 2017 (UTC)

When we do not have data, we have a 100% failure. When we are not to blindly import, what scenario do you propose? Thanks, GerardM (talk) 10:36, 18 June 2017 (UTC)
One common mistake in GeoNames is that they mix up items with the same label. This item has the name "Viksäter" and GeoNames tells it has a population of 309. But this is a village with only a handful of houses. If they do not hide 300 aliens in the barns, that data is completely wrong. The Viksäter who had a population of 309 is this as of 2005. Note that the data was removed in the history of the item. But it was rollbacked. There are no official records of how many people lives in villages like this, at least we have not had any for 100 years. Wrong data is only useful as fiction, and that is not what I am interested in here! -- Innocent bystander (talk) 11:59, 18 June 2017 (UTC)
We don't have "no data". Wikidata has some data. If we take data about cities there are use-cases where it's more important that the data we have is trustworthy, that there's a lot of data. False positives can be more harmful than false negatives and it's usually more work to remove false positives as it takes more human analysis. ChristianKl (talk) 18:40, 18 June 2017 (UTC)

I have no idea about the quality of the GeoNames database but it seems that the GeoNames ID (P1566) we already have in Wikidata are also to blame. Following up on User:ArthurPSmith's import of GRID I have set out to add headquarters location (P159) to the company items he created, using GRID's data. GRID provides GeoNames ids, but it turns out that I get more accurate reconciliation results by ignoring them, because of the poor quality of the claims on Wikidata. This is a shame, because in many cases GeoNames ids bring a real value by successfully disambiguating between cities. For instance, we have

but GRID uses 1862415, which seems to be the correct value. I observe a disagreement in 20% of the reconciled cities.

It seems that many disagreements are of this king (the confusion between a big city and an administrative territorial entity that contains it and has the same name). That could probably be fixed automatically, but I do not have the time to look into that. − Pintoch (talk) 17:48, 18 June 2017 (UTC)

Some facts about GeoNames ID (P1566) and Lsjbot (Q17430942)

The mass creation of bot generated local Wikipedia articles only based on GeoNames (Q830106) is similar to that ones only based on Catalogue of Life (Q38840) by Lsjbot (Q17430942). In both cases we are enforced to have a (sometimes dubious) labeled item and to get in (by bots) more questionable property values or relations. Obviously the creation was done without sanity checks. Collaborating with external data providers is a good thing, but first of all I'm missing a way to a give straight feedback to the Wikimedia communities that are responsible. --Succu (talk) 21:55, 18 June 2017 (UTC)

It is now June 2017. This happened and as I said before, I prefer to have the data first in Wikidata and use caching to provide the texts into the Wikipedias. At GeoNames they have noticed that our data is wrong in places, they can collaborate with us. My assertion is that when there is collaboration, we can do better for both our projects.
This is not about feedback to the Wikipedias involved. This is about collaborating with GeoNames. Thanks, GerardM (talk) 01:08, 19 June 2017 (UTC)
I think this collaboration would be most welcome! − Pintoch (talk) 23:37, 19 June 2017 (UTC)
@Succu: The articles are not only based on GeoNames. They are based on data from Nasa and some other sources together with some algorithms who interpret in what range a mountain is located and the size of lakes etc. A deep review of how it handled Finland shows that the algorithms had severe flaws, so also the complementary sources (which tells Finland has savanna). Add to that much of the data is based on coordinates that sometimes are wrong. I stopped supporting it when we reached Faroe Islands (Q4628) and I discovered that big parts of the poor nation was located on the bottom of the Atlantic Ocean.
I love to see a collaboration with GeoNames. But if it means that much of the data from that database is imported here without review of the quality of each part, I recommend against it. The flaws are not only in our end. Au contraire, they are much worse at GeoNames! -- Innocent bystander (talk) 19:19, 20 June 2017 (UTC)

Release date of computers

Hi. I can't find a common criterion for a property to mark the release date of computers:

  1. ZX Spectrum (Q23882)
  2. Amiga (Q100047)
  3. Apple II series (Q201652)
  4. Didaktik (Q576901)

Which one would be preferred one? {{subst:Usuario:Roberpl/Firma}} 07:33, 19 June 2017 (UTC)

inception (P571). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 07:54, 19 June 2017 (UTC)
The start of production isn't the release date. Apart from that inception works as per Pigsonthewing. ChristianKl (talk) 11:01, 21 June 2017 (UTC)

Phone, tablet, laptops technical specifications

Hi, I'm planning to buy a new tablet and decided to use query.wikidata.org service to find the best option. Unfortunately I found, that there is not much data about tablets even for the popular ones. For example look at Samsung Galaxy Tab S3, at the time I'm looked at it, it had zero statements. While Wikipedia info box has all the data I need, but not querieble. Then I decided to fill in missing data manually for a few devices I'm interested in. But unfortunately there is no properties for that kind of things either. So my question is, is it possible to create and use needed properties for phones, tablets and laptops without long approval period. I don't know how long it takes to get a proposed property accepted? And I need a bunch of properties. Also I have good Python skills, and could write a script or a bot, to import all this data from Wikipedia info boxes if someone gives me directions how to do that. But of course, without properties, it will not be possible anyway. Sirexo (talk) 18:51, 20 June 2017 (UTC)

Properties are given a minimum of one week for review. If you have a good case, good examples, and fill out the proposal template correctly (see some of the existing proposals at WD:PP) then it should be straightforward. Do search the existing properties first to see if there's already something that meets your needs though. ArthurPSmith (talk) 20:54, 20 June 2017 (UTC)
Sirexo - you might also want to check out meta:WikiObject proposed by Qupro. It has a list of useful wikidata properties for specific products - however we have so far not had a lot of detailed product information added to wikidata in general. ArthurPSmith (talk) 21:01, 20 June 2017 (UTC)
As ArthurPSmith said accepting a property takes at least a week.
When Wikipedia writes a template it's only important to keep the specific items for which the template is created in mind. On the other hand when we create properties for Wikidata it's important to also keep in mind how the property is used in other items as well. It's also important to find names that make it unlikely that the property will get misused.
You could start by modeling one example in https://test.wikidata.org/wiki/Wikidata:Main_Page . ChristianKl (talk) 11:17, 21 June 2017 (UTC)

A visual way to query Wikidata

Hey everyone, i've made a tool that allows you to query Wikidata in a visual waywithout using SPARQL. It's called VizQuery.

The possibilities of using Wikidata to do interesting queries are endless, and the current query service allows for very powerful queries indeed. However, i feel that for the general public, especially those who are not that technical, it might be a bit overwhelming and difficult for them to learn a complex language such as SPARQL. To make people familiar with the concept of queries i believe a somewhat less intimidating approach might be useful, hence this tool.

VizQuery is only capable of doing a subset of possible queries. It's basically simple triples, variables (prefixed with '?') and literals (between "quotes"). You can do pretty powerful queries with only those things though. For example, here's a query with vegetarians who are married to a vegetarian.

Under the hood VizQuery uses Ruben Verborgh's SPARQL.js library to convert between JSON and SPARQL, so theoretically every SPARQL query you could do in the regular query service can be done in VizQuery. However, many queries won't work because the visual interface only supports a subset of options: it's pretty hard to create user-friendly GUI representations of many of the complex SPARQL features. :)

Anyway, i'd like to hear what you think. Bugs, feature request and pull requests are also welcome on my Github page.

Husky (talk) 23:47, 20 June 2017 (UTC)

Thanks, Husky! Pretty cool! It's useful for showing people the most basic stuff of queries in Wikidata. Strakhov (talk) 14:38, 21 June 2017 (UTC)
This is really great! I've bookmarked it. - PKM (talk) 18:53, 21 June 2017 (UTC)
Nice tool! Consider to add support translation of of interface and make it work in more browsers (at Internet Explorer (Q1575) or Mozilla Firefox (Q698) does not work for me).

An Idea - User statistics based on wikidata

hello,
I'm thinking about a possibility to use the query service for statistical issues related to Users of Wikimedia & partner projects. For example, it would be interesting to view the activity of users based on their projects, activity type (edits), locations, etc. I know some statistical tools exist already but with limited abilities. I think that involving wikidata in this task could bring encouraging results to present to the different wiki communities. My question is: has this idea been discussed before? and if yes, what is the actual state of this issue? If no, I'd like to listen to your opinions. Greeting! --Sky xe (talk) 01:36, 21 June 2017 (UTC)

Wikimedia cares strongly about privacy. If you propose a tool to analyze people's locations that's unlikely to get very far. Fleshing out a proposal that produces an added benefit, doesn't take too many technical resources and that doesn't violate privacy is non-trivial. ChristianKl (talk) 10:36, 21 June 2017 (UTC)
I guess Sky xe (talkcontribslogs) means locations of the subjects edited. Such things like coordinate location (P625) of articles/items edited by a specific user. A user like Salgo60 who edits a lot about people buried on a burial place in Stockholm will then be very sharply located. I, who in my early years here at WMF-wikis, edited a lot in the Asteroid belt and in the realms of the (ex)planet Pluto will have a very interesting pattern.  
-- Innocent bystander (talk) 10:46, 21 June 2017 (UTC)
I agree with ChristianKl here. There are already plenty of tools around which allow to stalk users to an uncomfortable extent. While those tool results base on a lot of publicly available information, the accumulation and evaluation of this information often goes too far to my opinion. For me this is in fact the main reason not to reveal my real life identity, and not to attend real life community gatherings such as the WikidataCon. —MisterSynergy (talk) 10:46, 21 June 2017 (UTC)
Agreeing with ChristianKI and MisterSynergy if this is related to contributor's identity, IP's, edition patterns etc. And still waiting for the arrival of structured data on Commons. Let's see how that stuff handles dates, places, coordinates and let stalkers picture a "chronomap" of a contributor with a handy app. Strakhov (talk) 14:27, 21 June 2017 (UTC)

Thank you all for the answers. I agree that privacy indeed is a critical point. But imagine we can view and evaluate statistics -by non violating the privacy of all user identities- such as:

  • Which wiki project has been most active in the last period (week, month, year) relative to the number of active users and which languages are basically involved.
  • Which fields have been edited most, e.g. in the English Wikipedia.
  • What is relation between users and their activity on the different wiki projects (to bypass the privacy challenge, e.g. identification only by countries instead of geographic coordinates based on IP protocols)
  • etc.. a lot more could be possible.

I think that the idea deserves to be discussed and I'm sure, more aspects will be enlighten thereafter.. Greetings! --Sky xe (talk) 16:47, 21 June 2017 (UTC)

Given current practice, I think the country in which a user resides is seen as private. Once you allow queries, they can also used for things besides your examples. ChristianKl (talk) 10:15, 22 June 2017 (UTC)

Identifier for FamilySearch

I may have asked this before, but I can't find the results of the discussion. Should we have a Identifier for FamilySearch even if there is no landing page for the link unless you are registered for free. For biographies it would provide the data and actual image of the record for people in the census, and other primary records for confirming birth, marriage, and death. Here is the page for Elizabeth Coleman White at https://familysearch.org/tree/person/29WD-R5W/details To see it you would need to register, but the site is a cornucopia of biographical information based on primary documents. We link to Geni.com, but that does not have the original documents for free. --Richard Arthur Norton (1958- ) (talk) 23:41, 21 June 2017 (UTC)

Properties are created after a property proposal in Wikidata:Property proposal. Feel free to create a proporty proposal for FamilySearch. ChristianKl (talk) 09:01, 22 June 2017 (UTC)
We already have FamilySearch person ID (P2889). —MisterSynergy (talk) 09:05, 22 June 2017 (UTC)
Thanks. --Richard Arthur Norton (1958- ) (talk) 17:52, 22 June 2017 (UTC)

guitar/bass

Does anyone know why the item guitar (Q6607) that describes the guitar is called bass? Mycomp (talk) 06:44, 27 June 2017 (UTC)

It does no longer. It was vandalised yesterday. Matěj Suchánek (talk) 07:05, 27 June 2017 (UTC)
I see. Thank you. Mycomp (talk) 07:48, 27 June 2017 (UTC)
This section was archived on a request by: Matěj Suchánek (talk) 07:07, 28 June 2017 (UTC)

Item/property descriptions on mouseover?

I noticed today that the watchlist has a feature where hovering over an item or property link gives the label and description in the user's default language - so, for example, Cédric Villani (Q334065) has a little label saying "Cédric Villani | French mathematician". RecentChanges and Contributions both have the same feature.

Elsewhere on Wikidata (such as here, or on an item page), hovering over an item/property link just gives the Q-number or P-number. It feels like it would be useful to turn the detailed version on everywhere, particularly when something links to an ambiguously titled item. Any thoughts? If there's interest I'll file a bug request. Andrew Gray (talk) 15:58, 21 June 2017 (UTC)

I would prefer disambiguation without hovers, but (in curly braces).
description, if absent then P31, if absent then P279 d1g (talk) 16:25, 21 June 2017 (UTC)
I don't think your description is accurate. Sometimes on the item page I see the description displayed. I think there's a script that tries to load the description the moment you however over it. If the loading takes to long the ID get's displayed. At least that's the best hypothesis for why it's behavior. ChristianKl (talk) 20:02, 21 June 2017 (UTC)
Interesting - I've never spotted this on an item page. Andrew Gray (talk) 12:20, 23 June 2017 (UTC)

Good and bad news about cross-wiki search results in Wikipedia

Good news: The cross-wiki search results from other projects are now live in Wikipedia. Bad news: The search results from Wikidata are not part of the plan, meaning users won't see those results in any Wikipedia site. Feel free to share your thoughts here. --George Ho (talk) 20:36, 22 June 2017 (UTC)

George Ho does the "Discovery Team" know that wikidata exists? But I think this is somewhat moot - or at least up to the language wikipedias how to deal with. In, for example, svwiki if your search turns up few or no results it also does a wikidata search - for example [10]. ArthurPSmith (talk) 13:18, 23 June 2017 (UTC)..
They have enabled a gadget on the Swedish Wikipedia to show results from Wikidata. And seems the Discovery team is aware of Wikidata per this. Stryn (talk) 14:55, 23 June 2017 (UTC)

Nutritional Information

We currently don't have any nutritional information in Wikidata. It seems USDA Food Composition Databases contains a great database by the US government. US government data is in the public domain, so we could import it. Do we currently have the necessary properties? ChristianKl (talk) 12:59, 22 June 2017 (UTC)

you might want to check out properties that are instances of Wikidata property related to food and eating (Q23038310) - there don't seem to be many yet though. ArthurPSmith (talk) 17:54, 22 June 2017 (UTC)
Pierre from Wikidata & Open Food Facts, reporting for duty :-)
- We recently imported all the USDA db into Open Food Facts. We're looking for contributors who can scan their food, and add pictures to augment the data. Millions of photographic evidence to take. There are also some generic profiles (without barcodes) in that database, which we haven't imported into OFF, but that I've put up for matching on Mix N'Match. More info at https://www.wikidata.org/wiki/Wikidata:WikiProject_Food/Properties
- Also, feel free to help Open Food Facts: we need contributors, translators, coders...
>> Web: https://world.openfoodfacts.org
>> Android: https://android.openfoodfacts.org (also on FDroid)/ iOS: https://ios.openfoodfacts.org
>> Translators needed @ https://crowdin.com/project/openfoodfacts
--Teolemon (talk) 15:32, 23 June 2017 (UTC)
Unfortunately, Open Food Facts seems to have a license that's not compatible with Wikidata. I think having data under CC-0 so that everybody can use it is a more worthy project. ChristianKl (talk) 17:42, 23 June 2017 (UTC)
The USDA db is public domain, as all works by US Civil Servants, although derived (for the part with barcodes) from producer data. So you can import it into Wikidata. For Open Food Facts, I do get your point: We have chosen the ODbL to ensure the Db keeps growing, and that commercial apps play the game of sending pictures back. --Teolemon (talk) 18:45, 23 June 2017 (UTC)
Yes, importing from USDA would work for Wikidata.
I think you are wrong about the effects of licensing. Any company that uses a data set has an interest in that data being high quality. We likely wouldn't be able to exchange data with Songkick if we would be ODL-licensed.
In many cases, I would estimate that a serious company will build their own database instead of using OpenFoodFacts. The will use BSD/Apache-licensed libraries but they won't use GPL licensed ones. ChristianKl (talk) 21:20, 23 June 2017 (UTC)

date of death of Honoré de Balzac

I was just looking at date of death (P570) of Honoré de Balzac and we seem to have two values 18 August that all Wikipedia articles state and 19 August "stated in" "Integrated Authority File" 3 years ago. I assume "Integrated Authority File" means this link, but I could not find any mention of date of death there. Is it OK to delete that date if the source no longer states it? If there is a controversy over his date of death, I then we need to keep both dates but otherwise this looks like typo that someone corrected long time ago. What is proper procedure of removing the data? --Jarekt (talk) 12:50, 23 June 2017 (UTC)

I think it actually references this link (see last triple) --Nono314 (talk) 17:08, 23 June 2017 (UTC)
Ok so that date is still returned by d-nb.info I wonder if it is a typo or there is a real controversy about date of his death. --Jarekt (talk) 18:37, 23 June 2017 (UTC)
He passed during the night. According to frwikisource, it would have occurred between 22:30 and 22:40 on 18th. --Nono314 (talk) 19:56, 23 June 2017 (UTC)

Best practice etiquette related to choosing image (P18) property

I am working on importing some of the data stored in Commons Creator templates to Wikidata, and I am looking now at images used in creator infoboxes. I imported all the images that were missing on Wikidata, but now I am looking at cases where both Creator template and Wikidata item has an image and which are different. You can see Creator templates in question in here and the images at c:User:Jarekt/f. I think I have a good idea of what is superior image to depict a person, but would like to double check if others feel the same way. My current thinking is to replace Wikidata image if:

  • new image depicts person and current image depicts work created by the person
  • new image is a better quality version of the old image
  • new image shows a single person and old image multiple people
  • favor images which show face.

Any other suggestions, to help pick "better" image (P18)?

  • Shall we favor "portrait" vs. "landscape" aspect ratio?
  • In case of painters and photographers I prefer realistic self-portraits to depict a person. How others feel about it?
  • Is it a good idea to have multiple good images stored in P18? I think not, but just checking.

--Jarekt (talk) 16:56, 21 June 2017 (UTC)

I completely support your first four points. Additionally, I think a color portrait is better than an engraving of a portrait in most cases. And if we have a mediocre "head shot" and a good full-length portrait, we should consider using a head-and-shoulders crop from the full-length portrait. I also prefer "portrait" rather than "landscape" orientation. On self-portraits, I generally agree; but I'd make an exception for Dante Gabriel Rossetti and use the Watts portrait rather than the self-portrait of DGR as a very young man. - PKM (talk) 01:29, 22 June 2017 (UTC)
I agree with all four suggestions. Also portrait ratio should be preferred, because of wide use image (P18) in infoboxes. No preference about self-portraits. Single image is preferred.
Watch out for pictures not representing the person like images of signature or grave (without picture of bust of person).--Jklamo (talk) 11:25, 22 June 2017 (UTC)

Thank you both for confirming my rules of thumb. I am replacing hundreds of images with (hopefully) better ones. I thought I will share a little shortcut I find very useful which I use in c:User:Jarekt/f. That page was created by a excel spreadsheet pairing up images from commons with images from wikidata for the same person. A little (+) can be used to perform 2-click replacement of the wikidata image with Creator one (I am not suggesting the actual replacement in case of those 2 images, so only do one click). That technique can be probably used for other mass tasks with human in the loop. --Jarekt (talk) 15:33, 22 June 2017 (UTC)

For Classical Greek and Roman authors, a contemporary portrait at bust length (Q241045) (sculpture) is usually the best option over a mosaic, painting, etching, or later form of depiction. The best photographs of busts have the face pointed towards the viewer, and the best choice of bust will have the nose intact. When a bust is not available, a depiction on a coin may be useful. Care has to be exercised though, because some sculptures have been incorrectly identified, and later found to be so. The current data accompanying some of these sculptures is not always present yet in Commons or Wikidata. --EncycloPetey (talk) 21:02, 22 June 2017 (UTC)

Agree, I will keep that in mind when picking images. --Jarekt (talk) 12:11, 23 June 2017 (UTC)

To start with, do we all agree that there should be only one file in P18 for every item? I can easily see that if not, Wikidata can become a trash can similar to Commons; on the other hand, I see added value in say having an exterion and an interior of a building. Has this actually ever been discussed?--Ymblanter (talk) 08:49, 23 June 2017 (UTC)

Well, I would prefer to see both a male and a female file in Panthera leo (Q140) like it already has.
Replacing an image which already has a caption in a language you cannot interpret and change could be very troublesome.
A related topic: I would like to allow such things as depicts (P180) as a qualifier to P18. Often we have pictures of churches to illustrate a village. But how to tell which church? The discussions this far has denied such usage, but I think it could be very useful since you cannot add links into the caption-property. And what a picture illustrates depends on in which item/article you use the file in. File:Carlito syrichta on the shoulder of a human.jpg has for example been used to illustrate everything from Lsjbot to Primates and everything in between. -- Innocent bystander (talk) 11:49, 23 June 2017 (UTC)
I agree that male and female images (if different) for living organisms is a great idea. I guess that is an example of a case where multiple images stored in image (P18) are a good idea. In context of Commons Creator templates images will be mostly of people, and for such I find multiple images troublesome. I did not pay much attention to captions, as so few images have them I did not notice you could add them. To me a new better image outweighs the loss of a caption, but it should be considered in case of minor improvements. One thing I do not like in case of P18 images of people is to see images depicting their artworks instead of depicting the artists. When the artwork shows someone else that is hard to spot. --Jarekt (talk) 12:11, 23 June 2017 (UTC)
Agree, if it isn't a good self portrait, an art made of the subject is not a good file at all in P18. I guess they have been imported from WP-articles and things like that. No big problem in WP if the caption tells it isn't the subject, but such things goes unnoticed by these rough tools. . -- Innocent bystander (talk) 12:22, 23 June 2017 (UTC)
I like your rules of thumb and I agree that "image of the artist's work" should never be in P18. However, would it be worth creating a new property for "image of an example of this person's work", to go alongside images of plaques, graves, signatures, etc? This would give us soemwhere to put these, which might be useful for the cases when we have no other appropriate image but still want to have some kind of illustration (eg for an infobox). Andrew Gray (talk) 12:19, 23 June 2017 (UTC)
Well, don't we already have a property for "Notable work"? I guess P18 could be used as qualifier to those. Or even better, be put inside those "work-items". -- Innocent bystander 12:25, 23 June 2017 (UTC)~
Agree, Works by the creators should be added to notable work (P800). In case of paintings user:Multichill and other Wikidata:WikiProject sum of all paintings participants are adding a ton of items for those. I do not know if we have other projects for sculptures, drawings, books, photographs, etc., but we should. --Jarekt (talk) 13:06, 23 June 2017 (UTC)
  • {comment}} @Jarekt: I don't see that we are exactly limited to a single image, though where adding more than one it is important that one is marked as preferred.  — billinghurst sDrewth 13:00, 24 June 2017 (UTC)

Link to SQID tool in the left hand bar

I really like the gadget that displays the Resonator link in the left hand bar. I think it would be good to have the same for the tool at https://tools.wmflabs.org/sqid/ as well. Is there a reason against having such a gadget? ChristianKl (talk) 18:08, 22 June 2017 (UTC)

I think that SQID (Q24298088) should be default view, when non-editors view site (to explore data and "like this" buttons)
What we have now should be "edit" mode (with many edit buttons) d1g (talk) 07:18, 24 June 2017 (UTC)

Annoyed with the amount of merges that can be done

I really hate it when people request an item to be deleted when all they could do is merge it. Waste of bytes for system, waste of my time, and just a waste of time for the person submitting because they have to find the duplicate and write that in reason. PokestarFan • Drink some tea and talk with me • Stalk my edits • I'm not shouting, I just like this font! 12:59, 24 June 2017 (UTC)

Data donation: Giphy

I'm pleased to report that I have secured another data donation: over 4.7K of Giphy user account IDs, for brands and artists, to be matched to Giphy username (P4013). These are already in Mix'n'Match, where over 20% are already matched and uploaded to Wikidata. The rest await attention! Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:38, 24 June 2017 (UTC)

"Single value" constraints and deprecated values

We could do with either changing our "single value" constraint so that deprecated values are (optionally) ignored, or having an alternative constraint that could be applied to that end (without the need to write complex constraint code). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:43, 12 June 2017 (UTC)

Not just for deprecated please. E.g. for Thailand central administrative unit code (P1067), there is one current single value, but there may be several older ones - which then have normal rank, not deprecated, as they were correct in past, whereas the current value has rank high. So single value should just check for single value in the highest used rank. Ahoerstemeier (talk) 21:43, 12 June 2017 (UTC)
I'd also suggest that it pay attention to whether there's a start/end date on entries. There could be two valid historic values for something, both marked 'normal', and with suitable start/end dates, but no current version to mark as preferred. --Oravrattas (talk) 06:52, 13 June 2017 (UTC)
In that case`, why not add a "no value" with high rank to mark that there is not current value? Then it'd work without problems by checking on the highest used rank. Ahoerstemeier (talk) 07:52, 13 June 2017 (UTC)
That is fine in the case where it's definitively known that there is no valid value at the moment, but it's possible to have a case where there are historic versions that are known to no longer be valid, and where people have correctly added the information for those. Forcing them to then either find a new value, or explicitly choose between 'no value' and 'unknown value', when they may not have that knowledge, simply to satisfy a constraint check, violates the "missing is not broken" principle. --Oravrattas (talk) 13:51, 13 June 2017 (UTC)
The constraints are not enforced, so they don't force anyone to do anything. - Nikki (talk) 14:07, 16 June 2017 (UTC)
I think we can go with this: "single value" now means "single best value", ie. the highest rank in a statement group should only occur once. Matěj Suchánek (talk) 14:16, 16 June 2017 (UTC)
When it comes to authority control, where there should be only one correct value, I don't think 2 statements with rank normal and 1 statement with the preferred rank should pass the contraint.
I think that the default "single value" constraint should allow depricated values but limit the amount of other values to one. We can have an additional "single best value constraint that also allows multiple statements with the normal rank along one statement with the preferred rank. ChristianKl (talk) 13:32, 18 June 2017 (UTC)

@Lydia Pintscher (WMDE): FYI. I suspect you'll want a Phabricator ticket, but this is you opportunity to comment first. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:08, 23 June 2017 (UTC)

We have phabricator:T167653 already it seems. My current thinking is that "single value" means there should only be one best ranked statement. Are there cases where this leads to data we do want to flag as not being flagged? --Lydia Pintscher (WMDE) (talk) 09:23, 26 June 2017 (UTC)

Wiktionary sitelinks enabled in Wikidata

Hello all,

As mentioned previously, we are now able to store the interwiki links of all the Wiktionaries namespaces (except main, citations, user and talk) in Wikidata.

Important: even if it is technically possible, you should not link Wiktionary main namespace pages from Wikidata. The interwiki links for them are already provided by Cognate, and in the future, Wikidata will also have special entity types for lexemes (see Wikidata:Wiktionary and mw:Extension:WikibaseLexeme/Data Model).

How can you help?

  • First of all, you can help us translating this documentation page in the languages you know.
  • If you know tools, scripts, bots, that could be useful for the migration process and removing the manual sitelinks, please share your informations on the page and offer help to people who would need to use them.
  • You may want to pay special attention to the new created items and all recent changes that will result from this new feature available for Wiktionaries.
  • Be friendly and welcoming with the Wiktionary editors :) Help them if necessary, make them feel part of the great Wikidata community.

Thank you very much! Lea Lacroix (WMDE) (talk) 08:36, 21 June 2017 (UTC)

What's the motivation to enable the future before the lexeme data type exists? ChristianKl (talk) 10:25, 21 June 2017 (UTC)
@ChristianKl: This is for sitelinks for everything that's not in the main namespace. So categories, templates, modules and project pages basically. Jon Harald Søby (talk) 10:43, 21 June 2017 (UTC)
Are you aware of any bot that could transfer automatically all the interwiki links to wikidata? I guess that bots which did the job for other projects can be adapted for this task. Pamputt (talk) 00:59, 22 June 2017 (UTC)
@Jon Harald Søby: But it's still really possible to add Wiktionary main ns pages, phab:T158323 said that an AbuseFilter should prevent such edits but where's that AF? --Liuxinyu970226 (talk) 02:24, 22 June 2017 (UTC)
@Liuxinyu970226: seems that such an abuse filter wasn't created yet. You can ask at WD:AN and someone with needed skills and rights will do it, probably. XXN, 11:53, 26 June 2017 (UTC)

Members of the European Parliament

I just saw this query on Twitter:

Does anyone have a spreadsheet of MEPs sorted by committee? [...] everything on EP site is so unhelpful.

How far are we from being able to answer that, and who is working on MEP data? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:00, 23 June 2017 (UTC)

I haven't looked into European Parliament information in Wikidata in any great depth yet, but my impression is that we're quite far away from being able to answer something like this. Most of the position held (P39) entries that I've seen for member of the European Parliament (Q27169) are completely bare — i.e. not even qualifiers that will tell us when this was true, or which constituencies they were elected from — so we have no way to even produce a list of who we think the current MEPs are to see how complete it looks. And committee memberships is something that we don't really have in much depth for any legislature yet (and not at all for the vast majority of them). My impression is that some people have done work tidying up data on the MEPs from individual countries, but no-one is actively working on the European Parliament as a whole. --Oravrattas (talk) 06:30, 24 June 2017 (UTC)
We even have a few hundred with no country, as well. There was a push to get them all in based on matching to a database produced by the EP - so we do have them all, give or take some errors or duplication, but beyond that the data is pretty light. Andrew Gray (talk) 10:32, 24 June 2017 (UTC)

3600 via P1186 / MEP directory ID. Links at Wikidata:EveryPolitician need updating. --Atlasowa (talk) 13:09, 26 June 2017 (UTC)

Wikidata weekly summary #266

Dispute about webapp-related "said to be the same as" claims

I don't know if I'm in the right place, but I wanted to get a third opinion on a dispute involving three statements added this morning by User:D1gggg:

The first two showed up on my watchlist; they sounded wrong to me so I reverted. But the user re-reverted again without an explanation. So I started a discussion on their talk page (Topic:Tt6ir4v8yjzxodgf), but made no progress, it seems we're talking past each other. Can anyone chime in with a third opinion please? Intgr (talk) 15:23, 26 June 2017 (UTC)

  Comment be aware that my questions were ignored 123 times... d1g (talk) 15:43, 26 June 2017 (UTC)

I don't see a reason for why they should be tagged with said to be the same as (P460). Not all software that runs on a server is a web framework or a web server. The German Wikipedia lists 17 different types of servers and web servers are only one of those 17. The server software that manages access to printers on a network is neither a web framework nor web server software. print server (Q1162303) is a subclass of server software (Q1371279) and it would also make sense to classify web application (Q189210) as subclass. ChristianKl (talk) 15:56, 26 June 2017 (UTC)
we can change these 3 claims or every claim, I don't see a reason to use these items.
"web framework" or "web server" add no value over server software (Q1371279)/client (Q528166).
I don't see classification for Rich Site Summary (Q45432) in German Wikipedia.
Print server are using print protocols Internet Printing Protocol (Q1667982).
depends on software (P1547) should be used for "frameworks". d1g (talk) 16:10, 26 June 2017 (UTC)
I agree these things are not the same, and here are a few examples for User:D1gggg:
So these concepts are related, but too loosely to be identified with this property. − Pintoch (talk) 16:30, 26 June 2017 (UTC)
I don't think web framework (Q1330336) is in meaning equivalent to web server (Q11288) and Django does qualify as web framework (Q1330336) but otherwise I agree with Pintoch. ChristianKl (talk) 16:47, 26 June 2017 (UTC)
@Pintoch: HTTP server, so definitely server software (Q1371279) - which I'm pointing
"web framework" - is a subjective game around server (Q44127) and server software (Q1371279) in non-{{Q|Q251}] world.
Servlets are the Java platform technology of choice for extending and enhancing Web servers. - then what it is if not a server software (Q1371279)? subclass of "web framework"? Synonym? d1g (talk) 17:19, 26 June 2017 (UTC)
I do not understand your sentence "HTTP server, so definitely server software (Q1371279) - which I'm pointing" (and the following ones are also unclear). Can you please write full sentences? − Pintoch (talk) 17:42, 26 June 2017 (UTC)
When somebody says "HTTP" it can only mean two things: HTTP client (Q528166) or HTTP server software (Q1371279)
  • client A program that establishes connections for the purpose of sending requests.
  • server An application program that accepts connections in order to service requests by sending back responses. Any given program may be capable of being both a client and a server; our use of these refers only to the role being performed by the program for a particular connection, rather than to the program's capabilities in general. Likewise, any server may act as an origin server, proxy, gateway, or tunnel, switching behavior based on the nature of each request.
When one says "web framework" it can mean anything depending on the perspective of the speaker and his grip in technology. There is no standard for "web framework". 17:54, 26 June 2017 (UTC)
Thanks a lot, this is much clearer! But I do not really understand why you keep bringing up the HTTP protocol. Not every web-related concept is described by the standard for the HTTP protocol. (As a random pick, I would take Gangnam Style (Q890): not sure whether it is more of a server or a client. But it is definitely a very important component of the web!) − Pintoch (talk) 18:32, 26 June 2017 (UTC)

  Comment I'm not change my opinion that "web ..." aren't useful because how complicated "web standards" are now.

We can use spend time on references and who says what, but results wouldn't be useful as specifying each protocol and their client/server implementation. Java Servlet (Q375673), IPv6 (Q2551624), HTTP/2 (Q739120), HTTPS (Q44484). It was HTTP 1.1 long, long time ago to call everything "web". d1g (talk) 16:56, 26 June 2017 (UTC)

d1g the world wide web was invented by Tim Berners-Lee in 1989 or thereabouts (en:World Wide Web). Servers have been around since the internet began, at least 20 years earlier - see en:Server (computing) ("In computing, "server" dates at least to RFC 5 (1969)"). An email server, for example, responds to SMTP requests, not to HTTP, so it is not a web server, and certainly not a web application, but it is server software. Your use of "said to be the same as" here is clearly wrong, and your arguments are not making any sense at all. ArthurPSmith (talk) 18:18, 26 June 2017 (UTC)
That's my point?.. "server"/"client" was there since day 1 or so; you can't define and source "web framework" consistently.
"web framework" is meaningless: commonly it means "server", but sometimes it is used for "client" too. d1g (talk) 18:28, 26 June 2017 (UTC)
so why are you trying to claim something that you assert is "meaningless" is "the same as" something that is clearly well defined and "was there since day 1"? In fact, while "web framework" may be somewhat buzzwordish, in my experience it does have a tangible meaning distinguishing that kind of software environment from the other components of an http(s)-based internet service. Which is in fact why we have a whole enwiki page on it: en:Web framework. enwiki doesn't usually have entire long pages about things that are "meaningless". ArthurPSmith (talk) 19:12, 26 June 2017 (UTC)
I used "roughly same as", not exactly as (exact match (P2888)) and I removed these claims and with replaced them with facet of (P1269).
http/https changed a lot. and ""web framework" gained popularity only after 1992" I also wonder why there is a peak in 60s...
So, "web framework" usage is very limited (1992-) as opposed to "servers" and "clients".
Furthermore, number of application programming interface (Q165194) is grown so that "client" in HTTP sense is more relevant than before.
Basically, Internet of Things (Q251212) is a call to API-ize every hardware, even more programs would be "clients" of everything else. d1g (talk) 20:19, 26 June 2017 (UTC)

Qualifiers in QuickStatements 2?

QuickStatements is a great tool and I use it extensively however it is lacking documentation (as we already discussed at Project chat) so the main way of figuring things out is try and error. Reading User:Magnus Manske/quick statements2 page suggests that the tool might be supporting Qualifiers. Did anybody figured out how to do it? I would like to be adding statements like date of birth is "circa 500 BC" (as in Pythagoras (Q10261)). --Jarekt (talk) 18:46, 26 June 2017 (UTC)

  • Add TAB P1480 TAB Q5727902 behind the statements that would normally add 500 BC and before the sources. ChristianKl (talk) 18:59, 26 June 2017 (UTC)
  • You will find the documentation in the old version. Matěj Suchánek (talk) 19:07, 26 June 2017 (UTC)
Matěj Suchánek & ChristianKl, thank you. I misunderstood the documentation in the old version. I thought one can add multiple properties with a single statement, but it is one property and multiple qualitiers. That makes much more sense. Thanks again. --Jarekt (talk) 19:37, 26 June 2017 (UTC)

Integrate the merge function into the default UI

Currently, a user has to activity a gadget to be shown the merge function. This means that new users are discouraged from merging items. Giving that we have a huge merge backlog, it might be worthwhile to integrate the merge function by default. What do you think? ChristianKl (talk) 07:30, 18 June 2017 (UTC)

I agree this would be very useful. We can enable the gadget by default, that's by far the easiest way to do it. Opinions please! :-) Multichill (talk) 10:55, 18 June 2017 (UTC)
I think the gadget lacks some quickly accessible explanations how to use it, what it does and some warnings when not to use it. This is necessary to prevent accidental misuse. Matěj Suchánek (talk) 15:28, 18 June 2017 (UTC)
@Matěj Suchánek: Do you think there's significant accidental misuse in the status quo? ChristianKl (talk) 20:33, 18 June 2017 (UTC)
@ChristianKl: Have a look at these contributions. In the last month there were 50 unpatrolled redirects created by IP's, where in the last 2,5 hours 50 redirects were created by users who were logged in. Q.Zanden questions? 21:33, 18 June 2017 (UTC)
(edit conflict) Just a question (unclear to me): Do you propose enabling for all logged in users or just everybody?
There will always be some misuse since mistakes always happen. But the new audience to get this gadget enabled (ie. newbies) is more likely to make mistakes. This is my concern, hence my proposal. Matěj Suchánek (talk) 20:57, 18 June 2017 (UTC)
My preference would be to enable it for everybody. Sometimes this will indeed lead to mistakes but I think that the amount of increased engagement is more valuable than the mistakes that will be made. I would also be okay with enabling it for logged in users. ChristianKl (talk) 22:44, 18 June 2017 (UTC)
  Support. − Pintoch (talk) 20:47, 18 June 2017 (UTC)
ChristianKl - while I think this would ultimately be the right thing to do, there are several features that I believe are not now present in the merge gadget that I think should be addressed before it receives more widespread use:
  • I don't believe it recognizes different from (P1889) which should prevent merges between the respective items if it is present on either of them. There are perhaps some other properties that should also prevent or at least raise a question about merges (for example if country (P17) is set for both and differs, or if coordinate location (P625) differs by more than, say 10 km (?)).
  • the description in a given language on the merged item is completely lost if the item it is being merged into already has any sort of description in that language (and many old items have bare-bones descriptions auto-generated from P31's). There should be a way to pick the better description or merge descriptions in some way if appropriate.
  • merging should sensibly handle the case where one item has a wikilink to a redirect and the other has the direct link (i.e. allow the merge in this case even though both pages have links in the same language).
otherwise I think merge really needs to be limited to more experienced wikidata users who can recognize and know how to deal with these issues. ArthurPSmith (talk) 23:04, 18 June 2017 (UTC)
  •   Comment ... "we have a huge merge backlog ..." citation needed! Or maybe it is language specific. If we think it is true, can we demonstrate that?
  • +1 to Matěj Suchánek and ArthurPSmith's statements. I would prefer that there was a new right developed for that could be auto-assigned after reaching an edit count criteria, rather than all users at any time. There definitely needs to be some better supporting information. Numbers of us more experienced users have fallen over with categories to main; taxonomy; settlement to municipal where there has been errors in the existing links, or other errors that make them look the same.
  • We need an improved "danger" list or better builds to prevent merges apart from "different from". I have seen "book/literary work/..." (or similar) and "version, edition, or translation (Q3331189) merged which is problematic.  — billinghurst sDrewth 01:47, 19 June 2017 (UTC)
  • The fact that you see books being merged with version, edition, or translation (Q3331189) doesn't necessarily mean that someone used the merge tool to do it. Magnus merge game suggests items to be merged when they share labels or aliases. When the book and the edition share a name it would suggest them to be merged. ChristianKl (talk) 07:53, 19 June 2017 (UTC)
      Comment @ChristianKl: then please have those instructions updated or removed as it is simply inappropriate, and we need a stronger means to prevent such. Further, I would suggest that version, edition, or translation (Q3331189) should basically not be presented with the suggestion to be merged, it is very unlikely we will need to merge editions  — billinghurst sDrewth 13:05, 24 June 2017 (UTC)
  • I think we have a merge backlog, because in the SPARQL query that's supposed to look for items that are supposed to be deleted that's shared on the German Project Chat the event that items should be merged happens more frequently than that they should be deleted. In the recent RFC about allowing redirects there was also a concern that it increases an already existing merge backlog. ChristianKl (talk) 07:53, 19 June 2017 (UTC)
    Please provide a link to the query so we can all see the evidence. PLbot produces lists of works to merge, and from my observations of English language components it would appear to be in a managed state.  — billinghurst sDrewth 13:12, 24 June 2017 (UTC)
  • PLbot can only find items that have actually the same name. When working through tinyurl.com/ycr58tmo I found a bunch of items where merging made more sense. At the moment, for some reason that query seems to time out. ChristianKl (talk) 11:32, 27 June 2017 (UTC)
  • If we want to couple merging to a user right, it might make sense to use the already existing auto-patrol flag. It could work for both the merge tool and for the merging game. ChristianKl (talk) 13:03, 19 June 2017 (UTC)
    Yes you could merge it if you thought that they should be co-assigned. That said, that is an applied right either by application or grant of administrator, so it will not necessarily meet your needs unless administrators are pro-actively assigning the right. And to note that you would need to do a little magic through your gadgets as people should still be able to turn it off if they do not wish to see/have the merge functionality. As it is already a gadget, I don't think that you will get best value to co-assign it that way.  — billinghurst sDrewth 22:53, 26 June 2017 (UTC)
    Another thought. If we have some editing criteria, eg. 1000 edits, as a point when we think that someone knows enough to merge, we could just run a bot through finding users who pass a milestone value during the past <name your time period> and leave a note about the gadget.  — billinghurst sDrewth 22:58, 26 June 2017 (UTC)
  • Given that the user has to click on "More" before they see the possibility to merge, I can't imagine why someone wouldn't want to have the merge function enabled. ChristianKl (talk) 11:32, 27 June 2017 (UTC)

Bulk create items without statements, but only a single sitelink ?

In the past, occasionally, bots create thousands of items merely with sitelinks, but no further statements.

The situation has improved. Items without any statements for enwiki are at 4.6% (see Wikidata:Database_reports/without_claims_by_site). Still, we have items created years ago without any statements.

The question is if we should continue to create items without any statements (and wait till some day someone adds statements to these). Alternatively, we could decide that bulk created items should include at least one statement, if if a sitelink is present.
--- Jura 22:01, 25 June 2017 (UTC)

For Wikipedia, the answer may be "yes". For Wikisource, the answer is a resounding "no". Wikisource items without any statements are valueless, and many, many, many items at Wikisource sites are chapters of books, or acts within plays, and really should not have data items. --EncycloPetey (talk) 02:50, 26 June 2017 (UTC)
Clearly the tool you mentioned, Wikidata:Database reports/without claims by site, does not work for unlinked pages. In fact, only few tools (e.g. Duplicity) work for unlinked pages, but it's a huge backlog (multiplied by hundreds of wikis), and it's not practicial unless someone periodically check them (like nlwiki). Also, another way to kill unlinked items is QuickStatements, which required both the subject and the object are connected.--GZWDer (talk) 04:25, 26 June 2017 (UTC)
@Jura1: I think that every item in Wikidata without any statement is useless (even when it has multiple sitelinks, but the more sitelinks, the more easy it is to add any statement), and therefore can be deleted. Maybe a way to decrease the number of un-statemented items is to make a list with 50 items on it and give it a week to be improved. If the week has expired, and there are still items with zero statements they can be deleted without processing through RfD. If there are others with any other idea to clean the list of empty items, feel free to add it!
@EncycloPetey:, why should chapters or acts not have any statements? They can have part of (P361), instance of (P31) chapter (Q1980247)/act (Q421744) and with series ordinal (P1545). I don't see a problem why they shouldn't have any statements. Q.Zanden questions? 10:22, 26 June 2017 (UTC)
"I think that every item in Wikidata without any statement is useless (even when it has multiple sitelinks...), and therefore can be deleted" So you'd be happy to break the interwiki linking between two Wikipedias, or other sister projects? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:33, 26 June 2017 (UTC)
@Pigsonthewing: No, not at all, I see how much Wikidata can do to connect all the different languages. But if there are no statements, I think that the items are quite useless. And as I also said above, the more sitelinks an item has, the easier it should be to create some statements about it because there are multiple articles about that topic. So I think we should put more work in adding statements to (almost) empty items. Q.Zanden questions? 10:42, 26 June 2017 (UTC)
@QZanden: Is your position that you see what Wikidata can do to connect different languages but you still want to remove items that provide sitelinks between different articles in different languages?
Given that descriptions of items can be displayed in the Android App, items with a single sitelink and no statements can provide benefits to users. ChristianKl (talk) 11:06, 26 June 2017 (UTC)
@Q.Zanden: Your two statements are mutually exclusive. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:52, 26 June 2017 (UTC)
The number of items without any sitelinks are growing steadily. It is impossible to guess what articles in any and all of the 285+ Wikipedias. When there are items, they are found when you search for them. We should more aggressively add items for articles that are not linked to an item. At that the English Wikipedia is not of a special relevance, the German Wikipedia for instance is in many fields more complete. Thanks, GerardM (talk) 11:20, 26 June 2017 (UTC)
@Q.Zanden: Because for any given chapter of a book, there must be (a) a general data item for the chapter as a work, and then (b,c,...) separate data items for each edition of that chapter. So, if we have a 1910 UK edition of The Time Machine, with 12 chapters, then we need to create 12 data items for the 12 chapters as works, then another 12 data items for the actual chapters of the 1910 UK edition. The 12 work chapters will be part of the book as a work, but the 12 edition data items for chapters will be part of the specific edition, and also editions of the work chapters. This runs into further problems when you realize that there is more than one way that The Time Machine is divided into chapters, namely editions with 12, 14, or 16 chapters, which are not analogous to each other. So we need one data item for the novel as a whole, one data item for the serialized version as a work, one data item for the Heinemann text as a work, one for the Holt text as a work, one for the Atlantic text as a work, then each individual edition can be added. But then that whole discussion about chapters as works and editions? That gets multiplied by 4 because of the three novel and one serialized edition, and each set of chapter data items needs to clearly identify to which edition of the text it belongs. And that just deals with English editions. For translations into other languages, there's no guarantee that it will fit any of those editions. And because translations are separate data items, and editions are separate data items, there will never be any interwiki links to reduce the number of data items through merging. Create chapter data items? The creation of those for a single novel can run into the hundreds. And if a text is misidentified and all its item have to be located and edited? The work and maintenance are prohibitive beyond belief with no real benefit. What is the advantage to having a data item for a chapter 3 from the 1910 UK publication of the Heinemann text of H. G. Wells' The Time MAchine? It will have no interwiki links, because only the English Wikisource will have English copies of chapter 3 of the 1910, etc., and neither will it have external links to any data items because libraries don't do that sort of thing. --EncycloPetey (talk) 11:19, 26 June 2017 (UTC)
I think the situation for WikiSource is somewhat different and not my primary concern when I started this thread.
Numbers for enwiki are actually going down (333789 / 4.6%) yesterday compared to ( 411047 / 6.2%) a year ago. Still, I don't think the current level is necessary a good base level. There are benefits of having a single sitelink items, but I'm less convinced that it outweighs disadvantages ..
--- Jura 13:41, 26 June 2017 (UTC)
  • I strongly find even items with nothing but a label and a single sitelink to in themselves be useful in wikidata. Yes they require some human attention to become more useful, but in themselves they ensure that (if they are complete) everything that any wikipedia has written about has some record in wikidata, and can therefore be found and enhanced. If otherwise one has to search through every language wikipedia to see if there is a matching entity for the external identifier you are looking at, I would find that nearly hopeless. So I would strongly oppose any effort to stop adding these items. ArthurPSmith (talk) 14:36, 26 June 2017 (UTC)
  • I find these useful as well. As long as our policy is that every article in every Wikipedia should link to one and only one Wikdata item, we should create these as we go. - PKM (talk) 19:06, 26 June 2017 (UTC)
For the Dutch Wikipedia we have an approach:
  • Monitor new article creations and try to link the to an existing item or create a new item with some statements. This is all done by humans with tools like duplicity
  • When an article is at least 28 days old and hasn't been edited for 21 days a bot comes along and creates the missing item. In practice this rarely happens, but this prevents a backlog
  • People work on Wikidata:Database reports/without claims by site/nlwiki, this is just a couple of new items and mostly old. The oldest is currently from February 2013
User:NoclaimsBot also works on the Dutch Wikipedia, but rarely has any hits these days. Would probably be good to run the newitem bot for other language Wikipedia's too. This way the new items come in a steady flow instead of a huge pile of items every once in a while. Multichill (talk) 19:36, 26 June 2017 (UTC)
  •   Comment @QZanden: Chapters in a novel don't have items as they are predominantly not individually notable. How many "Chapter 1" labels would you like to see through Wikidata? Where an individual component of a work is notable, eg. a biographical entry, a paper, a poem, then WSes does look to list them individually.  — billinghurst sDrewth 21:38, 26 June 2017 (UTC)
  • Maybe we can try to adopt some of the exemplary practices of nlwiki for other Wikipedias. Obviously, it works best if contributors from a given Wikipedia regularly check duplicity. On the Wikidata side, maybe for some categories, items could be created regularly by bot after a fairly short time (include a P31 value). For others, the bot could create them after a wait time (even without P31/P279).
    --- Jura 11:47, 27 June 2017 (UTC)

Titles of films in other regions

When adding has quality (P1552) with title for Spain (Q27847754) and title for Hispanic America (Q27834579) for the film Saving Mr. Banks (Q3474574) as Jura1 said at the Infobox film of eswiki I realized that the addition triggers a constraint report. Is this way of defining the titles in other regions incorrect? And if it's correct where can the constraint be fixed? I can't find it neither in the talk page nor in the property page. -- Agabi10 (talk) 19:47, 27 June 2017 (UTC)

Once in a while, constraint definitions get re-loaded from property talk pages. Should be ok next month ..
--- Jura 19:52, 27 June 2017 (UTC)

Named time and style periods

There is a tiny problem with named time periods (read HUGE problems). Such periods not only have different names in different countries, but they also have different start and end times. They can even appear in different order in different countries. Very confusing. Very very confusing. In Norway we had the Merovinger time between younger and older iron age. In Sweden they had the Vendel time. In Denmark they had the Younger Germanic iron age. Iron age in Norway goes from 500 BC to approx 1000 AD, unless we are talking about Sami people which had an iron age from 0 AD to 1500 AD. It is a complete mess.

So what to do? It seems like Iron Age (Q11764) have a sort of solution, but incoming references might be wrong if they don't use additional specifier to identify which period and where it applies. And to whom, like for the Sami people. Use applies to part (P518) with an ethnic group?

An other strange example. Assume a building described as belonging to a specific architectural period like Swiss chalet style (Q2256729). This architectural period starts at another year in a neighboring country. Use Norway and Sweden as an example. In Norway the period started about 1840 and in Sweden about 1850. In 1855 it had already started to fall out of popularity in Norway, being replaced by Dragestil (Q4562834). In Norway there were factories mass producing such houses, exporting them to Sweden, and there being sold and built as Swiss chalet style. So, is a building built in Norway (or Sweden) in late 1880-ish belonging to the Swiss chalet style if it is described on the Swedish vs Norwegian Wikipedia? What style period does Villa Fridheim (Q14942980) belong to on Swedish vs Norwegian Wikipedia? Can we say that it is the same style period anyhow?

So, the description for a style period should probably both have a reference to where the object is, and where the object originated. There are probably some problems associated with the observer too, but this is more than complex enough for now.

This starts to get really messy. Anyone with an idea how this could be simplified? Jeblad (talk) 20:16, 27 June 2017 (UTC)

I've run into this exact problem with assigning clothing to periods. I like the solution at Iron Age (Q11764), although I'd prefer to see valid in place (P3005) rather than applies to part (P518) for the regional time periods.
As to styles, a style may originate in a time period, but as you note, styles migrate, and things can be created in an outdated or old-fashioned style. I think we need to think about styles and historical periods separately. We haven't yet built out a complete hierarchy of styles. We could do that, possibly using the Getty Art & Architecture Thesaurus, and then associate the styles with a period using <inception>. - PKM (talk) 20:52, 27 June 2017 (UTC)
Yeah, historical periods and style periods are different, but both relates to time, and it is confusing. There are probably more such periods. Is style periods from architecture, visual art, and litterature the same? I suspect they're not, even if architecture is an art. Perhaps this is related to model vs instantiation. Jeblad (talk) 21:26, 27 June 2017 (UTC)

Q31000000

We now have this.--GZWDer (talk) 05:04, 28 June 2017 (UTC)

help with a instance of (P31) in Jerusalem in religion items

What instance of (P31) and/or subclass of (P279) do I add to Jerusalem in Islam (Q6185220), Jerusalem in Judaism (Q2920442) and Jerusalem in Christianity (Q6185216)? DGtal (talk) 08:47, 27 June 2017 (UTC)

Use facet of (P1269). Sjoerd de Bruin (talk) 12:19, 27 June 2017 (UTC)
Thanks. DGtal (talk) 11:22, 28 June 2017 (UTC)

Confusion about "one of" constraint on manner of death (P1196)

Given the following premises:

I would conclude that:

However, no such constraint appears among the item's statements. Given that my premises listed above are true, how is it that my conclusion was false? (In other words, where is the constraint specified, if not at the page Property:P1196?) More importantly, why is the constraint not defined among the statements at the page Property:P1196?

N.B. I see that Property_talk:P1196 contains the template {{Constraint:One of|values={{Q|3739104}}, {{Q|171558}}, {{Q|10737}}, {{Q|149086}}, {{Q|15729048}}, {{Q|8454}}, somevalue }}. Is this where the constraint is formally specified? If so, why is it done in this totally non-obvious manner, on the property's talk page, and not using property constraint (P2302) on the item's page?

Please notify me in your reply. Thanks! Zazpot (talk) 15:23, 28 June 2017 (UTC)

@Zazpot: Hi! Two things:
  1. Constraints are currently defined in templates (like {{Constraint:One of}}) on property talk pages, but will soon (within a few weeks, hopefully) be migrated to constraint statements. Some properties already have constraint statements (e. g. COSPAR ID (P247)), but even in those cases the constraints defined in statements and on the talk pages may differ. This will hopefully be sorted out once we (the WikibaseQualityConstraints extension as well as Ivan A. Krestinin’s report generation tool) use constraint statements.
  2. The documentation on the property constraints portal (and the subpage) is independent of those constraint definitions – I copied over the manner of death (P1196) constraint since it was a nice, short example, but decided to leave out the unknown value Help for now since it wasn’t yet clear how the constraint would deal with this value. Now that this has been cleared up, I should probably update the documentation to remove this confusing mismatch.
I hope this helps… --Lucas Werkmeister (WMDE) (talk) 15:51, 28 June 2017 (UTC)
@Lucas Werkmeister (WMDE): many thanks indeed for this reply, which makes things much clearer to me. Given that Help:Property_constraints_portal currently has no mention whatsoever of {{Constraint:<whatever>}} templates, it is likely to come as a surprise to other users (as it did to me, before you replied) to find (a) that they exist at all, and (b) that they are the mechanism by some (all?) of the examples given at Help:Property_constraints_portal are implemented. While your reply has definitely helped me, this documentation/implementation mismatch should definitely be addressed, so that this issue will be clearer for other new users, too :) Zazpot (talk) 17:11, 28 June 2017 (UTC)
@Zazpot: well, the documentation is only a few weeks old, and I didn’t want to spend much time on documenting a soon to be obsolete system… constraint statements really shouldn’t take more than two weeks by now, I think (perhaps three depending on deployment schedule). --Lucas Werkmeister (WMDE) (talk) 17:18, 28 June 2017 (UTC)
@Lucas Werkmeister (WMDE): thanks, and understood. Hoping for a smooth deployment :) Zazpot (talk) 18:11, 28 June 2017 (UTC)
It shouldn't be that much of a surprise as they were mentioned to you at Wikidata:Project_chat#One-of_constraint_not_working.3F.
Usually the documentation lags behind, but here it's a bit ahead. The old version is at https://www.wikidata.org/w/index.php?title=Help:Property_constraints&oldid=374302843 . JakobVoss redirected it to the sci-fi version.
--- Jura 17:20, 28 June 2017 (UTC)
@Jura1: when ChristianKl said "only the constraints on the talk page get intepreted by the constraint tools", it was not clear to me at the time what that meant, because I did not know what "constraints on the talk page" were. Now it makes sense :) Thanks for the link to that helpful revision of the Constraints Portal page, which spared me from wading through many dozens of intermediate revisions! Zazpot (talk) 18:11, 28 June 2017 (UTC)

Convert Q to P

Hi. capacity factor (Q1755148) should be a property. Could someone help convert this, or guide me on how to go about with it please? The term is user frequently in the energy industry. Thanks! Rehman 11:16, 26 June 2017 (UTC)

It's not a matter of converting. The existing item is fine. There's the option for creating a corresponding property but that wouldn't mean we would get rid of the existing item that's linked to interwiki links. If you want a new property, the process is to write a property proposal. ChristianKl (talk) 11:56, 26 June 2017 (UTC)
Thanks Christian. Rehman 00:15, 29 June 2017 (UTC)

Ugh category duplication! Surely we can do better

It has been previously requested of Emaus and GZWDer to watch their category creations as they have had a tendency to create duplicates. In the past few days EmausBot (talkcontribslogs) and GZWDer (flood) (talkcontribslogs) have created 840 duplicate categories for Commons <-> EnWiki alone [11], and at a guess many others. This is unsustainable and an abuse of volunteer time and patience to have to resolve that sort of rubbish. There needs to be a more mindful approach to the creation of categories than the current apparent "wham bam thank you maam" approach that is taking place.  — billinghurst sDrewth 04:18, 27 June 2017 (UTC)

Given the amount of created category-items that seems to be an error rate that's less than 1%. It would be good if it would be lower, but it's not that bad.
If I look at the created items I don't understand why Q30652379 got created by EmausBot (talkcontribslogs) without any labels. ChristianKl (talk) 08:41, 27 June 2017 (UTC)

Maybe @Magnus Manske:'s merge game can be changed in a way that lists of merge candidates like the one that Pasleim created go to the top? Given that currently, the game has a high false positive rate, merging might get more productive with presorted lists. ChristianKl (talk) 08:59, 27 June 2017 (UTC)

Umm, how about we ensure that there is a checking mechanism and that it is both updated and current. While your percentage of errors may appear reasonable, the specific number is not. Both users need to update their code and practice. Having the creation of 800+ duplicated category items is not a good use of volunteer time to laboriously merge them.  — billinghurst sDrewth 04:10, 28 June 2017 (UTC)
BTW, by the end of the job of creating items for all bot-created articles in ceb.wiki (not me is the one who works on this now - post updated on 14:58, 29 June 2017 (UTC)), there will be tens of thousands of duplicated items, and a big part of them will not appear in any page of User:Pasleim/projectmerge. XXN, 22:26, 28 June 2017 (UTC)
I have found that many of the cebWP pages have aligned with svWP so that hasn't been the most horrendous, though if that is continuing unregulated, then you just need to have PLbot generate reports. All that said, fix the problem, not propagate the problem. We need to require bot operators look to match, not just blithely keep creating because it is easier.  — billinghurst sDrewth 00:18, 29 June 2017 (UTC)

Weird vandalism caught thanks to listeriabot

See this nonsensical edit founds thank so https://fr.wikipedia.org/w/index.php?title=Wikip%C3%A9dia:Le_Bistro/18_janvier_2017&curid=10535308&diff=138525196&oldid=138061797 . An account that wikidata datas has to be used. author  TomT0m / talk page 10:02, 28 June 2017 (UTC)

This happens daily, for example with given names. More patrolling or better tools for that is the answer. Sjoerd de Bruin (talk) 10:16, 28 June 2017 (UTC)
You’re closing your eyes. New patrollers won’t pop up just because you says we need more patrolling. It’s usually a mistake to say « this is the answer » to a complex problem. A complex problem has multiple faces, hence multiple entry points to be solved. We should take advantage of each of them to come up to an efficient vandalism fighting. author  TomT0m / talk page 11:07, 28 June 2017 (UTC)
I think you mean this set of edits. However, I don't think that's "vandalism", it looks more like a new editor, not understanding what they're doing. I've now welcomed, them, with the standard {{Welcome}} template, and wonder why no-one else already did so. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:32, 28 June 2017 (UTC)
thank you. we need to welcome good faith newbies, and train them, rather than labeling them as vandals and whinging at project chat. after all, where will we get the editors to work the backlogs, if we do not recruit them? Slowking4 (talk) 03:12, 29 June 2017 (UTC)

Bot reversion of vandalism

On the English Wikipedia and other large wikis, there are a multitude of bots such as ClueBot NG which revert obvious vandalism (e.g. nonsense, rude words, page blanking) quickly after its insertion into articles. Is there a bot currently doing this on Wikidata; and if not, is it possible for this to be done (for statements, as well as labels and descriptions) given the obvious differences in data format? Jc86035 (talk) 14:37, 24 June 2017 (UTC)

That's a very interesting topic! I guess a machine learning approach could go a long way. There are many interesting features to take into account the constraint violations (not just format), the editor's experience, the tags, the edit type, size… − Pintoch (talk) 18:18, 24 June 2017 (UTC)
ORES already does the machine learning work and is supposed to have an interface that communicates the likelihood that an edit is vandalism. ChristianKl (talk) 09:04, 25 June 2017 (UTC)
Oh right, I forgot about this one. Is there a bot that uses ORES to actually revert things? − Pintoch (talk) 09:11, 25 June 2017 (UTC)
At the moment there isn't a bot, but if I understand the ORES mission the right way, they see it as their role to provide the necessary information for a potential bot to do a task like this. ChristianKl (talk) 10:03, 25 June 2017 (UTC)
@Ladsgroup: You might want to chime in. --Lydia Pintscher (WMDE) (talk) 09:28, 26 June 2017 (UTC)
I've found ORES marking a large number of my recent merge edits in red - I think it needs to be adjusted for how wikidata works, because it is badly wrong about vandalism in a lot of cases here. ArthurPSmith (talk) 14:21, 26 June 2017 (UTC)

The trouble with vandalism has always been that overt vandalism is easy but covert vandalism is hard to detect. I suspect that there is less overt vandalism on Wikidata. That's not to say I am not interested in the problem.   All the best: Rich Farmbrough12:31, 26 June 2017 (UTC).

Yes, that's a correct assessment. I will check what threshold can be used to run such a bot. Amir (talk) 12:33, 26 June 2017 (UTC)

I started the bot with threshold of 98.7% which will catch around 10% of vandalism, I can change it to 97% and it will catch more than half of them but in one in ten reverts will be wrong and we don't want that. Amir (talk) 13:17, 29 June 2017 (UTC)
@Ladsgroup:What's the expected false positive rate at the threshold of 98.7%? ChristianKl (talk) 15:11, 29 June 2017 (UTC)
2% Amir (talk) 15:13, 29 June 2017 (UTC)
Thanks a lot Amir! What is the username of the bot? I'd be interested to watch its contributions for a while. − Pintoch (talk) 15:15, 29 June 2017 (UTC)
(: here is an example Amir (talk) 15:21, 29 June 2017 (UTC)

Interacting with the Data Import Hub

Under https://www.wikidata.org/wiki/Wikidata:Data_Import_Hub there are presently a few projects that are about important data into Wikidata but most of the projects don't get any comments. It seems to me like it would be useful if we would provide more guidance at this page. ChristianKl (talk) 15:49, 27 June 2017 (UTC)

Thanks for pointing that out! I really need to write a tutorial to explain how to use OpenRefine for Wikidata imports. This is exactly what this tool is good for, but people are just not aware it exists. − Pintoch (talk) 18:10, 27 June 2017 (UTC)
It's not entirely clear whether the hub is for people trying to figure out how to do it (without asking for help on specific points) or people trying to find someone else to do it for them. Supposedly, the second group would better ask at Wikidata:Bot requests.
--- Jura 18:19, 27 June 2017 (UTC)
Some imports are large enough that we normally want a bot approval for the import to happen. It seems like currently people think that if they annouce the import at the Data hub everything is fine. ChristianKl (talk) 12:37, 29 June 2017 (UTC)

One-of constraint not working?

I created the item Open Definition conformant license (Q30939938) and applied a one-of constraint (Q21510859) to it, intending that only certain items could thereafter be assigned instance of (P31) Open Definition conformant license (Q30939938).

SIL Open Font License (Q1150837) was not one of those items.

I then attempted to test whether the constraint was working, by adding the false statement: SIL Open Font License (Q1150837) instance of (P31) Open Definition conformant license (Q30939938). (I intend to undo the addition of that false statement, once I am sure the constraint is working.)

I was disappointed twice:

  1. Wikidata let me add that statement, despite the constraint;
  2. Special:ConstraintReport/Q1150837 does not list the statement SIL Open Font License (Q1150837) instance of (P31) Open Definition conformant license (Q30939938) as a constraint violation.

How can I fix this so that Wikidata will report that false statement as a constraint violation?

Please w:WP:NOTIFY me in your reply. Thanks! Zazpot (talk) 20:36, 27 June 2017 (UTC)

@Zazpot: In our data model items don't have constraints. Adding constraints to items does nothing. Even for properties, constraint statements are currently ignoresd and only the constraints on the talk page get intepreted by the constraint tools. Constraints on Wikidata also don't prevent anybody from adding new statement but only show that a contraint was broken. ChristianKl (talk) 21:04, 27 June 2017 (UTC)
ChristianKl, thanks for your quick reply. How, then, can I maximise the likelihood that false statements added to Wikidata about Open Definition conformant license (Q30939938) will be detected and corrected? Zazpot (talk) 21:10, 27 June 2017 (UTC)
I think as creator of the item you get notifications whenever someone adds a link. There's no mechanism to tag an item in a way that leads to it not being a valid value as instance of (P31) for other items. ChristianKl (talk) 21:18, 27 June 2017 (UTC)
ChristianKl (or anyone else following this thread): as you say, it is not possible in WikiData at the moment for items to have constraints. That being so, is it possible for me to add a property constraint to instance of (P31) such that statements giving the object of that property as Open Definition conformant license (Q30939938) will be constrained to taking only certain items as their subject? Zazpot (talk) 15:35, 28 June 2017 (UTC)
I don't think we have a constraint that can be put on instance of (P31) that would do such a thing. ChristianKl (talk) 11:04, 29 June 2017 (UTC)

Model of time properties

 
Yet another incorrect time model displayed at mw:Wikibase/DataModel.

I am trying to understand wikidata model of time properties. We have a lot of documentation like Help:Wikidata_datamodel, Help:Dates or Help:Modelling/general/time and they all seem to be outdated and possibly written before we had time model. All the help pages (and some phabricator pages like phabricator:T73867) talk about "before" and "after" fields used for specifying for range of time. That task is currently done with start time (P580) / end time (P582) and earliest date (P1319) / latest date (P1326) qualifiers. So is all the talk about before / after fields some kind of evolutionary dead-end, or is it some kind of parallel system? If it is evolutionary dead-end than we should move descriptions of it from Help namespace pages to Help_talk namespace pages. --Jarekt (talk) 14:45, 28 June 2017 (UTC)

Thanks Jura, I did not find that page. Ok so the "before" and "after" (and "timezone") fields are stored just not used. That is a bit confusing, but other help pages are not flat wrong as I thought. --Jarekt (talk) 12:08, 29 June 2017 (UTC)
I think it would be great if we would use the field. Many times, we know that a person has either born in year 19xx or in 19xx+1. ChristianKl (talk) 12:41, 29 June 2017 (UTC)
The current method is to make the date more generic, for example specify only decade or century, and than use earliest date (P1319) / latest date (P1326) qualifiers for more precise description. Use of qualifiers is more flexible, but I wish unused before/ after fields were removed to avoid confusion. --Jarekt (talk) 15:36, 29 June 2017 (UTC)
I find the user experience of that solution quite bad. It takes a lot more effort than writing 1981-1982. It's also harder to learn for new people. ChristianKl (talk) 16:36, 29 June 2017 (UTC)
I agree, it is not very intuitive. We probably could write smarter parsers of the dates. They already recognize "1950s" as a decade and "18 century" as a century and convert "1 May 1999" to "1999-05-01". We could make it recognize "~1999" as 1999 with sourcing circumstances (P1480): circa (Q5727902) qualifier or "1981-1982" as either start time (P580) / end time (P582) or earliest date (P1319) / latest date (P1326) pair (depending if it is discrete or continuous event). --Jarekt (talk) 16:51, 29 June 2017 (UTC)


Taking Berthold, Duke of Bavaria (Q221328) as an example, the edit that created the date of birth property states the timestamp is +1000-00-00T00:00:00Z, the calendar is Julian, and the precision is 100 years. As I read the specifications for the JSON and RDF model, this precision means that only the hundred years digit, and more significant digits, are significant. In other words, the time stamp could be rewritten +10dd-dd-ddTdd:dd:ddZ where "d" means "don't care". Thus, the range of years would be 1000 to and including 1099. But many authorities consider the 10th century to comprise the years 1001 to and including 1100. Thus it is controversial for you to interpret this time value as 10th century.
You might argue that's what the user interface does, but the user interface has a long list of flaws that have been sitting around for years without action, so the user interface cannot serve as an example of what is correct. Jc3s5h (talk) 18:12, 29 June 2017 (UTC)
Jc3s5h, I think you are referring to my edits on Help:Dates. I think you have it a bit backward. When I add a property stating that someone was born in 19-th century (meaning years "1801-1900" or "1800-1899") than the software saves +1900-00-00T00:00:00Z string with precision 7. I expected it to save +1800-00-00T00:00:00Z, but that is not what happen. Similarly when I typed "2-nd millennium" (meaning years "1001-2000" than the software saved +0003-00-00T00:00:00Z (with precision 6), not +1000-00-00T00:00:00Z as one might expected. My guess is that JSON documentation is only correct for precision higher than a "year". (I tried it some time ago and kept notes, but now I could not reproduce it. Either something changed or I did something wrong while testing. ). If you think that is wrong than you can file a bug report, but my edits to Help:Dates were trying to capture what is currently there and not what it would be logical to be there. --Jarekt (talk) 20:10, 29 June 2017 (UTC)
I think the documentation should focus on what is stored in the database (which we can't actually see, but we can see an approximation with JSON or RDF output. Any discrepancy between what is typed into the user interface and what is stored should be addressed separately. Bear in mind that the user interface is only one way to enter data into the database. Also, what was entered into the user interface is fleeting; later editors can't know if the user interface was used at all, or if so, what the editor typed. Jc3s5h (talk) 21:01, 29 June 2017 (UTC)
Jc3s5h I did some more experiments on a test item and dates and we are both right. I just did not do enough experiments to fully understand the pattern. Let me use quick_statements notation of timestamp/precision. So +1700-01-01T00:00:00Z/7 shows as 17-th century and 2000-00-00T00:00:00Z/6 shows as second millennium. Also 0001-00-00T00:00:00Z/6 shows as first millennium. But +1701-01-01T00:00:00Z/7 shows as 18-th century and 2001-00-00T00:00:00Z/6 shows as third millennium. So you are right that the first segment of the timestamp just show the year and the precision shows how to round it up. --Jarekt (talk) 04:13, 30 June 2017 (UTC)

Lua code for Wikidata dates

By the way, I am working on c:Module:Wikidata date Lua code to parse Wikidata dates. Currently I can recognize:

See c:Module talk:Wikidata date/testcases for some examples. Am I missing any other ways people are using to specify dates? If so please provide me with item / property IDs. I found several cases where dates are hard to interpret. See for example:

Any other strange constructs people use? --Jarekt (talk) 16:01, 29 June 2017 (UTC)

Member of a legislative body

How would I indicate that Nick Licata was a member of the Seattle City Council? Property:P463 says it's not for this purpose, and Property:P39 doesn't seem to have a provision for being a member of a legislative body, only for holding a unique office. - Jmabel (talk) 00:04, 29 June 2017 (UTC)

position held (P39). You may need to create the role of "councillor of Seattle City Council", or whatever it is titled. There are plenty of examples around.  — billinghurst sDrewth 00:13, 29 June 2017 (UTC)
Another approach is to use position held (P39):councillor (Q708492) with an appropriate qualifier - of (P642):Seattle City Council (Q7442079) - if you don't feel it's appropriate to create a specific item for this. Andrew Gray (talk) 11:38, 29 June 2017 (UTC)

COH Challenge

Hi!

Between 1 July and 31 July, Wikimedia Sverige and UNESCO co-arranges the second writing challenge of the Connected Open Heritage project, the COH Challenge.

As part of the Connected Open Heritage project a large number of images under a free license have been uploaded, e.g. of world heritage sites and of important archaeological and built heritage sites in Syria, Mexico, Cyprus and Sweden (the images can be found here). The purpose of this challenge is to get as many of these images as possible to be used in Wikipedia articles (however, at most five images – with caption – per article). But a lot of points are also to be gained from adding those images to Wikidata items, and adding Wikidata Item numbers on the images' file pages.

If you'd like to participate, you can find the participation page here, where you also register your points. Participate in any language you’d like! The winner receives as well the honor as great prizes.

Best, Eric Luth (WMSE) (talk) 14:02, 29 June 2017 (UTC)

Simple Query

Hello everyone,
a simple query of scientific articles with dates of publication but without description works well for some languages and not for others. I need the output for "Arabic", but for some reason it is not correct and the result includes items with descriptions try it. When I change the filtered language to "it" or "de" it works fine. What is the reason and how to fix that? Note: it would work correct if date variable is taken out, but it's needed. --Sky xe (talk) 12:22, 29 June 2017 (UTC)

FILTER(LANG(?desc)= "it,de") is likely not what you want, LANG returns "it" https://www.w3.org/TR/2013/REC-sparql11-query-20130321/#func-lang
MINUS{?item schema:description ?desc . FILTER(LANG(?desc) IN ("it", "de")) }
d1g (talk) 21:57, 29 June 2017 (UTC)

@ d1g: thanks for the answer. Could you please edit the query to include what you suggested? Still don't get it exactly.

@Sky xe: otherwise your query is fine, this is some strange limitation
code should be ar, but it doesn't work with it (ara doesn't work either) d1g (talk) 11:44, 30 June 2017 (UTC)

How to distinguish sub-topics

E.g. having these two sub-categories of Category:C++ software (Q6259215):

How to distinguish them from the parent? Is good an has part (P527) as qualifier? Do you suggest some other qualifier? Do you suggest to propose another new qualifier as e.g. "sub-topic focus"? --Valerio Bozzolan (talk) 13:34, 29 June 2017 (UTC)

We don't document structures of categories on Wikidata, there were discussions about this in the past. Every Wikipedia has his own structure of categories. Sjoerd de Bruin (talk) 13:57, 29 June 2017 (UTC)
@Sjoerddebruin: Even if this is a so simple and globally recognized bifurcation case that can exists only from that parent? --Valerio Bozzolan (talk) 17:32, 29 June 2017 (UTC)
using qualifiers (Q30824740#P301), but this is not necessary for every category d1g (talk) 21:23, 29 June 2017 (UTC)
You can use category combines topics (P971): C++ (Q2407) and free software (Q341). Matěj Suchánek (talk) 13:14, 30 June 2017 (UTC)

Changing constraints in a property

I'd like to add either clothing (Q11460) (my preference) or folk costume (Q3172759) as acceptable types for indigenous to (P2341). Do I need to formally propose this somewhere? - PKM (talk) 18:47, 27 June 2017 (UTC)

@PKM: You can propose it on its talk page, pinging those involved in its proposal. Mahir256 (talk) 19:54, 27 June 2017 (UTC)
Done. Property_talk:P2341. No comments so far. - PKM (talk) 21:35, 30 June 2017 (UTC)

gymnasium (Q14092)

Have I used Library of Congress authority ID (P244) and Library of Congress Classification (P1149) on gymnasium (Q14092) correctly? --Senator2029 (talk) 18:02, 30 June 2017 (UTC)

For me, the Library of Congress Classification (P1149) gives a dead link... Library of Congress authority ID (P244) seems correct to me. Q.Zanden questions? 20:00, 30 June 2017 (UTC)

Manual edit summary in quickstatments

Would it be possible to leave an additional, but optional, edit summary in quickstatements? MechQuester (talk) 20:36, 30 June 2017 (UTC)

When you hit Run in background you will be asked if you want to set an edit summary. It will display then beside the #quickstatement label also the batch number with a link to that batch where the summary is shown. See for examples the latest edits from QuickStatementsBot where I ran a batch of three test edits. The summary links to the batchnumber where you can find the summary. It is not possible to add a summary when running QS by hand. Q.Zanden questions? 22:46, 30 June 2017 (UTC)